Machine Learning Functions#

Surfaces provides test functions based on real machine learning models. These functions offer realistic hyperparameter optimization landscapes derived from actual model training tasks.

Overview#

ML-based test functions evaluate the performance of machine learning models with given hyperparameters. Unlike mathematical functions with known global optima, these functions represent realistic optimization challenges.

Category

Count

Description

Tabular Classification

5

Classification models on tabular data

Tabular Regression

5

Regression models on tabular data

Image Classification

5

Image classification models

Time Series

6

Time series classification and forecasting

Why ML-Based Functions?#

Traditional mathematical test functions are useful but don’t capture the characteristics of real hyperparameter optimization:

  • Noise: Real ML evaluations have variance from data splits

  • Discrete parameters: Many hyperparameters are categorical

  • Complex interactions: Parameters often interact non-linearly

  • Expensive: Real training takes significant time

Surfaces’ ML functions provide these realistic properties in a standardized, reproducible way.

Tabular Classification#

Functions for optimizing classification model hyperparameters on tabular datasets.

Example: K-Neighbors Classifier#

from surfaces.test_functions.machine_learning import KNeighborsClassifierFunction

# Create the function
func = KNeighborsClassifierFunction()

# Evaluate with hyperparameters
score = func({
    "n_neighbors": 5,
    "weights": "distance",
    "p": 2
})

print(f"Accuracy: {score}")

Hyperparameters:

  • n_neighbors: Number of neighbors (integer)

  • weights: Weight function (‘uniform’ or ‘distance’)

  • p: Power parameter for Minkowski distance

Tabular Regression#

Functions for optimizing regression model hyperparameters on tabular datasets.

Example: Gradient Boosting Regressor#

from surfaces.test_functions.machine_learning import GradientBoostingRegressorFunction

func = GradientBoostingRegressorFunction()

score = func({
    "n_estimators": 100,
    "max_depth": 3,
    "learning_rate": 0.1
})

Hyperparameters:

  • n_estimators: Number of boosting stages

  • max_depth: Maximum tree depth

  • learning_rate: Shrinkage parameter

Image Classification#

Functions for optimizing image classification model hyperparameters.

Time Series Functions#

Functions for optimizing time series models.

Classification#

Forecasting#

Using ML Functions#

Loss vs Score Mode#

ML functions naturally return scores (higher is better). The objective parameter controls the sign:

func = KNeighborsClassifierFunction(objective="maximize")
score = func(params)  # Returns accuracy (0-1)

func = KNeighborsClassifierFunction(objective="minimize")
loss = func(params)  # Returns negative accuracy

Getting the Search Space#

The search space includes both continuous and categorical parameters:

func = KNeighborsClassifierFunction()
space = func.search_space()

print(space.keys())
# dict_keys(['n_neighbors', 'weights', 'p'])

print(space['weights'])
# ['uniform', 'distance']

scipy Integration Limitations#

ML functions with categorical parameters cannot be directly converted to scipy format. Use optimization libraries that support mixed parameter types:

# For ML functions, use libraries like:
# - Hyperactive
# - Optuna
# - scikit-optimize

Performance Considerations#

ML function evaluations involve actual model training, so they’re slower than mathematical functions:

  • KNeighbors: Fast (milliseconds per evaluation)

  • GradientBoosting: Slower (seconds per evaluation)

  • CNN/Deep models: Much slower (requires GPU for practical use)

For benchmarking optimization algorithms, consider:

  1. Using fewer iterations

  2. Testing on faster functions (KNeighbors)

  3. Adding artificial delays to mathematical functions for comparison

Importing Functions#

# Import specific functions
from surfaces.test_functions.machine_learning import (
    KNeighborsClassifierFunction,
    KNeighborsRegressorFunction,
    GradientBoostingRegressorFunction,
)

# Import all ML functions
from surfaces.test_functions.machine_learning import machine_learning_functions

for func_class in machine_learning_functions:
    print(func_class.__name__)