Bayesian optimisation with known observational noise#

In this example, NUBO is used for sequential single-point optimisation for situations where the observational noise from taking the measurements is known. We assume that the variance of the observational noise of our black-box function (simulated here by the Hartmann6D function) to be equal to \(\sigma^2 = 0.025\) for all observations. However, we could also specify individual noise levels for each observation. The Bayesian optimisation loop compared to the case with unknown noise differs only in terms of the likelihood that we use. Here, we use the FixedNoiseGaussianLikelihood and specify the observation noise variance for each data point (for this example, the same variance is used for all points). We also allow the likelihood to estimate any additional noise. The optimisation loop is run for 40 iterations and finds a solution close to the true optimum of -3.3224.

[1]:

import torch
from nubo.acquisition import ExpectedImprovement, UpperConfidenceBound
from nubo.models import GaussianProcess, fit_gp
from nubo.optimisation import single
from nubo.test_functions import Hartmann6D
from nubo.utils import gen_inputs
from gpytorch.likelihoods import FixedNoiseGaussianLikelihood


# test function
func = Hartmann6D(minimise=False, noise_std=0.025)
dims = 6

# specify bounds
bounds = torch.tensor([[0., 0., 0., 0., 0., 0.], [1., 1., 1., 1., 1., 1.]])

# training data
x_train = gen_inputs(num_points=dims*5,
                     num_dims=dims,
                     bounds=bounds)
y_train = func(x_train)

# Bayesian optimisation loop
iters = 40

for iter in range(iters):

    # specify Gaussian process
    likelihood = FixedNoiseGaussianLikelihood(noise=torch.ones(x_train.size(0))*0.025, learn_additional_noise=True)
    gp = GaussianProcess(x_train, y_train, likelihood=likelihood)

    # fit Gaussian process
    fit_gp(x_train, y_train, gp=gp, likelihood=likelihood, lr=0.1, steps=200)

    # specify acquisition function
    # acq = ExpectedImprovement(gp=gp, y_best=torch.max(y_train))
    acq = UpperConfidenceBound(gp=gp, beta=1.96**2)

    # optimise acquisition function
    x_new, _ = single(func=acq, method="L-BFGS-B", bounds=bounds, num_starts=5)

    # evaluate new point
    y_new = func(x_new)

    # add to data
    x_train = torch.vstack((x_train, x_new))
    y_train = torch.hstack((y_train, y_new))

    # print new best
    if y_new > torch.max(y_train[:-1]):
        print(f"New best at evaluation {len(y_train)}: \t Inputs: {x_new.numpy().reshape(dims).round(4)}, \t Outputs: {-y_new.numpy().round(4)}")

# results
best_iter = int(torch.argmax(y_train))
print(f"Evaluation: {best_iter+1} \t Solution: {-float(y_train[best_iter]):.4f}")

New best at evaluation 31:       Inputs: [0.3889 1.     0.5357 0.5685 0.8369 0.    ],    Outputs: [-2.6202]
New best at evaluation 33:       Inputs: [0.391  1.     0.5254 0.5547 0.8366 0.    ],    Outputs: [-2.6486]
New best at evaluation 39:       Inputs: [0.3754 1.     1.     0.5115 0.     0.    ],    Outputs: [-2.6733]
New best at evaluation 41:       Inputs: [0.3804 1.     1.     0.5882 0.     0.    ],    Outputs: [-2.7625]
New best at evaluation 42:       Inputs: [0.3894 0.9043 1.     0.5801 0.     0.    ],    Outputs: [-3.085]
New best at evaluation 47:       Inputs: [0.4179 0.8636 1.     0.5875 0.     0.    ],    Outputs: [-3.0931]
New best at evaluation 51:       Inputs: [0.4216 0.8661 1.     0.5709 0.     0.    ],    Outputs: [-3.1223]
New best at evaluation 53:       Inputs: [0.4138 0.8594 1.     0.5552 0.     0.    ],    Outputs: [-3.1599]
New best at evaluation 60:       Inputs: [0.4135 0.8788 1.     0.5735 0.     0.039 ],    Outputs: [-3.2312]
Evaluation: 60   Solution: -3.2312