Parallel multi-point joint Bayesian optimisation#

In this example, NUBO is used to perform multi-point optimisation that allows the candidates to be evaluated from the objective function in parallel. Multi-point optimisation is implemented in NUBO through Monte Carlo acquisition functions. The script below uses the MCUpperConfidenceBound acquisition function with 512 samples and resamples the base samples (default). Each batch of 4 is found jointly with the multi_joint() function by optimising the acquisition function with the stochastic Adam optimiser. We could also fix the base samples in MCUpperConfidenceBound and use a deterministic optimiser, such as L-BFGS-B. The Hartmann6D synthetic test function acts as a substitute for a black-box objective function, such as an experiment or a simulation. The optimisation loop is run for 10 iterations returning batches of 4 each (a total of 40 evaluations) and finds a solution close to the true optimum of -3.3224.

[1]:
import torch
from nubo.acquisition import MCExpectedImprovement, MCUpperConfidenceBound
from nubo.models import GaussianProcess, fit_gp
from nubo.optimisation import multi_joint
from nubo.test_functions import Hartmann6D
from nubo.utils import gen_inputs
from gpytorch.likelihoods import GaussianLikelihood


# test function
func = Hartmann6D(minimise=False)
dims = 6

# specify bounds
bounds = torch.tensor([[0., 0., 0., 0., 0., 0.], [1., 1., 1., 1., 1., 1.]])

# training data
x_train = gen_inputs(num_points=dims*5,
                     num_dims=dims,
                     bounds=bounds)
y_train = func(x_train)

# Bayesian optimisation loop
iters = 10

for iter in range(iters):

    # specify Gaussian process
    likelihood = GaussianLikelihood()
    gp = GaussianProcess(x_train, y_train, likelihood=likelihood)

    # fit Gaussian process
    fit_gp(x_train, y_train, gp=gp, likelihood=likelihood, lr=0.1, steps=200)

    # specify acquisition function
    # acq = MCExpectedImprovement(gp=gp, y_best=torch.max(y_train), samples=512)
    acq = MCUpperConfidenceBound(gp=gp, beta=1.96**2, samples=512)

    # optimise acquisition function
    x_new, _ = multi_joint(func=acq, method="Adam", batch_size=4, bounds=bounds, lr=0.1, steps=200, num_starts=5)

    # evaluate new point
    y_new = func(x_new)

    # add to data
    x_train = torch.vstack((x_train, x_new))
    y_train = torch.hstack((y_train, y_new))

    # print new best
    if torch.max(y_new) > torch.max(y_train[:-y_new.size(0)]):
        best_eval = torch.argmax(y_train)
        print(f"New best at evaluation {best_eval+1}: \t Inputs: {x_train[best_eval, :].numpy().reshape(dims).round(4)}, \t Outputs: {-y_train[best_eval].numpy().round(4)}")

# results
best_iter = int(torch.argmax(y_train))
print(f"Evaluation: {best_iter+1} \t Solution: {-float(y_train[best_iter]):.4f}")

New best at evaluation 44:       Inputs: [0.3957 0.8149 0.1248 0.5297 0.0373 0.0101],    Outputs: -2.9052
New best at evaluation 55:       Inputs: [0.4088 0.8735 0.9928 0.5549 0.0142 0.0588],    Outputs: -3.162
New best at evaluation 60:       Inputs: [0.4068 0.8819 0.9951 0.5724 0.004  0.0335],    Outputs: -3.1936
Evaluation: 60   Solution: -3.1936