Parallel multi-point joint Bayesian optimisation#
In this example, NUBO is used to perform multi-point optimisation that allows the candidates to be evaluated from the objective function in parallel. Multi-point optimisation is implemented in NUBO through Monte Carlo acquisition functions. The script below uses the MCUpperConfidenceBound
acquisition function with 512 samples and resamples the base samples (default). Each batch of 4 is found jointly with the multi_joint()
function by optimising the acquisition function with the
stochastic Adam optimiser. We could also fix the base samples in MCUpperConfidenceBound
and use a deterministic optimiser, such as L-BFGS-B. The Hartmann6D
synthetic test function acts as a substitute for a black-box objective function, such as an experiment or a simulation. The optimisation loop is run for 10 iterations returning batches of 4 each (a total of 40 evaluations) and finds a solution close to the true optimum of -3.3224.
[1]:
import torch
from nubo.acquisition import MCExpectedImprovement, MCUpperConfidenceBound
from nubo.models import GaussianProcess, fit_gp
from nubo.optimisation import multi_joint
from nubo.test_functions import Hartmann6D
from nubo.utils import gen_inputs
from gpytorch.likelihoods import GaussianLikelihood
# test function
func = Hartmann6D(minimise=False)
dims = 6
# specify bounds
bounds = torch.tensor([[0., 0., 0., 0., 0., 0.], [1., 1., 1., 1., 1., 1.]])
# training data
x_train = gen_inputs(num_points=dims*5,
num_dims=dims,
bounds=bounds)
y_train = func(x_train)
# Bayesian optimisation loop
iters = 10
for iter in range(iters):
# specify Gaussian process
likelihood = GaussianLikelihood()
gp = GaussianProcess(x_train, y_train, likelihood=likelihood)
# fit Gaussian process
fit_gp(x_train, y_train, gp=gp, likelihood=likelihood, lr=0.1, steps=200)
# specify acquisition function
# acq = MCExpectedImprovement(gp=gp, y_best=torch.max(y_train), samples=512)
acq = MCUpperConfidenceBound(gp=gp, beta=1.96**2, samples=512)
# optimise acquisition function
x_new, _ = multi_joint(func=acq, method="Adam", batch_size=4, bounds=bounds, lr=0.1, steps=200, num_starts=5)
# evaluate new point
y_new = func(x_new)
# add to data
x_train = torch.vstack((x_train, x_new))
y_train = torch.hstack((y_train, y_new))
# print new best
if torch.max(y_new) > torch.max(y_train[:-y_new.size(0)]):
best_eval = torch.argmax(y_train)
print(f"New best at evaluation {best_eval+1}: \t Inputs: {x_train[best_eval, :].numpy().reshape(dims).round(4)}, \t Outputs: {-y_train[best_eval].numpy().round(4)}")
# results
best_iter = int(torch.argmax(y_train))
print(f"Evaluation: {best_iter+1} \t Solution: {-float(y_train[best_iter]):.4f}")
New best at evaluation 44: Inputs: [0.3957 0.8149 0.1248 0.5297 0.0373 0.0101], Outputs: -2.9052
New best at evaluation 55: Inputs: [0.4088 0.8735 0.9928 0.5549 0.0142 0.0588], Outputs: -3.162
New best at evaluation 60: Inputs: [0.4068 0.8819 0.9951 0.5724 0.004 0.0335], Outputs: -3.1936
Evaluation: 60 Solution: -3.1936