Inquiry on initial dataset generation. #948
-
Hello, I have a question regarding the initial dataset generation approach for Bayesian Optimization in general. In both Ax and BoTorch, Sobol is used by default. I'm curious why the team did not consider other approaches such as Latin Hypercube sampling. Was it due factors such as speed (i.e. Sobol can generate an initial dataset basically instantly, wherelse latin hypercube may take a longer time), or was it other more fundamental issues such as overall optimization performance? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Hi, thanks for your question - it's a legitimate one. One reason is simply that we already have a very performant Sobol sampler hooked up and can just use that (that sampler is crucial for the MC acquisition function computation under the hood, but not necessarily for initial data set generation). But there is actually some work that argues that using a space-filling design can be detrimental to the optimization performance: https://arxiv.org/abs/1812.02794. I think this is interesting and somewhat intuitive, but I'm not sure about the practical implications (note also that that paper does not consider Sobol sampling, only random and LHS). Further, since we typically use a rather small number of initial data points, we care about the space being covered reasonably uniformly so that we have an idea of what the good regions are (the question of whether the uniform distances between the points make it harder to estimate the lengthscales of the GP seem like a second order concern to me in that context). |
Beta Was this translation helpful? Give feedback.
Hi, thanks for your question - it's a legitimate one.
One reason is simply that we already have a very performant Sobol sampler hooked up and can just use that (that sampler is crucial for the MC acquisition function computation under the hood, but not necessarily for initial data set generation).
But there is actually some work that argues that using a space-filling design can be detrimental to the optimization performance: https://arxiv.org/abs/1812.02794. I think this is interesting and somewhat intuitive, but I'm not sure about the practical implications (note also that that paper does not consider Sobol sampling, only random and LHS). Further, since we typically use a rather small nu…