Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: implement simulated campaigning for "hyper parameter tuning" #33

Open
4 tasks
matthewcarbone opened this issue Aug 11, 2023 · 7 comments
Open
4 tasks
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@matthewcarbone
Copy link
Collaborator

@ziatdinovmax as we discussed, I plan on implementing a simulated campaigning loop for tuning the "hyper parameters" of an optimization loop. I first want to learn this library inside and out so it might take some time. But anyways, the executive summary of the tasks at hand look something like this:

  • Develop a Campaign class for storing the state of and running the campaign
  • Parallelize the campaigning (since many simulations will have to be run); might want to consider mpi4py but more likely multiprocessing will be enough
  • Implement a smart checkpoint-restart system for the same reason
  • Write appropriate tests
@ziatdinovmax
Copy link
Owner

@matthewcarbone - sounds great, and I will be happy to help.

@ziatdinovmax ziatdinovmax added the enhancement New feature or request label Aug 14, 2023
@matthewcarbone
Copy link
Collaborator Author

@ziatdinovmax Quick update: I have not forgotten about this. I have funding starting in October and I'll be building on this.

Btw, unrelated question (we can open a new issue if you want), but can gpax do batch sampling? I.e. instead of sequential experiments ("given the data at hand, find me the next experiment to maximize the acquisition function"), can gpax do "given the data at hand, find me the next q experiments that jointly maximize the acquisition function"?

@ziatdinovmax
Copy link
Owner

@matthewcarbone - Thanks for the update.
Yes, there is a batch-mode acquisition: https://github.com/ziatdinovmax/gpax/blob/main/gpax/acquisition/batch_acquisition.py

@ziatdinovmax
Copy link
Owner

On the "parallelize the campaigning": assuming this is a single program that runs with different input parameters, can this be done with JAX built-in tools for parallel evaluation?

@matthewcarbone
Copy link
Collaborator Author

matthewcarbone commented Oct 5, 2023

It's funny I was thinking something similar, but I don't quite know how to do this. The tough part is that it's a combination of the continuous and bandit optimization. There's almost like a tree of decisions. For example, do you choose EI or UCB? If you choose UCB, you also need to choose beta. How does one go about optimizing over that space? I know it's possible, but I'm not sure how to implement it.

Btw, I also have concerns about speed. Fitting gpax to ~400 5-dimensional data points took quite a few minutes. I realize that's a lot of data, but I noticed that other codes are much faster. Is there any way we can speed up mcmc.run?

@ziatdinovmax
Copy link
Owner

One can use stochastic variational inference GP (viGP) or deep kernel learning (viDKL) for large datasets and high dimensions. The mcmc (or, more precisely, HMC with NUTS sampler) implementation is already dramatically faster than what pymc or pyro packages offer. I generally recommend it in situations where specific physics-based priors are available or one wants a detailed analysis of posterior distributions.

@matthewcarbone
Copy link
Collaborator Author

matthewcarbone commented Oct 12, 2023

Is there a way to do this already in GPax?

Edit: whoops please disregard. 😁

@ziatdinovmax ziatdinovmax added this to the v0.3 milestone Oct 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants