-
Notifications
You must be signed in to change notification settings - Fork 0
Improve benchmarking workflow #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
discussion
Discussion thread
Comments
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
I have transferred this issue to the new benchmark tool repository. |
Before we start writing all code for this ourselves, we should have a look around whether there already exists something that ticks off most of the boxes. There very well might be - distributed benchmarking isn't too niche of a topic. |
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Is your feature request related to a problem? Please describe
We currently benchmark PyVRP on five different instance sets. Each instance set requires a specific build, round function, stopping criteria, etc., as described here. I currently have a different folder for each instance set. At each new version release, I run the benchmarks from each of these folders which are automated. Once all instances are solved, I move the results to my local environment and run notebook to compute the gaps.
With many more instance sets to come, this benchmarking workflow requires a lot of manual work. It would be nice to have a more automated benchmarking workflow, and to have this publicly available so that anyone can reproduce these steps.
Describe the solution you'd like
The benchmarking process looks like this:
I think step 1 is relatively straightforward because it's just a simple Python/shell script. It will include some custom code that depends on the cluster environment that one runs on.
Step 2 is somewhat more cumbersome. I currently have several Jupyter notebooks that computes the gaps for each instance set. Besides requiring manual effort, it's just a bit messy and hard to maintain. Ideally, we keep something like a spreadsheet. Each instance set is in a separate tab, and each new version release becomes added as a new column. What's also nice is that we can store reference solutions so that the gaps are updated with new BKS. Instead of using an Excel spreadsheet, we can have an automated workflow that updates a set of CSV files with the new benchmark results.
Step 3 can still be done manually by editing the benchmark page. I'm OK with that.
I will try to work on this for PyVRP/PyVRP#435.
Additional context
There are two open issues that will simplify the benchmarking process further:
cli.py
.The text was updated successfully, but these errors were encountered: