Skip to content

Improve benchmarking workflow #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
leonlan opened this issue Feb 28, 2024 · 6 comments
Open

Improve benchmarking workflow #1

leonlan opened this issue Feb 28, 2024 · 6 comments
Labels
discussion Discussion thread

Comments

@leonlan
Copy link
Member

leonlan commented Feb 28, 2024

Is your feature request related to a problem? Please describe

We currently benchmark PyVRP on five different instance sets. Each instance set requires a specific build, round function, stopping criteria, etc., as described here. I currently have a different folder for each instance set. At each new version release, I run the benchmarks from each of these folders which are automated. Once all instances are solved, I move the results to my local environment and run notebook to compute the gaps.

With many more instance sets to come, this benchmarking workflow requires a lot of manual work. It would be nice to have a more automated benchmarking workflow, and to have this publicly available so that anyone can reproduce these steps.

Describe the solution you'd like

The benchmarking process looks like this:

  1. For each instance set:
    • Build PyVRP using the correct problem type and precision.
    • Solve with the correct stopping criterion and round function.
  2. Compute gaps for each instance set.
  3. Update the benchmarking results.

I think step 1 is relatively straightforward because it's just a simple Python/shell script. It will include some custom code that depends on the cluster environment that one runs on.

Step 2 is somewhat more cumbersome. I currently have several Jupyter notebooks that computes the gaps for each instance set. Besides requiring manual effort, it's just a bit messy and hard to maintain. Ideally, we keep something like a spreadsheet. Each instance set is in a separate tab, and each new version release becomes added as a new column. What's also nice is that we can store reference solutions so that the gaps are updated with new BKS. Instead of using an Excel spreadsheet, we can have an automated workflow that updates a set of CSV files with the new benchmark results.

Step 3 can still be done manually by editing the benchmark page. I'm OK with that.

I will try to work on this for PyVRP/PyVRP#435.

Additional context

There are two open issues that will simplify the benchmarking process further:

@N-Wouda
Copy link
Member

N-Wouda commented Feb 28, 2024

Related: #3, #2.

@N-Wouda

This comment was marked as resolved.

@N-Wouda

This comment was marked as resolved.

@N-Wouda N-Wouda transferred this issue from PyVRP/PyVRP Feb 28, 2024
@N-Wouda
Copy link
Member

N-Wouda commented Feb 28, 2024

I have transferred this issue to the new benchmark tool repository.

@N-Wouda N-Wouda added the discussion Discussion thread label Feb 28, 2024
@N-Wouda
Copy link
Member

N-Wouda commented Feb 29, 2024

Before we start writing all code for this ourselves, we should have a look around whether there already exists something that ticks off most of the boxes. There very well might be - distributed benchmarking isn't too niche of a topic.

@leonlan
Copy link
Member Author

leonlan commented Feb 29, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion Discussion thread
Projects
None yet
Development

No branches or pull requests

2 participants