Improve benchmarking workflow #1

leonlan · 2024-02-28T09:14:38Z

Is your feature request related to a problem? Please describe

We currently benchmark PyVRP on five different instance sets. Each instance set requires a specific build, round function, stopping criteria, etc., as described here. I currently have a different folder for each instance set. At each new version release, I run the benchmarks from each of these folders which are automated. Once all instances are solved, I move the results to my local environment and run notebook to compute the gaps.

With many more instance sets to come, this benchmarking workflow requires a lot of manual work. It would be nice to have a more automated benchmarking workflow, and to have this publicly available so that anyone can reproduce these steps.

Describe the solution you'd like

The benchmarking process looks like this:

For each instance set:
- Build PyVRP using the correct problem type and precision.
- Solve with the correct stopping criterion and round function.
Compute gaps for each instance set.
Update the benchmarking results.

I think step 1 is relatively straightforward because it's just a simple Python/shell script. It will include some custom code that depends on the cluster environment that one runs on.

Step 2 is somewhat more cumbersome. I currently have several Jupyter notebooks that computes the gaps for each instance set. Besides requiring manual effort, it's just a bit messy and hard to maintain. Ideally, we keep something like a spreadsheet. Each instance set is in a separate tab, and each new version release becomes added as a new column. What's also nice is that we can store reference solutions so that the gaps are updated with new BKS. Instead of using an Excel spreadsheet, we can have an automated workflow that updates a set of CSV files with the new benchmark results.

Step 3 can still be done manually by editing the benchmark page. I'm OK with that.

I will try to work on this for PyVRP/PyVRP#435.

Additional context

There are two open issues that will simplify the benchmarking process further:

Should we keep double precision? PyVRP#491: if we don't keep double precision, we only need to distinguish between CVRP and non-CVRP builds.
- Related: is it possible to get rid of a CVRP build too?
Handle rounding conventions VRPLIB#111: if we include rounding functions in the VRPLIB format, then we don't need to pass rounding functions to cli.py.

The text was updated successfully, but these errors were encountered:

N-Wouda · 2024-02-28T14:17:18Z

Related: #3, #2.

N-Wouda · 2024-02-28T16:44:58Z

I have transferred this issue to the new benchmark tool repository.

N-Wouda · 2024-02-29T11:02:11Z

Before we start writing all code for this ourselves, we should have a look around whether there already exists something that ticks off most of the boxes. There very well might be - distributed benchmarking isn't too niche of a topic.

leonlan · 2024-02-29T11:43:07Z

I’ve heard of Snakemake but haven’t really looked into it in much detail.
https://github.com/amq92/simple_slurm
https://hydra.cc/
https://github.com/d-krupke/AlgBench/tree/main

This comment was marked as resolved.

Sign in to view

N-Wouda transferred this issue from PyVRP/PyVRP Feb 28, 2024

N-Wouda added the discussion Discussion thread label Feb 28, 2024

leonlan mentioned this issue Oct 22, 2024

Iterated local search PyVRP/PyVRP#533

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve benchmarking workflow #1

Improve benchmarking workflow #1

leonlan commented Feb 28, 2024 •

edited

Loading

N-Wouda commented Feb 28, 2024

Uh oh!

This comment was marked as resolved.

This comment was marked as resolved.

N-Wouda commented Feb 28, 2024

Uh oh!

N-Wouda commented Feb 29, 2024

Uh oh!

leonlan commented Feb 29, 2024 •

edited

Loading

Uh oh!

Improve benchmarking workflow #1

Improve benchmarking workflow #1

Comments

leonlan commented Feb 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Additional context

N-Wouda commented Feb 28, 2024

Uh oh!

This comment was marked as resolved.

This comment was marked as resolved.

N-Wouda commented Feb 28, 2024

Uh oh!

N-Wouda commented Feb 29, 2024

Uh oh!

leonlan commented Feb 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

leonlan commented Feb 28, 2024 •

edited

Loading

leonlan commented Feb 29, 2024 •

edited

Loading