simulator: add counterexample minimization #623

alpaylan · 2025-01-06T15:28:21Z

This PR introduces counterexample minimization(shrinking) in the simulator. It will require changes to the current structure in various places, so I've opened it as a draft PR for now, in order to not overwhelm the reviewers all at once.

Turn interactions plans into sequences of properties instead of sequences of interactions, adding a semantic layer.
Add assumptions to the properties, rendering a property invalid if its assumptions are not valid.
Record the failure point in a failing assertion, as shrinking by definition works when the same assertion fails for a smaller input.
Add shrinking at three levels,
- top level(removing whole properties),
- property level(removing interactions within properties),
- interaction level(shrinking values in interactions to smaller ones).
Add marauders as a dev dependency, inject custom mutations to a testing branch for evaluating the simulator performance.
Integrate the simulator evaluation with the CI.

…perations that did not reflect the internal structure, in which they were actually concatenations of properties, which are a coherent set of interactions that are meaningful by themselves. this commit introduces this semantic layer into the data model by turning interaction plans into a sequence of properties, which are a sequence of interactions

the execution of the current property and switches to the next one.

three indexes(connection, interaction pointer, and secondary pointer) that can uniquely identify the executed interaction at any point. we will use the history for shrinking purposes.

pereman2 · 2025-01-07T11:29:07Z

This looks quite cool @alpaylan. Could you post some literature you think is relevant for this?

alpaylan · 2025-01-07T12:56:59Z

This looks quite cool @alpaylan. Could you post some literature you think is relevant for this?

Thanks @pereman2 ! Of course, here are some rather informal writing discussing different shrinking strategies(warning: they're quite opinionated with respect to author's position, which I don't agree)

Two other informal articles,

ECOOP20 paper from the hypothesis author,

https://drmaciver.github.io/papers/reduction-via-generation-preview.pdf

I view shrinking very related to delta debugging, which I think is a lot more used in the literature. I have a submission that discusses shrinking too, I'm not sure if sharing it here breaks double blind, but I can send it over email if you would like.

jussisaurio · 2025-01-07T15:03:30Z

I thought this page had a nice layman definition for shrinking:

In property-based testing, the initially found failing case may contain a lot of complexity that does actually cause the test to fail. Shrinking is the mechanism through which a property-based testing framework can simplify failing cases in order to find out the minimal reproducible case is.

We could even add it to our code

simulator/generation/plan.rs

alpaylan · 2025-01-08T08:17:44Z

The shrinking is a bit harder than I hoped for, particularly due to the fact that we cannot shrink closures, which limits our ability to minimize aggressively by removing columns from tables, because we wouldn't be able to modify the assertion that checks it.

The solution is to turn the assertions themselves into a small DSL, but at that point it goes into too much research, which I don't think is the right choice for now. I've instead focused on other shrinking mechanisms;

Removing properties after the failing one,
Removing properties that do not refer to the tables in the failing interaction,
Removing extensional parts in the property(I'll also add these with some bit of explanation, probably today)

better counterexample minimization. - it separates interaction plans from their state of execution - it removes closures from the property definitions, encoding properties as an enum variant, and deriving the closures from the variants. - it adds some naive counterexample minimization capabilities to the Limbo simulator and reduces the plan sizes considerably. - it makes small changes to various points of the simulator for better error reporting, enhancing code readability, small fixes to handle previously missed cases

alpaylan · 2025-01-10T23:31:25Z

Oh no. How do I fix that.(I accidentally pushed the extra commits I hadn't pulled on top of mine, fixed it now)

- previous query generation method was faulty, producing wrong assertions - this commit adds a new arbitrary_from implementation for predicates - new implementation takes a table and a row, and produces a predicate that would evaluate to true for the row this commit makes small changes to the main for increasing readability

…he trait signature

alpaylan · 2025-01-13T11:59:45Z

This has been going good, there's some more extra work;

the error reporting from the shrinking needs to get better
shrinking is currently eager, it tries to shrink aggressively all at once; ideal version would have multiple possible minimizations, apply each of them separately and shrink progressively, only stop when none of the possible minimizations reproduce the bug.
I still need some todo items in the original list

I propose we split the PR into two parts, finish the first part here, move the second part to another PR.

The existing work + better error reporting
The progressive shrinking(requires rearchitecting the system a bit) + value shrinkage + simulator evaluation(the mutation injection tool I'm working on also needs some work too, so this part is kind of longer)

The reason for the split is that I feel like this is gonna stale if I try to finish everything on the list, and I think implementers would probably benefit from the improved testing experience this PR has enabled so far.

What do you think @jussisaurio @pereman2 ?

edit: moved value shrinkage to the next work package, it's hard to do it without progressive shrinking, and we might even need further improvements to the system.

- remove pick_index from places where it's possible to use pick instead - allow multiple values to be inserted in the insert-select property

…king which row to check existence for in the result of the select query

alpaylan added 3 commits January 6, 2025 18:16

add assumptions to the interactions, where a failing assumption stops

daa77fe

the execution of the current property and switches to the next one.

add execution history to the simulator, the history records

cc56276

three indexes(connection, interaction pointer, and secondary pointer) that can uniquely identify the executed interaction at any point. we will use the history for shrinking purposes.

jussisaurio reviewed Jan 7, 2025

View reviewed changes

simulator/generation/plan.rs Outdated Show resolved Hide resolved

alpaylan force-pushed the main branch from 60e1b04 to 191b586 Compare January 10, 2025 23:32

alpaylan added 4 commits January 11, 2025 09:46

Merge branch 'tursodatabase:main' into main

7b2f65f

update properties to add extensional interactions between them

1344280

fix arbitrary_from ergonomics by removing the implicit reference in t…

43f6c34

…he trait signature

alpaylan force-pushed the main branch from 3c14f77 to 43f6c34 Compare January 13, 2025 11:43

alpaylan added 2 commits January 13, 2025 15:56

- add doc comments to generation traits and functions

c3ea027

- remove pick_index from places where it's possible to use pick instead - allow multiple values to be inserted in the insert-select property

fix non-determinism bug arising from a call to thread_rng while pic…

fb937ef

…king which row to check existence for in the result of the select query

alpaylan marked this pull request as ready for review January 13, 2025 15:01

alpaylan mentioned this pull request Jan 14, 2025

Simulator terminates with zero exit code on error #689

Closed

alpaylan force-pushed the main branch 2 times, most recently from fcdfb27 to 6f27442 Compare January 15, 2025 07:55

Merge branch 'main' of https://github.com/tursodatabase/limbo

ecb0f78

alpaylan force-pushed the main branch from 6f27442 to ecb0f78 Compare January 15, 2025 08:00

alpaylan added 2 commits January 15, 2025 11:42

add missed updates from the merge

c446e29

remove debug print

ea6ad8d

penberg closed this in ffe6514 Jan 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

simulator: add counterexample minimization #623

simulator: add counterexample minimization #623

Uh oh!

alpaylan commented Jan 6, 2025 •

edited

Loading

Uh oh!

pereman2 commented Jan 7, 2025

Uh oh!

alpaylan commented Jan 7, 2025

Uh oh!

jussisaurio commented Jan 7, 2025

Uh oh!

Uh oh!

alpaylan commented Jan 8, 2025 •

edited

Loading

Uh oh!

alpaylan commented Jan 10, 2025 •

edited

Loading

Uh oh!

alpaylan commented Jan 13, 2025 •

edited

Loading

Uh oh!

Uh oh!

simulator: add counterexample minimization #623

simulator: add counterexample minimization #623

Uh oh!

Conversation

alpaylan commented Jan 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pereman2 commented Jan 7, 2025

Uh oh!

alpaylan commented Jan 7, 2025

Uh oh!

jussisaurio commented Jan 7, 2025

Uh oh!

Uh oh!

alpaylan commented Jan 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alpaylan commented Jan 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alpaylan commented Jan 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

alpaylan commented Jan 6, 2025 •

edited

Loading

alpaylan commented Jan 8, 2025 •

edited

Loading

alpaylan commented Jan 10, 2025 •

edited

Loading

alpaylan commented Jan 13, 2025 •

edited

Loading