Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simulator: add counterexample minimization #623

Closed
wants to merge 13 commits into from

Conversation

alpaylan
Copy link
Contributor

@alpaylan alpaylan commented Jan 6, 2025

This PR introduces counterexample minimization(shrinking) in the simulator. It will require changes to the current structure in various places, so I've opened it as a draft PR for now, in order to not overwhelm the reviewers all at once.

  • Turn interactions plans into sequences of properties instead of sequences of interactions, adding a semantic layer.
  • Add assumptions to the properties, rendering a property invalid if its assumptions are not valid.
  • Record the failure point in a failing assertion, as shrinking by definition works when the same assertion fails for a smaller input.
  • Add shrinking at three levels,
    • top level(removing whole properties),
    • property level(removing interactions within properties),
    • interaction level(shrinking values in interactions to smaller ones).
  • Add marauders as a dev dependency, inject custom mutations to a testing branch for evaluating the simulator performance.
  • Integrate the simulator evaluation with the CI.

…perations that

did not reflect the internal structure, in which they were actually concatenations of
properties, which are a coherent set of interactions that are meaningful by themselves.
this commit introduces this semantic layer into the data model by turning interaction plans
into a sequence of properties, which are a sequence of interactions
the execution of the current property and switches to the next one.
three indexes(connection, interaction pointer, and secondary pointer)
that can uniquely identify the executed interaction at any point.
we will use the history for shrinking purposes.
@pereman2
Copy link
Collaborator

pereman2 commented Jan 7, 2025

This looks quite cool @alpaylan. Could you post some literature you think is relevant for this?

@alpaylan
Copy link
Contributor Author

alpaylan commented Jan 7, 2025

This looks quite cool @alpaylan. Could you post some literature you think is relevant for this?

Thanks @pereman2 ! Of course, here are some rather informal writing discussing different shrinking strategies(warning: they're quite opinionated with respect to author's position, which I don't agree)

Two other informal articles,

ECOOP20 paper from the hypothesis author,

I view shrinking very related to delta debugging, which I think is a lot more used in the literature. I have a submission that discusses shrinking too, I'm not sure if sharing it here breaks double blind, but I can send it over email if you would like.

@jussisaurio
Copy link
Collaborator

I thought this page had a nice layman definition for shrinking:

In property-based testing, the initially found failing case may contain a lot of complexity that does actually cause the test to fail. Shrinking is the mechanism through which a property-based testing framework can simplify failing cases in order to find out the minimal reproducible case is.

We could even add it to our code

@alpaylan
Copy link
Contributor Author

alpaylan commented Jan 8, 2025

The shrinking is a bit harder than I hoped for, particularly due to the fact that we cannot shrink closures, which limits our ability to minimize aggressively by removing columns from tables, because we wouldn't be able to modify the assertion that checks it.

The solution is to turn the assertions themselves into a small DSL, but at that point it goes into too much research, which I don't think is the right choice for now. I've instead focused on other shrinking mechanisms;

  • Removing properties after the failing one,
  • Removing properties that do not refer to the tables in the failing interaction,
  • Removing extensional parts in the property(I'll also add these with some bit of explanation, probably today)

better counterexample minimization.

- it separates interaction plans from their state of execution
- it removes closures from the property definitions, encoding properties as an enum variant, and deriving the closures from the variants.
- it adds some naive counterexample minimization capabilities to the Limbo simulator and reduces the plan sizes considerably.
- it makes small changes to various points of the simulator for better error reporting, enhancing code readability, small fixes to handle previously missed cases
@alpaylan
Copy link
Contributor Author

alpaylan commented Jan 10, 2025

Oh no. How do I fix that.(I accidentally pushed the extra commits I hadn't pulled on top of mine, fixed it now)

- previous query generation method was faulty, producing wrong assertions
- this commit adds a new arbitrary_from implementation for predicates
- new implementation takes a table and a row, and produces a predicate that would evaluate to true for the row
this commit makes small changes to the main for increasing readability
@alpaylan
Copy link
Contributor Author

alpaylan commented Jan 13, 2025

This has been going good, there's some more extra work;

  • the error reporting from the shrinking needs to get better
  • shrinking is currently eager, it tries to shrink aggressively all at once; ideal version would have multiple possible minimizations, apply each of them separately and shrink progressively, only stop when none of the possible minimizations reproduce the bug.
  • I still need some todo items in the original list

I propose we split the PR into two parts, finish the first part here, move the second part to another PR.

  • The existing work + better error reporting
  • The progressive shrinking(requires rearchitecting the system a bit) + value shrinkage + simulator evaluation(the mutation injection tool I'm working on also needs some work too, so this part is kind of longer)

The reason for the split is that I feel like this is gonna stale if I try to finish everything on the list, and I think implementers would probably benefit from the improved testing experience this PR has enabled so far.

What do you think @jussisaurio @pereman2 ?

edit: moved value shrinkage to the next work package, it's hard to do it without progressive shrinking, and we might even need further improvements to the system.

- remove pick_index from places where it's possible to use pick instead
- allow multiple values to be inserted in the insert-select property
…king

which row to check existence for in the result of the select query
@alpaylan alpaylan marked this pull request as ready for review January 13, 2025 15:01
@alpaylan alpaylan force-pushed the main branch 2 times, most recently from fcdfb27 to 6f27442 Compare January 15, 2025 07:55
@penberg penberg closed this in ffe6514 Jan 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants