Skip to content

Conversation

@pedrocarlo
Copy link
Contributor

@pedrocarlo pedrocarlo commented Oct 18, 2025

Depends on #3775 - to remove noise from this PR.

Motivation

In my continued efforts in making the simulator more accessible and simpler to work with, I have over time simplified and optimized some parts of the codebase like query generation and decision making so that more people from the community can contribute and enhance the simulator. This PR is one more step in that direction.

Before this PR, our InteractionPlan stored Vec<Interactions>. Interactions are a higher level collection that will generate a list of Interaction (yes I know the naming can be slightly confusing sometimes. Maybe we can change it later as well. Especially because Interactions are mainly just Property). However, this architecture imposed a problem when MVCC enters the picture. MVCC requires us to make sure that DDL statements are executed serially. To avoid adding even more complexity to plan generation, I opted on previous PRs to check before emitting an Interaction for execution, if the interaction is a DDL statement, and if it is, I emit a Commit for each connection still in a transaction. This worked slightly fine, but as we do not store the actual execution of interactions in the interaction plan, only the higher level Interactions, this meant that I had to do some workarounds to modify the Interactions inside the plan to persist the Commit I generated on demand.

Problem

However, I was stupid and overlooked the fact that for certain properties that allow queries to be generated in the middle (referenced as extensional queries in the code), we cannot specify the connection that should execute that query, meaning if a DDL statement occurred there, the simulator could emit the query but could not save it properly in the plan to reproduce in shrinking. So to correct and make interaction generation/emission less brittle, I refactored the InteractionPlan so that it stores Vec<Interaction> instead.

Implications

  • Interaction is not currently serializable using Serde due to the fact that it stores a function in Assertion. This means that we cannot serialize the plan into a plan.json. Which to me is honestly fine, as the only things that used plan.json was --load and --watch options. Which are options almost nobody really used.

  • For load, instead of generating the whole plan it just read the plan from disk. The workaround for that right now is just load the cli_opts that were last run for that particular seed and use those exact options to run the simulation.

  • For watch, currently there is not workaround but, @alpaylan told me has some plans to make assertions serializable by embedding a custom language into the plan.sql file, meaning we will probably not need a json file at all to store the interaction plan. And this embedded language will make it much easier to bring back a more proper watch mode.

  • The current shrinking algorithms all have some notion of properties and removal of properties, but Interaction do not have this concept. So I added some metadata to interactions and a origin ID to each Interaction so that we can search through the list of interactions using binary search to get all of the interactions that are part of the same Property. To support this, I added an InteractionBuilder and some utilities to iterate and remove properties in the InteractionPlan

Conclusion

Overall, this code simplifies emission of interactions and ensures the InteractionPlan always stores the actual interactions that get executed. This also decouples more query generation logic from query emission logic.

@nyrkio
Copy link

nyrkio bot commented Oct 18, 2025

Nyrkiö Report for Commit: a0c7abb

No performance changes detected.

Remember that Nyrkiö results become more precise when more commits are merged. So please check back in a few days.

Nyrkiö

  will track `Interaction` instead of `Interactions` in the Plan, this
  change will impossibilitate the serialization of the InteractionPlan with Serde Json. 
- make --load just load the previous cli args
@pedrocarlo pedrocarlo force-pushed the sim-refactor-interaction branch 2 times, most recently from 2bec7a0 to 4808a9b Compare October 18, 2025 16:59
@pedrocarlo pedrocarlo marked this pull request as draft October 18, 2025 17:00
  Modify `generation/property.rs` to use the Builder
- add additional metadata to `Interaction` to give more context for
  shrinking and iterating over interactions that originated from the
  same interaction.
- add Iterator like utilities for `InteractionPlan` to facilitate
  iterating over interactions that came from the same property:
  to calculate metrics per generation step 
- simplify generation as we now only store `Interaction`. So now we can 
  funnel most of the logic for interaction generation, metric update,
  and Interaction append in the `PlanGenerator::next`.
…te over properties, instead of handrolling property iteration
@pedrocarlo pedrocarlo force-pushed the sim-refactor-interaction branch from 4808a9b to eb0327b Compare October 18, 2025 22:05
@pedrocarlo pedrocarlo force-pushed the sim-refactor-interaction branch from eb0327b to 7a4498f Compare October 18, 2025 23:04
@pedrocarlo pedrocarlo marked this pull request as ready for review October 18, 2025 23:04
@pedrocarlo pedrocarlo force-pushed the sim-refactor-interaction branch from 7a4498f to a0c7abb Compare October 20, 2025 01:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant