Skip to content

Conversation

@alpaylan
Copy link
Contributor

This PR is a working doc on a roadmap for the simulator. @pedrocarlo @LeMikaelF please take a look.

Actionable items:

- [ ] Implement generation and/or shadowing for one of the languages features in [COVERAGE.md](./COVERAGE.md)
- [ ] At the moment, implementing a feature requires both adding a generation for it as well as
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If removing shadowing is an option, I think the sooner we remove it, the easier development will become, because for every function or query shape we add to the generator, we'll need to reimplement it in the simulator, and in my short experience that's far from trivial. In the last days, I've had to fight with affinity and type coercion, column projections, and others that I forget.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have an idea that removes shadowing, which is essentially using SQLite as our shadow. Attach the same database to both SQLite and Turso, and just use SQLite anywhere we would use the shadow table. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we would do writes with Turso, and reads with both Turso and SQLite. This sounds like a simple solution that would remove a lot of complexity from the simulator and would let us focus on the more important parts (query generation and properties). This sounds fantastic.

I wonder if there might be issues with concurrency or locking, especially once we start having multiple concurrent clients?

Another question is, at that point, is this oracle significantly different from differential? I know I keep circling back to this, but I don't fully comprehend what it is that the shadow oracle provides that the differential oracle doesn't.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if there might be issues with concurrency or locking, especially once we start having multiple concurrent clients?

Yeah that's the part that makes me question it the most. I don't know what happens to the database state when we're in a concurrent setting, is it possible to read the ephemeral state etc.

Another question is, at that point, is this oracle significantly different from differential? I know I keep circling back to this, but I don't fully comprehend what it is that the shadow oracle provides that the differential oracle doesn't.

I think it's a fair question, let me explain my perspective. For one thing, not all queries are deterministic. If you have a limit query, Turso actually doesn't guarantee that you'll have the same output (#2024). So we cannot rely purely on differentiality.

Another issue is locality. Differential testing bugs are very non-local, they provide little to no useful information for debugging. Properties give you much more local information.

The last one is that you want to have this infrastructure for long term, because at some point the projects will diverge, Turso will be SQLite compatible, but it won't be sqlite-rs. So having a bespoke PBT infrastructure is very useful at that point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About your second points, that's true, I didn't think of that. If you do a SELECT ... LIMIT 5, what you need to check is that the rows are a subset of the unlimited SELECT, not that another database gives the same result.

This makes me think that doublecheck could also give false positives on queries like insert into t select * from s limit 5. I'm currently working on extending the generator to implement queries like this, but what this means is that without a way to have oracle-specific generation (like you suggested in this document), this won't work.


I'm nearing the end of my day here, I'll pick this up tomorrow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Back to the question of shadowing. I think I need to be educated on how MVCC works, but I presume it's possible to read the ephemeral (non-committed) state from outside? It's not going to be the default behavior, but surely it's something that's accessible? Would SQLite be able to read the MVCC state at a given point from a Turso-generated database?

Copy link
Contributor

@pedrocarlo pedrocarlo Nov 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So MVCC is a construct that mainly exists in memory. We could maybe create some custom hooks that we can read rows from MVCC, but that would be slightly complicated I imagine.

Would SQLite be able to read the MVCC state at a given point

Physically (e.g on the disk), a Turso DB file using MVCC should theoretically have no differences from a SQLite DB. The only thing that is currently different is that we have a logical log file that is used for MVCC, but that is only used for checkpointing and for faster appends. After you close the db connection and checkpoint the WAL and Logical Log, it should just be a SQLite file.

Now, I am not sure how feasible it is to make SQLite read the in memory state from Turso while the simulation is running.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. The reason I thought about SQLite is because (as far as I understand), whatever happens, Turso will be compatible with SQLite file format. As such, we should always be able to read a Turso-generated file from SQLite for determining the database state. So the fact that semantics diverge isn't a problem, because we aren't thinking in terms of functionality equivalence, only state equivalence.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, the double-read (Turso+SQLite) strategy you suggest would work only:

  • in the absence of concurrent queries (because it would be hard to synchronize SQLite with the other processes)
  • in the absence of MVCC
  • in the absence of transactions (because SQLite wouldn't see uncommitted data or the same row versions as the transaction under test)

But there are ways around these limitations.

  • For concurrent queries, we would probably need a custom scheduler. I don't know if we want to do that.
  • For MVCC, we could just validate it using a doublecheck-like strategy, where we run an interaction plan with MVCC and run it again using WAL and SQLite, checking that the resulting binaries are identical.
  • For transactions, we could patch SQLite to be able to specify the "end mark" used by transactions. Something like Oracle's SELECT ... AS OF :timestamp.

That the main benefit of the double-read oracle is that it could prevent us from having to maintain a shadow model, but the drawbacks may outweigh benefits.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another idea I have is to enable shadowing on both SQLite and the custom shadow. We can keep the custom shadow for inside the concurrent sections, do SQLite on the other cases. Having a clean architecture for doing this is crucial, but I believe it's doable.

Actionable items:

- [ ] Implement generation and/or shadowing for one of the languages features in [COVERAGE.md](./COVERAGE.md)
- [ ] At the moment, implementing a feature requires both adding a generation for it as well as
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think of separating the Arbitrary trait into ArbitraryShadowed and Arbitrary?

It could be an easy refactor to rename every implementation of Arbitrary to ArbitraryShadowed and add a separate Arbitrary trait that would delegate by default to ArbitraryShadowed, and then all or most new development could go into Arbitrary.

We could add this as a GH issue and flag it with "help wanted", or "good first issue"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should first decide the do we want to keep shadowing question you posed, because it's an important one.

- [ ] Multi-client generation: TODO
- [ ] LLM-guided generation: TODO
- [ ] Custom feedback guided generation: TODO
- [ ] Coverage-guided generation: TODO
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean AFL? I've been dying to try it out on Turso.

Fuzzing SQLite with AFL was a major success story a few years ago:

In the second link, they found 22 crashes in "30 minutes of actual work", by running the existing SQLite test cases through AFL.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I had in mind was AFL+structured mutation/generation. I've previously build bespoke coverage infrastructure, so I know it's not that hard to do even though it might a bit slow in the way we do it. We can use the custom coverage we obtain for guiding the generation. Isn't Turso currently fuzzed? I would expect it to be.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it has custom fuzzers for many statements and behaviours, and if you look around you will find even more fuzzers that are written but I think are not actively used. But there is no centralized fuzzer.

@pedrocarlo
Copy link
Contributor

Nice stuff! Loved the ideas and the Coverage document.


## Fault Injection

- [ ] TODO
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case you missed it, short writes are now taken care of in unreliable-libc (#3569).

There's also clock skew, but it's always simulated, it's not a separate explicit fault.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept the fault injection part TODO on purpose because I really haven't touched it much. We can iterate on that part even more. We should probably have a better description of how that works too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants