diff --git a/COMPAT.md b/COMPAT.md index d4243613e8..77e55f38f2 100644 --- a/COMPAT.md +++ b/COMPAT.md @@ -4,7 +4,7 @@ This document describes the compatibility of Turso with SQLite. ## Table of contents -- [Turso compatibility with SQLite](#limbo-compatibility-with-sqlite) +- [Turso compatibility with SQLite](#turso-compatibility-with-sqlite) - [Table of contents](#table-of-contents) - [Overview](#overview) - [Features](#features) @@ -389,7 +389,7 @@ Modifiers: | json_type(json) | Yes | | | json_type(json,path) | Yes | | | json_valid(json) | Yes | | -| json_valid(json,flags) | | | +| json_valid(json,flags) | Yes | | | json_quote(value) | Yes | | | json_group_array(value) | Yes | | | jsonb_group_array(value) | Yes | | diff --git a/simulator/COVERAGE.md b/simulator/COVERAGE.md new file mode 100644 index 0000000000..f7913b72d0 --- /dev/null +++ b/simulator/COVERAGE.md @@ -0,0 +1,489 @@ +# Simulator coverage of Turso + +This document describes how much of the Turso capabilities +simulator covers and tests. + +## Table of contents + +- [Simulator coverage of Turso](#simulator-coverage-of-turso) + - [Table of contents](#table-of-contents) + - [SQLite query language](#sqlite-query-language) + - [Statements](#statements) + - [PRAGMA](#pragma) + - [Expressions](#expressions) + - [SQL functions](#sql-functions) + - [Scalar functions](#scalar-functions) + - [Mathematical functions](#mathematical-functions) + - [Aggregate functions](#aggregate-functions) + - [Date and time functions](#date-and-time-functions) + - [JSON functions](#json-functions) + - [SQLite journaling modes](#sqlite-journaling-modes) + - [Extensions](#extensions) + - [UUID](#uuid) + - [regexp](#regexp) + - [Vector](#vector) + - [Time](#time) + +## SQLite query language + +### Statements + +| Statement | Status | Comment | +| ------------------------- | ------- | --------------------------------------------------------------------------------- | +| ALTER TABLE | Partial | TODO | +| ANALYZE | No | | +| ATTACH DATABASE | No | | +| BEGIN TRANSACTION | Partial | TODO | +| COMMIT TRANSACTION | Partial | TODO | +| CREATE INDEX | Partial | TODO | +| CREATE TABLE | Partial | TODO | +| CREATE TABLE ... STRICT | No | | +| CREATE TRIGGER | No | | +| CREATE VIEW | No | | +| CREATE VIRTUAL TABLE | No | | +| DELETE | Partial | TODO | +| DETACH DATABASE | No | | +| DROP INDEX | Partial | TODO | +| DROP TABLE | Partial | TODO | +| DROP TRIGGER | No | | +| DROP VIEW | No | | +| END TRANSACTION | Partial | TODO | +| EXPLAIN | Partial | | +| INDEXED BY | No | | +| INSERT | Partial | TODO | +| ON CONFLICT clause | No | | +| REINDEX | No | | +| RELEASE SAVEPOINT | No | | +| REPLACE | No | | +| RETURNING clause | No | TODO | +| ROLLBACK TRANSACTION | Partial | TODO | +| SAVEPOINT | No | | +| SELECT | Partial | TODO | +| SELECT ... WHERE | Partial | TODO | +| SELECT ... WHERE ... LIKE | No | | +| SELECT ... LIMIT | Partial | TODO | +| SELECT ... ORDER BY | Partial | TODO | +| SELECT ... GROUP BY | No | | +| SELECT ... HAVING | No | | +| SELECT ... JOIN | Partial | TODO | +| SELECT ... CROSS JOIN | No | SQLite CROSS JOIN means "do not reorder joins". We don't support that yet anyway. | +| SELECT ... INNER JOIN | Partial | TODO | +| SELECT ... OUTER JOIN | No | | +| SELECT ... JOIN USING | No | | +| SELECT ... NATURAL JOIN | No | | +| UPDATE | Partial | | +| VACUUM | No | | +| WITH clause | No | | +| WINDOW functions | No | | + +#### [PRAGMA](https://www.sqlite.org/pragma.html) + +| Statement | Status | Comment | +| -------------------------------- | ---------- | ------------------------------------------------ | +| PRAGMA analysis_limit | No | | +| PRAGMA application_id | No | | +| PRAGMA auto_vacuum | Yes | | +| PRAGMA automatic_index | No | | +| PRAGMA busy_timeout | No | | +| PRAGMA busy_timeout | No | | +| PRAGMA cache_size | No | | +| PRAGMA cache_spill | No | | +| PRAGMA case_sensitive_like | Not Needed | deprecated in SQLite | +| PRAGMA cell_size_check | No | | +| PRAGMA checkpoint_fullsync | No | | +| PRAGMA collation_list | No | | +| PRAGMA compile_options | No | | +| PRAGMA count_changes | Not Needed | deprecated in SQLite | +| PRAGMA data_store_directory | Not Needed | deprecated in SQLite | +| PRAGMA data_version | No | | +| PRAGMA database_list | No | | +| PRAGMA default_cache_size | Not Needed | deprecated in SQLite | +| PRAGMA defer_foreign_keys | No | | +| PRAGMA empty_result_callbacks | Not Needed | deprecated in SQLite | +| PRAGMA encoding | No | | +| PRAGMA foreign_key_check | No | | +| PRAGMA foreign_key_list | No | | +| PRAGMA foreign_keys | No | | +| PRAGMA freelist_count | No | | +| PRAGMA full_column_names | Not Needed | deprecated in SQLite | +| PRAGMA fullsync | No | | +| PRAGMA function_list | No | | +| PRAGMA hard_heap_limit | No | | +| PRAGMA ignore_check_constraints | No | | +| PRAGMA incremental_vacuum | No | | +| PRAGMA index_info | No | | +| PRAGMA index_list | No | | +| PRAGMA index_xinfo | No | | +| PRAGMA integrity_check | No | | +| PRAGMA journal_mode | No | | +| PRAGMA journal_size_limit | No | | +| PRAGMA legacy_alter_table | No | | +| PRAGMA legacy_file_format | No | | +| PRAGMA locking_mode | No | | +| PRAGMA max_page_count | No | | +| PRAGMA mmap_size | No | | +| PRAGMA module_list | No | | +| PRAGMA optimize | No | | +| PRAGMA page_count | No | | +| PRAGMA page_size | No | | +| PRAGMA parser_trace | No | | +| PRAGMA pragma_list | No | | +| PRAGMA query_only | No | | +| PRAGMA quick_check | No | | +| PRAGMA read_uncommitted | No | | +| PRAGMA recursive_triggers | No | | +| PRAGMA reverse_unordered_selects | No | | +| PRAGMA schema_version | No | For writes, emulate defensive mode (always noop) | +| PRAGMA secure_delete | No | | +| PRAGMA short_column_names | Not Needed | deprecated in SQLite | +| PRAGMA shrink_memory | No | | +| PRAGMA soft_heap_limit | No | | +| PRAGMA stats | No | Used for testing in SQLite | +| PRAGMA synchronous | No | | +| PRAGMA table_info | No | | +| PRAGMA table_list | No | | +| PRAGMA table_xinfo | No | | +| PRAGMA temp_store | No | | +| PRAGMA temp_store_directory | Not Needed | deprecated in SQLite | +| PRAGMA threads | No | | +| PRAGMA trusted_schema | No | | +| PRAGMA user_version | No | | +| PRAGMA vdbe_addoptrace | No | | +| PRAGMA vdbe_debug | No | | +| PRAGMA vdbe_listing | No | | +| PRAGMA vdbe_trace | No | | +| PRAGMA wal_autocheckpoint | No | | +| PRAGMA wal_checkpoint | No | | +| PRAGMA writable_schema | No | | + +### Expressions + +Feature support of [sqlite expr syntax](https://www.sqlite.org/lang_expr.html). + +| Syntax | Status | Comment | +| ------------------------- | ------- | ------- | +| literals | Yes | | +| schema.table.column | No | | +| unary operator | Yes | | +| binary operator | Partial | TODO | +| agg() FILTER (WHERE ...) | No | | +| ... OVER (...) | No | | +| (expr) | Yes | | +| CAST (expr AS type) | No | | +| COLLATE | No | | +| (NOT) LIKE | Yes | | +| (NOT) GLOB | No | | +| (NOT) REGEXP | No | | +| (NOT) MATCH | No | | +| IS (NOT) | No | | +| IS (NOT) DISTINCT FROM | No | | +| (NOT) BETWEEN ... AND ... | No | | +| (NOT) IN (SELECT...) | No | | +| (NOT) EXISTS (SELECT...) | No | | +| x (SELECT...)) | No | | +| CASE WHEN THEN ELSE END | No | | +| RAISE | No | | + +### SQL functions + +#### Scalar functions + +| Function | Status | Comment | +| ---------------------------- | ------ | ------- | +| abs(X) | No | | +| changes() | No | | +| char(X1,X2,...,XN) | No | | +| coalesce(X,Y,...) | No | | +| concat(X,...) | No | | +| concat_ws(SEP,X,...) | No | | +| format(FORMAT,...) | No | | +| glob(X,Y) | No | | +| hex(X) | No | | +| ifnull(X,Y) | No | | +| iif(X,Y,Z) | No | | +| instr(X,Y) | No | | +| last_insert_rowid() | No | | +| length(X) | No | | +| like(X,Y) | No | | +| like(X,Y,Z) | No | | +| likelihood(X,Y) | No | | +| likely(X) | No | | +| load_extension(X) | No | | +| load_extension(X,Y) | No | | +| lower(X) | No | | +| ltrim(X) | No | | +| ltrim(X,Y) | No | | +| max(X,Y,...) | No | | +| min(X,Y,...) | No | | +| nullif(X,Y) | No | | +| octet_length(X) | No | | +| printf(FORMAT,...) | No | | +| quote(X) | No | | +| random() | No | | +| randomblob(N) | No | | +| replace(X,Y,Z) | No | | +| round(X) | No | | +| round(X,Y) | No | | +| rtrim(X) | No | | +| rtrim(X,Y) | No | | +| sign(X) | No | | +| soundex(X) | No | | +| sqlite_compileoption_get(N) | No | | +| sqlite_compileoption_used(X) | No | | +| sqlite_offset(X) | No | | +| sqlite_source_id() | No | | +| sqlite_version() | No | | +| substr(X,Y,Z) | No | | +| substr(X,Y) | No | | +| substring(X,Y,Z) | No | | +| substring(X,Y) | No | | +| total_changes() | No | | +| trim(X) | No | | +| trim(X,Y) | No | | +| typeof(X) | No | | +| unhex(X) | No | | +| unhex(X,Y) | No | | +| unicode(X) | No | | +| unlikely(X) | No | | +| upper(X) | No | | +| zeroblob(N) | No | | + +#### Mathematical functions + +| Function | Status | Comment | +| ---------- | ------ | ------- | +| acos(X) | No | | +| acosh(X) | No | | +| asin(X) | No | | +| asinh(X) | No | | +| atan(X) | No | | +| atan2(Y,X) | No | | +| atanh(X) | No | | +| ceil(X) | No | | +| ceiling(X) | No | | +| cos(X) | No | | +| cosh(X) | No | | +| degrees(X) | No | | +| exp(X) | No | | +| floor(X) | No | | +| ln(X) | No | | +| log(B,X) | No | | +| log(X) | No | | +| log10(X) | No | | +| log2(X) | No | | +| mod(X,Y) | No | | +| pi() | No | | +| pow(X,Y) | No | | +| power(X,Y) | No | | +| radians(X) | No | | +| sin(X) | No | | +| sinh(X) | No | | +| sqrt(X) | No | | +| tan(X) | No | | +| tanh(X) | No | | +| trunc(X) | No | | + +#### Aggregate functions + +| Function | Status | Comment | +| ----------------- | ------ | ------- | +| avg(X) | No | | +| count() | No | | +| count(*) | No | | +| group_concat(X) | No | | +| group_concat(X,Y) | No | | +| string_agg(X,Y) | No | | +| max(X) | No | | +| min(X) | No | | +| sum(X) | No | | +| total(X) | No | | + +#### Date and time functions + +| Function | Status | Comment | +| ----------- | ------ | ------- | +| date() | No | | +| time() | No | | +| datetime() | No | | +| julianday() | No | | +| unixepoch() | No | | +| strftime() | No | | +| timediff() | No | | + +Modifiers: + +| Modifier | Status | Comment | +| -------------- | ------- | ------------------------------------------- | +| Days | Yes | | +| Hours | Yes | | +| Minutes | Yes | | +| Seconds | Yes | | +| Months | Yes | | +| Years | Yes | | +| TimeOffset | Yes | | +| DateOffset | Yes | | +| DateTimeOffset | Yes | | +| Ceiling | Yes | | +| Floor | No | | +| StartOfMonth | Yes | | +| StartOfYear | Yes | | +| StartOfDay | Yes | | +| Weekday(N) | Yes | | +| Auto | No | | +| UnixEpoch | No | | +| JulianDay | No | | +| Localtime | Partial | requires fixes to avoid double conversions. | +| Utc | Partial | requires fixes to avoid double conversions. | +| Subsec | Yes | | + +#### JSON functions + +| Function | Status | Comment | +| ---------------------------------- | ------ | ------- | +| json(json) | No | | +| jsonb(json) | No | | +| json_array(value1,value2,...) | No | | +| jsonb_array(value1,value2,...) | No | | +| json_array_length(json) | No | | +| json_array_length(json,path) | No | | +| json_error_position(json) | No | | +| json_extract(json,path,...) | No | | +| jsonb_extract(json,path,...) | No | | +| json -> path | No | | +| json ->> path | No | | +| json_insert(json,path,value,...) | No | | +| jsonb_insert(json,path,value,...) | No | | +| json_object(label1,value1,...) | No | | +| jsonb_object(label1,value1,...) | No | | +| json_patch(json1,json2) | No | | +| jsonb_patch(json1,json2) | No | | +| json_pretty(json) | No | | +| json_remove(json,path,...) | No | | +| jsonb_remove(json,path,...) | No | | +| json_replace(json,path,value,...) | No | | +| jsonb_replace(json,path,value,...) | No | | +| json_set(json,path,value,...) | No | | +| jsonb_set(json,path,value,...) | No | | +| json_type(json) | No | | +| json_type(json,path) | No | | +| json_valid(json) | No | | +| json_valid(json,flags) | No | | +| json_quote(value) | No | | +| json_group_array(value) | No | | +| jsonb_group_array(value) | No | | +| json_group_object(label,value) | No | | +| jsonb_group_object(name,value) | No | | +| json_each(json) | No | | +| json_each(json,path) | No | | +| json_tree(json) | No | | +| json_tree(json,path) | No | | + +## [SQLite journaling modes](https://www.sqlite.org/pragma.html#pragma_journal_mode) + +We currently don't have plan to support the rollback journal mode as it locks the database file during writes. +Therefore, all rollback-type modes (delete, truncate, persist, memory) are marked are `Not Needed` below. + +| Journal mode | Status | Comment | +| ------------ | ------ | ------------------------------ | +| wal | No | | +| wal2 | No | experimental feature in sqlite | +| delete | No | | +| truncate | No | | +| persist | No | | +| memory | No | | + +## Extensions + +**We currently do not support testing any Turso extensions.** + +### UUID + +UUID's in Turso are `blobs` by default. + +| Function | Status | Comment | +| --------------------- | ------ | ------- | +| uuid4() | No | | +| uuid4_str() | No | | +| uuid7(X?) | No | | +| uuid7_timestamp_ms(X) | No | | +| uuid_str(X) | No | | +| uuid_blob(X) | No | | + +### regexp + +The `regexp` extension is compatible with [sqlean-regexp](https://github.com/nalgeon/sqlean/blob/main/docs/regexp.md). + +| Function | Status | Comment | +| -------------------------------------------- | ------ | ------- | +| regexp(pattern, source) | No | | +| regexp_like(source, pattern) | No | | +| regexp_substr(source, pattern) | No | | +| regexp_capture(source, pattern[, n]) | No | | +| regexp_replace(source, pattern, replacement) | No | | + +### Vector + +The `vector` extension is compatible with libSQL native vector search. + +| Function | Status | Comment | +| --------------------------------------- | ------ | ------- | +| vector(x) | No | | +| vector32(x) | No | | +| vector64(x) | No | | +| vector_extract(x) | No | | +| vector_distance_cos(x, y) | No | | +| vector_distance_l2(x, y) | No | | +| vector_concat(x, y) | No | | +| vector_slice(x, start_index, end_index) | No | | + +### Time + + + +| Function | Status | Comment | +| ------------------------------------------------------------------- | ------ | ------- | +| time_now() | No | | +| time_date(year, month, day[, hour, min, sec[, nsec[, offset_sec]]]) | No | | +| time_get_year(t) | No | | +| time_get_month(t) | No | | +| time_get_day(t) | No | | +| time_get_hour(t) | No | | +| time_get_minute(t) | No | | +| time_get_second(t) | No | | +| time_get_nano(t) | No | | +| time_get_weekday(t) | No | | +| time_get_yearday(t) | No | | +| time_get_isoyear(t) | No | | +| time_get_isoweek(t) | No | | +| time_get(t, field) | No | | +| time_unix(sec[, nsec]) | No | | +| time_milli(msec) | No | | +| time_micro(usec) | No | | +| time_nano(nsec) | No | | +| time_to_unix(t) | No | | +| time_to_milli(t) | No | | +| time_to_micro(t) | No | | +| time_to_nano(t) | No | | +| time_after(t, u) | No | | +| time_before(t, u) | No | | +| time_compare(t, u) | No | | +| time_equal(t, u) | No | | +| time_add(t, d) | No | | +| time_add_date(t, years[, months[, days]]) | No | | +| time_sub(t, u) | No | | +| time_since(t) | No | | +| time_until(t) | No | | +| time_trunc(t, field) | No | | +| time_trunc(t, d) | No | | +| time_round(t, d) | No | | +| time_fmt_iso(t[, offset_sec]) | No | | +| time_fmt_datetime(t[, offset_sec]) | No | | +| time_fmt_date(t[, offset_sec]) | No | | +| time_fmt_time(t[, offset_sec]) | No | | +| time_parse(s) | No | | +| dur_ns() | No | | +| dur_us() | No | | +| dur_ms() | No | | +| dur_s() | No | | +| dur_m() | No | | +| dur_h() | No | | diff --git a/simulator/ROADMAP.md b/simulator/ROADMAP.md new file mode 100644 index 0000000000..70c044786c --- /dev/null +++ b/simulator/ROADMAP.md @@ -0,0 +1,164 @@ +# Simulator Roadmap + +This document describes the current roadmap for the simulator. There +are 5 crucial aspects in the development of the simulator: + +1. Coverage: The reliability provided by random testing +is as strong as the input space of the random generation. We keep a detailed +[`COVERAGE.md`](./COVERAGE.md) file that describes how each feature of Turso is +covered by the simulator testing. +2. Input generation: For an input space as large as SQL, and a system as complex as +Turso, we cannot naively generate the whole input space, we must be selective and smart. +3. Correctness: What does it mean for Turso to be correct? We primarily rely on +[*properties*](#properties) as well as several [oracles](#oracles). +4. Debugging: Finding bugs is only the first half of testing a system, we must also +aim to help fix them. This is why simulator provides input shrinking as well as interactive +debugging capabilities. +5. Fault Injection: The simulator natively supports fault injection into the system, +simulating low level failures in the network or file system layer. + +## Coverage + +For detailed information about how much of the input space we currently cover, please +check the [`COVERAGE.md`](./COVERAGE.md) file. Currently, most of SQLite query language +is either not supported or partial, and functions aren't supported. + +Actionable items: + +- [ ] Implement generation and/or shadowing for one of the languages features in [COVERAGE.md](./COVERAGE.md) +- [ ] At the moment, implementing a feature requires both adding a generation for it as well as + a shadow operation for keeping track. For some oracles such as `doublecheck` or `differential`, we do not + need the shadowing. So we should separate the generation into `shadowable/not shadowable` paths, and generate + based on the current oracle. +- [ ] Run the simulator with coverage measurement, highlighting parts of Turso not touched by the simulator. + +## Input Generation + +- [x] Single client generation: In the current form, we generate a single client, although that may interact with Turso +from multiple connections. +- [ ] Multi-client generation: Instead of a single client making multiple connections, what would be more manageable is +having multiple clients, and the orchestrator picking a next client. This would also follow into concurrent client +generation in the future. +- [ ] Invalid Input generation: With the default oracle that relies on properties, invalid inputs did not make much sense. +For other oracles such as `differential` or `doublecheck`, they do make sense, because even in the case of errors the database +state should be equivalent with SQLite. It would make sense to deliberately break working queries to construct possibly +invalid inputs for testing unhappy paths. +- [ ] LLM-guided generation: There's quite a bit of potential in pushing an LLM to guide the generation. We had some +preliminary discussions on this, but a well-planned proposal will be necessary for implementing it. +- [ ] Custom feedback guided generation: SQLancer analyzes the `EXPLAIN PLAN`s to decide if a database state isn't diverse +enough for constructing interesting DQLs and instead starts to generate DMLs. We can use a similar approach. +- [ ] Coverage-guided generation: Coverage guidance has been a striking force in the last decade of random testing. Any +proposal on advancing the simulator in this area will be welcomed. +- [ ] Continuation Simulations: Simulations that load an existing DB file and start testing on top of that. +- [ ] Long-term simulations: Most of our simulations are bounded by minutes, why shouldn't we run OSS-Fuzz style continuous campaigns? +This requires implementing continuation simulations so we can preserve progress over time if the underlying machine fails. + +## Correctness + +- [ ] External property language: Currently, properties are hardcoded into the simulator as new variants that define +how to generate a sequence of database interactions and assertions that will be used by the simulator for testing +correctness properties defined in [#properties](#properties). There's a sketch of an external language for expressing +properties using [*generation actions*](https://alperenkeles.com/documents/d4.pdf) in our previous paper submission. +Ideally, we'll implement this language as a frontend for expressing properties, after which we'll be able to dynamically +include/exclude certain properties, change them in small ways based on the requirements. +- [ ] SQL with Contracts (SQL-C): There's a work-in-progress pull request ([https://github.com/tursodatabase/turso/pull/3715](https://github.com/tursodatabase/turso/pull/3715)) +for a SQL extension that includes inline bindings and assertions reasoning about dataframes. As this is done, the simulator will +output SQL-C files, which can be consumed by any SQL-C executor, decoupling the generation/execution in the simulator. +- [ ] Shadow State Abstraction: We currently maintain a shadow Turso, a simple in-memory key-value store that mimics the semantics +of the SQL statements. We use this shadow state for both random input generation as well as correctness checking in some of the +properties. However, maintaining a shadow state for all of SQL is virtually impossible, hence we would like to have alternative +solutions that make it easier to extend our reach, because a shadow implementation of any feature is required in order to use the simulator. +In this direction, we shall abstract away the shadow state, allowing multiple implementation of shadowing, and the ability to switch between them. +- [ ] SQLite as shadow state: Currently, we have a custom shadow state implementation that mimics Turso's behavior. +Another approach would be to use SQLite as the shadow state, using SQLite to be the source of truth for the contents +of the database. We can generally rely on this fact because Turso binary format is compatible with SQLite, but +the ephemeral state that's in-process will be an issue. + +### Properties + +There are a number of properties currently implemented. We implement existing properties from literature (mainly SQLancer), +as well as bespoke properties that is specific and niche. We also have a number of properties that rely on fault injection, +which is a unique feature of the simulator. + +#### SQLancer Properties + +SQLancer has a number of properties implemented for testing various database systems. +We have ported some of them to the simulator, and we plan to port more in the future. The current list of +ported properties is as follows: + +- [ ] [Pivoted Query Synthesis (PQS)](https://www.usenix.org/system/files/osdi20-rigger.pdf) + - [ ] InsertSelect: This property is a variation of the Pivoted Query Synthesis paper, where we first + insert some values into the database, and try to read them back by crafting a SELECT query that should return + the values. + - [x] `INSERT VALUES ...; SELECT ...:`: We can currently test inserting some literal values and reading them back. + - [ ] `INSERT SELECT ...; SELECT ...;`: We cannot test using the result of another SELECT for picking the inserted values. + - [ ] TODO: (there are other variants we currently do not support) +- [ ] [Non-optimizing Reference Engine Construction (NoREC)](https://arxiv.org/abs/2007.08292) + - [x] SelectSelectOptimizer: This is another SQLancer-inspired property, where we check that `SELECT * from t WHERE p` is equivalent to + `SELECT p from t` by asserting that the number of returned rows in the first one is equal to the number of `TRUE` values in the second one. +- [ ] [Ternary Logic Partitioning (TLP)](https://dl.acm.org/doi/pdf/10.1145/3428279) + - [x] WhereTrueFalseNull: This property relies on the three-valued logic of SQL, where `p OR (NOT P) OR (p IS NULL)` should always be `TRUE`. + - [x] UNIONAllPreservesCardinality: This property asserts that merging the results of multiple queries via `UNION ALL` is equivalent to +- [ ] [Differential Query Execution (DQE)](https://ieeexplore.ieee.org/document/10172736) + - [ ] TODO: This paper is not open access, if anyone has access, please update this. +- [ ] [Cardinality Estimation Restriction Testing (CERT)](https://arxiv.org/pdf/2306.00355) +- [ ] [Differential Query Plans (DQP)](https://dl.acm.org/doi/pdf/10.1145/3654991) +- [ ] [Constant Optimization Driven Database System Testing (CODDTest)](https://arxiv.org/abs/2501.11252) + +#### Bespoke Properties + +- [x] ReadYourUpdatesBack: This property is similar to InsertSelect, the main difference being we use UPDATE for changing some existing +values in the database and then checking the result of the UPDATE. +- [x] DeleteSelect: This property is similar to ReadYourUpdatesBack, the main difference being we use DELETE for removing some existing +values in the database and then checking the result of the DELETE, mainly by checking that the deleted values are not present anymore. +- [x] DropSelect: This is a failure property, where we drop a table and then check that any SELECT queries on the dropped table fail as expected. +- [x] DoubleCreateFailure: This is a failure property, where we try to create a table that already exists and check that the operation fails as expected. +- [x] SelectLimit: This property checks that the LIMIT clause in SELECT statements is respected by checking the cardinality of the returned results. + +#### Shadow State Properties + +These properties are invariants that check the consistency between Turso and the shadow state. + +- [x] TableHasExpectedContent: This property checks that a specific table in Turso has the same content as the shadow state. +- [x] AllTableHaveExpectedContent: This property checks that all tables in Turso have the same content as the shadow state. + +#### Fault Injection Properties + +- [x] FsyncNoWait: TODO +- [x] FaultyQuery: TODO + +### Oracles + +- [x] Property: The `property` oracle tests the properties presented above by generating random SQL statements and assertions that follow them. +- [x] Differential: The `differential` oracle tests Turso against SQLite. +- [x] Doublecheck: The `doublecheck` oracle tests Turso against itself for checking determinism across runs. +- [ ] Oracle-based Generation: Currently, we generate regardless the oracle, but `property` oracle actually poses limitations compared to `differential` +and `doublecheck` oracles, in those oracles we have much more freedom to generate random inputs. We must leverage this freedom. + +## Debugging + +- [ ] Automated Shrinking + - [x] Statement Removal + - [ ] Heuristic removal + - [x] Removing unused tables + - [ ] Removing all DQLs + - [x] Backtracking removal + - [ ] Statement shrinking + - [ ] Value removal + - [ ] Removing inserted values from `INSERT` statements + - [ ] Value shrinking + - [ ] Removing parts of SQL expressions (e.g `x AND y` into `x` or `y`) +- [ ] Interactive debugging (currently broken) +- [ ] Fault localization + +## Fault Injection + +- [ ] TODO + +## Data Collection + +At the moment, we collect no data from the simulator, which makes it hard to make future decisions. Some examples are: + +- Which properties are the most useful for finding bugs? +- What are the bottlenecks of the simulator speed, how can we optimize them? +- Which features/profiles are used, which are not, so we can focus on simplification by removing unused features or understanding why they aren't used. \ No newline at end of file