Skip to content

Commit 2752c77

Browse files
committed
Merge 'simulator: Add Bug Database(BugBase)' from Alperen Keleş
Previously, simulator used `tempfile` for storing the resulting interaction plans, database file, seeds, and all relevant information. This posed the problem that this information became ephemeral, and we were not able to properly use the results of previous runs for optimizing future runs. This PR removes the CLI option `output_dir`, bases the storage infrastructure on top of `BugBase` interface. Reviewed-by: Pere Diaz Bou <[email protected]> Closes #1276
2 parents d67e1b6 + 0bee24e commit 2752c77

File tree

13 files changed

+560
-215
lines changed

13 files changed

+560
-215
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,3 +35,4 @@ dist/
3535
testing/limbo_output.txt
3636
**/limbo_output.txt
3737
testing/test.log
38+
.bugbase

Cargo.lock

Lines changed: 36 additions & 4 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

docs/testing.md

Lines changed: 56 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -74,8 +74,63 @@ This will enable trace-level logs for the limbo_core crate and disable logs else
7474

7575
## Deterministic Simulation Testing (DST):
7676

77-
TODO!
77+
Limbo simulator uses randomized deterministic simulations to test the Limbo database behaviors.
78+
79+
Each simulation begins with a random configurations:
80+
81+
- the database workload distribution(percentages of reads, writes, deletes...),
82+
- database parameters(page size),
83+
- number of reader or writers, etc.
84+
85+
Based on these parameters, we randomly generate **interaction plans**. Interaction plans consist of statements/queries, and assertions that will be executed in order. The building blocks of interaction plans are:
86+
87+
- Randomly generated SQL queries satisfying the workload distribution,
88+
- Properties, which contain multiple matching queries with assertions indicating the expected result.
89+
90+
An example of a property is the following:
91+
92+
```sql
93+
-- begin testing 'Select-Select-Optimizer'
94+
-- ASSUME table marvelous_ideal exists;
95+
SELECT ((devoted_ahmed = -9142609771.541502 AND loving_wicker = -1246708244.164486)) FROM marvelous_ideal WHERE TRUE;
96+
SELECT * FROM marvelous_ideal WHERE (devoted_ahmed = -9142609771.541502 AND loving_wicker = -1246708244.164486);
97+
-- ASSERT select queries should return the same amount of results;
98+
-- end testing 'Select-Select-Optimizer'
99+
```
100+
101+
The simulator starts from an initially empty database, adding random interactions based on the workload distribution. It can
102+
add random queries unrelated to the properties without breaking the property invariants to reach more diverse states and respect the configured workload distribution.
78103

104+
The simulator executes the interaction plans in a loop, and checks the assertions. It can add random queries unrelated to the properties without
105+
breaking the property invariants to reach more diverse states and respect the configured workload distribution.
106+
107+
## Usage
108+
109+
To run the simulator, you can use the following command:
110+
111+
```bash
112+
RUST_LOG=limbo_sim=debug cargo run --bin limbo_sim
113+
```
114+
115+
The simulator CLI has a few configuration options that you can explore via `--help` flag.
116+
117+
```txt
118+
The Limbo deterministic simulator
119+
120+
Usage: limbo_sim [OPTIONS]
121+
122+
Options:
123+
-s, --seed <SEED> set seed for reproducible runs
124+
-d, --doublecheck enable doublechecking, run the simulator with the plan twice and check output equality
125+
-n, --maximum-size <MAXIMUM_SIZE> change the maximum size of the randomly generated sequence of interactions [default: 5000]
126+
-k, --minimum-size <MINIMUM_SIZE> change the minimum size of the randomly generated sequence of interactions [default: 1000]
127+
-t, --maximum-time <MAXIMUM_TIME> change the maximum time of the simulation(in seconds) [default: 3600]
128+
-l, --load <LOAD> load plan from the bug base
129+
-w, --watch enable watch mode that reruns the simulation on file changes
130+
--differential run differential testing between sqlite and Limbo
131+
-h, --help Print help
132+
-V, --version Print version
133+
```
79134

80135
## Fuzzing
81136

simulator/Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,6 @@ limbo_core = { path = "../core" }
1919
rand = "0.8.5"
2020
rand_chacha = "0.3.1"
2121
log = "0.4.20"
22-
tempfile = "3.0.7"
2322
env_logger = "0.10.1"
2423
regex = "1.11.1"
2524
regex-syntax = { version = "0.8.5", default-features = false, features = [
@@ -31,3 +30,4 @@ serde = { version = "1.0", features = ["derive"] }
3130
serde_json = { version = "1.0" }
3231
notify = "8.0.0"
3332
rusqlite = { version = "0.34", features = ["bundled"] }
33+
dirs = "6.0.0"

simulator/README.md

Lines changed: 63 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -15,20 +15,18 @@ Based on these parameters, we randomly generate **interaction plans**. Interacti
1515

1616
An example of a property is the following:
1717

18-
```json
19-
{
20-
"name": "Read your own writes",
21-
"queries": [
22-
"INSERT INTO t1 (id) VALUES (1)",
23-
"SELECT * FROM t1 WHERE id = 1"
24-
],
25-
"assertions": [
26-
"result.rows.length == 1",
27-
"result.rows[0].id == 1"
28-
]
29-
}
18+
```sql
19+
-- begin testing 'Select-Select-Optimizer'
20+
-- ASSUME table marvelous_ideal exists;
21+
SELECT ((devoted_ahmed = -9142609771.541502 AND loving_wicker = -1246708244.164486)) FROM marvelous_ideal WHERE TRUE;
22+
SELECT * FROM marvelous_ideal WHERE (devoted_ahmed = -9142609771.541502 AND loving_wicker = -1246708244.164486);
23+
-- ASSERT select queries should return the same amount of results;
24+
-- end testing 'Select-Select-Optimizer'
3025
```
3126

27+
The simulator starts from an initially empty database, adding random interactions based on the workload distribution. It can
28+
add random queries unrelated to the properties without breaking the property invariants to reach more diverse states and respect the configured workload distribution.
29+
3230
The simulator executes the interaction plans in a loop, and checks the assertions. It can add random queries unrelated to the properties without
3331
breaking the property invariants to reach more diverse states and respect the configured workload distribution.
3432

@@ -44,36 +42,72 @@ The simulator code is broken into 4 main parts:
4442
To run the simulator, you can use the following command:
4543

4644
```bash
47-
cargo run
48-
```
49-
50-
This prompt (in the future) will invoke a clap command line interface to configure the simulator. For now, the simulator runs with the default configurations changing the `main.rs` file. If you want to see the logs, you can change the `RUST_LOG` environment variable.
51-
52-
```bash
53-
RUST_LOG=info cargo run --bin limbo_sim
45+
RUST_LOG=limbo_sim=debug cargo run --bin limbo_sim
5446
```
5547

56-
## Adding new properties
48+
The simulator CLI has a few configuration options that you can explore via `--help` flag.
5749

58-
Todo
50+
```txt
51+
The Limbo deterministic simulator
5952
60-
## Adding new generation functions
53+
Usage: limbo_sim [OPTIONS]
6154
62-
Todo
63-
64-
## Adding new models
55+
Options:
56+
-s, --seed <SEED> set seed for reproducible runs
57+
-d, --doublecheck enable doublechecking, run the simulator with the plan twice and check output equality
58+
-n, --maximum-size <MAXIMUM_SIZE> change the maximum size of the randomly generated sequence of interactions [default: 5000]
59+
-k, --minimum-size <MINIMUM_SIZE> change the minimum size of the randomly generated sequence of interactions [default: 1000]
60+
-t, --maximum-time <MAXIMUM_TIME> change the maximum time of the simulation(in seconds) [default: 3600]
61+
-l, --load <LOAD> load plan from the bug base
62+
-w, --watch enable watch mode that reruns the simulation on file changes
63+
--differential run differential testing between sqlite and Limbo
64+
-h, --help Print help
65+
-V, --version Print version
66+
```
6567

66-
Todo
68+
## Adding new properties
6769

68-
## Coverage with Limbo
70+
The properties are defined in `simulator/generation/property.rs` in the `Property` enum. Each property is documented with
71+
inline doc comments, an example is given below:
72+
73+
```rust
74+
/// Insert-Select is a property in which the inserted row
75+
/// must be in the resulting rows of a select query that has a
76+
/// where clause that matches the inserted row.
77+
/// The execution of the property is as follows
78+
/// INSERT INTO <t> VALUES (...)
79+
/// I_0
80+
/// I_1
81+
/// ...
82+
/// I_n
83+
/// SELECT * FROM <t> WHERE <predicate>
84+
/// The interactions in the middle has the following constraints;
85+
/// - There will be no errors in the middle interactions.
86+
/// - The inserted row will not be deleted.
87+
/// - The inserted row will not be updated.
88+
/// - The table `t` will not be renamed, dropped, or altered.
89+
InsertValuesSelect {
90+
/// The insert query
91+
insert: Insert,
92+
/// Selected row index
93+
row_index: usize,
94+
/// Additional interactions in the middle of the property
95+
queries: Vec<Query>,
96+
/// The select query
97+
select: Select,
98+
},
99+
```
69100

70-
Todo
101+
If you would like to add a new property, you can add a new variant to the `Property` enum, and the corresponding
102+
generation function in `simulator/generation/property.rs`. The generation function should return a `Property` instance, and
103+
it should generate the necessary queries and assertions for the property.
71104

72105
## Automatic Compatibility Testing with SQLite
73106

74-
Todo
107+
You can use the `--differential` flag to run the simulator in differential testing mode. This mode will run the same interaction plan on both Limbo and SQLite, and compare the results. It will also check for any panics or errors in either database.
75108

76109
## Resources
110+
77111
- [(reading) TigerBeetle Deterministic Simulation Testing](https://docs.tigerbeetle.com/about/vopr/)
78112
- [(reading) sled simulation guide (jepsen-proof engineering)](https://sled.rs/simulation.html)
79113
- [(video) "Testing Distributed Systems w/ Deterministic Simulation" by Will Wilson](https://www.youtube.com/watch?v=4fFDFbi3toc)

simulator/generation/plan.rs

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ impl InteractionPlan {
3838
let interactions = interactions.lines().collect::<Vec<_>>();
3939

4040
let plan: InteractionPlan = serde_json::from_str(
41-
std::fs::read_to_string(plan_path.with_extension("plan.json"))
41+
std::fs::read_to_string(plan_path.with_extension("json"))
4242
.unwrap()
4343
.as_str(),
4444
)
@@ -71,7 +71,6 @@ impl InteractionPlan {
7171
let _ = plan[j].split_off(k);
7272
break;
7373
}
74-
7574
if interactions[i].contains(plan[j][k].to_string().as_str()) {
7675
i += 1;
7776
k += 1;
@@ -86,7 +85,7 @@ impl InteractionPlan {
8685
j += 1;
8786
}
8887
}
89-
88+
let _ = plan.split_off(j);
9089
plan
9190
}
9291
}

simulator/generation/property.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -407,7 +407,7 @@ impl Property {
407407
match (select_predicate, select_star) {
408408
(Ok(rows1), Ok(rows2)) => {
409409
// If rows1 results have more than 1 column, there is a problem
410-
if rows1.iter().find(|vs| vs.len() > 1).is_some() {
410+
if rows1.iter().any(|vs| vs.len() > 1) {
411411
return Err(LimboError::InternalError(
412412
"Select query without the star should return only one column".to_string(),
413413
));

0 commit comments

Comments
 (0)