Open
Description
Assume we have the following SQLMesh project (using a DuckDB connection):
models/a.sql
MODEL (name s.a, kind INCREMENTAL_BY_TIME_RANGE (time_column ds), start '2023-01-02');
SELECT ds FROM VALUES ('2023-01-01'), ('2023-01-02'), ('2023-01-03') AS t(ds)
models/b.sql
MODEL (name s.b, kind INCREMENTAL_BY_TIME_RANGE (time_column ds), start '2020-01-01');
SELECT ds FROM s.a
models/c.sql
MODEL (name s.c, kind INCREMENTAL_BY_TIME_RANGE (time_column ds), start '2023-01-02');
SELECT ds FROM s.b
- Run
sqlmesh plan --no-prompts --auto-apply
- Change
start
to'2023-01-01'
fors.a
ands.c
- Run
sqlmesh plan --no-prompts --auto-apply
A couple of things are off:
- The modified models are disconnected in the evaluation dag:
{('"db"."s"."a"', ((1672531200000, 1672617600000), 0)): set(), ('"db"."s"."c"', ((1672531200000, 1672617600000), 0)): set()}
. This can lead to out-of-order execution, even thoughs.c
depends ons.a
indirectly. - After applying the 2nd plan, I don't see the date
2023-01-01
in eithers.b
ors.c
– onlys.a
has it. This is a side-effect of the previous bullet, because sqlmesh sees that we've already computed the interval2023-01-01
fors.b
, since itstart
s earlier than that, so it doesn't recompute it. Hence,s.c
, who sources froms.b
, ends up with stale data.