-
Notifications
You must be signed in to change notification settings - Fork 1.1k
add event_time page #6383
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add event_time page #6383
Changes from 12 commits
46231b0
33a66a8
6ebf5eb
501d948
4f2c6dc
57ee608
451fc46
3354c9d
603c21c
0c68f62
488460c
2fb62c5
69ba339
1ebbbdb
2b713ee
c789601
5708119
903c5d1
2dd873a
b7a07be
12cdffa
016c555
9c49664
2910914
5ba059e
79128fe
cc34575
551821d
735ae38
ac7616b
bdc037e
0363051
d693c9b
809f2a7
81e2318
14632b3
aad3987
e92c9db
2b98454
3ad1bb6
3da521f
edd1123
52c0db9
f461ffa
a4f3b23
a1c8166
f1969f4
c170a3b
d6a309b
3a8dee5
5851c2b
0656327
57679b2
556249a
0bd8584
bd233ad
b9e4be0
ff3416a
613f1ef
85f181d
4644684
8cf073b
46763d8
d2bf5af
6015dee
4250c9d
76b12e9
de8f752
4b28bbc
337248b
0e16ca6
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -31,6 +31,15 @@ dbt reports the comparison differences in: | |
|
||
<Lightbox src="/img/docs/dbt-cloud/example-ci-compare-changes-tab.png" width="85%" title="Example of the Compare tab" /> | ||
|
||
### Considerations | ||
It's common for CI jobs to only [build a subset of data](/best-practices/best-practice-workflows#limit-the-data-processed-when-in-development), for example only the last 7 days of data. When an [`event_time`](/reference/resource-configs/event-time) column is specified on your model, compare changes can: | ||
mirnawong1 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
- Compare data in CI against production for only the overlapping times, avoiding false positives and returning results faster. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think both of these bullets have the same benefit of "using only the overlapping timeframe, which avoids incorrect row-count changes and returns results faster" I would distinguish the 2 scenarios as:
Rather than nesting the second scenario within the first - lmk if that makes sense! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. changed it to this; It's common for CI jobs to only build a subset of data (for example only the last 7 days of data). When an This is useful in scenarios like:
|
||
- Handle scenarios where CI contains fresher data than production by using only the overlapping timeframe, which avoids incorrect row-count changes. | ||
- Coming soon, you'll be able to add a flag to the command list allowing you to select the specific time slice to compare. | ||
mirnawong1 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
<Lightbox src="/img/docs/deploy/apples_to_apples.png" title="event_time ensures the same time-slice of data is accurately compared between your CI and production environments." /> | ||
|
||
mirnawong1 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
## About the cached data | ||
|
||
After [comparing changes](#compare-changes), dbt Cloud stores a cache of no more than 100 records for each modified model for preview purposes. By caching this data, you can view the examples of changed data without rerunning the comparison against the data warehouse every time (optimizing for lower compute costs). To display the changes, dbt Cloud uses a cached version of a sample of the data records. These data records are queried from the database using the connection configuration (such as user, role, service account, and so on) that's set in the CI job's environment. | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,247 @@ | ||
--- | ||
title: "event_time" | ||
id: "event-time" | ||
sidebar_label: "event_time" | ||
resource_types: [models, seeds, source] | ||
description: "dbt uses event_time to understand when an event occurred. When defined, event_time enables microbatch incremental models and more refined comparison of datasets during Advanced CI." | ||
datatype: string | ||
--- | ||
|
||
Available in dbt Cloud Versionless and dbt Core v1.9 and higher. | ||
|
||
<Tabs> | ||
<TabItem value="model" label="Models"> | ||
|
||
<File name='dbt_project.yml'> | ||
|
||
```yml | ||
models: | ||
[resource-path:](/reference/resource-configs/resource-path) | ||
+event_time: my_time_field | ||
``` | ||
</File> | ||
|
||
|
||
<File name='models/properties.yml'> | ||
|
||
```yml | ||
models: | ||
- name: model_name | ||
[config](/reference/resource-properties/config): | ||
event_time: my_time_field | ||
``` | ||
</File> | ||
|
||
<File name="models/modelname.sql"> | ||
|
||
```sql | ||
{{ config( | ||
event_time='my_time_field' | ||
) }} | ||
``` | ||
|
||
</File> | ||
|
||
</TabItem> | ||
|
||
<TabItem value="seeds" label="Seeds"> | ||
|
||
<File name='dbt_project.yml'> | ||
|
||
```yml | ||
seeds: | ||
[resource-path:](/reference/resource-configs/resource-path) | ||
+event_time: my_time_field | ||
``` | ||
</File> | ||
|
||
<File name='seeds/properties.yml'> | ||
|
||
```yml | ||
seeds: | ||
- name: seed_name | ||
[config](/reference/resource-properties/config): | ||
event_time: my_time_field | ||
``` | ||
|
||
</File> | ||
</TabItem> | ||
|
||
<TabItem value="snapshot" label="Snapshots"> | ||
|
||
<File name='dbt_project.yml'> | ||
|
||
```yml | ||
snapshots: | ||
[resource-path:](/reference/resource-configs/resource-path) | ||
+event_time: my_time_field | ||
``` | ||
</File> | ||
|
||
<VersionBlock firstVersion="1.9"> | ||
<File name='snapshots/properties.yml'> | ||
|
||
```yml | ||
snapshots: | ||
- name: snapshot_name | ||
[config](/reference/resource-properties/config): | ||
event_time: my_time_field | ||
``` | ||
</File> | ||
</VersionBlock> | ||
|
||
<VersionBlock lastVersion="1.8"> | ||
|
||
<File name="models/modlename.sql"> | ||
|
||
```sql | ||
|
||
{{ config( | ||
event_time: 'my_time_field' | ||
) }} | ||
``` | ||
|
||
</File> | ||
|
||
|
||
import SnapshotYaml from '/snippets/_snapshot-yaml-spec.md'; | ||
|
||
<SnapshotYaml/> | ||
</VersionBlock> | ||
|
||
|
||
|
||
</TabItem> | ||
|
||
<TabItem value="sources" label="Sources"> | ||
|
||
<File name='dbt_project.yml'> | ||
|
||
```yml | ||
sources: | ||
[resource-path:](/reference/resource-configs/resource-path) | ||
+event_time: my_time_field | ||
``` | ||
</File> | ||
|
||
<File name='models/properties.yml'> | ||
|
||
```yml | ||
sources: | ||
- name: source_name | ||
[config](/reference/resource-properties/config): | ||
event_time: my_time_field | ||
``` | ||
|
||
</File> | ||
</TabItem> | ||
</Tabs> | ||
|
||
## Definition | ||
|
||
Set the `event_time` to the name of the field that represents the timestamp of the event, as opposed to a date like data loading date. You can configure `event_time` for a [model](/docs/build/models), [seed](/docs/build/seeds), or [source](/docs/build/sources) in your `dbt_project.yml` file, property YAML file, or config block. | ||
mirnawong1 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
`event_time` is required for [Incremental microbatch](/docs/build/incremental-microbatch) and [Advanced CI's compare changes](/docs/deploy/advanced-ci#considerations) in CI/CD workflows, where it ensures the same time-slice of data is correctly compared between your CI and production environments. | ||
mirnawong1 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
When you configure `event_time`, it enables compare changes to: | ||
mirnawong1 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
- Compare data in CI versus production for overlapping times only, reducing false discrepancies. | ||
- Handle scenarios where CI has "fresher" data than production by using only the overlapping timeframe, allowing you to avoid incorrect row-count changes. | ||
mirnawong1 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- Account for subset data builds in CI without flagging filtered-out rows as "deleted" when compared with production. | ||
mirnawong1 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
## Examples | ||
|
||
<Tabs> | ||
|
||
<TabItem value="model" label="Models"> | ||
|
||
Here's an example in the `dbt_project.yml` file: | ||
|
||
<File name='dbt_project.yml'> | ||
|
||
```yml | ||
models: | ||
my_project: | ||
user_sessions: | ||
+event_time: session_start_time | ||
``` | ||
</File> | ||
|
||
Example in a properties YAML file: | ||
|
||
<File name='models/properties.yml'> | ||
|
||
```yml | ||
models: | ||
- name: user_sessions | ||
config: | ||
event_time: session_start_time | ||
``` | ||
|
||
</File> | ||
|
||
Example in sql model config block: | ||
|
||
<File name="models/user_sessions.sql"> | ||
|
||
```sql | ||
{{ config( | ||
event_time='session_start_time' | ||
) }} | ||
``` | ||
|
||
</File> | ||
|
||
This setup sets `session_start_time` as the `event_time` for the `user_sessions` model, which makes sure the compare changes process uses this timestamp for time-slice comparisons or incremental microbatching. | ||
mirnawong1 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
</TabItem> | ||
|
||
<TabItem value="seeds" label="Seeds"> | ||
|
||
Here's an example in the `dbt_project.yml` file: | ||
|
||
<File name='dbt_project.yml'> | ||
|
||
```yml | ||
seeds: | ||
my_project: | ||
my_seed: | ||
+event_time: record_timestamp | ||
``` | ||
|
||
</File> | ||
|
||
Example in a seed properties YAML: | ||
|
||
<File name='seeds/properties.yml'> | ||
|
||
```yml | ||
seeds: | ||
- name: my_seed | ||
config: | ||
event_time: record_timestamp | ||
``` | ||
</File> | ||
|
||
This setup sets `record_timestamp` as the `event_time` for `my_seed`. This ensures that the `record_timestamp` is used consistently for compare changes processes or incremental microbatching. | ||
mirnawong1 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
</TabItem> | ||
<TabItem value="sources" label="Sources"> | ||
|
||
Here's an example of source properties YAML file: | ||
|
||
<File name='models/properties.yml'> | ||
|
||
```yml | ||
sources: | ||
- name: source_name | ||
tables: | ||
- name: table_name | ||
config: | ||
event_time: event_timestamp | ||
``` | ||
</File> | ||
|
||
This setup sets `event_timestamp` as the `event_time` for the specified source table. | ||
|
||
</TabItem> | ||
</Tabs> |
mirnawong1 marked this conversation as resolved.
Show resolved
Hide resolved
|
Uh oh!
There was an error while loading. Please reload this page.