-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Description
Is this your first time submitting a feature request?
- I have read the expectations for open source contributors
- I have searched the existing issues, and I could not find an existing issue for this feature
- I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion
Describe the feature
Currently in Snowflake, if I create a dbt snapshot, the dbt_valid_from
, dbt_valid_to
, and dbt_updated_at
columns are created as a UTC timestamp but stored as a TIMESTAMP_NTZ
type (exact SQL below):
to_timestamp_ntz(convert_timezone('UTC', current_timestamp()))
This can cause discrepancies in downstream usage for teams that primarily use TIMESTAMP_TZ
. For example all TIMESTAMP_NTZ
columns are evaluated using the session/account timezone (default is America/Los Angeles
). That means '2020-01-01 00:00:00'::TIMESTAMP_NTZ
is treated as greater than '2020-01-01 00:00:00 +0000'::TIMESTAMP_TZ
.
When creating a snapshot, users should have the ability to specify the data type of the dbt-generated columns.
- Snowflake: Choose from
TIMESTAMP_TZ
,TIMESTAMP_NTZ
, orTIMESTAMP_LTZ
- BigQuery: Choose from
TIMESTAMP
orDATETIME
- Databricks: Choose from
TIMESTAMP
orTIMESTAMP_NTZ
- Redshift: Choose from
TIMESTAMP
orTIMESTAMPTZ
Describe alternatives you've considered
In the meantime, this requires one of four solutions:
- Manually override dbt snapshots to enable the above functionality (not desired)
- Create a view on top of every snapshot to change the data type
- Let each downstream model decide if the data type should be updated
- Let downstream consumers handle it
Who will this benefit?
Any users that work downstream of snapshots that require consistency in their timestamp data types. That should cover most users of dbt + dashboard users
Are you interested in contributing this feature?
No
Anything else?
No response