You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jul 14, 2023. It is now read-only.
Tabelas particionadas e/ou clusterizadas têm menor custo de consulta e melhor performance (Ver https://www.youtube.com/watch?v=wapi0aR4BZE). É possível implementar tanto o particionamento quanto a clusterização na hora da materialização usando DBT.
Para um exemplo, veja o modelo da tabela twitter_metrics:
{{
config(
materialized='incremental',
partition_by={
"field": "upload_day",
"data_type": "date",
"granularity": "month",
}
)
}}
SELECT*FROM
(SELECT
SAFE_CAST(upload_ts AS INT64) upload_ts,
EXTRACT(DATEFROM TIMESTAMP_MILLIS(upload_ts*1000)) AS upload_day,
SAFE_CAST(id AS STRING) id,
SAFE_CAST(textAS STRING) text,
SAFE_CAST(created_at AS STRING) created_at,
SAFE_CAST(retweet_count AS INT64) retweet_count,
SAFE_CAST(reply_count AS INT64) reply_count,
SAFE_CAST(like_count AS INT64) like_count,
SAFE_CAST(quote_count AS INT64) quote_count,
SAFE_CAST(impression_count AS FLOAT64) impression_count,
SAFE_CAST(user_profile_clicks AS FLOAT64) user_profile_clicks,
SAFE_CAST(url_link_clicks AS FLOAT64) url_link_clicks,
SAFE_CAST(following_count AS INT64) following_count,
SAFE_CAST(followers_count AS INT64) followers_count,
SAFE_CAST(tweet_count AS INT64) tweet_count,
SAFE_CAST(listed_count AS INT64) listed_count
FROM`basedosdados-dev.br_bd_indicadores_staging.twitter_metrics`)
WHERE
upload_day <=CURRENT_DATE('America/Sao_Paulo')
{% if is_incremental() %}
{% set max_partition = run_query("SELECT gr FROM (SELECT IF(max(upload_day) > CURRENT_DATE('America/Sao_Paulo'), CURRENT_DATE('America/Sao_Paulo'), max(upload_day)) as gr FROM " ~ this ~ ")").columns[0].values()[0] %}
AND
upload_day > ("{{ max_partition }}")
{% endif %}
guialvesp1
changed the title
[infra] Implementar partição e cluster em tabelas materializadas
Épico: Implementar partição e cluster em tabelas materializadas
Jan 19, 2023
Tabelas particionadas e/ou clusterizadas têm menor custo de consulta e melhor performance (Ver https://www.youtube.com/watch?v=wapi0aR4BZE). É possível implementar tanto o particionamento quanto a clusterização na hora da materialização usando DBT.
Para um exemplo, veja o modelo da tabela
twitter_metrics
:Tabelas a serem particionadas:
br_ibge_pnad.microdados
#37The text was updated successfully, but these errors were encountered: