Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prepare active_users_aggregates for a backfill with shredder mitigation. #6349

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 13 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 7 additions & 7 deletions CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
# These datasets are subject to the additional change control procedures
# described in https://docs.google.com/document/d/1TTJi4ht7NuzX6BPG_KTr6omaZg70cEpxe9xlpfnHj9k/
# Active Users
/sql_generators/active_users_aggregates_v3/templates/ @mozilla/kpi_table_reviewers
/sql/moz-fx-data-shared-prod/fenix_derived/active_users_aggregates_v3/ @mozilla/kpi_table_reviewers
/sql/moz-fx-data-shared-prod/firefox_desktop_derived/active_users_aggregates_v1/ @mozilla/kpi_table_reviewers
/sql/moz-fx-data-shared-prod/firefox_ios_derived/active_users_aggregates_v3/ @mozilla/kpi_table_reviewers
/sql/moz-fx-data-shared-prod/focus_android_derived/active_users_aggregates_v3/ @mozilla/kpi_table_reviewers
/sql/moz-fx-data-shared-prod/focus_ios_derived/active_users_aggregates_v3/ @mozilla/kpi_table_reviewers
/sql/moz-fx-data-shared-prod/klar_ios_derived/active_users_aggregates_v3/ @mozilla/kpi_table_reviewers
/sql_generators/active_users_aggregates_v4/templates/ @mozilla/kpi_table_reviewers
/sql/moz-fx-data-shared-prod/fenix_derived/active_users_aggregates_v4/ @mozilla/kpi_table_reviewers
/sql/moz-fx-data-shared-prod/firefox_desktop_derived/active_users_aggregates_v4/ @mozilla/kpi_table_reviewers
/sql/moz-fx-data-shared-prod/firefox_ios_derived/active_users_aggregates_v4/ @mozilla/kpi_table_reviewers
/sql/moz-fx-data-shared-prod/focus_android_derived/active_users_aggregates_v4/ @mozilla/kpi_table_reviewers
/sql/moz-fx-data-shared-prod/focus_ios_derived/active_users_aggregates_v4/ @mozilla/kpi_table_reviewers
/sql/moz-fx-data-shared-prod/klar_ios_derived/active_users_aggregates_v4/ @mozilla/kpi_table_reviewers
# Search
/sql/moz-fx-data-shared-prod/search_terms @whd @jasonthomas
/sql/moz-fx-data-shared-prod/search_terms_derived @whd @jasonthomas
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ WITH todays_metrics AS (
app_version AS app_version,
normalized_channel AS channel,
IFNULL(country, '??') country,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpicky, but for consistency:

Suggested change
IFNULL(country, '??') country,
IFNULL(country, '??') AS country,

city,
IFNULL(city, '??') city,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
IFNULL(city, '??') city,
IFNULL(city, '??') AS city,

COALESCE(REGEXP_EXTRACT(locale, r'^(.+?)-'), locale, NULL) AS locale,
EXTRACT(YEAR FROM first_seen_date) AS first_seen_year,
os,
Expand Down Expand Up @@ -69,4 +69,21 @@ SELECT
FROM
todays_metrics
GROUP BY
ALL
segment,
app_name,
app_version,
channel,
country,
city,
locale,
first_seen_year,
os,
os_version,
os_version_major,
os_version_minor,
submission_date,
is_default_browser,
distribution_id,
attribution_source,
attribution_medium,
attributed
Original file line number Diff line number Diff line change
Expand Up @@ -2,78 +2,104 @@ fields:
- name: segment
type: STRING
mode: NULLABLE
description: Classification of client_ids based on usage and active state.
- name: app_name
type: STRING
mode: NULLABLE
description: Browser name.
- name: app_version
type: STRING
mode: NULLABLE
description: Browser version installed on the client.
- name: channel
type: STRING
mode: NULLABLE
description: Browser installation channel installed on the client.
- name: country
type: STRING
mode: NULLABLE
description: Country reported by the client.
- name: city
type: STRING
mode: NULLABLE
description: City reported by the client.
- name: locale
type: STRING
mode: NULLABLE
description: Locale reported by the client, which is a combination of language and regional settings.
- name: first_seen_year
type: INTEGER
mode: NULLABLE
description: Year extracted from the first_seen_date, that corresponds to the date when the first ping was received.
- name: os
type: STRING
mode: NULLABLE
description: Operating system reported by the client.
- name: os_version
type: STRING
mode: NULLABLE
description: OS version reported by the client.
- name: os_version_major
type: INTEGER
mode: NULLABLE
description: Major or first part of the OS version reported by the client.
- name: os_version_minor
type: INTEGER
mode: NULLABLE
description: Minor or second part of the OS version reported by the client.
- name: submission_date
type: DATE
mode: NULLABLE
description: Date when ping is received on the server side.
- name: is_default_browser
type: BOOLEAN
mode: NULLABLE
description: Whether the browser is set as the default browser on the client side.
- name: distribution_id
type: STRING
mode: NULLABLE
description: The id of the browser distribution made available in installation sources.
- name: attribution_source
type: STRING
mode: NULLABLE
description: The utm_term this install is attributed to. Reported by the install referrer service, not Adjust.
- name: attribution_medium
type: STRING
mode: NULLABLE
description: The utm_medium this install is attributed to. Reported by the install referrer service, not Adjust.
- name: attributed
type: BOOLEAN
mode: NULLABLE
description: True if the attribution source and medium are present.
- name: daily_users
type: INTEGER
mode: NULLABLE
description: Count of users who report a ping in a day.
- name: weekly_users
type: INTEGER
mode: NULLABLE
description: Count of users who have reported a ping over the last 7 days.
- name: monthly_users
type: INTEGER
mode: NULLABLE
description: Count of users who have reported a ping over the last 28 days.
- name: dau
type: INTEGER
mode: NULLABLE
description: Count of users who reported a ping on the submission_date that qualify as active.
- name: wau
type: INTEGER
mode: NULLABLE
description: Count of users who have reported a ping over the last 7 days and qualify as active.
- name: mau
type: INTEGER
mode: NULLABLE
description: Count of users who have reported a ping over the last 28 days and qualify as active.
- name: uri_count
type: INTEGER
mode: NULLABLE
description: Count of uri.
- name: active_hours
type: FLOAT64
mode: NULLABLE
description: Count of active hours.
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,10 @@ description: |-

The table is labeled as "change_controlled", which implies
that changes require the approval of at least one owner.

The label "shredder mitigation" indicates that this table is set up for
managed backfill with shredder mitigation, as described in
https://mozilla.github.io/bigquery-etl/cookbooks/creating_a_derived_dataset/#initiating-the-backfill.

Proposal:
https://docs.google.com/document/d/1qvWO49Lr_Z_WErh3I3058A3B1YuiuURx19K3aTdmejM/edit?usp=sharing
Expand All @@ -22,9 +26,10 @@ owners:
labels:
incremental: true
change_controlled: true
shredder_mitigation: true
scheduling:
dag_name: bqetl_analytics_aggregations
task_name: {{ app_name }}_active_users_aggregates
task_name: {{ app_name }}_active_users_aggregates_v3
bochocki marked this conversation as resolved.
Show resolved Hide resolved
date_partition_offset: -1
bigquery:
time_partitioning:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -209,4 +209,23 @@ SELECT
FROM
todays_metrics
GROUP BY
ALL
segment,
app_version,
attribution_medium,
attribution_source,
attributed,
city,
country,
distribution_id,
first_seen_year,
is_default_browser,
locale,
app_name,
channel,
os,
os_version,
os_version_major,
os_version_minor,
submission_date,
adjust_network,
install_source
Original file line number Diff line number Diff line change
Expand Up @@ -2,84 +2,112 @@ fields:
- name: segment
type: STRING
mode: NULLABLE
- name: app_version
type: STRING
mode: NULLABLE
- name: attribution_medium
description: Classification of client_ids based on usage and active state.
- name: app_name
type: STRING
mode: NULLABLE
- name: attribution_source
description: Browser name.
- name: app_version
type: STRING
mode: NULLABLE
- name: attributed
bochocki marked this conversation as resolved.
Show resolved Hide resolved
type: BOOLEAN
mode: NULLABLE
- name: city
description: Browser version installed on the client.
- name: channel
type: STRING
mode: NULLABLE
description: Browser installation channel installed on the client.
- name: country
type: STRING
mode: NULLABLE
- name: distribution_id
description: Country reported by the client.
- name: city
type: STRING
mode: NULLABLE
- name: first_seen_year
type: INTEGER
mode: NULLABLE
- name: is_default_browser
type: BOOLEAN
mode: NULLABLE
description: City reported by the client.
- name: locale
type: STRING
mode: NULLABLE
- name: app_name
type: STRING
mode: NULLABLE
- name: channel
type: STRING
description: Locale reported by the client, which is a combination of language and regional settings.
- name: first_seen_year
type: INTEGER
mode: NULLABLE
description: Year extracted from the first_seen_date, that corresponds to the date when the first ping was received.
- name: os
type: STRING
mode: NULLABLE
description: Operating system reported by the client.
- name: os_version
type: STRING
mode: NULLABLE
description: OS version reported by the client.
- name: os_version_major
type: INTEGER
mode: NULLABLE
description: Major or first part of the OS version reported by the client.
- name: os_version_minor
type: INTEGER
mode: NULLABLE
description: Minor or second part of the OS version reported by the client.
- name: submission_date
type: DATE
mode: NULLABLE
description: Date when ping is received on the server side.
- name: is_default_browser
type: BOOLEAN
mode: NULLABLE
description: Whether the browser is set as the default browser on the client side.
- name: distribution_id
type: STRING
mode: NULLABLE
description: A string containing the distribution identifier. This was used to identify installs from Mozilla Online, but now also identifies partnership deal distributions.
- name: attribution_source
type: STRING
mode: NULLABLE
description: The utm_term this install is attributed to. Reported by the install referrer service, not Adjust.
- name: attribution_medium
type: STRING
mode: NULLABLE
description: The utm_medium this install is attributed to. Reported by the install referrer service, not Adjust.
- name: attributed
type: BOOLEAN
mode: NULLABLE
description: True if the attribution source and medium are present.
- name: adjust_network
type: STRING
mode: NULLABLE
description: The source of a client installation.
- name: install_source
type: STRING
mode: NULLABLE
description: The id of the browser distribution made available in installation sources.
- name: daily_users
type: INTEGER
mode: NULLABLE
description: Count of users who report a ping in a day.
- name: weekly_users
type: INTEGER
mode: NULLABLE
description: Count of users who have reported a ping over the last 7 days.
- name: monthly_users
type: INTEGER
mode: NULLABLE
description: Count of users who have reported a ping over the last 28 days.
- name: dau
type: INTEGER
mode: NULLABLE
description: Count of users who reported a ping on the submission_date that qualify as active.
- name: wau
type: INTEGER
mode: NULLABLE
description: Count of users who have reported a ping over the last 7 days and qualify as active.
- name: mau
type: INTEGER
mode: NULLABLE
description: Count of users who have reported a ping over the last 28 days and qualify as active.
- name: uri_count
type: INTEGER
mode: NULLABLE
description: Count of uri.
- name: active_hours
type: FLOAT64
mode: NULLABLE
description: Count of active hours.
Loading