-
Notifications
You must be signed in to change notification settings - Fork 10
fix: remove invalid hostnames #1259 #1260
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This PR will trigger a patch release when merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure what the RF01 is about.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1260 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 6 6
Lines 764 764
=========================================
Hits 764 764 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Seems to be a bug in sqlfluff, see sqlfluff/sqlfluff#6521 |
@lydiapuric Then my suggestion is to fix it in EVENTS_V5 instead of in this run-query. That way all queries can benefit from the removal of invalid domains. |
@langswei Agree, and I will adjust my PR |
Closing this PR in favor of #1263 to implement this check on EVENTS_V5 |
Please ensure your pull request adheres to the following guidelines:
Related Issues
#1259
The DaaS team built the Run Helix Query API to load monthly Content Requests (CR) across all hostnames. When the API encounters hostnames containing non-printing or illegal Unicode characters, it returns an empty dataset.
The problem originates from the BigQuery function helix_rum.EVENTS_V5 when using the url parameter set to “-”. This function retrieves data for all hostnames, but many records include invalid hostname values (e.g., hostname?@a.png\u0027"\u003cbmt\u003e for host publish-p23952-e1363387.adobeaemcloud.net).
Thanks for reviewing!