-
-
Notifications
You must be signed in to change notification settings - Fork 129
Open
Labels
bugThings that are just plain broken.Things that are just plain broken.dagsterIssues related to our use of the Dagster orchestratorIssues related to our use of the Dagster orchestratordeveloper experienceThings that make the developers' lives easier, but don't necessarily directly improve the data.Things that make the developers' lives easier, but don't necessarily directly improve the data.performanceMake PUDL run faster!Make PUDL run faster!testingWriting tests, creating test data, automating testing, etc.Writing tests, creating test data, automating testing, etc.
Description
Overview
- I recently noticed that the integration tests seemed to be hanging hanging when I run them locally.
- They have tons of output from both alembic and Arelle, but nothing from the FERC data extractions or ETL.
- However, what really seems to have happened is we're no long outputting the logs generated by Dagster.
- We've also accumulated a lot of uninformative logging output through bitrot and some debug level logs.
- There are also many deprecation/future warnings coming from libraries that we use which aren't our problem.
- We have
filterwarnings
set inpyproject.toml
that help with some things. - But the overall signal-to-noise ratio of the logs has gotten very bad, which makes debugging real issues or seeing warnings that we should take seriously harder.
- Example of the logging output in the comments.
- This problem happens in the CI on GitHub as well. See these logs from a recent merge queue.
Specific Issues
- Ensure that Dagster logs come through in pytest output.
- Every error appears to be coming from the
pudl_service_territories
script, simply because it's the first test collected by pytest, which triggers the ETL. Can we forcetest/unit/etl_test.py::test_pudl_engine
to be first instead? Yes we can. See pytest-order -- fixed in Expand data validation docs and clean up validation tests #4474 - Address Pydantic/Pandera warnings & deprecations #4419
- Arelle logs like:
[DEBUG] arelle:23 Try #0: taxonomy_source=<arelle.FileSource.FileSource object
- DEBUG output from dbt phoning home using
urllib3.connectionpool
- Thousands of lines of unhelpful DEBUG output from
matplotlib.font_manager
- Alembic INFO/DEBUG logs from
alembic.runtime.migration
-
catalystcoop.pudl.output.ferc1:2138
Calculations in Exploded Metadata can not be represented as a forest! (it's fine. it's a DAG) - Chained assignment / inplace warning from
pudl/etl/glue_assets.py:689
Metadata
Metadata
Assignees
Labels
bugThings that are just plain broken.Things that are just plain broken.dagsterIssues related to our use of the Dagster orchestratorIssues related to our use of the Dagster orchestratordeveloper experienceThings that make the developers' lives easier, but don't necessarily directly improve the data.Things that make the developers' lives easier, but don't necessarily directly improve the data.performanceMake PUDL run faster!Make PUDL run faster!testingWriting tests, creating test data, automating testing, etc.Writing tests, creating test data, automating testing, etc.
Type
Projects
Status
Backlog