Skip to content

No Dagster logs but many uninformative logs in pytest #4482

@zaneselvans

Description

@zaneselvans

Overview

  • I recently noticed that the integration tests seemed to be hanging hanging when I run them locally.
  • They have tons of output from both alembic and Arelle, but nothing from the FERC data extractions or ETL.
  • However, what really seems to have happened is we're no long outputting the logs generated by Dagster.
  • We've also accumulated a lot of uninformative logging output through bitrot and some debug level logs.
  • There are also many deprecation/future warnings coming from libraries that we use which aren't our problem.
  • We have filterwarnings set in pyproject.toml that help with some things.
  • But the overall signal-to-noise ratio of the logs has gotten very bad, which makes debugging real issues or seeing warnings that we should take seriously harder.
  • Example of the logging output in the comments.
  • This problem happens in the CI on GitHub as well. See these logs from a recent merge queue.

Specific Issues

  • Ensure that Dagster logs come through in pytest output.
  • Every error appears to be coming from the pudl_service_territories script, simply because it's the first test collected by pytest, which triggers the ETL. Can we force test/unit/etl_test.py::test_pudl_engine to be first instead? Yes we can. See pytest-order -- fixed in Expand data validation docs and clean up validation tests #4474
  • Address Pydantic/Pandera warnings & deprecations #4419
  • Arelle logs like: [DEBUG] arelle:23 Try #0: taxonomy_source=<arelle.FileSource.FileSource object
  • DEBUG output from dbt phoning home using urllib3.connectionpool
  • Thousands of lines of unhelpful DEBUG output from matplotlib.font_manager
  • Alembic INFO/DEBUG logs from alembic.runtime.migration
  • catalystcoop.pudl.output.ferc1:2138 Calculations in Exploded Metadata can not be represented as a forest! (it's fine. it's a DAG)
  • Chained assignment / inplace warning from pudl/etl/glue_assets.py:689

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugThings that are just plain broken.dagsterIssues related to our use of the Dagster orchestratordeveloper experienceThings that make the developers' lives easier, but don't necessarily directly improve the data.performanceMake PUDL run faster!testingWriting tests, creating test data, automating testing, etc.

    Type

    No type

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions