Skip to content

Handle high memory usage in CEMS parquet-to-duckdb conversion #3812

@jdangerx

Description

@jdangerx

Overview

Details here: #3741 (comment)

But basically, the CEMS parquet file is pretty big and that's causing a memory issue when trying to convert to DuckDB.

Success Criteria

  • Hourly CEMS makes it into DuckDB
### Next steps
* [ ] figure out if we can convert the parquet file by chunks

Metadata

Metadata

Assignees

No one assigned

    Labels

    epacemsIntegration and analysis of the EPA CEMS dataset.performanceMake PUDL run faster!

    Type

    No type

    Projects

    Status

    Icebox

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions