Load parquet files to a duckdb file

Superset does not support loading data from sqlite so we want to use duckdb instead! Duckdb is well suited for our data because it's designed to handle local data warehouses. It's also a cheaper option for superset because something like BQ we'd have to pay for query compute costs.

### Success Criteria
* [ ] all parquet files *except CEMS and long-named tables* are in a `.duckdb` file
* [ ] `.duckdb` file is generated & distributed to S3/GCS in nightly builds

```[tasklist]
### Tasks
- [x] Move script from devtools to package and add as entry point. Add to tests
- [x] Run `parquet_to_duckdb.py` script in `gcp_pudl_etl.sh`. Add lines to copy pudl.duckdb to `s3://pudl.catalyst.coop`, `gs://pudl.catalyst.coop` and `gs://superset.catalyst.coop`
- [x] Create a duckdb schema from our metadata classes
- [x] Load all the parquet files into the .duckdb file
- [ ] Figure out why foreign keys slow freeze the loading step. Is it related to the enum issue?
- [ ] Agree on solution to handling enums
- [ ] https://github.com/catalyst-cooperative/pudl/issues/3812
- [ ] https://github.com/catalyst-cooperative/pudl/issues/3811
- [ ] Bonus: Use SQL Expression Language instead of raw text to specify constraints
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Load parquet files to a duckdb file #3739

Success Criteria

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Load parquet files to a duckdb file #3739

Description

Success Criteria

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions