-
-
Notifications
You must be signed in to change notification settings - Fork 128
Open
Labels
communityIssues that contributors have volunteered to take on or fostering more communityIssues that contributors have volunteered to take on or fostering more communitydata-typesDtype conversions, standardization and implications of data typesDtype conversions, standardization and implications of data typesgood first issueGood issues for first-time contributors. Self-contained, low context, no credentials required.Good issues for first-time contributors. Self-contained, low context, no credentials required.
Description
Overview
What is the problem we're solving? For very simple items, this can be encapsulated in the success criteria.
Elsewhere we convert our units from thousand of dollars to dollars, and thousands of lbs to lbs. We haven't done consistently with all fields however. Notably, these fields are still in "thousands of" units:
fgd_sorbent_consumption_1000_tons
(to be transformed in the_core_eia923__fgd_operation_maintenance
function inpudl.transform.eia923.py
)max_steam_flow_1000_lbs_per_hour
(to be transformed in the_core_eia860__boilers
function inpudl.transform.eia860.py
)steam_load_1000_lbs
(should be converted in thepudl.extract.epacems.py
module)
Success Criteria
How will we know that we're done?
-
fgd_sorbent_consumption_1000_tons
->fgd_sorbent_consumption_tons
-
max_steam_flow_1000_lbs_per_hour
->max_steam_flow_lbs_per_hour
-
steam_load_1000_lbs
->steam_load_lbs
Get set up
- Fork the PUDL repository and follow the steps to set up the PUDL development environment
- Activate the pudl-dev environment: mamba activate pudl-dev
- Grab the latest raw data for any relevant tables - in this case, EIA 860, EIA 923 and EPACEMS. To do so, run
pudl_datastore --dataset eia923
,pudl_datastore --dataset eia860
, andpudl_datastore --dataset epacems
(the last one pulls a large chunk of data, and might take a while). - For the first two variables: First, generate the raw assets upstream of the transform: follow the instructions to open Dagster, and in the left-hand menu find the "raw_eia923" asset group, e.g.. Click "Materialize all".
Next steps
- For each field, identify the first
_core
orcore
table in which it is transformed (see pudl.metadata.resources`). - In the corresponding transform function, divide this field by one thousand.
- Update the field description, units and name
- Update the alembic schema
- For the last variable: CEMS processing happens all in one step. To test your implemented solution, you'll simply need to find the "core_epacems__hourly_emissions" asset in the "Asset" tab, and materialize it. For the other two variables, you can generate the affected assets (e.g.,
_core_eia923__FGD_operation_maintenance
) and inspect them indevtools/inspect_assets.ipynb
.
Metadata
Metadata
Assignees
Labels
communityIssues that contributors have volunteered to take on or fostering more communityIssues that contributors have volunteered to take on or fostering more communitydata-typesDtype conversions, standardization and implications of data typesDtype conversions, standardization and implications of data typesgood first issueGood issues for first-time contributors. Self-contained, low context, no credentials required.Good issues for first-time contributors. Self-contained, low context, no credentials required.
Type
Projects
Status
In progress