Standardize 1000 tons to tons in fgd_sorbent_consumption column #4426

msa856 · 2025-07-15T17:53:09Z

Overview

Closes #XXXX.

What problem does this address?

What did you change?

Documentation

Make sure to update relevant aspects of the documentation:

Update the release notes: reference the PR and related issues.
Update relevant Data Source jinja templates (see docs/data_sources/templates).
Update relevant table or source description metadata (see src/metadata).
Review and update any other aspects of the documentation that might be affected by this PR.

Testing

How did you make sure this worked? How can a reviewer verify this?

To-do list

If updating analyses or data processing functions: make sure to update row count expectations in dbt tests.
Run make pytest-coverage locally to ensure that the merge queue will accept your PR.
Review the PR yourself and call out any questions or issues you have.
For minor ETL changes or data additions, once make pytest-coverage passes, make sure you have a fresh full PUDL DB downloaded locally, materialize new/changed assets and all their downstream assets and run relevant data validation tests using pytest and --live-dbs.
For bigger ETL or data changes run the full ETL locally and then run the data validations using make pytest-validate.
Alternatively, run the build-deploy-pudl GitHub Action manually.

jdangerx

You're headed in the right direction! I think the main thing you still need to do is update the alembic schema.

Then you can materialize the asset in dagster and then check if the output data is actually in tons instead of kilotons!

jdangerx · 2025-07-16T16:16:12Z

src/pudl/metadata/fields.py

-        "unit": "1000_tons",
-        "description": "Quantity of flue gas desulfurization sorbent used, to the nearest 0.1 thousand tons.",
+        "unit": "tons",
+        "description": "Quantity of flue gas desulfurization sorbent used, to the nearest ton.",


I suppose this is actually the nearest 100 tons? Looking at the PUDL viewer it does seem like the original values are to the nearest 0.1 thousand (disregarding the floating point error...):

I see what you mean. In tons it would be 108,599.99 for the first value and we want it rounded to 108600 (re: comment below), correct?

I updated the alembic schema, ran dagster dev, but materializing _core_eia923__fdg_operation_maintenance failed because of the asset check (continuity check). Should I change that too prior to materialization (src/pudl/transform/eia923.py)?

jdangerx · 2025-07-16T16:17:18Z

src/pudl/transform/eia923.py

@@ -1350,6 +1350,10 @@ def _core_eia923__fgd_operation_maintenance(
    fgd_df.loc[:, fgd_df.columns.str.endswith("_1000_dollars")] *= 1000
    fgd_df.columns = fgd_df.columns.str.replace("_1000_dollars", "")  # Rename columns

+    # Convert thousands of tons to tons
+    fgd_df.loc[:, fgd_df.columns.str.endswith("_1000_tons")] *= 1000


And if this is to the nearest 100 tons we should probably round this appropriately.

I was able to load _core_eia923__fgd_operation_maintenance into my notebook and it seemed to be rounded already (I don't know why that differs from what you showed in the viewer). Do you recommend another line of code to ensure it's always rounded?

Also, I am not sure if this is the best place to ask but was curious: does etl.defs bypass the asset checks? Is that why I can see the updated column whereas in Dagster I cannot?

…bent

standardized 1000s to tons in transform func.

84b0fc6

github-project-automation bot added this to Catalyst Megaproject Jul 15, 2025

github-project-automation bot moved this to New in Catalyst Megaproject Jul 15, 2025

msa856 changed the title ~~standardized 1000s to tons in transform func.~~ Standardize 1000 tons to tons in fdg-sorbent-consumption column Jul 15, 2025

update fields, dbt for fgd sorbent column

d620ab4

jdangerx reviewed Jul 16, 2025

View reviewed changes

msa856 added 2 commits July 16, 2025 12:45

Merge remote-tracking branch 'upstream/main' into standardize-fdg-sor…

f2531c4

…bent

Migration: standardized fgd sorbent column

d2c1e5c

zaneselvans changed the title ~~Standardize 1000 tons to tons in fdg-sorbent-consumption column~~ Standardize 1000 tons to tons in fgd_sorbent_consumption column Jul 16, 2025

cmgosnell added the community Issues that contributors have volunteered to take on or fostering more community label Jul 21, 2025

cmgosnell moved this from New to In progress in Catalyst Megaproject Jul 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Standardize 1000 tons to tons in fgd_sorbent_consumption column #4426

Standardize 1000 tons to tons in fgd_sorbent_consumption column #4426

msa856 commented Jul 15, 2025

Uh oh!

jdangerx left a comment

Uh oh!

jdangerx Jul 16, 2025

Uh oh!

msa856 Jul 16, 2025

Uh oh!

msa856 Jul 16, 2025 •

edited

Loading

Uh oh!

jdangerx Jul 16, 2025

Uh oh!

msa856 Jul 16, 2025

Uh oh!

Uh oh!

Uh oh!

Standardize 1000 tons to tons in fgd_sorbent_consumption column #4426

Are you sure you want to change the base?

Standardize 1000 tons to tons in fgd_sorbent_consumption column #4426

Conversation

msa856 commented Jul 15, 2025

Overview

What problem does this address?

What did you change?

Documentation

Testing

To-do list

Uh oh!

jdangerx left a comment

Choose a reason for hiding this comment

Uh oh!

jdangerx Jul 16, 2025

Choose a reason for hiding this comment

Uh oh!

msa856 Jul 16, 2025

Choose a reason for hiding this comment

Uh oh!

msa856 Jul 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jdangerx Jul 16, 2025

Choose a reason for hiding this comment

Uh oh!

msa856 Jul 16, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

msa856 Jul 16, 2025 •

edited

Loading