-
-
Notifications
You must be signed in to change notification settings - Fork 128
Add NRELATB data source page #4396
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i exported this from here: https://atb.nrel.gov/electricity/2024/definitions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried loading some of these saved HTML documents and unlike the newer blank FERC forms (which come as HTML) they don't seem to work very well as stand-alone documents -- lots of links, but they're broken. The tabular formatting doesn't seem to have carried across, etc. I wonder if we might want to save ("print") these definitions pages as PDFs instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly optional stuff, one major confusion area
primary keys. Subsets of the ``core_metric_parameter`` have unique values across the | ||
data given specific primary keys. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Blocking: This is probably too compressed, and needs more words in it to make it clear what all the parts mean. Or maybe an example? or both.
- Subsets how? a subset of the rows sharing a metric parameter? a subset of the available metric parameters? a subset of the columns (in the pudl sense) within the rows sharing a metric parameter?
- Values of the parameters or of the primary keys?
- Does specific primary keys mean a set of primary keys from the previous sentence, so you have "given primary keys [collection of columns], ..."? or does it mean specific values in the primary key columns, so you have "given primary keys [row selection criteria], ..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea i was worried about this not being clear.. i will try to flesh it out and check to see if it helps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay i reworked this a fair amount to just explain the original skinny to multiple wider tables. the thing i was tryyying to get at before was why we had to break the skinny table into multiple wider tables, which is to say specific values of core_metric_parameter
pertained to different sets of primary keys. it was a little tedious to figure out which core_metric_parameter
should be pulled into which of the final pudl tables bc it was a guessing game of "are you unique based on this set of PKs? if yes great, if no try another set, etc etc". I coped this directly from the transform docs and tbh i think that weirdness makes more sense over there than actually explained to a user.
…s are available in viewer
Co-authored-by: Kathryn Mazaitis <[email protected]>
Co-authored-by: Kathryn Mazaitis <[email protected]>
Co-authored-by: Kathryn Mazaitis <[email protected]>
Co-authored-by: Kathryn Mazaitis <[email protected]>
Add EPA CAMD - EIA crosswalk data source page
* feat: add dagster-dbt translation functionality * docs: update typos, add a little more explanation remove extraneous project prefix in dbt selectors.
* Update conda-lock.yml and re-render conda environment files. Autoupdate pre-commit hooks. * Bump gdal pin to 3.11.3 --------- Co-authored-by: zaneselvans <[email protected]> Co-authored-by: Zane Selvans <[email protected]>
* Add new expect_columns_not_all_null table-level dbt test. * Add new expect_columns_not_all_null test to all dbt models. * Exclude some actually all null columns. * Add more information to the failure outputs. * Remove remaining no-null-cols validation tests. * Rename id_dc_coupled_tightly to is_dc_coupled_tightly * Fix conditional_columns checks to pass when zero records are selected. * Add no-null-rows fast ETL conditions for _core_eia860__cooling_equipment * Add no-null-rows fast ETL conditions for _core_eia860__fgd_equipment * move so2 equipment into excluded columns * Add no-null-rows fast ETL conditions for _core_eia923__cooling_system_information * Update dbt/tests/data_tests/generic_tests/expect_columns_not_all_null.sql Co-authored-by: Kathryn Mazaitis <[email protected]> * Clean up / simplify not all null test a little. * Fix enum for NREL tax case (#4384) * fix enum for nrelatb tax case * Merge alembic migration heads. --------- Co-authored-by: Zane Selvans <[email protected]> * Add exceptions to no-null-cols test for _core_eia923__fgd_operation_maintenance * Add exceptions to expect_columns_not_all_null test for core_eia860__assn_boiler_generator table * Add exceptions to expect_columns_not_all_null test for core_eia860__assn_boiler_stack_flue table * Add exceptions to expect_columns_not_all_null test for core_eia860__scd_generators_solar table * Add exceptions to expect_columns_not_all_null test for core_eia860__scd_generators_wind table * Add exceptions to expect_columns_not_all_null test for out_eia923__fuel_receipts_costs table * Add exceptions to expect_columns_not_all_null for remaining tables failing fast ETL * Remove pinned dbt version now that regression has been fixed. * Remove dagster asset checks superseded by expect_columns_not_all_null * Docstring cleanup. * Add comment, shorten pudl.validate to pv * Add missing boolean columns to EIA-860 multiful transform. * Add exhaustive null blocks into SCD table expect_columns_not_all_null tests * Change pandera import to be pandas specific. * consolidate duplicate operating_switch and can_switch_when_oeprating columns * Add special cases to expect_columns_not_all_null for core_eia860__scd_generators_multifuel table. * Add special cases for expect_columns_not_all_null test in coalmine and boiler entity tables * Relock conda dependencies * Exclude static forward/backward filled boiler manufacturer fields from no-null-cols test. * Add description explaining why certain columns are excluded in yearly boiler table. * Relock dependencies * Rename conditional_columns to row_conditions * Fix name of columns_are_close test in docs file. * relock dependencies * Add descriptions to explain special cases in no-null-cols tests. * Fix rolling average description wording. * Add release notes about no-null-cols checks. * Dynamically exclude null column checks related to lack of EIA-860M coverage * Dynamically exclude null column checks related to lack of EIA-860M coverage * Remove final reference to pv.no_null_cols() and the function itself. * Deal with possibility of 1 or 2 EIA-860M only years. * Simplify logic and document report_date requirement * Add a real script for identifying null column row conditions * Add explicit help options. * Revert dependency updates to whatever is on main * Update stale docstring and dependencies b/c pandera imports * Remove log-level option * Remove log-level option * Re-lock dependencies based on main * Add unit tests for pudl_null_cols script.A * Update dbt/tests/data_tests/generic_tests/schema.yml Co-authored-by: Kathryn Mazaitis <[email protected]> * Update docs/release_notes.rst Co-authored-by: Kathryn Mazaitis <[email protected]> * Update test/unit/scripts/test_pudl_null_cols.py Co-authored-by: Kathryn Mazaitis <[email protected]> * Update test/unit/scripts/test_pudl_null_cols.py Co-authored-by: Kathryn Mazaitis <[email protected]> * Update test/unit/scripts/test_pudl_null_cols.py Co-authored-by: Kathryn Mazaitis <[email protected]> * Update test/unit/scripts/test_pudl_null_cols.py Co-authored-by: Kathryn Mazaitis <[email protected]> * Update test/unit/scripts/test_pudl_null_cols.py Co-authored-by: Kathryn Mazaitis <[email protected]> * Update test/unit/scripts/test_pudl_null_cols.py Co-authored-by: Kathryn Mazaitis <[email protected]> * Update test/unit/scripts/test_pudl_null_cols.py Co-authored-by: Kathryn Mazaitis <[email protected]> * Address comments from code review. --------- Co-authored-by: Kathryn Mazaitis <[email protected]> Co-authored-by: Christina Gosnell <[email protected]>
* wip first very draft of table name descriptor * slight clean up of wip table name metadata * Incorporate metadata checker and table name data extractor. Lightly-revise table name data stubs for use in templating. Includes minimal seed data for testing. Includes new 'label' attribute for DataSource metadata. * Resource metadata: cosmetic adjustments for preview wizard * Adjust templates for wizard * split shared text from prompts * Remove outdated 861 variance * Add table description rendering step * Add usage warnings * Add command line access to default table description templates; fix corner cases * Add table description build phase; support structured table description metadata in type checks. * Fully-decomposed metadata build * Update core_eia923__monthly_boiler_fuel with new decomposed table metadata * Adapt new metadata build for previews * Add support for quarterly; make descriptions render acceptably for unmigrated tables * Revert sample migrations * Handle usage warnings defaults at build time; skip usage warnings display if empty * Permit the sphinx build to override default table description template location * Try moving the table description template into the primary source tree * Drop docs_dir args that are no longer needed * Add docstrings to description rendering code * Refactor with more sensible layout, naming, and documentation * Add missing files * replace lockfiles with copies from main. (we're not adding any new libraries so this should be fine?) * Fix typos :D * Fix docs build; fix missed escapes in description template; use more consistent vocabulary * Fix API doc formatting and improve test coverage. - Remove extra indentation causing spurious blockquotes - Migrate one source so we can exercise the rest of the tests - Add test for resource_description console script, and fix some confusing naming - Fix bad field names in metadata test * Add flag to turn metadata checks for primary key descriptions on/off * Revisions from review. * Switch to more consistent and informative key naming scheme * Drop vestigal Datasource.label * Clarify string aggregations * Fix spacing more Co-authored-by: E. Belfer <[email protected]> * Fix rst in docstrings * Permit nonstandard table types if you have additional_summary_text set * typo * Fix docs * Docs and test improvements from review. Co-authored-by: Dazhong Xia <[email protected]> --------- Co-authored-by: Kathryn Mazaitis <[email protected]> Co-authored-by: Kathryn Mazaitis <[email protected]> Co-authored-by: E. Belfer <[email protected]> Co-authored-by: Dazhong Xia <[email protected]>
* Update DOIs * Update release notes
* fix: make yaml output indent lists properly for prettier * chore: make diff output a little nicer to read * fix: using __ within a class no work with module level function * docs: update docstring with info about prettier jankiness * docs: update release notes * chore: switch to click.secho instead of logger only stop propagation of logger.parent in this script.
* wip: only run integration tests for non-doc changes for testing purposes, not restricting CI to merge-group only. * chore: re-add the merge_group condition, add comment * chore: realize that skipped jobs count as passing for branch protection * revert-later: remove merge_group filter for testing * Revert "revert-later: remove merge_group filter for testing" This reverts commit 50d3c33. * docs: update release notes * fix: add some other docs files; don't skip unit tests * Update docs/release_notes.rst Co-authored-by: Zane Selvans <[email protected]> --------- Co-authored-by: Zane Selvans <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lil typos and then good to go!
Also 😟 so sorry I apparently merged the epacamd source page into this branch instead of main?? Made a mess. A temporary mess but still. Sorry.
Co-authored-by: Kathryn Mazaitis <[email protected]>
Overview
Closes #4373.
What problem does this address?
no page!
What did you change?
added page! plus added 2024 html definitions
Documentation
Make sure to update relevant aspects of the documentation:
docs/data_sources/templates
).Testing
How did you make sure this worked? How can a reviewer verify this?
make docs-build
-> eyeball the built docs pageTo-do list
dbt
tests.make pytest-coverage
locally to ensure that the merge queue will accept your PR.make pytest-coverage
passes, make sure you have a fresh full PUDL DB downloaded locally, materialize new/changed assets and all their downstream assets and run relevant data validation tests usingpytest
and--live-dbs
.make pytest-validate
.build-deploy-pudl
GitHub Action manually.