Clarification on the utility_id_ferc1
variable source in out_ferc1__yearly_utility_plant_summary_sched200
(and other Form 1 tables!)
#4363
-
Thanks for all the awesome work Catalyst does on the FERC Form 1 data. I had a quick question about the I've worked with the raw Form 1 data in the past, using an old PC and FoxPro to export the data to text files, as well as the older PUDL datasette tables (last accessed in early 2025). Updating some code today to work with the new parquette files, I noticed the id's in the current tables don't seem to align with the raw FERC data or the prior datasette csv's. For example:
I cross-checked a few investment values against PDF's of both utilities' filings and the values appear to align, so I think this is simply about the Question: Would you be able to clarify if the To illustrate the discrepancy, I'm attaching the set of names and respondent id's I exported in 2022, for the 2020 filing, from the raw FERC Form 1 data: f1_respondent_id20.txt. The id's here also align with the summary table at UT Austin: https://energy.utexas.edu/policy/fce/calculators. Thank you so much! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 4 replies
-
Hey @myozwiak sorry for the confusion here, and glad our FERC data has been useful! Please don't be shy about letting us know how it can be more useful, or if there additional tables that you'd like to see fully integrated int PUDL, or any weird data issues you find.
The originally reported FERC 1 utility IDs are named Though really, for that denormalized output table (and all the others), I can see a good argument for us merging the DBF and XBRL ids in across the board so they're always available if people are already familiar with them or want to merge in other data that's using the old IDs. |
Beta Was this translation helpful? Give feedback.
-
See discussion above! |
Beta Was this translation helpful? Give feedback.
Hey @myozwiak sorry for the confusion here, and glad our FERC data has been useful! Please don't be shy about letting us know how it can be more useful, or if there additional tables that you'd like to see fully integrated int PUDL, or any weird data issues you find.
utility_id_ferc1
is an ID assigned within PUDL which unifies the two different epochs of FERC Form 1 reporting. In the old DBF data that you've worked with in the past, each respondent had an integer ID. In the new XBRL based Form 1 reporting, each respondent has a different ID that's a string and looks likeC000528
. We've manually matched the old and new IDs and assigned theutility_id_ferc1
so that a single ID can be used t…