Skip to content

Revisit normalization of EIA-861 distributed generation fuel data #3324

@zaneselvans

Description

@zaneselvans

From this comment and this comment

The fuel_pct field comes from the core_eia861__yearly_distributed_generation_fuel table. In 2010, the form switched from reporting percent of total capacity to mw of capacity for each individual technology type. In the transform module, we convert all the pre-2010 pct_capacity values to mw of capacity so we can have a uniform column of values throughout the years.

The form stops reported fuel_pct values in 2007. This means that there was no incentive to convert this column to fuel_mw (because there was no post-2010 fuel_mw columns to combine it with).

This leaves me with 2 questions:

  1. The fuel_pct columns represent the percent of total capacity by fuel. We split this table off from the _tech and _misc tables so that we could normalize the data (i.e.: make fuel_class and fuel_pct columns instead of having individual columns for each fuel type). However, the capacity_mw field, that all the percentages are based on, lives in the core_eia861__yearly_distributed_generation_tech table. How should we solve this issue? It feels like the capacity field should be in the same table, but the whole point of normalization is to de-duplicate data for storage. It also doesn't necessarily make sense to recombine the tables, because that will result in tons of columns because we will have to make it a wide table again.

  2. Should we bother converting this fuel_pct to capacity_mw to match the newer tech_pct data that has been converted?

Will update the description for this field once we have more clarity on what to do here.

It helps to look at the column maps for the distributed_generation_eia861 table.

Metadata

Metadata

Assignees

No one assigned

    Labels

    data-cleaningTasks related to cleaning & regularizing data during ETL.eia861Anything having to do with EIA Form 861

    Type

    No type

    Projects

    Status

    New

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions