-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add NSSP secondary source #2074
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some small nitpicks, but overall looks good!
@@ -2,6 +2,11 @@ | |||
|
|||
We import the NSSP Emergency Department Visit data, including percentage and smoothed percentage of ER visits attributable to a given pathogen, from the CDC website. The data is provided at the county level, state level and national level; we do a population-weighted mean to aggregate from county data up to the HRR and MSA levels. | |||
|
|||
There are 2 sources we grab data from for nssp: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mention that there are some difference that's significant; I actually got 4% of the data having more that .1 difference and I would also mention this to logan/daniel/etc as an fyi
specifically for hhs region 7 seems to be the least similar... at least in this run
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea it's expected that hhs got wilder differences since the primary source signal values at hhs geo-level were 1. aggregated from state level data 2. using 2020 population weights, rather than something directly published by the source like the equivalent secondary signals for hhs geos. The public facing docs mentions this weighing already, but I'll update it and link stuff here with more info after the pipeline is in a good spot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add the comment in the details that there are some difference; in the sample run there were 4% that was more than .1 difference and it should be good.
Description
Add 4 signals:
pct_ed_visits_covid_secondary
,pct_ed_visits_influenza_secondary
,pct_ed_visits_rsv_secondary
,pct_ed_visits_combined_secondary
to nssp.These secondary signals are vailable at geo levels
state
,hhs
andnation
.Adding 4 signals from this secondary nssp source dataset. They report essentially the same data as the main dataset, but only available at state level and above. However, it appears this secondary source updates quicker than the main source recently.
More info at DETAILS.md
Context: https://delphi-org.slack.com/archives/C0130CSQRN3/p1730476295781409
Things to think about:
Associated Issue(s)