Update `key_colnames`, `revision_summary` #540

brookslogan · 2024-10-10T17:01:35Z

Checklist

Please:

Make sure this PR is against "dev", not "main" (unless this is a release
PR).
Request a review from one of the current main reviewers:
brookslogan, nmdefries.
Makes sure to bump the version number in DESCRIPTION. Always increment
the patch version number (the third number), unless you are making a
release PR from dev to main, in which case increment the minor version
number (the second number).
Describe changes made in NEWS.md, making sure breaking changes
(backwards-incompatible changes to the documented interface) are noted.
Collect the changes under the next release number (e.g. if you are on
1.7.2, then write your changes under the 1.8 heading).
See DEVELOPMENT.md for more information on the development
process.

Change explanations for reviewer

Make key_colnames.epi_archive output its unique key colnames (including version).
Make key_colnames.data.frame require other_keys to be passed.
Remove key_colnames.default.
Require exclude = to be passed by name in key_colnames.
Require dots to be empty in key_colnames.
- Note: this triggers some errors in epipredict tests. See below.
Update revision_summary to use new key_colnames.epi_archive.
Fix&tweak some tidyeval stuff in revision_summary.
Tweak some naming in revision_summary.

`epipredict` errors

Originally, I misread and thought these were from passing other_keys = hopefully-redundantly into key_colnames.epi_df and having them be ignored. I was planning to do something like

  expected_other_keys <- attr(x, "metadata")$other_keys
  if (is.null(other_keys)) {
    other_keys <- expected_other_keys
  } else {
    if (!identical(other_keys, expected_other_keys)) {
      cli_abort(c(
        "`other_keys` was provided, but didn't match expectations from inspecting `x`",
        "*" = "`other_keys` was {format_chr_with_quotes(other_keys)}",
        "*" = "`expected_other_keys` was {format_chr_with_quotes(expected_other_keys)}",
        "i" = "If you resolve this discrepancy by adjusting the metadata of `x`, you
               shouldn't have to pass `other_keys =` here anymore unless you want to
               continue to perform this check."
      ))
    }
  }

However, the error is actually from passing extra_keys = (which was also previously ignored, but it sounds like it has a different meaning). I am looking a bit more into this.

todo: decide what to do re. extra_keys =
- idea for now: forbid + PR to epipredict to specify other_keys = instead. If current behavior of other_keys = with epi_dfs breaks later, then decide whether to change things here or adjust any offending steps/layers.
todo: check whether epipredict#410 is caught... requiring other_keys = in key_colnames.data.frame should be flagging this, right?
- It is. Also need to include fix in epipredict PR before merging this.
todo: prepare epipredict PR and get it merged

Other work

todo: tests for additional revision_summary() adjustments
todo: break into 2 dependent PRs

Magic GitHub syntax to mark associated Issue(s) as resolved when this is merged into the default branch

Resolves key_colnames returns wrong values for archive data #565
Resolves epi_archive's compactify doesn't support distributions #541
Resolves key_colnames.epi_archive: double-check intention, fix implementation #539
Resolves #{issue number}

* Make `key_colnames.epi_archive` output epikey-time-version rather than just epikey-time. * Make `key_colnames.data.frame` require `other_keys` be provided. * Remove `key_colnames.default`. * Make `key_colnames` forbid passing `exclude` positionally. * Update downstream `revision_summary`.

* Produce error rather than default selection when user provides a tidyselection in ... but it selects zero columns. * Change time_within_x_latest to take `values` as a vector * Use `.data` instead of `pick` etc. in some places

So it is not misinterpreted as "the amount of time that it has been near the latest".

and avoid unnecessary `abs()`

to ease epipredict transition.

brookslogan added 4 commits October 9, 2024 16:37

fix(revision_summary): use selected value col, not last col

ec684f5

Clarify time_near_latest -> lag_near_latest

052854f

So it is not misinterpreted as "the amount of time that it has been near the latest".

brookslogan force-pushed the lcb/update-key_colnames.epi_archive branch from 97fdc29 to 052854f Compare October 10, 2024 19:27

brookslogan added 5 commits October 18, 2024 12:51

key_colnames: +flexible on dfs, +rigid on edfs, +tsibble, +archive

9f75e58

fix(revision_summary): consider units&class in lag filter

e994f52

and avoid unnecessary `abs()`

Fix compactification with dist_quantiles columns

839e921

Make extra_keys = into soft "deprecation" of a different behavior

34acd7e

to ease epipredict transition.

Tweak key_colnames.epi_df(other_keys =) mismatch error text

1353df9

brookslogan force-pushed the lcb/update-key_colnames.epi_archive branch from 592c3a2 to 1353df9 Compare October 22, 2024 21:24

brookslogan mentioned this pull request Oct 28, 2024

pass-through on revision_analysis docs #557

Open

brookslogan mentioned this pull request Nov 15, 2024

key_colnames returns wrong values for archive data #565

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update `key_colnames`, `revision_summary` #540

Update `key_colnames`, `revision_summary` #540

brookslogan commented Oct 10, 2024 •

edited

Loading

Update key_colnames, revision_summary #540

Are you sure you want to change the base?

Update key_colnames, revision_summary #540

Conversation

brookslogan commented Oct 10, 2024 • edited Loading

Checklist

Change explanations for reviewer

epipredict errors

Other work

Magic GitHub syntax to mark associated Issue(s) as resolved when this is merged into the default branch

Update `key_colnames`, `revision_summary` #540

Update `key_colnames`, `revision_summary` #540

brookslogan commented Oct 10, 2024 •

edited

Loading

`epipredict` errors