Fold fre pp rename-split into fre pp split-netcdf --rename#783
Fold fre pp rename-split into fre pp split-netcdf --rename#783Copilot wants to merge 12 commits intorename-splitfrom
fre pp rename-split into fre pp split-netcdf --rename#783Conversation
…ests Co-authored-by: ilaflott <[email protected]>
Co-authored-by: ilaflott <[email protected]>
fre pp rename-split into fre pp split-netcdf --rename
|
Oh, this PR should be targeting edit: got it |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## rename-split #783 +/- ##
================================================
- Coverage 84.04% 83.40% -0.64%
================================================
Files 71 71
Lines 4969 5057 +88
================================================
+ Hits 4176 4218 +42
- Misses 793 839 +46
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
…/tmp_path, move CLI tests Co-authored-by: ilaflott <[email protected]>
Co-authored-by: ilaflott <[email protected]>
Addressed all four review comments: moved rename logic into |
|
@copilot 4 spaces for indentation, not 2 |
Co-authored-by: ilaflott <[email protected]>
Converted |
|
These comments aren't tied to a specific line and tend to be more about design decisions, so I'm putting them here, rather than in the main review section:
|
cwhitlock-NOAA
left a comment
There was a problem hiding this comment.
This duplicates my comment from the main PR, but there are some design decisions we should talk about:
-
In the meeting that preceded this, one of the advantages of this approach that came up was doing the renaming + rewriting as a single file touch. That's not what the copilot rewrites are doing - it's splitting as the default and then copying. I can see that making sense from a code legibility standpoint, but it should be a decision by humans. It's also going to take a more comprehensive rewrite, since rename_file() in rename_script.py assumes that we have a preexisting file from split-netcdf to work with - it's splitting the filename and opening the file to get dates to parse.
-
We still have two separate files for tests of renaming and splitting: fre/pp/tests/test_split_netcdf_rename.py and fre/pp/tests/test_split_netcdf.py . That decision could make sense if we decided that certain options never get exercised with each other (e.g. --rename and --vars), but if that's the case we need code to make sure those options are mutually exclusive. I am also pretty sure we don't want that - the var list options + --rename is needed for adding variable filtering in the yamls.
-
Tests for get_freq_and_format_from_two_dates and get_duration_from_two_dates in rename_script.py need both examples that let you verify failed cases (these are not valid dates) and a slightly wider selection of tests (more calendars, documenting behavior on some edge cases). I'm happy to take on that bit. We should also consider renaming that file to something like "split and rename helper functions" if the main renaming functionality is moving to a different file.
-
Copilot has eliminated a LOT of tests from fre/pp/tests/test_split_netcdf_rename.py . I'm not as familiar with that part of the code, but it looks like the total cases for testing/determining frequency were reduced a lot. @ceblanton could you check on the reductions there?
Clarified the example of the nested directory structure for output files.
Added a comment to clarify the purpose of the fixture.
we're humans and we are here to make those decisions! and good catch, we'd rather it just write, rather than write + copy + delete an old copy.
agreed, should be one file here.
this is out of scope of this PR, those exceptions should get covered in #717, which this PR is targeting. i.e. this PR is a sub-PR of #717
no this is not the right take- this file did not exist before, so these tests were not taken away. i did shuffle some tests out of that file and into |
|
@copilot respond to feedback, it's paramount that the |
Co-authored-by: ilaflott <[email protected]>
Co-authored-by: ilaflott <[email protected]>
Co-authored-by: ilaflott <[email protected]>
Refactored |
ceblanton
left a comment
There was a problem hiding this comment.
Wow. I think this is what we want, yes!
| def split_netcdf(file, outputdir, variables): | ||
| @click.option('-r', '--rename', is_flag=True, default=False, | ||
| help='After splitting, rename output files into a nested directory structure ' | ||
| 'organized by frequency and duration under $outputdir.') |
There was a problem hiding this comment.
"During splitting", not "after splitting", to make it clear it's a single operation not two.
| ''' fre pp split-netcdf --help includes --diag-manifest option ''' | ||
| result = runner.invoke(fre.fre, args=["pp", "split-netcdf", "--help"]) | ||
| assert result.exit_code == 0 | ||
| assert "--diag-manifest" in result.output |
There was a problem hiding this comment.
This and the new test above it do not test anything other than the existence of the click options!
Line 104-117 of https://github.com/NOAA-GFDL/fre-cli/blob/rename-split/fre/pp/tests/test_rename_split_to_pp.py (rename-split) contain a set of parameterized tests that test a bunch of frequency/duration pairings. I am not seeing anything similar in https://github.com/NOAA-GFDL/fre-cli/blob/73a872d2c5632e2747d78f51e60abfba42275a96/fre/tests/test_fre_pp_cli.py . |
I take that back - the tests are still in lines 104-117 of https://github.com/NOAA-GFDL/fre-cli/blob/73a872d2c5632e2747d78f51e60abfba42275a96/fre/pp/tests/test_rename_split_to_pp.py . I got thrown off because that file wasn't showing as part of the changed files...which it wouldn't, if it was being left alone. My concern is now that we've currently got 3 different files with tests for split-netcdf. Can we consolidate those into a single file? |
Describe your changes
Takes the
fre pp rename-splitfunctionality from therename-splitbranch and folds it intofre pp split-netcdfvia a--renameflag. Without--rename,split-netcdfbehaves as before.fre/pp/split_netcdf_script.pyrenameanddiag_manifestparameters tosplit_file_xarray(): whenrename=True, each split file is written directly to its final nestedcomponent/freq/duration/path (no intermediate flat file, no copy, no delete)_compute_renamed_path()helper that uses an in-memory time-decoded dataset to determine frequency, duration, and date range before the write, enabling a single file touch per variabletry/finallyfor proper resource cleanup of the decoded datasetsplit_file_xarrayto 4-space indentation (PEP 8)fre/pp/frepp.py--rename(-r) flag and--diag-manifest(-d) option to thesplit-netcdfclick command, passing them through tosplit_file_xarray()fre pp rename-splitcommandfre/tests/test_fre_pp_cli.py— CLI tests viaCliRunner:--renameand--diag-manifestappear in help outputsplit-netcdf --rename(parametrized for timeseries and static cases)--rename→ flat output)split_rename_ncgenfixture andtmp_pathfor output directoriesfre/pp/tests/test_split_netcdf.py— Unit tests via standardimport(merged into existing test file):split_file_xarraywithrename=Truefor timeseries and static datasplit_file_xarraywithout renamencgen_setuppytest fixture andtmp_pathfor output directoriesIssue ticket number and link (if applicable)
Checklist before requesting a review
Original prompt
fre pp split-netcdf --renamecallsfre pp rename-splitfunctionality #782📱 Kick off Copilot coding agent tasks wherever you are with GitHub Mobile, available on iOS and Android.