Mhs/das 2216/quick fixes #19

joeyschultz · 2024-10-17T14:55:33Z

Description

This PR updates the Swath Projector to add support for TEMPO Level 2 data.

Jira Issue ID

Local Test Steps

Checkout this branch, build the docker images and run the test image.

./bin/build-image && ./bin/build-test && ./bin/run-test

Ensure the tests pass (one test is being skipped due to changes in this PR).

Ran 88 tests in 27.973s

OK (skipped=1)

Start up Harmony-In-A-Box. If you already have it running, reload Harmony-In-A-Box to pick up this latest image.

./bin/restart-services

Download the "TEMPO_O3TOT_L2_example.ipynb" notebook from DAS-2216 and run it to test the swath-projector in Harmony-In-A-Box. The notebook should run successfully.

Open the output file in Panoply and inspect it to ensure the geographic reprojection is correct.

Note: The following variables from the input are purposefully excluded in the varinfo config and will be missing from the output because their dimensions do not match those of /geolocation/longitude and /geolocation/latitude:

/support_data/a_priori_layer_o3
/support_data/cal_adjustment
/support_data/dNdR
/support_data/layer_efficiency
/support_data/lut_wavelength
/support_data/N_value
/support_data/N_value_residual
/support_data/ozone_sensitivity_ratio
/support_data/step_1_N_value_residual
/support_data/step_2_N_value_residual
/support_data/temp_sensitivity_ratio

PR Acceptance Checklist

Jira ticket acceptance criteria met.
CHANGELOG.md updated to include high level summary of PR changes.
docker/service_version.txt updated if publishing a release.
Tests added/updated and passing.
Documentation updated (if needed).

Also updates CHANGELOG.md to add links for previous versions.

Doesn't fix new broken test

…ng returned for coordinate variables

flamingbear

I wont approve or request changes since I'm leaving, but I did leave a bunch of comment and questions.

flamingbear · 2024-10-17T20:23:46Z

CHANGELOG.md

-[v1.1.0]:(https://github.com/nasa/harmony-swath-projector/releases/tag/1.0.1)
-[v1.0.1]:(https://github.com/nasa/harmony-swath-projector/releases/tag/1.0.1)
-[v1.0.0]:(https://github.com/nasa/harmony-swath-projector/releases/tag/1.0.0)
+[v1.1.1]: (https://github.com/nasa/harmony-swath-projector/releases/tag/1.1.0)


You have to add your release here yourself

for all I know this was me...but still

Fixed in commit 735894d

flamingbear · 2024-10-17T20:31:24Z

swath_projector/nc_merge.py

                logger.error(f'Cannot find "{dataset_file}".')
-                raise MissingReprojectedDataError(variable_name)
+                # QuickFix (DAS-2216) Ignore missing reprojections
+                logger.error(f'Not Including "{variable_name}" in "{output_file}".')
+                # raise MissingReprojectedDataError(variable_name)


I'm pretty sure this was me but I left this branch without cleaning it up.

I'd get rid of the logger.error line and the commented raise line

Fixed in commit 735894d

flamingbear · 2024-10-17T20:32:18Z

swath_projector/utilities.py

    else:
        # Assumption: Array = (y, x)
-        return variable[:].filled(fill_value=fill_value)
+        return transpose_if_xdim_less_than_ydim(


I have no idea if that assumption is still correct, same with L45

I think it is a reasonable assumption (transpose to get xdim < ydim) as a default behavior. Follow-up ticket on this point is allow for a configuration entry to explicitly define the along-track vs across-track dimension. In that ticket I am proposing a swath feature metadata variable, similar to grid_mapping, that provides this definition - trying to move towards convergible metadata standards as discussed in the convergence tiger team.

I was talking about the line that says and if it's not true, it should be removed

# Assumption: Array = (y, x)

Perhaps clarify the comment to "First assume Array = (along-track, across-track), but transpose if along-track.size < across-track.size. Similarly for line 45.

To keep it more concise, I thought it would be best to modify the comments to:
line 45: # Assumption: Array = (time, along-track, across-track)
line 50: # Assumption: Array = (along-track, across-track)

Fixed in commit 78dd9df

flamingbear · 2024-10-17T22:13:36Z

swath_projector/utilities.py

@@ -62,7 +66,12 @@ def get_coordinate_variable(
        if coordinate_substring in coordinate.split('/')[-1] and variable_in_dataset(
            coordinate, dataset
        ):
-            return dataset[coordinate]
+            # QuickFix (DAS-2216) for short and wide swaths


Is this comment in the right place still? They are in the code so that they can be removed when the quick fixes are actual fixes.

Lines 70-73 are all part of the quickfix so I think the comment is where we want it.

I'm actually on the fence about this as I suspect these are essentially default behaviors and not necessarily going away with the quick fix. That said, there is at least the case of setting rows-per-scan = 2 that will need follow-up modification. We can go with it for now, but do need to follow-up.

flamingbear · 2024-10-17T23:19:32Z

swath_projector/utilities.py

+        transposed_variable = np.ma.transpose(variable).copy()
+        if hasattr(variable, 'mask'):
+            transposed_variable.mask = np.ma.transpose(variable[:].mask)
+            return transposed_variable


couldn't this be deleted and just let the var be returned on L246?

Fixed in commit 735894d

swath_projector/utilities.py

tests/unit/test_nc_merge.py

flamingbear · 2024-10-17T23:34:49Z

tests/unit/test_utilities.py

+    def test_masked_array(self):
+        # Test case with a masked array
+        input_array = np.ma.array(
+            [[1, 2, 3], [4, 5, 6]], mask=[[True, False, False], [False, True, False]]
+        )
+        expected_output = np.ma.array(
+            [[1, 4], [2, 5], [3, 6]],
+            mask=[[True, False], [False, True], [False, False]],
+        )
+        result = transpose_if_xdim_less_than_ydim(input_array)
+        np.testing.assert_array_equal(result, expected_output)
+        np.testing.assert_array_equal(result.mask, expected_output.mask)


Didn't I have this test in before you added the mask transpose that I was asking about above? I guess two questions, why didn't this fail before? And why does it still work with your changes to the function? (three questions, where's the test for the changes you made to the function)

This test wasn't there, but the test get_variable_values was, and did fail until it was revised to handle the transpose. The transpose operation posed some challenges during development when an early implementation lost the mask, so this test was added to verify the transpose did not lose or change the mask, and that the mask is transposed along with the base data. This should be implicit in Numpy Masked-Array operations, but - as noted - it didn't always work as expected.

This test wasn't there

I'm asking about this branch that I handed off to Joey. If you look at commit 7622aa7, you will see that I added these tests including this masked one I'm asking about. So my questions still stand and I still wonder if the masks are being applied properly.

Matt you are correct that this test was existing and passing before I made the modifications in this commit 72223b9 to how the masks are applied after the transpose. I'm trying to sort out why in the early implementation the transpose operation did not lose the mask in this unit test but it was losing the mask for the coordinate variables when running the granule through the swath projector.

Given the test failures, the updates seem reasonable. If we see a quick resolution and fix here, let's go for it, but otherwise, we know this implementation works, and we can go with that.

lyonthefrog

Overall, I think the changes look clean! I added a few questions and comments with minor fix suggestions. I'm not terribly familiar with the pyresample library, so I couldn't give a super thorough review of the bits that use that.

I ran the units tests and they succeeded, along with the expected skipped test. I ran the TEMPO_O3TOT_L2_example.ipynb notebook, and the output is a little confusing to me. This is the first time I've looked at the Swath Projector though, so it's likely I just haven't fully grasped what it's doing yet.

I know one of the AC is "Lat/Lon coordinates are configured and understood to be located in the /geolocation group," but I'm confused as to why the coordinate variables were all removed from geolocation with lat/lon-looking datasets added to the root directory. Should this request have moved/deleted/added variables like this?

Also, I wasn't sure how to check whether or not the output was successfully re-projected, but I plotted geolocation/solar_azimuth_angle. This is expected?

docs/TEMPO_O3TOT_L2_example.ipynb

swath_projector/earthdata_varinfo_config.json

swath_projector/exceptions.py

swath_projector/utilities.py

tests/unit/test_utilities.py

CHANGELOG.md

bin/project_local_granule.py

CHANGELOG.md

owenlittlejohns

Generally - nice work. In particular, I really appreciate the additional unit tests that really kick the tyres of the new function you added.

The changes I'm requesting here:

Primarily: The configuration rule needs to be tightened up - it's applicable to all variables in the /support_data, but not all of those variables have the same dimensions as the coordinate variables. (You may need to override the coordinates for those variables, too, to set them to None, so that the Swath Projector doesn't think those are science variables).
After doing the above, try re-enabling the removed MissingReprojectedDataError. Tweaking the configuration file rule might mean that things work out, but if they don't we need to know how to reenable that exception and associated test, and document that in the test suite. (We also need to be planning what to do get it back to normal before completing this feature).
Simplifying the use of the np.ma.transpose - I'm only messing around locally, but it looks like it can be a lot simpler.

bin/project_local_granule.py

swath_projector/earthdata_varinfo_config.json

swath_projector/nc_merge.py

swath_projector/earthdata_varinfo_config.json

tests/unit/test_utilities.py

docs/TEMPO_O3TOT_L2_example.ipynb

owenlittlejohns · 2024-10-21T21:20:50Z

Also, I just spotted - there are some pre-commit failures for your PR: (see the link here). Do you have pre-commit configured locally? If not there are some instructions in the README.md.

…docstrings

owenlittlejohns

Thanks for the updates, I think this is pretty close.

I like the ExcludedScienceVariables approach for the configuration file - that definitely saves the creation of a really complicated regular expression.

I've left another couple of comments on the whole transposing thing (including that the tests need to tweak their inputs/outputs a bit). The only other thing I had was a tiny nit (that you can ignore if you think I'm being overly verbose).

swath_projector/utilities.py

swath_projector/earthdata_varinfo_config.json

tests/unit/test_nc_merge.py

swath_projector/earthdata_varinfo_config.json

tests/unit/test_utilities.py

CHANGELOG.md

owenlittlejohns

Warning - mixed bag ahead!

Good stuff:

The code changes look good. Thanks for bearing with the rabbit hole with the transpose stuff!
I built the service image locally, and the unit tests all passed.
I stood up my local Harmony instance, ran the notebook from DAS-2216 and the TEMPO L2 Ozone request completed successfully.

Interesting(?) stuff:

The output file from the notebook on DAS-2216 has three unexpected variables: /geolocation/lat_1, /geolocation/lon_1 and /geolocation/latitude_longitude_lat_1_lon_1 (the last one is a grid mapping variable). None of the variables in the file refer to these 1-D dimension variables or grid mapping variable, so they seem to be things that shouldn't be in the output. Looking at the Docker logs, it looks like the calculations for resolution etc are happening for /product/quality_flag and /support_data/ground_pixel_quality_flag - but both use the same coordinate in the source file, and they end up using lat and lon in their output, so I don't know quite why this is happening without digging in a bit more.

Worrying stuff:

I ran the Swath Projector regression test notebook against my local Harmony instance with these changes. That notebook failed. The error wasn't too helpful:

Reprojection failed with error: No variables could be reprojected

Digging in to the Docker logs, I see something more helpful (this is from interpolation.py L163, which is called from interpolation.py, L78):

ValueError: EWA requires 2 or more rows_per_scan and must be a factor of the total number of input row

I think what this is likely telling us is that for arrays with an odd number of rows, this change is going to fail.

D-Auty · 2024-10-28T12:08:59Z

I think the quick fix on rows-per-scan = 2 causing failures for dimension extents that are not even is to implement a least-integral-divisor function (see following) and use that to determine rows-per-scan. I suspect a better answer is to allow rows-per-scan = 1, and artificially extend the rows by e.g., duplicating the last row if necessary - but this requires some experimentation to see if it is acceptable in all cases. (I'm surprised PyResample doesn't offer a built-in solution for rows-per-scan = 1, since this would be a very common use-case). Experimentation to date has shown that using 2 for TEMPO data (which is even) has given better results. A minimal acceptable value for rows-per-scan seems appropriate.

def least_integral_divisor(n):
    """
    Finds the least integral divisor of a number n.
    """
    if n < 2:
        return 1  # 1 is the least divisor for 1 and 0
    for i in range(2, int(n**0.5) + 1):
        if n % i == 0:
            return i
    return 0  # If no divisor is found, return 0 to imply: "use all rows"

owenlittlejohns · 2024-10-28T15:00:23Z

Okay - it sounds like we have a couple of options.

Leave the previous implementation for all rows per scan, including for TEMPO data. This looks to have poorer results from David's testing. (So not preferred)
Use the function David suggests above to get the smallest divisor. But, I would recommend changing the final return value is the number itself rather than zero, e.g., defaulting to all rows per scan. That would maybe mean the function would be better named as something like get_rows_per_scan. (Minor nitpick: Python style guides like PEP008 say to use more verbose variable names, I think 3 characters is the recommended minimum length, so n and i would need to be renamed.)

That leaves a couple of things:

The extraneous 1-D grid dimensions and grid mapping variables in the output (/geolocation/lat_1, /geolocation/lon_1 and /geolocation/latitude_longitude_lat_1_lon_1). What's the thinking for those?
Using the second option above (finding the minimum factor that is 2 or more), which sounds good to me, the reference files in the Swath Projector regression tests may need updating. Obviously, that's a separate PR, but I don't want to lose track of that as a thing to be done.

D-Auty · 2024-10-28T15:13:33Z

The extra lat/lon/grid-mapping (_1) may be quite relevant in other use cases, e.g., (if SMAP were a swath, when I did some testing) - the global and polar datasets with different grid-mapping). So I am not clear on why we did not get references to those extra lat/lon and grid-mapping and/or did get the extra variables when not being referenced. More research is required. In the mean time can we merge this PR with a known issue - as it is not failing and the outputs are valid if not quite as desired.

owenlittlejohns · 2024-10-28T15:38:47Z

can we merge this PR with a known issue

I think that answer would probably have to come from the TRADE team saying whether they think the output is fine as it is. The additional grid mapping and 1-D dimension variables don't seem to be doing any harm, it's just that there's no real reason for them to be there. I could be convinced to merge things with those extra variables being produced, but we should probably pop a ticket in the cloud backlog to analysis/fix this.

joeyschultz · 2024-10-28T16:04:24Z

The latest commit bde29b6 implements the get_rows_per_scan function. I ran the Swath Projector Regression Test notebook and it is now completing the request, but failing on the comparison to the reference file, as Owen expected.

joeyschultz · 2024-10-28T19:21:55Z

I've been analyzing why the /geolocation/lat_1, /geolocation/lon_1 and /geolocation/latitude_longitude_lat_1_lon_1 are being produced. The science variables in the /geolocation group are not currently getting the MetadataOverride for their coordinates, so in resample_variable they have a unique coordinates_key of ('/geolocation/latitude', '/geolocation/longitude', '/geolocation/time') while the product/support group science variables have a coordinates key of ('/geolocation/latitude', '/geolocation/longitude'). The result is a separate target_area object being created for the geolocation variables, and two target_area objects in the reprojection_cache dictionary.

The downstream effect is in nc_single_band.write_dimensions, there is a "cache miss" on the dimensions for the ('/geolocation/latitude', '/geolocation/longitude', '/geolocation/time') coordinates_key, causing the extra lat/lon/grid-mapping (_1) dimensions to be created.

Changing the coordinates MetadataOverride Variable Pattern to the following appears to fix this:
"VariablePattern": "^/product/.*|^/support_data/.*|^/geolocation/(?!time$).*"
This ignores the /geolocation/time variable so it doesn't get recognized as a science variable and applies the override to the rest of the /geolocation science variables.

owenlittlejohns · 2024-10-29T14:58:48Z

Ahhhh! That's what it is with the two slightly different coordinates values! Nice work!

owenlittlejohns

Nice work!

The only change I'd suggest is possibly raising an exception when the number of rows per scan is less than 2 (because I think we know that will fail) but, because an exception gets raised elsewhere, that's not a deal breaker for me. It's mainly just making sure that whatever does get raised is informative to an end-user.

I guess the next step is to re-run the Swath Projector regression test notebook and update the reference files there (after checking the new versions look reasonable and make sense).

owenlittlejohns · 2024-10-30T16:45:31Z

swath_projector/utilities.py

+    found, return the total number of rows.
+    """
+    if total_rows < 2:
+        return 1


A question I have here is whether returning 1 is the right thing to do when the number of rows per scan is less than 2. Won't the EWA algorithm fail if we ask it to use less than 2 rows? If so, perhaps a better approach is to raise an exception here saying there aren't enough rows. The block of code that calls pyresample.ewa.fornav will raise an exception already, just potentially a less informative one.

(I'm not going to hold the PR up over this as there's already an exception being raised later on, but felt it was worth thinking a bit about - my thinking is that if we know a return value of < 2 is problematic, then we should catch the problem as soon as we know about it)

owenlittlejohns · 2024-10-30T16:47:35Z

tests/unit/test_utilities.py

+        np.testing.assert_array_equal(result.mask, expected_output.mask)
+
+
+class TestGetRowsPerScan(TestCase):


Nice tests - I think these cover all the cases.

flamingbear and others added 16 commits August 28, 2024 12:05

DAS-1934: Most basic implementation to allow consistent grids

bb7ebd8

DAS-1934: Update comments and README for new grid validation.

d072bca

DAS-1934: Update changelog

addf305

DAS-1934: Update changelog redux

8150470

DAS-1934: Update service version and address snyk vulnerabilities

44589d8

Update CHANGELOG.md

f9890f0

DAS-1934: Fix release note extraction script.

ed0ff0e

Also updates CHANGELOG.md to add links for previous versions.

DAS-2216: Quick Fixes 1, 2 and 4

7622aa7

Doesn't fix new broken test

Merge remote-tracking branch 'origin/main' into mhs/DAS-2216/quick-fixes

b319a63

Merge branch 'main' into mhs/DAS-2216/quick-fixes

31c5bea

DAS-2216: Modify earthdata-varinfo config for quick fix 1

2a5a6ef

DAS-2216: Resolve unit tests that were failing due to quick fixes

4477475

DAS-2216: Update service version and CHANGELOG.

404088c

Modify transpose_if_xdim_less_than_ydim to resolve mask array not bei…

72223b9

…ng returned for coordinate variables

Create notebook for PR testing and demo purposes

656c858

Install and run pre-commit

3601f92

joeyschultz requested review from owenlittlejohns and flamingbear October 17, 2024 14:55

flamingbear reviewed Oct 17, 2024

View reviewed changes

lyonthefrog requested changes Oct 18, 2024

View reviewed changes

lyonthefrog reviewed Oct 18, 2024

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

joeyschultz added 3 commits October 21, 2024 10:53

Remove unused exception, fix CHANGELOG links, and other minor updates

735894d

Reorganize some of the notebook functions

8770da0

Modify assumption comments in get_variable_values for clarity

78dd9df

owenlittlejohns requested changes Oct 21, 2024

View reviewed changes

joeyschultz added 2 commits October 23, 2024 10:08

Add to varinfo config, re-enable MissingReprojectedDataError, modify …

25cdc10

…docstrings

Remove TEMPO_O3TOT_L2_example.ipynb and add it to JIRA ticket instead.

b707bd2

owenlittlejohns requested changes Oct 24, 2024

View reviewed changes

owenlittlejohns reviewed Oct 25, 2024

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

joeyschultz added 2 commits October 25, 2024 14:06

Simplify variable transposal, update effected typehints and unit tests

6678d5a

Merge remote-tracking branch 'origin/main' into mhs/DAS-2216/quick-fixes

e5d4be1

lyonthefrog approved these changes Oct 25, 2024

View reviewed changes

owenlittlejohns reviewed Oct 25, 2024

View reviewed changes

Add get_rows_per_scan utility and associated unit tests

bde29b6

Apply coordinates MetadataOverride to geolocation group

cd54588

owenlittlejohns approved these changes Oct 30, 2024

View reviewed changes

joeyschultz merged commit 5c97edb into main Oct 30, 2024
4 checks passed

joeyschultz deleted the mhs/DAS-2216/quick-fixes branch October 30, 2024 17:37

		np.testing.assert_array_equal(result.mask, expected_output.mask)


		class TestGetRowsPerScan(TestCase):

Mhs/das 2216/quick fixes #19

Mhs/das 2216/quick fixes #19

Conversation

joeyschultz commented Oct 17, 2024 • edited Loading

Description

Jira Issue ID

Local Test Steps

PR Acceptance Checklist

flamingbear left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

flamingbear Oct 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

flamingbear Oct 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lyonthefrog left a comment • edited Loading

Choose a reason for hiding this comment

owenlittlejohns left a comment • edited Loading

Choose a reason for hiding this comment

owenlittlejohns commented Oct 21, 2024

owenlittlejohns left a comment

Choose a reason for hiding this comment

owenlittlejohns left a comment

Choose a reason for hiding this comment

D-Auty commented Oct 28, 2024 • edited Loading

owenlittlejohns commented Oct 28, 2024 • edited Loading

D-Auty commented Oct 28, 2024 • edited Loading

owenlittlejohns commented Oct 28, 2024

joeyschultz commented Oct 28, 2024

joeyschultz commented Oct 28, 2024 • edited Loading

owenlittlejohns commented Oct 29, 2024

owenlittlejohns left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joeyschultz commented Oct 17, 2024 •

edited

Loading

flamingbear Oct 18, 2024 •

edited

Loading

flamingbear Oct 18, 2024 •

edited

Loading

lyonthefrog left a comment •

edited

Loading

owenlittlejohns left a comment •

edited

Loading

D-Auty commented Oct 28, 2024 •

edited

Loading

owenlittlejohns commented Oct 28, 2024 •

edited

Loading

D-Auty commented Oct 28, 2024 •

edited

Loading

joeyschultz commented Oct 28, 2024 •

edited

Loading

owenlittlejohns left a comment •

edited

Loading