Skip to content

Conversation

@daniel-zamora
Copy link
Contributor

@daniel-zamora daniel-zamora commented Nov 18, 2025

Overview

What is the objective?

Fix incorrect granule counts when searching collections with MultiPolygon shapefiles. The collections endpoint with include_granule_counts=true was returning 0 or incorrect counts for MultiPolygon shapefiles, while the granules endpoint correctly found matching granules. This occurred because MultiPolygon OR semantics were being broken during spatial condition extraction, causing the query to incorrectly require granules to match ALL polygons instead of ANY polygon.

What are the changes?

Refactored the extract-spatial-conditions function to preserve OR group structure for MultiPolygon shapefiles.

What areas of the application does this impact?

  • Collection search with granule counts: /search/collections?include_granule_counts=true with MultiPolygon shapefiles
  • Spatial query processing: How OR groups from MultiPolygon geometries are handled in granule counting logic

Required Checklist

  • New and existing unit and int tests pass locally and remotely
  • clj-kondo has been run locally and all errors in changed files are corrected
  • I have commented my code, particularly in hard-to-understand areas
  • I have made changes to the documentation (if necessary)
  • My changes generate no new warnings

Additional Checklist

  • I have removed unnecessary/dead code and imports in files I have changed
  • I have cleaned up integration tests by doing one or more of the following:
    • migrated any are2 tests to are3 in files I have changed
    • de-duped, consolidated, removed dead int tests
    • transformed applicable int tests into unit tests
    • reduced number of system state resets by updating fixtures. Ex) (use-fixtures :each (ingest/reset-fixture {})) to be :once instead of :each

Summary by CodeRabbit

  • New Features

    • Spatial condition extraction now preserves OR-grouped spatial clauses when computing granule counts.
    • Added multipolygon GeoJSON/shapefile support for granule-count searches.
  • Tests

    • Added comprehensive unit tests for OR-group preservation, mixed contexts, duplicates, validation, and multipolygon extraction.
    • Added integration tests validating shapefile/multipolygon granule-count behavior and matching collection-level counts.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Nov 18, 2025

Walkthrough

Detects OR groups composed solely of SpatialCondition nodes and preserves those OR groups during spatial-condition extraction; adds unit tests, a MultiPolygon GeoJSON fixture, and an integration test that verifies shapefile-based granule counts.

Changes

Cohort / File(s) Change Summary
Core spatial extraction
search-app/src/cmr/search/services/query_execution/granule_counts_results_feature.clj
Adds private helper is-spatial-or-group? and updates extract-spatial-conditions to preserve OR groups that contain only SpatialCondition nodes while also extracting ungrouped SpatialCondition instances with parent-path validation.
Unit tests
search-app/test/cmr/search/test/unit/services/query_execution/granule_counts_results_feature_test.clj
Adds tests and helper make-spatial-condition covering is-spatial-or-group? and extract-spatial-conditions across OR/AND groups, mixed/nested contexts, duplicates, parent-path validation, negation validation, and edge cases.
Integration test resource
system-int-test/resources/shapefiles/multipolygon_test.geojson
Adds a GeoJSON FeatureCollection containing a MultiPolygon composed of two rectangular polygons for integration testing.
Integration test
system-int-test/test/cmr/system_int_test/search/granule/granule_counts_search_test.clj
Adds multipolygon-shapefile-granule-counts-test: enables shapefile flag, ingests MultiPolygon granules, waits for indexing/reindex/cache refresh, posts shapefile multipart requests, and asserts granule search hits and collection-level include_granule_counts results match.

Sequence Diagram(s)

sequenceDiagram
    participant Q as Query AST
    participant E as extract-spatial-conditions
    participant V as path-validator
    participant R as Result

    Note over Q,E: Preserve OR groups composed only of SpatialCondition nodes

    Q->>E: provide query tree
    E->>E: traverse tree, identify OR groups & SpatialCondition nodes
    alt OR group contains only SpatialCondition
        E->>V: validate group's parent path(s)
        V-->>E: validated
        E->>R: include entire OR group in results
    else single/ungrouped SpatialCondition
        E->>V: validate condition's parent path
        V-->>E: validated
        E->>R: include single SpatialCondition in results
    end
    Note right of R: returns concatenated [spatial OR groups + ungrouped SpatialConditions]
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Pay extra attention to:
    • Correct detection of OR vs AND groups and nesting rules.
    • Validation logic for parent paths and negation handling.
    • Unit tests covering duplicates and preserved OR-group structure.
    • Integration test timing and shapefile multipart handling.

Suggested labels

hacktoberfest-accepted

Suggested reviewers

  • DuJuan
  • jmaeng72
  • eereiter

Poem

🐰 I hopped through ORs where polygons play,
I kept whole groups safe so shapes could stay,
Shapefiles sent and granules counted two,
I twitched my nose — the tests ran true,
A carrot cheer for code that's new! 🥕

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main issue being fixed: incorrect granule counts for MultiPolygon shapefiles when comparing collections endpoint with include_granule_counts vs granules endpoint.
Description check ✅ Passed The description covers all required template sections: objective explains the MultiPolygon OR semantics issue, changes describe the extract-spatial-conditions refactoring, impacted areas are listed, and all required and additional checklist items are marked complete.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch CMR-10886

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
system-int-test/test/cmr/system_int_test/search/granule/granule_counts_search_test.clj (1)

757-794: MultiPolygon shapefile test is solid; consider resetting the shapefile flag to avoid state leakage.

The test correctly:

  • enables the shapefile parameter,
  • ingests granules inside and outside the MultiPolygon,
  • compares direct granule hits to collection-level include_granule_counts.

One concern: set-enable-shapefile-parameter-flag! is set to true via side/eval-form and never reset, which can leak configuration into subsequent system-int tests.

Wrapping the body in a try/finally and resetting the flag keeps the test self-contained:

 (deftest multipolygon-shapefile-granule-counts-test
-  (side/eval-form `(shapefile/set-enable-shapefile-parameter-flag! true))
-
-  (let [coll (make-coll 1 m/whole-world nil)]
-    ;; Granule in first polygon of MultiPolygon
-    (make-gran coll (p/point 20.0 5.0) nil)
-    ;; Granule in second polygon of MultiPolygon
-    (make-gran coll (p/point 47.0 -20.0) nil)
-    ;; Granule outside MultiPolygon
-    (make-gran coll (p/point -100.0 40.0) nil)
-    ...
-    (testing "MultiPolygon shapefile granule counts match direct granule search"
-      ...
-      (is (gran-counts/granule-counts-match? :xml {coll granule-hits} collection-response)
-          "MultiPolygon shapefile: collection granule counts should match granule search hits"))))))
+  (side/eval-form `(shapefile/set-enable-shapefile-parameter-flag! true))
+  (try
+    (let [coll (make-coll 1 m/whole-world nil)]
+      ;; Granule in first polygon of MultiPolygon
+      (make-gran coll (p/point 20.0 5.0) nil)
+      ;; Granule in second polygon of MultiPolygon
+      (make-gran coll (p/point 47.0 -20.0) nil)
+      ;; Granule outside MultiPolygon
+      (make-gran coll (p/point -100.0 40.0) nil)
+      ...
+      (testing "MultiPolygon shapefile granule counts match direct granule search"
+        ...
+        (is (gran-counts/granule-counts-match? :xml {coll granule-hits} collection-response)
+            "MultiPolygon shapefile: collection granule counts should match granule search hits")))
+    (finally
+      (side/eval-form `(shapefile/set-enable-shapefile-parameter-flag! false)))))

(You can inline the ... portions from the existing test body.)

This keeps global shapefile configuration from accidentally affecting other tests.

search-app/test/cmr/search/test/unit/services/query_execution/granule_counts_results_feature_test.clj (1)

12-139: Good coverage of spatial grouping semantics; consider an optional negative-path test.

The helper make-spatial-condition and the suite of tests around is-spatial-or-group? and extract-spatial-conditions thoroughly exercise:

  • OR vs AND ConditionGroups,
  • mixed spatial/non-spatial groups,
  • nested conditions, and
  • duplicate polygon scenarios.

Once extract-spatial-conditions restores validate-path-to-condition for ungrouped SpatialConditions (see comment in granule-counts-results-feature.clj), you might optionally add a small test that constructs a query where a spatial condition is wrapped in an unsupported parent (e.g., negation or other disallowed operator) and asserts that extraction fails as expected. That would fully lock in the intended safety behavior around parent-path validation.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fe7ccb9 and 2758b94.

📒 Files selected for processing (4)
  • search-app/src/cmr/search/services/query_execution/granule_counts_results_feature.clj (1 hunks)
  • search-app/test/cmr/search/test/unit/services/query_execution/granule_counts_results_feature_test.clj (1 hunks)
  • system-int-test/resources/shapefiles/multipolygon_test.geojson (1 hunks)
  • system-int-test/test/cmr/system_int_test/search/granule/granule_counts_search_test.clj (2 hunks)
🔇 Additional comments (2)
system-int-test/resources/shapefiles/multipolygon_test.geojson (1)

1-32: GeoJSON MultiPolygon test resource looks structurally sound.

Rings are closed, coordinates are in lon/lat order, and the MultiPolygon matches the two-rectangle description used in the tests. No changes needed.

system-int-test/test/cmr/system_int_test/search/granule/granule_counts_search_test.clj (1)

4-9: New test dependencies align with MultiPolygon shapefile usage.

The added requires (io, mt, side, shapefile) match the new shapefile-based granule count test and look appropriate.

@codecov-commenter
Copy link

codecov-commenter commented Nov 18, 2025

Codecov Report

❌ Patch coverage is 94.11765% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 58.20%. Comparing base (d3fbc7b) to head (7b2b754).
⚠️ Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
...query_execution/granule_counts_results_feature.clj 94.11% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2348      +/-   ##
==========================================
+ Coverage   58.17%   58.20%   +0.02%     
==========================================
  Files        1063     1063              
  Lines       72197    72230      +33     
  Branches     2084     2081       -3     
==========================================
+ Hits        42003    42038      +35     
+ Misses      28254    28252       -2     
  Partials     1940     1940              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@eereiter eereiter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check the coderabbit suggestion of inserting (validate-path-to-condition query condition-path) on line 95 in search-app/src/cmr/search/services/query_execution/granule_counts_results_feature.clj to make sure that wasn't missed, and possilbly write a test that covers that scenario if it was missed.

@daniel-zamora daniel-zamora force-pushed the CMR-10886 branch 2 times, most recently from 731c7ea to 10928ad Compare December 1, 2025 16:24
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
system-int-test/test/cmr/system_int_test/search/granule/granule_counts_search_test.clj (1)

757-794: Test logic is sound; consider adding cleanup for the global shapefile flag.

The test correctly verifies that MultiPolygon shapefile-based granule counts in collection searches match the counts from direct granule searches. The test structure follows existing patterns and uses appropriate assertions.

However, line 758 enables the shapefile parameter flag globally without cleanup. If this flag persists across tests, it could affect subsequent test behavior.

Consider adding cleanup to restore the original flag state:

 (deftest multipolygon-shapefile-granule-counts-test
-  (side/eval-form `(shapefile/set-enable-shapefile-parameter-flag! true))
-
-  (let [coll (make-coll 1 m/whole-world nil)]
+  (let [original-flag (side/eval-form `(shapefile/get-enable-shapefile-parameter-flag))
+        _ (side/eval-form `(shapefile/set-enable-shapefile-parameter-flag! true))]
+    (try
+      (let [coll (make-coll 1 m/whole-world nil)]
+        ;; ... rest of test logic ...
+        )
+      (finally
+        (side/eval-form `(shapefile/set-enable-shapefile-parameter-flag! ~original-flag))))))

Note: If the use-fixtures already handles this flag reset, or if a getter for the flag doesn't exist, then this cleanup may not be necessary.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 10928ad and 7b2b754.

📒 Files selected for processing (4)
  • search-app/src/cmr/search/services/query_execution/granule_counts_results_feature.clj (1 hunks)
  • search-app/test/cmr/search/test/unit/services/query_execution/granule_counts_results_feature_test.clj (1 hunks)
  • system-int-test/resources/shapefiles/multipolygon_test.geojson (1 hunks)
  • system-int-test/test/cmr/system_int_test/search/granule/granule_counts_search_test.clj (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • system-int-test/resources/shapefiles/multipolygon_test.geojson
  • search-app/test/cmr/search/test/unit/services/query_execution/granule_counts_results_feature_test.clj
🔇 Additional comments (3)
search-app/src/cmr/search/services/query_execution/granule_counts_results_feature.clj (2)

66-74: LGTM! Clean predicate for identifying spatial OR groups.

The function correctly identifies OR groups that should preserve their structure for MultiPolygon geometries. The logic validates all necessary conditions: instance type, OR operation, non-empty conditions, and that every child is a SpatialCondition.


76-98: LGTM! Two-step extraction preserves OR semantics and includes proper validation.

The refactored logic correctly:

  1. Preserves OR groups containing only SpatialConditions (fixing the MultiPolygon issue)
  2. Extracts standalone SpatialConditions not part of such groups
  3. Validates paths for both spatial OR groups (line 86) and ungrouped conditions (line 95)
  4. Prevents double-extraction by checking parent type (line 94)

This addresses the past review concern about missing path validation while maintaining the fix for MultiPolygon granule counts.

system-int-test/test/cmr/system_int_test/search/granule/granule_counts_search_test.clj (1)

4-9: LGTM! Necessary imports for shapefile test.

The new imports support the MultiPolygon shapefile test: io for resource loading, mt for MIME types, side for remote flag configuration, and shapefile for the parameter flag.

@daniel-zamora daniel-zamora merged commit 8588edf into master Dec 4, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants