-
Notifications
You must be signed in to change notification settings - Fork 101
CMR-10801: As a metadata steward, I should not get a CMR validation error for a keyword long name when it’s not provided and not required #2350
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughAdds conditional long-name handling to KMS lookups for platforms, instruments, and projects by introducing a generic lookup that can strip ignorable long-names (nil/blank/"Not provided") before cache matching, plus new public wrappers and corresponding unit and integration tests. Changes
Sequence Diagram(s)sequenceDiagram
participant Caller as Caller
participant Dispatcher as lookup-by-umm-c-keyword
participant Wrapper as wrapper (platforms/instruments/projects)
participant Generic as lookup-by-umm-c-keyword-with-long-name
participant Validator as skip-long-name-validation?
participant Cache as KMS Cache
Caller->>Dispatcher: lookup request (keyword-scheme, umm-c-keyword)
Dispatcher->>Wrapper: route to platform/instrument/project wrapper
Wrapper->>Generic: call with umm-c-keyword + optional transform-fn
Generic->>Validator: check if :long-name is ignorable
alt ignorable (nil/blank/"Not provided")
Generic->>Generic: remove :long-name from comparison/index
else use long-name
Generic->>Generic: include :long-name in comparison
end
Generic->>Cache: read using prepared comparison map
Cache-->>Generic: cached match or nil
Generic-->>Wrapper: return match (or propagate Carmine error)
Wrapper-->>Dispatcher: propagate result
Dispatcher-->>Caller: final lookup response
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
common-app-lib/test/cmr/common_app/test/services/kms_lookup.clj (1)
243-310: Long-name tests thoroughly exercise nil/blank/'Not provided' semanticsThe new
blank-and-not-provided-long-name-validation-test:
- Verifies that nil/empty/
"Not provided"all match KMS entries missing long-names for platforms, instruments, and projects.- Confirms that non-blank, non-
"Not provided"long-names still fail, even when short-name is valid.- Reasserts CMRS-4400 behavior for platforms with long-names in KMS (PLAT1) for both nil and exact long-name cases.
This gives solid unit-level coverage of the new lookup behavior. If we ever need to widen guarantees, optional follow-up would be to add analogous “KMS has long-name, UMM says 'Not provided'” tests for instruments/projects, but that’s not required for this PR.
common-app-lib/src/cmr/common_app/services/kms_lookup.clj (1)
279-333: Generic long-name-aware lookup and wrappers are sound; consider minor doc/perf clarificationsThe new
lookup-by-umm-c-keyword-with-long-nameplus the platform/instrument/project wrappers achieves:
- Consistent normalization (
kebab-case+ optional transform) across the three keyword schemes.- Conditional stripping of
:long-nameon both the UMM keyword and KMS index whenskip-long-name-validation?is true, so nil/blank/"Not provided"no longer cause false negatives.- Preservation of strict matching whenever
:long-nameis any other non-blank value, as verified by the new tests.A couple of small, non-blocking follow-ups you might consider:
- The
remove-long-name-from-kms-indexdocstring still talks only about “nil” long-names; updating it to mention blank/"Not provided"and all three schemes would better reflect current use.remove-long-name-from-kms-indexrebuilds the entire scheme index map on each lookup when skipping long-names. This was already true for the older platform path, but it now applies to instruments/projects as well. If lookups on these schemes become hot paths, precomputing or caching a “no-long-name” variant per scheme could avoid repeated work.- If KMS ever contained multiple entries that differ only by
:long-namefor the same short-name (and other fields), stripping:long-namecould collapse them; worth confirming that KMS schema/constraints prevent such duplicates.These are refinements rather than correctness issues; the current behavior matches the tests and PR intent.
Also applies to: 335-357, 367-368
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
common-app-lib/src/cmr/common_app/services/kms_lookup.clj(3 hunks)common-app-lib/test/cmr/common_app/test/services/kms_lookup.clj(2 hunks)system-int-test/test/cmr/system_int_test/ingest/collection/collection_keyword_validation_test.clj(1 hunks)
🔇 Additional comments (3)
system-int-test/test/cmr/system_int_test/ingest/collection/collection_keyword_validation_test.clj (1)
477-529: New 'Not provided' integration tests are well-targeted and aligned with existing patternsThe added tests cleanly cover:
- Valid cases for Platforms/Instruments/Projects when long-name is
"Not provided".- Negative cases ensuring arbitrary non-blank long names still fail even when short names are valid.
Assertions reuse existing helpers and message formats, so they’ll stay in sync with prior behavior. No changes needed here.
common-app-lib/test/cmr/common_app/test/services/kms_lookup.clj (1)
14-19: Sample KMS additions for PLAT2/INST2/PROJ2 look consistentThe extra platform/instrument/project entries cleanly model the “no long-name in KMS” scenarios without disturbing existing tests. Keys follow the same shape and casing as the rest of
sample-map.common-app-lib/src/cmr/common_app/services/kms_lookup.clj (1)
270-277: skip-long-name-validation? correctly captures the intended 'Not provided' semanticsThe helper cleanly treats nil, blank, and case/whitespace-insensitive
"Not provided"as “no long-name,” which matches the PR objective and keeps the behavior centralized. No changes needed here.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #2350 +/- ##
=======================================
Coverage 58.17% 58.18%
=======================================
Files 1063 1063
Lines 72197 72214 +17
Branches 2084 2082 -2
=======================================
+ Hits 42003 42017 +14
- Misses 28254 28256 +2
- Partials 1940 1941 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
18888db to
55a5a73
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
common-app-lib/src/cmr/common_app/services/kms_lookup.clj (1)
270-277: Optional: Redundant nil check.The explicit
(nil? long-name)check on line 274 is technically redundant becausestring/blank?already returnstruefor nil values. However, this makes the intent explicit and has no performance impact.If you prefer a more concise version, you could simplify to:
-(defn- skip-long-name-validation? - "Returns true if the long name should be skipped in validation. - Long name is skipped if it's nil, empty string, or 'Not provided' (case-insensitive)." - [long-name] - (or (nil? long-name) - (string/blank? long-name) - (when (string? long-name) - (= "not provided" (string/lower-case (string/trim long-name)))))) +(defn- skip-long-name-validation? + "Returns true if the long name should be skipped in validation. + Long name is skipped if it's nil, empty string, or 'Not provided' (case-insensitive)." + [long-name] + (or (string/blank? long-name) + (when (string? long-name) + (= "not provided" (string/lower-case (string/trim long-name))))))
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
common-app-lib/src/cmr/common_app/services/kms_lookup.clj(3 hunks)common-app-lib/test/cmr/common_app/test/services/kms_lookup.clj(2 hunks)system-int-test/test/cmr/system_int_test/ingest/collection/collection_keyword_validation_test.clj(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- system-int-test/test/cmr/system_int_test/ingest/collection/collection_keyword_validation_test.clj
🔇 Additional comments (8)
common-app-lib/src/cmr/common_app/services/kms_lookup.clj (6)
279-285: LGTM!The function correctly removes
:long-namefrom all keys in the KMS index map to enable matching on remaining fields when long-name validation should be skipped. The implementation is straightforward and correct.
308-333: LGTM!The generic lookup function correctly handles long-name validation by:
- Conditionally removing long-name from the input keyword when it should be skipped (nil, blank, or "Not provided")
- Conditionally removing long-name from KMS index keys to enable matching on remaining fields
- Preserving normal matching behavior when long-name has a real value
The logic correctly handles all scenarios:
- Accepts nil/blank/"Not provided" long-names even when KMS entry lacks a long-name
- Accepts exact long-name matches when both are present
- Rejects invalid long-name values
- Maintains consistent error handling with other lookup functions
335-345: LGTM!The refactoring correctly delegates to the new generic function while preserving the existing :type → :category transformation for platforms. This maintains backward compatibility while gaining the long-name validation improvements.
347-351: LGTM!New dedicated lookup function for instruments that correctly delegates to the generic long-name handler. This enables instruments to accept nil/blank/"Not provided" long-names as intended by the PR objectives.
353-357: LGTM!New dedicated lookup function for projects that correctly delegates to the generic long-name handler. This enables projects to accept nil/blank/"Not provided" long-names as intended by the PR objectives.
367-368: LGTM!The dispatch correctly routes instruments and projects to their new dedicated lookup functions, ensuring they benefit from the long-name validation logic. The integration is clean and maintains backward compatibility for other keyword schemes.
common-app-lib/test/cmr/common_app/test/services/kms_lookup.clj (2)
14-19: LGTM!The test data additions correctly provide KMS entries without long-name fields (PLAT2, INST2, PROJ2) to enable testing the new validation logic. These complement the existing entries with long-names (PLAT1, INST1, PROJ1), providing comprehensive test coverage for both scenarios.
243-310: LGTM!Comprehensive test coverage for the new long-name validation logic:
- Tests all three keyword types (platforms, instruments, projects) with identical scenarios
- Covers all skip-long-name cases: nil, empty string, and case-insensitive "Not provided"
- Validates rejection of invalid long-names even with valid short-names
- Tests CMR-4400 behavior where nil long-name matches KMS entries that have long-names (lines 301-310)
The tests correctly verify that when long-name is skipped, matching occurs on remaining fields regardless of whether the KMS entry has a long-name.
55a5a73 to
dc860e9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
common-app-lib/test/cmr/common_app/test/services/kms_lookup.clj (1)
243-310: New long-name tests are solid; consider an extra case and minor dedupThe new
blank-and-not-provided-long-name-validation-testthoroughly covers nil, empty string,"Not provided"(case-insensitive), and invalid long-name cases for platforms, instruments, and projects, and also reasserts the CMR-4400 behavior for platforms with long-names in KMS. This is good coverage.Two optional follow-ups:
- Add a case where KMS has a long-name (e.g., PLAT1) and the UMM long-name is
"Not provided"to explicitly lock in that behavior, since the helper treats"Not provided"equivalently to nil/blank.- Consider folding the earlier CMR-4400 test (Lines 191–195) into this new group, or vice versa, to avoid testing the same nil-long-name platform behavior in two places.
common-app-lib/src/cmr/common_app/services/kms_lookup.clj (2)
270-278: skip-long-name-validation? matches requirements; minor robustness/doc tweaks possibleThe predicate correctly treats nil, blank, and
"Not provided"(case-insensitive, trimmed) as “not provided,” which aligns with the PR intent.If you want to harden this a bit and make the doc more precise, you could:
- Reflect that whitespace-only strings are also treated as blank.
- Guard
string/blank?behindstring?to avoid surprises if a non-string ever sneaks in.Example tweak:
(defn- skip-long-name-validation? - "Returns true if the long name should be skipped in validation. - Long name is skipped if it's nil, empty string, or 'Not provided' (case-insensitive)." + "Returns true if the long name should be skipped in validation. + Long name is skipped if it's nil, blank/whitespace, or 'Not provided' (case-insensitive)." [long-name] - (or (nil? long-name) - (string/blank? long-name) - (when (string? long-name) - (= "not provided" (string/lower-case (string/trim long-name)))))) + (or (nil? long-name) + (and (string? long-name) + (or (string/blank? long-name) + (= "not provided" + (-> long-name string/trim string/lower-case))))))
308-368: Generic long-name lookup refactor is correct; watch out for repeated full-map rewritesThe new
lookup-by-umm-c-keyword-with-long-namehelper plus the platform/instrument/project wrappers look logically sound:
- UMM keywords are kebab-cased, optionally transformed (e.g.,
:type→:categoryfor platforms), and only include:long-namein the comparison when it should be validated.- When long-name is “skipped” (nil/blank/
"Not provided"), you drop it from the comparison map and also from the KMS index keys, so KMS entries with and without long-names can match on the remaining fields.- Error handling and logging follow the existing pattern, and instruments/projects now get the same long-name behavior that platforms already had.
One thing to keep an eye on: whenever
skip-long-name?is true,remove-long-name-from-kms-indexwalks the entire scheme map and builds a new map on every lookup. If the KMS index for platforms/instruments/projects is large and these lookups are hot, that extra O(N) work per call could add up.If this becomes a hotspot, consider a follow-up refactor such as:
- Precomputing and storing a “no-long-name” index variant at index-build time, or
- Memoizing
remove-long-name-from-kms-indexper scheme (e.g., keyed by thevalueidentity or a stable hash) so repeated lookups reuse the same transformed map.Also, the docstring on
remove-long-name-from-kms-index(“only want to validate long-name if it is not nil”) is now slightly out of date given the broader skip semantics; updating that comment in a follow-up would avoid confusion.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
common-app-lib/src/cmr/common_app/services/kms_lookup.clj(3 hunks)common-app-lib/test/cmr/common_app/test/services/kms_lookup.clj(2 hunks)system-int-test/test/cmr/system_int_test/ingest/collection/collection_keyword_validation_test.clj(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- system-int-test/test/cmr/system_int_test/ingest/collection/collection_keyword_validation_test.clj
🔇 Additional comments (1)
common-app-lib/test/cmr/common_app/test/services/kms_lookup.clj (1)
14-19: Sample KMS entries for PLAT2/INST2/PROJ2 look consistent with new scenariosThe added platform/instrument/project entries correctly model “no long-name in KMS” cases and align with the new tests (PLAT2 has
:categoryonly, INST2/PROJ2 have only:short-name). No issues here.
Overview
What is the objective?
Fix KMS keyword validation to properly handle "Not provided" as a valid long-name value for Platforms, Instruments, and Projects. Previously, only nil long-names were handled correctly - any non-nil value (including "Not provided") would fail validation if the KMS entry lacked a long-name field.
What are the changes?
What areas of the application does this impact?
common-app-lib: KMS keyword lookup service and validation logic)
Required Checklist
Additional Checklist
Summary by CodeRabbit
New Features
Bug Fixes
Tests
✏️ Tip: You can customize this high-level summary in your review settings.