Skip to content

Add entityStatus to all service entities + domains#26132

Open
yan-3005 wants to merge 9 commits intomainfrom
ram/add-entity-status-to-all-entities
Open

Add entityStatus to all service entities + domains#26132
yan-3005 wants to merge 9 commits intomainfrom
ram/add-entity-status-to-all-entities

Conversation

@yan-3005
Copy link
Contributor

@yan-3005 yan-3005 commented Feb 26, 2026

Describe your changes:

Fixes https://github.com/open-metadata/openmetadata-collate/issues/3049

Type of change:

  • Bug fix
  • Improvement
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes <issue-number>: <short explanation>
  • I have commented on my code, particularly in hard-to-understand areas.
  • For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

Summary by Gitar

  • Added entityStatus field: Added entityStatus field to all 10 service entities (API, Dashboard, Database, Messaging, Metadata, MLModel, Pipeline, Search, Storage) and domain entities for lifecycle governance and approval workflows
  • Schema & mapping updates: Updated JSON schemas, Elasticsearch index mappings (4 language variants), and generated TypeScript interfaces with EntityStatus enum
  • Integration testing: Added comprehensive test_entityStatus() in BaseEntityIT covering all status transitions, persistence, and version increments with graceful fallback for unsupported entities

This will update automatically on new commits.

"type": "keyword"
},

"entityStatus": {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Bug: entityStatus nested inside customPropertiesTyped in domain mappings

In all four locale variants of domain_index_mapping.json, the entityStatus field is incorrectly placed inside customPropertiesTyped.properties (after refFqn) rather than at the top level of mappings.properties (alongside descriptionStatus).

For all service index mappings, entityStatus is correctly placed as a sibling of descriptionStatus at the top level. And the entity schema (domain.json) defines entityStatus as a top-level property. However, the domain ES mapping has it deeply nested inside a custom properties object, which means Elasticsearch queries filtering on entityStatus for domains will need to use a different field path (customPropertiesTyped.entityStatus) than for services (entityStatus), causing inconsistent search behavior or missed results entirely.

The field should be moved from inside customPropertiesTyped.properties to be a sibling of descriptionStatus at the top-level mappings.properties in all four domain mapping files.

Was this helpful? React with 👍 / 👎

"Version should increment when entityStatus is updated");
}

} catch (Exception e) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Bug: Catch-all Exception silently swallows real test failures

The test_entityStatus method wraps the entire test body (including assertions and PATCH operations) in a try/catch(Exception e) block that logs and silently passes. This means if patchEntity throws an HTTP error, or if an assertEquals fails (which throws AssertionError — a subclass of Error, not Exception, so those would still propagate), any other runtime exceptions from the actual test logic (e.g., NullPointerException, JsonProcessingException, network errors) will be silently swallowed and the test will pass.

The intent is to skip entities that don't have getEntityStatus(), but that's already handled by the null check on line 2636. The broad catch block now masks legitimate failures for entities that do support the field.

A better approach: use Assumptions.assumeTrue(currentStatus != null) for the skip logic (already handled), and remove the catch-all. If catching a specific method-not-found error is needed (for entities where the method doesn't exist at compile time), catch only NoSuchMethodError or UnsupportedOperationException specifically.

Suggested fix:

} catch (NoSuchMethodError e) {
  // If entity doesn't support entityStatus, skip this test
  log.info(
      "Entity "
          + entity.getClass().getSimpleName()
          + " does not support entityStatus: "
          + e.getMessage());
}

Was this helpful? React with 👍 / 👎

"refFqn": {
"type": "keyword"
},

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Quality: Extra blank lines in domain mapping JSON files

The domain_index_mapping.json files across all locales have extraneous blank lines around the entityStatus field definition (empty lines before/after "type" and "normalizer"), while all the service mapping files use the compact 4-line format consistently. This appears to be an accidental formatting inconsistency.

Suggested fix:

"entityStatus": {
  "type": "keyword",
  "normalizer": "lowercase_normalizer"
}

Was this helpful? React with 👍 / 👎

@github-actions
Copy link
Contributor

TypeScript types have been updated based on the JSON schema changes in the PR

@github-actions github-actions bot requested a review from a team as a code owner February 26, 2026 18:50
@yan-3005
Copy link
Contributor Author

gitar fix all the review comments

@gitar-bot gitar-bot bot requested a review from a team as a code owner February 26, 2026 19:05
@yan-3005
Copy link
Contributor Author

gitar , fix the review comments, add entity status to those services and add it to the corresponding index mappings as well

@gitar-bot
Copy link

gitar-bot bot commented Feb 27, 2026

gitar , fix the review comments, add entity status to those services and add it to the corresponding index mappings as well

All review comments have been fixed. I've added entityStatus field to the three missing service types:

1. DriveService:

  • Added to entity schema: driveService.json
  • Added to create request: createDriveService.json
  • Added to ES mappings: drive_service_index_mapping.json (en, jp, ru, zh)

2. LLMService:

  • Added to entity schema: llmService.json
  • Added to create request: createLLMService.json
  • Added to ES mapping: llm_service_index_mapping.json (en only)

3. SecurityService:

  • Added to entity schema: securityService.json
  • Added to create request: createSecurityService.json
  • Added to ES mappings: security_service_index_mapping.json (en, jp, ru, zh)

Now all 12 service types have consistent entityStatus field support across entity schemas, create APIs, and Elasticsearch mappings.

@yan-3005
Copy link
Contributor Author

gitar bot, check the py tests unit tests and static check failures, what is the issue, fix it properly

@gitar-bot
Copy link

gitar-bot bot commented Feb 27, 2026

gitar bot, check the py tests unit tests and static check failures, what is the issue, fix it properly

Fixed the Python static check failure. The issue was that CreateDatabaseServiceRequest was missing the entityStatus parameter in builder_end_to_end.py:116.

Root cause: This PR added entityStatus field to all service entities, and the Pydantic model generation made it a required parameter. The example code wasn't updated to include it.

Fix applied: Added entityStatus=None to the constructor call. The backend will apply the default value "Unprocessed" when None is passed.

The static type checker should now pass on both Python 3.10 and 3.11.

@gitar-bot
Copy link

gitar-bot bot commented Feb 27, 2026

🔍 CI failure analysis for dc6f357: Playwright E2E test failed in 'Table › Set & update column-level custom property' due to timeout. Additionally, 3 flaky tests passed on retry. This commit also has Python static check fix applied and py-run-tests infrastructure failure.

Issue

CI job "playwright-ci-postgresql (4, 6)" failed on commit dc6f357 with 1 permanent test failure and 3 flaky tests.

Root Cause

Permanent Failure: Table Column-Level Custom Property Test

Failed Test:

[chromium] › playwright/e2e/Pages/Entity.spec.ts:2140:7 › Table › Set & update column-level custom property

Error:

Test timeout of 180000ms exceeded (3 minutes)

Error: page.waitForResponse: Target page, context or browser has been closed
Error: keyboard.type: Target page, context or browser has been closed
Error: locator.click: Target page, context or browser has been closed

The test timed out after 3 minutes while trying to set/update column-level custom properties. The browser context was closed unexpectedly, suggesting either:

  • Application crash or hang
  • Test infrastructure instability
  • Interaction with the UI that caused an unexpected state

Likelihood of being PR-related: Unknown/Low

This PR adds entityStatus field to service entities and domains - it doesn't modify column-level custom properties or table entity logic. However, the widespread Elasticsearch mapping changes (55+ files) could indirectly affect page load times or data fetching.

Flaky Tests (Passed on Retry)

  1. Search Index › Level 2 Deeply Nested Columns › should add and remove tags to nested column immediately without refresh

    • Test timeout in beforeEach hook (60s)
    • Failed to wait for selector
  2. Domain Rename Comprehensive Tests › Multiple consecutive domain renames preserve all associations

    • Expected "No Description" but received old description text
    • This is a domain-related test and this PR modified domain schemas and mappings
    • Potential relevance: Medium
  3. Ml Model › Description Add, Update and Remove for child entities

    • Expected "No Description" after removing description
    • Received: "Lorem ipsum... updated"
    • Description removal didn't persist properly

Test Summary:

  • 1 failed (permanent)
  • 3 flaky (passed on retry)
  • 16 skipped
  • 648 passed
  • Total duration: 1.3 hours

Previous CI Failures Still Applicable

  1. Python Static Checks (3.10 & 3.11): Missing entityStatus parameter - FIXED locally, will resolve once pushed
  2. py-run-tests (3.10): IBM server network timeout - Infrastructure issue, unrelated to PR
  3. Integration Tests (previous commit): Domain indexing timeout failures - potentially related to Elasticsearch mapping changes
  4. Playwright SearchIndexApplication (previous commit): Test timeout - potentially related to search service mapping changes

Details

Pattern observed across multiple CI runs:

  • Multiple timeout failures in tests that wait for indexing or search operations
  • Domain-related test flakiness (domain rename test)
  • This PR modified 55+ Elasticsearch mapping files for service entities and domains

Assessment:
The column-level custom property test failure appears unrelated to this PR's changes. However, the domain rename flaky test showing description persistence issues could be related to the domain schema changes in this PR. The widespread indexing/timeout pattern across multiple test suites suggests the Elasticsearch mapping changes may be causing performance issues or race conditions during reindexing.

Code Review ⚠️ Changes requested 2 resolved / 3 findings

Systematic addition of entityStatus to services and domains, but three service types (driveService, llmService, securityService) are still missing the field across schema, ES mapping, and create API layers — this inconsistency remains unresolved from prior review.

⚠️ Bug: entityStatus missing from driveService, llmService, securityService

📄 openmetadata-spec/src/main/resources/json/schema/entity/services/apiService.json:137

The PR adds entityStatus to 9 out of 12 service entity types, but three services are missed: driveService, llmService, and securityService. These services follow the same pattern as the updated ones (they have dataProducts, descriptionStatus in ES mappings, etc.), so the omission appears unintentional.

Missing across all three layers for each:

  • Entity schema: driveService.json, llmService.json, securityService.json
  • Create request schema: createDriveService.json, createLLMService.json, createSecurityService.json
  • Elasticsearch mappings: drive_service_index_mapping.json, llm_service_index_mapping.json, security_service_index_mapping.json (all language variants)

This means filtering/searching by entityStatus will not work for these service types, and the API won't accept entityStatus on create requests for them. This creates an inconsistent governance experience across services.

✅ 2 resolved
Bug: entityStatus in domain ES mappings nested inside wrong object

📄 openmetadata-spec/src/main/resources/elasticsearch/en/domain_index_mapping.json:442 📄 openmetadata-spec/src/main/resources/elasticsearch/jp/domain_index_mapping.json:380 📄 openmetadata-spec/src/main/resources/elasticsearch/ru/domain_index_mapping.json:460 📄 openmetadata-spec/src/main/resources/elasticsearch/zh/domain_index_mapping.json:380
In all 4 language variants of domain_index_mapping.json, the entityStatus field is incorrectly placed inside the customPropertiesTyped nested object rather than at the top-level properties alongside descriptionStatus.

For service mappings (e.g., api_service_index_mapping.json), entityStatus is correctly placed at the top-level properties (next to descriptionStatus). But in the domain mappings, it's nested inside customPropertiesTyped.properties — a nested type used for custom property values.

This means Elasticsearch queries filtering by entityStatus on domains will fail or return incorrect results, because:

  1. The field lives inside a nested object, requiring a nested query to access it
  2. The indexing path is customPropertiesTyped.entityStatus rather than just entityStatus
  3. The actual entityStatus value from the domain entity won't be indexed at this path at all — it will be dynamically mapped (or ignored) at the correct top level

The fix is to move the entityStatus field definition outside of customPropertiesTyped.properties and place it at the same level as descriptionStatus in all 4 domain mapping files (en, jp, ru, zh).

Bug: Broad catch(Exception) silently swallows real test failures

📄 openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/BaseEntityIT.java:2704
The test_entityStatus method wraps the entire test body (including PATCH operations and GET calls) in a catch (Exception e) block that only logs and returns silently. While AssertionError from JUnit's assertEquals extends Error (not Exception) and would correctly fail the test, any runtime exception thrown by patchEntity(), getEntity(), or createEntity() calls would be silently caught and logged as "entity does not support entityStatus."

This means if patching or retrieval fails due to a server error, serialization issue, or network problem, the test would pass silently, hiding real bugs. The intent is to skip entities that don't have a getEntityStatus() method, but this is already handled by the null check on line 2636.

A narrower approach would be to catch only NoSuchMethodError or use reflection to check for the method, rather than catching all exceptions. Alternatively, re-throw exceptions that aren't related to the method-not-found case.

Bug: entityStatus missing from driveService, llmService, securityService

📄 openmetadata-spec/src/main/resources/json/schema/entity/services/apiService.json:137
The PR adds entityStatus to 9 out of 12 service entity types, but three services are missed: driveService, llmService, and securityService. These services follow the same pattern as the updated ones (they have dataProducts, descriptionStatus in ES mappings, etc.), so the omission appears unintentional.

Missing across all three layers for each:

  • Entity schema: driveService.json, llmService.json, securityService.json
  • Create request schema: createDriveService.json, createLLMService.json, createSecurityService.json
  • Elasticsearch mappings: drive_service_index_mapping.json, llm_service_index_mapping.json, security_service_index_mapping.json (all language variants)

This means filtering/searching by entityStatus will not work for these service types, and the API won't accept entityStatus on create requests for them. This creates an inconsistent governance experience across services.

Tip

Comment Gitar fix CI or enable auto-apply: gitar auto-apply:on

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

Auto-apply Compact
gitar auto-apply:on         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

@github-actions
Copy link
Contributor

🛡️ TRIVY SCAN RESULT 🛡️

Target: openmetadata-ingestion:trivy (debian 12.12)

Vulnerabilities (4)

Package Vulnerability ID Severity Installed Version Fixed Version
libpam-modules CVE-2025-6020 🚨 HIGH 1.5.2-6+deb12u1 1.5.2-6+deb12u2
libpam-modules-bin CVE-2025-6020 🚨 HIGH 1.5.2-6+deb12u1 1.5.2-6+deb12u2
libpam-runtime CVE-2025-6020 🚨 HIGH 1.5.2-6+deb12u1 1.5.2-6+deb12u2
libpam0g CVE-2025-6020 🚨 HIGH 1.5.2-6+deb12u1 1.5.2-6+deb12u2

🛡️ TRIVY SCAN RESULT 🛡️

Target: Java

Vulnerabilities (34)

Package Vulnerability ID Severity Installed Version Fixed Version
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.12.7 2.15.0
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.13.4 2.15.0
com.fasterxml.jackson.core:jackson-databind CVE-2022-42003 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4.2
com.fasterxml.jackson.core:jackson-databind CVE-2022-42004 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4
com.google.code.gson:gson CVE-2022-25647 🚨 HIGH 2.2.4 2.8.9
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.3.0 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.3.0 3.25.5, 4.27.5, 4.28.2
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.7.1 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.7.1 3.25.5, 4.27.5, 4.28.2
com.nimbusds:nimbus-jose-jwt CVE-2023-52428 🚨 HIGH 9.8.1 9.37.2
com.squareup.okhttp3:okhttp CVE-2021-0341 🚨 HIGH 3.12.12 4.9.2
commons-beanutils:commons-beanutils CVE-2025-48734 🚨 HIGH 1.9.4 1.11.0
commons-io:commons-io CVE-2024-47554 🚨 HIGH 2.8.0 2.14.0
dnsjava:dnsjava CVE-2024-25638 🚨 HIGH 2.1.7 3.6.0
io.airlift:aircompressor CVE-2025-67721 🚨 HIGH 0.27 2.0.3
io.netty:netty-codec-http2 CVE-2025-55163 🚨 HIGH 4.1.96.Final 4.2.4.Final, 4.1.124.Final
io.netty:netty-codec-http2 GHSA-xpw8-rcwv-8f8p 🚨 HIGH 4.1.96.Final 4.1.100.Final
io.netty:netty-handler CVE-2025-24970 🚨 HIGH 4.1.96.Final 4.1.118.Final
net.minidev:json-smart CVE-2021-31684 🚨 HIGH 1.3.2 1.3.3, 2.4.4
net.minidev:json-smart CVE-2023-1370 🚨 HIGH 1.3.2 2.4.9
org.apache.avro:avro CVE-2024-47561 🔥 CRITICAL 1.7.7 1.11.4
org.apache.avro:avro CVE-2023-39410 🚨 HIGH 1.7.7 1.11.3
org.apache.derby:derby CVE-2022-46337 🔥 CRITICAL 10.14.2.0 10.14.3, 10.15.2.1, 10.16.1.2, 10.17.1.0
org.apache.ivy:ivy CVE-2022-46751 🚨 HIGH 2.5.1 2.5.2
org.apache.mesos:mesos CVE-2018-1330 🚨 HIGH 1.4.3 1.6.0
org.apache.thrift:libthrift CVE-2019-0205 🚨 HIGH 0.12.0 0.13.0
org.apache.thrift:libthrift CVE-2020-13949 🚨 HIGH 0.12.0 0.14.0
org.apache.zookeeper:zookeeper CVE-2023-44981 🔥 CRITICAL 3.6.3 3.7.2, 3.8.3, 3.9.1
org.eclipse.jetty:jetty-server CVE-2024-13009 🚨 HIGH 9.4.56.v20240826 9.4.57.v20241219
org.lz4:lz4-java CVE-2025-12183 🚨 HIGH 1.8.0 1.8.1

🛡️ TRIVY SCAN RESULT 🛡️

Target: Node.js

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: Python

Vulnerabilities (21)

Package Vulnerability ID Severity Installed Version Fixed Version
Werkzeug CVE-2024-34069 🚨 HIGH 2.2.3 3.0.3
aiohttp CVE-2025-69223 🚨 HIGH 3.12.12 3.13.3
aiohttp CVE-2025-69223 🚨 HIGH 3.13.2 3.13.3
apache-airflow CVE-2025-68438 🚨 HIGH 3.1.5 3.1.6
apache-airflow CVE-2025-68675 🚨 HIGH 3.1.5 3.1.6, 2.11.1
azure-core CVE-2026-21226 🚨 HIGH 1.37.0 1.38.0
cryptography CVE-2026-26007 🚨 HIGH 42.0.8 46.0.5
google-cloud-aiplatform CVE-2026-2472 🚨 HIGH 1.130.0 1.131.0
google-cloud-aiplatform CVE-2026-2473 🚨 HIGH 1.130.0 1.133.0
jaraco.context CVE-2026-23949 🚨 HIGH 5.3.0 6.1.0
jaraco.context CVE-2026-23949 🚨 HIGH 6.0.1 6.1.0
protobuf CVE-2026-0994 🚨 HIGH 4.25.8 6.33.5, 5.29.6
pyasn1 CVE-2026-23490 🚨 HIGH 0.6.1 0.6.2
python-multipart CVE-2026-24486 🚨 HIGH 0.0.20 0.0.22
ray CVE-2025-62593 🔥 CRITICAL 2.47.1 2.52.0
starlette CVE-2025-62727 🚨 HIGH 0.48.0 0.49.1
urllib3 CVE-2025-66418 🚨 HIGH 1.26.20 2.6.0
urllib3 CVE-2025-66471 🚨 HIGH 1.26.20 2.6.0
urllib3 CVE-2026-21441 🚨 HIGH 1.26.20 2.6.3
wheel CVE-2026-24049 🚨 HIGH 0.45.1 0.46.2
wheel CVE-2026-24049 🚨 HIGH 0.45.1 0.46.2

🛡️ TRIVY SCAN RESULT 🛡️

Target: usr/bin/docker

Vulnerabilities (4)

Package Vulnerability ID Severity Installed Version Fixed Version
stdlib CVE-2025-68121 🔥 CRITICAL v1.25.5 1.24.13, 1.25.7, 1.26.0-rc.3
stdlib CVE-2025-61726 🚨 HIGH v1.25.5 1.24.12, 1.25.6
stdlib CVE-2025-61728 🚨 HIGH v1.25.5 1.24.12, 1.25.6
stdlib CVE-2025-61730 🚨 HIGH v1.25.5 1.24.12, 1.25.6

🛡️ TRIVY SCAN RESULT 🛡️

Target: /etc/ssl/private/ssl-cert-snakeoil.key

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /home/airflow/openmetadata-airflow-apis/openmetadata_managed_apis.egg-info/PKG-INFO

No Vulnerabilities Found

@sonarqubecloud
Copy link

@sonarqubecloud
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend safe to test Add this label to run secure Github workflows on PRs To release Will cherry-pick this PR into the release branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants