Skip to content

Conversation

@minhmo1620
Copy link
Contributor

@minhmo1620 minhmo1620 commented Oct 14, 2025

Problem Statement

The PR aim to resolve the following workflow:

  1. Registers a new schema for admin operations and rolls it out to all controllers.
  2. Serializes and sends an AdminOperation message using the new schema.
  3. Performs a rollback to the previous schema version.
  4. Ensures that the AdminConsumptionTask can fetch the required schema from the schema system store, even after local cache removal, and successfully consumes messages with the new schema ID.

Without this change, admin topic will be blocked since new schema id cannot be found in local controller.

Solution

Create the system store for Admin Operation. When new schema cannot be found in local controller, controller will try to download the schema back to local controller and unblock the admin topic.

Feature flag:

controller.admin.operation.system.store.enabled
default = false

Code changes

  • Added new code behind a config. If so list the config names and their default values in the PR description.
  • Introduced new log lines.
    • Confirmed if logs need to be rate limited to avoid excessive logging.

Concurrency-Specific Checks

Both reviewer and PR author to verify

  • Code has no race conditions or thread safety issues.
  • Proper synchronization mechanisms (e.g., synchronized, RWLock) are used where needed.
  • No blocking calls inside critical sections that could lead to deadlocks or performance degradation.
  • Verified thread-safe collections are used (e.g., ConcurrentHashMap, CopyOnWriteArrayList).
  • Validated proper exception handling in multi-threaded code to avoid silent thread termination.

How was this PR tested?

  • New unit tests added.
  • New integration tests added.
  • Modified or extended existing tests.
  • Verified backward compatibility (if applicable).

Does this PR introduce any user-facing or breaking changes?

  • No. You can skip the rest of this section.
  • Yes. Clearly explain the behavior change and its impact.

@minhmo1620 minhmo1620 force-pushed the minnguye/system_store_admmin_op branch 2 times, most recently from a05786e to f1f49c9 Compare November 6, 2025 18:01
Change the protocolMap to ConcurrentHashMap

Fix spotbug

Fix spotbug
@minhmo1620 minhmo1620 force-pushed the minnguye/system_store_admmin_op branch from 39bf77a to 6c291ab Compare November 8, 2025 17:03
@minhmo1620 minhmo1620 marked this pull request as ready for review November 9, 2025 00:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant