Skip to content

Conversation

HaoW30
Copy link
Contributor

@HaoW30 HaoW30 commented Aug 12, 2025

What problem does this PR solve?

Issue Number: Close #9638

What is changed and how does it work?

This PR adds a new set of PD HTTP endpoints (v1 API) to support a pitr-recovering marker, following the same pattern as the existing snapshot-recovering marker implementation.

New endpoints:

  • GET /admin/cluster/markers/pitr-recovering – check whether the pitr recovery marker is set
  • POST /admin/cluster/markers/pitr-recovering – mark the cluster as under pitr recovery••
  • DELETE /admin/cluster/markers/pitr-recovering – clear the pitr recovery marker

Implementation details:

  1. Keypath support: Added pitrRecoveringMarkPathFormat and PitrRecoveringMarkPath() function in pkg/utils/keypath/absolute_key_path.go

  2. Server functions: Added three new methods to server/server.go:

    • MarkPitrRecovering() - Sets the PITR recovery marker in etcd
    • IsPitrRecovering(ctx) - Checks if PITR recovery marker exists
    • UnmarkPitrRecovering(ctx) - Removes the PITR recovery marker
  3. HTTP handlers: Added corresponding HTTP handlers in server/api/admin.go:

    • markPitrRecovering - POST handler
    • isPitrRecovering - GET handler (returns JSON with marked boolean field)
    • unmarkPitrRecovering - DELETE handler
  4. Routing: Added three route registrations in server/api/router.go with proper audit logging

  5. Testing: Added comprehensive test TestMarkPitrRecovering that validates:

    • Default state (marker not set)
    • Setting the marker (POST)
    • Checking marker is set (GET returns true)
    • Clearing the marker (DELETE)
    • Verifying marker is cleared (GET returns false)

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Code changes

Side effects

  • Possible performance regression
  • Increased code complexity
  • Breaking backward compatibility

Related changes

Release note

None.

…recovering)

Add new HTTP endpoints for PITR (Point-in-Time Recovery) coordination:
- GET /admin/cluster/markers/pitr-recovering
- POST /admin/cluster/markers/pitr-recovering••
- DELETE /admin/cluster/markers/pitr-recovering

This provides a centralized way for TiDB Operator to coordinate PITR
operations across multi-cluster deployments, following the same pattern
as existing snapshot-recovering endpoints.

Signed-off-by: h-wang4 <[email protected]>
@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-linked-issue release-note-none Denotes a PR that doesn't merit a release note. dco-signoff: yes Indicates the PR's author has signed the dco. labels Aug 12, 2025
Copy link
Contributor

ti-chi-bot bot commented Aug 12, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign cabinfeverb for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added contribution This PR is from a community contributor. needs-ok-to-test Indicates a PR created by contributors and need ORG member send '/ok-to-test' to start testing. labels Aug 12, 2025
Copy link
Contributor

ti-chi-bot bot commented Aug 12, 2025

Hi @HaoW30. Thanks for your PR.

I'm waiting for a tikv member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot ti-chi-bot bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Aug 12, 2025
@ti-chi-bot ti-chi-bot added ok-to-test Indicates a PR is ready to be tested. and removed needs-ok-to-test Indicates a PR created by contributors and need ORG member send '/ok-to-test' to start testing. labels Aug 12, 2025
@ti-chi-bot
Copy link
Member

Now you can start all CI jobs with /test all in comment or query the triggers with /test ?

Copy link

codecov bot commented Aug 12, 2025

Codecov Report

❌ Patch coverage is 68.75000% with 15 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.70%. Comparing base (4e36646) to head (a12dfdf).

❌ Your patch check has failed because the patch coverage (68.75%) is below the target coverage (74.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #9655      +/-   ##
==========================================
+ Coverage   76.61%   76.70%   +0.09%     
==========================================
  Files         482      482              
  Lines       76184    76232      +48     
==========================================
+ Hits        58367    58474     +107     
+ Misses      14231    14179      -52     
+ Partials     3586     3579       -7     
Flag Coverage Δ
unittests 76.70% <68.75%> (+0.09%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@okJiang okJiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rest lgtm

h.rd.Text(w, http.StatusInternalServerError, err.Error())
return
}
type resStruct struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about using a public struct? It is same as L183

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

@okJiang
Copy link
Member

okJiang commented Aug 13, 2025

FYI, If you submit a PR on tidb-operator, please link it here. I think it's okay to wait until both PRs are ready before merging.

@HaoW30
Copy link
Contributor Author

HaoW30 commented Aug 13, 2025

FYI, If you submit a PR on tidb-operator, please link it here. I think it's okay to wait until both PRs are ready before merging.

@okJiang sounds good, will do

@okJiang
Copy link
Member

okJiang commented Aug 26, 2025

@HaoW30 Hey, how’s the progress recently? Have you encountered any difficulties?

Copy link
Contributor

ti-chi-bot bot commented Aug 28, 2025

[FORMAT CHECKER NOTIFICATION]

Notice: To remove the do-not-merge/needs-linked-issue label, please provide the linked issue number on one line in the PR body, for example: Issue Number: close #123 or Issue Number: ref #456, multiple issues should use full syntax for each issue and be separated by a comma, like: Issue Number: close #123, ref #456.

📖 For more info, you can check the "Linking issues" section in the CONTRIBUTING.md.

@ti-chi-bot ti-chi-bot bot added dco-signoff: no Indicates the PR's author has not signed dco. and removed dco-signoff: yes Indicates the PR's author has signed the dco. labels Aug 28, 2025
Copy link
Contributor

ti-chi-bot bot commented Aug 28, 2025

Thanks for your pull request. Before we can look at it, you'll need to add a 'DCO signoff' to your commits.

📝 Please follow instructions in the contributing guide to update your commits with the DCO

Full details of the Developer Certificate of Origin can be found at developercertificate.org.

The list of commits missing DCO signoff:

  • b5a410c Rename pitr-recovering to pitr-restore-mode and address pr comments

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@HaoW30
Copy link
Contributor Author

HaoW30 commented Aug 28, 2025

@HaoW30 Hey, how’s the progress recently? Have you encountered any difficulties?

@okJiang Thanks! We've tested this patch with BR/operator changes locally end-to-end and made sure things work for the multi-k8s cluster restore scenario. We're preparing the relevant upstream PR for BR. Will link it here once ready.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contribution This PR is from a community contributor. dco-signoff: no Indicates the PR's author has not signed dco. do-not-merge/needs-linked-issue ok-to-test Indicates a PR is ready to be tested. release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add PD Endpoints for PiTR Restore Mode Marker (pitr-recovering)
3 participants