Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verify flushed data are recovered upon reopen in crash test #12787

Closed
wants to merge 1 commit into from

Conversation

hx235
Copy link
Contributor

@hx235 hx235 commented Jun 21, 2024

Context/Summary:

This is to solve #12152. We persist the largest flushed seqno before crash just like how we persist the ExpectedState. And we verify the db lates seqno after recovery is no smaller than this flushed seqno.

Test:

  • Manually observe that the persisted sequence after flush completion is used to verify db's latest sequence
  • python3 tools/db_crashtest.py --simple blackbox --interval=30
  • CI

@facebook-github-bot
Copy link
Contributor

@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@hx235 hx235 force-pushed the verify_flushed_data_recovery branch from 6e5f737 to a4314e7 Compare June 21, 2024 21:06
@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

1 similar comment
@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

@hx235 hx235 force-pushed the verify_flushed_data_recovery branch from a4314e7 to 4be510e Compare June 22, 2024 17:59
@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@hx235 hx235 requested a review from ajkr June 22, 2024 21:13
@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

1 similar comment
@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

@hx235 hx235 force-pushed the verify_flushed_data_recovery branch from 800b609 to 1dddd4a Compare July 22, 2024 22:34
@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@hx235 hx235 force-pushed the verify_flushed_data_recovery branch from 1dddd4a to 9f98a86 Compare December 19, 2024 05:16
@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

db_stress_tool/db_stress_listener.h Show resolved Hide resolved
db_stress_tool/db_stress_shared_state.h Outdated Show resolved Hide resolved
db_stress_tool/db_stress_shared_state.h Show resolved Hide resolved
db_stress_tool/db_stress_test_base.cc Outdated Show resolved Hide resolved
db_stress_tool/expected_state.cc Show resolved Hide resolved
db_stress_tool/expected_state.cc Outdated Show resolved Hide resolved
db_stress_tool/expected_state.cc Show resolved Hide resolved
db_stress_tool/expected_state.h Outdated Show resolved Hide resolved
db_stress_tool/db_stress_test_base.cc Show resolved Hide resolved
Copy link
Contributor

@archang19 archang19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks generally right to me.

I want to read through #12152 again to understand it better.

Some questions:

From #12152 (comment)

We should update the expected minimum value of that counter for (1) explicit flushes, and (2) OnFlushCompleted() events

Will a separate PR handle case 1? I only see case 2 being handled

From #12152 (comment)

WAL-disabled, atomic_flush-disabled may be worth testing at some future point but that is still an open topic to discuss. My thoughts are here - #11841 (comment).

Another case that may be worth testing at some point is WAL-disabled, atomic_flush-enabled, and manual flush triggered on a subset of column families (what you suggested we are already doing).

What is the current thinking on how many of these WAL {enabled, disabled}, atomic flush {enabled, disabled} combinations we will try to test?

@hx235 hx235 force-pushed the verify_flushed_data_recovery branch from 9f98a86 to 18d84d1 Compare December 20, 2024 04:26
@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

@hx235 hx235 force-pushed the verify_flushed_data_recovery branch from 18d84d1 to c26ae4d Compare December 20, 2024 04:53
@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

@hx235 hx235 force-pushed the verify_flushed_data_recovery branch from c26ae4d to a8d9041 Compare December 20, 2024 04:55
@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

Copy link
Contributor

@archang19 archang19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I think we should be fine as long as we run the crash test for enough time before merging.

@facebook-github-bot
Copy link
Contributor

@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@hx235 hx235 force-pushed the verify_flushed_data_recovery branch from a8d9041 to fd22753 Compare December 24, 2024 23:51
@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@hx235 merged this pull request in e3024e7.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants