Fix possible out-of-order/inconsistent seqno-to-time mapping #13279

pdillinger · 2025-01-08T00:06:38Z

Summary: The crash test with COERCE_CONTEXT_SWITCH=1 is showing a failure:

db_stress: db/seqno_to_time_mapping.cc:480: bool rocksdb::SeqnoToTimeMapping::Append(rocksdb::SequenceNumber, uint64_t): Assertion `false' failed.

with DBImpl::SetOptions() in the call stack. This assertion and those around it are mostly there for catching systematic problems with recording the mappings, as small imprecisions here and there are not a problem in production. Nevertheless, we need to fix this to maintain the assertions for catching possible future systematic problems.

Because the seqno and time are acquired before holding the DB mutex, there could be a race where T1 acquires latest seqno, T1 acquires latest seqno, T2 acquires unix time, T1 acquires unix time, and entries are not just saved out-of-order, but would represent an inconsistent (time traveling) mapping if they were saved.

We can fix this by getting the seqno and unix times while under the mutex. (Hopefully this is not caused by non-monotonic clock adjustments.)

Test Plan: local run blackbox_crash_test with COERCE_CONTEXT_SWITCH=1. This is not really a production concern, and the conditions are not really reproducible in a unit test after the fix.

Summary: The crash test with COERCE_CONTEXT_SWITCH=1 is showing a failure: ``` db_stress: db/seqno_to_time_mapping.cc:480: bool rocksdb::SeqnoToTimeMapping::Append(rocksdb::SequenceNumber, uint64_t): Assertion `false' failed. ``` with `DBImpl::SetOptions()` in the call stack. This assertion and those around it are mostly there for catching systematic problems with recording the mappings, as small imprecisions here and there are not a problem in production. Nevertheless, we need to fix this to maintain the assertions for catching possible future systematic problems. Because the seqno and time are acquired before holding the DB mutex, there could be a race where T1 acquires latest seqno, T1 acquires latest seqno, T2 acquires unix time, T1 acquires unix time, and entries are not just saved out-of-order, but would represent an inconsistent (time traveling) mapping if they were saved. We can fix this by getting the seqno and unix times while under the mutex. (Hopefully this is not caused by non-monotonic clock adjustments.) Test Plan: local run blackbox_crash_test with COERCE_CONTEXT_SWITCH=1. This is not really a production concern, and the conditions are not really reproducible in a unit test after the fix.

facebook-github-bot · 2025-01-08T00:09:07Z

@pdillinger has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

cbi42

LGTM

facebook-github-bot · 2025-01-08T02:30:34Z

@pdillinger merged this pull request in b341dc8.

pdillinger requested a review from cbi42 January 8, 2025 00:06

facebook-github-bot added the CLA Signed label Jan 8, 2025

cbi42 approved these changes Jan 8, 2025

View reviewed changes

facebook-github-bot closed this in b341dc8 Jan 8, 2025

facebook-github-bot added the Merged label Jan 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix possible out-of-order/inconsistent seqno-to-time mapping #13279

Fix possible out-of-order/inconsistent seqno-to-time mapping #13279

pdillinger commented Jan 8, 2025

facebook-github-bot commented Jan 8, 2025

cbi42 left a comment

facebook-github-bot commented Jan 8, 2025

Fix possible out-of-order/inconsistent seqno-to-time mapping #13279

Fix possible out-of-order/inconsistent seqno-to-time mapping #13279

Conversation

pdillinger commented Jan 8, 2025

facebook-github-bot commented Jan 8, 2025

cbi42 left a comment

Choose a reason for hiding this comment

facebook-github-bot commented Jan 8, 2025