-
Notifications
You must be signed in to change notification settings - Fork 3.4k
With FILE based SFT, the list of HFiles we maintain in .filelist is n… #7361
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
sharmaar12
wants to merge
1
commit into
apache:HBASE-29081
Choose a base branch
from
sharmaar12:sft_issue
base: HBASE-29081
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+8
−2
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
…ot getting updated for read replica Link to JIRA: https://issues.apache.org/jira/browse/HBASE-29611 Description: Steps to Repro (For detailed steps, check JIRA): - Create two clusters on the same storage location. - Create table on active, then refresh meta on the read replica to get the table meta data updated. - Add some rows and flush on the active cluster, do refresh_hfiles on the read replica and scan table. - If you now again add the rows in the table on active and do refresh_hfiles then the rows added are not visible in the read replica. Cause: The refresh store file is a two step process: 1. Load the existing store file from the .filelist (choose the file with higher timestamp for loading) 2. refresh store file internals (clean up old/compacted files, replace store file in .filelist) In the current scenario, what is happening is that for the first time read-replica is loading the list of Hfiles from the file in .filelist created by active cluster but then it is creating the new file with greater timestamp. Now we have two files in .filelist. On the subsequent flush from active the file in .filelist created by the active gets updated but the file created by read-replica is not. While loading in the refresh_hfiles as we take the file with higher timestamp the file created by read-replica for the first time gets loaded which does not have an updated list of hfiles. Fix: As we just wanted the file from active to be loaded anytime we perform refresh store files, we must not create a new file in the .filelist from the read-replica, in this way we will stop the timestamp mismatch. Also we don't want to initialize the tracker file (StoreFileListFile.java:load()) from read-replica as we are not writing it hence we have added check for read only property in StoreFileTrackerBase.java:load()
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
…ot getting updated for read replica
Link to JIRA: https://issues.apache.org/jira/browse/HBASE-29611
Description:
Steps to Repro (For detailed steps, check JIRA):
Cause:
The refresh store file is a two step process:
In the current scenario, what is happening is that for the first time read-replica is loading the list of Hfiles from the file in .filelist created by active cluster but then it is creating the new file with greater timestamp. Now we have two files in .filelist. On the subsequent flush from active the file in .filelist created by the active gets updated but the file created by read-replica is not. While loading in the refresh_hfiles as we take the file with higher timestamp the file created by read-replica for the first time gets loaded which does not have an updated list of hfiles.
Fix:
As we just wanted the file from active to be loaded anytime we perform refresh store files, we must not create a new file in the .filelist from the read-replica, in this way we will stop the timestamp mismatch.
Also we don't want to initialize the tracker file (StoreFileListFile.java:load()) from read-replica as we are not writing it hence we have added check for read only property in StoreFileTrackerBase.java:load()