-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RavenDB-22088 - reduce the lock contention in SegmentReader #67
RavenDB-22088 - reduce the lock contention in SegmentReader #67
Conversation
lock (this) | ||
{ | ||
// This is initialized in the constructor. | ||
// The only time it isn't initialized is when termsIndexDivisor equals -1, which we don't use. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What we don't use? Can you elaborate on that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The TermInfosReader
holds an array of sorted, cached terms and their positions (note that this cache does not include all terms) to efficiently locate the correct term during searches.
The algorithm is roughly the following:
We use binary search to find the position of the closest term within the sorted cache.
private int GetIndexOffset(Term term) |
We then iterate until we find the exact match.
while (scanBuffer.CompareTo(termBuffer) > 0 && Next(state)) |
The size of the cache is determined by the termsIndexDivisor
. A larger number creates a smaller cache, requiring more time to find the correct term but reducing memory usage.
The default value is 1
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's hard for me understand the answer. I was rather expecting you to tell my why is it the case that in our scenarios termsIndexDivisor
is never -1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can confirm that it can never be -1
:
lucenenet/src/Lucene.Net/Index/IndexWriter.cs
Line 1522 in 34bc333
if (value <= 0) |
{ | ||
// This is initialized in the constructor. | ||
// The only time it isn't initialized is when termsIndexDivisor equals -1, which we don't use. | ||
var reader = tis; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if tis
will be already disposed? As I see it can be disposed by DecRef()
and since here we're no longer under the lock that seems to be possible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can still get a reference to it and after that it will be disposed.
There is no protection against it in the current implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In current impl if tis
is null
(which is the case after it got disposed:
lucenenet/src/Lucene.Net/Index/SegmentReader.cs
Lines 261 to 266 in 34bc333
if (tis != null) | |
{ | |
tis.Dispose(); | |
// null so if an app hangs on to us we still free most ram | |
tis = null; | |
} |
then we return tisNoIndex
.
After the change made here we can return tis
while DecRef()
will dispose it (don't know if those can be executed concurrently but I believe they can since both use lock()
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can get a reference to it (when calling GetTermsReader
).
And after that it can be disposed, so you'll be holding a disposed object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original code doesn't protect you from getting a tis
this might be disposed afterwards.
return tis != null; | ||
} | ||
{ | ||
// If this is null, we call LoadTermsIndex, which uses a lock. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please note that LoadTermsIndex
has a comment "This method is not fully thread safe, and relies on the synchronization in IndexWriter". I wonder if we can relay on LoadTermsIndex
and the lock we have there, especially after the change you made there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The lock is there because this method isn't thread safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And your change made it even more not thread save, didn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The lock
was added there along with the comment.
Maybe I'm missing something but this the lock makes this method thread safe.
@@ -310,7 +317,10 @@ internal void DecRef() | |||
} | |||
|
|||
internal void OpenDocStores(SegmentInfo si, IState state) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this method also affected (and you saw that in the stacktraces) or you modified it "just in case" because it might have the same issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added Console.WriteLine
in all methods in order to see which ones are being used.
I don't see a reason to hold the lock if fieldsReaderOrig != null
.
We talked with Grisha about all my comments. It looks fine although it's a risky change IMO. We'll see how it'll work on RavenDB's tests. |
No description provided.