Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RavenDB-22088 - reduce the lock contention in SegmentReader #67

Merged
merged 2 commits into from
Jun 17, 2024
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 21 additions & 11 deletions src/Lucene.Net/Index/SegmentReader.cs
Original file line number Diff line number Diff line change
Expand Up @@ -191,10 +191,16 @@ internal Directory GetCFSReader()
return cfsReader;
}
}

internal TermInfosReader GetTermsReader()
{
lock (this)
{
// This is initialized in the constructor.
// The only time it isn't initialized is when termsIndexDivisor equals -1, which we don't use.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What we don't use? Can you elaborate on that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TermInfosReader holds an array of sorted, cached terms and their positions (note that this cache does not include all terms) to efficiently locate the correct term during searches.

The algorithm is roughly the following:
We use binary search to find the position of the closest term within the sorted cache.

private int GetIndexOffset(Term term)

We then iterate until we find the exact match.
while (scanBuffer.CompareTo(termBuffer) > 0 && Next(state))

The size of the cache is determined by the termsIndexDivisor. A larger number creates a smaller cache, requiring more time to find the correct term but reducing memory usage.
The default value is 1.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's hard for me understand the answer. I was rather expecting you to tell my why is it the case that in our scenarios termsIndexDivisor is never -1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can confirm that it can never be -1:

var reader = tis;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if tis will be already disposed? As I see it can be disposed by DecRef() and since here we're no longer under the lock that seems to be possible

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can still get a reference to it and after that it will be disposed.
There is no protection against it in the current implementation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In current impl if tis is null (which is the case after it got disposed:

if (tis != null)
{
tis.Dispose();
// null so if an app hangs on to us we still free most ram
tis = null;
}
)

then we return tisNoIndex.

After the change made here we can return tis while DecRef() will dispose it (don't know if those can be executed concurrently but I believe they can since both use lock())

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can get a reference to it (when calling GetTermsReader).
And after that it can be disposed, so you'll be holding a disposed object.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original code doesn't protect you from getting a tis this might be disposed afterwards.

if (reader != null)
return reader;

lock (this)
{
if (tis != null)
{
Expand All @@ -208,20 +214,21 @@ internal TermInfosReader GetTermsReader()
}

internal bool TermsIndexIsLoaded()
{
lock (this)
{
return tis != null;
}
{
// If this is null, we call LoadTermsIndex, which uses a lock.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note that LoadTermsIndex has a comment "This method is not fully thread safe, and relies on the synchronization in IndexWriter". I wonder if we can relay on LoadTermsIndex and the lock we have there, especially after the change you made there.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lock is there because this method isn't thread safe.

Copy link
Member

@arekpalinski arekpalinski Jun 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And your change made it even more not thread save, didn't it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lock was added there along with the comment.
Maybe I'm missing something but this the lock makes this method thread safe.

return tis != null;
}

// NOTE: only called from IndexWriter when a near
// real-time reader is opened, or applyDeletes is run,
// sharing a segment that's still being merged. This
// method is not fully thread safe, and relies on the
// synchronization in IndexWriter
internal void LoadTermsIndex(SegmentInfo si, int termsIndexDivisor, IState state)
{
internal void LoadTermsIndex(SegmentInfo si, int termsIndexDivisor, IState state)
{
if (tis != null)
return;

lock (this)
{
if (tis == null)
Expand Down Expand Up @@ -310,7 +317,10 @@ internal void DecRef()
}

internal void OpenDocStores(SegmentInfo si, IState state)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this method also affected (and you saw that in the stacktraces) or you modified it "just in case" because it might have the same issue?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added Console.WriteLine in all methods in order to see which ones are being used.
I don't see a reason to hold the lock if fieldsReaderOrig != null.

{
{
if (fieldsReaderOrig != null)
return;

lock (this)
{

Expand Down
Loading