-
Notifications
You must be signed in to change notification settings - Fork 803
[SYCL][L0] Rework how we maintain per-thread queue groups #8896
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Sergey V Maslov <[email protected]>
| auto TID = std::this_thread::get_id(); | ||
| auto &InitialGroup = Queue->ComputeQueueGroupsByTID.begin()->second; | ||
| const auto &Result = | ||
| Queue->ComputeQueueGroupsByTID.insert({TID, InitialGroup}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code was doing wrong it used the same ZeQueues for the new entry so it is then was destroyed multiple times causing a segfault. Now, without per-thread setting we don't even create multiple entries, and we do per-thread then the new entry is properly initialized here: https://github.com/intel/llvm/pull/8896/files#diff-1f19baf7cc76d94bf4a1819c81e7b34292fb3ce215cd50873e8c9236cc73601cR501
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feature was for immediate command lists only. So the zeQueues should have been unallocated. Have you extended it for command queues also?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So now the vector of ZeQueues is now separate for each entry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See here, new entries are initialized with null: https://github.com/intel/llvm/pull/8896/files#diff-1f19baf7cc76d94bf4a1819c81e7b34292fb3ce215cd50873e8c9236cc73601cR501
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is still for immediate command-lists only, but it is safer to initialize ZeQueues to null for safety
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But that value should have been a null, and the zecommandqueuedestroy should not have been called. But it did get called, which means that the vector of zequeus has to be explicitly set to nullptrs. OK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it was nullptr but then a call togetZeQueue created a non-null one: https://github.com/intel/llvm/pull/8896/files#diff-15dd1eb076d2164bd9e87d9057f05f652a716498e8cdf5975e564c65309a0985L2843
That non-null value was then copied to yet a new entry created for another thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The perThread and getNative don't mix well. Say you have a queue that uses multiple cmd lists, one per thread. Someone calls getNative, makes his own queue from it and assumes that a sync on that queue is also a sync on the original queue. But it isn't because only one cmd list will be synch'ed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, but the failure was even without per-thread requested. We still created multiple redundant entries, which additionally held the same queues. I don't think that syncing a queue returned by a get_native() gives any guarantees about the entire SYCL queue being synced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That insert was supposed to be "insert if not found".
Signed-off-by: Sergey V Maslov <[email protected]>
No description provided.