CR-1202587: Fixing xbutil validate with ml_timeline on client #8277
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem solved by the commit
When an application uses multiple hardware contexts and turns on ml_timeline, the application would encounter an error when creating the second hw_context since the ml_timeline plugin was keeping a reference to the shared pointer and not releasing it.
This resolves the issue by not keeping the cached hw_context anymore but creating it on the fly when necessary.
Bug / issue (if any) fixed, which PR introduced the bug, how it was discovered
This was introduced as a side effect of driver changes that changed the debug buffer write back mechanism, and the necessary combination of turning on ml_timeline and other debug features at the same time. The bug was discovered through regression testing.
How problem was solved, alternative solutions (if any) and why they were rejected
This is a temporary solution, as the reason the ml_timeline plugin had a cached hw_context is that it requires it when AIE profiling and/or AIE debug are enabled in conjunction with ml timeline. We have to enforce an order where ml_timeline flushes its data before profiling or debug can fetch values, and since those can ask for data at any time in the execution we could not rely on a hook from the user side to give us a live hw_context at the time we need to flush.
We will have to revisit this with a different solution to resolve that use case in the future.
Risks (if any) associated the changes in the commit
High risk to the ml_timeline feature as this changes the behavior of flushing the data from the device.
What has been tested and how, request additional testing if necessary
Testing in progress on the original failing test case and other designs.
Documentation impact (if any)
No documentation impact.