-
Notifications
You must be signed in to change notification settings - Fork 391
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Various preparations in call tree code for fast inverted tree #4897
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #4897 +/- ##
=======================================
Coverage 88.42% 88.43%
=======================================
Files 304 304
Lines 27176 27181 +5
Branches 7334 7325 -9
=======================================
+ Hits 24030 24037 +7
+ Misses 2928 2926 -2
Partials 218 218 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, so it was fairly recent after all! Thanks for finding it!
src/profile-logic/call-tree.js
Outdated
const leafBefore = callNodeLeaf[callNodeIndex]; | ||
const leafAfter = leafBefore + weight; | ||
callNodeLeaf[callNodeIndex] = leafAfter; | ||
rootTotalSummary += abs(leafAfter) - abs(leafBefore); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear to me why you changed from the previous rootTotalSummary += abs(weight)
? It seems to me that the new computation may yield a different result if weight is negative but the leaf values are positive (the difference would be negative, and we'd subtract some value from rootTotalSummary
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That said we don't seem to use it in the compare view. Anyway doing the difference of the time of adjacent samples doesn't make sense really in a merged thread.
But I'd still like to understand the change, if you remember why you did it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, I agree this deserves some justification.
This was an optimization for the performance of dragging a preview selection. Computing the rootTotalSummary was done as follows: sum over callNodeIndex of abs(callNodeSelf[callNodeIndex]). But when dragging a preview selection, many call nodes have zero leaf time. So it was faster to compute the sum per sample rather than per call node, because it avoids iterating over all the zero-time call nodes.
Actually, writing it out like this, it's possible that this is just a tradeoff; for profiles with fewer call nodes but lots more samples, maybe the old way to compute it is faster.
Before:
weights: [1, 2, 3], leaf: 0+1=1, 1+2=3, 3+3=6, abs: 6
weights: [1, 1, -2], leaf: 0+1=1, 1+1=2, 2+(-2)=0, abs: 0
weights: [1, 3, -2], leaf: 0+1=1, 1+3=4, 4+(-2)=2, abs: 2
weights: [-4, 5, -2], leaf: 0+(-4)=-4, -4+5=1, 1+(-2)=-1, abs: 1
After:
weights: [1, 2, 3], absleaf: 0 + (abs(0+1) - abs(0)) = 1, 1 + (abs(1+2) - abs(1)) = 3, 3 + (abs(3+3) - abs(3)) = 6
weights: [1, 1, -2], absleaf: 0 + (abs(0+1) - abs(0)) = 1, 1 + (abs(1+1) - abs(1)) = 2, 2 + (abs(2+(-2)) - abs(2)) = 0
weights: [1, 3, -2], absleaf: 0 + (abs(0+1) - abs(0)) = 1, 1 + (abs(1+3) - abs(1)) = 4, 4 + (abs(4+(-2)) - abs(4)) = 2
weights: [-4, 5, -2], absleaf: 0 + (abs(0+(-4)) - abs(0)) = 4, 4 + (abs(-4+5) - abs(-4)) = 1, 1 + (abs(1+(-2)) - abs(1)) = 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to revert this change. We can still make the change once the fast-invert work is done, it's quite orthogonal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That said we don't seem to use it in the compare view.
What did you mean by this? We don't use the traced timing in the compare view (because we detect that sample weights are present), but we do use rootTotalSummary (from the sample weights) in the diff call tree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see, I totally missed that we weren't looping over the same table. And I was confused with callNodeIndex vs sampleIndex too. (not enough sleep)
That said we don't seem to use it in the compare view.
What did you mean by this? We don't use the traced timing in the compare view (because we detect that sample weights are present), but we do use rootTotalSummary (from the sample weights) in the diff call tree.
I just wasn't sure where we used it, But I guess we use it to get the percentages in the call tree?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes exactly, this is the value that the percentages are relative to.
This became unused in firefox-devtools@e0edc96 , part of firefox-devtools#4807.
…bar. With the fast inverted tree implementation, getting the running/total for an inverted call node will require a function call and not just a lookup in a big array. This is a step towards that world.
This will let us add a second implementation of CallTreeInternal for the inverted tree. What's still in CallTree will be used for both the inverted and the non-inverted tree.
f764cbc
to
10074e3
Compare
for (let callNodeIndex = 0; callNodeIndex < callNodeCount; callNodeIndex++) { | ||
rootTotalSummary += abs(callNodeLeaf[callNodeIndex]); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this could have stayed in the loop over the call nodes in computeCallTreeTimings like this was before? Now we're looping over call nodes twice :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true. I did it this way so that rebasing #4900 would be easier.
[@canova Nazım Can Altınova] Return null instead of throwing an error if it fails to get the resource name in active tab view (#4905) [@fqueze Florian Quèze] Improve the precision of the getFittedText implementation. (#4893) [@fqueze Florian Quèze] Show marker count next to the marker name in the marker chart when a marker is hovered. (#4892) [@mstange Markus Stange] Convert some code to use the non-inverted call node table even while we're showing the inverted call tree (Merge PR #4899) [@canova Nazım Can Altınova] Bring the max stacking depth limit back for marker timings (#4898) [@mstange Markus Stange] Various preparations in call tree code for fast inverted tree (Merge PR #4897) [@mstange Markus Stange] Move call node path/index conversion functions into CallNodeInfo (Merge PR #4896) [@mstange Markus Stange] Make mergeFunction fast (Merge PR #4895) [@gthb Gunnlaugur Thor Briem] feat: support from-url profiles in compare (#4875) And all the locale contributors: de: Michael Köhler el: Jim Spentzos en-GB: Ian Neal es-CL: ravmn fr: Théo Chevalier fy-NL: Fjoerfoks ia: Melo46 it: Francesco Lodolo [:flod] nl: Fjoerfoks, Mark Heijl pt-BR: Marcelo Ghelman ru: Valery Ledovskoy sv-SE: Andreas Pettersson uk: Artem Polivanchuk zh-CN: 你我皆凡人, Olvcpr423, Pontoon zh-TW: Pin-guang Chen
This PR is on top of #4896.
This PR contains a collection of commits which move things around in
call-tree.js
and friends, to make the upcoming "fast inverted tree" PR smaller and easier to review. This PR can land first and will preserve the current performance characteristics.