Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Various preparations in call tree code for fast inverted tree #4897

Merged
merged 8 commits into from
Jan 23, 2024

Conversation

mstange
Copy link
Contributor

@mstange mstange commented Jan 22, 2024

This PR is on top of #4896.

This PR contains a collection of commits which move things around in call-tree.js and friends, to make the upcoming "fast inverted tree" PR smaller and easier to review. This PR can land first and will preserve the current performance characteristics.

@mstange mstange requested a review from julienw January 22, 2024 22:38
@mstange mstange self-assigned this Jan 22, 2024
Copy link

codecov bot commented Jan 22, 2024

Codecov Report

Attention: 4 lines in your changes are missing coverage. Please review.

Comparison is base (0f047ca) 88.42% compared to head (f764cbc) 88.43%.

❗ Current head f764cbc differs from pull request most recent head 10074e3. Consider uploading reports for the commit 10074e3 to get more accurate results

Files Patch % Lines
src/components/flame-graph/FlameGraph.js 71.42% 2 Missing ⚠️
src/components/stack-chart/index.js 0.00% 2 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #4897   +/-   ##
=======================================
  Coverage   88.42%   88.43%           
=======================================
  Files         304      304           
  Lines       27176    27181    +5     
  Branches     7334     7325    -9     
=======================================
+ Hits        24030    24037    +7     
+ Misses       2928     2926    -2     
  Partials      218      218           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@julienw julienw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: you removed _rootCount in e0edc96, part of #4807 :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, so it was fairly recent after all! Thanks for finding it!

const leafBefore = callNodeLeaf[callNodeIndex];
const leafAfter = leafBefore + weight;
callNodeLeaf[callNodeIndex] = leafAfter;
rootTotalSummary += abs(leafAfter) - abs(leafBefore);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear to me why you changed from the previous rootTotalSummary += abs(weight)? It seems to me that the new computation may yield a different result if weight is negative but the leaf values are positive (the difference would be negative, and we'd subtract some value from rootTotalSummary).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said we don't seem to use it in the compare view. Anyway doing the difference of the time of adjacent samples doesn't make sense really in a merged thread.
But I'd still like to understand the change, if you remember why you did it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, I agree this deserves some justification.

This was an optimization for the performance of dragging a preview selection. Computing the rootTotalSummary was done as follows: sum over callNodeIndex of abs(callNodeSelf[callNodeIndex]). But when dragging a preview selection, many call nodes have zero leaf time. So it was faster to compute the sum per sample rather than per call node, because it avoids iterating over all the zero-time call nodes.

Actually, writing it out like this, it's possible that this is just a tradeoff; for profiles with fewer call nodes but lots more samples, maybe the old way to compute it is faster.

Before:
weights: [1, 2, 3], leaf: 0+1=1, 1+2=3, 3+3=6, abs: 6
weights: [1, 1, -2], leaf: 0+1=1, 1+1=2, 2+(-2)=0, abs: 0
weights: [1, 3, -2], leaf: 0+1=1, 1+3=4, 4+(-2)=2, abs: 2
weights: [-4, 5, -2], leaf: 0+(-4)=-4, -4+5=1, 1+(-2)=-1, abs: 1

After:
weights: [1, 2, 3], absleaf: 0 + (abs(0+1) - abs(0)) = 1, 1 + (abs(1+2) - abs(1)) = 3, 3 + (abs(3+3) - abs(3)) = 6
weights: [1, 1, -2], absleaf: 0 + (abs(0+1) - abs(0)) = 1, 1 + (abs(1+1) - abs(1)) = 2, 2 + (abs(2+(-2)) - abs(2)) = 0
weights: [1, 3, -2], absleaf: 0 + (abs(0+1) - abs(0)) = 1, 1 + (abs(1+3) - abs(1)) = 4, 4 + (abs(4+(-2)) - abs(4)) = 2
weights: [-4, 5, -2], absleaf: 0 + (abs(0+(-4)) - abs(0)) = 4, 4 + (abs(-4+5) - abs(-4)) = 1, 1 + (abs(1+(-2)) - abs(1)) = 1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to revert this change. We can still make the change once the fast-invert work is done, it's quite orthogonal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said we don't seem to use it in the compare view.

What did you mean by this? We don't use the traced timing in the compare view (because we detect that sample weights are present), but we do use rootTotalSummary (from the sample weights) in the diff call tree.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, I totally missed that we weren't looping over the same table. And I was confused with callNodeIndex vs sampleIndex too. (not enough sleep)

That said we don't seem to use it in the compare view.

What did you mean by this? We don't use the traced timing in the compare view (because we detect that sample weights are present), but we do use rootTotalSummary (from the sample weights) in the diff call tree.

I just wasn't sure where we used it, But I guess we use it to get the percentages in the call tree?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes exactly, this is the value that the percentages are relative to.

src/profile-logic/call-tree.js Outdated Show resolved Hide resolved
…bar.

With the fast inverted tree implementation, getting the running/total
for an inverted call node will require a function call and not just a
lookup in a big array.
This is a step towards that world.
This will let us add a second implementation of CallTreeInternal for the
inverted tree. What's still in CallTree will be used for both the inverted
and the non-inverted tree.
@mstange mstange merged commit 9b4b26b into firefox-devtools:main Jan 23, 2024
15 of 16 checks passed
Comment on lines +519 to +521
for (let callNodeIndex = 0; callNodeIndex < callNodeCount; callNodeIndex++) {
rootTotalSummary += abs(callNodeLeaf[callNodeIndex]);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this could have stayed in the loop over the call nodes in computeCallTreeTimings like this was before? Now we're looping over call nodes twice :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true. I did it this way so that rebasing #4900 would be easier.

@julienw julienw mentioned this pull request Feb 6, 2024
julienw added a commit that referenced this pull request Feb 6, 2024
[@canova Nazım Can Altınova] Return null instead of throwing an error if it fails to get the resource name in active tab view (#4905)
[@fqueze Florian Quèze] Improve the precision of the getFittedText implementation. (#4893)
[@fqueze Florian Quèze] Show marker count next to the marker name in the marker chart when a marker is hovered. (#4892)
[@mstange Markus Stange] Convert some code to use the non-inverted call node table even while we're showing the inverted call tree (Merge PR #4899)
[@canova Nazım Can Altınova] Bring the max stacking depth limit back for marker timings (#4898)
[@mstange Markus Stange] Various preparations in call tree code for fast inverted tree (Merge PR #4897)
[@mstange Markus Stange] Move call node path/index conversion functions into CallNodeInfo (Merge PR #4896)
[@mstange Markus Stange] Make mergeFunction fast (Merge PR #4895)
[@gthb Gunnlaugur Thor Briem] feat: support from-url profiles in compare (#4875)

And all the locale contributors:
de: Michael Köhler
el: Jim Spentzos
en-GB: Ian Neal
es-CL: ravmn
fr: Théo Chevalier
fy-NL: Fjoerfoks
ia: Melo46
it: Francesco Lodolo [:flod]
nl: Fjoerfoks, Mark Heijl
pt-BR: Marcelo Ghelman
ru: Valery Ledovskoy
sv-SE: Andreas Pettersson
uk: Artem Polivanchuk
zh-CN: 你我皆凡人, Olvcpr423, Pontoon
zh-TW: Pin-guang Chen
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants