-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
performance observations while interacting with a 23k-tip tree #1878
Labels
Comments
jameshadfield
added a commit
that referenced
this issue
Oct 28, 2024
For big trees, attempting to animate changes results in poor performance and we're better off making all SVG changes in a single pass. "big" here depends on both the browser's performance and the size of the tree. Here we just use the latter as it's simpler to code, but in the future we may measure performance as the user interacts with the viz and update these flags accordingly. (This is one of the reasons we calculate these in middleware.) For future, we should add nuance to this performance flag. For instance, viewing a 2k tree and zooming into a clade has poor performance and we'd be better off skipping the animation, however once in a small clade it works well. For that same tree (viewing all 2k tips) filtering to a clade still looks good with the animation. 4000 tips is arbitrary, it was chosen after testing a variety of datasets locally. Note that there is a big difference between not setting a transition (within d3) and setting a transition of 0ms - the former is much faster. See <#1878> for more details here. I didn't implement this change for `modifySVGInStages` as it uses a more complex scheduling system; but the performance here is terrible for really large trees so we should revisit this in future work. Performance changes ------------------- Auspice built in production mode on Chrome 130.0 on a Apple M3 pro chip. Datasets: a large ncov tree (23k tips) (see <#1878> for more), and a flu tree of 2k tips. Both only rendering the tree panel. Flu performance is unchanged, as 2k tips isn't enough to trigger this performance toggle. For posterity: * Zooming into clade 2a. Initial frame ~160ms, then 30fps. * Filtering to clade 2a. Initial frame ~115ms, then 60fps. * Animation: 60fps ncov performance: * Zoom into clade 21M. * Previously: Initial frame 1500ms, second (final frame) 430ms later, main blocked for a further 700ms * Now: Single frame of 921ms. Main blocked for a further 550ms. * Change: Time to correct tree reduced by ~1000ms (50%). Time until main is no longer blocked: reduced by ~1100ms (45%). * Filter to clade 21M. * Previously: Initial frame 1050ms, second frame 930ms later, main blocked for a further 660ms * Now: Single frame of 1018ms, main blocked for a further 660ms * Change: Time to correct tree reduced by ~950ms (50%). Time until main is no longer blocked: reduced by ~950ms (35%). * Animate. This is unchanged - Auspice sets the transition time to 0 (see end of this commit message). We get around 4 frames at 70ms each (14fps) then a long commit phase blocking for ~650ms. For unknown reasons we trigger a timer flush but it doesn't show up in the performance snapshot, which is why these times are unchanged by this commit. Note that the timings for the initial frame are slightly variable in chrome. The JS time / blinkng pipeline parts shown on the main thread seem pretty consistent, but when the frame is drawn (relative to when the commit phase starts) is more variable. So take the above results with a grain of salt. P.S. there is a somewhat parallel implementation of this concept in Auspice where PhyloTree sets a transition time of zero if subsequent updates come in quick enough, but that is buggy. Firstly, if an update takes a long time the main thread will be blocked so the next update will be indistinguishable from one which was triggered an acceptable time later. Secondly, it only works for animation where we have a stream of updates.
This was referenced Oct 28, 2024
jameshadfield
added a commit
that referenced
this issue
Nov 6, 2024
For big trees, attempting to animate changes results in poor performance and we're better off making all SVG changes in a single pass. "big" here depends on both the browser's performance and the size of the tree. Here we just use the latter as it's simpler to code, but in the future we may measure performance as the user interacts with the viz and update these flags accordingly. (This is one of the reasons we calculate these in middleware.) For future, we should add nuance to this performance flag. For instance, viewing a 2k tree and zooming into a clade has poor performance and we'd be better off skipping the animation, however once in a small clade it works well. For that same tree (viewing all 2k tips) filtering to a clade still looks good with the animation. 4000 tips is arbitrary, it was chosen after testing a variety of datasets locally. Note that there is a big difference between not setting a transition (within d3) and setting a transition of 0ms - the former is much faster. See <#1878> for more details here. I didn't implement this change for `modifySVGInStages` as it uses a more complex scheduling system; but the performance here is terrible for really large trees so we should revisit this in future work. Performance changes ------------------- Auspice built in production mode on Chrome 130.0 on a Apple M3 pro chip. Datasets: a large ncov tree (23k tips) (see <#1878> for more), and a flu tree of 2k tips. Both only rendering the tree panel. Flu performance is unchanged, as 2k tips isn't enough to trigger this performance toggle. For posterity: * Zooming into clade 2a. Initial frame ~160ms, then 30fps. * Filtering to clade 2a. Initial frame ~115ms, then 60fps. * Animation: 60fps ncov performance: * Zoom into clade 21M. * Previously: Initial frame 1500ms, second (final frame) 430ms later, main blocked for a further 700ms * Now: Single frame of 921ms. Main blocked for a further 550ms. * Change: Time to correct tree reduced by ~1000ms (50%). Time until main is no longer blocked: reduced by ~1100ms (45%). * Filter to clade 21M. * Previously: Initial frame 1050ms, second frame 930ms later, main blocked for a further 660ms * Now: Single frame of 1018ms, main blocked for a further 660ms * Change: Time to correct tree reduced by ~950ms (50%). Time until main is no longer blocked: reduced by ~950ms (35%). * Animate. This is unchanged - Auspice sets the transition time to 0 (see end of this commit message). We get around 4 frames at 70ms each (14fps) then a long commit phase blocking for ~650ms. For unknown reasons we trigger a timer flush but it doesn't show up in the performance snapshot, which is why these times are unchanged by this commit. Note that the timings for the initial frame are slightly variable in chrome. The JS time / blinkng pipeline parts shown on the main thread seem pretty consistent, but when the frame is drawn (relative to when the commit phase starts) is more variable. So take the above results with a grain of salt. P.S. there is a somewhat parallel implementation of this concept in Auspice where PhyloTree sets a transition time of zero if subsequent updates come in quick enough, but that is buggy. Firstly, if an update takes a long time the main thread will be blocked so the next update will be indistinguishable from one which was triggered an acceptable time later. Secondly, it only works for animation where we have a stream of updates.
jameshadfield
added a commit
that referenced
this issue
Nov 6, 2024
For big trees, attempting to animate changes results in poor performance and we're better off making all SVG changes in a single pass. "big" here depends on both the browser's performance and the size of the tree. Here we just use the latter as it's simpler to code, but in the future we may measure performance as the user interacts with the viz and update these flags accordingly. (This is one of the reasons we calculate these in middleware.) For future, we should add nuance to this performance flag. For instance, viewing a 2k tree and zooming into a clade has poor performance and we'd be better off skipping the animation, however once in a small clade it works well. For that same tree (viewing all 2k tips) filtering to a clade still looks good with the animation. 4000 tips is arbitrary, it was chosen after testing a variety of datasets locally. Note that there is a big difference between not setting a transition (within d3) and setting a transition of 0ms - the former is much faster. See <#1878> for more details here. I didn't implement this change for `modifySVGInStages` as it uses a more complex scheduling system; but the performance here is terrible for really large trees so we should revisit this in future work. Performance changes ------------------- Auspice built in production mode on Chrome 130.0 on a Apple M3 pro chip. Datasets: a large ncov tree (23k tips) (see <#1878> for more), and a flu tree of 2k tips. Both only rendering the tree panel. Flu performance is unchanged, as 2k tips isn't enough to trigger this performance toggle. For posterity: * Zooming into clade 2a. Initial frame ~160ms, then 30fps. * Filtering to clade 2a. Initial frame ~115ms, then 60fps. * Animation: 60fps ncov performance: * Zoom into clade 21M. * Previously: Initial frame 1500ms, second (final frame) 430ms later, main blocked for a further 700ms * Now: Single frame of 921ms. Main blocked for a further 550ms. * Change: Time to correct tree reduced by ~1000ms (50%). Time until main is no longer blocked: reduced by ~1100ms (45%). * Filter to clade 21M. * Previously: Initial frame 1050ms, second frame 930ms later, main blocked for a further 660ms * Now: Single frame of 1018ms, main blocked for a further 660ms * Change: Time to correct tree reduced by ~950ms (50%). Time until main is no longer blocked: reduced by ~950ms (35%). * Animate. This is unchanged - Auspice sets the transition time to 0 (see end of this commit message). We get around 4 frames at 70ms each (14fps) then a long commit phase blocking for ~650ms. For unknown reasons we trigger a timer flush but it doesn't show up in the performance snapshot, which is why these times are unchanged by this commit. Note that the timings for the initial frame are slightly variable in chrome. The JS time / blinkng pipeline parts shown on the main thread seem pretty consistent, but when the frame is drawn (relative to when the commit phase starts) is more variable. So take the above results with a grain of salt. P.S. there is a somewhat parallel implementation of this concept in Auspice where PhyloTree sets a transition time of zero if subsequent updates come in quick enough, but that is buggy. Firstly, if an update takes a long time the main thread will be blocked so the next update will be indistinguishable from one which was triggered an acceptable time later. Secondly, it only works for animation where we have a stream of updates.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
A summary of observations while rendering & interacting with a huge tree. Ideally we'd like to have Auspice be performant on such big trees, but also this is a simple way to flag up slow parts of the code. The timings are just to indicate bottlenecks - while they are reasonably consistent for me they'll be different for you. (This issue is mainly for me to keep track of things in a public-facing way. If you'd like to work on anything please spin it out into a new issue.)
Setup:
Dataset:
groups/blab-private/ncov/north-america/usa/washington/king-county/large?c=ns_kc&d=tree&p=full
(i.e. only tree panel rendering). 23k tips. The dataset is only 45MB, so lots of nodes but (relatively) few mutations (45MB minimised but uncompressed).Auspice 2.59.1 at master branch (35e4ca6) running in production mode with the following changes:
Chrome 130.0 running in incognito mode on an Apple M3 Pro chip. GC triggered before each snapshot was taken.
Test 1:
Filter to
f_region=South%20America
(n=259 tips selected). Start a snapshot then inactivate the filter, immediately hover over a tip so we can see when that listenerr will fire/render. This shares a lot of similarities to dragging the date range / animating but is much easier to reproduce consistently & isolate.Trace-20241026T131320.json
Heap Size: 230MB to 330Mb.
Timings:
onTipHover
(1ms) + recalculate style, layout, pre-paint.TLDR: Total time to partial tree update painted: ~1250ms. Total time to tree fully painted & responsive: 2500ms.
Don't try to animate things UPDATE: done in #1880
Improvements: huge; should be an easy win if we can make a heuristic about when to animate vs not. Changing
genericSelectAndModify
to no longer have.transition().duration(transitionTime)
provides a huge improvement.genericSelectAndModify
.Note: Achieving the same thing by using a
transitionTime=0
and thus callingtimerFlush()
is worse. Step 1 takes ~900ms withtimerFlush
taking ~300ms.Recalculating entropy, despite the entropy panel not being shown: UPDATE: done in #1879
Improvements: large.
calcEntropyInView
takes 295ms. The entropy panel's not even rendered!We should add a concept of "thing X doesn't need to be recalculated" - e.g. when the entropy panel's off the screen or not even on. The data can then be recalculated when needed.
Even when the entropy panel is on the screen, it can have delayed updating. Seems perfect for web-workers - e.g. show a grey overlay on the panel itself to indicate data out-of-date, recalculate things in the worker, then trigger the redux update. (Aside: we don't have any web workers in Auspice, but there are many situations they would be useful. This blog post seems a good resource on how workers may access the tree structure.)
Somewhat related to delayed updating, the thunk which calls this -
updateEntropyVisibility
- is debounced. Simply changing this to not fire on the leading edge (leading: false
) shifts this to a separate task on the event loop. That won't necessarily improve responsiveness, but it'll speed up the tree being re-drawn.Unnecessary recalculation of filtering (sidebar) options UPDATE: issue #1882
Improvements: medium, but easy wins. Total time: 41ms. The sidebar filtering component
<FilterData>
updates on any filter change (not date range). Each time it calls its makeOptions method which recomputes all options (to show in the dropdown), even though most of them are unchanged. CallingcollectGenotypeStates
takes 30ms, but the result of this doesn't ever change so is completely unnecessary.Note: the options are calculated every render time, but we're debouncing the callback which hands them to
<AsyncSelect>
. Can we just defer the calculation ofoptions
to this callback?Expensive tooltips
The unmounting of the react tooltip (shown when hovering on the filter badge) takes 40ms and then it immediately mounts again (40ms). These aren't deferred / put onto the event loop for some reason (I don't quite get this). Probably hard to improve but seems egregious.
Test 2:
No filters. Hover over a branch to bring up the info-box, then mouse out to remove it (very slow re-rendering). Now do the same action over tips and it's extremely responsive.
Comparing timings, the big difference is a ~1000ms commit phase for the branch hover, and a <1ms commit for tip hover. (Similar to test 1, the branch hover is drawn to screen during that long commit, but the main thread remains blocked.) The tip-hover is 60fps on my machine.
With tip hover, we select the tip element (within the SVG) and change its radius as well as creating/destroying a react component (a
<div>
). The branch hover similarly selects the branch and changes it's stroke colour, creates/destroys the info-box<div>
, and additionally adds a confidence-interval<path>
to the SVG. Removing this CI code reduces commit phase time to match that of tip-hover, i.e. 60fps.This tells us something about property trees / display lists - selecting and modifying a single (SVG?) element is very very different from creating / removing one.
There are a few options to fix this:
Test 3: initial load
Load
http://localhost:4000/groups/blab-private/ncov/north-america/usa/washington/king-county/large?c=ns_kc&d=tree&p=full
. Running on localhost without any network slowness etc we can see the following timings (which seem to vary quite a bit more than the above tests):treeJsonToState
modifyControlsStateViaTree
countTraitsAcrossTree
calcEntropyInView
(see Test 1, above)renderTree
(over half of this indrawBranches
)Deferring
calcEntropyInView
is an obvious win here (see above). In the wild, the JSON fetch time will probably dominate this phase so I wouldn't prioritise this above the other observations in this issue. (One nice change would be to investigate embedding the spinner image in the JS - I often notice it's not loaded for a long time when the network is slower.)The text was updated successfully, but these errors were encountered: