-
Notifications
You must be signed in to change notification settings - Fork 0
Adding script for benchmarking the curvilinear search function #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Meant to compare the spatial hashing (Parcels-code/Parcels#2132) against the old (vectorized) search
|
Just posting a basline bulk runtime estimate on my system for this benchmark with parcels v4-dev at Parcels-code/Parcels@add8158 Runtime (41s)System specs
OS details tuned profile |
|
Doubling the hash grid cell cell size (computed in Nearly all the runtime savings here is coming from the initialization. This result makes sense as increasing the hash cell size reduces the inner loop size on the hash table initialization, giving fewer |
|
Here's a few results, just playing with the hash grid cell size factor (the parameter that multiplies the median quad area), for the single particle on MOi benchmark. I've also split apart the timing for the hash grid initialization and the actual search time @erikvansebille - tbh, I'm not sure how you're getting runtimes for multiple particles - it must not be on v4-dev. I'm crashing during the search as soon as I go to more than 1 particle in this benchmark Edit : my current work-around is to use the |
|
Thanks for this analysis, Joe!
Indeed, I do all the simulations on the vectorized kernel branch from Parcels-code/Parcels#2122; I hope to merge that branch into v4-dev any day now; when @VeckoTheGecko has done a final review. So indeed best to already do all performance benchmarking under that branch On Friday, I started work on the VectorField interpolation (Parcels-code/Parcels#2152) and found there that I got significant speedups when loading the curvilinear |
Potentially. I'll pull profiles this morning :) |
|
I did also find a bug in how we treat wrap-arounds in the antimeridian. This is creating a few rows in the hash table that are quite long - this very well could be associated with long construction times and with long search times.. Working on a fix now. |
|
In working through resolving issues related to crossing the antimeridian, I've tracked down a few more problematic cells in the MOi dataset; there are some nearly colinear elements near the north pole that give us some additional headache. The bounding box and hash cell indices are given below. To avoid these kinds of issues, grids with the |
|
Thanks for digging deeper, Joe!
What do you mean with that? I don't think I understand. Anything I can help with? |
|
I've drafted a PR with a (failing) unit test at Parcels-code/Parcels#2153 that you could use to debug the spatial hash function. It uses the NEMO_Curvilinear dataset, which is a bit smaller than the MOi one used here; but has the same issue that there is an antemeridian in the domain, see plot below
Does this help you with debugging? |
|
I just did the full benchmarking for the Morton encoding in Parcels-code/Parcels#2158. It really is very impressive speedup compared to the spatial hashing that you introduced in Parcels-code/Parcels#2132, @fluidnumerics-joe! On my MacBook, the runtime is < 1 seconds up to 500k particles, and the initiation is only 1 second too. For 1M particles, the runtime is 2 seconds, which is a factor 300(!) faster than the 604 seconds for the spatial hash. The performance worsens for the 5M particle case (which takes 668 seconds), probably because of memory issues. See below the peak memory uses. The memory for 5M particles is >6GB, which may have lead to some disk/swapping on my laptop.
Note also that the memory load for the Morton encoding is in general 5 times higher than for the spatial hashing. Any initial idea why the memory footprint is so much larger, @fluidnumerics-joe? Reducing that is perhaps something to explore further down the line. |
|
Here an update of the search_indices timings with the new Morton implementation in Parcels-code/Parcels#2177 (cyan lines)
The initialisation is a bit slower (6 seconds on my MacBook), but the factor 10 speed-up for 5M particles is very nice (from 11 minutes to 50 seconds) The peak memory use is a lot higher again, though...
|
|
I'll see if there's room to trim back on memory usage. There's a few spots where |








Meant to compare the spatial hashing (Parcels-code/Parcels#2132) against the old (vectorized) search