[WIP] : improve current timing script #114

akshitasure12 · 2025-05-31T18:22:17Z

improves #51

dschult

Any way to show text for both the time speedup and the time itself?
I often care more about speed-ups when the length of time is large -- the current values shown make it hard to see how the length of time changes between 50 and 300 nodes. Should I care that parallel is slower for 50 nodes?

Any ideas about why it would be slower for 50 nodes? It is doing all n_jobs=-1 processors -- what is making it slower?

timing/timing_individual_function.py

akshitasure12 · 2025-06-02T04:30:37Z

Any way to show text for both the time speedup and the time itself?

We could display both of them by making an annotation dataframe that stores formatted strings. Something like this:

display.at[number_of_nodes[num], p] = f"{parallelTime:.2g}s\n{timesFaster:.2g}x"

Another way could be to create two dataframes (one for the parallel run time and another for speed-ups) and overlaying one over the other. For our use case, I think the annotation dataframe would be much simpler.

Should I care that parallel is slower for 50 nodes?

Although the parallel implementation appears to be slower for 50 nodes, the difference in their runtimes is minimal. This is likely due to parallel overhead, which tends to be more noticeable on smaller graphs. Still, the implementation remains efficient overall -- so, it seems reasonable to me to overlook this small dip in small-scale graphs. LMKWYT

akshitasure12 · 2025-06-02T08:05:28Z

timing/heatmap_is_reachable_timing.png

not sure why the speed-up with 500 nodes is less than with 300 nodes. Could you please help me understand this?

How long does it take to run and create this tiing picture? If it isn't very long, try running it a few times and see if the dip in performance between 300 and 500 goes away.

If the results change from run to run, we should try to pick "better" parameters that don't show those slowdowns. Most computers these days have background processes running that can imapct timing scripts. To see through the haze of such noise, we could try multiple runs and take the minimum time from those runs. Another approach is to choose larger problems so the small differences from background processes looks more like noise.

It's also nice to repeatedly double the parameter (or multiply by a constant factor other than 2, like 10). That allows you to see whether the time is increasing like n^2 or n^3. Doubling nodes will take ~8 times longer for O(n^3). Of course, the big-O is only valid in the limit as n-> infinity, so take it all with a grain of salt. But if it is consistently a factor of 8 longer when doubling, you can be fairly constant it is running an O(n^3) code.

In this case, since we don't care so much about graphs with 50 nodes, and we do care about large graphs. We should probably increase the number of nodes. Maybe start with 250 nodes and double it up to 4000 for this picture. Lots of problems should really be run on 10_000-100_000 nodes. But that would take a long time to do the timing... :)

I guess the point of these plots is to show the impact of parallel code. So the best results would show a factor near 2 for njobs=2 and near 5 for njobs=5. And the title or caption should say what njobs was for the experiment.

:)

@dschult Thank you for your insights!

How long does it take to run and create this tiing picture? If it isn't very long, try running it a few times and see if the dip in performance between 300 and 500 goes away.

It didn’t take long to run, so I executed it a few more times to check if the dip was just a one-off. What I observed was that the speedup would initially peak, but then wouldn't yield any speedup for larger number of nodes.

After some debugging, I found the issue was in the is_reachable function, where num_in_chunk was being passed instead of n_jobs:

num_in_chunk = max(len(G) // n_jobs, 1) node_chunks = nxp.chunks(G, num_in_chunk)

needs to be replaced with:

node_chunks = nxp.chunks(G, n_jobs)

This is being addressed in PR#112

After applying that fix, I ran it 5 times and considered the minimum time from these runs for both is_reachable (ref. heatmap) and betweenness_centrality (ref. heatmap). However, I still observe a decline in speedup at higher node counts. Could this be due to system limitations?

System details:

MacBook Pro 13-inch M2
macOS Sequoia 15.5
Apple M2 Chip
Memory 8GB

Yay!! So good to hear you found one fix. Too bad there are still some mysteries.
Let's see -- are we running into memory issues? (probably not.... 1600 nodes -> 2.5M edges for the complete graph (prob=1)... each takes about 32-128 bytes so I think the 8GB memory would be OK unless the data has to be copied many times in the parallel version. I think they share memory, but I'm not sure.

Let's hope it is another bug. One that involves code that affects both of these functions. I guess start in the utility functions that are called in both functions.

akshitasure12 · 2025-06-05T17:38:37Z

Hi @dschult, I had previously added another commit to move numpy to the test dependencies, but I think it might not have been pulled. Would you like me to reapply that?

dschult · 2025-06-06T12:02:47Z

Yes -- I'm very sorry to have lost that commit to the PR. I didn't pull the latest changes when I rebased on the main branch.

It all looks good now that you reapplied that. 🚀

akshitasure12 · 2025-06-08T10:38:20Z

In the recent commit, I tried to implement memmapping in nx.tournament.is_reachable.

I replaced NetworkX's graph object with a NumPy-based adjacency matrix. This is because the problem involved making multiple copies of the input graph for each worker. Joblib.parallel has a special way of handling large numpy arrays by passing them as memory-mapped references to worker processes. This file is then accessed in read-only mode (mmap_mode="r") by each parallel worker utilising joblib's dump and load.

The heatmap obtained gives speedups ranging from 11x to 25x with this change, view heatmap.

Note : the current implementation assumes that the graph nodes are integers — I plan to make this more dynamic soon. One potential drawback is the memory usage in case of sparse graphs.
LMKWYT

dschult · 2025-06-09T13:57:55Z

Interesting that they have special tools for numpy arrays, but not for python objects.
For the nodes, you can map nodes to integers and back with tools like:

nodelist = list(G)  # map of integers to nodes
nodemap = {n: i for i, n in enumerate(nodelist)}

Those are good speeds -- so it looks like memory might be an issue more generally too.

It might be possible to use a numpy matrix in a more memory efficient way than the adjacency matrix.
For example, if the max degree is M, and number of nodes is N, you could make an NxM numpy array and use it as an adjacency list. Make all entries -1 (or another "sentry" value) after the last valid neighbor. So that is NxM instead of NxN.
Then A[n, A[n] != -1] gets the neighbors (there might be a more efficient way to get the neighbors, I'm not sure).

Another approach would be to use a scipy sparse array. So, there are lighter memory ways to handle adjacencies. But let's not worry about that yet.

Do I understand the memmap model? The array is written to a file and then each process has read access to that file. So it uses hard disk space instead of memory to share the data. It looks like there is a Python standard library version of this called mmap. I wonder if that works well with dicts. But numpy is very fast so it might be better to just stick with memmapping.

Very cool!

akshitasure12 · 2025-06-09T19:44:38Z

It might be possible to use a numpy matrix in a more memory efficient way than the adjacency matrix.
For example, if the max degree is M, and number of nodes is N, you could make an NxM numpy array and use it as an adjacency list. Make all entries -1 (or another "sentry" value) after the last valid neighbor. So that is NxM instead of NxN.

I like this idea of using an NxM numpy array as an adjacency list -- would be an efficient way of expressing it. I'll work my way with this. Thanks for the insights :)
Also, I thought it would be a good idea to move this to a separate PR to keep things aligned with the original focus (ref PR#119).

Do I understand the memmap model? The array is written to a file, and then each process has read access to that file. So it uses hard disk space instead of memory to share the data.

Yep, that’s pretty much it!

dschult · 2025-06-09T21:54:08Z

+1 for moving the adjacency stuff to a separate PR. :)

akshitasure12 · 2025-06-20T06:41:43Z

Do we still need timing_comparison.md, or can we delete that as well?

dschult · 2025-07-01T02:56:08Z

I think we can delete that file too.
(It is the output of timing_all_functions.py script right? I can't find any mention of it in the repo.)

This reverts commit 472d395.

Schefflera-Arboricola · 2025-07-03T12:20:36Z

timing/timing_all_functions.py

I'm not sure if we should remove this file-- because we do wish to someday have a heatmap (or some other visual) to showcase all(or most) of the algorithms' speedups. LMKWYT. Also, we still have the heatmap generated by this script - https://github.com/networkx/nx-parallel/blob/main/timing/heatmap_all_functions.png

How would we manage the heatmap as the number of algorithms grow? Would we do something like regenerating this heatmap every time a new algorithm is added? I'm just wondering what the long-term plan may possibly be!

i meant something like this - https://github.com/python-graphblas/graphblas-algorithms/blob/main/README.md -- doesn't have to be a heatmap

we don't have to showcase all algorithms-- only 1-2 algorithms of each kind(like shortest paths, centrality measures, etc) and on real world networks.

Schefflera-Arboricola · 2025-07-03T12:26:02Z

timing/timing_comparison.md

I think we should keep this file(and populate it with all other heatmaps) because it contains important info like the machine's specifications and the heatmap results. I think here we can also mention other software specifications like backend="loky", etc.

maybe we should rename it to README.md (in the timing folder)-- so that it's visible as soon as someone opens the folder.

Also, currently, not all heatmaps are run on the same machine. These are very old machine specifications. LMKWYT.

timing/timing_individual_function.py

Schefflera-Arboricola · 2025-07-03T12:53:18Z

timing/timing_individual_function.py

+# Default Config
+joblib.parallel_config(n_jobs=-1)
+# To use NetworkX's parallel backend, set the following configuration.
+# nx.config.backends.parallel.active = True
+# nx.config.backends.parallel.n_jobs = -1


no need for these comments-- We expect that the reader of timing script knows how to use/configure nx-parallel. LMKWYT.

Suggested change

# Default Config

joblib.parallel_config(n_jobs=-1)

# To use NetworkX's parallel backend, set the following configuration.

# nx.config.backends.parallel.active = True

# nx.config.backends.parallel.n_jobs = -1

# Setting configs: Using all CPU cores

joblib.parallel_config(n_jobs=-1)

We wouldn’t need this anymore since we’re already updating the NetworkX configs to -1, right?

yes but that PR is not merged yet-- what merge order of these PRs would you prefer? should we merge this PR second?

Schefflera-Arboricola · 2025-07-03T12:59:46Z

timing/timing_individual_function.py

-        stdTime = t2 - t1
+tournament_funcs = ["is_reachable", "tournament_is_strongly_connected"]
+bipartite_funcs = ["node_redundancy"]
+random.seed(42)


Do we need to set the seed here?

I had included it for reproducibility, but happy to remove it if there's no need.

yes, but why do we need a global seed? is it being used by any function(s) below?

Schefflera-Arboricola · 2025-07-03T13:51:49Z

timing/heatmap_is_reachable_timing.png

using "line"(top line, bottom line) is a bit confusing-- because usually in the context of graphs(not networks) a line might mean line graph. So, "value" or "number" might be more apt. or we can also color code the values

Schefflera-Arboricola · 2025-07-03T13:53:23Z

timing/timing_individual_function.py

I tried running this script and I got a lot of findfont: Font family 'Arial Rounded MT Bold' not found. at the end

also for betweenness_centrality it ran once and then it started running again--

(nxp-dev) @Schefflera-Arboricola ➜ /nx-parallel (improve-timing) $ python timing/timing_individual_function.py 200 0.24375876500016602 0.4948986529998365 Finished <function betweenness_centrality at 0x7173ebb4d440> 400 2.0439088319999428 3.804219898000156 Finished <function betweenness_centrality at 0x7173ebb4d440> 800 17.221894170999803 29.955433169000116 Finished <function betweenness_centrality at 0x7173ebb4d440> 1600 139.0610084220002 237.87486764899995 Finished <function betweenness_centrality at 0x7173ebb4d440> 200 0.2752337769998121 0.5823823759992592 Finished <function betweenness_centrality at 0x7173ebb4d440> 400 2.3135327339996365 4.701841838999826 Finished <function betweenness_centrality at 0x7173ebb4d440> 800 19.94758787899991 ...

Please make sure if the script runs with all kinds of algorithms i.e. all code blocks run fine

I think this is because for algorithms like betweenness centrality, the script checks it for multiple probabilities of nodes, i.e., p = [1, 0.8, 0.6, 0.4, 0.2], which isn't the case for tournament graphs (where edge probability is always 1). I’m assuming this is what you meant when you said it started running again! Pls let me know if I’ve misunderstood anything.

ok, thank you for clarifying -- it would be helpful to print out the edge probability in the output

Schefflera-Arboricola · 2025-07-03T14:04:12Z

timing/timing_individual_function.py

+    ax.set_xlabel("Number of Vertices", fontweight="bold", fontsize=12)
+    ax.set_ylabel("Edge Probability", fontweight="bold", fontsize=12)
+
+    n_jobs = nxp.get_n_jobs()


i'd recommend using get_active_backend()[1] instead of get_n_jobs here-- bcoz the first one gives the n_jobs that joblib has stored and used while running the code(and we are using joblib's config system in this script).

This reverts commit f45627d.

This reverts commit 924bc27.

Co-authored-by: Aditi Juneja <[email protected]>

akshitasure12 changed the title ~~[WIP] : update makefile for running timing scripts~~ [WIP] : improve current timing script May 31, 2025

dschult reviewed Jun 1, 2025

View reviewed changes

timing/timing_individual_function.py Outdated Show resolved Hide resolved

akshitasure12 commented Jun 2, 2025

View reviewed changes

dschult force-pushed the improve-timing branch from a999e8e to 67e3e19 Compare June 4, 2025 21:27

dschult added the type: Enhancement New feature or request label Jun 5, 2025

dschult force-pushed the improve-timing branch 2 times, most recently from d12a986 to 6174eb3 Compare June 5, 2025 17:33

akshitasure12 mentioned this pull request Jun 9, 2025

[ENH]: parallel implementation of is_reachable() with numpy arrays. #119

Open

akshitasure12 marked this pull request as ready for review June 15, 2025 07:26

akshitasure12 added 12 commits July 1, 2025 15:54

updated pyproject.toml

eb7b543

minimal improvements

4e010d7

modularized the script

146f25f

removed function calls

d0935a8

added next func record_result

c85e4c7

switched to enumerate

31b8360

Added number of trials for minimal noise

1b3f2ee

increased the number of nodes to test

c2ed666

setting ticks

7c15124

Move back numpy and scipy to test dependencies

66c234a

working with shared memory

6b666bb

Revert "working with shared memory"

49e35d8

This reverts commit 472d395.

akshitasure12 added 6 commits July 1, 2025 15:58

use timeit.repeat() instead of perf_counter()

1a1e629

simplified bipartite logic

234cc43

optimised the visualization

4ec2ab2

minor edits

b843714

removed timing_all_functions

924bc27

removed timing_comparision.md

f45627d

akshitasure12 force-pushed the improve-timing branch from a82b405 to f45627d Compare July 1, 2025 10:33

Schefflera-Arboricola reviewed Jul 3, 2025

View reviewed changes

akshitasure12 and others added 3 commits July 4, 2025 19:38

Revert "removed timing_comparision.md"

aab4245

This reverts commit f45627d.

Revert "removed timing_all_functions"

79cf08c

This reverts commit 924bc27.

update installation text

3edf1d8

Co-authored-by: Aditi Juneja <[email protected]>

Uh oh!

[WIP] : improve current timing script #114

Are you sure you want to change the base?

[WIP] : improve current timing script #114

Uh oh!

Conversation

akshitasure12 commented May 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dschult left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

akshitasure12 commented Jun 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

akshitasure12 Jun 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

System details:

Uh oh!

Choose a reason for hiding this comment

Uh oh!

akshitasure12 commented Jun 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dschult commented Jun 6, 2025

Uh oh!

akshitasure12 commented Jun 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dschult commented Jun 9, 2025

Uh oh!

akshitasure12 commented Jun 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dschult commented Jun 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

akshitasure12 commented Jun 20, 2025

Uh oh!

dschult commented Jul 1, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Schefflera-Arboricola Jul 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Schefflera-Arboricola Jul 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

akshitasure12 commented May 31, 2025 •

edited

Loading

akshitasure12 commented Jun 2, 2025 •

edited

Loading

akshitasure12 Jun 2, 2025 •

edited

Loading

akshitasure12 commented Jun 5, 2025 •

edited

Loading

akshitasure12 commented Jun 8, 2025 •

edited

Loading

akshitasure12 commented Jun 9, 2025 •

edited

Loading

dschult commented Jun 9, 2025 •

edited

Loading

Schefflera-Arboricola Jul 4, 2025 •

edited

Loading

Schefflera-Arboricola Jul 3, 2025 •

edited

Loading