Skip to content

Adding Embarrassingly Parallel Algorithms in NetworkX to nx-parallel #82

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Schefflera-Arboricola opened this issue Sep 23, 2024 · 10 comments
Labels
good first issue Good for newcomers Sprints Issues suitable for sprints or PRs from the sprints type: Enhancement New feature or request

Comments

@Schefflera-Arboricola
Copy link
Member

Embarrassingly parallel algorithms

Embarrassingly parallel algorithms are those that can be easily and efficiently divided into smaller, independent tasks that can be executed concurrently across multiple processing units, such as CPUs, GPUs, or nodes, without requiring significant communication or synchronisation between them. This type of parallelism is often referred to as “embarrassingly parallel” because it’s so straightforward and easy to implement. Refer this to see how to implement embarrassingly parallel algorithms using joblib.

Characteristics of Embarrassingly Parallel Problems(Algorithms):

  • Independent tasks: Each task can be executed independently, without relying on the output of other tasks.
  • No communication: Tasks do not require frequent or significant communication with each other.
  • No data dependencies: Tasks do not rely on shared data or intermediate results from other tasks.
  • Easy to divide: The problem can be easily split into smaller, parallelizable tasks.

Adding a new algorithm to nx-parallel

Step 1 : Find a NetworkX algorithm that is embarrassingly parallel

Step 2 : Complete the checklist in Contributor's guide

This checklist in Contributor's guide for adding a new algorithm to the package.

Some references:

It would be better to look at currently existing algorithms than previous PRs that add parallel algorithms(like #60 , #37 , #45 and many more) because we have changed a few things.

Feel free to ask questions if you get stuck anywhere or if anything is not clear or if you want something to be explained in a bit more detail here.

Thank you :)

@Schefflera-Arboricola Schefflera-Arboricola added type: Enhancement New feature or request good first issue Good for newcomers Sprints Issues suitable for sprints or PRs from the sprints labels Sep 23, 2024
@RohitP2005
Copy link

Hey , i would like to work on this issue

@Schefflera-Arboricola
Copy link
Member Author

Hi @RohitP2005 , let us know if you have any questions :)

Also, writing parallel versions of functions that return a non-generator(i.e. functions that return and not yield) would be easier to begin with.

@RohitP2005
Copy link

Thank you @Schefflera-Arboricola , I will reach out to you if i need anything

@RohitP2005
Copy link

@Schefflera-Arboricola i have made a pull request traversal
algorithm added

@RohitP2005
Copy link

hey @Schefflera-Arboricola can u clarify that am i supposed to add a alogrithmfrom in networkx that is embarissingly parallel to the nxparallel algorithms
If yes i also need a suggestion of such algorithm

@Rajendraprasad7
Copy link

Hi, I am trying to add a parallel algorithm for diameter of unweighted graphs based on this paper since the current algorithm which works for weighted graphs does not have much scope for parallelisation. When, I run the tests for the parallel algorithm it fails the test cases which are based on weighted graphs. Should I go ahead and submit the PR anyway?

@dschult
Copy link
Member

dschult commented Mar 13, 2025

It is fine to submit early-stage or not-quite-working PRs. You can indicate that they aren't quite ready to be merged by selecting the option of a "Draft Pull Request". Or you can just state in the PR post what is not quite working and how you've tried to make it work. Of course the PR comment should start with a description of the intent of the PR and description of what it does -- just like a PR with completed work would.
:)

@akshitasure12
Copy link
Contributor

akshitasure12 commented Mar 16, 2025

Hey @Schefflera-Arboricola,
I'm looking into parallelizing the number_of_cliques function in clique.py in NetworkX to improve performance.
Since each node’s clique count is independent, I was thinking of implementing it.
Would this be a good direction, or are there any concerns regarding efficiency, memory, or NetworkX's internal structure?
Please let me know if this is worth pursuing!

@SKADE2303
Copy link

Hello @Schefflera-Arboricola,

I am looking into implementing additional algorithms in nx-parallel and would appreciate your input. I came across an article mentioning that computing Closeness Centrality on large graphs is computationally expensive in NetworkX. Given that Betweenness Centrality is already parallelized in the repository, would it be worthwhile to explore parallelization for other centrality algorithms as well?

Please let me know if this approach is good.

@Schefflera-Arboricola
Copy link
Member Author

@akshitasure12 @SKADE2303

If you’re confident with your approach, please open a PR. We’ll review it. If you run into specific issues, feel free to ask, but otherwise, just go for it. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers Sprints Issues suitable for sprints or PRs from the sprints type: Enhancement New feature or request
Development

No branches or pull requests

6 participants