[WIP]: Refactor-- consolidate and simplify#7
Conversation
…th new ParallelGraph logic, add a tutorial
…able, and cleanly annotated base classes to permit easy iteration
|
Note, that this now more closely follows the proposed @nx._dispatch structure: In [1]: import networkx as nx; import nx_parallel
In [2]: G = nx.barabasi_albert_graph(100, 3)
In [3]: H = nx_parallel.ParallelGraph(G)
In [4]: nx.betweenness_centrality(H)
Out[4]:
{0: 0.25151869905207225,
1: 0.049610587404427434,
2: 0.12300459374399848,
3: 0.08304068354565398,
4: 0.13909529780368793,
5: 0.033370757979639815,
6: 0.025224691529040805,
7: 0.010337586781618939,
8: 0.013377277246126217,
9: 0.05723111951263706,
10: 0.1619649220082718,
...
91: 0.000662037520417236,
92: 0.0004985754985754986,
93: 0.0006179450357121424,
94: 0.0005769500203637495,
95: 0.0020165815010068505,
96: 0.0016015554110792203,
97: 0.0015408701683211486,
98: 0.0019846075040880237,
99: 0.0016253091741785242} |
There was a problem hiding this comment.
Thanks @dPys for this PR. It has really helped us shape nx-parallel nicely. I'm closing this because there are a lot of merge conflicts and a lot has changed since you opened this PR. But, please feel free to re-open :) and feel free to provide any kind of feedback on the current state of nx-parallel or any of the open issues.
Thank you very much @dPys :)
Hope to see you contribute here again :)
| "ipyparallel", | ||
| ] | ||
|
|
||
| class Backend: |
| def chunk(l: Union[List, Tuple], n: int) -> Iterable: | ||
| """Divide a list `l` of nodes or edges into `n` chunks.""" | ||
| l_c = iter(l) | ||
| while 1: | ||
| x = tuple(itertools.islice(l_c, n)) | ||
| if not x: | ||
| return | ||
| yield x | ||
|
|
||
| def create_iterables(self, G: nx.Graph, iterator: str) -> Iterable: |
There was a problem hiding this comment.
These 2 function have been included in the nx-parallel. Thanks @dPys !
| return joblib.Parallel(n_jobs=self.backend.processes)(calls) | ||
|
|
||
|
|
||
| class NxReduce: |
There was a problem hiding this comment.
The NxMap and NxReduce architecture implemented here is being discussed in Issue #30.
There was a problem hiding this comment.
Great, will take a look
|
I'm so glad to see the state of nx-parallel improve to what it is now. It's really starting to come together @Schefflera-Arboricola ! Even though this PR is being closed due to the merge conflicts, would still be nice to be listed as a contributor? By the way-- are you all still meeting regularly (Wednesdays if I recall? I believe I have the link still...) If so, would love to start sitting in on those discussions? This repo seems ripe for a code sprint and perhaps even a conference submission? |
|
Thank you! And thank you for your contributions. I apologize we don't have you as a contributor on this repository. I should have imported commits of this PR into my PR when I was borrowing the And yes we have weekly meetings, here’s the calendar link: https://scientific-python.org/calendars/networkx.ics. We’d be delighted to have you join! Also, feel free to lead any sprints on nx-parallel at any upcoming conferences. Also, if you could give any feedback as a user of nx-parallel then that will be great too :) Thank you and hope to see you around! |
|
You're very welcome! No worries about the commits—happy to have contributed. I joined last week's meeting and should also be there this afternoon. Thanks for the invite and offer to lead sprints on nx-parallel at any upcoming conferences. I don't have any conferences that I'm planning to attend in the next few months, but will consider sprinting if/when I do. And, of course, I'd be glad to provide feedback on nx-parallel as I continue tinkering. |
This PR includes the following changes:
Backendsclass to a new moduleexternalfor handling different parallelization backends ( "multiprocessing", "dask", "ray", "loky", "threading", and "ipyparallel") through the joblib API.optional_packagefunction to thenx_parallel/misc.pyfile to accommodate for the possibility of multiple backend dependencies.partitionmodule, comprising generalNxMapandNxReduceclasses that can be used to more easily and consistently chunk, map, and reduce parallelizable components of most nx algorithms (as of now, "nodes", "edges", "isolates", and "neighborhoods" are supported).algorithmsmodule as proposed by @20kavishs, but with the help for thepartitionclasses, this greatly simplifies the implementation for each parallel algorithm variant.dependenciessection in thepyproject.tomlfile to impose some minimal version requirement for NetworkX and joblib.