Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backend selection #106

Closed
wants to merge 3 commits into from
Closed

Backend selection #106

wants to merge 3 commits into from

Conversation

Jnelen
Copy link
Contributor

@Jnelen Jnelen commented Feb 25, 2024

Description

This is my attempt to adress issue #68
For the functions that work with _rmsd_isomorphic_core, there is now a specific option to select the backend: "graph_tool" or "networkx". There is also the "default" option, which will mimic the original behavior: first it tries Graph Tool followed by (trying to) fall back on NetworkX. If you specifically select a backend that isn't available, an error will be raised. I also had to update/modify many of the tests to implement this new behavior, but now everything is passing again.

Changelog

  • Added the (optional) backend selection for _rmsd_isomorphic_core (and symrmsd and rmsdwrapper)
  • Added (optional) backend selection for molecule.to_graph()
  • Removed most of the (imported) methods from graph.py and instead dynamically use the correct ones directly from gt.py or nx.py accordingly
  • Updated tests to accommodate the new behavior. I only modified molecule_test.py, for the to_graph() function, and the graph_test.py, to explicitly test everything with both NetworkX and graph-tool as a backend.

Documentation

The recognized arguments for the backend feature are:

  • Graph Tool: "graph_tool", "graphtool", "graph tool", "graph-tool" and "gt"
  • NetworkX: "networkx" and "nx"

They are all case insensitive, so things like "GraphTool" or "NetworkX" will also work without problem

Checklist

  • Tests (passes after modifying it for the new behavior)
  • Documentation
  • Changelog

…se (Graph Tool or NetworkX)

There is also the "default" option, which will mimic the original behaviour which is to try Graph Tool first followed by (trying to) fall back on NetworkX
If you specifically select a backend which isn't available, an error will be raised.
I also had to update/modify many of the tests to implement this new behaviour, but now everything is passing again.
@RMeli
Copy link
Owner

RMeli commented Feb 25, 2024

Thank you very much @Jnelen for your contribution. #68 is a very important issue I had not time to work on for a long time, so it's great you want to fix it! (It will also allow to easily support other backends, as discussed in #69).

However, I would like to have a set_beckend function that works transparently, i.e. something that does not require to temper with the API. This is somewhat hinted by the code snippet in #68 (where I tempered with cached modules), but I should have been more explicit. Apologies for the confusion!

Your current solution, for example, changes the very simple to_graph functions to a complex one with many different branches. IMHO, it is not the to_graph function's task to check the backend and select the appropriate one. Another hint that this is not an ideal solution, is that you had to add the same code to _rmsd_isomorphic_core.

Likewise, with a transparent set_beckend function it should be reasonably straightforward to have a test configuration that runs once with set_backend("nx") and another time with set_backend("gt"), without the need to explicitly change many different tests. One goal of #68 is to reduce the size of the CI matrix by testing both backends with the same configuration, and with the current solution this would require to also change all rmsd tests to explicitly use the desired backend.

I hope you won't feel discouraged by these comments. I just think that there is a better design which would be less intrusive and more scalable. Do you think you could change this PR into something on those lines? (If you want to discuss the design further before doing a lot of code changes, feel free to comment under the original issue or open a separate GitHub Discussion.)

@Jnelen
Copy link
Contributor Author

Jnelen commented Feb 26, 2024

Thank you very much @Jnelen for your contribution. #68 is a very important issue I had not time to work on for a long time, so it's great you want to fix it! (It will also allow to easily support other backends, as discussed in #69).

However, I would like to have a set_beckend function that works transparently, i.e. something that does not require to temper with the API. This is somewhat hinted by the code snippet in #68 (where I tempered with cached modules), but I should have been more explicit. Apologies for the confusion!

Your current solution, for example, changes the very simple to_graph functions to a complex one with many different branches. IMHO, it is not the to_graph function's task to check the backend and select the appropriate one. Another hint that this is not an ideal solution, is that you had to add the same code to _rmsd_isomorphic_core.

Likewise, with a transparent set_beckend function it should be reasonably straightforward to have a test configuration that runs once with set_backend("nx") and another time with set_backend("gt"), without the need to explicitly change many different tests. One goal of #68 is to reduce the size of the CI matrix by testing both backends with the same configuration, and with the current solution this would require to also change all rmsd tests to explicitly use the desired backend.

I hope you won't feel discouraged by these comments. I just think that there is a better design which would be less intrusive and more scalable. Do you think you could change this PR into something on those lines? (If you want to discuss the design further before doing a lot of code changes, feel free to comment under the original issue or open a separate GitHub Discussion.)

Hi @RMeli ,

I agree with all your points. I thought my approach would be fine for almost all end users, since the basic rmsd functions are unchanged. However, I do agree it adds some complexity, and anything using graph.py previously would break. I should have thought that through better! I will try to do it again, but would you mind if I do it in a seperate pull request? For now I'd prefer to keep this open as a reference. If I get the new version to work i'll put it as a seperate PR, and we can close/remove this one.

@RMeli
Copy link
Owner

RMeli commented Feb 26, 2024

I thought my approach would be fine for almost all end users, since the basic rmsd functions are unchanged.

I think for the end-user having to specify the backend is not a big problem, but I'd prefer to have a clean API (i.e. separation of concerns). And the main issue is maintainability. I would like to see the number of backends grow (even GPU-accelerated ones if needed), and that is why I would like to have them as loosely coupled as possible with the rest of the code base.

I should have thought that through better!

Designing software is often a long series of tries. I thought about this quite a bit, and that's why I have something in mind that I think would be clean. But happy to be proven wrong/be convinced otherwise!

The only thing I can suggest going forward, is to discuss big design changes beforehand to avoid wasting your time (although discussing on actual code on a PR is much easier and more concrete).

For now I'd prefer to keep this open as a reference. If I get the new version to work i'll put it as a seperate PR, and we can close/remove this one.

I have no issues keeping this PR open (although all the history/code will remain in the closed PR too). If you still want to work on this, I'd be happy to look at a separate PR. I can also try to better formulate the solution I'm looking for in #68, to avoid misunderstandings.

@Jnelen
Copy link
Contributor Author

Jnelen commented Feb 27, 2024

I'm going to give it another shot, with an approach that I hope will align better with your view. From there we can see how to further improve it if necessary! Thanks for the feedback and insights!

@Jnelen Jnelen closed this Mar 1, 2024
@Jnelen Jnelen deleted the backend_selection branch March 7, 2024 21:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants