-
Notifications
You must be signed in to change notification settings - Fork 718
feature: IMDReader Integration #4923
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello there first time contributor! Welcome to the MDAnalysis community! We ask that all contributors abide by our Code of Conduct and that first time contributors introduce themselves on GitHub Discussions so we can get to know you. You can learn more about participating here. Please also add yourself to package/AUTHORS
as part of this PR.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #4923 +/- ##
===========================================
- Coverage 93.62% 93.58% -0.05%
===========================================
Files 177 178 +1
Lines 22001 22174 +173
Branches 3114 3138 +24
===========================================
+ Hits 20599 20752 +153
- Misses 947 958 +11
- Partials 455 464 +9 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Your thoughts on this are appreciated - @orbeckst |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the initial PR.
- The first big step is to get the tests running properly so that the CI uses an imdclient without IMDReader. Otherwise we are not sure we're testing the code here.
- Minor initial comments while I skimmed.
- Simple thing: run
black
over all files to get the formatting and ordering of imports right
imdclient: | ||
default: 'imdclient' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's going to take time to get imdclient to a stage where the imdclient package does not actually affect MDAnalysis.
Is there a way that we could temporarily (for initial CI testing) install imdclient from a branch or tarball, e.g., in a pip
section? Then we could fairly rapidly create a preliminary (unpublished) imdclient package without IMDReader.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By initial CI testing, do you mean "in this PR"?
There's a pip section just below, which should work if you put in the git location for pip install, but also you can just temporarily modify the CI script to do an additional pip install if it's for "testing within the PR itself".
If it's "after merge", this would require more discussion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for right now to bootstrap the PR.
I don't want to merge without a solid conda-forge imdclient package in place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just put this code to where it's needed, e.g., directly into IMD.py.
Do not add a util.py here, we need to keep this module as clean as possible because it's already quite crowded.
@@ -79,6 +79,7 @@ extra_formats = [ | |||
"pytng>=0.2.3", | |||
"gsd>3.0.0", | |||
"rdkit>=2020.03.1", | |||
"imdclient", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We almost certainly need to add a minimal version.
I set the PR to Work in progress for the time being, just to indicate that we're not yet at the stage where the CI is working. Once the tests run properly, we can update the status. Obviously, this shouldn't discourage anyone from contributing and commenting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A first quick look
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read through and nothing jumped out at me that wasn't already mentioned, except that I didn't see these changes in the documentation that was linked. The following needs to be added:
Add doc/sphinx/source/documentation_pages/coordinates/IMD.rst
.. automodule:: MDAnalysis.coordinates.IMD
doc/sphinx/source/documentation_pages/coordinate_modules.rst
coordinates/IMD
doc/sphinx/source/documentation_pages/references.rst
If you use IMD capability...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay, had a lookover, I will try and push some changes addressing some of these myself also, but would be good to pick up the momentum here again if possible.
@amruthesht do we have a plan for pushing ahead with suggested changes? |
Working on the imdclient repository right now. I had a few small issues there with using the |
1. Moved `parse_host_port` to `IMD.py` and deleted `util.py` 2. Cleaned up `test_imd.py` - changes to `assert_*` functions and simplified non-applicable test to pass automatically
@amruthesht kicking CI here. |
@yuxuanzhuang would you be able to have a run through? |
I will have another look at the PR this week. One immediate thing is to add documentation. Please refer to the comments below.
|
Need to figure out tests also @amruthesht |
Fail here is due to a circular import which I think comes from https://github.com/Becksteinlab/imdclient/blob/main/imdclient/IMD.py#L11-L12. Removal of MDA as a direct dep in Becksteinlab/imdclient#65 (comment) will likely fix this issue, but I also could be wrong about the source of circular import. Does that jive with your understanding @orbeckst? |
Yes, IMD.py seems to be the culprit. This problem will be gone once we have completed the split in imdclient. |
The tests seem to be memory-consuming (which leads to the windows machine error?) ##[warning]Free memory is lower than 5%; Currently used: 96.93% Aside from that, this PR is in good shape. Maybe increase the coverage as well. |
@amruthesht @ljwoods2 is high memory use a known thing with the IMDReader? |
@hmacdope It looks like these VMs have 7GB RAM. In theory, the IMDClient should only be using 10MB of mem during each test by default for numpy arrays in IMDFrames. Could be a deeper issue, but first we should probably answer
If it's nothing else, it will require a deeper look at mem usage, I will try to reproduce if that's the case |
From memory its run in parallel, I will look into forcing serial? Seems ok on ubuntu-latest, we could just control with an env var at worst. |
.. code-block:: python | ||
import MDAnalysis as mda | ||
u = mda.Universe("topol.tpr", "imd://localhost:8889", buffer_size=10*1024*1024) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amruthesht @ljwoods2 what does buffer size do, and how to tune needs to be described
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably have an explicit kwarg for it on the reader, right now I think it just gets passed in to the client from **kwargs
Details about the IMD protocol and usage examples can be found in the | ||
`imdclient <https://github.com/Becksteinlab/imdclient>`_ repository. | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also need a full example in user-guide most likely, but not blocking for this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amruthesht could you please raise an issue in https://github.com/MDAnalysis/UserGuide/issues to add documentation for using streaming with MDAnalysis? Include a short bullet-pointed outline of what will be needed so that users (who normally read the UserGuide and not the MDA API docs) can immediately use streaming with IMDv3. E.g., as a starting point
- summary of what it does
- why to use it,
- how to install (link to imdclient/MD engine),
- simple example iterating and pulling information from trajectory
- current limitations
Imagine being a new user. What do you want to see there?
Thanks!
""" | ||
|
||
format = "IMD" | ||
one_pass = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, as far as I remember that was something that exists for compatibility with the streambase we were shipping with the imdclient package (but analysis class modification will be in a different PR mda-side)
Parameters | ||
---------- | ||
filename : a string of the form "imd://host:port" where host is the hostname |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add documentation for buffer_size
which gets passed to imdclient. Buffer size is a cap on the number of bytes allocated in memory for the IMDFrameBuffer
. Increasing it will likely decrease the amount of time the simulation engine spends in a paused state, but requires more RAM
|
||
|
||
# NOTE: think of other edge cases as well- should be robust | ||
def parse_host_port(filename): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we moved this method to imdclient:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be removed then.
@pytest.mark.skipif(not HAS_IMDCLIENT, reason="IMDClient not installed") | ||
def test_n_atoms_not_specified(): | ||
universe = mda.Universe(COORDINATES_TOPOLOGY, COORDINATES_H5MD) | ||
port = get_free_port() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method of finding a free port is not going to be perfectly reliable in highly parallel situations. Ideally, the InThreadIMDServer
would bind a free port and return the number it bound somehow so that it could be passed to the reader.
These tests also rely heavily on the test methods of imdclient not changing... are we okay with that? Maybe it's worth reviewing them so we don't get stuck maintaining an API we don't like
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For right now, some reliance on the imdclient testing infrastructure is unavoidable to move things along.
I think this mostly echoes what @hmacdope said on Discord.
With tests available and passing, this PR can definitely come out of draft status. |
Super exciting! In theory should be working, just final tweaks. |
@yuxuanzhuang would you be able to chip away at some of the docs tweaks and small things that need cleaning up? |
Fixes #4827
This draft PR addresses the feature request discussed in #4827.
Note:
The
IMDReader
feature which was previously a part of theimdclient
package has been moved intoMDAnalysis
below. Any other modules have been in retained inimdclient
, which has been added as an optional dependency here. We are currently in the process of splitting theimdclient
package as mentioned above. (Issue, PR)Major changes made in this Pull Request:
IMDReader
, other associated base classes and a utility function were added to coordinates in the main package.test-imd.py
*.yaml
filesimdclient
was added as an optional dependencyPR Checklist
package/CHANGELOG
file updated?package/AUTHORS
? (If it is not, add it!)Developers Certificate of Origin
I certify that I can submit this code contribution as described in the Developer Certificate of Origin, under the MDAnalysis LICENSE.
📚 Documentation preview 📚: https://mdanalysis--4923.org.readthedocs.build/en/4923/