Skip to content

LinearDensity should have more reliable tests #3603

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
BFedder opened this issue Apr 5, 2022 · 9 comments · May be fixed by #5013
Open

LinearDensity should have more reliable tests #3603

BFedder opened this issue Apr 5, 2022 · 9 comments · May be fixed by #5013

Comments

@BFedder
Copy link
Contributor

BFedder commented Apr 5, 2022

Is your feature request related to a problem?

Currently, some of the tests in test_lineardensity.py are basically tautologies because the expected values the tests check against were obtained by running lineardensity.py. This is the case for the following expected values:

expected_xchar_atoms
expected_xpos_residues
expected_xchar_residues
expected_xpos_segments
expected_xchar_segments
expected_xpos_fragments
expected_xchar_fragments

The current test Universe u = mda.Universe(waterPSF, waterDCD) is not ideal for determining charge densities by residues, segments, or fragments, because in these groupings the overall charge is zero.

Describe the solution you'd like

The expected values should be obtained independently from LinearDensity so that the tests are actually meaningful.

@PicoCentauri had the following suggestion for how this might be achieved:

Generate a trajectory based on random positions of particles in a cubic box. Draw the positions from a uniform distribution. Run the analysis and compare the densities. You could do three kinds of system: uncharged particles, charged particles and maybe charged dimers.

Describe alternatives you've considered

Currently, expected_xpos_atoms came from a different source. One could look into how these values were originally determined and replicate this for the other test cases.

Additional context

This came up in a discussion on PR #3572

@PicoCentauri
Copy link
Contributor

PicoCentauri commented Apr 6, 2022

Thanks for this issue. If somebody wants to start on this. The following lines can be used to generate a Universe of individual atoms.

import MDAnalysis as mda
import numpy as np
from MDAnalysis.core._get_readers import get_reader_for

def make_Universe():
    """Generate a reference universe of 100 atoms with uniformly drawn random positions."""
    n_atoms = 100
    n_frames = 100

    u = mda.Universe.empty(n_atoms=n_atoms,
                           n_residues=n_atoms,
                           n_segments=n_atoms,
                           atom_resindex=np.arange(n_atoms),
                           residue_segindex=np.arange(n_atoms))

    for attr in ["charges", "masses"]:
        u.add_TopologyAttr(attr, values=np.ones(n_atoms))

    shape = (n_frames, n_atoms, 3)
    coords = np.random.random(shape)

    u.trajectory = get_reader_for(coords)(coords,
                                          order='fac',
                                          n_atoms=n_atoms)

    for ts in u.trajectory:
        ts.dimensions = np.array([1, 1, 1, 90, 90, 90])

    return u

@olivia632
Copy link

i want to start on this

@PicoCentauri
Copy link
Contributor

@olivia632 cool!

I suggest that you calculate the density using the make_universe function and compare it to the values you would calculate based on the system parameters. The number of particles is 100, the volume is 1 Å^3, the mass per particle is 1 u and the charge is 1 e. If I am not mistaken the mass density in each bin and direction should be 100 u/Å^3 and the charge 100 e/Å^3.

@olivia632
Copy link

k I will work on that and then reach you back

@PicoCentauri
Copy link
Contributor

You can also reach me on discord if you have any general questions.

@olivia632
Copy link

there are 3 test cases failing in the original code can you clarify on that

@olivia632
Copy link

==================================== short test summary info ====================================
FAILED analysis/test_lineardensity.py::test_invalid_grouping - TypeError: 'method' object is n...
FAILED analysis/test_lineardensity.py::test_lineardensity[residues-expected_masses1-expected_charges1-expected_xpos1-expected_xchar1]
FAILED analysis/test_lineardensity.py::test_lineardensity[segments-expected_masses2-expected_charges2-expected_xpos2-expected_xchar2]
=========================== 3 failed, 2 passed, 15 warnings in 0.68s ============================

@PicoCentauri
Copy link
Contributor

@olivia632 did you figures out what is going wrong (Pull the to latest version from github etc?). The tests seem to work for the CI...

@EchoFia
Copy link

EchoFia commented Apr 3, 2025

Hello, my name is Echo (they/them), and I'm interested in having a look at this issue as part of my GSoC 2025 application (Project 5)!

My understanding of the problem:

  • For each Analysis module, there is an associated set of tests that checks most of the functionality of a module by comparing the calculated values from the Analysis module to a set of known values (for the same system).
  • For the Linear Density module, the 'known values' were originally calculated using the module (the code equivalent of using a word to define itself).
  • Also, the groupings of the used system (5x H2O) is not particularly demonstrative of charges & charge densities, as only the atoms are charged but segments (1x H2O), fragments (1x H2O) and segments (5x H2O) are all neutral.

Aim:

  • Create a new set of tests that look at 3 types of systems: uncharged particles, charged particles, and charged dimers.
  • For each system, we should calculate 4 points of data (total mass, total charge, mass density per slice, charge density per slice) for each of the 4 groupings (atoms, residues, segments, fragments) - overall, these are the '16 things'

Developing a solution:

  • We were given sample code to generate a Universe with 100 randomly positioned particles of mass and charge of +1 for 100 time steps.
  • To generate 3 system (above) with appropriate groupings:
    • Neutral particles: set the charge to 0
    • Charged particles: set the charges to be random (between -1 and 1, maybe), as this will make it unlikely that sets of particles will have neutral charge (which wouldn't be unreasonably if all charges were -1 and +1)
    • Charged dimers: define each dimer by its centre of mass (CoM) and the angles of its orientation, and then apply a motif to each point (i.e. for each point, add 2 atoms using the CoM and orientation to calculate their coords, and their masses/charges can be stored with the CoM and orientation)
  • Once the systems have been developed, we need to calculate the '16 things' (in Aim), with density being calculated as total mass (or charge) divided by volume of slice, and the numnber of slices will match with that in lineardensity.py.
  • We then use np.assert_allclose as used in the original code

Questions:

  • Are the systems supposed to be real or can we just create a theoretical system that demonstrates the mass and charge densities properly?
    • Example: Argon gas (modelled as perfect gas) for uncharged particles
    • The use of this would be in being able to compare to known values (such as the denisty of Argon at room temperature)
    • My understanding is that this is NOT relevant, as we are not simulating the particles with a force field, we are just randomising the co-ordinates at each time step, and thus the system will not be realistic
    • If so, to demonstrate the groupings, can we just create a set of x particles and divide it arbitrarily into the 3 larger groupings (with each group potentially being charged overall)?
  • Can the systems be randomised each time, or should we randomly generate a set of 3 systems (checking each demonstrates all the functionality), and then save this as hard code?
    • NOTE: I would also save the code somehow as documentation to demonstrate how the hard code was generated so it can be followed, as otherwise I find it quite hard to read the code initially as they seemingly just quoted some random numbers with no indication of their origin (that I saw, at least)
  • For a grouping of atoms, say a residue, it is my understanding that when calculating mass density, the slice that a residue is in is taken to be the slice where its CoM is in. is this correct? If not, what is the point of groupings in this module?
  • My understanding of pytest is next to nothing and I could thoroughly read through it, but it seems that this part of the code is fine, and so would it be alright to essentially copy and paste that part, and just change the systems (and the associated np.assert_allclose functions)?

...wow, that was a lot of words, sorry.

My wifi has unfortunately crashed for like a week which is a little awkward, but I'll hopefully be able to finish this before the GSoC 2025 deadline, but I'd like to keep working on it afterwards even if not. I'm also quite new to Git, and so will be asking a friend to help with how to work a Pull Request, and would appreciate any advice on how to structure comments and such as well (as this has gotten very long...).

But yeah, looking forward to working on this!

EchoFia pushed a commit to EchoFia/mdanalysis that referenced this issue Apr 5, 2025
…e feedback as this is just a rough attempt
@EchoFia EchoFia linked a pull request Apr 5, 2025 that will close this issue
5 tasks
EchoFia pushed a commit to EchoFia/mdanalysis that referenced this issue Apr 7, 2025
EchoFia pushed a commit to EchoFia/mdanalysis that referenced this issue Apr 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants