-
Notifications
You must be signed in to change notification settings - Fork 2
242 move wyckoff to positions #247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
- TODO: connect to other functionalities
-- TODO: resolve missing population of `site_symmetry` - Add standard naming to Wyckoff positions
- Remove site_symmetry quantity from WyckoffPosition class - Remove unpopulated degrees_of_freedom, equivalent_coordinates, and coordinate_expressions fields - Fix coordinate handling to only set fractional coordinates for crystallographic systems - Clean up _populate_wyckoff_positions method
This reverts commit d6b4dca.
This reverts commit 9bb52ca.
This reverts commit 7cb7682.
…ites`, i.e. geometric only.
Incorporate `log` decorator
Pull Request Test Coverage Report for Build 17124103302Details
💛 - Coveralls |
|
Main modifications:
Open questions:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR addresses issue #242 by replacing the AtomicCell class with a more generic Cell class and moving Wyckoff position information from cell-level to system-level storage in ModelSystem. The change simplifies the model system representation while consolidating crystallographic data at the appropriate hierarchical level.
Key changes:
- Replace
AtomicCellwithCellclass, removing Wyckoff-specific quantities - Move Wyckoff position data to
ModelSystem.wyckoff_siteswith enhanced documentation - Update all references and tests to use the new
Cellclass
Reviewed Changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| src/nomad_simulations/schema_packages/model_system.py | Core refactoring: removes AtomicCell class, adds wyckoff_sites to ModelSystem, updates method names and references |
| tests/*.py | Updates test files to use Cell instead of AtomicCell |
| tests/conftest.py | Updates fixture generation functions to return Cell objects |
| docs/model_system/model_system.md | Adds comprehensive documentation for the new ModelSystem structure |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| sec_symmetry = self.m_create(Symmetry) | ||
| sec_symmetry.normalize(archive, logger) | ||
|
|
||
| self.wyckoff_sites = self._compute_wyckoff_sites(sec_symmetry) |
Copilot
AI
Aug 21, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The _compute_wyckoff_sites method is called with a sec_symmetry parameter, but the method definition does not accept any parameters. This will cause a TypeError at runtime.
| return True | ||
|
|
||
| @log | ||
| def _compute_wyckoff_sites(self) -> list[str]: |
Copilot
AI
Aug 21, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method signature indicates no parameters, but it's called with a sec_symmetry parameter on line 1357. Either add the parameter to the method signature or remove it from the call site.
| def _compute_wyckoff_sites(self) -> list[str]: | |
| def _compute_wyckoff_sites(self, sec_symmetry=None) -> list[str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes this is correct
| type=str, | ||
| description=""" | ||
| Any verbose naming refering to the ModelSystem. Can be left empty if it is a simple | ||
| Any verbose naming referring to the `ModelSystem`. Can be left empty if it is a simple |
Copilot
AI
Aug 21, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The word 'referring' should be 'refering' to match the existing style, or consider using 'related to' for clarity.
| Any verbose naming referring to the `ModelSystem`. Can be left empty if it is a simple | |
| Any verbose naming related to the `ModelSystem`. Can be left empty if it is a simple |
|
One option for |
|
@ladinesa Feel free to comment on the use of the |
maybe do it in a separate pr. |
From the sprint meeting, we decided to have a single |
JFRudzinski
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ndaelman-hu Unfortunately I don't have time for an in-depth review, but at a glance it looks good to me, and we discussed quite a bit about it, so I feel like we are on the same page.
The docs I completely did not read, but I will do a review anyway at a later time. So, from my side this is fine to merge, I hope someone else can give you a more detailed review.
One question: should we make Cell() non-repeating, or was there still some debate about this?
Thx, I'll give the other reviewers some more time.
Yes, it is addressed in issue #240. Will soon have a PR up there too. |
| 2. **Primitive Cell** (`type='primitive'`) | ||
| - Smallest possible unit cell that tiles the crystal | ||
| - Contains minimum number of lattice points | ||
| - Used for certain calculations (e.g., band structures) | ||
|
|
||
| 3. **Conventional Cell** (`type='conventional'`) | ||
| - Standardized cell following International Tables conventions | ||
| - Used for symmetry analysis and crystallographic databases | ||
| - **Typically used to define symmetry operators:** ["x,y,z", "-x,y,-z", "x+1/2,y+1/2,z", "-x+1/2,y+1/2,-z"] (conventional) vs ["x,y,z", "-x,-y,z", "-x,y,-z", "x,-y,-z"] (primitive). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be maybe noted here that primitive and conventional systems may not directly correspond to the original cell: they will have different densities and "idealized" lattice parameters to match space group that they were matched with certain precision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add this as a clarification to the description.
they will have different densities
Not sure what you mean with densities here: mass, electronic? The former stays the same, the latter changes exactly like the geometry. However, k-samplings should change too, and this is not acknowledged (or addressed) anywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ow, upon re-read, i think I understand it now. You're referring to deviations in the cell volume due to the symmetrization accuracy constraint.
| ### Wyckoff Site Annotations | ||
|
|
||
| `ModelSystem` provides Wyckoff site information through the `wyckoff_sites` quantity: | ||
|
|
||
| ```python | ||
| model_system.wyckoff_sites = ['a4', 'b2', 'c1'] # Format: letter + multiplicity | ||
| ``` | ||
|
|
||
| **Key features:** | ||
| - Format: <letter><multiplicity> (e.g., 'a1', 'b2') | ||
| - Crystallographic standard: Based on International Tables for Crystallography | ||
| - Geometric classification: Determined by atomic positions only (chemical species ignored) | ||
| - Per-atom assignment: Each entry corresponds to the position at the same index | ||
|
|
||
| Important: Wyckoff assignments are computed relative to the conventional cell representation, ensuring database compatibility and literature consistency. | ||
|
|
||
| For complete structural uniqueness, combine Wyckoff information with chemical composition data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest that wyckoff sites would have their own model that holds the letter, the multiplicity + any wyckoff parameters.
Also worth noting that also the primitive and conventional cells might need their own Wyckoff sites if we e.g. want to visualize them. But maybe this is not necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you're suggesting splitting the format here? Is this for convention reasons, or maybe querying?
With "Wyckoff parameters", do you mean of the analysis itself, or the symmetry operations at each site?
The representation of these operations is indeed cell-dependent. The locations should simply match the geometric transformations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With wyckoff parameters, I mean that some Wyckoff positions are parametrized with up to three parameters (x, y, z). Now whether it is important to know these parameters or not is another question. MatID can give out this information if you explicitly ask, see and example here: https://nomad-coe.github.io/matid/docs/learn/symmetry-analysis#wyckoff-sets.
If you think that just having the letter+multiplicity is enough, then having a list of strings should be fine. But this structure cannot be easily extended later, unlike a list of objects (list of objects does have other problems, like the fact that in ES you require nested queries to combine information from several fields).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the main application of Wyckoff would be in visualization, yes.
I can only imagine somebody looking for an element (or with a slab, an adsorbate) being present at a specific site. This would be a very sophisticated query, though. I think users would have to be taught about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the queries, I'll follow your suggestion up: I think we would want to support queries like
- find me materials where the representative position
['1/4', '0', '1/2']that has multiplicity 6. - find me materials (of symmetry ...) where
Feoccupies ab8site. This might be more complex under our current setup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
find me materials (of symmetry ...) where
Feoccupies ab8site. This might be more complex under our current setup.
To my understanding, this kind of query would have to be nested in ES. That is to say, we need an object like [{element: Xx, wyckoff_letter: x, wyckoff_number: x, ...}, ...], similar to the proposal above. A flat array mirroring positions as I had originally, is likely nicer for visualization: no need to scan and flatten the object. Considering compatibility with the current schema:
- high: the flat object and global symmetry descriptors.
- low: Wyckoff objects less so. They'd need particle data for the nested query, but they are only appropriate for crystalline system. An intermediate solution would be to have the
particle_indicestwice, but we've been trying to avoid double data, as it tends to confuse users.
If we want allow for nested queries in data (and that's a big IF), it might be nicer to have a specialized section, e.g. data/query, where we bundle any relevant data to power nested queries. This would clearly denote that this section solves a purely technical problem, as well as suggest a potential (more sophisticated) queries. The core schema can meanwhile grow in a more intuitive way for scientists and domain experts.
This is were my discussion with @JFRudzinski arrived at. @lauri-codes We would really appreciate your opinion here! The suggestion here could act as a template for future cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the summary @ndaelman-hu
I will just clarify that data/query or results/query are equally viable, or perhaps even a different structure, this is a matter of design to be dictated by you @lauri-codes
|
@lauri-codes There was some confusion about whether the Wyckoff analysis only considers the positions, or also accounts for the different atomic species at these positions. Can you confirm that it is in fact the latter? While including elements into the analysis is very sensible, it raises the question whether we should also consider species with different computational settings? There are cases where these settings still represent the same element, or those that were the settings model a new physical reality, e.g. core-hole excitations. We could also run multiple Wyckoff analyses with each possible setting (positions only, chemical elements, computational settings). |
|
Yes, Wyckoff analysis (and symmetry analysis in general) does take the species identity into account. You can yourself explicitly remove the species information of course before the analysis, but this completely changes the symmetries (e.g. rock salt becomes simple cubic). I guess it is difficult to draw the line when a certain particle has been altered in it's behaviour so much that it has become a different "species". I would go with some pragmatic approach here that does not completely break the searchability of data based on symmetry. |
If feasible, I would go with option 1. |
See issue #242