Skip to content

Commit aa770a5

Browse files
committed
first pass of pre-commit hooks
a few small updates;
1 parent 9568e18 commit aa770a5

File tree

216 files changed

+3900
-2560
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

216 files changed

+3900
-2560
lines changed

.github/ISSUE_TEMPLATE/feature_request.yml

-1
Original file line numberDiff line numberDiff line change
@@ -39,4 +39,3 @@ body:
3939
attributes:
4040
label: Additional notes
4141
description: Any additional context, screenshots, etc. that may help with the discussion and implementation.
42-

.gitignore

+12
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,18 @@ logs
77
experimental
88
lightning_logs
99

10+
TEMP/
11+
TESTFILE.py
12+
data_explore.py
13+
examples_folder_log.txt
14+
faenet_test.py
15+
lips_splits.py
16+
matsciml/datasets/materials_project/devset-full/
17+
msl-ptl2-venv/
18+
pyg-venv/
19+
replace_substring.py
20+
run_examples.sh
21+
1022
# Byte-compiled / optimized / DLL files
1123
__pycache__/
1224
*.py[cod]

.pre-commit-config.yaml

+1
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,7 @@ repos:
4242
hooks:
4343
- id: isort
4444
name: isort (python)
45+
args: ["--profile=black"]
4546
- repo: https://github.com/psf/black
4647
rev: 23.7.0
4748
hooks:

CONTRIBUTING.md

+5-5
Original file line numberDiff line numberDiff line change
@@ -75,7 +75,7 @@ are not needed as explicit arguments.
7575
If variables/features are required by the model, one can override the `read_batch` method. See the [MPNN](https://github.com/IntelLabs/matsciml/blob/main/matsciml/models/dgl/mpnn.py)
7676
wrapper to see how this pattern can be used to check for data within a batch.
7777

78-
Aside from implementing the `_forward` method of the model itself, the constituent building blocks should be broken up into their own files, respective to what their functions are. For example, layer based classes and utilities should be placed into a `layers.py` file, and other helpful functions can be placed in a `helper.py` or `utils.py` file.
78+
Aside from implementing the `_forward` method of the model itself, the constituent building blocks should be broken up into their own files, respective to what their functions are. For example, layer based classes and utilities should be placed into a `layers.py` file, and other helpful functions can be placed in a `helper.py` or `utils.py` file.
7979

8080
Completed models can be added to the list of imports in `./matsciml/models/<framework>/__init__.py`, where `<framework>` can be `dgl` or `pyg`.
8181

@@ -108,7 +108,7 @@ class AmazingModel(AbstractPyGModel):
108108
### DGL models
109109

110110
DGL does not provide a class to inherit from for the message passing step, and instead, relies
111-
on users to define user-defined functions (`udf`), and extensive use of graph scopes.
111+
on users to define user-defined functions (`udf`), and extensive use of graph scopes.
112112

113113
We recommend reviewing the [MPNN](https://github.com/IntelLabs/matsciml/blob/main/matsciml/models/dgl/mpnn.py) wrapper
114114
to see a simplified case, and the [MegNet](https://github.com/IntelLabs/matsciml/tree/main/matsciml/models/dgl/megnet) implementation
@@ -132,7 +132,7 @@ for this type of model.
132132
- Provide proper documentation on how to access, use, and understand the data.
133133
- Make sure to include data preprocessing scripts if applicable.
134134

135-
Adding a dataset usually involves interacting with an external API to query and download data. If this is the case, a separate `{dataset}_api.py` and `dataset.py` file can be used to separate out the functionalities. In the API file, a default query can be used to save data to lmdb files, and do any initial preprocessing necessary to get the data into a usable format. Keeping track of material ID's and the status of queries.
135+
Adding a dataset usually involves interacting with an external API to query and download data. If this is the case, a separate `{dataset}_api.py` and `dataset.py` file can be used to separate out the functionalities. In the API file, a default query can be used to save data to lmdb files, and do any initial preprocessing necessary to get the data into a usable format. Keeping track of material ID's and the status of queries.
136136

137137
The main dataset file should take care of all of the loading, processing and collating needed to prepare data for the training pipeline. This typically involves adding the necessary key-value pairs which are expected, such as `atomic_numbers`, `pc_features`, and `targets`.
138138

@@ -144,7 +144,7 @@ The existing dataset's should be used as a template, and can be expanded upon de
144144
- Follow our testing framework and naming conventions.
145145
- Verify that all tests pass successfully before making a pull request.
146146

147-
Tests for each new model and datasets should be added to their respective tests folder, and follow the conventions of the existing tests. Task specific tests may be added to the model folder itself. All relevant tests must pass in order for a pull request to be accepted and merged.
147+
Tests for each new model and datasets should be added to their respective tests folder, and follow the conventions of the existing tests. Task specific tests may be added to the model folder itself. All relevant tests must pass in order for a pull request to be accepted and merged.
148148

149149
Model tests may be added [here](https://github.com/IntelLabs/matsciml/tree/main/matsciml/models/dgl/tests), and dataset tests may be added to their respective dataset folders when created.
150150

@@ -164,4 +164,4 @@ __If it is your first pull request, please ensure you add your name to the [cont
164164

165165
We appreciate your dedication to making our project better and look forward to your contributions! If you have any questions or need assistance, feel free to reach out through the issue tracker or discussions section.
166166

167-
Thank you for being a part of our open-source community!
167+
Thank you for being a part of our open-source community!

LICENSE.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -18,4 +18,4 @@ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
1818
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
1919
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
2020
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21-
SOFTWARE.
21+
SOFTWARE.

README.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -97,9 +97,9 @@ For more advanced use cases:
9797
Checkout materials generation with CDVAE
9898
</summary>
9999

100-
CDVAE [7] is a latent diffusion model that trains a VAE on the reconstruction
100+
CDVAE [7] is a latent diffusion model that trains a VAE on the reconstruction
101101
objective, adds Gaussian noise to the latent variable, and learns to predict
102-
the noise. The noised and generated features inlcude lattice parameters,
102+
the noise. The noised and generated features inlcude lattice parameters,
103103
atoms composition, and atom coordinates.
104104
The generation process is based on the annealed Langevin dynamics.
105105

@@ -140,7 +140,7 @@ Multiple tasks trained using the same dataset
140140
python examples/tasks/multitask/single_data_multitask_example.py
141141
```
142142

143-
Utilizes Materials Project data to train property regression and material classification jointly
143+
Utilizes Materials Project data to train property regression and material classification jointly
144144
</details>
145145

146146
<details>

Security.md

+1-2
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,5 @@
11
# Security Policy
2-
Intel is committed to rapidly addressing security vulnerabilities affecting our customers and providing clear guidance on the solution, impact, severity and mitigation.
2+
Intel is committed to rapidly addressing security vulnerabilities affecting our customers and providing clear guidance on the solution, impact, severity and mitigation.
33

44
## Reporting a Vulnerability
55
Please report any security vulnerabilities in this project [utilizing the guidelines here](https://www.intel.com/content/www/us/en/security-center/vulnerability-handling-guidelines.html).
6-

docker/Dockerfile

+3-3
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ FROM nvidia/cuda:$CUDA_VERSION
77

88
ENV LANG=C.UTF-8 LC_ALL=C.UTF-8
99
# Avoids some interactive prompts during apt-get install
10-
ARG DEBIAN_FRONTEND=noninteractive
10+
ARG DEBIAN_FRONTEND=noninteractive
1111

1212
# clean up and refresh apt-get index
1313
RUN apt-get update && \
@@ -33,7 +33,7 @@ RUN apt-get update --fix-missing && \
3333
sudo \
3434
software-properties-common \
3535
python3.9 \
36-
python3-pip \
36+
python3-pip \
3737
virtualenv && \
3838
apt-get clean && rm -rf /var/cache/apt/archives /var/lib/apt/lists/*
3939

@@ -64,4 +64,4 @@ RUN pip install matminer
6464
RUN pip install p_tqdm
6565
RUN pip install -U pytorch-lightning==1.8.6
6666
RUN pip install -U torchmetrics==0.11.4
67-
RUN pip install -U pytest
67+
RUN pip install -U pytest

examples/datasets/carolina_db/single_task_devset.py

+8-5
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,12 @@
1+
from __future__ import annotations
2+
13
import pytorch_lightning as pl
24
from torch.nn import LayerNorm, SiLU
35

6+
from matsciml.datasets.transforms import PointCloudToGraphTransform
47
from matsciml.lightning.data_utils import MatSciMLDataModule
58
from matsciml.models import PLEGNNBackbone
69
from matsciml.models.base import ScalarRegressionTask
7-
from matsciml.datasets.transforms import PointCloudToGraphTransform
8-
910

1011
# configure a simple model for testing
1112
model_args = {
@@ -57,9 +58,11 @@
5758
dset_kwargs={
5859
"transforms": [
5960
PointCloudToGraphTransform(
60-
"dgl", cutoff_dist=20.0, node_keys=["pos", "atomic_numbers"]
61-
)
62-
]
61+
"dgl",
62+
cutoff_dist=20.0,
63+
node_keys=["pos", "atomic_numbers"],
64+
),
65+
],
6366
},
6467
)
6568

examples/datasets/materials_project/single_task_base.py

+3-1
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,10 @@
1+
from __future__ import annotations
2+
13
import pytorch_lightning as pl
24
from torch.nn import LayerNorm, SiLU
35

4-
from matsciml.lightning.data_utils import MatSciMLDataModule
56
from matsciml.datasets.transforms import PointCloudToGraphTransform
7+
from matsciml.lightning.data_utils import MatSciMLDataModule
68
from matsciml.models import GraphConvModel
79
from matsciml.models.base import ScalarRegressionTask
810

examples/datasets/materials_project/single_task_devset.py

+3-2
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,11 @@
1+
from __future__ import annotations
2+
13
import pytorch_lightning as pl
24

5+
from matsciml.datasets.transforms import PointCloudToGraphTransform
36
from matsciml.lightning.data_utils import MatSciMLDataModule
47
from matsciml.models import GraphConvModel
58
from matsciml.models.base import ScalarRegressionTask
6-
from matsciml.datasets.transforms import PointCloudToGraphTransform
7-
89

910
# configure a simple model for testing
1011
model = GraphConvModel(100, 1, encoder_only=True)

examples/datasets/materials_project/single_task_egnn.py

+9-5
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,12 @@
1+
from __future__ import annotations
2+
13
import pytorch_lightning as pl
24
from torch.nn import LayerNorm, SiLU
35

4-
from matsciml.lightning.data_utils import MatSciMLDataModule
56
from matsciml.datasets.transforms import PointCloudToGraphTransform
7+
from matsciml.lightning.data_utils import MatSciMLDataModule
68
from matsciml.models import PLEGNNBackbone
7-
from matsciml.models.base import ScalarRegressionTask, BinaryClassificationTask
9+
from matsciml.models.base import ScalarRegressionTask
810

911
pl.seed_everything(21616)
1012

@@ -56,9 +58,11 @@
5658
dset_kwargs={
5759
"transforms": [
5860
PointCloudToGraphTransform(
59-
"dgl", cutoff_dist=20.0, node_keys=["pos", "atomic_numbers"]
60-
)
61-
]
61+
"dgl",
62+
cutoff_dist=20.0,
63+
node_keys=["pos", "atomic_numbers"],
64+
),
65+
],
6266
},
6367
val_split=0.2,
6468
batch_size=16,

examples/datasets/materials_project/single_task_gala.py

+2-1
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,12 @@
1+
from __future__ import annotations
2+
13
import pytorch_lightning as pl
24
from torch.nn import LayerNorm, SiLU
35

46
from matsciml.lightning.data_utils import MatSciMLDataModule
57
from matsciml.models import GalaPotential
68
from matsciml.models.base import ScalarRegressionTask
79

8-
910
model_args = {
1011
"D_in": 100,
1112
"hidden_dim": 128,

examples/datasets/materials_project/single_task_symmetry.py

+9-5
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,10 @@
1+
from __future__ import annotations
2+
13
import pytorch_lightning as pl
24
from torch.nn import LayerNorm, SiLU
35

4-
from matsciml.lightning.data_utils import MatSciMLDataModule
56
from matsciml.datasets.transforms import PointCloudToGraphTransform
7+
from matsciml.lightning.data_utils import MatSciMLDataModule
68
from matsciml.models import GraphConvModel
79
from matsciml.models.base import CrystalSymmetryClassificationTask
810

@@ -25,13 +27,15 @@
2527
# the base set is required because the devset does not contain symmetry labels
2628
dm = MatSciMLDataModule(
2729
dataset="MaterialsProjectDataset",
28-
train_path='./mp-project/base/train',
30+
train_path="./mp-project/base/train",
2931
dset_kwargs={
3032
"transforms": [
3133
PointCloudToGraphTransform(
32-
"dgl", cutoff_dist=20.0, node_keys=["pos", "atomic_numbers"]
33-
)
34-
]
34+
"dgl",
35+
cutoff_dist=20.0,
36+
node_keys=["pos", "atomic_numbers"],
37+
),
38+
],
3539
},
3640
val_split=0.2,
3741
batch_size=16,

examples/datasets/nomad/single_task_devset.py

+9-8
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,12 @@
1+
from __future__ import annotations
2+
13
import pytorch_lightning as pl
24
from torch.nn import LayerNorm, SiLU
35

6+
from matsciml.datasets.transforms import PointCloudToGraphTransform
47
from matsciml.lightning.data_utils import MatSciMLDataModule
58
from matsciml.models import PLEGNNBackbone
6-
from matsciml.models.base import (
7-
ScalarRegressionTask,
8-
)
9-
from matsciml.datasets.transforms import PointCloudToGraphTransform
10-
9+
from matsciml.models.base import ScalarRegressionTask
1110

1211
# configure a simple model for testing
1312
model_args = {
@@ -59,9 +58,11 @@
5958
dset_kwargs={
6059
"transforms": [
6160
PointCloudToGraphTransform(
62-
"dgl", cutoff_dist=20.0, node_keys=["pos", "atomic_numbers"]
63-
)
64-
]
61+
"dgl",
62+
cutoff_dist=20.0,
63+
node_keys=["pos", "atomic_numbers"],
64+
),
65+
],
6566
},
6667
)
6768

examples/datasets/oqmd/single_task_devset.py

+8-5
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,12 @@
1+
from __future__ import annotations
2+
13
import pytorch_lightning as pl
24
from torch.nn import LayerNorm, SiLU
35

6+
from matsciml.datasets.transforms import PointCloudToGraphTransform
47
from matsciml.lightning.data_utils import MatSciMLDataModule
58
from matsciml.models import PLEGNNBackbone
69
from matsciml.models.base import ScalarRegressionTask
7-
from matsciml.datasets.transforms import PointCloudToGraphTransform
8-
910

1011
# configure a simple model for testing
1112
model_args = {
@@ -56,9 +57,11 @@
5657
dset_kwargs={
5758
"transforms": [
5859
PointCloudToGraphTransform(
59-
"dgl", cutoff_dist=20.0, node_keys=["pos", "atomic_numbers"]
60-
)
61-
]
60+
"dgl",
61+
cutoff_dist=20.0,
62+
node_keys=["pos", "atomic_numbers"],
63+
),
64+
],
6265
},
6366
num_workers=0,
6467
)

0 commit comments

Comments
 (0)