Skip to content

Commit 0058706

Browse files
committed
entropy renaming on README
1 parent 973c754 commit 0058706

File tree

2 files changed

+31
-4
lines changed

2 files changed

+31
-4
lines changed

README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -61,13 +61,13 @@ To use the QUESTS package to create descriptors and compute entropies, you can u
6161
```python
6262
from ase.io import read
6363
from quests.descriptor import get_descriptors
64-
from quests.entropy import perfect_entropy, diversity
64+
from quests.entropy import entropy, diversity
6565

6666
dset = read("dataset.xyz", index=":")
6767
x = get_descriptors(dset, k=32, cutoff=5.0)
6868
h = 0.015
6969
batch_size = 10000
70-
H = perfect_entropy(x, h=h, batch_size=batch_size)
70+
H = entropy(x, h=h, batch_size=batch_size)
7171
D = diversity(x, h=h, batch_size=batch_size)
7272
```
7373

@@ -131,14 +131,14 @@ Note that this constraint requires the descriptors to be generated using the tra
131131
import torch
132132
from ase.io import read
133133
from quests.descriptor import get_descriptors
134-
from quests.gpu.entropy import perfect_entropy
134+
from quests.gpu.entropy import entropy
135135

136136
dset = read("dataset.xyz", index=":")
137137
x = get_descriptors(dset, k=32, cutoff=5.0)
138138
x = torch.tensor(x, device="cuda")
139139
h = 0.015
140140
batch_size = 10000
141-
H = perfect_entropy(x, h=h, batch_size=batch_size)
141+
H = entropy(x, h=h, batch_size=batch_size)
142142
```
143143

144144
#### Computing overlap between datasets

quests/gpu/entropy.py

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,33 @@
1111

1212

1313
def perfect_entropy(
14+
x: np.ndarray,
15+
h: Union[float, List[float]] = DEFAULT_BANDWIDTH,
16+
batch_size: int = DEFAULT_BATCH,
17+
device: str = "cpu"
18+
):
19+
"""Deprecated. Please use `entropy`.
20+
21+
Computes the perfect entropy of a dataset using a batch distance
22+
calculation. This is necessary because the full distance matrix
23+
often does not fit in the memory for a big dataset. This function
24+
can be SLOW, despite the optimization of the computation, as it
25+
does not approximate the results.
26+
27+
Arguments:
28+
x (np.ndarray): an (N, d) matrix with the descriptors
29+
h (int or np.nadarray): bandwidth (value / vector) for the Gaussian kernel
30+
batch_size (int): maximum batch size to consider when
31+
performing a distance calculation.
32+
33+
Returns:
34+
entropy (float): entropy of the dataset given by `x`.
35+
or (np.ndarray): if 'h' is a vector
36+
"""
37+
return entropy(x, h, batch_size, device=device)
38+
39+
40+
def entropy(
1441
x: torch.tensor,
1542
h: float = DEFAULT_BANDWIDTH,
1643
batch_size: int = DEFAULT_BATCH,

0 commit comments

Comments
 (0)