Change Neighborhood Size Calc in Find Orientations #714
Replies: 3 comments
-
For reference, here is the line. @donald-e-boyce, can you take a look at this? |
Beta Was this translation helpful? Give feedback.
-
In the If The
where @darrencpagan suggests the formula: One thing immediately apparent is that the formula decreases with Here are some other suggestions:
In any case, I think we ought to experiment with these values a little |
Beta Was this translation helpful? Give feedback.
-
I think Don’s suggestion of a statistics-based approach is the way to go. This heuristic was originally developed and tested with synthetic data, but I think it definitely merits a parameter study both with and without strong texture.Sent from my iPhoneOn Sep 23, 2024, at 08:24, Donald Boyce ***@***.***> wrote:
In the create_clustering_parameters function, the parameters
min_samples and mean_rpg are returned. The min_samples
parameter is the minimum size for a cluster when using the
dbcluster package for clustering. The mean_rpg is the mean
number of reflections per grain (more below).
If min_samples is too large, then it may miss some valid grains.
If it is too small, then dbscan is less efficient, and some of the
clusters may not correspond to real grains.
The create_clustering_parameters function works by creating 100 random
orientations and creating synthetic grains with those orientations, centroid
at the origin, and zero strain. Then it computes the reflections for
each seed HKL, tracking the number of reflections for each grain and
seed. The mean_rpg value is just the mean of the number of reflections
on each grain. The min_samples value is computed from the minimum number
of reflections of the 100 grains and the completeness threshold, using the
formula:
min_samples = max(
int(np.floor(0.5*compl_thresh*min(seed_refl_per_grain))),
2
)
where compl_thresh is the minimum completeness for a grain.
The intent here is that from the random orientations, you expect almost all
of the grains to have at least compl_thresh*min(seed_refl_per_grain))
reflections, and cutting that in half gives generous allowance.
The minimum number of samples is always at least 2.
@darrencpagan suggests the formula:
int(np.floor(0.75*(1-compl_thresh)*min(seed_refl_per_grain)))
to replace the existing
int(np.floor(0.5*compl_thresh*min(seed_refl_per_grain))).
One thing immediately apparent is that the formula decreases with
increasing completeness. In fact, for completeness threshold of 1, it gives
a value of 0, corresponding to a min_samples value of 2 (the minimum).
And for a completeness threshold of 0, it gives the maximum value.
This seems to be the opposite of what you want. With a high completeness
threshold, you can use a higher cluster size because all the grains will
have lots of reflections; with a low completeness threshold, you have
to make it much smaller because many grains will have a small proportion
of reflections.
Here are some other suggestions:
set the min_samples based on mean and standard deviation of
the samples; knowing the standard deviation, you can set the cutoff
value to correspond to an expected percentage of grains being found,
e.g. to the 99% or 99.9 percent value, based on the
random grain statistics.
run more than 100 random grains; make that an option to the
create_clustering_parameters function.
in fact, add ngrains and a pvalue arguments to the
create_clustering_parameters function
maybe add an option to set min_samples directly to a fixed value, and make that
available in the config file.
In any case, I think we ought to experiment with these values a little
bit and see what happens. I'll work on that some this week, starting
with adding arguments to create_clustering_parameters function.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
The logic for determining the neighborhood size for clustering for find-orientations tends to overestimate the necessary size, leading to missed grains, particularly in textured materials.
Reilly Knox and I suggest changing line 693 of findorientations.py to:
int(np.floor(0.75*(1-compl_thresh)*min(seed_refl_per_grain)))
Beta Was this translation helpful? Give feedback.
All reactions