-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define percent strong #21
Comments
@niranjchandrasekaran - at checkin today I think I may have answered your question about "percent strong" incorrectly. We are not calculating medians in |
@gwaygenomics, does that mean the current implementation of |
I think that is correct. The good news is that it should require relatively little extra code to implement the second type, given that this matrix is being computed cytominer-eval/cytominer_eval/operations/percent_strong.py Lines 35 to 37 in 6f9d350
|
note: we've also renamed |
Potentially useful resource: https://github.com/shntnu/grit-benchmark/blob/rtests/1.calculate-metrics/cell-health/taxonomy.md |
I'll copy here text by @gwaygenomics from the paper https://github.com/broadinstitute/lincs-profiling-complementarity because it's the clearest description of the method I've come across! Constructing an appropriate null distribution to calculate reproducibility metrics Specifically, for percent replicating, for a given perturbation x with n replicates of dose p, we randomly sampled n non-replicate profiles from all 1,327 common perturbations treated with dose p. We performed this sampling procedure 1,000 times per replicate cardinalityclass (e.g. compounds with 3 replicates, 4 replicates, 5 replicates, etc.) with two additional restrictions: (1) the random sample did not include replicates for perturbation x, and (2) no two compounds of the same non-x perturbation were included in the same null group. For example, in cases where a compound treatment at a specific dose had five replicates, we sampled 1,000 groups of five randomly sampled non-replicate profiles of the same dose. For percent replicating, we used level 4 profiles considering compound and dose information as replicates. We considered a replicating profile one in which the ground truth median pairwise replicate correlation was higher than 95% of the null distribution. We therefore calculate the percent replicating metric as the total number of replicating profiles over all common compounds. For percent matching, we performed a similar procedure. The only differences were that we (1) used level 5 consensus signatures and (2) considered MOA and dose information as replicates. We subsequently constructed dose and MOA replicate count-specific null distributions to compare against. We considered a matched MOA one in which the ground truth MOA median pairwise correlation was higher than 95% of the null distribution. We therefore calculate the percent matching metric as the total number of matched MOAs over all common MOAs. We used these null distributions to calculate a non-parametric p value. First, for each compound, we calculated its median pairwise replicate correlation. We next calculated the median pairwise correlations of each randomly sampled group matched to the specific dose and replicate count. Lastly, we calculated a compound specific p value by dividing how many times the real median pairwise correlation of replicates was higher than all 1,000 randomly sampled groups of median pairwise correlations. |
Quick note because this came up when reviewing @jccaicedo 's paper: As of Oct 2021, the definitions in #21 (comment) might be inconsistent with the terminology used in the package. |
I am fairly sure they will be inconsistent - although I do think the differences will be very minor. We did not use this package in that paper, and I wrote the package implementation a couple months before |
Makes sense 👍 |
(Stubs for now, so we can add this documentation to code later)
Percent strong is reported in two ways. We should distinguish between these ways of reporting (they are similar but not the same)
The second version can be a bit confusing so here is an example:
The text was updated successfully, but these errors were encountered: