-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Controls / cases counts inverted when using binary model #63
Comments
Hi @GACGAMA, Since you are working on variant-set analysis, is your Overall, this is possible, as the effect sizes of different variants in the variant set could be in various directions- some variants have a protective effect, while others may have a deleterious impact on the outcome. As such, the heterozygous counts may not always be enriched in the case samples. |
Hi @xihaoli |
To jump in on this, @xihaoli, it seems the Score_Stat metric obtained for the variant sets after running the |
@kwdoyle |
Thank you both! @kwdoyle: @GACGAMA: Thanks for your patience. I agree that the simplest thing to look at is to include the number of samples (carriers) in each cohort for binary outcomes. Have you figured out a way to do it now? Best, |
@xihaoli I have figured a way to do that by using R, but I've used the VCF data which is not the best, because the .GDS files are available. Im not sure how to manipulate the GDS files but my scripts should be easily usable in this context if it is adapted |
Thank you @GACGAMA. I have added you as a collaborator of the STAARpipeline-Tutorial repo. Please feel free to contribute if there is any improvements that you wish to make. |
Hello!
I've finished analyzing a large cohort using 0/1 numerical values for control/case status respectively.
Then I went to my vcf to count the distributions of variants for each group.
But what I saw was the contrary of what I expected, I found things that seemed enriched in controls instead of case. Is that the expected order for the enrichment?
I observed that when looking for the percentage comparing controls and cases, the results are both ways - some seemed enriched for controls (most of them) and some for cases, so I'm not sure how to interpret this
Should I repeat by inversing 0/1 as case/control respectively to find things enriched in cases?
One example:
<style> </style>To count, I summed the genotype calls for each group (case and control) and summarized it for each gene-category pair.
As you can see, I have a lot more total Heterozygous calls for controls in one example and cases in another
The text was updated successfully, but these errors were encountered: