Skip to content

Controls / cases counts inverted when using binary model #63

@GACGAMA

Description

@GACGAMA

Hello!

I've finished analyzing a large cohort using 0/1 numerical values for control/case status respectively.
Then I went to my vcf to count the distributions of variants for each group.
But what I saw was the contrary of what I expected, I found things that seemed enriched in controls instead of case. Is that the expected order for the enrichment?

I observed that when looking for the percentage comparing controls and cases, the results are both ways - some seemed enriched for controls (most of them) and some for cases, so I'm not sure how to interpret this

Should I repeat by inversing 0/1 as case/control respectively to find things enriched in cases?

One example:

<style> </style>
gene_step gene Cases_Sum_nHet Control_Sum_nHet Total X more in controls
MUC4_ODMS_WGS_WES_MUC4_synonymous MUC4 261 2829 10.83908046
METTL1_ODMS_WGS_METTL1_promoter_CAGE METTL1 4 0 0

To count, I summed the genotype calls for each group (case and control) and summarized it for each gene-category pair.
As you can see, I have a lot more total Heterozygous calls for controls in one example and cases in another

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions