Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fitting PCs with nPC vs CV #93

Open
JandehF opened this issue Dec 6, 2023 · 3 comments
Open

Fitting PCs with nPC vs CV #93

JandehF opened this issue Dec 6, 2023 · 3 comments

Comments

@JandehF
Copy link

JandehF commented Dec 6, 2023

Hi I am conducting FarmCPU GWAS and would like to understand the difference in fitting PCs using nPC vs CV. I read in the rMPV manual (cran repository) that when using nPC FarmCPU, the PCs are fitted as fixed effects.Is this correct? My understanding of GWAS models is that we want to fit PCs as covariates to ensure association between SNPs and trait are not confounded by pop structure.

My question is which argument should I use to add principle components to the model (nPC.FarmCPU or CV.FarmCPU)? I noticed associations slightly changed when performing GWAS with the 2 different arguments. Ive attached images of the resulting Manhattan and Q-Q plots.

Thank you in advance for your help.
Kindly,
Jandeh

Screen Shot 2023-12-06 at 6 11 22 PM
@YinLiLin
Copy link
Collaborator

YinLiLin commented Dec 8, 2023

Thank you for the question.
In fact, it should not have significant impact on the GWAS results between these two settings, did you check if the order of individuals in phenotype, covariates, genotype is fully consistent?

@JandehF
Copy link
Author

JandehF commented Dec 11, 2023

Hi yes I have double-checked that the order of individuals in consistent. I just noticed I am running into a problem with coding when I double checked.

If I add PC as covariates using model.matrix, I end up with an extra variable in covariates (seems to be adding an intercept variable). I think this is what led to the results in my original post and is not the correct coding for my needs.
PC<- read.delim("NAM_PC0.15het.txt",head=FALSE)
Covariates <- model.matrix(~as.factor(PC1)+as.factor(PC2)+as.numeric(PC3)+ as.numeric(PC4), data=PC)

If I add PC as covariates using bigmemory::as.matrix (as outlined on the main GitHub page). The first row of PCs is removed and I end up with 1 less row than that total number of individuals so the GWAS analysis will not run.
MVP.Data.PC("NAM_PC0.15het.txt", out='NAM', sep='\t')
covariates <- bigmemory::as.matrix(attach.big.matrix("NAM.pc.desc", head=FALSE))

If I adding PC as covariates by just attaching as a matrix, the PC matrix seem so upload with no problem. The associations are still different from using nPC but closer than my first run.
PC<- read.delim("NAM_PC0.15het.txt",head=FALSE)
cv <- as.matrix(PC)

My questions are:

  1. What is the statistical difference in considering PC in covariates vs. as a fixed effect?
  2. what is the best way (coding) to upload covariates to fit in the GWAS model?

@YinLiLin
Copy link
Collaborator

Thank you for the detailed descriptions.
In rMVP, the PCs are fitted as covariates rather than fixed effects. And for PCs, we recommend using the argument 'nPC.XXX' to fit the model, it should be safer and more stable we believe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants