-
Notifications
You must be signed in to change notification settings - Fork 55
Open
Description
Dear scikit-allel developers,
I am working on calculating XP-nSL and XP-EHH using scikit-allel, and I want to confirm the correct MAF filtering practice, referencing comments from the selscan developer:
- The selscan author stated: "filtering for MAF when applying the two population statistics (XP-EHH/XP-nSL) hindered power quite a bit. This is why I didn't implement it in selscan." (Question: XP-nSL maf filtering and interpretation of normalisation szpiech/selscan#147 (comment))
- They further clarified:
- For XP-nSL: Retaining sites monomorphic in the two compared populations (but polymorphic in the species) is reasonable, as they may influence distance measures.
- For XP-EHH: Monomorphic sites in both populations only slow down the software (no impact on results), so they can be filtered. (The results are closely related with the monomorphic sites szpiech/selscan#89 (comment))
I want to confirm if this guidance applies to scikit-allel as well:
- Should I completely avoid MAF filtering (e.g., MAF > 0.01/0.05) for both XP-nSL and XP-EHH in scikit-allel?
- For XP-EHH, is filtering only fully monomorphic sites (in both populations) the only acceptable filtering (no MAF-based filtering)?
- For XP-nSL, should I retain even the monomorphic sites in the two compared populations?
Thank you very much for your clarification.
Best regards,
Zheng zhuqing
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels