-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Selection paper outline #8
Comments
We have made some decisions in terms of the manuscript's scope (ping me, correct me if I am wrong) based on the discussion we had today (02/22/22) during our biweekly meeting: PART IFor the demography inference with flavors of selection (background selection)
These analyses are halfway implemented in our current analyses2 repository specifically in n_t.snake workflow We also want the multi-population analyses
Part IIFor the DFE inference excluding the positive portion
I think implementations are almost finished, see analyses2 repository for details, thanks to @andrewkern @petrelharp @mufernando, and others... Part IIIUnderstanding/simulating beneficial mutations as in a sweep using the positive portion of a DFE This is still a work in progress, but two things could be evaluated here
As @andrewkern @petrelharp have pointed out, there are multiple features to be added in terms of positive selection, that could either be included in the discussion of this manuscript and implemented in the next paper, or that could be implemented for this paper but not trivial. I would personally vote to simplify and leave it as future work just so we don't lose momentum. |
Update based on @izabelcavassim 's previous post:
Mostly complete. Need production sims and final plots.
Mostly complete. Need production sims and final plots.
There is work left to do for this aim.
Human
Arabidopsis
Plots in each of the above areas could be kept relatively simple and extra information reported in tables or supp mat; or they could get pretty big including panels for different methods, species, and DFEs... TBD |
It would be nice to do a non-gamma DFE for one of our simulations. Maybe a lognormal, even reaching back to Boyko 2008? |
What we're set up to do is "compare different sweep detection methods" in terms of FPR/TPR in windows across a chromosome. In particular, there's a working pipeline that uses sweepfinder2 to detect sweeps in windows across a chromosome (under simulated neutral/BGS/BGS+sweep scenarios). There's a start at a similar pipeline for diploshic, but it isn't finished. So, assuming that what'll go in the paper is a sweepfinder vs diploshic comparison, what remains to be done is:
(I think? Tagging @mufernando and @andrewkern as they're the ones who've put these workflows together.) This should serve as an illustration of what stdpopsim can do wrt sweeps, so maybe we don't need anything else? The other bullet points, while interesting, seem like a lot of work without clear questions in mind. |
Notes from the meeting: proposal is for the sweeps section, discuss:
|
We also discussed, following @RyanGutenkunst 's comment above, adding a non-gamma DFE for humans, and running the DFE inference pipeline on it: popsim-consortium/stdpopsim#1470 |
I started stubbing out a manuscript in a new repo here: https://github.com/popsim-consortium/analysis2_manuscript I'm planning on starting to the writing today |
Hey all-- I'm opening up an issue for us to start bashing away at an outline for the second paper. a particular goal is to have a solid list of the analyses we want to do and then later, delegation of those analyses to particular individuals/groups.
We have a google doc going for the outline here , but it might be preferable to just use this issue and so I've copied that text below
Selection & PopSim
Paper 2
Timeline for selection papers:
Late summer early fall
Companion papers on 1) sweeps & 2) rescaling. Also similar timeline.
Outline of main analyses for main paper:
Comparison of different DFE methods like FitDadi polyDFE, GRAPES (Ryan G’s group & Izabel can work on this). How is demography dealt with? Sample size?
Sweeps! (will be its own companion paper that Andy is leading, but some key results in the main paper).
Implement sweep models from literature. Make a model in StdPopSim “recurrent_sweeps”. Can put this model with different demographics & rec rates, etc.
Look at summary stats & power to detect sweeps in human genomes under different demographic models.
Look at power of ML methods
Confounders. Multiple sweeps. Sweeps & BGS.
How do DFE methods perform when sweeps are included?
Selection confounding demographic inference (can recycle a lot of pipelines from paper 1, just running them on models with selection).
What we need to do:
Decide what models to do:
DFE
Sweep
https://github.com/popsim-consortium/analysis2
Implement models
QC
Analyses
######################################################
Brainstorming of ideas for PopSim Selection paper form the call on 6/15 (not all will be in paper):
Comparison of different DFE methods (Ryan G’s group can work on this). How is demography dealt with? Sample size?
Scaling (maybe merits its own paper delving into theory of scaling...might be too ambitious for PopSim paper)
Ideally, PopSim paper will point to this companion paper. PopSim paper will have to mention scaling in some way. PopSim paper could connect it with applications...use guidelines from theory paper to do stuff for a particular organism
3)Do current models of DFEs/annotations in humans predict summaries of genetic variation (spatial pattern of pi, SFS, LD?)? (strength: leverage demographic models from before...annotations, DFE...all the fancy stuff together. Great way to showcase the whole resource! Guidance for how well the field is doing in terms of model adequacy)
What if synonymous (or “neutral sites”) are actually under selection? Does that confound things.
Sweeps! (may be its own paper, but could put some key results in the main paper).
Implement sweep models from literature. Make a model in StdPopSim “recurrent_sweeps”. Can put this model with different demographics & rec rates, etc.
Look at summary stats & power to detect sweeps in human genomes under different demographic models.
Look at power of ML methods
Confounders. Multiple sweeps. Sweeps & BGS.
How do DFE methods perform when sweeps are included?
Selection confounding demographic inference
In paper say how stdpopsim can be used to test “your new method” for detecting selection. No one perfect statistic--depends on biology, data, etc.
Try to show an example in the paper from a non-human example.
The text was updated successfully, but these errors were encountered: