proposed production-level config for humans #97

chriscrsmith · 2023-03-21T17:53:00Z

Soliciting feedback on the choice of settings in the proposed config. This will be what @lntran26 and I use for the hopefully final simulations in humans without scaling.

Notes:

I made a separate config for human. Thought it would be cleaner to break up sims for different species?
I think 100 samples is an increase from 20 which I've been using up to this point.
add demographic model OutOfAfricaArchaicAdmixture_5R19

stsmall · 2023-03-29T15:09:57Z

@chriscrsmith,
I dont recall how to line comment on the commit.
For the demofix, you need to follow the tiny_config.yaml format.
specifically:

"mask_file": "workflows/masks/HapmapII_GRCh37.mask.bed"
# set any of the below to 'none' to skip annot masking
"stairway_annot_mask" : ""
"msmc_annot_mask" : ""
"gone_annot_mask" : ""
"smcpp_annot_mask" : ""
"methods" : ["stairwayplot", "gone", "smcpp", "msmc"]

stsmall · 2023-03-29T15:10:45Z

also you want the "num_msmc_iterations" to be at least 20

stsmall · 2023-03-29T15:12:44Z

dfe and annotations list need to be the same length

"dfe_list": ["Gamma_H17", "Gamma_H17"]
"annotation_list": ["all_sites", "ensembl_havana_104_exons"]

stsmall · 2023-03-29T15:17:28Z

For 'replicates' I think it should be more than 3. Not sure what an upper limit is ... maybe 10 or at least 20?

chriscrsmith · 2023-03-30T01:20:33Z

Thanks @stsmall . Ok what do you think of the new version? I left reps=3 until we get input from others. 20 sounds like too many to me

stsmall · 2023-03-30T17:21:57Z

Thanks @stsmall . Ok what do you think of the new version? I left reps=3 until we get input from others. 20 sounds like too many to me

The plots use seeds (reps) to create CI ribbons. 3 reps will just be noisy. IDK if 20 is too many or not enough, I just picked a number. Since it runs in parallel w/ the reps, shouldnt be too much a slow down to do more, right? We could always add more later, but then would have to rerun the n_t, dfe pipelines on the full dataset.

stsmall · 2023-03-30T17:25:46Z

Otherwise it looks good. :)
Do we want to do more variations for msmc2? Right now it is just 6. More haps (maybe the limit is 16?) will provide better resolution of <1000 gens, which is where msmc2 really seems to go awry. The run time will get way longer and even though it is paralleled, it is still the last thing to finish.

chriscrsmith · 2023-03-30T17:35:17Z

gotcha, the ribbons. 20 reps sounds good

andrewkern · 2023-04-04T16:32:25Z

i'm a bit concerned about the compute cost of 20 reps up front. The way these runs go, we almost always have to rerun it.
i think we should start with 3 reps -- if that completes in a reasonable time we can generate more reps if we want to. One way we could do this would be to have two seeds -- 1 for the first 3 reps, then a second seed for the next 17 (or whatever number..)

workflows/config/snakemake/production_config_HomSap.yml

chriscrsmith · 2023-04-04T17:43:47Z

Otherwise it looks good. :) Do we want to do more variations for msmc2? Right now it is just 6. More haps (maybe the limit is 16?) will provide better resolution of <1000 gens, which is where msmc2 really seems to go awry. The run time will get way longer and even though it is paralleled, it is still the last thing to finish.

If it's already the longest running part of the analysis, I think let's leave for now, update later as needed?

chriscrsmith · 2023-04-04T18:12:50Z

see new commit: changed genetic map, deleted some unused parameters

I have not done a full run yet, but if I turn on scaling it seems to get off the ground ok.

chriscrsmith · 2023-04-11T23:31:33Z

There was some talk in the tuesday meeting about potentially doing the Papuan demographic model. What does everyone think?

RyanGutenkunst · 2023-04-12T03:34:56Z

That would be a flex. :-) I guess we'd assume the DFE was the same in Denisova and Neanderthal as modern humans. We'd lose the easy comparison with the previous paper, but if we run and include the neutral analysis here, that's no problem.

petrelharp · 2023-04-12T04:00:21Z

Say, @chriscrsmith - could you clarify what exactly is being proposed? Like, is there going to be just one demographic model? Or, more than one? And, what DFE(s)?

chriscrsmith · 2023-04-12T14:03:01Z

Demog.

I imagined at least running the same demographic model from the previous paper, for comparison. So, OutOfAfricaArchaicAdmixture_5R19
However we have been using the OutOfAfrica_3G09. Is there something special about this one? Do we leave this model in the analysis.
Based on Ryan's feedback I'd lean towards skipping the Papuan model. But was wondering if we should it include it alongside the other one(s).

DFEs

Gamma_K17: Kim et al. (2017), https://doi.org/10.1534/genetics.116.197145
Gamma_H17: Huber et al. (2017), https://doi.org/10.1073/pnas.1619508114

petrelharp · 2023-04-12T20:20:55Z

I'm still a bit fuzzy here - are we deciding which single model to run, or are we deciding between having 1 or 2 models? Or what? And, concretely, what goes in the paper - is this the demographic model(s) that'll be used for both (a) inferring DFEs and (b) the effect of selection on demographic inference? The same one(s) for both?

chriscrsmith · 2023-04-12T20:41:31Z

Demog.

I imagined at least running the same demographic model from the previous paper, for comparison. So, OutOfAfricaArchaicAdmixture_5R19
However we have been using the OutOfAfrica_3G09. Is there something special about this one? Here are options: 1. Do we leave this model in the analysis? 2. Take it out?
Based on Ryan's feedback I'd lean towards skipping the Papuan model. But was wondering if we should it include it alongside the other one(s). Here are options: 1. Do we use this model? 2 Skip this model?

DFEs

Gamma_K17: Kim et al. (2017), https://doi.org/10.1534/genetics.116.197145
Gamma_H17: Huber et al. (2017), https://doi.org/10.1073/pnas.1619508114

chriscrsmith · 2023-04-12T20:52:39Z

I'm still a bit fuzzy here - are we deciding which single model to run, or are we deciding between having 1 or 2 models?

I imagined at least running the same demographic model from the previous paper.

And, concretely, what goes in the paper - is this the demographic model(s) that'll be used for both (a) inferring DFEs and (b) the effect of selection on demographic inference? The same one(s) for both?

I think that makes sense.

petrelharp · 2023-04-13T00:40:24Z

I agree about using the same model as in the last paper. There is nothing special (besides being an early model and thus jumping to our minds more easily?) about OutOfAfrica_3G09.

I don't have a good sense about whether we've got room for results about more than one model - that depends on what figures we want?

chriscrsmith · 2023-04-13T15:58:23Z

Updated the PR to delete the human model we've been using, so it's now replaced with the model from the previous paper.

Here's the relevant post about our plan for the paper: #8

petrelharp · 2023-04-13T16:59:39Z

Thanks for finding the outline! =) So, current proposal is to just have one model? That seems fine to me, really - unless there's a reason to think that methods might behave differently under some methods than others? But, I guess if we're going to look at different scenarios I'd much rather look at different speices than just different human models. So: I agree!

petrelharp · 2023-05-09T16:27:31Z

In the meeting just now we decided we can merge this.

chriscrsmith · 2023-09-26T16:39:10Z

In meeting just now agreed this looks good, minus the Gamma_H17 dfe.

petrelharp · 2023-10-10T16:30:12Z

@chriscrsmith says merge!

chriscrsmith and others added 2 commits March 21, 2023 10:45

proposed production-level config for humans

e659546

Update production_config_HomSap.yml

fd69e35

chriscrsmith added 3 commits March 29, 2023 18:10

scotts suggestions

21c2b57

del new thing

378d97e

modeling after tiny config

ee5e0d9

lntran26 reviewed Apr 4, 2023

View reviewed changes

workflows/config/snakemake/production_config_HomSap.yml Outdated Show resolved Hide resolved

lntran26 reviewed Apr 4, 2023

View reviewed changes

workflows/config/snakemake/production_config_HomSap.yml Outdated Show resolved Hide resolved

lntran26 reviewed Apr 4, 2023

View reviewed changes

workflows/config/snakemake/production_config_HomSap.yml Outdated Show resolved Hide resolved

lntran26 reviewed Apr 4, 2023

View reviewed changes

workflows/config/snakemake/production_config_HomSap.yml Outdated Show resolved Hide resolved

using updated genetic map; cleaned up some unused parameters

79bd6db

chriscrsmith added 2 commits April 12, 2023 14:21

Update production_config_HomSap.yml

1ad00dc

Update production_config_HomSap.yml

4aa63f2

Update production_config_HomSap.yml

12e5f97

down to a single DFE model

1e61a6e

petrelharp merged commit 8ae1c1a into popsim-consortium:main Oct 10, 2023
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

proposed production-level config for humans #97

proposed production-level config for humans #97

chriscrsmith commented Mar 21, 2023

stsmall commented Mar 29, 2023

stsmall commented Mar 29, 2023

stsmall commented Mar 29, 2023

stsmall commented Mar 29, 2023

chriscrsmith commented Mar 30, 2023

stsmall commented Mar 30, 2023

stsmall commented Mar 30, 2023

chriscrsmith commented Mar 30, 2023

andrewkern commented Apr 4, 2023 •

edited

Loading

chriscrsmith commented Apr 4, 2023

chriscrsmith commented Apr 4, 2023

chriscrsmith commented Apr 11, 2023

RyanGutenkunst commented Apr 12, 2023

petrelharp commented Apr 12, 2023

chriscrsmith commented Apr 12, 2023

petrelharp commented Apr 12, 2023

chriscrsmith commented Apr 12, 2023 •

edited

Loading

chriscrsmith commented Apr 12, 2023

petrelharp commented Apr 13, 2023

chriscrsmith commented Apr 13, 2023

petrelharp commented Apr 13, 2023

petrelharp commented May 9, 2023

chriscrsmith commented Sep 26, 2023

petrelharp commented Oct 10, 2023

proposed production-level config for humans #97

proposed production-level config for humans #97

Conversation

chriscrsmith commented Mar 21, 2023

stsmall commented Mar 29, 2023

stsmall commented Mar 29, 2023

stsmall commented Mar 29, 2023

stsmall commented Mar 29, 2023

chriscrsmith commented Mar 30, 2023

stsmall commented Mar 30, 2023

stsmall commented Mar 30, 2023

chriscrsmith commented Mar 30, 2023

andrewkern commented Apr 4, 2023 • edited Loading

chriscrsmith commented Apr 4, 2023

chriscrsmith commented Apr 4, 2023

chriscrsmith commented Apr 11, 2023

RyanGutenkunst commented Apr 12, 2023

petrelharp commented Apr 12, 2023

chriscrsmith commented Apr 12, 2023

petrelharp commented Apr 12, 2023

chriscrsmith commented Apr 12, 2023 • edited Loading

chriscrsmith commented Apr 12, 2023

petrelharp commented Apr 13, 2023

chriscrsmith commented Apr 13, 2023

petrelharp commented Apr 13, 2023

petrelharp commented May 9, 2023

chriscrsmith commented Sep 26, 2023

petrelharp commented Oct 10, 2023

andrewkern commented Apr 4, 2023 •

edited

Loading

chriscrsmith commented Apr 12, 2023 •

edited

Loading