Skip to content

Commit

Permalink
Merge pull request #573 from LStrachan/devel
Browse files Browse the repository at this point in the history
Simplify vignette A
  • Loading branch information
gregorgorjanc authored Dec 22, 2023
2 parents 1123148 + a03096b commit 316145f
Showing 1 changed file with 14 additions and 45 deletions.
59 changes: 14 additions & 45 deletions vignettes/A_Honeybee_biology.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -50,18 +50,18 @@ library(package = "SIMplyBee")
Figure 1 visualizes the initiation of the simulation. First, we simulate some
honeybee genomes that represent the founder population. You can quickly generate
random genomes using AlphaSimR's `quickHaplo()`. These founder genomes are
rapidly simulated by sampling 0s and 1s, and do not include any species-specific
demographic history. This is equivalent to all loci having allele frequency 0.5
and being in linkage equilibrium. We use this approach only for demonstrations
and testing.
rapidly simulated by sampling chromosomes as series of 0s and 1s, and do not
include any species-specific demographic history. This is equivalent to all loci
having allele frequency 0.5 and being in linkage equilibrium. We use this approach
only for demonstrations and testing.

Alternatively, you can more accurately simulate honeybee genomes with
SIMplyBee's `simulateHoneyBeeGenomes()`. This function simulates the honeybee
genome using MaCS (Chen et al., 2009) for three subspecies: *A. m. ligustica*,
*A. m. carnica*, and *A. m. mellifera* according to the demographic model
described by Wallberg et al. (2014).
genome using coalescent simulation of whole chromosomes using MaCS (Chen et al., 2009)
for three subspecies: *A. m. ligustica*, *A. m. carnica*, and *A. m. mellifera*
according to the demographic model described by Wallberg et al. (2014).

As a first demonstration, we will use `quickHaplo()` and simulate genomes of two
As a demonstration, we will use `quickHaplo()` and simulate genomes of two
founding individuals. In this example, the genomes will be represented by only
three chromosomes and 1,000 segregating sites per chromosome. Honeybees have 16
chromosomes and far more segregating sites per chromosome, but we want a quick
Expand All @@ -71,43 +71,12 @@ simulation here.
founderGenomes <- quickHaplo(nInd = 2, nChr = 3, segSites = 100)
```

Alternatively, we use `simulateHoneyBeeGenomes()` to sample genomes of a founder
population including 4 *A. m. mellifera* (North) individuals and 2 *A. m.
carnica* individuals. The genomes will be represented by only three chromosomes
and 5 segregating sites per chromosome. These numbers are of course extremely
low because we want a quick examample for demonstrative reasons. This chunk of
code should take a few minutes to run.

```{r simulate honeybee genomes}
founderGenomes2 <- simulateHoneyBeeGenomes(nMelN = 4,
nCar = 2,
nChr = 3,
nSegSites = 5)
```

Unfortunately, due to the complexity of this function, even using such small
numbers takes a while to run. Simulating a group of founder genomes with more
realistic numbers will therefore require a lot of time to run. We suggest
running this part to an external server and save the outcome as an RData file,
which we can load in our environment and work with it.

```{r save Rdata file}
# Save the genomes on a server
save(founderGenomes2, file = "FounderGenomes2_3chr.RData")
# Loade the saved genomes elsewhere
load(file = "FounderGenomes2_3chr.RData")
```

Besides specifying the number of individuals, chromosomes, and segregating
sites, `simulateHoneyBeeGenomes()`, also takes a number of genomic parameters:
effective population size, ploidy, length of chromosomes in base pairs, genetic
length of a chromosome in Morgans, mutation rate, recombination rate, and time
of population splits. The default values for these numbers follow published
references (Wallberg et al., 2014). While you can change these parameters, we
don't advise doing this because such demographic models, and their parameters,
are estimated jointly, so we should not be changing them independently. You can
read more about these parameters in the help page
`help(simulateHoneyBeeGenomes)`.
As mentioned, the `simulateHoneyBeeGenomes()` generates more realistic chromosome
samples, but also requires much more time. Hence, when you use `simulateHoneyBeeGenomes()`,
we suggest you save the output to an RData file that you then load in your
environment and work with it. See the function documentation using
`help(simulateHoneyBeeGenomes)` to learn all the parameters involved in the
function.

Now we are ready to setup global simulation parameters using `SimParamBee`.
`SimParamBee` builds upon AlphaSimR's `SimParam`, which includes genome and
Expand Down

0 comments on commit 316145f

Please sign in to comment.