Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to new versions, minor updates #3

Merged
merged 1 commit into from
Oct 31, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,5 @@
.Rproj.user
_book/
_dust/
odin_files
*_files/
*_cache/
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,3 +25,5 @@ which will open the site in a browser. Then edit the `.qmd` pages; changes will
The site will build when pushed to GitHub, so failures will show on a PR. It will push to `gh-pages` automatically when merged.

Documentation for quarto is [here](https://quarto.org/docs/guide/) and for books specifically [here](https://quarto.org/docs/books/)

We cache models in `_dust` so that it's not painful to edit things. You should remove this directory after substantial changes to any of the packages (you may want to delete `_book` too so that quarto rebuilds everything as expected).
1 change: 1 addition & 0 deletions WORDLIST
Original file line number Diff line number Diff line change
Expand Up @@ -19,5 +19,6 @@ nondifferentiable
nonlinear
odin
odin's
pMCMC
standalone
stochasticity
1 change: 0 additions & 1 deletion _quarto.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@ book:
repo-actions: [edit, issue, source]
chapters:
- index.qmd
- intro.qmd
- part: "Odin"
chapters:
- odin.qmd
Expand Down
8 changes: 2 additions & 6 deletions arrays.qmd
Original file line number Diff line number Diff line change
@@ -1,12 +1,8 @@
# Arrays
# Arrays {#sec-arrays}

```{r}
#| include: false
source("utils.R")
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
source("common.R")
```

::: {.callout-warning}
Expand Down
6 changes: 1 addition & 5 deletions colophon.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -2,17 +2,13 @@

```{r}
#| include: false
source("common.R")
pkgs <- c("odin2", "dust2", "monty",
"pkgload", "pkgbuild", "cpp11", "cli", "rlang",
"brio", "decor", "fs", "knitr", "rmarkdown", "usethis")
for (p in pkgs) {
loadNamespace(p)
}
source("utils.R")
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```

Key versions:
Expand Down
20 changes: 20 additions & 0 deletions common.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)

lang_output <- function(x, lang) {
cat(c(sprintf("```%s", lang), x, "```"), sep = "\n")
}

cpp_output <- function(x) {
lang_output(x, "cpp")
}

r_output <- function(x) {
lang_output(x, "r")
}

plain_output <- function(x) {
lang_output(x, "plain")
}
55 changes: 33 additions & 22 deletions data.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -2,17 +2,14 @@

```{r}
#| include: false
source("utils.R")
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
source("common.R")
```

Before we start on inference, we need to start thinking about how our models fit to data, and how we can get data into the models. This gets very close to the interface that we need to work with [monty](https://mrc-ide.github.io/monty), which is the focus of the next section of the book.

```{r}
#| include: false
source("common.R")
set.seed(42)
```

Expand Down Expand Up @@ -67,7 +64,7 @@ When running this model we'll see incidence produced:

```{r}
pars <- list(beta = 1, gamma = 0.6)
sys <- dust_system_create(sir(), pars, dt = 0.25)
sys <- dust_system_create(sir, pars, dt = 0.25)
dust_system_set_state_initial(sys)
t <- seq(0, 20, by = 0.25)
y <- dust_system_simulate(sys, t)
Expand All @@ -78,7 +75,7 @@ plot(t, y$incidence, type = "s", xlab = "Time", ylab = "Incidence")
Our step is a quarter of a day, so we see four levels of incidence per day, with the last being the daily peak. Let's run again with a timestep of 1 day, we see:

```{r}
sys <- dust_system_create(sir(), pars, dt = 0.25)
sys <- dust_system_create(sir, pars, dt = 0.25)
dust_system_set_state_initial(sys)
t <- seq(0, 20)
y <- dust_system_simulate(sys, t)
Expand All @@ -98,7 +95,7 @@ tail(d)

We have a column for time `time` and one for observed cases `cases` spanning the time range 0 to 20.

## Comparison to data
## Comparison to data {#sec-simple-sir-data}

The next step is to tell our model about this data:

Expand Down Expand Up @@ -135,10 +132,16 @@ First, `cases <- data()` says that `cases` is a special **data** variable. It w

Second, `cases ~ Poisson(incidence)` describes the per-data-point [likelihood calculation](https://en.wikipedia.org/wiki/Likelihood_function); the syntax may be familiar to you if you have read [Richard McElreath's *Statistical Rethinking*](https://xcelab.net/rm/) or used any number of Bayesian statistical frameworks.

The generator will advertise that this system can be compared to data:

```{r}
sir
```

Once we have our new model, we can see how the data comparison works.

```{r}
sys <- dust_system_create(sir(), list(), n_particles = 10)
sys <- dust_system_create(sir, list(), n_particles = 10)
dust_system_set_state_initial(sys)
dust_system_run_to_time(sys, d$time[[1]])
dust_system_compare_data(sys, d[1, ])
Expand Down Expand Up @@ -178,14 +181,20 @@ As you can see, these are the same; the expression
cases ~ Poisson(incidence)
```

has done the same sort of thing as we can do ourselves by asking "what is the (log) probability of observing `cases` cases if the underlying rate of incidence is `incidence`".
has done the same sort of thing as we can do ourselves by asking "what is the (log) probability of observing `cases` cases if the underlying rate of incidence is `incidence`". Note that this can be `-Inf` (a probability of 0) in the case where the modelled incidence is zero.

```{r}
dpois(10, 0, log = TRUE)
```

This is fine, so long as not all particles are impossible.

## The likelihood filter {#sec-data-filter}

We include a simple bootstrap particle filter in `dust` for estimating the marginal likelihood in this case; the probability of the entire data series conditional on our model, marginalised (averaged) over all stochastic realisations. Because our model is stochastic, this is an **estimate** of the likelihood but the estimate will get less noisy as the number of particles increases.

```{r}
filter <- dust_filter_create(sir(), time_start = 0, data = d, n_particles = 200)
filter <- dust_filter_create(sir, time_start = 0, data = d, n_particles = 200)
```

Here, we've created a filter where the time series starts at 0, using our data set `d` and we've picked 200 particles to start with. We run the filter with `dust_likelihood_run`, and this returns a likelihood:
Expand All @@ -211,7 +220,7 @@ var(ll)
If we increase the number of particles, this variance will decrease, at the cost of taking longer to run:

```{r}
filter2 <- dust_filter_create(sir(), time_start = 0, data = d,
filter2 <- dust_filter_create(sir, time_start = 0, data = d,
n_particles = 2000)
ll2 <- replicate(100, dust_likelihood_run(filter2, list()))
mean(ll2)
Expand All @@ -220,37 +229,39 @@ var(ll2)

The log-likelihoods from different numbers of particles are not directly comparable, though.

You can extract trajectories from the filter after running it: this gives a *tree* showing the ancestry of the particles remaining in the sample. To enable this, you must run with `save_history = TRUE`, as this incurs a runtime cost.
You can extract trajectories from the filter after running it: this gives a *tree* showing the ancestry of the particles remaining in the sample. To enable this, you must run with `save_trajectories = TRUE`, as this incurs a runtime cost.

```{r}
ll <- dust_likelihood_run(filter, list(), save_history = TRUE)
h <- dust_likelihood_last_history(filter)
ll <- dust_likelihood_run(filter, list(), save_trajectories = TRUE)
h <- dust_likelihood_last_trajectories(filter)
```

The history will be an array with 3 dimensions representing (in turn) state, particle and time:
The trajectories will be an array with 3 dimensions representing (in turn) state, particle and time:

```{r}
dim(h)
```

We can plot the history of our incidence here:
We can plot the trajectories of our incidence here:

```{r}
matplot(d$time, t(dust_unpack_state(filter, h)$incidence), type = "l",
lty = 1, col = "#00000044", xlab = "Time", ylab = "Incidence")
lty = 1, col = "#00000044", ylim = c(0, 75),
xlab = "Time", ylab = "Incidence")
points(cases ~ time, d, pch = 19, col = "red")
```

This model does not fit the data very well! The points don't lie close to the modelled trajectories. We also see everything before time ~10 reduced to a single particle's history, which will have the effect of increasing the variance.
This model does not fit the data very well! The points don't lie close to the modelled trajectories. We also see everything before time ~10 reduced to a single particle's trajectories, which will have the effect of increasing the variance.

If we found better parameters things would look much better. We just so happen to know that if `beta` is 1 and `gamma` 0.6 then the model will fit better:

```{r}
pars <- list(beta = 1, gamma = 0.6)
ll <- dust_likelihood_run(filter, pars, save_history = TRUE)
h <- dust_likelihood_last_history(filter)
ll <- dust_likelihood_run(filter, pars, save_trajectories = TRUE)
h <- dust_likelihood_last_trajectories(filter)
matplot(d$time, t(dust_unpack_state(filter, h)$incidence), type = "l",
lty = 1, col = "#00000044", xlab = "Time", ylab = "Incidence")
lty = 1, col = "#00000044", ylim = c(0, 75),
xlab = "Time", ylab = "Incidence")
points(cases ~ time, d, pch = 19, col = "red")
```

Expand Down
31 changes: 29 additions & 2 deletions index.qmd
Original file line number Diff line number Diff line change
@@ -1,5 +1,32 @@
# Preface {.unnumbered}

This is a Quarto book.
```{r}
#| include: false
source("common.R")
```

To learn more about Quarto books visit <https://quarto.org/docs/books>.
This is a book about writing models in the [`odin`](https://mrc-ide.github.io/odin2) "domain specific language" (DSL) and fitting these models to data using [`monty`](https://mrc-ide.github.io/monty). These tools can be used entirely separately; you can create a dynamical model (or "system") in `odin` and never fit it to data, or you can use `monty` to fit models that are written in plain R code. We'll structure the book to cover the tools fairly separately at first, then describe their intersection. Along the way, you'll also use [`dust`](https://mrc-ide.github.io/dust2), which powers `odin`, so we'll spend some time describing how to use that, and also how to write models directly in `dust` if you need more control than `odin` provides.

## History

Originally, we designed odin around describing and solving systems of differential equations, with a small amount of support for working with discrete-time stochastic systems. Over time, and particularly during the COVID-19 response, we expanded the stochastic support and introduced the ability to fit these models using particle filters and other sequential Monte Carlo algorithms. You may have used `odin.dust` and `mcstate` through this period. As of 2024 we have rationalised our tools into a new set of packages which are currently called `odin2`, `dust2` and `monty2`

## Installation

You can install these packages in one go from our R-universe:

```r
install.packages(
c("monty", "dust2", "odin2"),
repos = c("https://mrc-ide.r-universe.dev", "https://cloud.r-project.org"))
```

You will also need a functioning C++ toolchain; you can test this by running

```{r}
pkgbuild::has_build_tools(debug = TRUE)
```

## About this book

This book exists to supplement the package documentation, and we will try to link extensively to it. It is designed as a fairly friendly walk-through of the tools, and to mirror courses that we run.
Loading
Loading