Skip to content

Commit 2c863ed

Browse files
authored
Updates to chapter 8 slides (#43)
1 parent ce01ab0 commit 2c863ed

File tree

6 files changed

+127
-221
lines changed

6 files changed

+127
-221
lines changed

_freeze/slides/06/execute-results/html.json

Lines changed: 6 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,15 @@
11
{
2-
"hash": "251ca1bf0aaa9a855eeeda3e85979c71",
2+
"hash": "79025dae1f6efd9d029d108bb5da036f",
33
"result": {
44
"engine": "knitr",
5-
"markdown": "---\nengine: knitr\ntitle: Docker for Data Science\n---\n\n## Learning objectives\n\n- Decide whether a container is the right tool for a given job.\n- Download and run pre-built Docker images.\n- Describe the stages of the Docker container lifecycle.\n- Build simple Dockerfiles for your own projects.\n\n## Why containers? {-}\n\n![](images/05_it-works-on-my-machine.jpg)\n\n- Containers are a way to save an entire machine's state, rather than just project components\n - This extends beyond packages and libraries to the R or Python version itself, as well as any other tools\n \n- While a container is similar to a VM, they are much more single purpose. They...\n - are quick to start\n - can be used for individual projects or scripts\n \n- The container's configuration is code (infrastructure as code!) and so it's easy to reproduce\n\n*https://www.reddit.com/r/ProgrammerHumor/comments/cw58z7/it_works_on_my_machine/*\n\n## Why containers for Data Science? {-}\n\n- Containers are mostly used for:\n 1. Packaging an environment for someone else to use\n 2. Packaging a finished product (project/app/whatever) for archiving, reproducibility, or production\n \n![](https://do4ds.com/chapters/sec1/images-docker/docker-usage.png)\n \nPotential examples:\n\n- When publishing, instead of providing code, data, and describing the environment used, you can include a Dockerfile so anyone can pick up exactly where you left off\n- You have a project at work that needs to be interacted with every week no matter who looks at it or when\n- You're publishing an R package and want to test specific features across different base R or package versions\n\n## Words of caution {-}\n\n- Docker is limited to the resources you give it. If your dev machine is less than awesome, your Docker container will be much less than awesome\n- Docker is only allowed access to what you give it and may take some extra work to get running\n- Some workplaces may not be comfortable with Docker\n- Some use-cases may require direct access to the hardware and are incompatible with a container system\n - Sometimes computers do math differently\n \n- Containers require proper setup!\n\n## Diving In {-}\n\n### Docker Run {-}\n\n`docker run [OPTIONS] IMAGE [COMMAND] [ARG...]`\n\n- `docker run` tells Docker to run the following image\n- Options are configured as needed\n- `IMAGE` is configured as `user/image` when pulling from docker hub (like CRAN but for Docker)\n\n\n::: {.cell}\n\n:::\n\n\n### Docker Compose {-}\n\n`docker-compose.yml` -> `docker compose [-f <arg>...] [options] [COMMAND] [ARGS...]`\n\n- The `docker-compose` file provides a structured way to describe a docker image\n- Easy way to combine multiple services (maybe you want R + Python)\n- Can be used with the `run` (do something) or `up` command (be ready to do something)\n- Options are similar to `docker run`\n\n\n::: {.cell}\n\n:::\n\n\n\n::: {.cell}\n\n:::\n\n\n## Container Lifecyle {-}\n\n![](https://do4ds.com/chapters/sec1/images-docker/docker-lifecycle.png)\n\n- Change the Dockerfile, not the image\n- Images can be shared like code\n - Think Git!\n- There are services to provide private image registries for companies\n- Containers usually auto-pull if it doesn't exist already\n\n## More on Docker Run {-}\n\n\n::: {.cell}\n\n:::\n\n\n- DockerHub containers are in the form `<user>/<name>` (`alexkgold/plumber`)\n - You can tag an image with a version number `user>/<name>:<version>`\n\n- `--rm` to remove the container on kill (probably not for production)\n- `-d` run in detached mode so the terminal is free for other uses\n- `-p <host>:<container>` publishes a port from inside the container to outside\n- `--name` to assign a name of your choice\n- `-v <outside/directory>:<inside/directory>` to expose a directory\n - `${PWD}` is your project directory\n \n## Build a Dockerfile {-}\n\n- Not the same as a `docker-compose.yml`\n\n- `FROM` the base image for the container\n- `RUN` run any command as though it's using the terminal\n - If using something fancy, you may need to install it first\n- `COPY` copy a file from host to container\n- `CMD` run a command at runtime\n\n![](images/05_container-layers.jpg)\n\n- Containers will rebuild from the top-most command that was changed\n\n\n::: {.cell}\n\n:::\n\n\n\n::: {.cell}\n\n:::\n\n\n## Trying out Docker {-}\n\n1. Try out plumber penguins [in your browser]( http://localhost:8000/__docs__/)\n\n\n::: {.cell}\n\n:::\n\n\n2. Kill it\n\n\n::: {.cell}\n\n:::\n\n\n3. Do it again\n\n\n::: {.cell}\n\n:::\n\n\n4. Poke around\n\n\n::: {.cell}\n\n:::\n\n\n5. Kill it again\n\n\n::: {.cell}\n\n:::\n\n\n## Meeting Videos {-}\n\n### Cohort 1 {-}\n\n<iframe src=\"https://www.youtube.com/embed/gzJ3eT6tcog\" width=\"100%\" height=\"400px\" data-external=\"1\"></iframe>\n\n<iframe src=\"https://www.youtube.com/embed/CHbaTCo4gQk\" width=\"100%\" height=\"400px\" data-external=\"1\"></iframe>\n",
6-
"supporting": [],
5+
"markdown": "---\nengine: knitr\ntitle: Demystyfing Docker \n---\n\n## Learning objectives\n\n- Decide whether a container is the right tool for a given job.\n- Download and run pre-built Docker images.\n- Describe the stages of the Docker container lifecycle.\n- Build simple Dockerfiles for your own projects.\n\n## Why docker matters for data science \n\n- Docker creates standardized environments that are:\n\n 1. Reproducible\n\n 2. Portable \n\n 3. Collaborative \n \n 4. Scalable\n\n![](images/06_docker-logo.png){.absolute width=480 height=270 left=500px top=200px fig-alt=\"The Docker logo, a whale with shipping containers on its back\"}\n\n::: aside\nSee [An Introduction to Docker for R Users](https://colinfay.me/docker-r-reproducibility/) for a guide on reproducibility with Docker for R users\n:::\n\n::: {.notes}\n\n- Docker allows us to set up infrastructure as code\n\n- Docker enhances reproducibility by creating a reproducible environment all the way down to the operating system - important for highly regulated industries\n\n- Docker allows you to develop your project using an image that may more closely match the production environment - important for Shiny apps and APIs\n\n:::\n\n## What is Docker? \n\n- An open-source tool for building, sharing, and running software\n\n![](images/06_docker-lifecycle.png){fig-align=\"center\" width=50% height=50% fig-alt=\"The Docker lifecycle and commands, showing that a Dockerfile produces a Docker Image, which leads to a Docker Container\"}\n\n::: {.notes}\n\n- Requires a Linux operating system or the Windows Subsystem for Linux (WSL)\n\n- Windows and macOS users download [Docker Desktop](https://www.docker.com/products/docker-desktop/), which comes with a Linux VM. Linux users are recommended to install the [Docker Engine](https://docs.docker.com/engine/install/) directly.\n\n- Three terms to be familiar with are DockerFile, Docker Image, and Container Instances. \n\n- Note that the names *container* and *instance* are often used interchangeably\n\n:::\n\n## Specify your environment via a `Dockerfile`\n\n- Dockerfiles build Docker images\n\n- Dockerfiles are plain text files using `FROM`, `RUN`, `COPY`, and `CMD` commands \n\n\n::: {.cell}\n\n```{.bash .cell-code}\nFROM ubuntu:latest # <1> \nCOPY my-data.csv /data/data.csv # <2>\nRUN [\"head\", \"/data/data.csv\"] # <3>\n```\n:::\n\n1. Declare the base image\n2. Copy `data.csv` from the host's working directory to the container's data directory\n3. Print the first few rows of `data.csv` \n\n::: {.notes}\n\n- Creating your own Dockerfile is optional - many standard Docker images exist on Dockerhub (e.g. [`rocker/tidyverse`]()) \n\n- Dockerfiles build images and also *pull* existing images\n\n:::\n\n## Docker images are a snapshot of your environment\n\n- Docker images contain the bundled software (e.g. OS, data, packages)\n\n- Docker images can be shared with others via [Docker hub](https://hub.docker.com/)\n\n- Docker images can, in theory, be a standalone project\n\n::: {.notes}\n\n- Other container registries exist (e.g. [Azure Container Registry](https://azure.microsoft.com/en-us/products/container-registry/?msockid=16f9494b817a63e60f275f7a80ce623d))\n\n:::\n\n## Containers are an ephemeral instance of a Docker Image\n\n- By default, changes made to containers are lost on shutdown\n\n- Data can be preserved from instance to instance of the same container using mounted volumes\n\n- Containers are a process that executes the layers of your Dockerfile \n\n## Meeting Videos {-}\n\n### Cohort 1 {-}\n\n<iframe src=\"https://www.youtube.com/embed/gzJ3eT6tcog\" width=\"100%\" height=\"400px\" data-external=\"1\"></iframe>\n\n<iframe src=\"https://www.youtube.com/embed/CHbaTCo4gQk\" width=\"100%\" height=\"400px\" data-external=\"1\"></iframe>\n",
6+
"supporting": [
7+
"06_files"
8+
],
79
"filters": [
810
"rmarkdown/pagebreak.lua"
911
],
10-
"includes": {
11-
"include-after-body": [
12-
"\n<script>\n // htmlwidgets need to know to resize themselves when slides are shown/hidden.\n // Fire the \"slideenter\" event (handled by htmlwidgets.js) when the current\n // slide changes (different for each slide format).\n (function () {\n // dispatch for htmlwidgets\n function fireSlideEnter() {\n const event = window.document.createEvent(\"Event\");\n event.initEvent(\"slideenter\", true, true);\n window.document.dispatchEvent(event);\n }\n\n function fireSlideChanged(previousSlide, currentSlide) {\n fireSlideEnter();\n\n // dispatch for shiny\n if (window.jQuery) {\n if (previousSlide) {\n window.jQuery(previousSlide).trigger(\"hidden\");\n }\n if (currentSlide) {\n window.jQuery(currentSlide).trigger(\"shown\");\n }\n }\n }\n\n // hookup for slidy\n if (window.w3c_slidy) {\n window.w3c_slidy.add_observer(function (slide_num) {\n // slide_num starts at position 1\n fireSlideChanged(null, w3c_slidy.slides[slide_num - 1]);\n });\n }\n\n })();\n</script>\n\n"
13-
]
14-
},
12+
"includes": {},
1513
"engineDependencies": {},
1614
"preserve": {},
1715
"postProcess": true

0 commit comments

Comments
 (0)