A docker-compose file for launching a prodi.gy and connected postgres containers.
- Docker
- docker-compose
- direnv (optional, but helpful)
- envsubst
- A valid prodigy subscription and access to the prodigy linux wheel
A .env
needs to be set to provide environment variables to docker compose. This should be based on .env.template
If you use direnv, you can create a .envrc
from the .env
file with make .envrc
. Note that docker-compose will only read the .env
, so this should be the canonical store of env vars, with .envrc
just created for the convenience of using direnv.
An .env file should look like:
# REMOTE tool (optional)
# See https://github.com/wellcometrust/remote
INSTANCE_NAME=XXXXXXXXXXXXXXXXXXXXXXXXX
FILTER_PREFIX=*
# AWS
NAME=XXXXXXXXXXXXXXXXX
AWS_PROFILE=default
AWS_REGION=eu-west-1
AWS_ACCOUNT_ID=XXXXXXXXXXXX
S3_BUCKET=s3://XXXXXXXXXXXXXXXXXX
S3_PATH=${S3_BUCKET}/prodigy/${PRODIGY_WHEEL}
REPO=${AWS_ACCOUNT_ID}.dkr.ecr.${AWS_REGION}.amazonaws.com
# Docker / docker-compose envs
BASE_TAG=3.8.11-slim-buster
# Note that to use spaCy 3 you must use prodigy nightly version
SPACY_VERSION=3.0.0
SPACY_MODEL_URL=https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-${SPACY_VERSION}/en_core_web_sm-${SPACY_VERSION}.tar.gz
# Prodigy
VERSION=1.11.0a8
PRODIGY_WHEEL=prodigy-${VERSION}-cp36.cp37.cp38.cp39-cp36m.cp37m.cp38.cp39-linux_x86_64.whl
PRODIGY_HOME=/opt/prodigy
LOCAL_PORT=1337
# Postgres env vars
POSTGRES_DATA=/data/pg
POSTGRES_PASSWORD=password
POSTGRES_USER=prodigy
PGURL=postgres://${POSTGRES_USER}:${POSTGRES_PASSWORD}@localhost:5432
Set the username and password values in .env
file. These will get automatically propagated into the prodigy.json
file to the folder matching the env var PRODIGY_HOME
.
To launch the containers you can run:
make up
# which is equivalent to
docker-compose up -d db
To run a prodigy commands just prefix the usual prodigy command with docker-compose run --rm
, e.g.:
docker-compose run --rm --service-ports prodigy stats -ls
If you wish to set an alias on your system to avoid having to call docker-compose
you can add the following to your .bashrc
or equivalent.
prodigy () {
docker-compose run --rm --service-ports prodigy $@
}
To update the prodigy container, for example with a new version of prodigy, add the appropriate prodigy wheel to ./prodigy
and update the PRODIGY_WHEEL
env var in .env
. You should also increment the VERSION
env var. It probably makes sense for this to follow the version of prodigy.
NOTE: you must have a prodigy license in order to get access to a prodigy wheel. See https://prodi.gy.
Build the containers with:
make build
# equiavent to
docker-compose build
To push the container to a docker repo (for example docker hub or AWS ECR) you must first run docker login
. If you are using ECR, you can do this automatically with the following command (you will need to set AWS_PROFILE
or AWS_SECRET_ACCESS_KEY
and AWS_ACCESS_KEY_ID
):
make push
This will run the following commands:
# Update the .envrc file from .env
direnv dotenv > .envrc
# Authorise the .envrc in this folder
direnv allow
# Log in to ECR repository. Env vars will be read from local env via .envrc file
aws ecr get-login-password --region=${AWS_REGION} | sudo docker login \
--username AWS --password-stdin ${REPO}
# Push the docker image to the repo
sudo docker-compose push