You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Merge pull request #39 from bird-house/autodeploy-the-autodeploy-phase-2
Autodeploy the autodeploy phase 2: everything operational but a few compatibility issues remain
Part of #27
Activating the `./components/scheduler` will do everything. All configurations are centralized in the `env.local` file.
One missing feature is piece-wise choice of platform or notebook autodeploy only, like with the old manual `install-*` stcripts under https://github.com/bird-house/birdhouse-deploy/tree/master/birdhouse/deployment. Right now it's all or nothing. I can work on this if you guys think it's needed.
Remaining compatibility issues with Medus (Vagrant box works fine):
* Notebook autodeploy do not work. It looks like using the `bash` docker image, I am unable to wget any httpS address. This same `docker run` command works fine on my Vagrant box as well. So there's something on Medus.
```
$ docker run --rm --name debug_wget_httpS -u root bash bash -c "wget https://google.com -O -"
Connecting to google.com (172.217.13.206:443)
wget: error getting response: Connection reset by peer
```
* All the containers are being recreated when `./pavics-compose.sh` runs inside the container (first migration to the new autodeploy mechanism). To investigate but I suspect this might be due to older version of `docker` and `docker-compose` on Medus.
* This one looks like due to older kernel on Medus:
```
sysctl: error: 'net.ipv4.tcp_tw_reuse' is an unknown key
sh: 0: unknown operand
```
* All the files updated by `git pull` are now owned by `root` (the user inside the container). I'll have to undo this ownership change, somehow. This one is super weird, I should have got it on my Vagrant box. Probably Vagrant did some magic to always ensure files under `/vagrant` is always owned by the user even if changed by user `root`.
* Documentation: update README and list relevant configuration variables in `env.local` for this new `./component/scheduler`.
Migrating to this new mechanism requires manual deletion of all the artifacts created by the old install scripts: `sudo rm /etc/cron.d/PAVICS-deploy /etc/cron.hourly/PAVICS-deploy-notebooks /etc/logrotate.d/PAVICS-deploy /usr/local/sbin/triggerdeploy.sh`. Both can not co-exist at the same time.
Maximum backward-compatibility has been kept with the old existing install scripts style:
* Still log to the same existing log files under `/var/log/PAVICS`.
* Old single ssh deploy key is still compatible, but the new mechanism allows for different ssh deploy keys for each extra repos (again, public repos should use https clone path to avoid dealing with ssh deploy keys in the first place)
* Old install scripts are kept
Features missing in old existing install scripts or how this improves on the old install scripts:
* Autodeploy of the autodeploy itself ! This is the biggest win. Previously, if `triggerdeploy.sh` or `PAVICS-deploy-notebooks` script changes, they have to be deployed manually. It's very annoying. Now they are volume-mount in so are fresh on each run.
* `env.local` now drive absolutely everything, source control that file and we've got a true DevOPS pipeline.
* Configurable platform and notebook autodeploy frequency. Previously, this means manually editing the generated cron file, less ideal.
* Do not need any support on the local host other than `docker` and `docker-compose`. cron/logrotate/git/ssh versions are all locked-down in the docker images used by the autodeploy. Recall previously we had to deal with git version too old on some hosts.
* Each cron job run in its own docker image meaning the runtime environment is traceable and reproducible.
* The newly introduced scheduler component is made extensible so other jobs can added into it as well (ex: backup), via `env.local`, which should source control, meaning all surrounding maintenance related tasks can also be traceable and reproducible.
This is a rather large PR. For a less technical overview, start with the diff of README.md, env.local.example, common.env. If a change looks funny to you, read the commit description that introduce that change, the reasoning should be there.
0 commit comments