Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions .github/workflows/precommit.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
name: pre-commit

on:
pull_request:
types: [opened, synchronize, reopened, ready_for_review]

jobs:
pre-commit:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v3
- uses: pre-commit/[email protected]
13 changes: 13 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
repos:
- repo: https://github.com/astral-sh/ruff-pre-commit
# Ruff version.
rev: v0.14.9
hooks:
# Run the linter.
- id: ruff-check
files: birdhouse/components/jupyterhub/jupyterhub_custom/jupyterhub_custom/
args: [ --config=birdhouse/components/jupyterhub/jupyterhub_custom/ruff.toml ]
# Run the formatter.
- id: ruff-format
files: birdhouse/components/jupyterhub/jupyterhub_custom/jupyterhub_custom/
args: [ --config=birdhouse/components/jupyterhub/jupyterhub_custom/ruff.toml ]
54 changes: 54 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,60 @@

Also changes the format for `JUPYTERHUB_RESOURCE_LIMITS` to a yaml or JSON string.

- Refactor Jupyterhub configuration files

Previously the jupyterhub configuration was defined in `components/jupyterhub/jupyterhub_config.py.template`
except for the custom authenticator which was included in the `pavics/jupyterhub` docker image. This was fine,
except that the configuration was split across two projects and as the configuration got more complex,
maintaining these was getting more difficult.

This moves all configuration code including the authenticator to a python package named `jupyterhub_custom`
located at `components/jupyterhub/jupyterhub_custom/`. It moves all configurations for the authenticator
to the `magpie_authenticator.py` file and all the configurations for the spawner to the `custom_spawner.py`
file. All variables that are settable by environment variables are moved to the `constants.py` file. This
makes it much easier to see which variables are configurable using environment variables all in one place.

This change introduces unit tests and style policies (enforced by [ruff](https://docs.astral.sh/ruff/) and
[pre-commit](https://pre-commit.com/)) for this package to encourage better code practices.

This change should be fully backwards compatible with the exception of the change to how settings for
`deprecated-components/catalog` is handled (see below). However, it does deprecate some environment variables in
favour of better configuration solutions:

- using the `JUPYTERHUB_ENABLE_MULTI_NOTEBOOKS` variable to set the `DockerSpawner.allowed_images` variable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the change, but I guess it could have been used for setting other definitions? If such custom overrides were applied/needed, should they be moved to JUPYTERHUB_CONFIG_OVERRIDE? (i.e.: should we recommend using it over JUPYTERHUB_ENABLE_MULTI_NOTEBOOKS?).

Will the effect be equivalent, or other considerations would be necessary (eg: overrides not applying the same way since more changes could happen in between the "allowe_image" step and the final config object)? Some guidances along those lines could be relevant here if there is anything notable to consider.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it could have been used for setting other definitions?

It could have but I hope that it wasn't since any changes made there were prone to being overridden later on in the file.

should we recommend using it over JUPYTERHUB_ENABLE_MULTI_NOTEBOOKS?

Yes for sure, by deprecating JUPYTERHUB_ENABLE_MULTI_NOTEBOOKS we should be communicating that but we can be more explicit about that if you think that people are adding additional code to JUPYTERHUB_ENABLE_MULTI_NOTEBOOKS instead of JUPYTERHUB_CONFIG_OVERRIDE.

Will the effect be equivalent

It should be unless they're using that to set a variable that is interpreted differently later on in the code. But I would definitely consider that a hack since the rest of the code could change later on and break any customizations done there. It's much safer to add in any custom overrides at the end of the configuration so that you know for sure that your changes are applied last.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I guess let's see if @tlvu was using it in such way.
If not, I am fine with having limited/no details about unintended usage to keep the changes concise.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tlvu do you have a situation where you're using JUPYTERHUB_ENABLE_MULTI_NOTEBOOKS like this?

is deprecated.
- why?: this variable could be used to insert any python code into the middle of the
`jupyterhub_config.py.template` file. This makes it too easy to accidentally insert code that breaks
the configuration settings later on. We should avoid code insertion like this whenever possible.
- new method: by default `DockerSpawner.allowed_images` will be set based on the values of
`JUPYTERHUB_IMAGE_SELECTION_NAMES` and `JUPYTERHUB_DOCKER_NOTEBOOK_IMAGES`. If more than one image
is specified by those variables then the user will be able to select which one they want. If you
want to specify images other than those in the default, those can be set using the `JUPYTERHUB_ALLOWED_IMAGES`
which is a yaml or JSON mapping of image names to jupyterlab docker image tags.
- using the `JUPYTERHUB_ADMIN_USERS` variable to set the `DockerSpawner.admin_users` variable is deprecated.
- why?: this also executes arbitrary code (like above). Also, jupyterhub encourages assigning admin permissions
based on
[roles](https://jupyterhub.readthedocs.io/en/latest/tutorial/getting-started/authenticators-users-basics.html#configure-admins-admin-users),
not by assigning them to the `admin_users` attribute. This makes it possible to easily revoke admin
permissions as well and does not require us to restart Jupyterhub to do so.
- new method: a new Magpie group is created. Its name is determined by the `JUPYTERHUB_ADMIN_GROUP_NAME`
variable (default is "jupyterhub-admin"). Add users to this group in Magpie in order to give them admin permissions on Jupyterhub.
- using lowercase constants in `JUPYTERHUB_CONFIG_OVERRIDE` is deprecated.
- why?: [PEP8 recommends](https://peps.python.org/pep-0008/#constants) constants be written with capitals
with underscores separating them.
- new method: all the constants that were previously named with lowercase have an uppercase equivalent
in `constants.py`. For example: `notebook_dir == constants.NOTEBOOK_DIR`. The uppercase version is
preferred.

Note: if your deployment is still using the `deprecated-components/catalog` component. You may want to manually set the
following in `JUPYTERHUB_CONFIG_OVERRIDE` since the automatic addition of the the `CATALOG_USERNAME` to the `blocked_users`
set is no longer part of the default settings (since the `deprecated-components/catalog` component has been deprecated for
a long time):

```python
c.MagpieAuthenticator.blocked_users.add("${CATALOG_USERNAME}")
```

[2.21.0](https://github.com/bird-house/birdhouse-deploy/tree/2.21.0) (2026-01-27)
------------------------------------------------------------------------------------------------------------------

Expand Down
5 changes: 3 additions & 2 deletions birdhouse/components/jupyterhub/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,10 @@ custom_templates/login.html
jupyterhub_config.py
config/proxy/conf.extra-service.d/jupyterhub.conf
config/canarie-api/canarie_api_monitoring.py
config/magpie/providers.cfg
config/magpie/config.yml
service-config.json

# Old paths. Keep these so that old config files remain uncommittable after updates.
jupyterhub_canarie_api_monitoring.py
config/proxy/canarie_api_monitoring.py
config/proxy/canarie_api_monitoring.py
config/magpie/providers.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,15 @@ providers:
c4i: false
type: api
sync_type: api

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious, this rename from .cfg to .yml is just to clarify the file format or there are other reasons for the rename?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The magpie doc used to have permissions.cfg and providers.cfg separately (still allowed and used by birdhouse-deploy), but later introduced config.yml that allows defining everything together since some users/groups permissions are often relevant along the specific services (providers) that get defined.

YAML was used at that point to reflect the actual data structure.

You can actually use many extensions interchangeably. Magpie doesn't care as long as it loads as JSON-like. https://pavics-magpie.readthedocs.io/en/latest/configuration.html#configuration-file-formats

groups:
- name: ${JUPYTERHUB_ADMIN_GROUP_NAME}
description: Members of this group are given admin permissions on jupyterhub
discoverable: false

permissions:
- service: jupyterhub
resource: /
permission: read
group: ${JUPYTERHUB_ADMIN_GROUP_NAME}
action: create
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
services:
magpie:
volumes:
- ./components/jupyterhub/config/magpie/providers.cfg:${MAGPIE_PROVIDERS_CONFIG_PATH}/jupyter.cfg:ro
- ./components/jupyterhub/config/magpie/config.yml:${MAGPIE_PROVIDERS_CONFIG_PATH}/jupyter.yml:ro
- ./components/jupyterhub/config/magpie/config.yml:${MAGPIE_PERMISSIONS_CONFIG_PATH}/jupyter.yml:ro
22 changes: 11 additions & 11 deletions birdhouse/components/jupyterhub/default.env
Original file line number Diff line number Diff line change
Expand Up @@ -69,8 +69,13 @@ export JUPYTERHUB_CRYPT_KEY=
# JUPYTERHUB_CRYPT_KEY is set.
export JUPYTERHUB_AUTHENTICATOR_REFRESH_AGE=60

# Usernames that should be given admin access in jupyterhub
export JUPYTERHUB_ADMIN_USERS='{\"${MAGPIE_ADMIN_USERNAME}\"}' # python set syntax
# Usernames that should be given admin access in jupyterhub
# This option is DEPRECATED (assign jupyterhub admin users to magpie group named JUPYTERHUB_ADMIN_GROUP_NAME instead)
#export JUPYTERHUB_ADMIN_USERS='{\"${MAGPIE_ADMIN_USERNAME}\"}' # python set syntax

# Users that belong to this group will have admin permissions when they log in to jupyterhub.
# By setting this variable this group will be created if it doesn't exist
export JUPYTERHUB_ADMIN_GROUP_NAME=jupyterhub-admin

# See description in env.local.example for details
export JUPYTERHUB_RESOURCE_LIMITS=
Expand All @@ -86,13 +91,15 @@ export DELAYED_EVAL="
# add any new variables not already in 'VARS' or 'OPTIONAL_VARS' that must be replaced in templates here
VARS="
$VARS
\$JUPYTERHUB_ADMIN_USERS
\$JUPYTERHUB_ADMIN_GROUP_NAME
\$JUPYTER_DEMO_USER
\$JUPYTERHUB_USER_DATA_DIR
"

OPTIONAL_VARS="
$OPTIONAL_VARS
\$JUPYTERHUB_ADMIN_USERS
\$JUPYTERHUB_ENABLE_MULTI_NOTEBOOKS
\$JUPYTER_DEMO_USER
\$JUPYTER_LOGIN_BANNER_TOP_SECTION
\$JUPYTER_LOGIN_BANNER_BOTTOM_SECTION
\$JUPYTER_LOGIN_TERMS_URL
Expand All @@ -101,13 +108,6 @@ OPTIONAL_VARS="
\$JUPYTERHUB_VERSION
\$JUPYTERHUB_IMAGE
\$JUPYTERHUB_IMAGE_URI
\$JUPYTERHUB_AUTHENTICATOR_AUTHORIZATION_URL
\$JUPYTERHUB_AUTHENTICATOR_REFRESH_AGE
\$JUPYTER_IDLE_SERVER_CULL_TIMEOUT
\$JUPYTER_IDLE_KERNEL_CULL_TIMEOUT
\$JUPYTER_IDLE_KERNEL_CULL_INTERVAL
\$JUPYTERHUB_USER_DATA_DIR
\$JUPYTERHUB_RESOURCE_LIMITS
"

# add any component that this component requires to run
Expand Down
14 changes: 13 additions & 1 deletion birdhouse/components/jupyterhub/docker-compose-extra.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ services:
DOCKER_NETWORK_NAME: jupyterhub_network
JUPYTERHUB_USER_DATA_DIR: ${JUPYTERHUB_USER_DATA_DIR}
WORKSPACE_DIR: ${JUPYTERHUB_USER_DATA_DIR}
JUPYTERHUB_ADMIN_USERS: ${JUPYTERHUB_ADMIN_USERS}
JUPYTERHUB_ADMIN_USERS: ${JUPYTERHUB_ADMIN_USERS} # DEPRECATED: see default.env for details
JUPYTER_DEMO_USER: ${JUPYTER_DEMO_USER}
JUPYTER_DEMO_USER_MEM_LIMIT: ${JUPYTER_DEMO_USER_MEM_LIMIT}
JUPYTER_DEMO_USER_CPU_LIMIT: ${JUPYTER_DEMO_USER_CPU_LIMIT}
Expand All @@ -20,7 +20,19 @@ services:
USER_WORKSPACE_GID: ${USER_WORKSPACE_GID}
JUPYTERHUB_CRYPT_KEY: ${JUPYTERHUB_CRYPT_KEY}
JUPYTERHUB_DOCKER_EXTRA_HOSTS: ${JUPYTERHUB_DOCKER_EXTRA_HOSTS:-}
JUPYTERHUB_AUTHENTICATOR_AUTHORIZATION_URL: ${JUPYTERHUB_AUTHENTICATOR_AUTHORIZATION_URL:-}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the new env vars here redefine a default value different than in default.env, is there a reason? And some do not redefine default value here, is there a reason to break the consistency?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give an example?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example this JUPYTERHUB_AUTHENTICATOR_AUTHORIZATION_URL:${JUPYTERHUB_AUTHENTICATOR_AUTHORIZATION_URL:-} default here is empty string but the default in default.env is

export JUPYTERHUB_AUTHENTICATOR_AUTHORIZATION_URL='http://twitcher:8000/ows/verify/jupyterhub'

Then example JUPYTERHUB_ADMIN_GROUP_NAME: ${JUPYTERHUB_ADMIN_GROUP_NAME} do not redefine a default here.

So the consistency question is 2 folds. Why sometime we redefine the default and sometime we don't? And then when we do redefine the default, why it is not the same value as in default.env?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see.
The default is still what is in default.env but docker compose will log a warning if an environment variable is missing in a docker-compose.yml file.
One way to tell docker compose that this variable is allowed to be empty is to use the :- syntax. That way it will not log the warning if this variable happens to be empty.

JUPYTERHUB_AUTHENTICATOR_REFRESH_AGE: ${JUPYTERHUB_AUTHENTICATOR_REFRESH_AGE:-}
JUPYTERHUB_ADMIN_GROUP_NAME: ${JUPYTERHUB_ADMIN_GROUP_NAME}
BIRDHOUSE_FQDN_PUBLIC: ${BIRDHOUSE_FQDN_PUBLIC}
BIRDHOUSE_PROXY_SCHEME: ${BIRDHOUSE_PROXY_SCHEME}
JUPYTER_IDLE_SERVER_CULL_TIMEOUT: ${JUPYTER_IDLE_SERVER_CULL_TIMEOUT:-}
JUPYTER_IDLE_KERNEL_CULL_TIMEOUT: ${JUPYTER_IDLE_KERNEL_CULL_TIMEOUT:-}
JUPYTER_IDLE_KERNEL_CULL_INTERVAL: ${JUPYTER_IDLE_KERNEL_CULL_INTERVAL:-}
JUPYTERHUB_ALLOWED_IMAGES: ${JUPYTERHUB_ALLOWED_IMAGES:-}
JUPYTERHUB_RESOURCE_LIMITS: ${JUPYTERHUB_RESOURCE_LIMITS:-}
PYTHONPATH: /srv/jupyterhub/jupyterhub_custom
volumes:
- ./components/jupyterhub/jupyterhub_custom:/srv/jupyterhub/jupyterhub_custom:ro
- ./components/jupyterhub/jupyterhub_config.py:/srv/jupyterhub/jupyterhub_config.py:ro
- ./components/jupyterhub/custom_templates:/custom_templates:ro
- ${JUPYTERHUB_USER_DATA_DIR}:${JUPYTERHUB_USER_DATA_DIR}
Expand Down
Loading