Skip to content

Commit

Permalink
sso: integrate CERN login
Browse files Browse the repository at this point in the history
  • Loading branch information
ntarocco committed Sep 26, 2024
1 parent ffa8425 commit 1e0c6f4
Show file tree
Hide file tree
Showing 16 changed files with 342 additions and 104 deletions.
116 changes: 105 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,24 +6,118 @@

# Invenio-CERN-sync

Integrates CERN databases and login with Invenio.
Integrates CERN databases and SSO login with Invenio.

## Users sync
## SSO login

This module connects to LDAP to fetch users, updates already existing users
and inserts missing ones.
This module provides configurable integration with the CERN SSO login.

To integrate the CERN SSO, add this to your application configuration:

```python
from invenio_cern_sync.sso import cern_remote_app_name, cern_keycloak
OAUTHCLIENT_REMOTE_APPS = {
cern_remote_app_name: cern_keycloak.remote_app,
}

CERN_APP_CREDENTIALS = {
"consumer_key": "CHANGE ME",
"consumer_secret": "CHANGE ME",
}

To get the extra user fields stored in the user profile, set the following:
from invenio_cern_sync.sso.api import confirm_registration_form
OAUTHCLIENT_SIGNUP_FORM = confirm_registration_form

from invenio_cern_sync.users.profile import CERNUserProfileSchema
ACCOUNTS_USER_PROFILE_SCHEMA = CERNUserProfileSchema()
OAUTHCLIENT_CERN_REALM_URL = cern_keycloak.realm_url
OAUTHCLIENT_CERN_USER_INFO_URL = cern_keycloak.user_info_url
OAUTHCLIENT_CERN_VERIFY_EXP = True
OAUTHCLIENT_CERN_VERIFY_AUD = False
OAUTHCLIENT_CERN_USER_INFO_FROM_ENDPOINT = True
```

You can also provide your own schema.
Define, use the env var to inject the right configuration
for your env (local, prod, etc.):

- INVENIO_CERN_SYNC_KEYCLOAK_BASE_URL
- INVENIO_SITE_UI_URL

Define
- ACCOUNTS_DEFAULT_USER_VISIBILITY
- ACCOUNTS_DEFAULT_EMAIL_VISIBILITY

## Sync users and groups

You can sync users and groups from the CERN AuthZ service or LDAP
with the local Invenio db.

First, decide what fields you would like to get from the CERN database.
By default, only the field in `invenio_cern_sync.users.profile.CERNUserProfileSchema`
are kept when syncing.

If you need to customize that, you will need to:

1. Provide your own schema class, and assign it the config var `ACCOUNTS_USER_PROFILE_SCHEMA`
2. Change the mappers, to serialize the fetched users from the CERN format to your
local format. If you are using AuthZ, assign your custom serializer func
to `CERN_SYNC_AUTHZ_USERPROFILE_MAPPER`.
If you are using LDAP, assign it to `CERN_SYNC_LDAP_USERPROFILE_MAPPER`.
3. You can also customize what extra data can be stored in the RemoteAccount.extra_data fields
via the config `CERN_SYNC_AUTHZ_USER_EXTRADATA_MAPPER` or `CERN_SYNC_LDAP_USER_EXTRADATA_MAPPER`.

If are only using the CERN SSO as unique login method, you will probably also configure:

```python
ACCOUNTS_DEFAULT_USER_VISIBILITY = True
ACCOUNTS_DEFAULT_EMAIL_VISIBILITY = True
```

### AuthZ

In your app, define the following configuration:

```python
CERN_SYNC_KEYCLOAK_BASE_URL = "<url>"
CERN_SYNC_AUTHZ_BASE_URL = "<url>"
```

The above `CERN_APP_CREDENTIALS` configuration must be already configured.
You will also need to make sure that those credentials are allowed to fetch
the entire CERN database of user and groups.

Then, create a new celery task and sync users:

```python
from invenio_cern_sync.users.sync import sync

def sync_users_task():
user_ids = sync(method="AuthZ")
# you can optionally pass extra kwargs for the AuthZ client APIs.

# make sure that you re-index users if needed. For example, in InvenioRDM:
# from invenio_users_resources.services.users.tasks import reindex_users
# reindex_users.delay(user_ids)
```

To fetch groups:

```python
from invenio_cern_sync.groups.sync import sync

def sync_groups_task():
roles_ids = sync()
```

### LDAP

You can use LDAP instead. Define the LDAP url:

```python
CERN_SYNC_LDAP_URL = <url>
```

Then, create a new celery task and sync users:

```python
from invenio_cern_sync.users.sync import sync

def sync_users_task():
user_ids = sync(method="LDAP")
# you can optionally pass extra kwargs for the LDAP client APIs.
```
8 changes: 5 additions & 3 deletions invenio_cern_sync/authz/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,9 +45,12 @@ class KeycloakService:
def __init__(self, base_url=None, client_id=None, client_secret=None):
"""Constructor."""
self.base_url = base_url or current_app.config["CERN_SYNC_KEYCLOAK_BASE_URL"]
self.client_id = client_id or current_app.config["CERN_SYNC_KEYCLOAK_CLIENT_ID"]
self.client_id = (
client_id or current_app.config["CERN_APP_CREDENTIALS"]["consumer_key"]
)
self.client_secret = (
client_secret or current_app.config["CERN_SYNC_KEYCLOAK_CLIENT_SECRET"]
client_secret
or current_app.config["CERN_APP_CREDENTIALS"]["consumer_secret"]
)

def get_authz_token(self):
Expand Down Expand Up @@ -75,7 +78,6 @@ def get_authz_token(self):
"cernGroup", # "CA"
"cernSection", # "IR"
"instituteName", # "CERN"
"instituteAbbreviation", # "CERN"
"preferredCernLanguage", # "EN"
"orcid",
"primaryAccountEmail",
Expand Down
11 changes: 4 additions & 7 deletions invenio_cern_sync/authz/mapper.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,21 +11,18 @@
def userprofile_mapper(cern_identity):
"""Map the CERN Identity fields to the Invenio user profile schema.
:param cern_identity: the identity dict
:param profile_schema: the Invenio user profile schema to map to
:return: a serialized dict, containing all the keys that will appear in the
User.profile JSON column. Any unwanted key should be removed.
"""
The returned dict structure must match the user profile schema defined via
the config ACCOUNTS_USER_PROFILE_SCHEMA."""
return dict(
affiliations=cern_identity["instituteName"],
cern_department=cern_identity["cernDepartment"],
cern_group=cern_identity["cernGroup"],
cern_section=cern_identity["cernSection"],
family_name=cern_identity["lastName"],
full_name=cern_identity["displayName"],
given_name=cern_identity["firstName"],
institute_abbreviation=cern_identity["instituteAbbreviation"],
institute=cern_identity["instituteName"],
mailbox=cern_identity.get("postOfficeBox", ""),
orcid=cern_identity.get("orcid", ""),
person_id=cern_identity["personId"],
)

Expand Down
28 changes: 6 additions & 22 deletions invenio_cern_sync/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,31 +13,14 @@
from .ldap.mapper import userprofile_mapper as ldap_userprofile_mapper

###################################################################################
# Required config

CERN_SYNC_KEYCLOAK_CLIENT_ID = ""
"""Set the unique id/name of the CERN SSO app, also called `consumer_key`.
This corresponds to the RemoteAccount `client_id` column.
"""

CERN_SYNC_REMOTE_APP_NAME = None
"""Set the configured remote (oauth) app name for the CERN login.
This corresponds to the UserIdentity `method` column.
"""

###################################################################################
# CERN AuthZ
# Required config when using the AuthZ method to sync users, or when syncing groups

CERN_SYNC_KEYCLOAK_BASE_URL = ""
"""."""

CERN_SYNC_KEYCLOAK_CLIENT_SECRET = ""
"""."""
CERN_SYNC_KEYCLOAK_BASE_URL = "https://keycloak-qa.cern.ch/"
"""Base URL of the CERN SSO Keycloak endpoint."""

CERN_SYNC_AUTHZ_BASE_URL = ""
"""."""
CERN_SYNC_AUTHZ_BASE_URL = "https://authorization-service-api-qa.web.cern.ch/"
"""Base URL of the Authorization Service API endpoint."""

CERN_SYNC_AUTHZ_USERPROFILE_MAPPER = authz_userprofile_mapper
"""Map the AuthZ response to Invenio user profile schema.
Expand All @@ -50,6 +33,7 @@


###################################################################################
# CERN LDAP
# Required config when using the LDAP method to sync users

CERN_SYNC_LDAP_URL = None
Expand Down
1 change: 0 additions & 1 deletion invenio_cern_sync/ldap/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,6 @@
"cernAccountType",
"cernActiveStatus",
"cernGroup",
"cernInstituteAbbreviation",
"cernInstituteName",
"cernSection",
"cn", # username
Expand Down
10 changes: 3 additions & 7 deletions invenio_cern_sync/ldap/mapper.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,20 +13,16 @@
def userprofile_mapper(ldap_user):
"""Map the LDAP fields to the Invenio user profile schema.
:param ldap_user: the ldap dict
:param profile_schema: the Invenio user profile schema to map to
:return: a serialized dict, containing all the keys that will appear in the
User.profile JSON column. Any unwanted key should be removed.
"""
The returned dict structure must match the user profile schema defined via
the config ACCOUNTS_USER_PROFILE_SCHEMA."""
return dict(
affiliations=first_or_default(ldap_user, "cernInstituteName"),
cern_department=first_or_default(ldap_user, "division"),
cern_group=first_or_default(ldap_user, "cernGroup"),
cern_section=first_or_default(ldap_user, "cernSection"),
family_name=first_or_default(ldap_user, "sn"),
full_name=first_or_default(ldap_user, "displayName"),
given_name=first_or_default(ldap_user, "givenName"),
institute_abbreviation=first_or_default(ldap_user, "cernInstituteAbbreviation"),
institute=first_or_default(ldap_user, "cernInstituteName"),
mailbox=first_or_default(ldap_user, "postOfficeBox"),
person_id=first_or_default(ldap_user, "employeeID"),
)
Expand Down
62 changes: 62 additions & 0 deletions invenio_cern_sync/sso/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# -*- coding: utf-8 -*-
#
# Copyright (C) 2024 CERN.
#
# Invenio-CERN-sync is free software; you can redistribute it and/or modify it under
# the terms of the MIT License; see LICENSE file for more details.

"""Invenio-CERN-sync SSO module."""

###################################################################################
# CERN SSO
# Pre-configured settings for CERN SSO

import os
from urllib.parse import quote

from invenio_oauthclient.contrib.keycloak import KeycloakSettingsHelper

from .api import (
cern_groups_handler,
cern_groups_serializer,
cern_info_handler,
cern_info_serializer,
cern_setup_handler,
)

_base_url = os.environ.get(
"INVENIO_CERN_SYNC_KEYCLOAK_BASE_URL", "https://keycloak-qa.cern.ch/"
)
_site_ui_url = os.environ.get("INVENIO_SITE_UI_URL", "https://127.0.0.1")

cern_remote_app_name = "cern" # corresponds to the UserIdentity `method` column

cern_keycloak = KeycloakSettingsHelper(
title="CERN",
description="CERN SSO authentication",
base_url=_base_url,
realm="cern",
app_key="CERN_APP_CREDENTIALS", # config key for the app credentials
logout_url="{}auth/realms/cern/protocol/openid-connect/logout?redirect_uri={}".format(
_base_url, quote(_site_ui_url)
),
)

handlers = cern_keycloak.get_handlers()
handlers["signup_handler"] = {
**handlers["signup_handler"],
"info": cern_info_handler,
"info_serializer": cern_info_serializer,
"groups_serializer": cern_groups_serializer,
"groups": cern_groups_handler,
"setup": cern_setup_handler,
}
rest_handlers = cern_keycloak.get_rest_handlers()
rest_handlers["signup_handler"] = {
**rest_handlers["signup_handler"],
"info": cern_info_handler,
"info_serializer": cern_info_serializer,
"groups_serializer": cern_groups_serializer,
"groups": cern_groups_handler,
"setup": cern_setup_handler,
}
Loading

0 comments on commit 1e0c6f4

Please sign in to comment.