Description
Bug description
We query the user's login name, uid, and gid from the Active Directory used for authentication to create files and folders with a consistent ownership inside a Dockerised JupyterHub/-Lab and the host mapping these volumes into the containers:
c.LDAPAuthenticator.auth_state_attributes = ["uid", "uidNumber", "gidNumber"]
This worked fine in a previous version of JupyterHub 4.1, but fails after the update to 5, due to Active Directory returning multiple containers for a single user. I don't know whether this is the case for every Active Directory combined with Exchange Servers or only for users who set up Exchange Active Sync (EAS) clients.
How to reproduce
As this issue is caused by an Exchange Active Sync enabled Active Directory, it is reproducible by the MWE at the bottom of the README and does not need Jupyter. Except from copying my Hub config, I added a logger (import logging; logger = logging.getLogger(__name__);logging.basicConfig(level=logging.DEBUG)
) to get more output:
DEBUG:asyncio:Using selector: EpollSelector
DEBUG:traitlets:Attempting to bind cn=ab12cde,ou=People,dc=example,dc=com
DEBUG:traitlets:Successfully bound cn=ab12cde,ou=People,dc=example,dc=com
DEBUG:traitlets:username:ab12cde Using dn cn=ab12cde,ou=People,dc=example,dc=com
ERROR:traitlets:Expected 1 but got 2 search response entries for DN 'cn=ab12cde,ou=People,dc=example,dc=com' when looking up attributes configured via auth_state_attributes. The user's auth state will not include any attributes.
When adding a bit of debugging to output conn.entries
in get_user_attributes()
, one can observe that this is caused by ExchangeActiveSync adding a container sharing the CN of the user:
[DN: CN=ab12cde,ou=People,dc=example,dc=com - STATUS: Read - READ TIME: 2024-12-17T15:56:55.187692
gidNumber: 1000000
uid: ab12cde
uidNumber: 1234567
, DN: CN=ExchangeActiveSyncDevices,CN=ab12cde,ou=People,dc=example,dc=com - STATUS: Read - READ TIME: 2024-12-17T15:56:55.187764
]
I failed to find an ldapsearch
query which could provide information about these containers beneath the users, but it is possible to query them via PowerShell on a Domain Controller:
Get-ADObject -Filter "ObjectClass -eq 'msExchActiveSyncDevice' -or ObjectClass -eq 'msExchActiveSyncDevices' -or ObjectClass -eq 'top'" -searchbase "CN=ab12cde,ou=People,dc=example,dc=com" | Format-List
DistinguishedName : CN=ExchangeActiveSyncDevices,CN=ab12cde,ou=People,dc=example,dc=com
Name : ExchangeActiveSyncDevices
ObjectClass : msExchActiveSyncDevices
ObjectGUID : <guid>
DistinguishedName : CN=Android§<deviceid>,CN=ExchangeActiveSyncDevices,CN=ab12cde,ou=People,dc=example,dc=com
Name : Android§<deviceid>
ObjectClass : msExchActiveSyncDevice
ObjectGUID : <guid>
DistinguishedName : CN=TbSync§<deviceid>,CN=ExchangeActiveSyncDevices,CN=ab12cde,ou=People,dc=example,dc=com
Name : TbSync§<deviceid>
ObjectClass : msExchActiveSyncDevice
ObjectGUID : <guid>
Expected behaviour
I would like get_user_attributes()
to ignore the EAS specific containers, they are not useful for authentication purposes or provide any additional attributes needed for spawned containers. Possibilities would be to change search_filter
to ((objectClass=user)
or make it user configurable (c.LDAPAuthenticator.search_filter
?). Perhaps the most specific filter addressing this issue would exclude the two EAS ObjectClasses: (&(!(objectClass=msExchActiveSyncDevice))(!(objectClass=msExchActiveSyncDevices))(objectClass=*))