acls #835

phact · 2026-01-22T20:18:23Z

This pull request implements comprehensive extraction, propagation, and storage of Access Control List (ACL) information for documents ingested from Google Drive, OneDrive, and SharePoint connectors. It introduces connector-specific logic to fetch detailed user and group permissions from each provider's API and ensures that this ACL data is consistently passed through the document processing pipeline and indexed in OpenSearch. This enables fine-grained access control and auditing for ingested documents.

The most important changes are:

Connector-specific ACL Extraction:

Added _extract_google_drive_acl to google_drive/connector.py to fetch and parse user/group permissions from the Google Drive API for each file, and propagate this ACL into ConnectorDocument instances. [1] [2] [3] [4]
Added _extract_onedrive_acl to onedrive/connector.py to retrieve permissions from the Microsoft Graph API for OneDrive items, and use this ACL in document creation. [1] [2]
Added _extract_sharepoint_acl to sharepoint/connector.py to obtain permissions from the Microsoft Graph API for SharePoint files, and use this ACL in document creation. [1] [2]

Pipeline and Metadata Propagation:

Modified the document processing pipeline (service.py and processors.py) to accept and propagate the acl field from connectors through to chunk indexing, ensuring ACLs are stored with each chunk in OpenSearch. [1] [2] [3] [4]

Efficient ACL Indexing and Updates:

Refactored _update_connector_metadata in service.py to call a dedicated update_document_acl utility, optimizing ACL updates using hashing to skip unchanged ACLs and updating only when necessary. Other metadata is now updated via a single update_by_query call for efficiency.

These changes collectively provide end-to-end support for extracting, storing, and updating document-level ACLs from external storage providers, improving security and compliance in the document indexing pipeline.

acls

cb4ea7f

edwinjosechittilappilly self-requested a review January 22, 2026 21:07

phact requested a review from lucaseduoli January 26, 2026 17:33

Merge branch 'main' into connector-acls

13edcfc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

acls #835

acls #835

Uh oh!

phact commented Jan 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

acls #835

Are you sure you want to change the base?

acls #835

Uh oh!

Conversation

phact commented Jan 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants