-
Notifications
You must be signed in to change notification settings - Fork 5.5k
feat: Add access control for procedures #26803
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
feat: Add access control for procedures #26803
Conversation
Reviewer's GuideIntroduces EXECUTE-level access control for procedures (including Iceberg distributed procedures), wires Iceberg to a configurable security system (allow-all or file-based), and enforces additional checks such as table INSERT/DELETE permissions and row filter/column mask constraints while updating SPI/System/connector access control surfaces and tests. Sequence diagram for procedure CALL access controlsequenceDiagram
actor User
participant Client
participant Coordinator
participant StatementAnalyzer
participant CallTask
participant AccessControlManager as AccessControl
participant SystemAC as SystemAccessControl
participant ConnectorAC as ConnectorAccessControl
User->>Client: submit CALL schema.proc(...)
Client->>Coordinator: send query
Coordinator->>StatementAnalyzer: analyze Call
StatementAnalyzer->>AccessControlManager: checkCanCallProcedure(txnId, identity, ctx, procedureName)
AccessControlManager->>SystemAC: checkCanCallProcedure(identity, ctx, catalogSchemaTableName)
SystemAC-->>AccessControlManager: allow or deny (system-level)
alt catalog has ConnectorAccessControl
AccessControlManager->>ConnectorAC: checkCanCallProcedure(connectorTxn, connectorIdentity, ctx, schemaTableName)
ConnectorAC-->>AccessControlManager: allow or deny (connector-level)
end
AccessControlManager-->>StatementAnalyzer: result
StatementAnalyzer-->>Coordinator: analysis (distributed or regular)
note over StatementAnalyzer,AccessControlManager: For TABLE_DATA_REWRITE also register INSERT and DELETE checks
Coordinator->>CallTask: execute Call
CallTask->>AccessControlManager: checkCanCallProcedure(txnId, identity, ctx, procedureName)
AccessControlManager-->>CallTask: result
CallTask-->>Coordinator: invoke connector procedure
Coordinator-->>Client: return result
Class diagram for new procedure access control structuresclassDiagram
class AccessControl {
<<interface>>
+checkCanSelectFromColumns(transactionId, identity, context, tableName, columnOrSubfieldNames)
+checkCanCallProcedure(transactionId, identity, context, procedureName)
}
class SystemAccessControl {
<<interface>>
+checkCanSelectFromColumns(identity, context, table, columns)
+checkCanCallProcedure(identity, context, procedure)
}
class ConnectorAccessControl {
<<interface>>
+checkCanSelectFromColumns(transactionHandle, identity, context, tableName, columnOrSubfieldNames)
+checkCanCallProcedure(transactionHandle, identity, context, procedureName)
}
class AccessDeniedException {
+denySelectColumns(tableName, columnNames)
+denyCallProcedure(procedureName)
+denyCallProcedure(procedureName, extraInfo)
}
class AccessControlManager {
+checkCanCallProcedure(transactionId, identity, context, procedureName)
}
class StatsRecordingSystemAccessControl {
-delegate
-stats
+checkCanCallProcedure(identity, context, procedure)
}
class StatsRecordingSystemAccessControl_Stats {
+checkCanCallProcedure:SystemAccessControlStats
+getCheckCanCallProcedure()
}
class FileBasedAccessControl {
-schemaRules
-tableRules
-sessionPropertyRules
-procedureRules
+checkCanCallProcedure(transactionHandle, identity, context, procedureName)
-checkProcedurePermission(identity, procedureName, requiredPrivileges)
-isDatabaseOwner(identity, schemaName)
}
class ProcedureAccessControlRule {
-privileges:Set~ProcedurePrivilege~
-userRegex:Optional~Pattern~
-schemaRegex:Optional~Pattern~
-procedureRegex:Optional~Pattern~
+ProcedureAccessControlRule(privileges, userRegex, schemaRegex, procedureRegex)
+match(user, procedure):Optional~Set~ProcedurePrivilege~~
}
class ProcedurePrivilege {
<<enumeration>>
EXECUTE
}
class AccessControlRules {
-schemaRules:List~SchemaAccessControlRule~
-tableRules:List~TableAccessControlRule~
-sessionPropertyRules:List~SessionPropertyAccessControlRule~
-procedureRules:List~ProcedureAccessControlRule~
+getSchemaRules()
+getTableRules()
+getSessionPropertyRules()
+getProcedureRules()
}
class FileBasedSystemAccessControl {
+checkCanCallProcedure(identity, context, procedure)
}
class AllowAllSystemAccessControl {
+checkCanCallProcedure(identity, context, procedure)
}
class ReadOnlySystemAccessControl {
+checkCanCallProcedure(identity, context, procedure)
}
class DenyQueryIntegrityCheckSystemAccessControl {
+checkCanCallProcedure(identity, context, procedure)
}
class AllowAllAccessControl_plugin {
<<connector access control>>
+checkCanCallProcedure(transactionHandle, identity, context, procedureName)
}
class ReadOnlyAccessControl_plugin {
<<connector access control>>
+checkCanCallProcedure(transactionHandle, identity, context, procedureName)
}
class DenyAllAccessControl_spi {
+checkCanCallProcedure(transactionId, identity, context, procedureName)
}
class AllowAllAccessControl_spi {
+checkCanCallProcedure(transactionId, identity, context, procedureName)
}
class ViewAccessControl {
+checkCanCallProcedure(transactionId, identity, context, procedureName)
}
class SystemTableAwareAccessControl {
-delegate:ConnectorAccessControl
+checkCanCallProcedure(transactionHandle, identity, context, procedureName)
}
AccessControl <|.. AccessControlManager
SystemAccessControl <|.. FileBasedSystemAccessControl
SystemAccessControl <|.. AllowAllSystemAccessControl
SystemAccessControl <|.. ReadOnlySystemAccessControl
SystemAccessControl <|.. DenyQueryIntegrityCheckSystemAccessControl
SystemAccessControl <|.. StatsRecordingSystemAccessControl
ConnectorAccessControl <|.. FileBasedAccessControl
ConnectorAccessControl <|.. AllowAllAccessControl_plugin
ConnectorAccessControl <|.. ReadOnlyAccessControl_plugin
ConnectorAccessControl <|.. SystemTableAwareAccessControl
AccessControlRules o-- ProcedureAccessControlRule
FileBasedAccessControl o-- ProcedureAccessControlRule
ProcedureAccessControlRule ..> ProcedurePrivilege
StatsRecordingSystemAccessControl *-- StatsRecordingSystemAccessControl_Stats
AccessControlManager ..> SystemAccessControl
AccessControlManager ..> ConnectorAccessControl
AccessControlManager ..> AccessDeniedException
Class diagram for RewriteWriterTarget access checksclassDiagram
class RewriteWriterTarget {
-metadata:Metadata
-accessControl:AccessControl
+RewriteWriterTarget(metadata, accessControl)
+optimize(plan, session, types, variableAllocator, idAllocator, warningCollector):PlanOptimizerResult
}
class RewriteWriterTarget_Rewriter {
-session:Session
-metadata:Metadata
-accessControl:AccessControl
-planChanged:boolean
+Rewriter(session, metadata, accessControl)
+visitCallDistributedProcedure(node, context):PlanNode
+isPlanChanged():boolean
-findTableHandleForCallDistributedProcedure(startNode):Optional~TableHandle~
-hasFullDataAccessControl(tableHandle):boolean
}
class Metadata {
+getTableMetadata(session, tableHandle):TableMetadata
+getColumnHandles(session, tableHandle):Map~String,ColumnHandle~
+getColumnMetadata(session, tableHandle, columnHandle):ColumnMetadata
}
class AccessControl {
+getRowFilters(transactionId, identity, context, table):List~ViewExpression~
+getColumnMasks(transactionId, identity, context, table, columns):Map~ColumnMetadata,ViewExpression~
}
class TableMetadata {
+getConnectorId():ConnectorId
+getTable():SchemaTableName
}
class Session {
+getTransactionId():Optional~TransactionId~
+getRequiredTransactionId():TransactionId
+getIdentity():Identity
+getAccessControlContext():AccessControlContext
}
class StatementAnalyzer {
+visitCall(call, scope):Scope
}
class AccessControlInfoForTable {
+AccessControlInfoForTable(accessControl, identity, transactionId, context, tableName)
}
class Analysis {
+addAccessControlCheckForTable(operation, accessControlInfoForTable)
}
RewriteWriterTarget *-- RewriteWriterTarget_Rewriter
RewriteWriterTarget_Rewriter ..> Metadata
RewriteWriterTarget_Rewriter ..> AccessControl
RewriteWriterTarget_Rewriter ..> Session
RewriteWriterTarget_Rewriter ..> TableMetadata
StatementAnalyzer ..> AccessControl
StatementAnalyzer ..> Analysis
Analysis o-- AccessControlInfoForTable
AccessControlInfoForTable ..> AccessControl
AccessControlInfoForTable ..> Session
File-Level Changes
Assessment against linked issues
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
06b6b85 to
04be7a6
Compare
…igurable Iceberg catalog configuration file now supports the ``iceberg.security`` property with two valid values: ``allow-all`` and ``file``. The default value is ``allow-all``, which preserves the original behavior.
04be7a6 to
db67a29
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey - I've found 7 issues, and left some high level feedback:
- In
StatsRecordingSystemAccessControl.checkCanCallProcedure, the failure and duration are recorded oncheckCanAddColumninstead of the newcheckCanCallProcedurestat, so the metrics for procedure calls will be misreported; update both therecordFailure()andrecord(...)calls to usestats.checkCanCallProcedure. - In
ProcedureAccessControlRule's constructor, theschemaRegexnull-check uses the message "sourceRegex is null", which looks like a copy‑paste error and should be corrected to referenceschemaRegexfor clearer diagnostics. - In
TestIcebergFileBasedSecurity.testCallDistributedProceduresWithInsertDeletePermission, the expected error messages hardcodeschema.test_rewrite_tableinstead of using theschemavariable, which makes the test brittle if the schema changes; consider constructing the fully qualified table name fromschemaandtableNamein the assertions.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `StatsRecordingSystemAccessControl.checkCanCallProcedure`, the failure and duration are recorded on `checkCanAddColumn` instead of the new `checkCanCallProcedure` stat, so the metrics for procedure calls will be misreported; update both the `recordFailure()` and `record(...)` calls to use `stats.checkCanCallProcedure`.
- In `ProcedureAccessControlRule`'s constructor, the `schemaRegex` null-check uses the message "sourceRegex is null", which looks like a copy‑paste error and should be corrected to reference `schemaRegex` for clearer diagnostics.
- In `TestIcebergFileBasedSecurity.testCallDistributedProceduresWithInsertDeletePermission`, the expected error messages hardcode `schema.test_rewrite_table` instead of using the `schema` variable, which makes the test brittle if the schema changes; consider constructing the fully qualified table name from `schema` and `tableName` in the assertions.
## Individual Comments
### Comment 1
<location> `presto-main-base/src/main/java/com/facebook/presto/sql/planner/optimizations/RewriteWriterTarget.java:187-196` </location>
<code_context>
+ private boolean hasFullDataAccessControl(TableHandle tableHandle)
</code_context>
<issue_to_address>
**issue (bug_risk):** Use a required transaction ID instead of `session.getTransactionId().get()` to avoid Optional misuse
In `hasFullDataAccessControl`, `getRowFilters` and `getColumnMasks` currently call `session.getTransactionId().get()`. If no transaction ID is present (e.g. some autocommit/non-transactional flows), this will throw `NoSuchElementException` instead of an access-control error. Please use `session.getRequiredTransactionId()` here (optionally passing it into the rewriter) to match the rest of the access-control code and avoid this failure mode.
</issue_to_address>
### Comment 2
<location> `presto-main-base/src/main/java/com/facebook/presto/security/StatsRecordingSystemAccessControl.java:498-507` </location>
<code_context>
}
+ @Override
+ public void checkCanCallProcedure(Identity identity, AccessControlContext context, CatalogSchemaTableName procedure)
+ {
+ }
</code_context>
<issue_to_address>
**issue (bug_risk):** checkCanCallProcedure is recording stats against the wrong counter (checkCanAddColumn)
In `checkCanCallProcedure`, the catch and finally blocks still use `stats.checkCanAddColumn` instead of the new `stats.checkCanCallProcedure`. This will misclassify procedure calls as "add column" operations and leave the procedure-specific metric unused. Please update those calls to `stats.checkCanCallProcedure.recordFailure()` and `stats.checkCanCallProcedure.record(duration)` to keep metrics accurate.
</issue_to_address>
### Comment 3
<location> `presto-plugin-toolkit/src/main/java/com/facebook/presto/plugin/base/security/ProcedureAccessControlRule.java:43` </location>
<code_context>
+ public ProcedureAccessControlRule(
+ @JsonProperty("privileges") Set<ProcedurePrivilege> privileges,
+ @JsonProperty("user") Optional<Pattern> userRegex,
+ @JsonProperty("schema") Optional<Pattern> schemaRegex,
+ @JsonProperty("procedure") Optional<Pattern> procedureRegex)
+ {
+ this.privileges = ImmutableSet.copyOf(requireNonNull(privileges, "privileges is null"));
+ this.userRegex = requireNonNull(userRegex, "userRegex is null");
+ this.schemaRegex = requireNonNull(schemaRegex, "sourceRegex is null");
+ this.procedureRegex = requireNonNull(procedureRegex, "procedureRegex is null");
+ }
</code_context>
<issue_to_address>
**nitpick (typo):** Constructor error message refers to `sourceRegex` instead of `schemaRegex`
`this.schemaRegex` is initialized with `requireNonNull(schemaRegex, "sourceRegex is null")`. For consistency and clarity, the message should reference `schemaRegex` instead.
```suggestion
this.schemaRegex = requireNonNull(schemaRegex, "schemaRegex is null");
```
</issue_to_address>
### Comment 4
<location> `presto-hive/src/test/java/com/facebook/presto/hive/TestHiveFileBasedSecurity.java:89` </location>
<code_context>
+ Session joe = Session.builder(getSession("joe"))
+ .setConnectionProperty(new ConnectorId("hive"), USE_LIST_DIRECTORY_CACHE, "true")
+ .build();
+ assertDenied(() -> queryRunner.execute(joe, "call hive.system.invalidate_directory_list_cache()"));
+ }
+
</code_context>
<issue_to_address>
**suggestion (testing):** Strengthen the negative assertion to check for access-denied semantics rather than any RuntimeException.
`assertDenied` currently passes on any `RuntimeException`, so this test would also succeed for unrelated failures (e.g., syntax errors or catalog misconfiguration). To better validate access control, tighten the expectation by asserting the specific access-denied exception type and/or checking that the exception message clearly indicates an access-denied condition for this procedure. That way the test only passes when the failure is truly due to access control.
Suggested implementation:
```java
Session bob = Session.builder(getSession("bob"))
.setConnectionProperty(new ConnectorId("hive"), USE_LIST_DIRECTORY_CACHE, "true")
.build();
queryRunner.execute(bob, "call hive.system.invalidate_directory_list_cache()");
Session joe = Session.builder(getSession("joe"))
.setConnectionProperty(new ConnectorId("hive"), USE_LIST_DIRECTORY_CACHE, "true")
.build();
try {
queryRunner.execute(joe, "call hive.system.invalidate_directory_list_cache()");
fail("Expected access to be denied for joe when calling invalidate_directory_list_cache");
}
catch (AccessDeniedException e) {
assertTrue(
e.getMessage() != null && e.getMessage().contains("invalidate_directory_list_cache"),
"AccessDeniedException message should indicate invalidate_directory_list_cache was denied, but was: " + e.getMessage());
}
```
```java
import com.facebook.presto.Session;
import com.facebook.presto.spi.ConnectorId;
import com.facebook.presto.spi.security.AccessDeniedException;
import com.facebook.presto.spi.security.Identity;
import com.facebook.presto.testing.QueryRunner;
import com.google.common.collect.ImmutableList;
import java.util.Optional;
```
```java
import static com.facebook.presto.hive.HiveQueryRunner.createQueryRunner;
import static com.facebook.presto.hive.HiveSessionProperties.USE_LIST_DIRECTORY_CACHE;
import static com.facebook.presto.testing.TestingSession.testSessionBuilder;
import static org.testng.Assert.assertTrue;
import static org.testng.Assert.fail;
```
</issue_to_address>
### Comment 5
<location> `presto-iceberg/src/test/java/com/facebook/presto/iceberg/TestIcebergFileBasedSecurity.java:124-125` </location>
<code_context>
+ assertUpdate(icebergAdmin, format("call system.rewrite_data_files('%s', '%s')", schema, tableName), 2);
+ // `alice` and `bob` have the permission to execute `iceberg.system.rewrite_data_files`,
+ // but they lack the necessary permission to perform INSERT or DELETE on the target table
+ assertDenied(() -> assertUpdate(alice, format("call system.rewrite_data_files('%s', '%s')", schema, tableName)),
+ "Access Denied: Cannot delete from table schema.test_rewrite_table");
+ assertDenied(() -> assertUpdate(bob, format("call system.rewrite_data_files('%s', '%s')", schema, tableName)),
+ "Access Denied: Cannot insert into table schema.test_rewrite_table");
</code_context>
<issue_to_address>
**issue (testing):** Make the expected error messages resilient to catalog/schema naming and align them with the actual object name formatting.
These assertions hard-code `schema.test_rewrite_table`, while the actual error messages come from `QualifiedObjectName` (often including the catalog, e.g., `iceberg.<schema>.test_rewrite_table`) and use `getSession().getSchema()` instead of the literal `schema`.
To avoid brittle tests, either build the expected message with `format` using the actual schema (and catalog, if present), or match only the stable portion via regex, e.g.:
- `"Access Denied: Cannot delete from table .*\\." + tableName`
- `"Access Denied: Cannot insert into table .*\\." + tableName`
This will keep the tests stable if catalog/schema naming or formatting changes.
</issue_to_address>
### Comment 6
<location> `presto-iceberg/src/test/java/com/facebook/presto/iceberg/TestIcebergFileBasedSecurity.java:168-174` </location>
<code_context>
+ // perform INSERT/DELETE operations on the target table involved in the procedure
+ assertUpdate(icebergAdmin, format("call system.rewrite_data_files('%s', '%s')", schema, tableName), 2);
+
+ assertions.executeExclusively(() -> {
+ accessControl.reset();
+ accessControl.rowFilter(
+ new QualifiedObjectName("iceberg", schema, tableName),
+ "iceberg",
+ new ViewExpression("iceberg", Optional.empty(), Optional.empty(), "a < 2"));
+ assertions.assertQuery(icebergAdmin, "SELECT count(*) FROM " + tableName, "VALUES BIGINT '1'");
+ assertions.assertFails(icebergAdmin, format("call system.rewrite_data_files('%s', '%s')", schema, tableName),
+ "Access Denied: Full data access is restricted by row filters and column masks");
+ });
</code_context>
<issue_to_address>
**suggestion (testing):** Consider also asserting normal read/masked behavior in the column-mask scenario to fully validate the access-control interaction.
In the row-filter case you both confirm the filter is enforced (via a `SELECT` count) and that `rewrite_data_files` is denied. In the column-mask case you only check the denial. Please also assert that a normal query still works and applies the mask (for example, selecting column `b` and verifying it reads as `'noop'`), so the test fails if masks are accidentally disabled instead of the rewrite being blocked.
```suggestion
assertions.executeExclusively(() -> {
accessControl.reset();
accessControl.columnMask(
new QualifiedObjectName("iceberg", schema, tableName),
"b",
"iceberg",
new ViewExpression("iceberg", Optional.empty(), Optional.empty(), "'noop'"));
// verify that normal reads still work and the column mask is applied
assertions.assertQuery(
icebergAdmin,
"SELECT b FROM " + tableName + " ORDER BY a",
"VALUES 'noop', 'noop'");
assertions.assertFails(
icebergAdmin,
format("call system.rewrite_data_files('%s', '%s')", schema, tableName),
"Access Denied: Full data access is restricted by row filters and column masks");
});
```
</issue_to_address>
### Comment 7
<location> `presto-docs/src/main/sphinx/connector/iceberg.rst:2408-2409` </location>
<code_context>
+================================================== ============================================================
+Property Value Description
+================================================== ============================================================
+``allow-all`` (default value) None authorization checks are enforced, thus allowing all
+ operations.
+
+``file`` Authorization checks are enforced using a config file specified
</code_context>
<issue_to_address>
**issue (typo):** Fix grammar in the `allow-all` description (`None authorization` → `No authorization`).
Also consider rewriting the full sentence to: `No authorization checks are enforced, thus allowing all operations.`
```suggestion
``allow-all`` (default value) No authorization checks are enforced, thus allowing all
operations.
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
...in-base/src/main/java/com/facebook/presto/sql/planner/optimizations/RewriteWriterTarget.java
Show resolved
Hide resolved
...-main-base/src/main/java/com/facebook/presto/security/StatsRecordingSystemAccessControl.java
Show resolved
Hide resolved
...olkit/src/main/java/com/facebook/presto/plugin/base/security/ProcedureAccessControlRule.java
Outdated
Show resolved
Hide resolved
presto-hive/src/test/java/com/facebook/presto/hive/TestHiveFileBasedSecurity.java
Outdated
Show resolved
Hide resolved
|
@sourcery-ai resolve |
Description
This PR implements the following functionality:
Add basic access control check for calling procedures. Procedures are now access control objects at the same level as tables, requiring EXECUTE privilege for invocation.
Make access control in Iceberg configurable and introduces file-based access control into Iceberg. The default remains
allow-all.Add target table's INSERT/DELETE access control checks for
TABLE_DATA_REWRITEdistributed procedure.Add row filters and column masks access control checks for
TABLE_DATA_REWRITEdistributed procedure.To be implemented: Enforce access control for CALL procedures in
RangerBasedAccessControlandSqlStandardAccessControlas well. (current behavior: calls are always allowed, but the necessary permissions on target tables will be checked.)Motivation and Context
See issue: #26680
Impact
allow-allandfile.Test Plan
Contributor checklist
Release Notes