Skip to content

HIVE-28804: The user does not have the permission for the table hdfs,… #5975

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

zxl-333
Copy link

@zxl-333 zxl-333 commented Jul 15, 2025

… but can delete the metadata

What changes were proposed in this pull request?

The user does not have the permission for the table hdfs, but can delete the metadata

Why are the changes needed?

When I create a table using the hdfs user and write data into it, and then use the hive user to delete this table, the engine side shows that the deletion is successful. However, the metastore log indicates that the deletion failed due to insufficient permissions when deleting the HDFS directory. Nevertheless, the metadata has been deleted. This situation may result in the data of this table becoming junk data.
2025-03-04 16:44:27,617 | WARN | org.apache.hadoop.hive.metastore.utils.FileUtils | Failed to move to trash: hdfs://myns/warehouse/tablespace/managed/hive/test_drop; Force to delete it. 2025-03-04 16:44:27,621 | ERROR | org.apache.hadoop.hive.metastore.utils.MetaStoreUtils | Got exception: org.apache.hadoop.security.AccessControlException Permission denied: user=hive, access=ALL, inode="/warehouse/tablespace/managed/hive/test_drop":hdfs:hadoop:drwxr-xr-x at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkSubAccess(FSPermissionChecker.java:455) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:356) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermissionWithContext(FSPermissionChecker.java:370) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:240) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1943) at org.apache.hadoop.hdfs.server.namenode.FSDirDeleteOp.delete(FSDirDeleteOp.java:105) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3300) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:1153) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:725) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:614) at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:582) at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:566) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1116) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1060) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:983) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1890) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2997)org.apache.hadoop.security.AccessControlException: Permission denied: user=hive, access=ALL, inode="/warehouse/tablespace/managed/hive/test_drop":hdfs:hadoop:drwxr-xr-x at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkSubAccess(FSPermissionChecker.java:455) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:356) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermissionWithContext(FSPermissionChecker.java:370) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:240) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1943) at org.apache.hadoop.hdfs.server.namenode.FSDirDeleteOp.delete(FSDirDeleteOp.java:105) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3300) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:1153) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:725) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:614) at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:582) at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:566) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1116) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1060) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:983) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1890) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2997) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:1.8.0_352] at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[?:1.8.0_352] at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:1.8.0_352] at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[?:1.8.0_352] at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121) ~[hadoop-common-3.3.3.jar:?] at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:88) ~[hadoop-common-3.3.3.jar:?] at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:1664) ~[hadoop-hdfs-client-3.3.3.jar:?] at org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:992) ~[hadoop-hdfs-client-3.3.3.jar:?] at org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:989) ~[hadoop-hdfs-client-3.3.3.jar:?] at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) ~[hadoop-common-3.3.3.jar:?] at org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:999) ~[hadoop-hdfs-client-3.3.3.jar:?] at org.apache.hadoop.hive.metastore.utils.FileUtils.moveToTrash(FileUtils.java:97) ~[hive-exec-3.1.2.jar:3.1.2] at org.apache.hadoop.hive.metastore.HiveMetaStoreFsImpl.deleteDir(HiveMetaStoreFsImpl.java:41) [hive-exec-3.1.2.jar:3.1.2] at org.apache.hadoop.hive.metastore.Warehouse.deleteDir(Warehouse.java:363) [hive-exec-3.1.2.jar:3.1.2] at org.apache.hadoop.hive.metastore.Warehouse.deleteDir(Warehouse.java:351) [hive-exec-3.1.2.jar:3.1.2] at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.deleteTableData(HiveMetaStore.java:2586) [hive-exec-3.1.2.jar:3.1.2] at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_core(HiveMetaStore.java:2559) [hive-exec-3.1.2.jar:3.1.2] at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_with_environment_context(HiveMetaStore.java:2708) [hive-exec-3.1.2.jar:3.1.2] at sun.reflect.GeneratedMethodAccessor238.invoke(Unknown Source) ~[?:?] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_352] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_352] at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) [hive-exec-3.1.2.jar:3.1.2] at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) [hive-exec-3.1.2.jar:3.1.2] at com.sun.proxy.$Proxy27.drop_table_with_environment_context(Unknown Source) [?:?] at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$drop_table_with_environment_context.getResult(ThriftHiveMetastore.java:15068) [hive-exec-3.1.2.jar:3.1.2]

Does this PR introduce any user-facing change?

No

How was this patch tested?

Use the existing unit tests.

@@ -3047,7 +3047,8 @@ private boolean drop_table_core(final RawStore ms, final String catName, final S
tableDataShouldBeDeleted = checkTableDataShouldBeDeleted(tbl, deleteData);
if (tableDataShouldBeDeleted && tbl.getSd().getLocation() != null) {
tblPath = new Path(tbl.getSd().getLocation());
if (!wh.isWritable(tblPath.getParent())) {
// HIVE-28804 drop table user should have table path and parent path permission
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tables in Hive are owned by the hive user. To implement fine-grained access control, consider using Apache Ranger.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I don’t understand why it’s necessary to check the parent directory, when the description mentions that the table directory is owned by the HDFS user, not hive

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another point, since this is a MANAGED table /warehouse/tablespace/managed/hive/test_drop, it should be owned by Hive, and external attempts to create or load data into it aren't supported.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When there is no ranger management permission, the above-mentioned exception will occur.The parent directory does not have the permission and cannot be deleted. This logic has always existed.Here, the hive user is merely a test user. It could also be other users such as spark.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zxl-333 this is not an EXTERNAL table, but a MANAGED! Only hive user is supposed to have access, no-one else!!! Please check the docs for MANAGED table access patterns.

Copy link
Author

@zxl-333 zxl-333 Jul 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, ok. so you need to propagate the exception back to client? why add check for parent dir? Btw, since it's a managed table, the 'test_drop' table should have been created by the Hive user, not the HDFS user.

why do you even create a MANAGED table? just use EXTERNAL

Not having the permission should be considered a failure, rather than deleting the metadata without deleting the HDFS file directory.

The "manage" table can not only be for Hive users, but also for other users on the cluster. If there is no permission to write in the parent directory, then even if the current directory has write permission, the deletion operation cannot be completed successfully. Below is what I tested through the HDFS command.
parent direct:
drwxr-xr-x - hdfs hadoop 0 2025-07-16 17:48 /tmp/test_drop
child direct:
drwxrwxrwx - hdfs hadoop 0 2025-07-16 17:48 /tmp/test_drop/t1
use hive user delete direct
hdfs dfs -rm -r -f /tmp/test_drop/t1
rm: Failed to move to trash: hdfs://ns1/tmp/test_drop/t1: Permission denied: user=hive, access=WRITE, inode="/tmp/test_drop":hdfs:hadoop:drwxr-xr-x

Copy link
Member

@deniskuzZ deniskuzZ Jul 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "manage" table is a table governed by Hive. All modifications must be performed through Hive. Any external changes, whether via CLI or other means are not supported !!!
Managed tables can only be created by the hive user.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "manage" table is a table governed by Hive. All modifications must be performed through Hive. Any external changes, whether via CLI or other means are not supported !!!
Managed tables can only be created by the hive user.

The "manage" table has different users for each database. A cluster cannot have only one user.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An end-user cannot directly modify or create/delete the table directory for a managed table. Their interaction is with the Hive service, not the file system. They ask Hive to perform the action, and Hive does it on their behalf after checking permissions.
Managed table dir should have HIVE user as an owner, not HDFS

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is quite normal for other users to create the "manage" table as well. Otherwise, every time a table is created or deleted, the superuser "hive" would have to be used. A cluster can have hundreds of thousands of tables or even more. Every time a table is created or deleted, one has to contact the Hive superuser. The offline SQL scripts all contain operations for creating and deleting tables. On the YARN queue, it is mapped through the user name and the queue. This ensures that when running the offline SQL, there won't be a situation where the creation and deletion of tables are switched to be executed by the Hive superuser, which is not a reasonable approach. Especially, when creating tables, there will be expressions like "create table tbl_name as select". If other users are not allowed to create tables, then the entire cluster will have to rely solely on the "hive" superuser.

Copy link
Member

@deniskuzZ deniskuzZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comments

Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants