HIVE-20189: Separate metastore client code into its own module#5924
HIVE-20189: Separate metastore client code into its own module#5924deniskuzZ merged 4 commits intoapache:masterfrom
Conversation
| <dependencies> | ||
| <dependency> | ||
| <groupId>org.apache.hive</groupId> | ||
| <artifactId>hive-standalone-metastore-common</artifactId> |
There was a problem hiding this comment.
The new module depends on the metastore-common, which doesn't reduce the size or dependencies from metastore-common, not sure if it's ok, for the user what will benefit from the new module?
There was a problem hiding this comment.
It's expected for a client module to depend on a common module. The client module delivers core client functionality and can be enhanced later with features like caching.
Offering a 'client' JAR aligns with common conventions and is expected to be user-friendly.
ATM we have 2 distinct cache wrappers in different Hive modules, just because there was no structure:
- org.apache.hadoop.hive.metastore.HiveClientCache
- org.apache.iceberg.hive.CachedClientPool
There was a problem hiding this comment.
another point I don't understand, why ql and beeline modules depend on metastore-server? I think we should drop the dependency on server and move the classes into the metastore common or client.
There was a problem hiding this comment.
I basically agree that we can potentially improve the structure in the future. We will have obvious guidelines about how to organize files.
- metastore-client: Client-specific files, which Server doesn't need
- metastore-server: Server-specific files, which Client doesn't need
- metastore-common: Common modules
another point I don't understand, why ql and beeline modules depend on metastore-server? I think we should drop the dependency on server and move the classes into the metastore common or client.
I guess ql requires metastore-server to use an embedded HMS.
There was a problem hiding this comment.
isn't embedded HMS only used in tests?
There was a problem hiding this comment.
No, it isn't. It can be used when metastore.thrift.uris is empty. For example, our Hive docker image(not HMS docker image) can set up a HiveServer2 without HMS. It probably uses the embedded mode that runs HMS-equivalent threads in HS2.
There was a problem hiding this comment.
Alternatively, we can say it is for testing purposes and separate the classes from hive-exec.
5461f07 to
24ca325
Compare
959dfa7 to
bc40344
Compare
| Configuration conf, String userName) throws Exception { | ||
| Token<org.apache.hadoop.mapreduce.security.token.delegation.DelegationTokenIdentifier> t; | ||
| try (JobClient jcl = new JobClient(new JobConf(conf, HCatOutputFormat.class))) { | ||
| t = jcl.getDelegationToken(new Text(userName)); |
| import org.apache.hadoop.hive.metastore.IMetaStoreClient; | ||
| import org.apache.hadoop.hive.metastore.TableType; | ||
| import org.apache.hadoop.hive.metastore.Warehouse; | ||
| import org.apache.hadoop.hive.metastore.IMetaStoreClient; |
There was a problem hiding this comment.
nit: I think the original import is sorted correctly, so we might not need this change.
|



What changes were proposed in this pull request?
move client classes into it's own module
Why are the changes needed?
improve the structure of standalone-metastore classes
Does this PR introduce any user-facing change?
No
How was this patch tested?
jenkins