-
Notifications
You must be signed in to change notification settings - Fork 4.8k
HIVE-20189: Separate metastore client code into its own module #5924
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
<dependencies> | ||
<dependency> | ||
<groupId>org.apache.hive</groupId> | ||
<artifactId>hive-standalone-metastore-common</artifactId> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new module depends on the metastore-common
, which doesn't reduce the size or dependencies from metastore-common
, not sure if it's ok, for the user what will benefit from the new module?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's expected for a client module to depend on a common module. The client module delivers core client functionality and can be enhanced later with features like caching.
Offering a 'client' JAR aligns with common conventions and is expected to be user-friendly.
ATM we have 2 distinct cache wrappers in different Hive modules, just because there was no structure:
- org.apache.hadoop.hive.metastore.HiveClientCache
- org.apache.iceberg.hive.CachedClientPool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another point I don't understand, why ql
and beeline
modules depend on metastore-server
? I think we should drop the dependency on server and move the classes into the metastore common or client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I basically agree that we can potentially improve the structure in the future. We will have obvious guidelines about how to organize files.
- metastore-client: Client-specific files, which Server doesn't need
- metastore-server: Server-specific files, which Client doesn't need
- metastore-common: Common modules
another point I don't understand, why ql and beeline modules depend on metastore-server? I think we should drop the dependency on server and move the classes into the metastore common or client.
I guess ql
requires metastore-server to use an embedded HMS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't embedded HMS only used in tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it isn't. It can be used when metastore.thrift.uris
is empty. For example, our Hive docker image(not HMS docker image) can set up a HiveServer2 without HMS. It probably uses the embedded mode that runs HMS-equivalent threads in HS2.
Lines 2654 to 2663 in d7b3a5b
/** | |
* Check if metastore is being used in embedded mode. | |
* This utility function exists so that the logic for determining the mode is same | |
* in HiveConf and HiveMetaStoreClient | |
* @param msUri - metastore server uri | |
* @return true if the metastore is embedded | |
*/ | |
public static boolean isEmbeddedMetaStore(String msUri) { | |
return (msUri == null) || msUri.trim().isEmpty(); | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, we can say it is for testing purposes and separate the classes from hive-exec.
5461f07
to
24ca325
Compare
959dfa7
to
bc40344
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Configuration conf, String userName) throws Exception { | ||
Token<org.apache.hadoop.mapreduce.security.token.delegation.DelegationTokenIdentifier> t; | ||
try (JobClient jcl = new JobClient(new JobConf(conf, HCatOutputFormat.class))) { | ||
t = jcl.getDelegationToken(new Text(userName)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may directly return it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, +1
import org.apache.hadoop.hive.metastore.TableType; | ||
import org.apache.hadoop.hive.metastore.Warehouse; | ||
import org.apache.hadoop.hive.metastore.IMetaStoreClient; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think the original import is sorted correctly, so we might not need this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reverted
|
What changes were proposed in this pull request?
move client classes into it's own module
Why are the changes needed?
improve the structure of standalone-metastore classes
Does this PR introduce any user-facing change?
No
How was this patch tested?
jenkins