Skip to content

Conversation

Aggarwal-Raghav
Copy link
Contributor

@Aggarwal-Raghav Aggarwal-Raghav commented Sep 1, 2025

What changes were proposed in this pull request?

Although Hive has moved to Jline3, 2.x was also getting shipped in packaging.

Why are the changes needed?

To have single Jline jar. Check HIVE-29130 for more details

Does this PR introduce any user-facing change?

No

How was this patch tested?

On local setup

@Aggarwal-Raghav
Copy link
Contributor Author

As Hadoop has jline 3.9.0**, in my setup** it was getting picked first compared to 3.25.0 from hive. Causing NoSuchMethodError org.jline.reader.EndOfFileException#getPartialLine() check here.

I would like the reviewer to confirm on this.

@Aggarwal-Raghav
Copy link
Contributor Author

CC @ayushtkn @abstractdog

@abstractdog
Copy link
Contributor

thanks @Aggarwal-Raghav for taking care of this, would you be so kind to add an enforcer item so ban this dependency even transitively in the future? that would be the perfect solution I believe

@Aggarwal-Raghav
Copy link
Contributor Author

thanks @Aggarwal-Raghav for taking care of this, would you be so kind to add an enforcer item so ban this dependency even transitively in the future? that would be the perfect solution I believe

sure. Will update the PR.

@@ -48,8 +48,7 @@
<include>org.apache.hive:hive-service-rpc:jar</include>
<include>commons-cli:commons-cli:jar</include>
<include>commons-io:commons-io:jar</include>
<include>commons-logging:commons-logging:jar</include>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removing commons-logging as it was getting packaged in beeline tarball. Check HIVE-24691 and HIVE-20019

@@ -48,8 +48,7 @@
<include>org.apache.hive:hive-service-rpc:jar</include>
<include>commons-cli:commons-cli:jar</include>
<include>commons-io:commons-io:jar</include>
<include>commons-logging:commons-logging:jar</include>
<include>jline:jline:jar</include>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved it to jline3.x groupId

<goals>
<goal>enforce</goal>
</goals>
<configuration>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed the enforcer pluging from standalone-metastore/metastore-common/pom.xml and standalone-metastore/metastore-server/pom.xml and moved to standalone-metastore/pom.xml i.e. a common place and updated it with the same list of banned dependency as in parent pom.xml

- jline2.x has been added to enforcer plugin
- commons-logging direct dependency has been removed from standalone-metastore module
- enforcer plugin in standalone-metastore is now same as of parent pom.xml
@Aggarwal-Raghav
Copy link
Contributor Author

Aggarwal-Raghav commented Sep 3, 2025

  1. The unit test failures in this PR are relevant. The reason for this is that Pig version 0.16.0 uses jline 1.x. The jline dependency is being excluded from Pig due to an enforcer rule, which is causing failures specifically in the hcatalog/hcatalog-pig-adapter module.

  2. I believe the current unit test scenario in the cluster/runtime environment might also be failing because we are not shipping jline 1.x. Assumption: This would only work if jline 2.x, which is being shipped without this PR, is backward compatible with version 1.x.

My recommendation for the next step is to remove the jline enforcer rule from the parent pom.xml, keeping the exclusions everywhere else except hcatalog/hcatalog-pig-adapter. Is that the approach acceptable? Still with this approach [2] i.e. pig working in cluster is unknown.

possible_follow_up.patch

CC @abstractdog

@ayushtkn
Copy link
Member

ayushtkn commented Sep 3, 2025

As Hadoop has jline 3.9.0**, in my setup** it was getting picked first compared to 3.25.0 from hive. Causing NoSuchMethodError org.jline.reader.EndOfFileException#getPartialLine() check here.

Hadoop's JLine 3.9.0 it was getting picked up first. So, we should make sure Hive Classpath is always before HADOOP jars.

I don't understand in that scope or for this problem why are we playing with Jline-2.x?

I am not very convinced on enforcing some places and not on some. We might not have a use case or expertise with Pig, so better not to touch it in that case. We had a thread around Pig to depreacate/remove some time back & we were shot down, so people tend to use it

Why not upgrade JLine in Hadoop?

@abstractdog
Copy link
Contributor

  1. The unit test failures in this PR are relevant. The reason for this is that Pig version 0.16.0 uses jline 1.x. The jline dependency is being excluded from Pig due to an enforcer rule, which is causing failures specifically in the hcatalog/hcatalog-pig-adapter module.
  2. I believe the current unit test scenario in the cluster/runtime environment might also be failing because we are not shipping jline 1.x. Assumption: This would only work if jline 2.x, which is being shipped without this PR, is backward compatible with version 1.x.

My recommendation for the next step is to remove the jline enforcer rule from the parent pom.xml, keeping the exclusions everywhere else except hcatalog/hcatalog-pig-adapter. Is that the approach acceptable? Still with this approach [2] i.e. pig working in cluster is unknown.

possible_follow_up.patch

CC @abstractdog

it is what it is, thanks @Aggarwal-Raghav for checking, but before proceeding with this, can you please check if there is an include/exclude possibility in bannedDependencies to allow old jline transitively through pig only, does it make sense?

@Aggarwal-Raghav
Copy link
Contributor Author

Thanks for the reply @ayushtkn.

I don't understand in that scope or for this problem why are we playing with Jline-2.x?

In hive packaging, both jline 2.11 and jline 3.25.0 are getting shipped. IMO, its wrong. The aim of this PR is to ship only jline3.x jar. If we are ok shipping both the version then I'll close this PR.

Why not upgrade JLine in Hadoop?

It's good to have same dependency version across components. In my setup, Jline 3.9.0 is getting picked up first which is causing NoSuchMethodError. -- I want to confirm if its due to my setup or not.

In conclusion,

  • if Hadoop Classpath first, then jline 3.9.0 is picked => hadoop has to upgrade jline to 3.25.0
  • if hive classpath is first, then we are all good.

@ayushtkn
Copy link
Member

ayushtkn commented Sep 3, 2025

In conclusion,
if Hadoop Classpath first, then jline 3.9.0 is picked => hadoop has to upgrade jline to 3.25.0
if hive classpath is first, then we are all good.

So, the solution is like in ideal setup Hive Classpath should be first. In that case we are sorted. Else for the problem that you reported. Hadoop needs to upgrade JLine, correct?

And this PR is solving a different problem as reported in the ticket, this won't solve the problem that you reported?

@Aggarwal-Raghav
Copy link
Contributor Author

Aggarwal-Raghav commented Sep 3, 2025

In conclusion,
if Hadoop Classpath first, then jline 3.9.0 is picked => hadoop has to upgrade jline to 3.25.0
if hive classpath is first, then we are all good.

So, the solution is like in ideal setup Hive Classpath should be first. In that case we are sorted. Else for the problem that you reported. Hadoop needs to upgrade JLine, correct?

And this PR is solving a different problem as reported in the ticket, this won't solve the problem that you reported?

yes and yes.
In addition, I found 2 problems:

  1. commons-logggins is also getting shipped which is big no-no based on HIVE-24691 and HIVE-20019
  2. enforcer plugin defined in parent pom is not enforced on standalone-metastore module. IMO, the banned dependencies should be same.

@Aggarwal-Raghav
Copy link
Contributor Author

can you please check if there is an include/exclude possibility in bannedDependencies to allow old jline transitively through pig only, does it make sense?

we can use include tag but there are 2 ways:
1. Do it on parent pom level: include-tag.patch

  • Benefit: Single place enforcer is defined. So, easy to maintain.
  • Drawback: For all the module jline 1.0 is enabled. But it won't be shipped in packaging. As we have excluded it from all problematic places flagged by enforcer plugin so it shouldn't be a problem.

2. Do it on hcatalog/hcatalog-pig-adapter/pom.xml level: include-tag-child-pom.patch

  • Benefit: Only for hcatalog/hcatalog-pig-adapter jline 1.0 is enabled, rest all modules are not impacted.
  • Drawback: Copy of enforcer plugin needs to be defined and maintainability of enforcer plugin for new bannedDependencies can be missed for this child pom.

@abstractdog , please let me know which is better.

@abstractdog
Copy link
Contributor

can you please check if there is an include/exclude possibility in bannedDependencies to allow old jline transitively through pig only, does it make sense?

we can use include tag but there are 2 ways: 1. Do it on parent pom level: include-tag.patch

  • Benefit: Single place enforcer is defined. So, easy to maintain.
  • Drawback: For all the module jline 1.0 is enabled. But it won't be shipped in packaging. As we have excluded it from all problematic places flagged by enforcer plugin so it shouldn't be a problem.

2. Do it on hcatalog/hcatalog-pig-adapter/pom.xml level: include-tag-child-pom.patch

  • Benefit: Only for hcatalog/hcatalog-pig-adapter jline 1.0 is enabled, rest all modules are not impacted.
  • Drawback: Copy of enforcer plugin needs to be defined and maintainability of enforcer plugin for new bannedDependencies can be missed for this child pom.

@abstractdog , please let me know which is better.

I would go for 1), I don't like the whole copied enforcer plugin config in 2), thanks a lot for investigating all the possibilities!

@Aggarwal-Raghav
Copy link
Contributor Author

UT passed in local.

cd hcatalog/hcatalog-pig-adapter ; mvn clean test -Dtest=TestTextFileHCatLoader,TestRCFileHCatLoader,TestAvroHCatLoader,TestSequenceFileHCatLoader,TestParquetHCatLoader,TestOrcHCatLoader -Drat.skip

Pushed the changes. Will see CI output.

<exclude>jline:jline</exclude>
</excludes>
<includes>
<include>jline:jline:1.0</include>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: please explain here with a brief comment why we still allow 1.0 to make it clear for the future if we can remove

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@abstractdog
Copy link
Contributor

thanks @Aggarwal-Raghav, I'm good now with the pom.xml changes, but haven't followed if you've reached consensus with @ayushtkn on the other aspect of this PR, please let me know guys

@Aggarwal-Raghav
Copy link
Contributor Author

@ayushtkn, can you please take a look once!

Copy link

sonarqubecloud bot commented Sep 4, 2025

@Aggarwal-Raghav
Copy link
Contributor Author

gentle ping for review @ayushtkn !

@ayushtkn
Copy link
Member

Stuck with some stuff. If @abstractdog is happy. I am happy. feel free to move ahead :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants