Skip to content

Fix thrift client connection for Kerberos Hive Client #1747

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Apr 16, 2025

Conversation

kevinjqliu
Copy link
Contributor

@kevinjqliu kevinjqliu commented Mar 1, 2025

Closes #1744

TSaslClientTransport cannot be reopen. This PR changes the behavior to recreate a TSaslClientTransport when its already closed.

Note, _HiveClient should be used with context manager, but can be used without.

@kevinjqliu kevinjqliu requested a review from mnzpk March 2, 2025 18:48
Copy link
Member

@hussein-awala hussein-awala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO, it would be better to define the client as a cached_property, and just return it.
+1 for moving the init out of the __init__ func, and converting the client to a cached_property will help to do this, where the user will be able to use the client with and without a context manager.

@Fokko Fokko added this to the PyIceberg 0.9.1 milestone Mar 26, 2025
@Fokko
Copy link
Contributor

Fokko commented Mar 26, 2025

@kevinjqliu I think this one is also good for 0.9.1

@Fokko
Copy link
Contributor

Fokko commented Apr 3, 2025

Gentle ping @kevinjqliu. Any thoughts on the cached_property suggested by @hussein-awala?

@mnzpk
Copy link

mnzpk commented Apr 11, 2025

IMHO, it would be better to define the client as a cached_property, and just return it. +1 for moving the init out of the __init__ func, and converting the client to a cached_property will help to do this, where the user will be able to use the client with and without a context manager.

Hi @hussein-awala, just to make sure I have this right, would this mean moving most of the logic in _init_thrift_client to a cached property named client with __enter__ returning self.client and __exit__ closing the transport and deleting self.client?

Edit: I'm asking because there might be cases where the above could actually result in a user invoking methods on a client whose underlying transport has been closed. For example:

hive_client = _HiveClient(...)

p_client = hive_client.client

with hive_client as open_client:
    print(open_client.get_all_databases())

print(p_client.get_all_databases())  # Results in TTransportException: Transport not open

So its likely that this is not what you meant.

@kevinjqliu
Copy link
Contributor Author

kevinjqliu commented Apr 12, 2025

Thanks for chiming in @hussein-awala @mnzpk. Please take a look at the new implementation.

The context manager (__enter__ & __exit__) now manages the underlying _transport. The _transport will be closed on exit and re-initialized and opened on enter.

@mnzpk could you give this a try?

@kevinjqliu kevinjqliu force-pushed the kevinjqliu/thrift-client branch from 6b8d2ee to 7b21b5b Compare April 13, 2025 04:52
@kevinjqliu
Copy link
Contributor Author

kevinjqliu commented Apr 13, 2025

CI's currently failing for main branch, see https://github.com/apache/iceberg-python/pull/1899/files#r2040915222

make test-integration still works locally and ran successfully

@kevinjqliu kevinjqliu force-pushed the kevinjqliu/thrift-client branch from 7b21b5b to d788c8d Compare April 13, 2025 15:46
@kevinjqliu
Copy link
Contributor Author

cool CI passes now

@mnzpk please take a look when you have time :)

Copy link
Contributor

@Fokko Fokko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this @kevinjqliu 🙌

@Fokko Fokko merged commit 881b2d5 into apache:main Apr 16, 2025
7 checks passed
Fokko pushed a commit that referenced this pull request Apr 17, 2025
Closes #1744

`TSaslClientTransport` cannot be reopen. This PR changes the behavior to
recreate a `TSaslClientTransport` when its already closed.

Note, `_HiveClient` should be used with context manager, but can be used
without.
@kevinjqliu kevinjqliu deleted the kevinjqliu/thrift-client branch April 17, 2025 18:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SASL negotiation fails when invoking multiple methods on pyiceberg.catalog.hive.HiveCatalog
4 participants