Skip to content

Commit 9daf74d

Browse files
Alex Wojtowiczpan3793
Alex Wojtowicz
andcommitted
[KYUUBI #6908] Connection class ssl context object paramater
**Why are the changes needed:** Currently looking to connect to a HiveServer2 behind an NGINX proxy that is requiring mTLS communication. pyHive seems to lack the capability to establish an mTLS connection in applications such as Airflow directly communicating to the HiveServer2 instance. The change needed is to be able to pass in the parameters for a proper mTLS ssl context to be established. I believe that creating your own ssl_context object is the quickest and cleanest way to do so, leaving the responsibility of configuring it to further implementations and users. Also cuts down on code length. **How was this patch tested:** Corresponding pytest fixtures have been added, using the mock module to see if ssl_context object was properly accessed, or if the default one created in the Connection initialization was properly configured. Was not able to run pytest fixtures specifically, was lacking JDBC driver, first time contributing to open source, happy to run tests if provided guidance. Passed a clean build and test of the entire kyuubi project in local dev environment. **Was this patch authored or co-authored using generative AI tooling** Yes, Generated-by Cursor-AI with Claude Sonnet 3.5 agent Closes #6935 from alexio215/connection-class-ssl-context-param. Closes #6908 539b299 [Cheng Pan] Update python/pyhive/tests/test_hive.py 14c6074 [Alex Wojtowicz] Simplified testing, following pattern of other tests, need proper SSL setup with nginx to test ssl_context fully b947f24 [Alex Wojtowicz] Added exception handling since JDBC driver will not run in python tests 11f9002 [Alex Wojtowicz] Passing in fully configured mock object before creating connection 009c5cf [Alex Wojtowicz] Added back doc string documentation e3280bc [Alex Wojtowicz] Python testing 529de8a [Alex Wojtowicz] Added ssl_context object. If no obj is provided, then it continues to use default provided parameters Lead-authored-by: Alex Wojtowicz <[email protected]> Co-authored-by: Cheng Pan <[email protected]> Signed-off-by: Cheng Pan <[email protected]>
1 parent d33aa0b commit 9daf74d

File tree

2 files changed

+30
-7
lines changed

2 files changed

+30
-7
lines changed

Diff for: python/pyhive/hive.py

+9-7
Original file line numberDiff line numberDiff line change
@@ -159,7 +159,8 @@ def __init__(
159159
password=None,
160160
check_hostname=None,
161161
ssl_cert=None,
162-
thrift_transport=None
162+
thrift_transport=None,
163+
ssl_context=None
163164
):
164165
"""Connect to HiveServer2
165166
@@ -172,19 +173,20 @@ def __init__(
172173
:param password: Use with auth='LDAP' or auth='CUSTOM' only
173174
:param thrift_transport: A ``TTransportBase`` for custom advanced usage.
174175
Incompatible with host, port, auth, kerberos_service_name, and password.
175-
176+
:param ssl_context: A custom SSL context to use for HTTPS connections. If provided,
177+
this overrides check_hostname and ssl_cert parameters.
176178
The way to support LDAP and GSSAPI is originated from cloudera/Impyla:
177179
https://github.com/cloudera/impyla/blob/255b07ed973d47a3395214ed92d35ec0615ebf62
178180
/impala/_thrift_api.py#L152-L160
179181
"""
180182
if scheme in ("https", "http") and thrift_transport is None:
181183
port = port or 1000
182-
ssl_context = None
183184
if scheme == "https":
184-
ssl_context = create_default_context()
185-
ssl_context.check_hostname = check_hostname == "true"
186-
ssl_cert = ssl_cert or "none"
187-
ssl_context.verify_mode = ssl_cert_parameter_map.get(ssl_cert, CERT_NONE)
185+
if ssl_context is None:
186+
ssl_context = create_default_context()
187+
ssl_context.check_hostname = check_hostname == "true"
188+
ssl_cert = ssl_cert or "none"
189+
ssl_context.verify_mode = ssl_cert_parameter_map.get(ssl_cert, CERT_NONE)
188190
thrift_transport = thrift.transport.THttpClient.THttpClient(
189191
uri_or_host="{scheme}://{host}:{port}/cliservice/".format(
190192
scheme=scheme, host=host, port=port

Diff for: python/pyhive/tests/test_hive.py

+21
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@
1515
import time
1616
import unittest
1717
from decimal import Decimal
18+
import ssl
1819

1920
import mock
2021
import pytest
@@ -238,6 +239,26 @@ def test_custom_connection(self):
238239
subprocess.check_call(['sudo', 'cp', orig_none, des])
239240
_restart_hs2()
240241

242+
@pytest.mark.skip(reason="Need a proper setup for SSL context testing")
243+
def test_basic_ssl_context(self):
244+
"""Test that connection works with a custom SSL context that mimics the default behavior."""
245+
# Create an SSL context similar to what Connection creates by default
246+
ssl_context = ssl.create_default_context()
247+
ssl_context.check_hostname = False
248+
ssl_context.verify_mode = ssl.CERT_NONE
249+
250+
# Connect using the same parameters as self.connect() but with our custom context
251+
with contextlib.closing(hive.connect(
252+
host=_HOST,
253+
port=10000,
254+
configuration={'mapred.job.tracker': 'local'},
255+
ssl_context=ssl_context
256+
)) as connection:
257+
with contextlib.closing(connection.cursor()) as cursor:
258+
# Use the same query pattern as other tests
259+
cursor.execute('SELECT 1 FROM one_row')
260+
self.assertEqual(cursor.fetchall(), [(1,)])
261+
241262

242263
def _restart_hs2():
243264
subprocess.check_call(['sudo', 'service', 'hive-server2', 'restart'])

0 commit comments

Comments
 (0)