[SPARK-51206][PYTHON][CONNECT] Move Arrow conversion helpers out of Spark Connect #49941

wengh · 2025-02-13T20:30:34Z

What changes were proposed in this pull request?

Refactor pyspark.sql.connect.conversion to move LocalDataToArrowConversion and ArrowTableToRowsConversion into pyspark.sql.conversion.

The reason is that pyspark.sql.connect.conversion checks for Spark Connect dependencies such as grpcio and pandas, but LocalDataToArrowConversion and ArrowTableToRowsConversion don't need these dependencies.

pyspark.sql.connect.conversion still re-exports the two classes for backward compatibility.

Why are the changes needed?

Python Data Sources should work without Spark Connect dependencies but currently it imports LocalDataToArrowConversion and ArrowTableToRowsConversion from pyspark.sql.connect.conversion making it require unnecessary dependencies. This change moves these two classes to pyspark.sql.conversion so that Python Data Sources runs without Spark Connect dependencies.

Does this PR introduce any user-facing change?

Relaxed requirements for using Python Data Sources.

How was this patch tested?

Existing tests should make sure that the changes don't break anything.

Manually tested to ensure that Python Data Sources can run without grpcio and pandas.

Was this patch authored or co-authored using generative AI tooling?

No

…park connect module

python/pyspark/sql/conversion.py

allisonwang-db · 2025-02-14T22:08:52Z

cc @HyukjinKwon

[SPARK-51206][PYTHON][CONNECT] Move Arrow conversion helpers out of S…

5439b4f

…park connect module

github-actions bot added SQL PYTHON CONNECT labels Feb 13, 2025

ueshin requested a review from allisonwang-db February 14, 2025 01:22

allisonwang-db reviewed Feb 14, 2025

View reviewed changes

python/pyspark/sql/conversion.py Outdated Show resolved Hide resolved

python/pyspark/sql/conversion.py Show resolved Hide resolved

python/pyspark/sql/conversion.py Outdated Show resolved Hide resolved

add arrow check and fix docstring

375f2a4

wengh requested a review from allisonwang-db February 14, 2025 23:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-51206][PYTHON][CONNECT] Move Arrow conversion helpers out of Spark Connect #49941

[SPARK-51206][PYTHON][CONNECT] Move Arrow conversion helpers out of Spark Connect #49941

wengh commented Feb 13, 2025 •

edited

Loading

allisonwang-db commented Feb 14, 2025

[SPARK-51206][PYTHON][CONNECT] Move Arrow conversion helpers out of Spark Connect #49941

Are you sure you want to change the base?

[SPARK-51206][PYTHON][CONNECT] Move Arrow conversion helpers out of Spark Connect #49941

Conversation

wengh commented Feb 13, 2025 • edited Loading

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

allisonwang-db commented Feb 14, 2025

wengh commented Feb 13, 2025 •

edited

Loading