Skip to content

Conversation

zhengruifeng
Copy link
Contributor

@zhengruifeng zhengruifeng commented Oct 13, 2025

What changes were proposed in this pull request?

Limit Arrow batch sizes in SQL_GROUPED_AGG_PANDAS_UDF

Why are the changes needed?

to avoid potential OOM in the JVM side (we will introduce iterator API for it in separate PRs)

Does this PR introduce any user-facing change?

no

How was this patch tested?

added tests

Was this patch authored or co-authored using generative AI tooling?

no

test

test
jobArtifactUUID,
conf.pythonUDFProfiler) with GroupedPythonArrowInput
} else {
new ArrowPythonWithNamedArgumentRunner(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am hitting some weird issue in the work on SQL_GROUPED_AGG_ARROW_UDF,

will fix it separately in https://issues.apache.org/jira/browse/SPARK-53867

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant