You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[SPARK-51818][CONNECT] Move QueryExecution creation to AnalyzeHandler and don't Execute for AnalyzePlanRequests
### What changes were proposed in this pull request?
Analyze Plan Requests for Schema should not trigger an Execute on the Logical Plan, currently when sending an AnalyzePlanRequest with a command that gets executed eagerly the Dataset.ofRows(logicalPlan) call executes the underlying command. We do not want this to happen when doing AnalyzePlan. So instead we construct the LogicalPlan with the CommandExecutionMode.SKIP and return the resulting schema that way.
https://issues.apache.org/jira/browse/SPARK-51818
### Why are the changes needed?
SQL commands that get sent via an AnalyzePlanRequest get executed eagerly right now, this PR fixes that
### Does this PR introduce _any_ user-facing change?
When calling .schema on DataFrame via Spark Connect the plan saved in the DataFrame is not executed anymore, that was the case beforehand. Example: spark.newDataFrame(plan: proto.Plan).schema with plan encoding some SQL command that gets executed eagerly like DROP TABLE the current behavior would execute the SQL command. This will not happen anymore after this change.
### How was this patch tested?
Added Test for sending an AnalyzePlanRequest with Drop Table and making sure the table was not dropped
### Was this patch authored or co-authored using generative AI tooling?
No
Closes#50605 from peterpashkin/peter-pashkin/MoveAnalyzeAndSkipExecution.
Authored-by: Peter Pashkin <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
0 commit comments