Skip to content

zookeeper upgrade 3.9.3 #22

zookeeper upgrade 3.9.3

zookeeper upgrade 3.9.3 #22

Triggered via push November 12, 2024 13:27
Status Failure
Total duration 2h 21m 4s
Artifacts 23

build_main.yml

on: push
Run  /  Breaking change detection with Buf (branch-3.4)
57s
Run / Breaking change detection with Buf (branch-3.4)
Run  /  Scala 2.13 build with SBT
18m 4s
Run / Scala 2.13 build with SBT
Run  /  Run TPC-DS queries with SF=1
53m 17s
Run / Run TPC-DS queries with SF=1
Run  /  Run Docker integration tests
32m 54s
Run / Run Docker integration tests
Run  /  Run Spark on Kubernetes Integration test
1h 3m
Run / Run Spark on Kubernetes Integration test
Matrix: Run / build
Matrix: Run / java-11-17
Run  /  Build modules: sparkr
29m 21s
Run / Build modules: sparkr
Run  /  Linters, licenses, dependencies and documentation generation
25m 52s
Run / Linters, licenses, dependencies and documentation generation
Matrix: Run / pyspark
Fit to window
Zoom out
Zoom in

Annotations

87 errors and 29 warnings
Run / Build modules: catalyst, hive-thriftserver
Process completed with exit code 18.
Run / Linters, licenses, dependencies and documentation generation
Process completed with exit code 1.
Run / Build modules: sql - slow tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
Run / Build modules: sql - slow tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
Run / Build modules: sql - slow tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
Run / Build modules: sql - slow tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
Run / Build modules: sql - slow tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
Run / Build modules: sql - slow tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
Run / Build modules: sql - slow tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
Run / Build modules: sql - slow tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
Run / Build modules: sql - slow tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
Run / Build modules: sql - slow tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
Run / Build modules: sql - other tests
Uncaught exception in thread stdout writer for python3
Run / Build modules: sql - other tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
Run / Build modules: sql - other tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
Run / Build modules: sql - other tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
Run / Build modules: sql - other tests
Process completed with exit code 18.
Run / Build modules: sql - extended tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
Run / Build modules: sql - extended tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
Run / Build modules: sql - extended tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
Run / Build modules: sql - extended tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
Run / Build modules: sql - extended tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
Run / Build modules: sql - extended tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
Run / Build modules: sql - extended tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
Run / Build modules: sql - extended tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
Run / Build modules: sql - extended tests
Uncaught exception in thread stdout writer for python3
Run / Build modules: sql - extended tests
Uncaught exception in thread stdout writer for python3
Run / Run Spark on Kubernetes Integration test
Set() did not contain "decomtest-4e57599320a8c16d-exec-1".
Run / Run Spark on Kubernetes Integration test
Set() did not contain "decomtest-fbc0189320a9a7e9-exec-1".
Run / Run Spark on Kubernetes Integration test
sleep interrupted
Run / Run Spark on Kubernetes Integration test
Task io.fabric8.kubernetes.client.utils.internal.SerialExecutor$$Lambda$544/267958776@294ea2a6 rejected from java.util.concurrent.ThreadPoolExecutor@7b5d309a[Shutting down, pool size = 2, active threads = 2, queued tasks = 0, completed tasks = 384]
Run / Run Spark on Kubernetes Integration test
Task io.fabric8.kubernetes.client.utils.internal.SerialExecutor$$Lambda$544/267958776@24b5e12f rejected from java.util.concurrent.ThreadPoolExecutor@7b5d309a[Shutting down, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = 385]
Run / Run Spark on Kubernetes Integration test
Set() did not contain "decomtest-ed80019320bd4a97-exec-1".
Run / Run Spark on Kubernetes Integration test
Set() did not contain "decomtest-8880a69320be381f-exec-1".
Run / Run Spark on Kubernetes Integration test
Set() did not contain "decomtest-5ca6de9320c1f156-exec-1".
Run / Run Spark on Kubernetes Integration test
Status(apiVersion=v1, code=404, details=StatusDetails(causes=[], group=null, kind=pods, name=spark-test-app-75535461897e4659a2df334bfb6fa9dd-driver, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=pods "spark-test-app-75535461897e4659a2df334bfb6fa9dd-driver" not found, metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=NotFound, status=Failure, additionalProperties={})..
ArrowUtilsSuite.(It is not a test it is a sbt.testing.SuiteSelector): org/apache/spark/sql/util/ArrowUtilsSuite#L40
sbt.ForkMain$ForkError: java.lang.NoSuchFieldError: chunkSize
SQLQueryTestSuite.udaf/udaf-group-analytics.sql - Grouped Aggregate Pandas UDF: org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udaf/udaf-group-analytics.sql - Grouped Aggregate Pandas UDF Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1 Expected Some("struct<(a + b):int,b:int,udaf((a - b)):int>"), but got Some("struct<>") Schema did not match for query #1 SELECT a + b, b, udaf(a - b) FROM testData GROUP BY a + b, b WITH CUBE: -- !query SELECT a + b, b, udaf(a - b) FROM testData GROUP BY a + b, b WITH CUBE -- !query schema struct<> -- !query output java.lang.NoClassDefFoundError Could not initialize class org.apache.spark.sql.util.ArrowUtils$
SQLQueryTestSuite.udaf/udaf-group-by-ordinal.sql - Grouped Aggregate Pandas UDF: org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udaf/udaf-group-by-ordinal.sql - Grouped Aggregate Pandas UDF Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1 Expected Some("struct<a:int,udaf(b):int>"), but got Some("struct<>") Schema did not match for query #1 select a, udaf(b) from data group by 1: -- !query select a, udaf(b) from data group by 1 -- !query schema struct<> -- !query output java.lang.NoClassDefFoundError Could not initialize class org.apache.spark.sql.util.ArrowUtils$
SQLQueryTestSuite.udaf/udaf-group-by.sql - Grouped Aggregate Pandas UDF: org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udaf/udaf-group-by.sql - Grouped Aggregate Pandas UDF Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1 Expected Some("struct<udaf(a):int,udaf(b):int>"), but got Some("struct<>") Schema did not match for query #2 SELECT udaf(a), udaf(b) FROM testData: -- !query SELECT udaf(a), udaf(b) FROM testData -- !query schema struct<> -- !query output java.lang.NoClassDefFoundError Could not initialize class org.apache.spark.sql.util.ArrowUtils$
SQLQueryTestSuite.udaf/udaf-grouping-set.sql - Grouped Aggregate Pandas UDF: org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udaf/udaf-grouping-set.sql - Grouped Aggregate Pandas UDF Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1 Expected Some("struct<a:string,b:string,c:string,udaf(d):int>"), but got Some("struct<>") Schema did not match for query #1 SELECT a, b, c, udaf(d) FROM grouping GROUP BY a, b, c GROUPING SETS (()): -- !query SELECT a, b, c, udaf(d) FROM grouping GROUP BY a, b, c GROUPING SETS (()) -- !query schema struct<> -- !query output java.lang.NoClassDefFoundError Could not initialize class org.apache.spark.sql.util.ArrowUtils$
SQLQueryTestSuite.udf/postgreSQL/udf-aggregates_part1.sql - Scalar Pandas UDF: org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/postgreSQL/udf-aggregates_part1.sql - Scalar Pandas UDF Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1 Expected Some("struct<avg_1:double>"), but got Some("struct<>") Schema did not match for query #0 SELECT avg(udf(four)) AS avg_1 FROM onek: -- !query SELECT avg(udf(four)) AS avg_1 FROM onek -- !query schema struct<> -- !query output java.lang.NoClassDefFoundError Could not initialize class org.apache.spark.sql.util.ArrowUtils$
SQLQueryTestSuite.udf/postgreSQL/udf-aggregates_part2.sql - Scalar Pandas UDF: org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/postgreSQL/udf-aggregates_part2.sql - Scalar Pandas UDF Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1 Expected Some("struct<min(udf(unique1)):int>"), but got Some("struct<>") Schema did not match for query #12 select min(udf(unique1)) from tenk1: -- !query select min(udf(unique1)) from tenk1 -- !query schema struct<> -- !query output java.lang.NoClassDefFoundError Could not initialize class org.apache.spark.sql.util.ArrowUtils$
SQLQueryTestSuite.udf/postgreSQL/udf-aggregates_part3.sql - Scalar Pandas UDF: org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/postgreSQL/udf-aggregates_part3.sql - Scalar Pandas UDF Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1 Expected Some("struct<col:bigint>"), but got Some("struct<>") Schema did not match for query #1 select udf((select udf(count(*)) from (values (1)) t0(inner_c))) as col from (values (2),(3)) t1(outer_c): -- !query select udf((select udf(count(*)) from (values (1)) t0(inner_c))) as col from (values (2),(3)) t1(outer_c) -- !query schema struct<> -- !query output java.util.concurrent.ExecutionException org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 32829.0 failed 1 times, most recent failure: Lost task 0.0 in stage 32829.0 (TID 30981) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$ at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60) at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57) at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47) at org.apache.spark.sql.execution.python.ArrowPythonRunner.newReaderIterator(ArrowPythonRunner.scala:30) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212) at org.apache.spark.sql.execution.python.ArrowEvalPythonExec.evaluate(ArrowEvalPythonExec.scala:92) at org.apache.spark.sql.execution.python.EvalPythonExec.$anonfun$doExecute$2(EvalPythonExec.scala:131) at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2(RDD.scala:858) at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2$adapted(RDD.scala:858) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166) at org.apache.spark.scheduler.Task.run(Task.scala:141) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Driver stacktrace:
SQLQueryTestSuite.udf/postgreSQL/udf-case.sql - Scalar Pandas UDF: org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/postgreSQL/udf-case.sql - Scalar Pandas UDF Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1 Expected Some("struct<One:string,Simple WHEN:int>"), but got Some("struct<>") Schema did not match for query #12 SELECT '3' AS `One`, CASE WHEN udf(1 < 2) THEN 3 END AS `Simple WHEN`: -- !query SELECT '3' AS `One`, CASE WHEN udf(1 < 2) THEN 3 END AS `Simple WHEN` -- !query schema struct<> -- !query output java.lang.NoClassDefFoundError Could not initialize class org.apache.spark.sql.util.ArrowUtils$
SQLQueryTestSuite.udf/postgreSQL/udf-join.sql - Scalar Pandas UDF: org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/postgreSQL/udf-join.sql - Scalar Pandas UDF Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1 Expected Some("struct<xxx:string,udf(i):int,udf(j):int,udf(t):string>"), but got Some("struct<>") Schema did not match for query #28 SELECT udf('') AS `xxx`, udf(i), udf(j), udf(t) FROM J1_TBL AS tx: -- !query SELECT udf('') AS `xxx`, udf(i), udf(j), udf(t) FROM J1_TBL AS tx -- !query schema struct<> -- !query output java.lang.NoClassDefFoundError Could not initialize class org.apache.spark.sql.util.ArrowUtils$
SQLQueryTestSuite.udf/postgreSQL/udf-select_having.sql - Scalar Pandas UDF: org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/postgreSQL/udf-select_having.sql - Scalar Pandas UDF Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1 Expected Some("struct<udf(b):int,udf(c):string>"), but got Some("struct<>") Schema did not match for query #11 SELECT udf(b), udf(c) FROM test_having GROUP BY b, c HAVING udf(count(*)) = 1 ORDER BY udf(b), udf(c): -- !query SELECT udf(b), udf(c) FROM test_having GROUP BY b, c HAVING udf(count(*)) = 1 ORDER BY udf(b), udf(c) -- !query schema struct<> -- !query output java.lang.NoClassDefFoundError Could not initialize class org.apache.spark.sql.util.ArrowUtils$
SQLQueryTestSuite.udf/postgreSQL/udf-select_implicit.sql - Scalar Pandas UDF: org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/postgreSQL/udf-select_implicit.sql - Scalar Pandas UDF Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1 Expected Some("struct<udf(c):string,udf(count(1)):bigint>"), but got Some("struct<>") Schema did not match for query #11 SELECT udf(c), udf(count(*)) FROM test_missing_target GROUP BY udf(test_missing_target.c) ORDER BY udf(c): -- !query SELECT udf(c), udf(count(*)) FROM test_missing_target GROUP BY udf(test_missing_target.c) ORDER BY udf(c) -- !query schema struct<> -- !query output java.lang.NoClassDefFoundError Could not initialize class org.apache.spark.sql.util.ArrowUtils$
SQLQueryTestSuite.udf/udf-count.sql - Scalar Pandas UDF: org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-count.sql - Scalar Pandas UDF Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1 Expected Some("struct<udf(count(1)):bigint,udf(count(1)):bigint,udf(count(NULL)):bigint,udf(count(a)):bigint,udf(count(b)):bigint,udf(count((a + b))):bigint,udf(count(named_struct(a, a, b, b))):bigint>"), but got Some("struct<>") Schema did not match for query #1 SELECT udf(count(*)), udf(count(1)), udf(count(null)), udf(count(a)), udf(count(b)), udf(count(a + b)), udf(count((a, b))) FROM testData: -- !query SELECT udf(count(*)), udf(count(1)), udf(count(null)), udf(count(a)), udf(count(b)), udf(count(a + b)), udf(count((a, b))) FROM testData -- !query schema struct<> -- !query output java.lang.NoClassDefFoundError Could not initialize class org.apache.spark.sql.util.ArrowUtils$
SQLQueryTestSuite.udf/udf-cross-join.sql - Scalar Pandas UDF: org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-cross-join.sql - Scalar Pandas UDF Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1 Expected Some("struct<k:string,v1:int,k:string,v2:int>"), but got Some("struct<>") Schema did not match for query #3 SELECT * FROM nt1 cross join nt2 where udf(nt1.k) = udf(nt2.k): -- !query SELECT * FROM nt1 cross join nt2 where udf(nt1.k) = udf(nt2.k) -- !query schema struct<> -- !query output java.lang.NoClassDefFoundError Could not initialize class org.apache.spark.sql.util.ArrowUtils$
SQLQueryTestSuite.udf/udf-except-all.sql - Scalar Pandas UDF: org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-except-all.sql - Scalar Pandas UDF Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1 Expected Some("struct<udf(c1):int>"), but got Some("struct<>") Schema did not match for query #4 SELECT udf(c1) FROM tab1 EXCEPT ALL SELECT udf(c1) FROM tab2: -- !query SELECT udf(c1) FROM tab1 EXCEPT ALL SELECT udf(c1) FROM tab2 -- !query schema struct<> -- !query output java.lang.NoClassDefFoundError Could not initialize class org.apache.spark.sql.util.ArrowUtils$
SQLQueryTestSuite.udf/udf-except.sql - Scalar Pandas UDF: org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-except.sql - Scalar Pandas UDF Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1 Expected Some("struct<udf(k):string,udf(v):int>"), but got Some("struct<>") Schema did not match for query #2 SELECT udf(k), udf(v) FROM t1 EXCEPT SELECT udf(k), udf(v) FROM t2: -- !query SELECT udf(k), udf(v) FROM t1 EXCEPT SELECT udf(k), udf(v) FROM t2 -- !query schema struct<> -- !query output java.lang.NoClassDefFoundError Could not initialize class org.apache.spark.sql.util.ArrowUtils$
SQLQueryTestSuite.udf/udf-group-analytics.sql - Scalar Pandas UDF: org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-group-analytics.sql - Scalar Pandas UDF Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1 Expected Some("struct<udf((a + b)):int,b:int,udf(sum((a - b))):bigint>"), but got Some("struct<>") Schema did not match for query #1 SELECT udf(a + b), b, udf(SUM(a - b)) FROM testData GROUP BY udf(a + b), b WITH CUBE: -- !query SELECT udf(a + b), b, udf(SUM(a - b)) FROM testData GROUP BY udf(a + b), b WITH CUBE -- !query schema struct<> -- !query output java.lang.NoClassDefFoundError Could not initialize class org.apache.spark.sql.util.ArrowUtils$
SQLQueryTestSuite.udf/udf-group-by.sql - Scalar Pandas UDF: org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-group-by.sql - Scalar Pandas UDF Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1 Expected Some("struct<count(udf(a)):bigint,udf(count(b)):bigint>"), but got Some("struct<>") Schema did not match for query #2 SELECT COUNT(udf(a)), udf(COUNT(b)) FROM testData: -- !query SELECT COUNT(udf(a)), udf(COUNT(b)) FROM testData -- !query schema struct<> -- !query output java.lang.NoClassDefFoundError Could not initialize class org.apache.spark.sql.util.ArrowUtils$
SQLQueryTestSuite.udf/udf-having.sql - Scalar Pandas UDF: org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-having.sql - Scalar Pandas UDF Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1 Expected Some("struct<k:string,udf(sum(v)):bigint>"), but got Some("struct<>") Schema did not match for query #1 SELECT udf(k) AS k, udf(sum(v)) FROM hav GROUP BY k HAVING udf(sum(v)) > 2: -- !query SELECT udf(k) AS k, udf(sum(v)) FROM hav GROUP BY k HAVING udf(sum(v)) > 2 -- !query schema struct<> -- !query output java.lang.NoClassDefFoundError Could not initialize class org.apache.spark.sql.util.ArrowUtils$
SQLQueryTestSuite.udf/udf-inline-table.sql - Scalar Pandas UDF: org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-inline-table.sql - Scalar Pandas UDF Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1 Expected Some("struct<udf(col1):string,udf(col2):int>"), but got Some("struct<>") Schema did not match for query #0 select udf(col1), udf(col2) from values ("one", 1): -- !query select udf(col1), udf(col2) from values ("one", 1) -- !query schema struct<> -- !query output java.lang.NoClassDefFoundError Could not initialize class org.apache.spark.sql.util.ArrowUtils$
SQLQueryTestSuite.udf/udf-inner-join.sql - Scalar Pandas UDF: org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-inner-join.sql - Scalar Pandas UDF Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1 Expected Some("struct<a:int,tag:string>"), but got Some("struct<>") Schema did not match for query #6 SELECT tb.* FROM ta INNER JOIN tb ON ta.a = tb.a AND ta.tag = tb.tag: -- !query SELECT tb.* FROM ta INNER JOIN tb ON ta.a = tb.a AND ta.tag = tb.tag -- !query schema struct<> -- !query output java.lang.NoClassDefFoundError Could not initialize class org.apache.spark.sql.util.ArrowUtils$
SQLQueryTestSuite.udf/udf-intersect-all.sql - Scalar Pandas UDF: org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-intersect-all.sql - Scalar Pandas UDF Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1 Expected Some("struct<udf(k):int,v:int>"), but got Some("struct<>") Schema did not match for query #2 SELECT udf(k), v FROM tab1 INTERSECT ALL SELECT k, udf(v) FROM tab2: -- !query SELECT udf(k), v FROM tab1 INTERSECT ALL SELECT k, udf(v) FROM tab2 -- !query schema struct<> -- !query output java.lang.NoClassDefFoundError Could not initialize class org.apache.spark.sql.util.ArrowUtils$
SQLQueryTestSuite.udf/udf-join-empty-relation.sql - Scalar Pandas UDF: org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-join-empty-relation.sql - Scalar Pandas UDF Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1 Expected Some("struct<udf(udf(a)):int,a:int>"), but got Some("struct<>") Schema did not match for query #5 SELECT udf(udf(t1.a)), empty_table.a FROM t1 LEFT OUTER JOIN empty_table ON (udf(t1.a) = udf(empty_table.a)): -- !query SELECT udf(udf(t1.a)), empty_table.a FROM t1 LEFT OUTER JOIN empty_table ON (udf(t1.a) = udf(empty_table.a)) -- !query schema struct<> -- !query output java.lang.NoClassDefFoundError Could not initialize class org.apache.spark.sql.util.ArrowUtils$
SQLQueryTestSuite.udf/udf-natural-join.sql - Scalar Pandas UDF: org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-natural-join.sql - Scalar Pandas UDF Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1 Expected Some("struct<k:string,v1:int,v2:int>"), but got Some("struct<>") Schema did not match for query #2 SELECT * FROM nt1 natural join nt2 where udf(k) = "one": -- !query SELECT * FROM nt1 natural join nt2 where udf(k) = "one" -- !query schema struct<> -- !query output java.lang.NoClassDefFoundError Could not initialize class org.apache.spark.sql.util.ArrowUtils$
SQLQueryTestSuite.udf/udf-outer-join.sql - Scalar Pandas UDF: org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-outer-join.sql - Scalar Pandas UDF Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1 Expected Some("struct<udf(sum(udf(coalesce(int_col1, int_col0)))):bigint,(udf(coalesce(int_col1, int_col0)) * 2):int>"), but got Some("struct<>") Schema did not match for query #2 SELECT (udf(SUM(udf(COALESCE(t1.int_col1, t2.int_col0))))), (udf(COALESCE(t1.int_col1, t2.int_col0)) * 2) FROM t1 RIGHT JOIN t2 ON udf(t2.int_col0) = udf(t1.int_col1) GROUP BY udf(GREATEST(COALESCE(udf(t2.int_col1), 109), COALESCE(t1.int_col1, udf(-449)))), COALESCE(t1.int_col1, t2.int_col0) HAVING (udf(SUM(COALESCE(udf(t1.int_col1), udf(t2.int_col0))))) > (udf(COALESCE(t1.int_col1, t2.int_col0)) * 2): -- !query SELECT (udf(SUM(udf(COALESCE(t1.int_col1, t2.int_col0))))), (udf(COALESCE(t1.int_col1, t2.int_col0)) * 2) FROM t1 RIGHT JOIN t2 ON udf(t2.int_col0) = udf(t1.int_col1) GROUP BY udf(GREATEST(COALESCE(udf(t2.int_col1), 109), COALESCE(t1.int_col1, udf(-449)))), COALESCE(t1.int_col1, t2.int_col0) HAVING (udf(SUM(COALESCE(udf(t1.int_col1), udf(t2.int_col0))))) > (udf(COALESCE(t1.int_col1, t2.int_col0)) * 2) -- !query schema struct<> -- !query output java.lang.NoClassDefFoundError Could not initialize class org.apache.spark.sql.util.ArrowUtils$
SQLQueryTestSuite.udf/udf-pivot.sql - Scalar Pandas UDF: org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-pivot.sql - Scalar Pandas UDF Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1 Expected Some("struct<udf(year):int,dotNET:bigint,Java:bigint>"), but got Some("struct<>") Schema did not match for query #3 SELECT * FROM ( SELECT udf(year), course, earnings FROM courseSales ) PIVOT ( udf(sum(earnings)) FOR course IN ('dotNET', 'Java') ): -- !query SELECT * FROM ( SELECT udf(year), course, earnings FROM courseSales ) PIVOT ( udf(sum(earnings)) FOR course IN ('dotNET', 'Java') ) -- !query schema struct<> -- !query output java.lang.NoClassDefFoundError Could not initialize class org.apache.spark.sql.util.ArrowUtils$
SQLQueryTestSuite.udf/udf-special-values.sql - Scalar Pandas UDF: org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-special-values.sql - Scalar Pandas UDF Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1 Expected Some("struct<udf(x):int>"), but got Some("struct<>") Schema did not match for query #0 SELECT udf(x) FROM (VALUES (1), (2), (NULL)) v(x): -- !query SELECT udf(x) FROM (VALUES (1), (2), (NULL)) v(x) -- !query schema struct<> -- !query output java.lang.NoClassDefFoundError Could not initialize class org.apache.spark.sql.util.ArrowUtils$
SQLQueryTestSuite.udf/udf-udaf.sql - Scalar Pandas UDF: org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-udaf.sql - Scalar Pandas UDF Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1 Expected Some("struct<my_avg:double,my_avg2:double,my_avg3:double>"), but got Some("struct<>") Schema did not match for query #2 SELECT default.myDoubleAvg(udf(int_col1)) as my_avg, udf(default.myDoubleAvg(udf(int_col1))) as my_avg2, udf(default.myDoubleAvg(int_col1)) as my_avg3 from t1: -- !query SELECT default.myDoubleAvg(udf(int_col1)) as my_avg, udf(default.myDoubleAvg(udf(int_col1))) as my_avg2, udf(default.myDoubleAvg(int_col1)) as my_avg3 from t1 -- !query schema struct<> -- !query output java.lang.NoClassDefFoundError Could not initialize class org.apache.spark.sql.util.ArrowUtils$
SQLQueryTestSuite.udf/udf-union.sql - Scalar Pandas UDF: org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-union.sql - Scalar Pandas UDF Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1 Expected Some("struct<c1:int,c2:string>"), but got Some("struct<>") Schema did not match for query #2 SELECT udf(c1) as c1, udf(c2) as c2 FROM (SELECT udf(c1) as c1, udf(c2) as c2 FROM t1 UNION ALL SELECT udf(c1) as c1, udf(c2) as c2 FROM t1): -- !query SELECT udf(c1) as c1, udf(c2) as c2 FROM (SELECT udf(c1) as c1, udf(c2) as c2 FROM t1 UNION ALL SELECT udf(c1) as c1, udf(c2) as c2 FROM t1) -- !query schema struct<> -- !query output java.lang.NoClassDefFoundError Could not initialize class org.apache.spark.sql.util.ArrowUtils$
SQLQueryTestSuite.udf/udf-window.sql - Scalar Pandas UDF: org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-window.sql - Scalar Pandas UDF Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1 Expected Some("struct<udf(val):int,cate:string,count(val) OVER (PARTITION BY cate ORDER BY udf(val) ASC NULLS FIRST ROWS BETWEEN CURRENT ROW AND CURRENT ROW):bigint>"), but got Some("struct<>") Schema did not match for query #1 SELECT udf(val), cate, count(val) OVER(PARTITION BY cate ORDER BY udf(val) ROWS CURRENT ROW) FROM testData ORDER BY cate, udf(val): -- !query SELECT udf(val), cate, count(val) OVER(PARTITION BY cate ORDER BY udf(val) ROWS CURRENT ROW) FROM testData ORDER BY cate, udf(val) -- !query schema struct<> -- !query output java.lang.NoClassDefFoundError Could not initialize class org.apache.spark.sql.util.ArrowUtils$
ArrowConvertersSuite.(It is not a test it is a sbt.testing.SuiteSelector): org/apache/spark/sql/execution/arrow/ArrowConvertersSuite#L1435
sbt.ForkMain$ForkError: java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$
ArrowWriterSuite.(It is not a test it is a sbt.testing.SuiteSelector): org/apache/spark/sql/execution/arrow/ArrowWriterSuite#L75
sbt.ForkMain$ForkError: java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$
PythonUDFSuite.SPARK-39962: Global aggregation of Pandas UDF should respect the column order: org/apache/spark/sql/execution/python/PythonUDFSuite#L85
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 38.0 failed 1 times, most recent failure: Lost task 0.0 in stage 38.0 (TID 37) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$ at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60) at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57) at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47) at org.apache.spark.sql.execution.python.ArrowPythonRunner.newReaderIterator(ArrowPythonRunner.scala:30) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212) at org.apache.spark.sql.execution.python.AggregateInPandasExec.$anonfun$doExecute$8(AggregateInPandasExec.scala:176) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:893) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:893) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166) at org.apache.spark.scheduler.Task.run(Task.scala:141) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Driver stacktrace:
PythonUDTFSuite.Arrow optimized UDTF: org/apache/spark/sql/execution/python/PythonUDTFSuite#L85
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 13.0 failed 1 times, most recent failure: Lost task 0.0 in stage 13.0 (TID 16) (localhost executor driver): java.lang.NoSuchFieldError: chunkSize at io.netty.buffer.PooledByteBufAllocatorL$InnerAllocator.<init>(PooledByteBufAllocatorL.java:153) at io.netty.buffer.PooledByteBufAllocatorL.<init>(PooledByteBufAllocatorL.java:49) at org.apache.arrow.memory.NettyAllocationManager.<clinit>(NettyAllocationManager.java:51) at org.apache.arrow.memory.DefaultAllocationManagerFactory.<clinit>(DefaultAllocationManagerFactory.java:26) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at org.apache.arrow.memory.DefaultAllocationManagerOption.getFactory(DefaultAllocationManagerOption.java:108) at org.apache.arrow.memory.DefaultAllocationManagerOption.getDefaultAllocationManagerFactory(DefaultAllocationManagerOption.java:98) at org.apache.arrow.memory.BaseAllocator$Config.getAllocationManagerFactory(BaseAllocator.java:772) at org.apache.arrow.memory.ImmutableConfig.access$801(ImmutableConfig.java:24) at org.apache.arrow.memory.ImmutableConfig$InitShim.getAllocationManagerFactory(ImmutableConfig.java:83) at org.apache.arrow.memory.ImmutableConfig.<init>(ImmutableConfig.java:47) at org.apache.arrow.memory.ImmutableConfig.<init>(ImmutableConfig.java:24) at org.apache.arrow.memory.ImmutableConfig$Builder.build(ImmutableConfig.java:485) at org.apache.arrow.memory.BaseAllocator.<clinit>(BaseAllocator.java:61) at org.apache.spark.sql.util.ArrowUtils$.<init>(ArrowUtils.scala:34) at org.apache.spark.sql.util.ArrowUtils$.<clinit>(ArrowUtils.scala) at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60) at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57) at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47) at org.apache.spark.sql.execution.python.ArrowPythonUDTFRunner.newReaderIterator(ArrowPythonUDTFRunner.scala:33) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212) at org.apache.spark.sql.execution.python.ArrowEvalPythonUDTFExec.evaluate(ArrowEvalPythonUDTFExec.scala:73) at org.apache.spark.sql.execution.python.EvalPythonUDTFExec.$anonfun$doExecute$2(EvalPythonUDTFExec.scala:96) at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2(RDD.scala:858) at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2$adapted(RDD.scala:858) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) at org.apache.spark.sql.execution.SQLExecutionRDD.$anonfun$compute$1(SQLExecutionRDD.scala:52) at org.apache.spark.sql.internal.SQLConf$.withExistingConf(SQLConf.scala:158) at org.apache.spark.sql.execution.SQLExecutionRDD.compute(SQLExecutionRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166) at org.apache.spark.scheduler.Task.run(Task.scala:141) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Driver stacktrace:
PythonUDTFSuite.arrow optimized UDTF with lateral join: org/apache/spark/sql/execution/python/PythonUDTFSuite#L90
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 15.0 failed 1 times, most recent failure: Lost task 0.0 in stage 15.0 (TID 19) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$ at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60) at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57) at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47) at org.apache.spark.sql.execution.python.ArrowPythonUDTFRunner.newReaderIterator(ArrowPythonUDTFRunner.scala:33) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212) at org.apache.spark.sql.execution.python.ArrowEvalPythonUDTFExec.evaluate(ArrowEvalPythonUDTFExec.scala:73) at org.apache.spark.sql.execution.python.EvalPythonUDTFExec.$anonfun$doExecute$2(EvalPythonUDTFExec.scala:96) at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2(RDD.scala:858) at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2$adapted(RDD.scala:858) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) at org.apache.spark.sql.execution.SQLExecutionRDD.$anonfun$compute$1(SQLExecutionRDD.scala:52) at org.apache.spark.sql.internal.SQLConf$.withExistingConf(SQLConf.scala:158) at org.apache.spark.sql.execution.SQLExecutionRDD.compute(SQLExecutionRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166) at org.apache.spark.scheduler.Task.run(Task.scala:141) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Driver stacktrace:
ArrowColumnVectorSuite.(It is not a test it is a sbt.testing.SuiteSelector): org/apache/spark/sql/vectorized/ArrowColumnVectorSuite#L31
sbt.ForkMain$ForkError: java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$
FlatMapGroupsInPandasWithStateDistributionSuite.applyInPandasWithState should require StatefulOpClusteredDistribution from children - without initial state: org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateDistributionSuite#L1
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = 5252f80c-607b-40cc-a9e0-87cf3e96cecd, runId = 93a0a0f8-d7d3-48a5-ac5f-5798a09981d5] terminated with exception: Job aborted due to stage failure: Task 0 in stage 2.0 failed 1 times, most recent failure: Lost task 0.0 in stage 2.0 (TID 7) (localhost executor driver): java.lang.NoSuchFieldError: chunkSize at io.netty.buffer.PooledByteBufAllocatorL$InnerAllocator.<init>(PooledByteBufAllocatorL.java:153) at io.netty.buffer.PooledByteBufAllocatorL.<init>(PooledByteBufAllocatorL.java:49) at org.apache.arrow.memory.NettyAllocationManager.<clinit>(NettyAllocationManager.java:51) at org.apache.arrow.memory.DefaultAllocationManagerFactory.<clinit>(DefaultAllocationManagerFactory.java:26) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at org.apache.arrow.memory.DefaultAllocationManagerOption.getFactory(DefaultAllocationManagerOption.java:108) at org.apache.arrow.memory.DefaultAllocationManagerOption.getDefaultAllocationManagerFactory(DefaultAllocationManagerOption.java:98) at org.apache.arrow.memory.BaseAllocator$Config.getAllocationManagerFactory(BaseAllocator.java:772) at org.apache.arrow.memory.ImmutableConfig.access$801(ImmutableConfig.java:24) at org.apache.arrow.memory.ImmutableConfig$InitShim.getAllocationManagerFactory(ImmutableConfig.java:83) at org.apache.arrow.memory.ImmutableConfig.<init>(ImmutableConfig.java:47) at org.apache.arrow.memory.ImmutableConfig.<init>(ImmutableConfig.java:24) at org.apache.arrow.memory.ImmutableConfig$Builder.build(ImmutableConfig.java:485) at org.apache.arrow.memory.BaseAllocator.<clinit>(BaseAllocator.java:61) at org.apache.spark.sql.util.ArrowUtils$.<init>(ArrowUtils.scala:34) at org.apache.spark.sql.util.ArrowUtils$.<clinit>(ArrowUtils.scala) at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60) at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57) at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47) at org.apache.spark.sql.execution.python.ApplyInPandasWithStatePythonRunner.newReaderIterator(ApplyInPandasWithStatePythonRunner.scala:53) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212) at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.process(FlatMapGroupsInPandasWithStateExec.scala:195) at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.processNewData(FlatMapGroupsInPandasWithStateExec.scala:133) at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition(FlatMapGroupsWithStateExec.scala:144) at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition$(FlatMapGroupsWithStateExec.scala:119) at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec.processDataWithPartition(FlatMapGroupsInPandasWithStateExec.scala:55) at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$doExecute$2(FlatMapGroupsWithStateExec.scala:244) at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps.$anonfun$mapPartitionsWithStateStore$1(package.scala:68) at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:127) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166) at org.apache.spark.scheduler.Task.run(Task.scala:141) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Driver stacktrace:
FlatMapGroupsInPandasWithStateSuite.applyInPandasWithState - streaming: org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateSuite#L1
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = 927d8ac8-8cc4-43e4-b7ce-2779be25f287, runId = 3a78887e-f53e-4672-9dcb-87f7fc1fdf16] terminated with exception: Job aborted due to stage failure: Task 1 in stage 1.0 failed 1 times, most recent failure: Lost task 1.0 in stage 1.0 (TID 2) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$ at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60) at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57) at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47) at org.apache.spark.sql.execution.python.ApplyInPandasWithStatePythonRunner.newReaderIterator(ApplyInPandasWithStatePythonRunner.scala:53) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212) at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.process(FlatMapGroupsInPandasWithStateExec.scala:195) at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.processNewData(FlatMapGroupsInPandasWithStateExec.scala:133) at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition(FlatMapGroupsWithStateExec.scala:144) at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition$(FlatMapGroupsWithStateExec.scala:119) at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec.processDataWithPartition(FlatMapGroupsInPandasWithStateExec.scala:55) at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$doExecute$2(FlatMapGroupsWithStateExec.scala:244) at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps.$anonfun$mapPartitionsWithStateStore$1(package.scala:68) at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:127) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166) at org.apache.spark.scheduler.Task.run(Task.scala:141) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Driver stacktrace:
FlatMapGroupsInPandasWithStateSuite.applyInPandasWithState - streaming, multiple groups in partition, multiple outputs per grouping key: org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateSuite#L1
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = db828e32-826e-4f1c-b32f-2a64c43c2764, runId = ef6159f7-7887-4ecb-8e22-536ec9c0b982] terminated with exception: Job aborted due to stage failure: Task 0 in stage 3.0 failed 1 times, most recent failure: Lost task 0.0 in stage 3.0 (TID 5) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$ at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60) at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57) at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47) at org.apache.spark.sql.execution.python.ApplyInPandasWithStatePythonRunner.newReaderIterator(ApplyInPandasWithStatePythonRunner.scala:53) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212) at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.process(FlatMapGroupsInPandasWithStateExec.scala:195) at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.processNewData(FlatMapGroupsInPandasWithStateExec.scala:133) at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition(FlatMapGroupsWithStateExec.scala:144) at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition$(FlatMapGroupsWithStateExec.scala:119) at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec.processDataWithPartition(FlatMapGroupsInPandasWithStateExec.scala:55) at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$doExecute$2(FlatMapGroupsWithStateExec.scala:244) at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps.$anonfun$mapPartitionsWithStateStore$1(package.scala:68) at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:127) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166) at org.apache.spark.scheduler.Task.run(Task.scala:141) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Driver stacktrace:
FlatMapGroupsInPandasWithStateSuite.applyInPandasWithState - streaming + aggregation: org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateSuite#L1
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = dfa8731a-8bd7-4e71-8ae6-db5d4a7f7686, runId = 4a0396e1-634d-4260-b95c-56c6808a9d33] terminated with exception: Job aborted due to stage failure: Task 1 in stage 5.0 failed 1 times, most recent failure: Lost task 1.0 in stage 5.0 (TID 8) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$ at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60) at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57) at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47) at org.apache.spark.sql.execution.python.ApplyInPandasWithStatePythonRunner.newReaderIterator(ApplyInPandasWithStatePythonRunner.scala:53) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212) at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.process(FlatMapGroupsInPandasWithStateExec.scala:195) at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.processNewData(FlatMapGroupsInPandasWithStateExec.scala:133) at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition(FlatMapGroupsWithStateExec.scala:144) at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition$(FlatMapGroupsWithStateExec.scala:119) at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec.processDataWithPartition(FlatMapGroupsInPandasWithStateExec.scala:55) at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$doExecute$2(FlatMapGroupsWithStateExec.scala:244) at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps.$anonfun$mapPartitionsWithStateStore$1(package.scala:68) at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:127) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:104) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166) at org.apache.spark.scheduler.Task.run(Task.scala:141) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Driver stacktrace:
FlatMapGroupsInPandasWithStateSuite.applyInPandasWithState - streaming with processing time timeout: org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateSuite#L1
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = 79fe0515-369c-47de-a7e5-6cfdac166756, runId = f7f13073-159e-4ac4-93ff-f4d3892da4bd] terminated with exception: Job aborted due to stage failure: Task 1 in stage 8.0 failed 1 times, most recent failure: Lost task 1.0 in stage 8.0 (TID 12) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$ at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60) at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57) at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47) at org.apache.spark.sql.execution.python.ApplyInPandasWithStatePythonRunner.newReaderIterator(ApplyInPandasWithStatePythonRunner.scala:53) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212) at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.process(FlatMapGroupsInPandasWithStateExec.scala:195) at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.processNewData(FlatMapGroupsInPandasWithStateExec.scala:133) at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition(FlatMapGroupsWithStateExec.scala:144) at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition$(FlatMapGroupsWithStateExec.scala:119) at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec.processDataWithPartition(FlatMapGroupsInPandasWithStateExec.scala:55) at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$doExecute$2(FlatMapGroupsWithStateExec.scala:244) at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps.$anonfun$mapPartitionsWithStateStore$1(package.scala:68) at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:127) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166) at org.apache.spark.scheduler.Task.run(Task.scala:141) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Driver stacktrace:
FlatMapGroupsInPandasWithStateSuite.applyInPandasWithState - streaming w/ event time timeout + watermark ifUseDateTimeType=true: org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateSuite#L1
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = 669a0c45-1579-4e8a-8f21-3b36b6491d08, runId = 40e27a92-e4e9-469a-af7a-6965fb177535] terminated with exception: Job aborted due to stage failure: Task 1 in stage 10.0 failed 1 times, most recent failure: Lost task 1.0 in stage 10.0 (TID 16) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$ at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60) at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57) at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47) at org.apache.spark.sql.execution.python.ApplyInPandasWithStatePythonRunner.newReaderIterator(ApplyInPandasWithStatePythonRunner.scala:53) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212) at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.process(FlatMapGroupsInPandasWithStateExec.scala:195) at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.processNewData(FlatMapGroupsInPandasWithStateExec.scala:133) at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition(FlatMapGroupsWithStateExec.scala:144) at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition$(FlatMapGroupsWithStateExec.scala:119) at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec.processDataWithPartition(FlatMapGroupsInPandasWithStateExec.scala:55) at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$doExecute$2(FlatMapGroupsWithStateExec.scala:244) at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps.$anonfun$mapPartitionsWithStateStore$1(package.scala:68) at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:127) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166) at org.apache.spark.scheduler.Task.run(Task.scala:141) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Driver stacktrace:
FlatMapGroupsInPandasWithStateSuite.applyInPandasWithState - streaming w/ event time timeout + watermark ifUseDateTimeType=false: org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateSuite#L1
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = 22804abd-797c-4287-87a3-0739c0f0747f, runId = e8f47582-9b09-4ca5-90fd-71e28894b995] terminated with exception: Job aborted due to stage failure: Task 1 in stage 12.0 failed 1 times, most recent failure: Lost task 1.0 in stage 12.0 (TID 20) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$ at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60) at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57) at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47) at org.apache.spark.sql.execution.python.ApplyInPandasWithStatePythonRunner.newReaderIterator(ApplyInPandasWithStatePythonRunner.scala:53) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212) at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.process(FlatMapGroupsInPandasWithStateExec.scala:195) at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.processNewData(FlatMapGroupsInPandasWithStateExec.scala:133) at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition(FlatMapGroupsWithStateExec.scala:144) at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition$(FlatMapGroupsWithStateExec.scala:119) at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec.processDataWithPartition(FlatMapGroupsInPandasWithStateExec.scala:55) at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$doExecute$2(FlatMapGroupsWithStateExec.scala:244) at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps.$anonfun$mapPartitionsWithStateStore$1(package.scala:68) at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:127) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166) at org.apache.spark.scheduler.Task.run(Task.scala:141) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Driver stacktrace:
FlatMapGroupsInPandasWithStateSuite.SPARK-20714: watermark does not fail query when timeout = NoTimeout: org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateSuite#L1
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = d31422d2-571d-486b-911a-fdf76b05daaf, runId = 1f20e80f-9042-4572-b59a-50499bcf201d] terminated with exception: Job aborted due to stage failure: Task 0 in stage 14.0 failed 1 times, most recent failure: Lost task 0.0 in stage 14.0 (TID 23) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$ at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60) at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57) at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47) at org.apache.spark.sql.execution.python.ApplyInPandasWithStatePythonRunner.newReaderIterator(ApplyInPandasWithStatePythonRunner.scala:53) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212) at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.process(FlatMapGroupsInPandasWithStateExec.scala:195) at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.processNewData(FlatMapGroupsInPandasWithStateExec.scala:133) at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition(FlatMapGroupsWithStateExec.scala:144) at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition$(FlatMapGroupsWithStateExec.scala:119) at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec.processDataWithPartition(FlatMapGroupsInPandasWithStateExec.scala:55) at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$doExecute$2(FlatMapGroupsWithStateExec.scala:244) at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps.$anonfun$mapPartitionsWithStateStore$1(package.scala:68) at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:127) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166) at org.apache.spark.scheduler.Task.run(Task.scala:141) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Driver stacktrace:
FlatMapGroupsInPandasWithStateSuite.SPARK-20714: watermark does not fail query when timeout = ProcessingTimeTimeout: org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateSuite#L1
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = f376ca1f-950a-4416-9e94-abc9d3dc3dec, runId = a0f108b1-30dc-494c-aa71-61d10c8cee5c] terminated with exception: Job aborted due to stage failure: Task 1 in stage 16.0 failed 1 times, most recent failure: Lost task 1.0 in stage 16.0 (TID 29) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$ at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60) at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57) at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47) at org.apache.spark.sql.execution.python.ApplyInPandasWithStatePythonRunner.newReaderIterator(ApplyInPandasWithStatePythonRunner.scala:53) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212) at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.process(FlatMapGroupsInPandasWithStateExec.scala:195) at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.processNewData(FlatMapGroupsInPandasWithStateExec.scala:133) at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition(FlatMapGroupsWithStateExec.scala:144) at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition$(FlatMapGroupsWithStateExec.scala:119) at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec.processDataWithPartition(FlatMapGroupsInPandasWithStateExec.scala:55) at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$doExecute$2(FlatMapGroupsWithStateExec.scala:244) at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps.$anonfun$mapPartitionsWithStateStore$1(package.scala:68) at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:127) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166) at org.apache.spark.scheduler.Task.run(Task.scala:141) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Driver stacktrace:
FlatMapGroupsInPandasWithStateSuite.applyInPandasWithState - uses state format version 2 by default: org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateSuite#L1
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = 01b9bcd4-4e21-419e-81cb-46a6c6b4e20d, runId = a568a7c9-d0e5-44c2-a46e-4e3786ca0c7c] terminated with exception: Job aborted due to stage failure: Task 1 in stage 18.0 failed 1 times, most recent failure: Lost task 1.0 in stage 18.0 (TID 34) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$ at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60) at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57) at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47) at org.apache.spark.sql.execution.python.ApplyInPandasWithStatePythonRunner.newReaderIterator(ApplyInPandasWithStatePythonRunner.scala:53) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212) at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.process(FlatMapGroupsInPandasWithStateExec.scala:195) at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.processNewData(FlatMapGroupsInPandasWithStateExec.scala:133) at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition(FlatMapGroupsWithStateExec.scala:144) at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition$(FlatMapGroupsWithStateExec.scala:119) at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec.processDataWithPartition(FlatMapGroupsInPandasWithStateExec.scala:55) at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$doExecute$2(FlatMapGroupsWithStateExec.scala:244) at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps.$anonfun$mapPartitionsWithStateStore$1(package.scala:68) at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:127) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166) at org.apache.spark.scheduler.Task.run(Task.scala:141) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Driver stacktrace:
FlatMapGroupsInPandasWithStateSuite.applyInPandasWithState - streaming - arrow RecordBatch size with chunking: org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateSuite#L1
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = 56444c14-cdcb-4ee4-a0d2-c0df328e7b1d, runId = 7907aa1f-c0a3-4c23-8277-57183074e09e] terminated with exception: Job aborted due to stage failure: Task 0 in stage 20.0 failed 1 times, most recent failure: Lost task 0.0 in stage 20.0 (TID 37) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$ at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60) at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57) at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47) at org.apache.spark.sql.execution.python.ApplyInPandasWithStatePythonRunner.newReaderIterator(ApplyInPandasWithStatePythonRunner.scala:53) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212) at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.process(FlatMapGroupsInPandasWithStateExec.scala:195) at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.processNewData(FlatMapGroupsInPandasWithStateExec.scala:133) at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition(FlatMapGroupsWithStateExec.scala:144) at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition$(FlatMapGroupsWithStateExec.scala:119) at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec.processDataWithPartition(FlatMapGroupsInPandasWithStateExec.scala:55) at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$doExecute$2(FlatMapGroupsWithStateExec.scala:244) at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps.$anonfun$mapPartitionsWithStateStore$1(package.scala:68) at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:127) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166) at org.apache.spark.scheduler.Task.run(Task.scala:141) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Driver stacktrace:
FlatMapGroupsInPandasWithStateSuite.applyInPandasWithState - streaming - partial consume of iterator in user function: org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateSuite#L1
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = 7badf2c8-5757-42df-9738-f3cdeb085dea, runId = 3d3f0e80-5cb1-42ea-b96a-c836a8873a8d] terminated with exception: Job aborted due to stage failure: Task 0 in stage 22.0 failed 1 times, most recent failure: Lost task 0.0 in stage 22.0 (TID 39) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$ at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60) at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57) at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47) at org.apache.spark.sql.execution.python.ApplyInPandasWithStatePythonRunner.newReaderIterator(ApplyInPandasWithStatePythonRunner.scala:53) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212) at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.process(FlatMapGroupsInPandasWithStateExec.scala:195) at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.processNewData(FlatMapGroupsInPandasWithStateExec.scala:133) at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition(FlatMapGroupsWithStateExec.scala:144) at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition$(FlatMapGroupsWithStateExec.scala:119) at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec.processDataWithPartition(FlatMapGroupsInPandasWithStateExec.scala:55) at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$doExecute$2(FlatMapGroupsWithStateExec.scala:244) at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps.$anonfun$mapPartitionsWithStateStore$1(package.scala:68) at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:127) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166) at org.apache.spark.scheduler.Task.run(Task.scala:141) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Driver stacktrace:
FlatMapGroupsInPandasWithStateSuite.SPARK-40670: applyInPandasWithState - streaming having non-null columns: org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateSuite#L1
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = 39c0c136-1b94-4990-a33b-68a25dee3500, runId = b82cf789-345f-4fb0-bb71-114b1f03a842] terminated with exception: Job aborted due to stage failure: Task 1 in stage 24.0 failed 1 times, most recent failure: Lost task 1.0 in stage 24.0 (TID 42) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$ at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60) at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57) at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47) at org.apache.spark.sql.execution.python.ApplyInPandasWithStatePythonRunner.newReaderIterator(ApplyInPandasWithStatePythonRunner.scala:53) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212) at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.process(FlatMapGroupsInPandasWithStateExec.scala:195) at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.processNewData(FlatMapGroupsInPandasWithStateExec.scala:133) at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition(FlatMapGroupsWithStateExec.scala:144) at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition$(FlatMapGroupsWithStateExec.scala:119) at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec.processDataWithPartition(FlatMapGroupsInPandasWithStateExec.scala:55) at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$doExecute$2(FlatMapGroupsWithStateExec.scala:244) at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps.$anonfun$mapPartitionsWithStateStore$1(package.scala:68) at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:127) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367) at org.apache.spark.rdd.RDD.iterator(RDD.scala:331) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166) at org.apache.spark.scheduler.Task.run(Task.scala:141) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Driver stacktrace:
Run / Check changes
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
Run / Breaking change detection with Buf (branch-3.4)
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
Run / Base image build
The following actions use a deprecated Node.js version and will be forced to run on node20: docker/login-action@v2, actions/checkout@v3, docker/setup-qemu-action@v2, docker/setup-buildx-action@v2, docker/build-push-action@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
Run / Scala 2.13 build with SBT
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
Run / Build modules: catalyst, hive-thriftserver
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
Run / Run Docker integration tests
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
Run / Java 11 build with Maven
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
Run / Linters, licenses, dependencies and documentation generation
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
Run / Build modules: sparkr
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
Run / Build modules: hive - slow tests
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
Run / Build modules: pyspark-errors
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
Run / Build modules: pyspark-errors
No files were found with the provided path: **/target/test-reports/*.xml. No artifacts will be uploaded.
Run / Build modules: sql - slow tests
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/setup-python@v4, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
Run / Build modules: streaming, sql-kafka-0-10, streaming-kafka-0-10, mllib-local, mllib, yarn, mesos, kubernetes, hadoop-cloud, spark-ganglia-lgpl, connect, protobuf
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/setup-python@v4, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
Run / Run TPC-DS queries with SF=1
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
Run / Build modules: core, unsafe, kvstore, avro, network-common, network-shuffle, repl, launcher, examples, sketch, graphx
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
Run / Build modules: sql - other tests
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/setup-python@v4, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
Run / Java 17 build with Maven
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
Run / Build modules: sql - extended tests
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/setup-python@v4, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
Run / Run Spark on Kubernetes Integration test
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
Run / Build modules: pyspark-pandas
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
Run / Build modules: pyspark-sql, pyspark-mllib, pyspark-resource, pyspark-testing
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
Run / Build modules: hive - other tests
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
Run / Build modules: pyspark-core, pyspark-streaming, pyspark-ml
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
Run / Build modules: pyspark-pandas-slow
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
Run / Build modules: pyspark-pandas-connect
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
Run / Build modules: pyspark-connect
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
Run / Build modules: pyspark-pandas-slow-connect
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
Deprecation notice: v1, v2, and v3 of the artifact actions
The following artifacts were uploaded using a version of actions/upload-artifact that is scheduled for deprecation: "test-results-catalyst, hive-thriftserver--8-hadoop3-hive2.3", "test-results-core, unsafe, kvstore, avro, network-common, network-shuffle, repl, launcher, examples, sketch, graphx--8-hadoop3-hive2.3", "test-results-docker-integration--8-hadoop3-hive2.3", "test-results-hive-- other tests-8-hadoop3-hive2.3", "test-results-hive-- slow tests-8-hadoop3-hive2.3", "test-results-pyspark-connect--8-hadoop3-hive2.3", "test-results-pyspark-core, pyspark-streaming, pyspark-ml--8-hadoop3-hive2.3", "test-results-pyspark-pandas--8-hadoop3-hive2.3", "test-results-pyspark-pandas-connect--8-hadoop3-hive2.3", "test-results-pyspark-pandas-slow--8-hadoop3-hive2.3", "test-results-pyspark-pandas-slow-connect--8-hadoop3-hive2.3", "test-results-pyspark-sql, pyspark-mllib, pyspark-resource, pyspark-testing--8-hadoop3-hive2.3", "test-results-sparkr--8-hadoop3-hive2.3", "test-results-sql-- extended tests-8-hadoop3-hive2.3", "test-results-sql-- other tests-8-hadoop3-hive2.3", "test-results-sql-- slow tests-8-hadoop3-hive2.3", "test-results-streaming, sql-kafka-0-10, streaming-kafka-0-10, mllib-local, mllib, yarn, mesos, kubernetes, hadoop-cloud, spark-ganglia-lgpl, connect, protobuf--8-hadoop3-hive2.3", "test-results-tpcds--8-hadoop3-hive2.3", "unit-tests-log-catalyst, hive-thriftserver--8-hadoop3-hive2.3", "unit-tests-log-sql-- extended tests-8-hadoop3-hive2.3", "unit-tests-log-sql-- other tests-8-hadoop3-hive2.3", "unit-tests-log-sql-- slow tests-8-hadoop3-hive2.3", "unit-tests-log-streaming, sql-kafka-0-10, streaming-kafka-0-10, mllib-local, mllib, yarn, mesos, kubernetes, hadoop-cloud, spark-ganglia-lgpl, connect, protobuf--8-hadoop3-hive2.3". Please update your workflow to use v4 of the artifact actions. Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/

Artifacts

Produced during runtime
Name Size Digest
test-results-catalyst, hive-thriftserver--8-hadoop3-hive2.3 Expired
2.84 MB
test-results-core, unsafe, kvstore, avro, network-common, network-shuffle, repl, launcher, examples, sketch, graphx--8-hadoop3-hive2.3 Expired
2.8 MB
test-results-docker-integration--8-hadoop3-hive2.3 Expired
157 KB
test-results-hive-- other tests-8-hadoop3-hive2.3 Expired
1.11 MB
test-results-hive-- slow tests-8-hadoop3-hive2.3 Expired
948 KB
test-results-pyspark-connect--8-hadoop3-hive2.3 Expired
589 KB
test-results-pyspark-core, pyspark-streaming, pyspark-ml--8-hadoop3-hive2.3 Expired
380 KB
test-results-pyspark-pandas--8-hadoop3-hive2.3 Expired
1.21 MB
test-results-pyspark-pandas-connect--8-hadoop3-hive2.3 Expired
1020 KB
test-results-pyspark-pandas-slow--8-hadoop3-hive2.3 Expired
1.53 MB
test-results-pyspark-pandas-slow-connect--8-hadoop3-hive2.3 Expired
1.23 MB
test-results-pyspark-sql, pyspark-mllib, pyspark-resource, pyspark-testing--8-hadoop3-hive2.3 Expired
386 KB
test-results-sparkr--8-hadoop3-hive2.3 Expired
280 KB
test-results-sql-- extended tests-8-hadoop3-hive2.3 Expired
3.55 MB
test-results-sql-- other tests-8-hadoop3-hive2.3 Expired
4.7 MB
test-results-sql-- slow tests-8-hadoop3-hive2.3 Expired
3.44 MB
test-results-streaming, sql-kafka-0-10, streaming-kafka-0-10, mllib-local, mllib, yarn, mesos, kubernetes, hadoop-cloud, spark-ganglia-lgpl, connect, protobuf--8-hadoop3-hive2.3 Expired
337 KB
test-results-tpcds--8-hadoop3-hive2.3 Expired
22.6 KB
unit-tests-log-catalyst, hive-thriftserver--8-hadoop3-hive2.3 Expired
8.54 MB
unit-tests-log-sql-- extended tests-8-hadoop3-hive2.3 Expired
510 MB
unit-tests-log-sql-- other tests-8-hadoop3-hive2.3 Expired
297 MB
unit-tests-log-sql-- slow tests-8-hadoop3-hive2.3 Expired
383 MB
unit-tests-log-streaming, sql-kafka-0-10, streaming-kafka-0-10, mllib-local, mllib, yarn, mesos, kubernetes, hadoop-cloud, spark-ganglia-lgpl, connect, protobuf--8-hadoop3-hive2.3 Expired
248 MB