zookeeper upgrade 3.9.3 #22
build_main.yml
on: push
Run
/
Check changes
1m 33s
Run
/
Breaking change detection with Buf (branch-3.4)
57s
Run
/
Scala 2.13 build with SBT
18m 4s
Run
/
Run TPC-DS queries with SF=1
53m 17s
Run
/
Run Docker integration tests
32m 54s
Run
/
Run Spark on Kubernetes Integration test
1h 3m
Matrix: Run / build
Matrix: Run / java-11-17
Run
/
Build modules: sparkr
29m 21s
Run
/
Linters, licenses, dependencies and documentation generation
25m 52s
Matrix: Run / pyspark
Annotations
87 errors and 29 warnings
Run / Build modules: catalyst, hive-thriftserver
Process completed with exit code 18.
|
Run / Linters, licenses, dependencies and documentation generation
Process completed with exit code 1.
|
Run / Build modules: sql - slow tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Build modules: sql - slow tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Build modules: sql - slow tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Build modules: sql - slow tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Build modules: sql - slow tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Build modules: sql - slow tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Build modules: sql - slow tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Build modules: sql - slow tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Build modules: sql - slow tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Build modules: sql - slow tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
|
Run / Build modules: sql - other tests
Uncaught exception in thread stdout writer for python3
|
Run / Build modules: sql - other tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Build modules: sql - other tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Build modules: sql - other tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Build modules: sql - other tests
Process completed with exit code 18.
|
Run / Build modules: sql - extended tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Build modules: sql - extended tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Build modules: sql - extended tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Build modules: sql - extended tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Build modules: sql - extended tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Build modules: sql - extended tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Build modules: sql - extended tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Build modules: sql - extended tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Build modules: sql - extended tests
Uncaught exception in thread stdout writer for python3
|
Run / Build modules: sql - extended tests
Uncaught exception in thread stdout writer for python3
|
Run / Run Spark on Kubernetes Integration test
Set() did not contain "decomtest-4e57599320a8c16d-exec-1".
|
Run / Run Spark on Kubernetes Integration test
Set() did not contain "decomtest-fbc0189320a9a7e9-exec-1".
|
Run / Run Spark on Kubernetes Integration test
sleep interrupted
|
Run / Run Spark on Kubernetes Integration test
Task io.fabric8.kubernetes.client.utils.internal.SerialExecutor$$Lambda$544/267958776@294ea2a6 rejected from java.util.concurrent.ThreadPoolExecutor@7b5d309a[Shutting down, pool size = 2, active threads = 2, queued tasks = 0, completed tasks = 384]
|
Run / Run Spark on Kubernetes Integration test
Task io.fabric8.kubernetes.client.utils.internal.SerialExecutor$$Lambda$544/267958776@24b5e12f rejected from java.util.concurrent.ThreadPoolExecutor@7b5d309a[Shutting down, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = 385]
|
Run / Run Spark on Kubernetes Integration test
Set() did not contain "decomtest-ed80019320bd4a97-exec-1".
|
Run / Run Spark on Kubernetes Integration test
Set() did not contain "decomtest-8880a69320be381f-exec-1".
|
Run / Run Spark on Kubernetes Integration test
Set() did not contain "decomtest-5ca6de9320c1f156-exec-1".
|
Run / Run Spark on Kubernetes Integration test
Status(apiVersion=v1, code=404, details=StatusDetails(causes=[], group=null, kind=pods, name=spark-test-app-75535461897e4659a2df334bfb6fa9dd-driver, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=pods "spark-test-app-75535461897e4659a2df334bfb6fa9dd-driver" not found, metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=NotFound, status=Failure, additionalProperties={})..
|
ArrowUtilsSuite.(It is not a test it is a sbt.testing.SuiteSelector):
org/apache/spark/sql/util/ArrowUtilsSuite#L40
sbt.ForkMain$ForkError: java.lang.NoSuchFieldError: chunkSize
|
SQLQueryTestSuite.udaf/udaf-group-analytics.sql - Grouped Aggregate Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udaf/udaf-group-analytics.sql - Grouped Aggregate Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<(a + b):int,b:int,udaf((a - b)):int>"), but got Some("struct<>") Schema did not match for query #1
SELECT a + b, b, udaf(a - b) FROM testData GROUP BY a + b, b WITH CUBE: -- !query
SELECT a + b, b, udaf(a - b) FROM testData GROUP BY a + b, b WITH CUBE
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udaf/udaf-group-by-ordinal.sql - Grouped Aggregate Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udaf/udaf-group-by-ordinal.sql - Grouped Aggregate Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<a:int,udaf(b):int>"), but got Some("struct<>") Schema did not match for query #1
select a, udaf(b) from data group by 1: -- !query
select a, udaf(b) from data group by 1
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udaf/udaf-group-by.sql - Grouped Aggregate Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udaf/udaf-group-by.sql - Grouped Aggregate Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<udaf(a):int,udaf(b):int>"), but got Some("struct<>") Schema did not match for query #2
SELECT udaf(a), udaf(b) FROM testData: -- !query
SELECT udaf(a), udaf(b) FROM testData
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udaf/udaf-grouping-set.sql - Grouped Aggregate Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udaf/udaf-grouping-set.sql - Grouped Aggregate Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<a:string,b:string,c:string,udaf(d):int>"), but got Some("struct<>") Schema did not match for query #1
SELECT a, b, c, udaf(d) FROM grouping GROUP BY a, b, c GROUPING SETS (()): -- !query
SELECT a, b, c, udaf(d) FROM grouping GROUP BY a, b, c GROUPING SETS (())
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/postgreSQL/udf-aggregates_part1.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/postgreSQL/udf-aggregates_part1.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<avg_1:double>"), but got Some("struct<>") Schema did not match for query #0
SELECT avg(udf(four)) AS avg_1 FROM onek: -- !query
SELECT avg(udf(four)) AS avg_1 FROM onek
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/postgreSQL/udf-aggregates_part2.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/postgreSQL/udf-aggregates_part2.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<min(udf(unique1)):int>"), but got Some("struct<>") Schema did not match for query #12
select min(udf(unique1)) from tenk1: -- !query
select min(udf(unique1)) from tenk1
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/postgreSQL/udf-aggregates_part3.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/postgreSQL/udf-aggregates_part3.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<col:bigint>"), but got Some("struct<>") Schema did not match for query #1
select udf((select udf(count(*))
from (values (1)) t0(inner_c))) as col
from (values (2),(3)) t1(outer_c): -- !query
select udf((select udf(count(*))
from (values (1)) t0(inner_c))) as col
from (values (2),(3)) t1(outer_c)
-- !query schema
struct<>
-- !query output
java.util.concurrent.ExecutionException
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 32829.0 failed 1 times, most recent failure: Lost task 0.0 in stage 32829.0 (TID 30981) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$
at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47)
at org.apache.spark.sql.execution.python.ArrowPythonRunner.newReaderIterator(ArrowPythonRunner.scala:30)
at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212)
at org.apache.spark.sql.execution.python.ArrowEvalPythonExec.evaluate(ArrowEvalPythonExec.scala:92)
at org.apache.spark.sql.execution.python.EvalPythonExec.$anonfun$doExecute$2(EvalPythonExec.scala:131)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2(RDD.scala:858)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2$adapted(RDD.scala:858)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Driver stacktrace:
|
SQLQueryTestSuite.udf/postgreSQL/udf-case.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/postgreSQL/udf-case.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<One:string,Simple WHEN:int>"), but got Some("struct<>") Schema did not match for query #12
SELECT '3' AS `One`,
CASE
WHEN udf(1 < 2) THEN 3
END AS `Simple WHEN`: -- !query
SELECT '3' AS `One`,
CASE
WHEN udf(1 < 2) THEN 3
END AS `Simple WHEN`
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/postgreSQL/udf-join.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/postgreSQL/udf-join.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<xxx:string,udf(i):int,udf(j):int,udf(t):string>"), but got Some("struct<>") Schema did not match for query #28
SELECT udf('') AS `xxx`, udf(i), udf(j), udf(t)
FROM J1_TBL AS tx: -- !query
SELECT udf('') AS `xxx`, udf(i), udf(j), udf(t)
FROM J1_TBL AS tx
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/postgreSQL/udf-select_having.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/postgreSQL/udf-select_having.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<udf(b):int,udf(c):string>"), but got Some("struct<>") Schema did not match for query #11
SELECT udf(b), udf(c) FROM test_having
GROUP BY b, c HAVING udf(count(*)) = 1 ORDER BY udf(b), udf(c): -- !query
SELECT udf(b), udf(c) FROM test_having
GROUP BY b, c HAVING udf(count(*)) = 1 ORDER BY udf(b), udf(c)
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/postgreSQL/udf-select_implicit.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/postgreSQL/udf-select_implicit.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<udf(c):string,udf(count(1)):bigint>"), but got Some("struct<>") Schema did not match for query #11
SELECT udf(c), udf(count(*)) FROM test_missing_target GROUP BY
udf(test_missing_target.c)
ORDER BY udf(c): -- !query
SELECT udf(c), udf(count(*)) FROM test_missing_target GROUP BY
udf(test_missing_target.c)
ORDER BY udf(c)
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/udf-count.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-count.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<udf(count(1)):bigint,udf(count(1)):bigint,udf(count(NULL)):bigint,udf(count(a)):bigint,udf(count(b)):bigint,udf(count((a + b))):bigint,udf(count(named_struct(a, a, b, b))):bigint>"), but got Some("struct<>") Schema did not match for query #1
SELECT
udf(count(*)), udf(count(1)), udf(count(null)), udf(count(a)), udf(count(b)), udf(count(a + b)), udf(count((a, b)))
FROM testData: -- !query
SELECT
udf(count(*)), udf(count(1)), udf(count(null)), udf(count(a)), udf(count(b)), udf(count(a + b)), udf(count((a, b)))
FROM testData
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/udf-cross-join.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-cross-join.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<k:string,v1:int,k:string,v2:int>"), but got Some("struct<>") Schema did not match for query #3
SELECT * FROM nt1 cross join nt2 where udf(nt1.k) = udf(nt2.k): -- !query
SELECT * FROM nt1 cross join nt2 where udf(nt1.k) = udf(nt2.k)
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/udf-except-all.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-except-all.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<udf(c1):int>"), but got Some("struct<>") Schema did not match for query #4
SELECT udf(c1) FROM tab1
EXCEPT ALL
SELECT udf(c1) FROM tab2: -- !query
SELECT udf(c1) FROM tab1
EXCEPT ALL
SELECT udf(c1) FROM tab2
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/udf-except.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-except.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<udf(k):string,udf(v):int>"), but got Some("struct<>") Schema did not match for query #2
SELECT udf(k), udf(v) FROM t1 EXCEPT SELECT udf(k), udf(v) FROM t2: -- !query
SELECT udf(k), udf(v) FROM t1 EXCEPT SELECT udf(k), udf(v) FROM t2
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/udf-group-analytics.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-group-analytics.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<udf((a + b)):int,b:int,udf(sum((a - b))):bigint>"), but got Some("struct<>") Schema did not match for query #1
SELECT udf(a + b), b, udf(SUM(a - b)) FROM testData GROUP BY udf(a + b), b WITH CUBE: -- !query
SELECT udf(a + b), b, udf(SUM(a - b)) FROM testData GROUP BY udf(a + b), b WITH CUBE
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/udf-group-by.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-group-by.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<count(udf(a)):bigint,udf(count(b)):bigint>"), but got Some("struct<>") Schema did not match for query #2
SELECT COUNT(udf(a)), udf(COUNT(b)) FROM testData: -- !query
SELECT COUNT(udf(a)), udf(COUNT(b)) FROM testData
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/udf-having.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-having.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<k:string,udf(sum(v)):bigint>"), but got Some("struct<>") Schema did not match for query #1
SELECT udf(k) AS k, udf(sum(v)) FROM hav GROUP BY k HAVING udf(sum(v)) > 2: -- !query
SELECT udf(k) AS k, udf(sum(v)) FROM hav GROUP BY k HAVING udf(sum(v)) > 2
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/udf-inline-table.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-inline-table.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<udf(col1):string,udf(col2):int>"), but got Some("struct<>") Schema did not match for query #0
select udf(col1), udf(col2) from values ("one", 1): -- !query
select udf(col1), udf(col2) from values ("one", 1)
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/udf-inner-join.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-inner-join.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<a:int,tag:string>"), but got Some("struct<>") Schema did not match for query #6
SELECT tb.* FROM ta INNER JOIN tb ON ta.a = tb.a AND ta.tag = tb.tag: -- !query
SELECT tb.* FROM ta INNER JOIN tb ON ta.a = tb.a AND ta.tag = tb.tag
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/udf-intersect-all.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-intersect-all.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<udf(k):int,v:int>"), but got Some("struct<>") Schema did not match for query #2
SELECT udf(k), v FROM tab1
INTERSECT ALL
SELECT k, udf(v) FROM tab2: -- !query
SELECT udf(k), v FROM tab1
INTERSECT ALL
SELECT k, udf(v) FROM tab2
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/udf-join-empty-relation.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-join-empty-relation.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<udf(udf(a)):int,a:int>"), but got Some("struct<>") Schema did not match for query #5
SELECT udf(udf(t1.a)), empty_table.a FROM t1 LEFT OUTER JOIN empty_table ON (udf(t1.a) = udf(empty_table.a)): -- !query
SELECT udf(udf(t1.a)), empty_table.a FROM t1 LEFT OUTER JOIN empty_table ON (udf(t1.a) = udf(empty_table.a))
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/udf-natural-join.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-natural-join.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<k:string,v1:int,v2:int>"), but got Some("struct<>") Schema did not match for query #2
SELECT * FROM nt1 natural join nt2 where udf(k) = "one": -- !query
SELECT * FROM nt1 natural join nt2 where udf(k) = "one"
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/udf-outer-join.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-outer-join.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<udf(sum(udf(coalesce(int_col1, int_col0)))):bigint,(udf(coalesce(int_col1, int_col0)) * 2):int>"), but got Some("struct<>") Schema did not match for query #2
SELECT
(udf(SUM(udf(COALESCE(t1.int_col1, t2.int_col0))))),
(udf(COALESCE(t1.int_col1, t2.int_col0)) * 2)
FROM t1
RIGHT JOIN t2
ON udf(t2.int_col0) = udf(t1.int_col1)
GROUP BY udf(GREATEST(COALESCE(udf(t2.int_col1), 109), COALESCE(t1.int_col1, udf(-449)))),
COALESCE(t1.int_col1, t2.int_col0)
HAVING (udf(SUM(COALESCE(udf(t1.int_col1), udf(t2.int_col0)))))
> (udf(COALESCE(t1.int_col1, t2.int_col0)) * 2): -- !query
SELECT
(udf(SUM(udf(COALESCE(t1.int_col1, t2.int_col0))))),
(udf(COALESCE(t1.int_col1, t2.int_col0)) * 2)
FROM t1
RIGHT JOIN t2
ON udf(t2.int_col0) = udf(t1.int_col1)
GROUP BY udf(GREATEST(COALESCE(udf(t2.int_col1), 109), COALESCE(t1.int_col1, udf(-449)))),
COALESCE(t1.int_col1, t2.int_col0)
HAVING (udf(SUM(COALESCE(udf(t1.int_col1), udf(t2.int_col0)))))
> (udf(COALESCE(t1.int_col1, t2.int_col0)) * 2)
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/udf-pivot.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-pivot.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<udf(year):int,dotNET:bigint,Java:bigint>"), but got Some("struct<>") Schema did not match for query #3
SELECT * FROM (
SELECT udf(year), course, earnings FROM courseSales
)
PIVOT (
udf(sum(earnings))
FOR course IN ('dotNET', 'Java')
): -- !query
SELECT * FROM (
SELECT udf(year), course, earnings FROM courseSales
)
PIVOT (
udf(sum(earnings))
FOR course IN ('dotNET', 'Java')
)
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/udf-special-values.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-special-values.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<udf(x):int>"), but got Some("struct<>") Schema did not match for query #0
SELECT udf(x) FROM (VALUES (1), (2), (NULL)) v(x): -- !query
SELECT udf(x) FROM (VALUES (1), (2), (NULL)) v(x)
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/udf-udaf.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-udaf.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<my_avg:double,my_avg2:double,my_avg3:double>"), but got Some("struct<>") Schema did not match for query #2
SELECT default.myDoubleAvg(udf(int_col1)) as my_avg, udf(default.myDoubleAvg(udf(int_col1))) as my_avg2, udf(default.myDoubleAvg(int_col1)) as my_avg3 from t1: -- !query
SELECT default.myDoubleAvg(udf(int_col1)) as my_avg, udf(default.myDoubleAvg(udf(int_col1))) as my_avg2, udf(default.myDoubleAvg(int_col1)) as my_avg3 from t1
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/udf-union.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-union.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<c1:int,c2:string>"), but got Some("struct<>") Schema did not match for query #2
SELECT udf(c1) as c1, udf(c2) as c2
FROM (SELECT udf(c1) as c1, udf(c2) as c2 FROM t1
UNION ALL
SELECT udf(c1) as c1, udf(c2) as c2 FROM t1): -- !query
SELECT udf(c1) as c1, udf(c2) as c2
FROM (SELECT udf(c1) as c1, udf(c2) as c2 FROM t1
UNION ALL
SELECT udf(c1) as c1, udf(c2) as c2 FROM t1)
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/udf-window.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-window.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<udf(val):int,cate:string,count(val) OVER (PARTITION BY cate ORDER BY udf(val) ASC NULLS FIRST ROWS BETWEEN CURRENT ROW AND CURRENT ROW):bigint>"), but got Some("struct<>") Schema did not match for query #1
SELECT udf(val), cate, count(val) OVER(PARTITION BY cate ORDER BY udf(val) ROWS CURRENT ROW) FROM testData
ORDER BY cate, udf(val): -- !query
SELECT udf(val), cate, count(val) OVER(PARTITION BY cate ORDER BY udf(val) ROWS CURRENT ROW) FROM testData
ORDER BY cate, udf(val)
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
ColumnarBatchSuite.(It is not a test it is a sbt.testing.SuiteSelector):
org/apache/spark/sql/execution/vectorized/ColumnarBatchSuite#L1585
sbt.ForkMain$ForkError: java.lang.NoSuchFieldError: chunkSize
|
ArrowConvertersSuite.(It is not a test it is a sbt.testing.SuiteSelector):
org/apache/spark/sql/execution/arrow/ArrowConvertersSuite#L1435
sbt.ForkMain$ForkError: java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
ArrowWriterSuite.(It is not a test it is a sbt.testing.SuiteSelector):
org/apache/spark/sql/execution/arrow/ArrowWriterSuite#L75
sbt.ForkMain$ForkError: java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
PythonUDFSuite.SPARK-39962: Global aggregation of Pandas UDF should respect the column order:
org/apache/spark/sql/execution/python/PythonUDFSuite#L85
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 38.0 failed 1 times, most recent failure: Lost task 0.0 in stage 38.0 (TID 37) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$
at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47)
at org.apache.spark.sql.execution.python.ArrowPythonRunner.newReaderIterator(ArrowPythonRunner.scala:30)
at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212)
at org.apache.spark.sql.execution.python.AggregateInPandasExec.$anonfun$doExecute$8(AggregateInPandasExec.scala:176)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:893)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:893)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Driver stacktrace:
|
PythonUDTFSuite.Arrow optimized UDTF:
org/apache/spark/sql/execution/python/PythonUDTFSuite#L85
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 13.0 failed 1 times, most recent failure: Lost task 0.0 in stage 13.0 (TID 16) (localhost executor driver): java.lang.NoSuchFieldError: chunkSize
at io.netty.buffer.PooledByteBufAllocatorL$InnerAllocator.<init>(PooledByteBufAllocatorL.java:153)
at io.netty.buffer.PooledByteBufAllocatorL.<init>(PooledByteBufAllocatorL.java:49)
at org.apache.arrow.memory.NettyAllocationManager.<clinit>(NettyAllocationManager.java:51)
at org.apache.arrow.memory.DefaultAllocationManagerFactory.<clinit>(DefaultAllocationManagerFactory.java:26)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:264)
at org.apache.arrow.memory.DefaultAllocationManagerOption.getFactory(DefaultAllocationManagerOption.java:108)
at org.apache.arrow.memory.DefaultAllocationManagerOption.getDefaultAllocationManagerFactory(DefaultAllocationManagerOption.java:98)
at org.apache.arrow.memory.BaseAllocator$Config.getAllocationManagerFactory(BaseAllocator.java:772)
at org.apache.arrow.memory.ImmutableConfig.access$801(ImmutableConfig.java:24)
at org.apache.arrow.memory.ImmutableConfig$InitShim.getAllocationManagerFactory(ImmutableConfig.java:83)
at org.apache.arrow.memory.ImmutableConfig.<init>(ImmutableConfig.java:47)
at org.apache.arrow.memory.ImmutableConfig.<init>(ImmutableConfig.java:24)
at org.apache.arrow.memory.ImmutableConfig$Builder.build(ImmutableConfig.java:485)
at org.apache.arrow.memory.BaseAllocator.<clinit>(BaseAllocator.java:61)
at org.apache.spark.sql.util.ArrowUtils$.<init>(ArrowUtils.scala:34)
at org.apache.spark.sql.util.ArrowUtils$.<clinit>(ArrowUtils.scala)
at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47)
at org.apache.spark.sql.execution.python.ArrowPythonUDTFRunner.newReaderIterator(ArrowPythonUDTFRunner.scala:33)
at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212)
at org.apache.spark.sql.execution.python.ArrowEvalPythonUDTFExec.evaluate(ArrowEvalPythonUDTFExec.scala:73)
at org.apache.spark.sql.execution.python.EvalPythonUDTFExec.$anonfun$doExecute$2(EvalPythonUDTFExec.scala:96)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2(RDD.scala:858)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2$adapted(RDD.scala:858)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.sql.execution.SQLExecutionRDD.$anonfun$compute$1(SQLExecutionRDD.scala:52)
at org.apache.spark.sql.internal.SQLConf$.withExistingConf(SQLConf.scala:158)
at org.apache.spark.sql.execution.SQLExecutionRDD.compute(SQLExecutionRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Driver stacktrace:
|
PythonUDTFSuite.arrow optimized UDTF with lateral join:
org/apache/spark/sql/execution/python/PythonUDTFSuite#L90
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 15.0 failed 1 times, most recent failure: Lost task 0.0 in stage 15.0 (TID 19) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$
at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47)
at org.apache.spark.sql.execution.python.ArrowPythonUDTFRunner.newReaderIterator(ArrowPythonUDTFRunner.scala:33)
at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212)
at org.apache.spark.sql.execution.python.ArrowEvalPythonUDTFExec.evaluate(ArrowEvalPythonUDTFExec.scala:73)
at org.apache.spark.sql.execution.python.EvalPythonUDTFExec.$anonfun$doExecute$2(EvalPythonUDTFExec.scala:96)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2(RDD.scala:858)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2$adapted(RDD.scala:858)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.sql.execution.SQLExecutionRDD.$anonfun$compute$1(SQLExecutionRDD.scala:52)
at org.apache.spark.sql.internal.SQLConf$.withExistingConf(SQLConf.scala:158)
at org.apache.spark.sql.execution.SQLExecutionRDD.compute(SQLExecutionRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Driver stacktrace:
|
ArrowColumnVectorSuite.(It is not a test it is a sbt.testing.SuiteSelector):
org/apache/spark/sql/vectorized/ArrowColumnVectorSuite#L31
sbt.ForkMain$ForkError: java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
FlatMapGroupsInPandasWithStateDistributionSuite.applyInPandasWithState should require StatefulOpClusteredDistribution from children - without initial state:
org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateDistributionSuite#L1
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = 5252f80c-607b-40cc-a9e0-87cf3e96cecd, runId = 93a0a0f8-d7d3-48a5-ac5f-5798a09981d5] terminated with exception: Job aborted due to stage failure: Task 0 in stage 2.0 failed 1 times, most recent failure: Lost task 0.0 in stage 2.0 (TID 7) (localhost executor driver): java.lang.NoSuchFieldError: chunkSize
at io.netty.buffer.PooledByteBufAllocatorL$InnerAllocator.<init>(PooledByteBufAllocatorL.java:153)
at io.netty.buffer.PooledByteBufAllocatorL.<init>(PooledByteBufAllocatorL.java:49)
at org.apache.arrow.memory.NettyAllocationManager.<clinit>(NettyAllocationManager.java:51)
at org.apache.arrow.memory.DefaultAllocationManagerFactory.<clinit>(DefaultAllocationManagerFactory.java:26)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:264)
at org.apache.arrow.memory.DefaultAllocationManagerOption.getFactory(DefaultAllocationManagerOption.java:108)
at org.apache.arrow.memory.DefaultAllocationManagerOption.getDefaultAllocationManagerFactory(DefaultAllocationManagerOption.java:98)
at org.apache.arrow.memory.BaseAllocator$Config.getAllocationManagerFactory(BaseAllocator.java:772)
at org.apache.arrow.memory.ImmutableConfig.access$801(ImmutableConfig.java:24)
at org.apache.arrow.memory.ImmutableConfig$InitShim.getAllocationManagerFactory(ImmutableConfig.java:83)
at org.apache.arrow.memory.ImmutableConfig.<init>(ImmutableConfig.java:47)
at org.apache.arrow.memory.ImmutableConfig.<init>(ImmutableConfig.java:24)
at org.apache.arrow.memory.ImmutableConfig$Builder.build(ImmutableConfig.java:485)
at org.apache.arrow.memory.BaseAllocator.<clinit>(BaseAllocator.java:61)
at org.apache.spark.sql.util.ArrowUtils$.<init>(ArrowUtils.scala:34)
at org.apache.spark.sql.util.ArrowUtils$.<clinit>(ArrowUtils.scala)
at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47)
at org.apache.spark.sql.execution.python.ApplyInPandasWithStatePythonRunner.newReaderIterator(ApplyInPandasWithStatePythonRunner.scala:53)
at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.process(FlatMapGroupsInPandasWithStateExec.scala:195)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.processNewData(FlatMapGroupsInPandasWithStateExec.scala:133)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition(FlatMapGroupsWithStateExec.scala:144)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition$(FlatMapGroupsWithStateExec.scala:119)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec.processDataWithPartition(FlatMapGroupsInPandasWithStateExec.scala:55)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$doExecute$2(FlatMapGroupsWithStateExec.scala:244)
at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps.$anonfun$mapPartitionsWithStateStore$1(package.scala:68)
at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:127)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Driver stacktrace:
|
FlatMapGroupsInPandasWithStateSuite.applyInPandasWithState - streaming:
org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateSuite#L1
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = 927d8ac8-8cc4-43e4-b7ce-2779be25f287, runId = 3a78887e-f53e-4672-9dcb-87f7fc1fdf16] terminated with exception: Job aborted due to stage failure: Task 1 in stage 1.0 failed 1 times, most recent failure: Lost task 1.0 in stage 1.0 (TID 2) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$
at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47)
at org.apache.spark.sql.execution.python.ApplyInPandasWithStatePythonRunner.newReaderIterator(ApplyInPandasWithStatePythonRunner.scala:53)
at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.process(FlatMapGroupsInPandasWithStateExec.scala:195)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.processNewData(FlatMapGroupsInPandasWithStateExec.scala:133)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition(FlatMapGroupsWithStateExec.scala:144)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition$(FlatMapGroupsWithStateExec.scala:119)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec.processDataWithPartition(FlatMapGroupsInPandasWithStateExec.scala:55)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$doExecute$2(FlatMapGroupsWithStateExec.scala:244)
at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps.$anonfun$mapPartitionsWithStateStore$1(package.scala:68)
at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:127)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Driver stacktrace:
|
FlatMapGroupsInPandasWithStateSuite.applyInPandasWithState - streaming, multiple groups in partition, multiple outputs per grouping key:
org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateSuite#L1
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = db828e32-826e-4f1c-b32f-2a64c43c2764, runId = ef6159f7-7887-4ecb-8e22-536ec9c0b982] terminated with exception: Job aborted due to stage failure: Task 0 in stage 3.0 failed 1 times, most recent failure: Lost task 0.0 in stage 3.0 (TID 5) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$
at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47)
at org.apache.spark.sql.execution.python.ApplyInPandasWithStatePythonRunner.newReaderIterator(ApplyInPandasWithStatePythonRunner.scala:53)
at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.process(FlatMapGroupsInPandasWithStateExec.scala:195)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.processNewData(FlatMapGroupsInPandasWithStateExec.scala:133)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition(FlatMapGroupsWithStateExec.scala:144)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition$(FlatMapGroupsWithStateExec.scala:119)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec.processDataWithPartition(FlatMapGroupsInPandasWithStateExec.scala:55)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$doExecute$2(FlatMapGroupsWithStateExec.scala:244)
at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps.$anonfun$mapPartitionsWithStateStore$1(package.scala:68)
at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:127)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Driver stacktrace:
|
FlatMapGroupsInPandasWithStateSuite.applyInPandasWithState - streaming + aggregation:
org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateSuite#L1
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = dfa8731a-8bd7-4e71-8ae6-db5d4a7f7686, runId = 4a0396e1-634d-4260-b95c-56c6808a9d33] terminated with exception: Job aborted due to stage failure: Task 1 in stage 5.0 failed 1 times, most recent failure: Lost task 1.0 in stage 5.0 (TID 8) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$
at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47)
at org.apache.spark.sql.execution.python.ApplyInPandasWithStatePythonRunner.newReaderIterator(ApplyInPandasWithStatePythonRunner.scala:53)
at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.process(FlatMapGroupsInPandasWithStateExec.scala:195)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.processNewData(FlatMapGroupsInPandasWithStateExec.scala:133)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition(FlatMapGroupsWithStateExec.scala:144)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition$(FlatMapGroupsWithStateExec.scala:119)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec.processDataWithPartition(FlatMapGroupsInPandasWithStateExec.scala:55)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$doExecute$2(FlatMapGroupsWithStateExec.scala:244)
at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps.$anonfun$mapPartitionsWithStateStore$1(package.scala:68)
at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:127)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:104)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Driver stacktrace:
|
FlatMapGroupsInPandasWithStateSuite.applyInPandasWithState - streaming with processing time timeout:
org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateSuite#L1
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = 79fe0515-369c-47de-a7e5-6cfdac166756, runId = f7f13073-159e-4ac4-93ff-f4d3892da4bd] terminated with exception: Job aborted due to stage failure: Task 1 in stage 8.0 failed 1 times, most recent failure: Lost task 1.0 in stage 8.0 (TID 12) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$
at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47)
at org.apache.spark.sql.execution.python.ApplyInPandasWithStatePythonRunner.newReaderIterator(ApplyInPandasWithStatePythonRunner.scala:53)
at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.process(FlatMapGroupsInPandasWithStateExec.scala:195)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.processNewData(FlatMapGroupsInPandasWithStateExec.scala:133)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition(FlatMapGroupsWithStateExec.scala:144)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition$(FlatMapGroupsWithStateExec.scala:119)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec.processDataWithPartition(FlatMapGroupsInPandasWithStateExec.scala:55)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$doExecute$2(FlatMapGroupsWithStateExec.scala:244)
at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps.$anonfun$mapPartitionsWithStateStore$1(package.scala:68)
at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:127)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Driver stacktrace:
|
FlatMapGroupsInPandasWithStateSuite.applyInPandasWithState - streaming w/ event time timeout + watermark ifUseDateTimeType=true:
org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateSuite#L1
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = 669a0c45-1579-4e8a-8f21-3b36b6491d08, runId = 40e27a92-e4e9-469a-af7a-6965fb177535] terminated with exception: Job aborted due to stage failure: Task 1 in stage 10.0 failed 1 times, most recent failure: Lost task 1.0 in stage 10.0 (TID 16) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$
at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47)
at org.apache.spark.sql.execution.python.ApplyInPandasWithStatePythonRunner.newReaderIterator(ApplyInPandasWithStatePythonRunner.scala:53)
at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.process(FlatMapGroupsInPandasWithStateExec.scala:195)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.processNewData(FlatMapGroupsInPandasWithStateExec.scala:133)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition(FlatMapGroupsWithStateExec.scala:144)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition$(FlatMapGroupsWithStateExec.scala:119)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec.processDataWithPartition(FlatMapGroupsInPandasWithStateExec.scala:55)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$doExecute$2(FlatMapGroupsWithStateExec.scala:244)
at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps.$anonfun$mapPartitionsWithStateStore$1(package.scala:68)
at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:127)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Driver stacktrace:
|
FlatMapGroupsInPandasWithStateSuite.applyInPandasWithState - streaming w/ event time timeout + watermark ifUseDateTimeType=false:
org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateSuite#L1
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = 22804abd-797c-4287-87a3-0739c0f0747f, runId = e8f47582-9b09-4ca5-90fd-71e28894b995] terminated with exception: Job aborted due to stage failure: Task 1 in stage 12.0 failed 1 times, most recent failure: Lost task 1.0 in stage 12.0 (TID 20) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$
at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47)
at org.apache.spark.sql.execution.python.ApplyInPandasWithStatePythonRunner.newReaderIterator(ApplyInPandasWithStatePythonRunner.scala:53)
at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.process(FlatMapGroupsInPandasWithStateExec.scala:195)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.processNewData(FlatMapGroupsInPandasWithStateExec.scala:133)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition(FlatMapGroupsWithStateExec.scala:144)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition$(FlatMapGroupsWithStateExec.scala:119)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec.processDataWithPartition(FlatMapGroupsInPandasWithStateExec.scala:55)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$doExecute$2(FlatMapGroupsWithStateExec.scala:244)
at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps.$anonfun$mapPartitionsWithStateStore$1(package.scala:68)
at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:127)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Driver stacktrace:
|
FlatMapGroupsInPandasWithStateSuite.SPARK-20714: watermark does not fail query when timeout = NoTimeout:
org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateSuite#L1
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = d31422d2-571d-486b-911a-fdf76b05daaf, runId = 1f20e80f-9042-4572-b59a-50499bcf201d] terminated with exception: Job aborted due to stage failure: Task 0 in stage 14.0 failed 1 times, most recent failure: Lost task 0.0 in stage 14.0 (TID 23) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$
at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47)
at org.apache.spark.sql.execution.python.ApplyInPandasWithStatePythonRunner.newReaderIterator(ApplyInPandasWithStatePythonRunner.scala:53)
at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.process(FlatMapGroupsInPandasWithStateExec.scala:195)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.processNewData(FlatMapGroupsInPandasWithStateExec.scala:133)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition(FlatMapGroupsWithStateExec.scala:144)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition$(FlatMapGroupsWithStateExec.scala:119)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec.processDataWithPartition(FlatMapGroupsInPandasWithStateExec.scala:55)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$doExecute$2(FlatMapGroupsWithStateExec.scala:244)
at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps.$anonfun$mapPartitionsWithStateStore$1(package.scala:68)
at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:127)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Driver stacktrace:
|
FlatMapGroupsInPandasWithStateSuite.SPARK-20714: watermark does not fail query when timeout = ProcessingTimeTimeout:
org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateSuite#L1
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = f376ca1f-950a-4416-9e94-abc9d3dc3dec, runId = a0f108b1-30dc-494c-aa71-61d10c8cee5c] terminated with exception: Job aborted due to stage failure: Task 1 in stage 16.0 failed 1 times, most recent failure: Lost task 1.0 in stage 16.0 (TID 29) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$
at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47)
at org.apache.spark.sql.execution.python.ApplyInPandasWithStatePythonRunner.newReaderIterator(ApplyInPandasWithStatePythonRunner.scala:53)
at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.process(FlatMapGroupsInPandasWithStateExec.scala:195)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.processNewData(FlatMapGroupsInPandasWithStateExec.scala:133)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition(FlatMapGroupsWithStateExec.scala:144)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition$(FlatMapGroupsWithStateExec.scala:119)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec.processDataWithPartition(FlatMapGroupsInPandasWithStateExec.scala:55)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$doExecute$2(FlatMapGroupsWithStateExec.scala:244)
at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps.$anonfun$mapPartitionsWithStateStore$1(package.scala:68)
at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:127)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Driver stacktrace:
|
FlatMapGroupsInPandasWithStateSuite.applyInPandasWithState - uses state format version 2 by default:
org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateSuite#L1
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = 01b9bcd4-4e21-419e-81cb-46a6c6b4e20d, runId = a568a7c9-d0e5-44c2-a46e-4e3786ca0c7c] terminated with exception: Job aborted due to stage failure: Task 1 in stage 18.0 failed 1 times, most recent failure: Lost task 1.0 in stage 18.0 (TID 34) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$
at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47)
at org.apache.spark.sql.execution.python.ApplyInPandasWithStatePythonRunner.newReaderIterator(ApplyInPandasWithStatePythonRunner.scala:53)
at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.process(FlatMapGroupsInPandasWithStateExec.scala:195)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.processNewData(FlatMapGroupsInPandasWithStateExec.scala:133)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition(FlatMapGroupsWithStateExec.scala:144)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition$(FlatMapGroupsWithStateExec.scala:119)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec.processDataWithPartition(FlatMapGroupsInPandasWithStateExec.scala:55)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$doExecute$2(FlatMapGroupsWithStateExec.scala:244)
at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps.$anonfun$mapPartitionsWithStateStore$1(package.scala:68)
at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:127)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Driver stacktrace:
|
FlatMapGroupsInPandasWithStateSuite.applyInPandasWithState - streaming - arrow RecordBatch size with chunking:
org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateSuite#L1
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = 56444c14-cdcb-4ee4-a0d2-c0df328e7b1d, runId = 7907aa1f-c0a3-4c23-8277-57183074e09e] terminated with exception: Job aborted due to stage failure: Task 0 in stage 20.0 failed 1 times, most recent failure: Lost task 0.0 in stage 20.0 (TID 37) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$
at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47)
at org.apache.spark.sql.execution.python.ApplyInPandasWithStatePythonRunner.newReaderIterator(ApplyInPandasWithStatePythonRunner.scala:53)
at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.process(FlatMapGroupsInPandasWithStateExec.scala:195)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.processNewData(FlatMapGroupsInPandasWithStateExec.scala:133)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition(FlatMapGroupsWithStateExec.scala:144)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition$(FlatMapGroupsWithStateExec.scala:119)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec.processDataWithPartition(FlatMapGroupsInPandasWithStateExec.scala:55)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$doExecute$2(FlatMapGroupsWithStateExec.scala:244)
at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps.$anonfun$mapPartitionsWithStateStore$1(package.scala:68)
at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:127)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Driver stacktrace:
|
FlatMapGroupsInPandasWithStateSuite.applyInPandasWithState - streaming - partial consume of iterator in user function:
org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateSuite#L1
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = 7badf2c8-5757-42df-9738-f3cdeb085dea, runId = 3d3f0e80-5cb1-42ea-b96a-c836a8873a8d] terminated with exception: Job aborted due to stage failure: Task 0 in stage 22.0 failed 1 times, most recent failure: Lost task 0.0 in stage 22.0 (TID 39) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$
at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47)
at org.apache.spark.sql.execution.python.ApplyInPandasWithStatePythonRunner.newReaderIterator(ApplyInPandasWithStatePythonRunner.scala:53)
at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.process(FlatMapGroupsInPandasWithStateExec.scala:195)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.processNewData(FlatMapGroupsInPandasWithStateExec.scala:133)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition(FlatMapGroupsWithStateExec.scala:144)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition$(FlatMapGroupsWithStateExec.scala:119)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec.processDataWithPartition(FlatMapGroupsInPandasWithStateExec.scala:55)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$doExecute$2(FlatMapGroupsWithStateExec.scala:244)
at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps.$anonfun$mapPartitionsWithStateStore$1(package.scala:68)
at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:127)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Driver stacktrace:
|
FlatMapGroupsInPandasWithStateSuite.SPARK-40670: applyInPandasWithState - streaming having non-null columns:
org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateSuite#L1
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = 39c0c136-1b94-4990-a33b-68a25dee3500, runId = b82cf789-345f-4fb0-bb71-114b1f03a842] terminated with exception: Job aborted due to stage failure: Task 1 in stage 24.0 failed 1 times, most recent failure: Lost task 1.0 in stage 24.0 (TID 42) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$
at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47)
at org.apache.spark.sql.execution.python.ApplyInPandasWithStatePythonRunner.newReaderIterator(ApplyInPandasWithStatePythonRunner.scala:53)
at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.process(FlatMapGroupsInPandasWithStateExec.scala:195)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.processNewData(FlatMapGroupsInPandasWithStateExec.scala:133)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition(FlatMapGroupsWithStateExec.scala:144)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition$(FlatMapGroupsWithStateExec.scala:119)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec.processDataWithPartition(FlatMapGroupsInPandasWithStateExec.scala:55)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$doExecute$2(FlatMapGroupsWithStateExec.scala:244)
at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps.$anonfun$mapPartitionsWithStateStore$1(package.scala:68)
at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:127)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Driver stacktrace:
|
Run / Check changes
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Breaking change detection with Buf (branch-3.4)
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Base image build
The following actions use a deprecated Node.js version and will be forced to run on node20: docker/login-action@v2, actions/checkout@v3, docker/setup-qemu-action@v2, docker/setup-buildx-action@v2, docker/build-push-action@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Scala 2.13 build with SBT
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Build modules: catalyst, hive-thriftserver
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Run Docker integration tests
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Java 11 build with Maven
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Linters, licenses, dependencies and documentation generation
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Build modules: sparkr
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Build modules: hive - slow tests
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Build modules: pyspark-errors
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Build modules: pyspark-errors
No files were found with the provided path: **/target/test-reports/*.xml. No artifacts will be uploaded.
|
Run / Build modules: sql - slow tests
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/setup-python@v4, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Build modules: streaming, sql-kafka-0-10, streaming-kafka-0-10, mllib-local, mllib, yarn, mesos, kubernetes, hadoop-cloud, spark-ganglia-lgpl, connect, protobuf
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/setup-python@v4, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Run TPC-DS queries with SF=1
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Build modules: core, unsafe, kvstore, avro, network-common, network-shuffle, repl, launcher, examples, sketch, graphx
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Build modules: sql - other tests
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/setup-python@v4, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Java 17 build with Maven
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Build modules: sql - extended tests
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/setup-python@v4, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Run Spark on Kubernetes Integration test
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Build modules: pyspark-pandas
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Build modules: pyspark-sql, pyspark-mllib, pyspark-resource, pyspark-testing
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Build modules: hive - other tests
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Build modules: pyspark-core, pyspark-streaming, pyspark-ml
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Build modules: pyspark-pandas-slow
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Build modules: pyspark-pandas-connect
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Build modules: pyspark-connect
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Build modules: pyspark-pandas-slow-connect
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Deprecation notice: v1, v2, and v3 of the artifact actions
The following artifacts were uploaded using a version of actions/upload-artifact that is scheduled for deprecation: "test-results-catalyst, hive-thriftserver--8-hadoop3-hive2.3", "test-results-core, unsafe, kvstore, avro, network-common, network-shuffle, repl, launcher, examples, sketch, graphx--8-hadoop3-hive2.3", "test-results-docker-integration--8-hadoop3-hive2.3", "test-results-hive-- other tests-8-hadoop3-hive2.3", "test-results-hive-- slow tests-8-hadoop3-hive2.3", "test-results-pyspark-connect--8-hadoop3-hive2.3", "test-results-pyspark-core, pyspark-streaming, pyspark-ml--8-hadoop3-hive2.3", "test-results-pyspark-pandas--8-hadoop3-hive2.3", "test-results-pyspark-pandas-connect--8-hadoop3-hive2.3", "test-results-pyspark-pandas-slow--8-hadoop3-hive2.3", "test-results-pyspark-pandas-slow-connect--8-hadoop3-hive2.3", "test-results-pyspark-sql, pyspark-mllib, pyspark-resource, pyspark-testing--8-hadoop3-hive2.3", "test-results-sparkr--8-hadoop3-hive2.3", "test-results-sql-- extended tests-8-hadoop3-hive2.3", "test-results-sql-- other tests-8-hadoop3-hive2.3", "test-results-sql-- slow tests-8-hadoop3-hive2.3", "test-results-streaming, sql-kafka-0-10, streaming-kafka-0-10, mllib-local, mllib, yarn, mesos, kubernetes, hadoop-cloud, spark-ganglia-lgpl, connect, protobuf--8-hadoop3-hive2.3", "test-results-tpcds--8-hadoop3-hive2.3", "unit-tests-log-catalyst, hive-thriftserver--8-hadoop3-hive2.3", "unit-tests-log-sql-- extended tests-8-hadoop3-hive2.3", "unit-tests-log-sql-- other tests-8-hadoop3-hive2.3", "unit-tests-log-sql-- slow tests-8-hadoop3-hive2.3", "unit-tests-log-streaming, sql-kafka-0-10, streaming-kafka-0-10, mllib-local, mllib, yarn, mesos, kubernetes, hadoop-cloud, spark-ganglia-lgpl, connect, protobuf--8-hadoop3-hive2.3".
Please update your workflow to use v4 of the artifact actions.
Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/
|
Artifacts
Produced during runtime
Name | Size | Digest | |
---|---|---|---|
test-results-catalyst, hive-thriftserver--8-hadoop3-hive2.3
Expired
|
2.84 MB |
|
|
test-results-core, unsafe, kvstore, avro, network-common, network-shuffle, repl, launcher, examples, sketch, graphx--8-hadoop3-hive2.3
Expired
|
2.8 MB |
|
|
test-results-docker-integration--8-hadoop3-hive2.3
Expired
|
157 KB |
|
|
test-results-hive-- other tests-8-hadoop3-hive2.3
Expired
|
1.11 MB |
|
|
test-results-hive-- slow tests-8-hadoop3-hive2.3
Expired
|
948 KB |
|
|
test-results-pyspark-connect--8-hadoop3-hive2.3
Expired
|
589 KB |
|
|
test-results-pyspark-core, pyspark-streaming, pyspark-ml--8-hadoop3-hive2.3
Expired
|
380 KB |
|
|
test-results-pyspark-pandas--8-hadoop3-hive2.3
Expired
|
1.21 MB |
|
|
test-results-pyspark-pandas-connect--8-hadoop3-hive2.3
Expired
|
1020 KB |
|
|
test-results-pyspark-pandas-slow--8-hadoop3-hive2.3
Expired
|
1.53 MB |
|
|
test-results-pyspark-pandas-slow-connect--8-hadoop3-hive2.3
Expired
|
1.23 MB |
|
|
test-results-pyspark-sql, pyspark-mllib, pyspark-resource, pyspark-testing--8-hadoop3-hive2.3
Expired
|
386 KB |
|
|
test-results-sparkr--8-hadoop3-hive2.3
Expired
|
280 KB |
|
|
test-results-sql-- extended tests-8-hadoop3-hive2.3
Expired
|
3.55 MB |
|
|
test-results-sql-- other tests-8-hadoop3-hive2.3
Expired
|
4.7 MB |
|
|
test-results-sql-- slow tests-8-hadoop3-hive2.3
Expired
|
3.44 MB |
|
|
test-results-streaming, sql-kafka-0-10, streaming-kafka-0-10, mllib-local, mllib, yarn, mesos, kubernetes, hadoop-cloud, spark-ganglia-lgpl, connect, protobuf--8-hadoop3-hive2.3
Expired
|
337 KB |
|
|
test-results-tpcds--8-hadoop3-hive2.3
Expired
|
22.6 KB |
|
|
unit-tests-log-catalyst, hive-thriftserver--8-hadoop3-hive2.3
Expired
|
8.54 MB |
|
|
unit-tests-log-sql-- extended tests-8-hadoop3-hive2.3
Expired
|
510 MB |
|
|
unit-tests-log-sql-- other tests-8-hadoop3-hive2.3
Expired
|
297 MB |
|
|
unit-tests-log-sql-- slow tests-8-hadoop3-hive2.3
Expired
|
383 MB |
|
|
unit-tests-log-streaming, sql-kafka-0-10, streaming-kafka-0-10, mllib-local, mllib, yarn, mesos, kubernetes, hadoop-cloud, spark-ganglia-lgpl, connect, protobuf--8-hadoop3-hive2.3
Expired
|
248 MB |
|