upgrade netty #21
build_main.yml
on: push
Run
/
Check changes
34s
Run
/
Breaking change detection with Buf (branch-3.4)
51s
Run
/
Scala 2.13 build with SBT
18m 33s
Run
/
Run TPC-DS queries with SF=1
53m 16s
Run
/
Run Docker integration tests
31m 15s
Run
/
Run Spark on Kubernetes Integration test
58m 31s
Matrix: Run / build
Matrix: Run / java-11-17
Run
/
Build modules: sparkr
29m 20s
Run
/
Linters, licenses, dependencies and documentation generation
4m 48s
Matrix: Run / pyspark
Annotations
88 errors and 29 warnings
Run / Linters, licenses, dependencies and documentation generation
Process completed with exit code 1.
|
Run / Build modules: catalyst, hive-thriftserver
Process completed with exit code 18.
|
|
Run / Build modules: sql - slow tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Build modules: sql - slow tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Build modules: sql - slow tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Build modules: sql - slow tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Build modules: sql - slow tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Build modules: sql - slow tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Build modules: sql - slow tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Build modules: sql - slow tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Build modules: sql - slow tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Build modules: sql - slow tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Run Spark on Kubernetes Integration test
Set() did not contain "decomtest-1baf0793209f349d-exec-1".
|
Run / Run Spark on Kubernetes Integration test
Set() did not contain "decomtest-de40979320a00ed3-exec-1".
|
Run / Run Spark on Kubernetes Integration test
sleep interrupted
|
Run / Run Spark on Kubernetes Integration test
Task io.fabric8.kubernetes.client.utils.internal.SerialExecutor$$Lambda$544/698485335@6de79405 rejected from java.util.concurrent.ThreadPoolExecutor@659bbcc4[Shutting down, pool size = 2, active threads = 2, queued tasks = 0, completed tasks = 371]
|
Run / Run Spark on Kubernetes Integration test
sleep interrupted
|
Run / Run Spark on Kubernetes Integration test
Task io.fabric8.kubernetes.client.utils.internal.SerialExecutor$$Lambda$544/698485335@1d080986 rejected from java.util.concurrent.ThreadPoolExecutor@659bbcc4[Shutting down, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = 372]
|
Run / Run Spark on Kubernetes Integration test
Set() did not contain "decomtest-a3290e9320b28bfb-exec-1".
|
Run / Run Spark on Kubernetes Integration test
Set() did not contain "decomtest-0f773e9320b36bbc-exec-1".
|
Run / Run Spark on Kubernetes Integration test
Set() did not contain "decomtest-4632639320b6febc-exec-1".
|
Run / Run Spark on Kubernetes Integration test
Status(apiVersion=v1, code=404, details=StatusDetails(causes=[], group=null, kind=pods, name=spark-test-app-9f58430bef5844fa931b8ea147036381-driver, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=pods "spark-test-app-9f58430bef5844fa931b8ea147036381-driver" not found, metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=NotFound, status=Failure, additionalProperties={})..
|
Run / Build modules: sql - other tests
Uncaught exception in thread stdout writer for python3
|
Run / Build modules: sql - other tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Build modules: sql - other tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Build modules: sql - other tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Build modules: sql - other tests
Process completed with exit code 18.
|
Run / Build modules: sql - extended tests
Uncaught exception in thread stdout writer for python3
|
Run / Build modules: sql - extended tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Build modules: sql - extended tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Build modules: sql - extended tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Build modules: sql - extended tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Build modules: sql - extended tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Build modules: sql - extended tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Build modules: sql - extended tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Build modules: sql - extended tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
Run / Build modules: sql - extended tests
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
ArrowUtilsSuite.(It is not a test it is a sbt.testing.SuiteSelector):
org/apache/spark/sql/util/ArrowUtilsSuite#L40
sbt.ForkMain$ForkError: java.lang.NoSuchFieldError: chunkSize
|
SQLQueryTestSuite.udaf/udaf-group-analytics.sql - Grouped Aggregate Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udaf/udaf-group-analytics.sql - Grouped Aggregate Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<(a + b):int,b:int,udaf((a - b)):int>"), but got Some("struct<>") Schema did not match for query #1
SELECT a + b, b, udaf(a - b) FROM testData GROUP BY a + b, b WITH CUBE: -- !query
SELECT a + b, b, udaf(a - b) FROM testData GROUP BY a + b, b WITH CUBE
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udaf/udaf-group-by-ordinal.sql - Grouped Aggregate Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udaf/udaf-group-by-ordinal.sql - Grouped Aggregate Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<a:int,udaf(b):int>"), but got Some("struct<>") Schema did not match for query #1
select a, udaf(b) from data group by 1: -- !query
select a, udaf(b) from data group by 1
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udaf/udaf-group-by.sql - Grouped Aggregate Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udaf/udaf-group-by.sql - Grouped Aggregate Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<udaf(a):int,udaf(b):int>"), but got Some("struct<>") Schema did not match for query #2
SELECT udaf(a), udaf(b) FROM testData: -- !query
SELECT udaf(a), udaf(b) FROM testData
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udaf/udaf-grouping-set.sql - Grouped Aggregate Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udaf/udaf-grouping-set.sql - Grouped Aggregate Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<a:string,b:string,c:string,udaf(d):int>"), but got Some("struct<>") Schema did not match for query #1
SELECT a, b, c, udaf(d) FROM grouping GROUP BY a, b, c GROUPING SETS (()): -- !query
SELECT a, b, c, udaf(d) FROM grouping GROUP BY a, b, c GROUPING SETS (())
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/postgreSQL/udf-aggregates_part1.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/postgreSQL/udf-aggregates_part1.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<avg_1:double>"), but got Some("struct<>") Schema did not match for query #0
SELECT avg(udf(four)) AS avg_1 FROM onek: -- !query
SELECT avg(udf(four)) AS avg_1 FROM onek
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/postgreSQL/udf-aggregates_part2.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/postgreSQL/udf-aggregates_part2.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<min(udf(unique1)):int>"), but got Some("struct<>") Schema did not match for query #12
select min(udf(unique1)) from tenk1: -- !query
select min(udf(unique1)) from tenk1
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/postgreSQL/udf-aggregates_part3.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/postgreSQL/udf-aggregates_part3.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<col:bigint>"), but got Some("struct<>") Schema did not match for query #1
select udf((select udf(count(*))
from (values (1)) t0(inner_c))) as col
from (values (2),(3)) t1(outer_c): -- !query
select udf((select udf(count(*))
from (values (1)) t0(inner_c))) as col
from (values (2),(3)) t1(outer_c)
-- !query schema
struct<>
-- !query output
java.util.concurrent.ExecutionException
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 32835.0 failed 1 times, most recent failure: Lost task 0.0 in stage 32835.0 (TID 30981) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$
at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47)
at org.apache.spark.sql.execution.python.ArrowPythonRunner.newReaderIterator(ArrowPythonRunner.scala:30)
at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212)
at org.apache.spark.sql.execution.python.ArrowEvalPythonExec.evaluate(ArrowEvalPythonExec.scala:92)
at org.apache.spark.sql.execution.python.EvalPythonExec.$anonfun$doExecute$2(EvalPythonExec.scala:131)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2(RDD.scala:858)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2$adapted(RDD.scala:858)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Driver stacktrace:
|
SQLQueryTestSuite.udf/postgreSQL/udf-case.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/postgreSQL/udf-case.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<One:string,Simple WHEN:int>"), but got Some("struct<>") Schema did not match for query #12
SELECT '3' AS `One`,
CASE
WHEN udf(1 < 2) THEN 3
END AS `Simple WHEN`: -- !query
SELECT '3' AS `One`,
CASE
WHEN udf(1 < 2) THEN 3
END AS `Simple WHEN`
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/postgreSQL/udf-join.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/postgreSQL/udf-join.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<xxx:string,udf(i):int,udf(j):int,udf(t):string>"), but got Some("struct<>") Schema did not match for query #28
SELECT udf('') AS `xxx`, udf(i), udf(j), udf(t)
FROM J1_TBL AS tx: -- !query
SELECT udf('') AS `xxx`, udf(i), udf(j), udf(t)
FROM J1_TBL AS tx
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/postgreSQL/udf-select_having.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/postgreSQL/udf-select_having.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<udf(b):int,udf(c):string>"), but got Some("struct<>") Schema did not match for query #11
SELECT udf(b), udf(c) FROM test_having
GROUP BY b, c HAVING udf(count(*)) = 1 ORDER BY udf(b), udf(c): -- !query
SELECT udf(b), udf(c) FROM test_having
GROUP BY b, c HAVING udf(count(*)) = 1 ORDER BY udf(b), udf(c)
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/postgreSQL/udf-select_implicit.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/postgreSQL/udf-select_implicit.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<udf(c):string,udf(count(1)):bigint>"), but got Some("struct<>") Schema did not match for query #11
SELECT udf(c), udf(count(*)) FROM test_missing_target GROUP BY
udf(test_missing_target.c)
ORDER BY udf(c): -- !query
SELECT udf(c), udf(count(*)) FROM test_missing_target GROUP BY
udf(test_missing_target.c)
ORDER BY udf(c)
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/udf-count.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-count.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<udf(count(1)):bigint,udf(count(1)):bigint,udf(count(NULL)):bigint,udf(count(a)):bigint,udf(count(b)):bigint,udf(count((a + b))):bigint,udf(count(named_struct(a, a, b, b))):bigint>"), but got Some("struct<>") Schema did not match for query #1
SELECT
udf(count(*)), udf(count(1)), udf(count(null)), udf(count(a)), udf(count(b)), udf(count(a + b)), udf(count((a, b)))
FROM testData: -- !query
SELECT
udf(count(*)), udf(count(1)), udf(count(null)), udf(count(a)), udf(count(b)), udf(count(a + b)), udf(count((a, b)))
FROM testData
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/udf-cross-join.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-cross-join.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<k:string,v1:int,k:string,v2:int>"), but got Some("struct<>") Schema did not match for query #3
SELECT * FROM nt1 cross join nt2 where udf(nt1.k) = udf(nt2.k): -- !query
SELECT * FROM nt1 cross join nt2 where udf(nt1.k) = udf(nt2.k)
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/udf-except-all.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-except-all.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<udf(c1):int>"), but got Some("struct<>") Schema did not match for query #4
SELECT udf(c1) FROM tab1
EXCEPT ALL
SELECT udf(c1) FROM tab2: -- !query
SELECT udf(c1) FROM tab1
EXCEPT ALL
SELECT udf(c1) FROM tab2
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/udf-except.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-except.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<udf(k):string,udf(v):int>"), but got Some("struct<>") Schema did not match for query #2
SELECT udf(k), udf(v) FROM t1 EXCEPT SELECT udf(k), udf(v) FROM t2: -- !query
SELECT udf(k), udf(v) FROM t1 EXCEPT SELECT udf(k), udf(v) FROM t2
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/udf-group-analytics.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-group-analytics.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<udf((a + b)):int,b:int,udf(sum((a - b))):bigint>"), but got Some("struct<>") Schema did not match for query #1
SELECT udf(a + b), b, udf(SUM(a - b)) FROM testData GROUP BY udf(a + b), b WITH CUBE: -- !query
SELECT udf(a + b), b, udf(SUM(a - b)) FROM testData GROUP BY udf(a + b), b WITH CUBE
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/udf-group-by.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-group-by.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<count(udf(a)):bigint,udf(count(b)):bigint>"), but got Some("struct<>") Schema did not match for query #2
SELECT COUNT(udf(a)), udf(COUNT(b)) FROM testData: -- !query
SELECT COUNT(udf(a)), udf(COUNT(b)) FROM testData
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/udf-having.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-having.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<k:string,udf(sum(v)):bigint>"), but got Some("struct<>") Schema did not match for query #1
SELECT udf(k) AS k, udf(sum(v)) FROM hav GROUP BY k HAVING udf(sum(v)) > 2: -- !query
SELECT udf(k) AS k, udf(sum(v)) FROM hav GROUP BY k HAVING udf(sum(v)) > 2
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/udf-inline-table.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-inline-table.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<udf(col1):string,udf(col2):int>"), but got Some("struct<>") Schema did not match for query #0
select udf(col1), udf(col2) from values ("one", 1): -- !query
select udf(col1), udf(col2) from values ("one", 1)
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/udf-inner-join.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-inner-join.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<a:int,tag:string>"), but got Some("struct<>") Schema did not match for query #6
SELECT tb.* FROM ta INNER JOIN tb ON ta.a = tb.a AND ta.tag = tb.tag: -- !query
SELECT tb.* FROM ta INNER JOIN tb ON ta.a = tb.a AND ta.tag = tb.tag
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/udf-intersect-all.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-intersect-all.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<udf(k):int,v:int>"), but got Some("struct<>") Schema did not match for query #2
SELECT udf(k), v FROM tab1
INTERSECT ALL
SELECT k, udf(v) FROM tab2: -- !query
SELECT udf(k), v FROM tab1
INTERSECT ALL
SELECT k, udf(v) FROM tab2
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/udf-join-empty-relation.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-join-empty-relation.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<udf(udf(a)):int,a:int>"), but got Some("struct<>") Schema did not match for query #5
SELECT udf(udf(t1.a)), empty_table.a FROM t1 LEFT OUTER JOIN empty_table ON (udf(t1.a) = udf(empty_table.a)): -- !query
SELECT udf(udf(t1.a)), empty_table.a FROM t1 LEFT OUTER JOIN empty_table ON (udf(t1.a) = udf(empty_table.a))
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/udf-natural-join.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-natural-join.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<k:string,v1:int,v2:int>"), but got Some("struct<>") Schema did not match for query #2
SELECT * FROM nt1 natural join nt2 where udf(k) = "one": -- !query
SELECT * FROM nt1 natural join nt2 where udf(k) = "one"
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/udf-outer-join.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-outer-join.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<udf(sum(udf(coalesce(int_col1, int_col0)))):bigint,(udf(coalesce(int_col1, int_col0)) * 2):int>"), but got Some("struct<>") Schema did not match for query #2
SELECT
(udf(SUM(udf(COALESCE(t1.int_col1, t2.int_col0))))),
(udf(COALESCE(t1.int_col1, t2.int_col0)) * 2)
FROM t1
RIGHT JOIN t2
ON udf(t2.int_col0) = udf(t1.int_col1)
GROUP BY udf(GREATEST(COALESCE(udf(t2.int_col1), 109), COALESCE(t1.int_col1, udf(-449)))),
COALESCE(t1.int_col1, t2.int_col0)
HAVING (udf(SUM(COALESCE(udf(t1.int_col1), udf(t2.int_col0)))))
> (udf(COALESCE(t1.int_col1, t2.int_col0)) * 2): -- !query
SELECT
(udf(SUM(udf(COALESCE(t1.int_col1, t2.int_col0))))),
(udf(COALESCE(t1.int_col1, t2.int_col0)) * 2)
FROM t1
RIGHT JOIN t2
ON udf(t2.int_col0) = udf(t1.int_col1)
GROUP BY udf(GREATEST(COALESCE(udf(t2.int_col1), 109), COALESCE(t1.int_col1, udf(-449)))),
COALESCE(t1.int_col1, t2.int_col0)
HAVING (udf(SUM(COALESCE(udf(t1.int_col1), udf(t2.int_col0)))))
> (udf(COALESCE(t1.int_col1, t2.int_col0)) * 2)
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/udf-pivot.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-pivot.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<udf(year):int,dotNET:bigint,Java:bigint>"), but got Some("struct<>") Schema did not match for query #3
SELECT * FROM (
SELECT udf(year), course, earnings FROM courseSales
)
PIVOT (
udf(sum(earnings))
FOR course IN ('dotNET', 'Java')
): -- !query
SELECT * FROM (
SELECT udf(year), course, earnings FROM courseSales
)
PIVOT (
udf(sum(earnings))
FOR course IN ('dotNET', 'Java')
)
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/udf-special-values.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-special-values.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<udf(x):int>"), but got Some("struct<>") Schema did not match for query #0
SELECT udf(x) FROM (VALUES (1), (2), (NULL)) v(x): -- !query
SELECT udf(x) FROM (VALUES (1), (2), (NULL)) v(x)
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/udf-udaf.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-udaf.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<my_avg:double,my_avg2:double,my_avg3:double>"), but got Some("struct<>") Schema did not match for query #2
SELECT default.myDoubleAvg(udf(int_col1)) as my_avg, udf(default.myDoubleAvg(udf(int_col1))) as my_avg2, udf(default.myDoubleAvg(int_col1)) as my_avg3 from t1: -- !query
SELECT default.myDoubleAvg(udf(int_col1)) as my_avg, udf(default.myDoubleAvg(udf(int_col1))) as my_avg2, udf(default.myDoubleAvg(int_col1)) as my_avg3 from t1
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/udf-union.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-union.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<c1:int,c2:string>"), but got Some("struct<>") Schema did not match for query #2
SELECT udf(c1) as c1, udf(c2) as c2
FROM (SELECT udf(c1) as c1, udf(c2) as c2 FROM t1
UNION ALL
SELECT udf(c1) as c1, udf(c2) as c2 FROM t1): -- !query
SELECT udf(c1) as c1, udf(c2) as c2
FROM (SELECT udf(c1) as c1, udf(c2) as c2 FROM t1
UNION ALL
SELECT udf(c1) as c1, udf(c2) as c2 FROM t1)
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
SQLQueryTestSuite.udf/udf-window.sql - Scalar Pandas UDF:
org/apache/spark/sql/SQLQueryTestSuite#L346
org.scalatest.exceptions.TestFailedException: udf/udf-window.sql - Scalar Pandas UDF
Python: 3.8 Pandas: 2.0.3 PyArrow: 12.0.1
Expected Some("struct<udf(val):int,cate:string,count(val) OVER (PARTITION BY cate ORDER BY udf(val) ASC NULLS FIRST ROWS BETWEEN CURRENT ROW AND CURRENT ROW):bigint>"), but got Some("struct<>") Schema did not match for query #1
SELECT udf(val), cate, count(val) OVER(PARTITION BY cate ORDER BY udf(val) ROWS CURRENT ROW) FROM testData
ORDER BY cate, udf(val): -- !query
SELECT udf(val), cate, count(val) OVER(PARTITION BY cate ORDER BY udf(val) ROWS CURRENT ROW) FROM testData
ORDER BY cate, udf(val)
-- !query schema
struct<>
-- !query output
java.lang.NoClassDefFoundError
Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
ColumnarBatchSuite.(It is not a test it is a sbt.testing.SuiteSelector):
org/apache/spark/sql/execution/vectorized/ColumnarBatchSuite#L1585
sbt.ForkMain$ForkError: java.lang.NoSuchFieldError: chunkSize
|
ArrowConvertersSuite.(It is not a test it is a sbt.testing.SuiteSelector):
org/apache/spark/sql/execution/arrow/ArrowConvertersSuite#L1435
sbt.ForkMain$ForkError: java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
ArrowWriterSuite.(It is not a test it is a sbt.testing.SuiteSelector):
org/apache/spark/sql/execution/arrow/ArrowWriterSuite#L75
sbt.ForkMain$ForkError: java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
PythonUDFSuite.SPARK-39962: Global aggregation of Pandas UDF should respect the column order:
org/apache/spark/sql/execution/python/PythonUDFSuite#L85
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 38.0 failed 1 times, most recent failure: Lost task 0.0 in stage 38.0 (TID 37) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$
at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47)
at org.apache.spark.sql.execution.python.ArrowPythonRunner.newReaderIterator(ArrowPythonRunner.scala:30)
at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212)
at org.apache.spark.sql.execution.python.AggregateInPandasExec.$anonfun$doExecute$8(AggregateInPandasExec.scala:176)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:893)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:893)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Driver stacktrace:
|
PythonUDTFSuite.Arrow optimized UDTF:
org/apache/spark/sql/execution/python/PythonUDTFSuite#L85
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 13.0 failed 1 times, most recent failure: Lost task 0.0 in stage 13.0 (TID 16) (localhost executor driver): java.lang.NoSuchFieldError: chunkSize
at io.netty.buffer.PooledByteBufAllocatorL$InnerAllocator.<init>(PooledByteBufAllocatorL.java:153)
at io.netty.buffer.PooledByteBufAllocatorL.<init>(PooledByteBufAllocatorL.java:49)
at org.apache.arrow.memory.NettyAllocationManager.<clinit>(NettyAllocationManager.java:51)
at org.apache.arrow.memory.DefaultAllocationManagerFactory.<clinit>(DefaultAllocationManagerFactory.java:26)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:264)
at org.apache.arrow.memory.DefaultAllocationManagerOption.getFactory(DefaultAllocationManagerOption.java:108)
at org.apache.arrow.memory.DefaultAllocationManagerOption.getDefaultAllocationManagerFactory(DefaultAllocationManagerOption.java:98)
at org.apache.arrow.memory.BaseAllocator$Config.getAllocationManagerFactory(BaseAllocator.java:772)
at org.apache.arrow.memory.ImmutableConfig.access$801(ImmutableConfig.java:24)
at org.apache.arrow.memory.ImmutableConfig$InitShim.getAllocationManagerFactory(ImmutableConfig.java:83)
at org.apache.arrow.memory.ImmutableConfig.<init>(ImmutableConfig.java:47)
at org.apache.arrow.memory.ImmutableConfig.<init>(ImmutableConfig.java:24)
at org.apache.arrow.memory.ImmutableConfig$Builder.build(ImmutableConfig.java:485)
at org.apache.arrow.memory.BaseAllocator.<clinit>(BaseAllocator.java:61)
at org.apache.spark.sql.util.ArrowUtils$.<init>(ArrowUtils.scala:34)
at org.apache.spark.sql.util.ArrowUtils$.<clinit>(ArrowUtils.scala)
at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47)
at org.apache.spark.sql.execution.python.ArrowPythonUDTFRunner.newReaderIterator(ArrowPythonUDTFRunner.scala:33)
at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212)
at org.apache.spark.sql.execution.python.ArrowEvalPythonUDTFExec.evaluate(ArrowEvalPythonUDTFExec.scala:73)
at org.apache.spark.sql.execution.python.EvalPythonUDTFExec.$anonfun$doExecute$2(EvalPythonUDTFExec.scala:96)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2(RDD.scala:858)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2$adapted(RDD.scala:858)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.sql.execution.SQLExecutionRDD.$anonfun$compute$1(SQLExecutionRDD.scala:52)
at org.apache.spark.sql.internal.SQLConf$.withExistingConf(SQLConf.scala:158)
at org.apache.spark.sql.execution.SQLExecutionRDD.compute(SQLExecutionRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Driver stacktrace:
|
PythonUDTFSuite.arrow optimized UDTF with lateral join:
org/apache/spark/sql/execution/python/PythonUDTFSuite#L90
org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 15.0 failed 1 times, most recent failure: Lost task 1.0 in stage 15.0 (TID 20) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$
at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47)
at org.apache.spark.sql.execution.python.ArrowPythonUDTFRunner.newReaderIterator(ArrowPythonUDTFRunner.scala:33)
at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212)
at org.apache.spark.sql.execution.python.ArrowEvalPythonUDTFExec.evaluate(ArrowEvalPythonUDTFExec.scala:73)
at org.apache.spark.sql.execution.python.EvalPythonUDTFExec.$anonfun$doExecute$2(EvalPythonUDTFExec.scala:96)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2(RDD.scala:858)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2$adapted(RDD.scala:858)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.sql.execution.SQLExecutionRDD.$anonfun$compute$1(SQLExecutionRDD.scala:52)
at org.apache.spark.sql.internal.SQLConf$.withExistingConf(SQLConf.scala:158)
at org.apache.spark.sql.execution.SQLExecutionRDD.compute(SQLExecutionRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Driver stacktrace:
|
ArrowColumnVectorSuite.(It is not a test it is a sbt.testing.SuiteSelector):
org/apache/spark/sql/vectorized/ArrowColumnVectorSuite#L31
sbt.ForkMain$ForkError: java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$
|
FlatMapGroupsInPandasWithStateDistributionSuite.applyInPandasWithState should require StatefulOpClusteredDistribution from children - without initial state:
org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateDistributionSuite#L1
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = ae86d4ae-8a48-4e77-a116-4ac148e3b7c1, runId = 4eeae99d-09b9-4cba-9b87-9b5955cae154] terminated with exception: Job aborted due to stage failure: Task 0 in stage 2.0 failed 1 times, most recent failure: Lost task 0.0 in stage 2.0 (TID 7) (localhost executor driver): java.lang.NoSuchFieldError: chunkSize
at io.netty.buffer.PooledByteBufAllocatorL$InnerAllocator.<init>(PooledByteBufAllocatorL.java:153)
at io.netty.buffer.PooledByteBufAllocatorL.<init>(PooledByteBufAllocatorL.java:49)
at org.apache.arrow.memory.NettyAllocationManager.<clinit>(NettyAllocationManager.java:51)
at org.apache.arrow.memory.DefaultAllocationManagerFactory.<clinit>(DefaultAllocationManagerFactory.java:26)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:264)
at org.apache.arrow.memory.DefaultAllocationManagerOption.getFactory(DefaultAllocationManagerOption.java:108)
at org.apache.arrow.memory.DefaultAllocationManagerOption.getDefaultAllocationManagerFactory(DefaultAllocationManagerOption.java:98)
at org.apache.arrow.memory.BaseAllocator$Config.getAllocationManagerFactory(BaseAllocator.java:772)
at org.apache.arrow.memory.ImmutableConfig.access$801(ImmutableConfig.java:24)
at org.apache.arrow.memory.ImmutableConfig$InitShim.getAllocationManagerFactory(ImmutableConfig.java:83)
at org.apache.arrow.memory.ImmutableConfig.<init>(ImmutableConfig.java:47)
at org.apache.arrow.memory.ImmutableConfig.<init>(ImmutableConfig.java:24)
at org.apache.arrow.memory.ImmutableConfig$Builder.build(ImmutableConfig.java:485)
at org.apache.arrow.memory.BaseAllocator.<clinit>(BaseAllocator.java:61)
at org.apache.spark.sql.util.ArrowUtils$.<init>(ArrowUtils.scala:34)
at org.apache.spark.sql.util.ArrowUtils$.<clinit>(ArrowUtils.scala)
at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47)
at org.apache.spark.sql.execution.python.ApplyInPandasWithStatePythonRunner.newReaderIterator(ApplyInPandasWithStatePythonRunner.scala:53)
at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.process(FlatMapGroupsInPandasWithStateExec.scala:195)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.processNewData(FlatMapGroupsInPandasWithStateExec.scala:133)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition(FlatMapGroupsWithStateExec.scala:144)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition$(FlatMapGroupsWithStateExec.scala:119)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec.processDataWithPartition(FlatMapGroupsInPandasWithStateExec.scala:55)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$doExecute$2(FlatMapGroupsWithStateExec.scala:244)
at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps.$anonfun$mapPartitionsWithStateStore$1(package.scala:68)
at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:127)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Driver stacktrace:
|
FlatMapGroupsInPandasWithStateSuite.applyInPandasWithState - streaming:
org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateSuite#L1
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = e085d587-3be1-4775-a775-3d8617b0ad95, runId = 9d1d9bd8-c728-444f-a73c-8823bbcb0c4a] terminated with exception: Job aborted due to stage failure: Task 1 in stage 1.0 failed 1 times, most recent failure: Lost task 1.0 in stage 1.0 (TID 2) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$
at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47)
at org.apache.spark.sql.execution.python.ApplyInPandasWithStatePythonRunner.newReaderIterator(ApplyInPandasWithStatePythonRunner.scala:53)
at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.process(FlatMapGroupsInPandasWithStateExec.scala:195)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.processNewData(FlatMapGroupsInPandasWithStateExec.scala:133)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition(FlatMapGroupsWithStateExec.scala:144)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition$(FlatMapGroupsWithStateExec.scala:119)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec.processDataWithPartition(FlatMapGroupsInPandasWithStateExec.scala:55)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$doExecute$2(FlatMapGroupsWithStateExec.scala:244)
at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps.$anonfun$mapPartitionsWithStateStore$1(package.scala:68)
at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:127)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Driver stacktrace:
|
FlatMapGroupsInPandasWithStateSuite.applyInPandasWithState - streaming, multiple groups in partition, multiple outputs per grouping key:
org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateSuite#L1
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = 0fe32053-0c18-49fd-afc5-bb332c1cd583, runId = 9e7a94b2-38e7-48d4-a6a9-d3d6d50ac834] terminated with exception: Job aborted due to stage failure: Task 0 in stage 3.0 failed 1 times, most recent failure: Lost task 0.0 in stage 3.0 (TID 5) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$
at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47)
at org.apache.spark.sql.execution.python.ApplyInPandasWithStatePythonRunner.newReaderIterator(ApplyInPandasWithStatePythonRunner.scala:53)
at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.process(FlatMapGroupsInPandasWithStateExec.scala:195)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.processNewData(FlatMapGroupsInPandasWithStateExec.scala:133)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition(FlatMapGroupsWithStateExec.scala:144)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition$(FlatMapGroupsWithStateExec.scala:119)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec.processDataWithPartition(FlatMapGroupsInPandasWithStateExec.scala:55)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$doExecute$2(FlatMapGroupsWithStateExec.scala:244)
at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps.$anonfun$mapPartitionsWithStateStore$1(package.scala:68)
at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:127)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Driver stacktrace:
|
FlatMapGroupsInPandasWithStateSuite.applyInPandasWithState - streaming + aggregation:
org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateSuite#L1
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = 820a9785-c93a-4141-810f-53eb20267928, runId = 00b8338b-7816-4bc3-8a74-b1bd640e563d] terminated with exception: Job aborted due to stage failure: Task 1 in stage 5.0 failed 1 times, most recent failure: Lost task 1.0 in stage 5.0 (TID 8) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$
at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47)
at org.apache.spark.sql.execution.python.ApplyInPandasWithStatePythonRunner.newReaderIterator(ApplyInPandasWithStatePythonRunner.scala:53)
at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.process(FlatMapGroupsInPandasWithStateExec.scala:195)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.processNewData(FlatMapGroupsInPandasWithStateExec.scala:133)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition(FlatMapGroupsWithStateExec.scala:144)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition$(FlatMapGroupsWithStateExec.scala:119)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec.processDataWithPartition(FlatMapGroupsInPandasWithStateExec.scala:55)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$doExecute$2(FlatMapGroupsWithStateExec.scala:244)
at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps.$anonfun$mapPartitionsWithStateStore$1(package.scala:68)
at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:127)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:104)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Driver stacktrace:
|
FlatMapGroupsInPandasWithStateSuite.applyInPandasWithState - streaming with processing time timeout:
org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateSuite#L1
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = 17739164-3032-451b-b794-cac2f5fb1d96, runId = aa2e9c6a-2a99-4d19-9887-fa935554cf81] terminated with exception: Job aborted due to stage failure: Task 1 in stage 8.0 failed 1 times, most recent failure: Lost task 1.0 in stage 8.0 (TID 12) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$
at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47)
at org.apache.spark.sql.execution.python.ApplyInPandasWithStatePythonRunner.newReaderIterator(ApplyInPandasWithStatePythonRunner.scala:53)
at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.process(FlatMapGroupsInPandasWithStateExec.scala:195)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.processNewData(FlatMapGroupsInPandasWithStateExec.scala:133)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition(FlatMapGroupsWithStateExec.scala:144)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition$(FlatMapGroupsWithStateExec.scala:119)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec.processDataWithPartition(FlatMapGroupsInPandasWithStateExec.scala:55)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$doExecute$2(FlatMapGroupsWithStateExec.scala:244)
at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps.$anonfun$mapPartitionsWithStateStore$1(package.scala:68)
at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:127)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Driver stacktrace:
|
FlatMapGroupsInPandasWithStateSuite.applyInPandasWithState - streaming w/ event time timeout + watermark ifUseDateTimeType=true:
org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateSuite#L1
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = f1d87807-4d99-4029-bcf2-dcd249f5c1ed, runId = 5240b775-b344-406d-b19d-0c5229089716] terminated with exception: Job aborted due to stage failure: Task 1 in stage 10.0 failed 1 times, most recent failure: Lost task 1.0 in stage 10.0 (TID 16) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$
at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47)
at org.apache.spark.sql.execution.python.ApplyInPandasWithStatePythonRunner.newReaderIterator(ApplyInPandasWithStatePythonRunner.scala:53)
at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.process(FlatMapGroupsInPandasWithStateExec.scala:195)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.processNewData(FlatMapGroupsInPandasWithStateExec.scala:133)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition(FlatMapGroupsWithStateExec.scala:144)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition$(FlatMapGroupsWithStateExec.scala:119)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec.processDataWithPartition(FlatMapGroupsInPandasWithStateExec.scala:55)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$doExecute$2(FlatMapGroupsWithStateExec.scala:244)
at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps.$anonfun$mapPartitionsWithStateStore$1(package.scala:68)
at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:127)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Driver stacktrace:
|
FlatMapGroupsInPandasWithStateSuite.applyInPandasWithState - streaming w/ event time timeout + watermark ifUseDateTimeType=false:
org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateSuite#L1
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = 7ee0f92f-7dac-4c2c-8d4a-2e0adccff8e0, runId = 9f1bb3b4-a5b6-4afe-a70a-806a2f0d75b3] terminated with exception: Job aborted due to stage failure: Task 0 in stage 12.0 failed 1 times, most recent failure: Lost task 0.0 in stage 12.0 (TID 19) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$
at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47)
at org.apache.spark.sql.execution.python.ApplyInPandasWithStatePythonRunner.newReaderIterator(ApplyInPandasWithStatePythonRunner.scala:53)
at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.process(FlatMapGroupsInPandasWithStateExec.scala:195)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.processNewData(FlatMapGroupsInPandasWithStateExec.scala:133)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition(FlatMapGroupsWithStateExec.scala:144)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition$(FlatMapGroupsWithStateExec.scala:119)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec.processDataWithPartition(FlatMapGroupsInPandasWithStateExec.scala:55)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$doExecute$2(FlatMapGroupsWithStateExec.scala:244)
at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps.$anonfun$mapPartitionsWithStateStore$1(package.scala:68)
at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:127)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Driver stacktrace:
|
FlatMapGroupsInPandasWithStateSuite.SPARK-20714: watermark does not fail query when timeout = NoTimeout:
org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateSuite#L1
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = bf806890-3110-44fd-b53f-c710291f09c1, runId = 4c63c032-4978-4cb6-83ff-e97e6a83542f] terminated with exception: Job aborted due to stage failure: Task 1 in stage 14.0 failed 1 times, most recent failure: Lost task 1.0 in stage 14.0 (TID 25) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$
at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47)
at org.apache.spark.sql.execution.python.ApplyInPandasWithStatePythonRunner.newReaderIterator(ApplyInPandasWithStatePythonRunner.scala:53)
at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.process(FlatMapGroupsInPandasWithStateExec.scala:195)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.processNewData(FlatMapGroupsInPandasWithStateExec.scala:133)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition(FlatMapGroupsWithStateExec.scala:144)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition$(FlatMapGroupsWithStateExec.scala:119)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec.processDataWithPartition(FlatMapGroupsInPandasWithStateExec.scala:55)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$doExecute$2(FlatMapGroupsWithStateExec.scala:244)
at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps.$anonfun$mapPartitionsWithStateStore$1(package.scala:68)
at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:127)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Driver stacktrace:
|
FlatMapGroupsInPandasWithStateSuite.SPARK-20714: watermark does not fail query when timeout = ProcessingTimeTimeout:
org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateSuite#L1
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = e4f48243-9eeb-407c-91f3-d23174758aa4, runId = 188d12cf-cbe8-4f07-a532-01e0763bf1f6] terminated with exception: Job aborted due to stage failure: Task 1 in stage 16.0 failed 1 times, most recent failure: Lost task 1.0 in stage 16.0 (TID 29) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$
at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47)
at org.apache.spark.sql.execution.python.ApplyInPandasWithStatePythonRunner.newReaderIterator(ApplyInPandasWithStatePythonRunner.scala:53)
at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.process(FlatMapGroupsInPandasWithStateExec.scala:195)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.processNewData(FlatMapGroupsInPandasWithStateExec.scala:133)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition(FlatMapGroupsWithStateExec.scala:144)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition$(FlatMapGroupsWithStateExec.scala:119)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec.processDataWithPartition(FlatMapGroupsInPandasWithStateExec.scala:55)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$doExecute$2(FlatMapGroupsWithStateExec.scala:244)
at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps.$anonfun$mapPartitionsWithStateStore$1(package.scala:68)
at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:127)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Driver stacktrace:
|
FlatMapGroupsInPandasWithStateSuite.applyInPandasWithState - uses state format version 2 by default:
org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateSuite#L1
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = 8534c564-57ed-4cdd-bdd0-432577867b93, runId = d8fe506e-9a92-45d5-9bc4-46c224a43af2] terminated with exception: Job aborted due to stage failure: Task 1 in stage 18.0 failed 1 times, most recent failure: Lost task 1.0 in stage 18.0 (TID 33) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$
at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47)
at org.apache.spark.sql.execution.python.ApplyInPandasWithStatePythonRunner.newReaderIterator(ApplyInPandasWithStatePythonRunner.scala:53)
at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.process(FlatMapGroupsInPandasWithStateExec.scala:195)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.processNewData(FlatMapGroupsInPandasWithStateExec.scala:133)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition(FlatMapGroupsWithStateExec.scala:144)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition$(FlatMapGroupsWithStateExec.scala:119)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec.processDataWithPartition(FlatMapGroupsInPandasWithStateExec.scala:55)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$doExecute$2(FlatMapGroupsWithStateExec.scala:244)
at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps.$anonfun$mapPartitionsWithStateStore$1(package.scala:68)
at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:127)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Driver stacktrace:
|
FlatMapGroupsInPandasWithStateSuite.applyInPandasWithState - streaming - arrow RecordBatch size with chunking:
org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateSuite#L1
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = 9baa87be-e58a-4849-bd0d-337245354772, runId = b2fa12e4-ab6f-4e37-99dd-a9dbeeebb135] terminated with exception: Job aborted due to stage failure: Task 0 in stage 20.0 failed 1 times, most recent failure: Lost task 0.0 in stage 20.0 (TID 36) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$
at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47)
at org.apache.spark.sql.execution.python.ApplyInPandasWithStatePythonRunner.newReaderIterator(ApplyInPandasWithStatePythonRunner.scala:53)
at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.process(FlatMapGroupsInPandasWithStateExec.scala:195)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.processNewData(FlatMapGroupsInPandasWithStateExec.scala:133)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition(FlatMapGroupsWithStateExec.scala:144)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition$(FlatMapGroupsWithStateExec.scala:119)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec.processDataWithPartition(FlatMapGroupsInPandasWithStateExec.scala:55)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$doExecute$2(FlatMapGroupsWithStateExec.scala:244)
at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps.$anonfun$mapPartitionsWithStateStore$1(package.scala:68)
at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:127)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Driver stacktrace:
|
FlatMapGroupsInPandasWithStateSuite.applyInPandasWithState - streaming - partial consume of iterator in user function:
org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateSuite#L1
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = 32a351a7-2be8-4331-8f9d-a2f8703af4a0, runId = d747ccc7-80f9-4923-8a4a-8194cfb2a069] terminated with exception: Job aborted due to stage failure: Task 0 in stage 22.0 failed 1 times, most recent failure: Lost task 0.0 in stage 22.0 (TID 38) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$
at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47)
at org.apache.spark.sql.execution.python.ApplyInPandasWithStatePythonRunner.newReaderIterator(ApplyInPandasWithStatePythonRunner.scala:53)
at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.process(FlatMapGroupsInPandasWithStateExec.scala:195)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.processNewData(FlatMapGroupsInPandasWithStateExec.scala:133)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition(FlatMapGroupsWithStateExec.scala:144)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition$(FlatMapGroupsWithStateExec.scala:119)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec.processDataWithPartition(FlatMapGroupsInPandasWithStateExec.scala:55)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$doExecute$2(FlatMapGroupsWithStateExec.scala:244)
at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps.$anonfun$mapPartitionsWithStateStore$1(package.scala:68)
at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:127)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Driver stacktrace:
|
FlatMapGroupsInPandasWithStateSuite.SPARK-40670: applyInPandasWithState - streaming having non-null columns:
org/apache/spark/sql/streaming/FlatMapGroupsInPandasWithStateSuite#L1
org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = 870be2de-0448-4b7f-b69f-328648c1809f, runId = b8761aa0-6cac-43ca-bb12-9d37e4249e35] terminated with exception: Job aborted due to stage failure: Task 1 in stage 24.0 failed 1 times, most recent failure: Lost task 1.0 in stage 24.0 (TID 41) (localhost executor driver): java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.sql.util.ArrowUtils$
at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.<init>(PythonArrowOutput.scala:60)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator(PythonArrowOutput.scala:57)
at org.apache.spark.sql.execution.python.PythonArrowOutput.newReaderIterator$(PythonArrowOutput.scala:47)
at org.apache.spark.sql.execution.python.ApplyInPandasWithStatePythonRunner.newReaderIterator(ApplyInPandasWithStatePythonRunner.scala:53)
at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:212)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.process(FlatMapGroupsInPandasWithStateExec.scala:195)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec$$anon$1.processNewData(FlatMapGroupsInPandasWithStateExec.scala:133)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition(FlatMapGroupsWithStateExec.scala:144)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.processDataWithPartition$(FlatMapGroupsWithStateExec.scala:119)
at org.apache.spark.sql.execution.python.FlatMapGroupsInPandasWithStateExec.processDataWithPartition(FlatMapGroupsInPandasWithStateExec.scala:55)
at org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$doExecute$2(FlatMapGroupsWithStateExec.scala:244)
at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps.$anonfun$mapPartitionsWithStateStore$1(package.scala:68)
at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:127)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Driver stacktrace:
|
Run / Check changes
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Base image build
The following actions use a deprecated Node.js version and will be forced to run on node20: docker/login-action@v2, actions/checkout@v3, docker/setup-qemu-action@v2, docker/setup-buildx-action@v2, docker/build-push-action@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Breaking change detection with Buf (branch-3.4)
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Linters, licenses, dependencies and documentation generation
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Scala 2.13 build with SBT
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Build modules: pyspark-errors
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Build modules: pyspark-errors
No files were found with the provided path: **/target/test-reports/*.xml. No artifacts will be uploaded.
|
Run / Build modules: catalyst, hive-thriftserver
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Build modules: sparkr
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Run Docker integration tests
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Java 17 build with Maven
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Java 11 build with Maven
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Build modules: hive - slow tests
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Build modules: streaming, sql-kafka-0-10, streaming-kafka-0-10, mllib-local, mllib, yarn, mesos, kubernetes, hadoop-cloud, spark-ganglia-lgpl, connect, protobuf
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/setup-python@v4, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Build modules: pyspark-pandas
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Build modules: sql - slow tests
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/setup-python@v4, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Run TPC-DS queries with SF=1
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Build modules: core, unsafe, kvstore, avro, network-common, network-shuffle, repl, launcher, examples, sketch, graphx
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Run Spark on Kubernetes Integration test
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Build modules: pyspark-pandas-slow
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Build modules: pyspark-sql, pyspark-mllib, pyspark-resource, pyspark-testing
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Build modules: sql - other tests
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/setup-python@v4, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Build modules: pyspark-core, pyspark-streaming, pyspark-ml
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Build modules: sql - extended tests
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/setup-python@v4, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Build modules: pyspark-pandas-connect
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Build modules: hive - other tests
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Build modules: pyspark-connect
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Run / Build modules: pyspark-pandas-slow-connect
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/cache@v3, actions/setup-java@v3, actions/upload-artifact@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Deprecation notice: v1, v2, and v3 of the artifact actions
The following artifacts were uploaded using a version of actions/upload-artifact that is scheduled for deprecation: "test-results-catalyst, hive-thriftserver--8-hadoop3-hive2.3", "test-results-core, unsafe, kvstore, avro, network-common, network-shuffle, repl, launcher, examples, sketch, graphx--8-hadoop3-hive2.3", "test-results-docker-integration--8-hadoop3-hive2.3", "test-results-hive-- other tests-8-hadoop3-hive2.3", "test-results-hive-- slow tests-8-hadoop3-hive2.3", "test-results-pyspark-connect--8-hadoop3-hive2.3", "test-results-pyspark-core, pyspark-streaming, pyspark-ml--8-hadoop3-hive2.3", "test-results-pyspark-pandas--8-hadoop3-hive2.3", "test-results-pyspark-pandas-connect--8-hadoop3-hive2.3", "test-results-pyspark-pandas-slow--8-hadoop3-hive2.3", "test-results-pyspark-pandas-slow-connect--8-hadoop3-hive2.3", "test-results-pyspark-sql, pyspark-mllib, pyspark-resource, pyspark-testing--8-hadoop3-hive2.3", "test-results-sparkr--8-hadoop3-hive2.3", "test-results-sql-- extended tests-8-hadoop3-hive2.3", "test-results-sql-- other tests-8-hadoop3-hive2.3", "test-results-sql-- slow tests-8-hadoop3-hive2.3", "test-results-streaming, sql-kafka-0-10, streaming-kafka-0-10, mllib-local, mllib, yarn, mesos, kubernetes, hadoop-cloud, spark-ganglia-lgpl, connect, protobuf--8-hadoop3-hive2.3", "test-results-tpcds--8-hadoop3-hive2.3", "unit-tests-log-catalyst, hive-thriftserver--8-hadoop3-hive2.3", "unit-tests-log-sql-- extended tests-8-hadoop3-hive2.3", "unit-tests-log-sql-- other tests-8-hadoop3-hive2.3", "unit-tests-log-sql-- slow tests-8-hadoop3-hive2.3", "unit-tests-log-streaming, sql-kafka-0-10, streaming-kafka-0-10, mllib-local, mllib, yarn, mesos, kubernetes, hadoop-cloud, spark-ganglia-lgpl, connect, protobuf--8-hadoop3-hive2.3".
Please update your workflow to use v4 of the artifact actions.
Learn more: https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/
|
Artifacts
Produced during runtime
Name | Size | Digest | |
---|---|---|---|
test-results-catalyst, hive-thriftserver--8-hadoop3-hive2.3
Expired
|
2.84 MB |
|
|
test-results-core, unsafe, kvstore, avro, network-common, network-shuffle, repl, launcher, examples, sketch, graphx--8-hadoop3-hive2.3
Expired
|
2.8 MB |
|
|
test-results-docker-integration--8-hadoop3-hive2.3
Expired
|
157 KB |
|
|
test-results-hive-- other tests-8-hadoop3-hive2.3
Expired
|
1.11 MB |
|
|
test-results-hive-- slow tests-8-hadoop3-hive2.3
Expired
|
948 KB |
|
|
test-results-pyspark-connect--8-hadoop3-hive2.3
Expired
|
603 KB |
|
|
test-results-pyspark-core, pyspark-streaming, pyspark-ml--8-hadoop3-hive2.3
Expired
|
380 KB |
|
|
test-results-pyspark-pandas--8-hadoop3-hive2.3
Expired
|
1.21 MB |
|
|
test-results-pyspark-pandas-connect--8-hadoop3-hive2.3
Expired
|
1020 KB |
|
|
test-results-pyspark-pandas-slow--8-hadoop3-hive2.3
Expired
|
1.53 MB |
|
|
test-results-pyspark-pandas-slow-connect--8-hadoop3-hive2.3
Expired
|
1.23 MB |
|
|
test-results-pyspark-sql, pyspark-mllib, pyspark-resource, pyspark-testing--8-hadoop3-hive2.3
Expired
|
386 KB |
|
|
test-results-sparkr--8-hadoop3-hive2.3
Expired
|
280 KB |
|
|
test-results-sql-- extended tests-8-hadoop3-hive2.3
Expired
|
3.55 MB |
|
|
test-results-sql-- other tests-8-hadoop3-hive2.3
Expired
|
4.7 MB |
|
|
test-results-sql-- slow tests-8-hadoop3-hive2.3
Expired
|
3.44 MB |
|
|
test-results-streaming, sql-kafka-0-10, streaming-kafka-0-10, mllib-local, mllib, yarn, mesos, kubernetes, hadoop-cloud, spark-ganglia-lgpl, connect, protobuf--8-hadoop3-hive2.3
Expired
|
337 KB |
|
|
test-results-tpcds--8-hadoop3-hive2.3
Expired
|
22.6 KB |
|
|
unit-tests-log-catalyst, hive-thriftserver--8-hadoop3-hive2.3
Expired
|
8.53 MB |
|
|
unit-tests-log-sql-- extended tests-8-hadoop3-hive2.3
Expired
|
510 MB |
|
|
unit-tests-log-sql-- other tests-8-hadoop3-hive2.3
Expired
|
297 MB |
|
|
unit-tests-log-sql-- slow tests-8-hadoop3-hive2.3
Expired
|
386 MB |
|
|
unit-tests-log-streaming, sql-kafka-0-10, streaming-kafka-0-10, mllib-local, mllib, yarn, mesos, kubernetes, hadoop-cloud, spark-ganglia-lgpl, connect, protobuf--8-hadoop3-hive2.3
Expired
|
254 MB |
|