Skip to content

Error occurred when using Etl to slice the grid tiff file in hdfs #3508

@KiktMa

Description

@KiktMa

An error occurred while using Etl from Geotrellis to build a pyramid model for raster data in hdfs and store it in the accumulo database

I am using geotrellis2.1.0Scala2.11.7hadoop2.7.7spark2.3.4jdk1.8

After I have written input.json, output.json, and backend-profiles.json, I use spark-submit to submit the task geotrellis. spark. etl. SinglebandIngest

./bin/spark-submit --class geotrellis.spark.etl.SinglebandIngest --master yarn /usr/local/app/spark/spark-2.3.4/jars/geotrellis-spark-etl_2.11-2.1.0.jar --input file:///app/tif/json/input.json --output file:///app/tif/json/output.json --backend-profiles file:///app/tif/json/backend-profiles.json

Error Reporting Results:

TaskSetManager:66 - Lost task 0.0 in stage 0.0 (TID 0, node1, executor 2): java.lang.NegativeArraySizeException
        atscala.reflect.ManifestFactory$$anon$6.newArray(Manifest.scala:93)
        at scala.reflect.ManifestFactory$$anon$6.newArray(Manifest.scala:91)
        at scala.Array$.ofDim(Array.scala:218)
        at geotrellis.raster.UByteArrayTile$.ofDim(UByteArrayTile.scala:239)
        at geotrellis.raster.UByteArrayTile$.empty(UByteArrayTile.scala:267)
        at geotrellis.raster.ArrayTile$.empty(ArrayTile.scala:431)
        at geotrellis.raster.io.geotiff.GeoTiffTile.mutable(GeoTiffTile.scala:698)
        at geotrellis.raster.io.geotiff.GeoTiffTile.toArrayTile(GeoTiffTile.scala:690)
        at geotrellis.spark.io.RasterReader$$anon$1.readFully(RasterReader.scala:67)
        at geotrellis.spark.io.RasterReader$$anon$1.readFully(RasterReader.scala:63)
        at geotrellis.spark.io.hadoop.HadoopGeoTiffRDD$$anonfun$apply$5$$anonfun$apply$6.apply(HadoopGeoTiffRDD.scala:148)
        at geotrellis.spark.io.hadoop.HadoopGeoTiffRDD$$anonfun$apply$5$$anonfun$apply$6.apply(HadoopGeoTiffRDD.scala:147)
        at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
        at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
        at scala.collection.Iterator$class.foreach(Iterator.scala:893)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
        at scala.collection.TraversableOnce$class.reduceLeft(TraversableOnce.scala:185)
        at scala.collection.AbstractIterator.reduceLeft(Iterator.scala:1336)
        at org.apache.spark.rdd.RDD$$anonfun$reduce$1$$anonfun$14.apply(RDD.scala:1021)
        at org.apache.spark.rdd.RDD$$anonfun$reduce$1$$anonfun$14.apply(RDD.scala:1019)
        at org.apache.spark.SparkContext$$anonfun$33.apply(SparkContext.scala:2130)
        at org.apache.spark.SparkContext$$anonfun$33.apply(SparkContext.scala:2130)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
        at org.apache.spark.scheduler.Task.run(Task.scala:109)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

My tiff file only has 180Mb. How can I solve this problem,I increased the driver memory to 2G, but I still couldn't resolve this error

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions