Replies: 1 comment
-
@squalud, I note there is already some code to pass |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I use Alluxio's proxy to provide S3 interface access.
By setting
spark.hadoop.fs.ks3.endpoint
tohttp://<alluxio-proxy-service-name>:39999/api/v1/s3/
and setting thespark.hadoop.fs.s3a.path.style.access
parameter totrue
to usepath-style
to access S3, I can use pyspark to successfully read csv files through the URL format ofs3a://data/tmp/file.csv
; Note that/data
is a path, not the bucket.But when I change to gluten and setting
spark.gluten.sql.native.arrow.reader.enabled
totrue
to use arrow's reader to read, I get an error:It seems that Arrow's reader treats the first-level path
data
as the bucket, that is, the configurationspark.hadoop.fs.s3a.path.style.access
does not take effect to Gluten/Arrow. How can I use gluten + arrow's reader to access S3 based onpath-style
just like Spark's original reader?Beta Was this translation helpful? Give feedback.
All reactions