Skip to content

Do not configure Polaris access to minIO. Please tell me the way out, what is missing? #1901

Open
@kaurych

Description

@kaurych

Describe the bug

Hello.
I ran into problems configuring the bundle on different Spark+Polaris+minIO virtual machines. I did the Polaris configuration according to the manual.
There's a good working example at the end of how to connect Spark to the Polaris directory. I figured out how to connect Spark to minIO:
https://polaris.apache.org/in-dev/0.9.0/quickstart/#building-and-deploying-polaris
#To connect Spark to the Polaris catalog

pyspark \
--packages org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.9.1 \
--conf spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions \
--conf spark.sql.catalog.quickstart_catalog.warehouse=quickstart_catalog \
--conf spark.sql.catalog.quickstart_catalog.header.X-Iceberg-Access-Delegation=vended-credentials \
--conf spark.sql.catalog.quickstart_catalog=org.apache.iceberg.spark.SparkCatalog \
--conf spark.sql.catalog.quickstart_catalog.catalog-impl=org.apache.iceberg.rest.RESTCatalog \
--conf spark.sql.catalog.quickstart_catalog.uri=http://10.129.0.66:8181/api/catalog \
--conf spark.sql.catalog.quickstart_catalog.credential='1712dd5b764d3581:aa6dd8e71c96ae1db7b6ace7c98b39cf' \
--conf spark.sql.catalog.quickstart_catalog.scope='PRINCIPAL_ROLE:ALL' \
--conf spark.sql.catalog.quickstart_catalog.token-refresh-enabled=true

and
#To connect Spark to minIO

--conf spark.hadoop.fs.s3a.endpoint=http://10.129.0.12:9000/ \
--conf spark.hadoop.fs.s3a.access.key=minadmin \
--conf spark.hadoop.fs.s3a.secret.key=minadmin \
--conf spark.hadoop.fs.s3a.path.style.access=true \
--conf spark.hadoop.fs.s3a.impl=org.apache.hadoop.fs.s3a.S3AFileSystem \
--conf spark.hadoop.fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider \
--conf spark.hadoop.fs.s3a.connection.ssl.enabled=false

But we can't find any way to transfer the Polaris connection parameters to minIO in order to save metadata on the minio structure. I see that the documentation for Polaris is crude.
How to write down the config correctly from what I have done so that everything works together. I can't really find anything on the Internet.
Artificial intelligence did not help. In addition, helping me can improve your documentation on polaris.apache.org
Thank you in advance

To Reproduce

No response

Actual Behavior

No response

Expected Behavior

No response

Additional context

No response

System information

DISTRIB_DESCRIPTION="Ubuntu 24.04.2 LTS"
Spark 3.5.6
iceberg 1.9.1
Polaris 0.9.0
Deploying Polaris as a standalone process
cd ~/polaris & ./gradlew run

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions