Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New S3-FileSystem Implementation not compatible with s3connection resource reference #712

Open
therealslimjp opened this issue Feb 27, 2025 · 4 comments

Comments

@therealslimjp
Copy link

Affected Stackable version

24.11.0

Affected Trino version

451

Current and expected behavior

We are currently migrating from legacy s3 file system to the new standard s3 filesystem (https://trino.io/docs/current/object-storage/file-system-s3.html#migration-from-legacy-s3-file-system)

we encountered an error using TrinoCatalogs and S3Connection as reference like this:

apiVersion: trino.stackable.tech/v1alpha1
kind: TrinoCatalog
metadata:
  name: mycatalog
  namespace: myns
spec:
  configOverrides:
    fs.native-s3.enabled: "true"
    hive.iceberg-catalog-name: mycatalog-iceberg
    hive.metastore.authentication.type: KERBEROS
    hive.metastore.client.keytab: /stackable/config/keytab/keytab
    hive.metastore.client.principal: USER@REALM
    hive.metastore.service.principal: <redacted>
    hive.metastore.thrift.impersonation.enabled: "false"
    hive.partition-projection-enabled: "true"
    s3.endpoint: MYENDPOINT
    s3.max-connections: "40"
    s3.path-style-access: "true"
    s3.region: us-east-1
    s3.retry-mode: STANDARD
    s3.socket-connect-timeout: 300s
    s3.socket-read-timeout: 300s
    s3.streaming.part-size: 100MB
  connector:
    hive:
      metastore:
        configMap: product-mycatalog-hive
      s3:
        reference: trino

here, trino is a s3 connection looking like this:

apiVersion: s3.stackable.tech/v1alpha1
kind: S3Connection
metadata:
  name: trino
  namespace: myns
spec:
  accessStyle: Path
  credentials:
    secretClass: s3-access-trino
  host: myhost
  port: myport
  tls:
    verification:
      server:
        caCert:
          secretClass: ca-cert

this is the error we obtain:

 
1) Error: Configuration property 'hive.s3.aws-access-key' was not used
 
2) Error: Configuration property 'hive.s3.aws-secret-key' was not used
 
3) Error: Configuration property 'hive.s3.endpoint' was not used
 
4) Error: Configuration property 'hive.s3.path-style-access' was not used
 
5) Error: Configuration property 'hive.s3.ssl.enabled' was not used
 
5 errors
io.airlift.bootstrap.ApplicationConfigurationException: Configuration errors:
 
1) Error: Configuration property 'hive.s3.aws-access-key' was not used
 
2) Error: Configuration property 'hive.s3.aws-secret-key' was not used
 
3) Error: Configuration property 'hive.s3.endpoint' was not used
 
4) Error: Configuration property 'hive.s3.path-style-access' was not used
 
5) Error: Configuration property 'hive.s3.ssl.enabled' was not used
 
5 errors
        at io.airlift.bootstrap.Bootstrap.configure(Bootstrap.java:217)
        at io.airlift.bootstrap.Bootstrap.initialize(Bootstrap.java:246)
        at io.trino.plugin.iceberg.IcebergConnectorFactory.createConnector(IcebergConnectorFactory.java:116)
        at io.trino.plugin.iceberg.IcebergConnectorFactory.create(IcebergConnectorFactory.java:78)
        at io.trino.connector.DefaultCatalogFactory.createConnector(DefaultCatalogFactory.java:207)
        at io.trino.connector.DefaultCatalogFactory.createCatalog(DefaultCatalogFactory.java:124)
        at io.trino.connector.LazyCatalogFactory.createCatalog(LazyCatalogFactory.java:45)
        at io.trino.connector.StaticCatalogManager.lambda$loadInitialCatalogs$1(StaticCatalogManager.java:161)
        at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
        at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
        at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
        at java.base/java.util.concurrent.ExecutorCompletionService.submit(Unknown Source)
        at io.trino.util.Executors.executeUntilFailure(Executors.java:46)
        at io.trino.connector.StaticCatalogManager.loadInitialCatalogs(StaticCatalogManager.java:155)
        at io.trino.server.Server.doStart(Server.java:155)
        at io.trino.server.Server.lambda$start$0(Server.java:93)
        at io.trino.$gen.Trino_451____20250227_152637_1.run(Unknown Source)
        at io.trino.server.Server.start(Server.java:93)
        at io.trino.server.TrinoServer.main(TrinoServer.java:37)

it looks like properties like hive.s3.* are added, but incompatible when fs.native-s3.enabled = true.

Possible solution

No response

Additional context

No response

Environment

No response

Would you like to work on fixing this bug?

None

@sbernauer
Copy link
Member

Hi @therealslimjp,

trino-op 24.11.0 does not know (or care) about the native S3 implementation at all.
As we are bumping to Trino 470 in #705, we are currently taking care of that.

Currently trino-op always configures the hive.s3.* properties, which I would consider expected behavior.
In general operators don't respect configOverrides for their internal code flow.

What I can offer you is that

  1. You can wait for feat: Add support for Trino 470 #705 and bump to Trino 470 (more on the experimental side and bigger change ;)
  2. I raised the PR feat: Support removing properties from catalogs #713 to offer the possibility to remove certain properties (in this case the hive.s3 configs).
    Please be aware that we have an internal process for CRD changes, so the PR will need some days - but you can use the 0.0.0-pr713 version early on.
  3. You might also be able to use the https://docs.stackable.tech/home/stable/trino/usage-guide/catalogs/generic/ to add all properties yourself.

@therealslimjp
Copy link
Author

Thanks for the rapid response.

I guess we'll wait for 470.

Option 2 may be useful and would render some of our problem aspects solved, but we would still have to put in the new fields for s3.access-key and s3.secret-access-key hardcoded into the catalog yaml. I naively would rather have a replacement/mapping-like function to change the keys of some of those key-pairs coming into the catalog via the s3connection . But that's just my problem, so it is not entirely thought through yet on my side😄

@therealslimjp
Copy link
Author

But anyway, we do not have the urgent requirement to get a workaround for our currently used version. It's fine for us just waiting for 470, so do not stress at all.

@sbernauer
Copy link
Member

Thanks for the feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants