Skip to content

external storage url in tidb cloud #21058

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions TOC-tidb-cloud.md
Original file line number Diff line number Diff line change
Expand Up @@ -692,6 +692,7 @@
- [Metadata Lock](/metadata-lock.md)
- [Use UUIDs](/best-practices/uuid.md)
- [TiDB Accelerated Table Creation](/accelerated-table-creation.md)
- [URI Formats of External Storage Services](/external-storage-uri.md)
- API Reference ![BETA](/media/tidb-cloud/blank_transparent_placeholder.png)
- [Overview](/tidb-cloud/api-overview.md)
- v1beta1
Expand Down
34 changes: 19 additions & 15 deletions external-storage-uri.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,14 +21,21 @@ The basic format of the URI is as follows:

- `access-key`: Specifies the access key.
- `secret-access-key`: Specifies the secret access key.
<CustomContent platform="tidb">
- `session-token`: Specifies the temporary session token. BR supports this parameter starting from v7.6.0.
</CustomContent>

<CustomContent platform="tidb-cloud">
- `session-token`: Specifies the temporary session token. TiDB supports this parameter starting from v7.6.0.
</CustomContent>
- `use-accelerate-endpoint`: Specifies whether to use the accelerate endpoint on Amazon S3 (defaults to `false`).
- `endpoint`: Specifies the URL of custom endpoint for S3-compatible services (for example, `<https://s3.example.com/>`).
- `force-path-style`: Use path style access rather than virtual hosted style access (defaults to `true`).
- `storage-class`: Specifies the storage class of the uploaded objects (for example, `STANDARD` or `STANDARD_IA`).
- `sse`: Specifies the server-side encryption algorithm used to encrypt the uploaded objects (value options: empty, `AES256`, or `aws:kms`).
- `sse-kms-key-id`: Specifies the KMS ID if `sse` is set to `aws:kms`.
- `acl`: Specifies the canned ACL of the uploaded objects (for example, `private` or `authenticated-read`).
<CustomContent platform="tidb">
- `role-arn`: When you need to access Amazon S3 data from a third party using a specified [IAM role](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles.html), you can specify the corresponding [Amazon Resource Name (ARN)](https://docs.aws.amazon.com/general/latest/gr/aws-arns-and-namespaces.html) of the IAM role with the `role-arn` URL query parameter, such as `arn:aws:iam::888888888888:role/my-role`. For more information about using an IAM role to access Amazon S3 data from a third party, see [AWS documentation](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_common-scenarios_third-party.html). BR supports this parameter starting from v7.6.0.
- `external-id`: When you access Amazon S3 data from a third party, you might need to specify a correct [external ID](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_create_for-user_externalid.html) to assume [the IAM role](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles.html). In this case, you can use this `external-id` URL query parameter to specify the external ID and make sure that you can assume the IAM role. An external ID is an arbitrary string provided by the third party together with the IAM role ARN to access the Amazon S3 data. Providing an external ID is optional when assuming an IAM role, which means if the third party does not require an external ID for the IAM role, you can assume the IAM role and access the corresponding Amazon S3 data without providing this parameter.

Expand All @@ -48,28 +55,25 @@ tiup cdc:v7.5.0 cli changefeed create \
--config=cdc_csv.toml
```

The following is an example of an Amazon S3 URI for [`IMPORT INTO`](/sql-statements/sql-statement-import-into.md). In this example, you need to specify a specific filename `test.csv`.
</CustomContent>

```shell
s3://external/test.csv?access-key=${access-key}&secret-access-key=${secret-access-key}
```
<CustomContent platform="tidb-cloud">
- `role-arn`: When you need to access Amazon S3 data from a third party using a specified [IAM role](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles.html), you can specify the corresponding [Amazon Resource Name (ARN)](https://docs.aws.amazon.com/general/latest/gr/aws-arns-and-namespaces.html) of the IAM role with the `role-arn` URL query parameter, such as `arn:aws:iam::888888888888:role/my-role`. For more information about using an IAM role to access Amazon S3 data from TiDB Cloud, see [Configure External Storage Access for TiDB Cloud Dedicated](/config-s3-and-gcs-access.md). TiDB supports this parameter starting from v7.6.0.
- `external-id`: When you access Amazon S3 data from TiDB Cloud, you must need to specify a correct [external ID](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_create_for-user_externalid.html) to assume [the IAM role](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles.html). In this case, you must use this `external-id` URL query parameter to specify the external ID and make sure that you can assume the IAM role.

## GCS URI format
> **Note:**
> When configuring the IAM role, make sure to explicitly specify the trusted AWS account ID in the Principal field, and always include a unique external-id condition to prevent unauthorized access via [confused deputy attacks](https://docs.aws.amazon.com/IAM/latest/UserGuide/confused-deputy.html).
> You can find the TiDB Cloud AWS account ID in TiDB Cloud, and use AWS CloudFormation to create the IAM role securely in one click by following the linked documentation, see [Configure External Storage Access for TiDB Cloud Dedicated](/config-s3-and-gcs-access.md).
> Optionally, you may also set a max-session-duration to limit the lifetime of temporary credentials for enhanced security.

- `scheme`: `gcs` or `gs`
- `host`: `bucket name`
- `parameters`:

- `credentials-file`: Specifies the path to the credentials JSON file on the migration tool node.
- `storage-class`: Specifies the storage class of the uploaded objects (for example, `STANDARD` or `COLDLINE`)
- `predefined-acl`: Specifies the predefined ACL of the uploaded objects (for example, `private` or `project-private`)

The following is an example of a GCS URI for TiDB Lightning and BR. In this example, you need to specify a specific file path `testfolder`.
The following is an example of an Amazon S3 URI for [`BACKUP`](/sql-statements/sql-statement-backup.md) and [`RESTORE`](/sql-statements/sql-statement-restore.md). In this example, you need to specify a specific file path `testfolder`.

```shell
gcs://external/testfolder?credentials-file=${credentials-file-path}
s3://external/testfolder?access-key=${access-key}&secret-access-key=${secret-access-key}
```

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

low

The example S3 URI for BACKUP and RESTORE in the TiDB Cloud context (lines 71-73) uses access-key and secret-access-key. Given the strong emphasis on using role-arn and external-id for TiDB Cloud security in the preceding paragraphs (lines 61-67), would it be beneficial to also provide an example URI that showcases the role-arn and external-id parameters? This could help reinforce the recommended practice for TiDB Cloud users.

Alternatively, if the intention is to show that both methods are supported, perhaps a brief note clarifying this could be added. The linked document /config-s3-and-gcs-access.md does cover both methods, so this is more of a suggestion for enhancing direct clarity within this page.


</CustomContent>

The following is an example of a GCS URI for [`IMPORT INTO`](/sql-statements/sql-statement-import-into.md). In this example, you need to specify a specific filename `test.csv`.

```shell
Expand Down
10 changes: 0 additions & 10 deletions sql-statements/sql-statement-backup.md
Original file line number Diff line number Diff line change
Expand Up @@ -112,18 +112,8 @@ BR supports backing up data to S3 or GCS:
BACKUP DATABASE `test` TO 's3://example-bucket-2020/backup-05/?access-key={YOUR_ACCESS_KEY}&secret-access-key={YOUR_SECRET_KEY}';
```

<CustomContent platform="tidb">

The URL syntax is further explained in [URI Formats of External Storage Services](/external-storage-uri.md).

</CustomContent>

<CustomContent platform="tidb-cloud">

The URL syntax is further explained in [external storage URI](https://docs.pingcap.com/tidb/stable/external-storage-uri).

</CustomContent>

When running on cloud environment where credentials should not be distributed, set the `SEND_CREDENTIALS_TO_TIKV` option to `FALSE`:

{{< copyable "sql" >}}
Expand Down
10 changes: 0 additions & 10 deletions sql-statements/sql-statement-restore.md
Original file line number Diff line number Diff line change
Expand Up @@ -103,18 +103,8 @@ BR supports restoring data from S3 or GCS:
RESTORE DATABASE * FROM 's3://example-bucket-2020/backup-05/';
```

<CustomContent platform="tidb">

The URL syntax is further explained in [URI Formats of External Storage Services](/external-storage-uri.md).

</CustomContent>

<CustomContent platform="tidb-cloud">

The URL syntax is further explained in [external storage URI](https://docs.pingcap.com/tidb/stable/external-storage-uri).

</CustomContent>

When running on cloud environment where credentials should not be distributed, set the `SEND_CREDENTIALS_TO_TIKV` option to `FALSE`:

{{< copyable "sql" >}}
Expand Down