Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SF-1057: Update PrivateLink setup steps (new Databricks and Snowflake instructions) #7063

Merged
merged 6 commits into from
Sep 27, 2024
Merged
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 50 additions & 20 deletions src/connections/aws-privatelink.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,14 @@ title: Amazon Web Services PrivateLink
> info ""
> Segment's PrivateLink integration is currently in private beta and is governed by Segment’s [First Access and Beta Preview Terms](https://www.twilio.com/en-us/legal/tos){:target="_blank”}. Only warehouses located in regions `us-east-1`, `us-west-2`, or `eu-west-1` are eligible for PrivateLink. You might incur additional networking costs while using AWS PrivateLink.

During the Private Beta, you can set up AWS PrivateLink for [Databricks](#databricks), [RDS Postgres](#rds-postgres), and [Redshift](#redshift).
During the Private Beta, you can set up AWS PrivateLink for [Databricks](#databricks), [RDS Postgres](#rds-postgres), [Redshift](#redshift), and [Snowflake](#snowflake).

## Databricks

The following Databricks integrations support PrivateLink:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't we also support profile sync?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea we're supposed to also support Data Graph and Profiles Sync. They're even listed in our customer request template. When writing this integrations list the first time, we removed Data Graph and Profiles Sync because we hadnt tested it yet so we werent certain that it worked

- [Databricks storage destination](/docs/connections/storage/catalog/databricks/)
- [Databricks Reverse ETL source](/docs/connections/reverse-etl/reverse-etl-source-setup-guides/databricks-setup/)

> info "Segment recommends reviewing the Databricks documentation before attempting AWS PrivateLink setup"
> The setup required to configure the Databricks PrivateLink integration requires front-end and back-end PrivateLink configuration. Review the [Databricks documentation on AWS PrivateLink](https://docs.databricks.com/en/security/network/classic/privatelink.html){:target="_blank”} to ensure you have everything required to set up this configuration before continuing.

Expand All @@ -22,47 +26,73 @@ Before you can configure AWS PrivateLink for Databricks, complete the following
- Configure a [security group](https://docs.databricks.com/en/security/network/classic/customer-managed-vpc.html#security-groups){:target="_blank”} with bidirectional access to 0.0.0.0/0 and ports 443, 3306, 6666, 2443, and 8443-8451.

### Configure PrivateLink for Databricks
To configure PrivateLink for Databricks:
To implement Segment's PrivateLink integration for Databricks:
1. Follow the instructions in Databricks' [Enable private connectivity using AWS PrivateLink](https://docs.databricks.com/en/security/network/classic/privatelink.html){:target="_blank”} documentation. You must create a [back-end](https://docs.databricks.com/en/security/network/classic/privatelink.html#private-connectivity-overview){:target="_blank”} connection to integrate with Segment's front-end connection.
2. After you've configured a back-end connection for Databricks, request access to Segment's PrivateLink integration by reaching out to your Customer Success Manager (CSM).
3. Your CSM sets up a call with Segment R&D to continue the onboarding process.

The following Databricks integrations support PrivateLink:
- [Databricks storage destination](/docs/connections/storage/catalog/databricks/)
- [Databricks Reverse ETL source](/docs/connections/reverse-etl/reverse-etl-source-setup-guides/databricks-setup/)
2. After you've configured a back-end connection for Databricks, let your Customer Success Manager (CSM) know that you're interested in PrivateLink.
3. Segment's engineering team creates a custom VPC endpoint on your behalf. Segment then provides you with the VPC endpoint's ID.
4. Follow the instructions in Databricks' [Register PrivateLink objects](https://docs.databricks.com/en/security/network/classic/privatelink.html#step-3-register-privatelink-objects){:target="_blank”} documentation. It'll instruct you to register the VPC endpoint in your Databricks account and to create or update your Private Access Setting to include the VPC endpoint.
AnnieZhao17 marked this conversation as resolved.
Show resolved Hide resolved
5. Configure your Databricks workspace to [use the Private Access Setting object](https://docs.databricks.com/en/security/network/classic/privatelink.html#step-4-create-or-update-your-workspace-with-privatelink-objects) from the previous step.
AnnieZhao17 marked this conversation as resolved.
Show resolved Hide resolved
6. Reach back out to your CSM and provide them with your Databricks Workspace URL. Segment configures their internal DNS to reroute Segment traffic for your Databricks workspace to your VPC endpoint.
7. Your CSM notifies you that Segment's PrivateLink integration is complete. If you have any existing Segment Databricks integrations that use your Databricks workspace URL, they now automatically use PrivateLink. You can also create new Databricks integrations in the Segment app. All newly created integrations using your Databricks workspace URL will automatically use PrivateLink.
AnnieZhao17 marked this conversation as resolved.
Show resolved Hide resolved

## RDS Postgres

The following RDS Postgres integrations support PrivateLink:
- [RDS Postgres storage destination](/docs/connections/storage/catalog/postgres/)
- [RDS Postgres Reverse ETL source](/docs/connections/reverse-etl/reverse-etl-source-setup-guides/postgres-setup/)

### Prerequisites
Before you can configure AWS PrivateLink for RDS Postgres, complete the following prerequisites in your Databricks workspace:
Before you can configure AWS PrivateLink for RDS Postgres, complete the following prerequisites:
- **Set up a Network Load Balancer (NLB) to route traffic to your Postgres database**: Segment recommends creating a NLB that has target group IP address synchronization, using a solution like AWS Lambda.
If any updates are made to the Availability Zones (AZs) enabled for your NLB, please let your CSM know so that Segment can update the AZs of your VPC endpoint.
- **Configure your NLB with one of the following settings**:
- Disable the **Enforce inbound rules on PrivateLink traffic** setting
- If you must enforce inbound rules on PrivateLink traffic, add an inbound rule that allows traffic belonging to Segment's PrivateLink/Edge CIDR: `10.0.0.0/8`

### Configure PrivateLink for RDS Postgres
To implement Segment's PrivateLink integration for RDS Postgres:
1. Create a Network Load Balancer VPC endpoint service using the instructions in the [Create a service powered by AWS PrivateLink](https://docs.aws.amazon.com/vpc/latest/privatelink/create-endpoint-service.html){:target="_blank”} documentation.
2. Reach out to your Customer Success Manager (CSM) for more details about Segment's AWS principal.
2. Let your Customer Success Manager (CSM) know that you're interested in PrivateLink. They will share information with you about Segment's AWS principal.
3. Add the Segment AWS principal as an “Allowed Principal” to consume the Network Load Balancer VPC endpoint service you created in step 1.
4. Reach out to your CSM and provide them with the Service name for the service that you created above. Segment's engineering team provisions a VPC endpoint for the service in the Segment Edge VPC.
5. After creating the VPC endpoint, Segment provides you with private DNS so you can update the **Host** in your Segment app settings or create a new Postgres integration. <br> The following RDS Postgres integrations support PrivateLink:
- [RDS Postgres storage destination](/docs/connections/storage/catalog/postgres/)
- [RDS Postgres Reverse ETL source](/docs/connections/reverse-etl/reverse-etl-source-setup-guides/postgres-setup/)
4. Reach out to your CSM and provide them with the Service Name for the service that you created above. Segment's engineering team provisions a VPC endpoint for the service in the Segment Edge VPC.
5. Segment provides you with the VPC endpoint's private DNS name. Use the DNS name as the **Host** setting to update or create new Postgres integrations in the Segment app.

## Redshift

The following Redshift integrations support PrivateLink:
- [Redshift storage destination](/docs/connections/storage/catalog/redshift/)
- [Redshift Reverse ETL source](/docs/connections/reverse-etl/reverse-etl-source-setup-guides/redshift-setup/)

### Prerequisites
Before you can configure AWS PrivateLink for Redshift, complete the following prerequisites:
- **You're using the RA3 node type**: To access Segment's PrivateLink integration, use an RA3 instance.
- **You've enabled cluster relocation**: Cluster relocation migrates your cluster behind a proxy and keeps the cluster endpoint unchanged, even if your cluster needs to be migrated to a new Availability Zone. A consistent cluster endpoint makes it possible for Segment's Edge account and VPC to remain connected to your cluster. To enable cluster relocation, follow the instructions in the AWS [Relocating your cluster](https://docs.aws.amazon.com/redshift/latest/mgmt/managing-cluster-recovery.html){:target="_blank”} documentation.
- **Your cluster is using a port within the ranges 5431-5455 or 8191-8215**: Clusters with cluster relocation enabled [might encounter an error if updated to include a port outside of this range](https://docs.aws.amazon.com/redshift/latest/mgmt/managing-cluster-recovery.html#:~:text=You%20can%20change%20to%20another%20port%20from%20the%20port%20range%20of%205431%2D5455%20or%208191%2D8215.%20(Don%27t%20change%20to%20a%20port%20outside%20the%20ranges.%20It%20results%20in%20an%20error.)){:target="_blank”}.

### Configure PrivateLink for Redshift
Implement Segment's PrivateLink integration by taking the following steps:
To implement Segment's PrivateLink integration for Redshift:
1. Let your Customer Success Manager (CSM) know that you're interested in PrivateLink. They will share information with you about Segment’s Edge account and VPC.
2. After you receive the Edge account ID and VPC ID, [grant cluster access to Segment's Edge account and VPC](https://docs.aws.amazon.com/redshift/latest/mgmt/managing-cluster-cross-vpc-console-grantor.html){:target="_blank”}.
3. Reach back out to your CSM and provide them with the Cluster identifier for your cluster and your AWS account ID.
4. Segment creates a Redshift managed VPC endpoint within the Segment Redshift subnet on your behalf, which creates a PrivateLink Endpoint URL. Segment then provides you with the internal PrivateLink Endpoint URL.
5. After Segment provides you with the URL, use it to update or create new Redshift integrations. The following integrations support PrivateLink:
- [Redshift storage destination](/docs/connections/storage/catalog/redshift/)
- [Redshift Reverse ETL source](/docs/connections/reverse-etl/reverse-etl-source-setup-guides/redshift-setup/)
3. Reach back out to your CSM and provide them with the Cluster Identifier for your cluster and your AWS account ID.
4. Segment's engineering team creates a Redshift managed VPC endpoint within the Segment Redshift subnet on your behalf, which creates a PrivateLink Endpoint URL. Segment then provides you with the internal PrivateLink Endpoint URL.
5. Use the provided PrivateLink Endpoint URL as the **Hostname** setting to update or create new Redshift integrations in the Segment app.

## Snowflake

The following Snowflake integrations support PrivateLink:
- [Snowflake storage destination](/docs/connections/storage/catalog/snowflake/)
- [Snowflake Reverse ETL source](/docs/connections/reverse-etl/reverse-etl-source-setup-guides/snowflake-setup/)

### Prerequisites
Before you can configure AWS PrivateLink for Snowflake, complete the following prerequisites:
AnnieZhao17 marked this conversation as resolved.
Show resolved Hide resolved
- Your Snowflake account is on the Business Critical [Edition](https://docs.snowflake.com/en/user-guide/intro-editions){:target="_blank”} or higher.
AnnieZhao17 marked this conversation as resolved.
Show resolved Hide resolved
- Your Snowflake account is hosted on the [AWS cloud platform](https://docs.snowflake.com/en/user-guide/intro-cloud-platforms){:target="_blank”}.

### Configure PrivateLink for Snowflake
To implement Segment's PrivateLink integration for Snowflake:
1. Follow Snowflake's PrivateLink documentation to [enable AWS PrivateLink](https://docs.snowflake.com/en/user-guide/admin-security-privatelink#enabling-aws-privatelink){:target="_blank”} for your Snowflake account.
2. Let your Customer Success Manager (CSM) know that you're interested in PrivateLink. They will provide you with Segment’s AWS Edge account ID.
3. Create a Snowflake Support Case to authorize PrivateLink connections from Segment's AWS account ID as a third party vendor to your Snowflake account.
4. After Snowflake support authorizes Segment, call the [SYSTEM$GET_PRIVATELINK_CONFIG](https://docs.snowflake.com/en/sql-reference/functions/system_get_privatelink_config) function while using the Snowflake ACCOUNTADMIN role. Reach back out to your Segment CSM and provide them with the **privatelink-vpce-id** and **privatelink-account-url** values from the function output. Note down for yourself the **privatelink-account-name** value.
AnnieZhao17 marked this conversation as resolved.
Show resolved Hide resolved
5. Segment's engineering team creates a custom VPC endpoint on your behalf. Segment also creates a CNAME record to reroute Segment traffic to use your VPC endpoint. This ensures that Segment connections to your **privatelink-account-name** are made over PrivateLink.
6. Your CSM notifies you that the setup on Segment's side is complete. Use your **privatelink-account-name** as the **Account** setting to update or create new Snowflake integrations in the Segment app.