Skip to content

Commit 64575f8

Browse files
author
Mike Robins
committed
Add updates + docs for locally running lake loader on LS S3
1 parent 49232f2 commit 64575f8

File tree

2 files changed

+13
-4
lines changed

2 files changed

+13
-4
lines changed

README.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@ If you would like to (optionally) run the Snowflake streaming loader as well you
6969
1. Configure your Snowflake private key and warehouse details in your `.env` file. You will need a [private key](https://docs.snowflake.com/en/user-guide/key-pair-auth) set up rather than a username / password as this is what the app uses for authentication.
7070
2. Launch docker compose with the warehouse you would like:
7171
* For Snowflake streaming loader use: `docker-compose --profile snowflake-loader up` which will launch the normal components + the Snowflake Kinesis loader.
72-
* For the Lake loader use `--profile lake-loader`.
72+
* For the Lake loader use `--profile lake-loader` (you can use Lake Loader to load to a remote blob storage service, e.g., S3 or locally using Localstack).
7373
* For the BigQuery loader use `--profile bigquery-loader`.
7474
3. Send some events!
7575

@@ -142,6 +142,12 @@ e.g., for BASH use
142142
export SERVICE_ACCOUNT_CREDENTIALS=$(cat /path/to/your/service-account-key.json)
143143
```
144144

145+
### Lake loader
146+
147+
Lake loader can use a remote object store (e.g., AWS S3, GCS, Blob Storage) etc but will work equally well writing to Localstack S3. An example configuration of this can be found in `loaders/lake_loader_config_iceberg_s3.hocon`.
148+
149+
If you wish to load to a different (local) bucket ensure that the resource is created in `init-aws.sh` before attempting to run the loader. Once loading has been setup you can view the data that lake loader writes out in your browser using: `https://snowplow-lake-loader.s3.localhost.localstack.cloud:4566/` or the equivalent name for your bucket.
150+
145151
## Incomplete events
146152

147153
Currently incomplete events load into the same table as successful events. This is deliberate - but can be overwritten by specifying a different table in the `incomplete` loader HOCON configuration file.

docker-compose.yml

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -221,19 +221,22 @@ services:
221221
- enrich
222222
- iglu-server
223223
volumes:
224-
- "./loaders/lake_loader_config.hocon:/loaders/lake_loader_config.hocon"
224+
- "./loaders/lake_loader_config_iceberg_s3.hocon:/loaders/lake_loader_config.hocon"
225225
- "./iglu-client:/snowplow/iglu-client"
226226
profiles: [lake-loader]
227227
environment:
228228
- "ACCEPT_LICENSE=${ACCEPT_LICENSE}"
229+
- "AWS_ENDPOINT_URL=http://localhost.localstack.cloud:4566"
230+
- "AWS_ENDPOINT_URL_S3=http://s3.localhost.localstack.cloud:4566"
231+
- "AWS_ACCOUNT_ID=000000000000"
229232
- "AWS_ACCESS_KEY_ID=localstack"
230233
- "AWS_SECRET_ACCESS_KEY=doesntmatter"
231234
- "AWS_REGION=ap-southeast-2"
232-
- "AWS_ENDPOINT_S3=http://localhost.localstack.cloud:4566"
233-
- "AWS_ENDPOINT_URL=http://localhost.localstack.cloud:4566"
234235

235236
extra_hosts:
236237
- "localhost.localstack.cloud:host-gateway"
238+
- "s3.localhost.localstack.cloud:host-gateway"
239+
- "snowplow-lake-loader.s3.localhost.localstack.cloud:host-gateway"
237240

238241
bigquery-loader:
239242
container_name: bigquery-loader

0 commit comments

Comments
 (0)