Skip to content
Open
Show file tree
Hide file tree
Changes from 21 commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
a6822ed
add directories and files
hernandezc1 Jun 17, 2025
8470a38
update classification field in bg table schema
hernandezc1 Jun 17, 2025
d9e5e1d
address codacy issue(s)
hernandezc1 Jun 17, 2025
415a33e
update `classification` and `properties` fields
hernandezc1 Jun 17, 2025
d768787
Merge branch 'develop' into u/ch/swift
hernandezc1 Jun 18, 2025
3562bce
Merge branch 'develop' into u/ch/swift
hernandezc1 Jun 19, 2025
8fa6c14
add new GCP resources
hernandezc1 Jun 26, 2025
d89a760
add ps_to_storage module for Swift
hernandezc1 Jun 26, 2025
241e4bc
update metadata key/value pairs
hernandezc1 Jun 26, 2025
d965999
address codacy issues
hernandezc1 Jun 26, 2025
dad2587
add IAM policy for BQ dataset
hernandezc1 Jun 27, 2025
08e9486
configure IAM policy for BQ dataset
hernandezc1 Jun 27, 2025
2633278
update script to accomodate swift alert schema
hernandezc1 Jun 27, 2025
1f8ec8f
ensures avro bucket does not exist before creating it
hernandezc1 Jun 27, 2025
d6fd04a
creates vm if it does not already exist
hernandezc1 Jun 27, 2025
03652f7
set up artifact registry
hernandezc1 Jun 27, 2025
86cdebc
update default ps topic
hernandezc1 Jun 27, 2025
e151221
update file metadata
hernandezc1 Jun 27, 2025
18072e8
use latest version of the kafka -> pubsub connector
hernandezc1 Jun 27, 2025
26547a1
update documentation
hernandezc1 Jun 27, 2025
48c7ae4
address codacy issue
hernandezc1 Jun 27, 2025
97d18b3
update documentation
hernandezc1 Jun 27, 2025
3a2655e
improve readability by updating parameter names
hernandezc1 Jun 29, 2025
9b746d2
update documentation
hernandezc1 Jun 30, 2025
f1d221e
update parameter names
hernandezc1 Jun 30, 2025
79805a5
squash bug
hernandezc1 Jun 30, 2025
face1ab
update documentation
hernandezc1 Jun 30, 2025
bf64911
Merge branch 'develop' into u/ch/swift
hernandezc1 Jul 1, 2025
76504c4
Merge branch 'develop' into u/ch/swift
hernandezc1 Jul 1, 2025
ae974ec
Merge branch 'develop' into u/ch/swift
hernandezc1 Jul 8, 2025
43ed80c
unpin requirements
hernandezc1 Jul 8, 2025
1f1f54b
rename GCP resources
hernandezc1 Jul 8, 2025
e710bd7
rename GCP resources
hernandezc1 Jul 8, 2025
efe7f18
squash bug and update resource names
hernandezc1 Jul 9, 2025
d4fbb19
use $PROJECT_ID directly where applicable
hernandezc1 Jul 9, 2025
2c25fd8
update documentation
hernandezc1 Jul 9, 2025
98f3ccf
add versiontag as an env_var
hernandezc1 Jul 9, 2025
7587e56
update resource name
hernandezc1 Jul 9, 2025
3c5a658
Merge branch 'develop' into u/ch/swift
hernandezc1 Jul 17, 2025
4bf33e5
Merge branch 'develop' into u/ch/swift
hernandezc1 Jul 17, 2025
1fca11f
assign dead letter topic
hernandezc1 Jul 17, 2025
cf3a4c7
update Pub/Sub dead letter topic name
hernandezc1 Jul 17, 2025
968493c
Merge branch 'develop' into u/ch/swift
hernandezc1 Jul 17, 2025
1122672
Merge branch 'develop' into u/ch/swift
hernandezc1 Jul 17, 2025
837f736
Merge branch 'develop' into u/ch/swift
hernandezc1 Jul 22, 2025
81bb094
update resource names
hernandezc1 Aug 6, 2025
45651e8
update resource names
hernandezc1 Aug 6, 2025
882187c
Merge branch 'develop' into u/ch/swift
hernandezc1 Sep 24, 2025
037ba42
Merge branch 'develop' into u/ch/swift
hernandezc1 Sep 26, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions broker/cloud_run/swift/ps_to_storage/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Use the official lightweight Python image.
# https://hub.docker.com/_/python
FROM python:3.12-slim

# Allow statements and log messages to immediately appear in the Knative logs
ENV PYTHONUNBUFFERED True

# Copy local code to the container image.
ENV APP_HOME /app
WORKDIR $APP_HOME
COPY . ./

# Install production dependencies.
RUN pip install --no-cache-dir -r requirements.txt

# Run the web service on container startup. Here we use the gunicorn
# webserver, with one worker process and 8 threads.
# For environments with multiple CPU cores, increase the number of workers
# to be equal to the cores available.
# Timeout is set to 0 to disable the timeouts of the workers to allow Cloud Run to handle instance scaling.
CMD exec gunicorn --bind :$PORT --workers 1 --threads 8 --timeout 0 main:app
29 changes: 29 additions & 0 deletions broker/cloud_run/swift/ps_to_storage/cloudbuild.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# https://cloud.google.com/build/docs/deploying-builds/deploy-cloud-run
# containerize the module and deploy it to Cloud Run
steps:
# Build the image
- name: 'gcr.io/cloud-builders/docker'
args: ['build', '-t', '${_REGION}-docker.pkg.dev/${PROJECT_ID}/${_REPOSITORY}/${_MODULE_IMAGE_NAME}', '.']
# Push the image to Artifact Registry
- name: 'gcr.io/cloud-builders/docker'
args: ['push', '${_REGION}-docker.pkg.dev/${PROJECT_ID}/${_REPOSITORY}/${_MODULE_IMAGE_NAME}']
# Deploy image to Cloud Run
- name: 'gcr.io/google.com/cloudsdktool/cloud-sdk'
entrypoint: gcloud
args: ['run', 'deploy', '${_MODULE_NAME}', '--image', '${_REGION}-docker.pkg.dev/${PROJECT_ID}/${_REPOSITORY}/${_MODULE_IMAGE_NAME}', '--region', '${_REGION}', '--set-env-vars', '${_ENV_VARS}']
images:
- '${_REGION}-docker.pkg.dev/${PROJECT_ID}/${_REPOSITORY}/${_MODULE_IMAGE_NAME}'
substitutions:
_SURVEY: 'swift'
_TESTID: 'testid'
_MODULE_NAME: '${_SURVEY}-alerts-to-storage-${_TESTID}'
_MODULE_IMAGE_NAME: 'gcr.io/${PROJECT_ID}/${_REPOSITORY}/${_MODULE_NAME}'
_REPOSITORY: 'cloud-run-services'
# cloud functions automatically sets the projectid env var using the name "GCP_PROJECT"
# use the same name here for consistency
# [TODO] PROJECT_ID is set in setup.sh. this is confusing and we should revisit the decision.
# i (Raen) think i didn't make it a substitution because i didn't want to set a default for it.
_ENV_VARS: 'GCP_PROJECT=${PROJECT_ID},SURVEY=${_SURVEY},TESTID=${_TESTID}'
_REGION: 'us-central1'
options:
dynamic_substitutions: true
95 changes: 95 additions & 0 deletions broker/cloud_run/swift/ps_to_storage/deploy.sh
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a bunch of variables in here with names that include "avro" -- I'm sure this is a holdover from the ZTF and LSST modules. For reusability, we should change those names because not all surveys publish avro alerts. For Swift in particular, assuming they publish in json and not avro, having "avro" here is confusing.

Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
#! /bin/bash
# Deploys or deletes broker Cloud Run service
# This script will not delete Cloud Run services that are in production

# "False" uses production resources
# any other string will be appended to the names of all resources
testid="${1:-test}"
# "True" tearsdown/deletes resources, else setup
teardown="${2:-False}"
# name of the survey this broker instance will ingest
survey="${3:-swift}"
region="${4:-us-central1}"
# get the environment variable
PROJECT_ID=$GOOGLE_CLOUD_PROJECT

MODULE_NAME="alerts-to-storage" # lower case required by cloud run
ROUTE_RUN="/" # url route that will trigger main.run()

define_GCP_resources() {
local base_name="$1"
local testid_suffix=""

if [ "$testid" != "False" ]; then
testid_suffix="-${testid}"
fi
echo "${base_name}${testid_suffix}"
}

#--- GCP resources used in this script
artifact_registry_repo=$(define_GCP_resources "${survey}-cloud-run-services")
cr_module_name=$(define_GCP_resources "${survey}-${MODULE_NAME}") # lower case required by cloud run
gcs_avro_bucket=$(define_GCP_resources "${PROJECT_ID}-${survey}_alerts")
ps_input_subscrip=$(define_GCP_resources "${survey}-alerts_raw") # pub/sub subscription used to trigger cloud run module
ps_subscription_avro=$(define_GCP_resources "${survey}-alert_avros-counter")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you actually using these counter subscriptions? If not, get rid of this. I think ZTF is the only broker pipeline that uses them.

FWIW, ZTF uses them with an ancillary module that helps us track broker performance. The module very useful, but the way I wrote it is complicated and makes it expensive and that's why we haven't added it to any other broker pipeline. We should rewrite it using bigquery subscriptions (which didn't exist when I wrote the original) -- #172.

ps_topic_avro=$(define_GCP_resources "projects/${PROJECT_ID}/topics/${survey}-alert_avros")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is the original name of the topic, but using "avros" is not generically good for the aforementioned reasons, and if Swift publishes json then we definitely shouldn't use it here. Let's think of something better. <survey>-alert_in_bucket? Just off the top of my head.

ps_trigger_topic=$(define_GCP_resources "${survey}-alerts_raw")
runinvoker_svcact="cloud-run-invoker@${PROJECT_ID}.iam.gserviceaccount.com"

if [ "${teardown}" = "True" ]; then
# ensure that we do not teardown production resources
if [ "${testid}" != "False" ]; then
echo
echo "Deleting resources for ${MODULE_NAME} module..."
gsutil rm -r "gs://${gcs_avro_bucket}"
gcloud pubsub topics delete "${ps_topic_avro}"
gcloud pubsub subscriptions delete "${ps_subscription_avro}"
gcloud pubsub subscriptions delete "${ps_input_subscrip}"
gcloud run services delete "${cr_module_name}" --region "${region}"
fi
else
echo
echo "Creating avro_bucket..."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's revisit which scripts create and delete which resources. I know we had a whole discussion about it when we first created these deploy.sh scripts for individual modules and decided to put this bucket creation here, but this is the only public resource that we handle this way and I've been confused by it more than once. Every other resource gets created by setup_broker.sh. At a minimum, our handling of this bucket and the bigquery dataset/table should be consistent.

if ! gsutil ls -b "gs://${gcs_avro_bucket}" >/dev/null 2>&1; then
#--- Create the bucket that will store the alerts
gsutil mb -l "${region}" "gs://${gcs_avro_bucket}"
gsutil uniformbucketlevelaccess set on "gs://${gcs_avro_bucket}"
gsutil requesterpays set on "gs://${gcs_avro_bucket}"
gcloud storage buckets add-iam-policy-binding "gs://${gcs_avro_bucket}" \
--member="allUsers" \
--role="roles/storage.objectViewer"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the bigquery policy, we should incorporate the permissions being granted here into our custom userPublic role so that we consolidate all the permissions we intend to grant to the public and we can apply the same role/policy to every resource. Also, we should really only give public access to production resources.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to standardize permissions across broker instances for all surveys in a separate PR

else
echo "${gcs_avro_bucket} already exists."
fi

#--- Setup the Pub/Sub notifications on the Avro storage bucket
echo
echo "Configuring Pub/Sub notifications on GCS bucket..."
trigger_event=OBJECT_FINALIZE
format=json # json or none; if json, file metadata sent in message body
gsutil notification create \
-t "$ps_topic_avro" \
-e "$trigger_event" \
-f "$format" \
"gs://${gcs_avro_bucket}"
gcloud pubsub subscriptions create "${ps_subscription_avro}" --topic="${ps_topic_avro}"

#--- Deploy the Cloud Run service
echo
echo "Creating container image for ${MODULE_NAME} module and deploying to Cloud Run..."
moduledir="." # assumes deploying what's in our current directory
config="${moduledir}/cloudbuild.yaml"
url=$(gcloud builds submit --config="${config}" \
--substitutions="_SURVEY=${survey},_TESTID=${testid},_MODULE_NAME=${cr_module_name},_REPOSITORY=${artifact_registry_repo}" \
"${moduledir}" | sed -n 's/^Step #2: Service URL: \(.*\)$/\1/p')
echo
echo "Creating trigger subscription for ${MODULE_NAME} Cloud Run service..."
# WARNING: This is set to retry failed deliveries. If there is a bug in main.py this will
# retry indefinitely, until the message is delete manually.
gcloud pubsub subscriptions create "${ps_input_subscrip}" \
--topic "${ps_trigger_topic}" \
--topic-project "${PROJECT_ID}" \
--ack-deadline=600 \
--push-endpoint="${url}${ROUTE_RUN}" \
--push-auth-service-account="${runinvoker_svcact}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you figure out how to limit the number of retries for cloud run services? If so, implement it here so we can drop this disaster waiting to happen 🥴.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@troyraen the only solution I can think of is the following:

# add to line 43
ps_deadletter_topic_input_subscrip=$(define_GCP_resources "${survey}-upsilon-deadletter") 

# add to line 51
gcloud pubsub topics delete "${ps_deadletter_topic_input_subscrip}"

# add to line 61
gcloud pubsub topics create "${ps_deadletter_topic_input_subscrip}"

gcloud pubsub subscriptions create "${ps_input_subscrip}" \
  --topic="${ps_trigger_topic}" \
  --topic-project="${PROJECT_ID}" \
  --ack-deadline=600 \
  --push-endpoint="${url}${ROUTE_RUN}" \
  --push-auth-service-account="${runinvoker_svcact}" \
  --dead-letter-topic="${ps_deadletter_topic_input_subscrip}" \
  --max-delivery-attempts=5

Rather than retrying indefinitely, the message will be published to a dead letter topic after 5 delivery attempts

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. What do you think about having only one deadletter topic that we use for every module? I'm resistant to having to manage a different one for each module but I also don't want to create a mess there if/when we actually need to use it. I don't have a clear sense of if/how we will actually want to work with messages that get delivered there. Do you? If we only have one deadletter topic and multiple modules start failing and dumping messages there, will we actually have a need to dig through them and sort out which ones came from which module? I think I would be more inclined to look at the bigquery tables to figure out which messages did/didn't make it through a given module. What do you think?

fi
107 changes: 107 additions & 0 deletions broker/cloud_run/swift/ps_to_storage/main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
#!/usr/bin/env python3
# -*- coding: UTF-8 -*-

"""This module stores Swift/BAT-GUANO alert data as a JSON file in Cloud Storage."""

import os
import flask
import pittgoogle
from google.cloud import logging, storage
from google.cloud.exceptions import PreconditionFailed

# [FIXME] Make this helpful or else delete it.
# Connect the python logger to the google cloud logger.
# By default, this captures INFO level and above.
# pittgoogle uses the python logger.
# We don't currently use the python logger directly in this script, but we could.
logging.Client().setup_logging()

PROJECT_ID = os.getenv("GCP_PROJECT")
TESTID = os.getenv("TESTID")
SURVEY = os.getenv("SURVEY")

# Variables for incoming data
# A url route is used in setup.sh when the trigger subscription is created.
# It is possible to define multiple routes in a single module and trigger them using different subscriptions.
ROUTE_RUN = "/" # HTTP route that will trigger run(). Must match deploy.sh

# Variables for outgoing data
HTTP_204 = 204 # HTTP code: Success
HTTP_400 = 400 # HTTP code: Bad Request

# GCP resources used in this module
TOPIC_ALERTS_JSON = pittgoogle.Topic.from_cloud(
"alerts-json", survey=SURVEY, testid=TESTID, projectid=PROJECT_ID
)
Comment on lines 34 to 36
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these alerts come from Swift as json or avro? If json, I think the name of this topic can just be swift-alerts since we're not changing the serialization. Also, if this stays as swift-alerts-json does that mean we don't publish any topic that's just called swift-alerts? I think from a usability/consistency standpoint we should always publish a stream called <survey>-alerts that is just a pass through of the survey's full (but deduplicated) alert stream from Kafka (or whatever) into Pub/Sub.

If that seems confusing in comparison with our LSST streams where lsst-alerts is avro and lsst-alerts-json is json, maybe we consider changing those names so that lsst-alerts is the json version and we make the avro one called lsst-alerts-avro? Benefit of the current naming is that lsst-alerts is byte-for-byte the same as what Rubin publishes. That was my original intention for all of our <survey>-alerts streams, and in that sense using swift-alerts is consistent (assuming Swift really does publish these as json -- otherwise, sorry for this irrelevant tangent). But since all of our topics downstream of <survey>-alerts use json exclusively, I can see an argument for making all <survey>-alerts streams json as well.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@troyraen the alerts from Swift are JSON serialized. I named the Pub/Sub resource that way because at the time it seemed more appropriate and descriptive, but I agree that having a <survey>-alerts topic that is just a pass through of the survey's full (deduplicated) alert stream is the convention this module should adopt

bucket_name = f"{PROJECT_ID}-{SURVEY}_alerts"
if TESTID != "False":
bucket_name = f"{bucket_name}-{TESTID}"

client = storage.Client()
bucket = client.get_bucket(client.bucket(bucket_name, user_project=PROJECT_ID))

app = flask.Flask(__name__)


@app.route(ROUTE_RUN, methods=["POST"])
def run():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def run():
def run() -> tuple[str, int]:

"""Uploads alert data to a GCS bucket. Publishes a de-duplicated JSON-serialized "alerts" stream
(${survey}-alerts-json) containing the original alert bytes. A BigQuery subscription is used to write alert data to
the appropriate BigQuery table.
This module is intended to be deployed as a Cloud Run service. It will operate as an HTTP endpoint
triggered by Pub/Sub messages. This function will be called once for every message sent to this route.
It should accept the incoming HTTP request and return a response.
Returns
-------
response : tuple(str, int)
Tuple containing the response body (string) and HTTP status code (int). Flask will convert the
tuple into a proper HTTP response. Note that the response is a status message for the web server.
"""
# extract the envelope from the request that triggered the endpoint
# this contains a single Pub/Sub message with the alert to be processed
envelope = flask.request.get_json()
try:
alert = pittgoogle.Alert.from_cloud_run(envelope, schema_name="default")
except pittgoogle.exceptions.BadRequest as exc:
return str(exc), HTTP_400

blob = bucket.blob(_name_in_bucket(alert))
blob.metadata = _create_file_metadata(alert, event_id=envelope["message"]["messageId"])

# raise a PreconditionFailed exception if filename already exists in the bucket using "if_generation_match=0"
# let it raise. the message will be dropped.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# let it raise. the message will be dropped.

This comment is a holdover from my original code but no longer makes sense because we've now implemented the try/except right here.

try:
blob.upload_from_string(alert.msg.data, if_generation_match=0)
except PreconditionFailed:
# this alert is a duplicate. drop it.
return "", HTTP_204

# publish the same alert as JSON
TOPIC_ALERTS_JSON.publish(alert)

return "", HTTP_204


def _create_file_metadata(alert: pittgoogle.Alert, event_id: str) -> dict:
"""Return key/value pairs to be attached to the file as metadata."""
# https://github.com/nasa-gcn/gcn-schema/blob/main/gcn/notices/swift/bat/Guano.example.json
metadata = {"file_origin_message_id": event_id}
metadata["_".join("alert_datetime")] = alert.dict["alert_datetime"]
metadata["_".join("alert_type")] = alert.dict["alert_type"]
metadata["_".join("classification")] = alert.dict["classification"]
metadata["_".join("id")] = alert.dict["id"]

return metadata


def _name_in_bucket(alert: pittgoogle.Alert) -> str:
"""Return the name of the file in the bucket."""
# not easily able to extract schema version, see:
# https://github.com/nasa-gcn/gcn-schema/blob/main/gcn/notices/swift/bat/Guano.example.json
_date = alert.dict["alert_datetime"][0:10]
_alert_type = alert.dict["alert_type"]
_id = alert.dict["id"][0]

return f"{_date}/{_alert_type}/{_id}.json"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's figure out how to add the schema version. Otherwise it will be difficult to do things like figure out how the bucket organization maps to the dataset/table organization. The pipeline must know the version for things like naming the bigquery table, so perhaps the easiest solution is to add it as an env var to this module.

15 changes: 15 additions & 0 deletions broker/cloud_run/swift/ps_to_storage/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# As explained here
# https://cloud.google.com/functions/docs/writing/specifying-dependencies-python
# dependencies for a Cloud Function must be specified in a `requirements.txt`
# file (or packaged with the function) in the same directory as `main.py`

google-cloud-logging
google-cloud-storage
pittgoogle-client>=0.3.15

# for Cloud Run
# https://cloud.google.com/run/docs/quickstarts/build-and-deploy/deploy-python-service
# pinned following quickstart example. [TODO] consider un-pinning
Flask==3.0.3
gunicorn==23.0.0
Werkzeug==3.0.6
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# for Cloud Run
# https://cloud.google.com/run/docs/quickstarts/build-and-deploy/deploy-python-service
# pinned following quickstart example. [TODO] consider un-pinning
Flask==3.0.3
gunicorn==23.0.0
Werkzeug==3.0.6
# for Cloud Run
# https://cloud.google.com/run/docs/quickstarts/build-and-deploy/deploy-python-service
Flask
gunicorn
Werkzeug

6 changes: 3 additions & 3 deletions broker/consumer/swift/vm_install.sh
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#! /bin/bash
# Installs the software required to run the Kafka Consumer.
# Assumes a Debian 10 OS.
# Assumes a Debian 12 OS.

#--- Get metadata attributes
baseurl="http://metadata.google.internal/computeMetadata/v1"
Expand Down Expand Up @@ -33,7 +33,7 @@ snap install core
snap install yq

#--- Install Java and the dev kit
# see https://www.digitalocean.com/community/tutorials/how-to-install-java-with-apt-on-debian-10
# see https://www.digitalocean.com/community/tutorials/how-to-install-java-with-apt-on-debian-11
apt update
echo "Installing Java..."
apt install -y default-jre
Expand Down Expand Up @@ -61,7 +61,7 @@ echo "Done installing Confluent Platform."
echo "Installing the Kafka -> Pub/Sub connector"
(
plugindir=/usr/local/share/kafka/plugins
CONNECTOR_RELEASE="1.1.0"
CONNECTOR_RELEASE="1.3.2"
mkdir -p ${plugindir}
#- install the connector
cd ${plugindir}
Expand Down
2 changes: 1 addition & 1 deletion broker/consumer/swift/vm_startup.sh
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ fi

#--- GCP resources used in this script
broker_bucket="${PROJECT_ID}-${survey}-broker_files"
PS_TOPIC_DEFAULT="${survey}-alerts"
PS_TOPIC_DEFAULT="${survey}-alerts_raw"
# use test resources, if requested
if [ "$testid" != "False" ]; then
broker_bucket="${broker_bucket}-${testid}"
Expand Down
45 changes: 25 additions & 20 deletions broker/setup_broker/swift/create_vm.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,17 @@
# Creates or deletes the GCP VM instances needed by the broker.
# This script will not delete VMs that are in production


broker_bucket=$1 # name of GCS bucket where broker files are staged
# name of GCS bucket where broker files are staged
gcs_broker_bucket=$1
# "False" uses production resources
# any other string will be appended to the names of all resources
testid="${2:-test}"
# "False" uses production resources
# any other string will be appended to the names of all resources
teardown="${3:-False}" # "True" tearsdown/deletes resources, else setup
survey="${4:-swift}"
# "True" tearsdown/deletes resources, else setup
teardown="${3:-False}"
# name of the survey this broker instance will ingest
survey="${4:-swift}"
zone="${5:-us-central1-a}"
project_id="${6:-PROJECT_ID}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to define yet another name for this variable. Just use $PROJECT_ID directly where applicable. (GCP makes this variable confusing by using different names for it in different contexts -- PROJECT_ID and GOOGLE_CLOUD_PROJECT being most common, but there's at least one more. In our scripts we try to standardize on PROJECT_ID.)


#--- GCP resources used in this script
consumerVM="${survey}-consumer"
Expand All @@ -25,19 +27,22 @@ if [ "$teardown" = "True" ]; then
if [ "$testid" != "False" ]; then
gcloud compute instances delete "$consumerVM" --zone="$zone"
fi

#--- Create resources
#--- Setup resources if they do not exist
else
#--- Consumer VM
# create VM
machinetype=e2-custom-1-5632
# metadata
googlelogging="google-logging-enabled=true"
startupscript="startup-script-url=gs://${broker_bucket}/${survey}/vm_install.sh"
shutdownscript="shutdown-script-url=gs://${broker_bucket}/${survey}/vm_shutdown.sh"
gcloud compute instances create "$consumerVM" \
--zone="$zone" \
--machine-type="$machinetype" \
--scopes=cloud-platform \
--metadata="${googlelogging},${startupscript},${shutdownscript}"
if ! gcloud compute instances describe "${consumerVM}" --zone="${zone}" --project="${project_id}" >/dev/null 2>&1; then
machinetype=e2-custom-1-5632
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do the installs actually succeed with a machine type this small? I recall needing a bigger machine for the set up and then making the machine type smaller for normal operations.

Copy link
Collaborator Author

@hernandezc1 hernandezc1 Jul 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do the installs actually succeed with a machine type this small?

It worked when I tested it!

# metadata
googlelogging="google-logging-enabled=true"
startupscript="startup-script-url=gs://${gcs_broker_bucket}/${survey}/vm_install.sh"
shutdownscript="shutdown-script-url=gs://${gcs_broker_bucket}/${survey}/vm_shutdown.sh"
#--- Create VM
gcloud compute instances create "$consumerVM" \
--zone="$zone" \
--machine-type="$machinetype" \
--scopes=cloud-platform \
--metadata="${googlelogging},${startupscript},${shutdownscript}"
else
echo
echo "VM instance ${consumerVM} already exists in zone ${zone}."
fi
fi
Loading