Skip to content

Serverless Video Transcode Batch Pipeline on GCP with FFMPEG & Go (caches on GCP Storage Bucket for streaming videos via HLS protocol)

Notifications You must be signed in to change notification settings

ShubhamTiwary914/video-transcode-gcp

Repository files navigation

video-transcode-gcp

Context

  • How do service like youtube adjust video quality automatically according to your network? Converting video to one resolution (ex: 1080p → 480p) is very CPU heavy job, so they're cached(preprocessed) beforehand, this process is transcoding

  • Then what happens next? Protocols like HLS (an application layer streaming protocol), that adjusts video bitrate according to your network rate can be used, which is called Adaptive Bitrate streaming (tools like VLC, OBS allow this easily)

  • Services like Youtube, Twitch have had similar services (but now they probably have in house mechanisms)

Note

So the goal is to simply do these transcoding jobs on serverless batch jobs, since these are bursty/random in nature, hence serverless jobs are better than running a continous VM


What the service does overall:

Transcodes video (with FFMPEG) -> into three variants of HLS segment playlists:

  • 1080p60 HLS
  • 720p60 HLS
  • 480p30 HLS

with video codecs: H.264/AVC


References (Overview for media/video transcoding):



High Level Overview

screenshot_2025-08-17-044405

All the services here are on GCP's premises & provisioned with Terraform.


A rundown on what each does:
  • Primary task is managed by containers with FFMPEG, handled via Cloud Run jobs -> runs transcoding batch jobs. These containers just need an input stream & some output stream, and transcodes the video.

  • So there's two GCS buckets: input & output, should be pretty obvious by those names.

  • There's 3 Cloud Run services:

    • auth: handled getting the signed URLs for the GCS buckets (no public access)
    • input trigger: triggered by GCS upload event.
    • consumer: consume from queue, a trigger batch job
  • Cloud Task Queue helps rate limit many concurrent jobs at a time period, set to work for <=100 jobs


Provisioning Steps (on GCP with terraform):

Prerequisities:


1. Getting the env variables for terraform:
TF_VAR_project_id=
TF_VAR_project_number=
TF_VAR_region=
TF_VAR_region_queue=

TF_VAR_region_bucket=
TF_VAR_hls_bucketname=

TF_VAR_job_name=

TF_VAR_artifact_reponame=
TF_VAR_artifact_packagename_signbucket=
TF_VAR_artifact_packagename_processjoTF_VAR_project_id=
TF_VAR_project_number=
TF_VAR_region=
TF_VAR_region_queue=

TF_VAR_region_bucket=
TF_VAR_hls_bucketname=

TF_VAR_job_name=

TF_VAR_artifact_reponame=
TF_VAR_artifact_packagename_signbucket=
TF_VAR_artifact_packagename_processjob=
TF_VAR_artifact_packagename_jobrunner=

TF_VAR_default_SA=
TF_VAR_user_SA=b=
TF_VAR_artifact_packagename_jobrunner=

TF_VAR_default_SA=
TF_VAR_user_SA=

these are to be set in .env on the project root dir. (shown in .env.example)

Getting the project number and id:

gcloud projects describe $(gcloud config get-value project) --format="value(projectId,projectNumber)"

Get list of regions (pick any one closer to you):

gcloud compute regions list

Get list of regions that support task queue & bucket (since not all may be suppported from the compute regions list) - also pick one closer to you:

gcloud tasks locations list
gcloud storage location list

Service accounts:

#Get default Service Account:
gcloud iam service-accounts list --filter="displayName:Compute Engine default service account" --format="value(email)"


#Create User Service account:
gcloud iam service-accounts create my-sa --display-name="My Service Account"

gcloud iam service-accounts keys create my-sa-key.json \
  --iam-account=my-sa@$(gcloud config get-value project).iam.gserviceaccount.com

For the remaining: names, your choice.


Then provision with Terraform:

task tf:init
task tf:plan
task tf:apply  #enter "yes" on prompt
  1. Run the Interface:
task src:run-interface

The received signed URL at the end, can be used to stream out video from sources like:


if any trouble running the service, just mail me at [email protected] or push up an issue

About

Serverless Video Transcode Batch Pipeline on GCP with FFMPEG & Go (caches on GCP Storage Bucket for streaming videos via HLS protocol)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published