An overview of all the components in this repo, and related components needed to run this service.
downloader
: Takesdownload
items from the queue, downloads an issue's details from the GitHub API, stores their values onto the disk, and then adds anindex
item onto the queue.indexer
: Takesindex
items from the queue, and imports JSON files from the disk into ElasticSearch.elasticsearch
: Indexed storage, containing all issue information, and provides APIs to query and consume the data.kibana
: Web UI for accessing ElasticSearch data, allowing quick and easy quering, and building data visualizations.queue-cli
: Small helper tool that allows manual modifications to the queue, like manually queuing issues for download, or checking the current queue state.redis
: Small key-value storage, responsible for holding queue items.reverse-proxy
: nginx image with pre-defined config, responsible for making sure that the right requests end up at the right components. For example, serving Kibana at/kibana/
, or exposing thewebhook-receiver
on/webhooks/
.webhook-receiver
: HTTP server to receive GitHub Webhook notifications. If a webhook is received, the receiver will add adownload
queue item.
- External reverse proxy/Load Balancer: The docker-compose stack does not listen on public ports, it just listens on
127.0.0.1:8080
with an HTTP-only connection. You are responsible for making sure the application is available from the internet. An example configuration for nginx can be found in/_docs/additional-files/external-nginx.conf
, adapt as needed. - Cronjobs/systemd-timer: This repository contains two scripts at
/src/scripts/
:make-snapshot
andremove-old-snapshots
. These scripts are responsible for creating daily snapshots of web-bug data, and for deleting snapshots older than one week. It is your responsibility to make sure those scripts are called once per day. - Backups: You absolutely should back up the data generated by this app to external backup storage.
/_docs/backups.md
contains additional details and hints.