Handles inserting of dimensions into database after input file becomes available; and creates an event by sending a message to a dimension-imported kafka topic so further processing of the input file can take place.
In order to run the service locally you will need the following:
- Clone the repo
go get github.com/ONSdigital/dp-dimension-importer
- Run kafka and zookeeper
- Run local S3 store
- Run the dataset API, see documentation here
- Run api auth stub, see documentation here
- Run the application with
make debug
Scripts for updating and debugging Kafka can be found here(dp-data-tools)
Environment variable | Default | Description |
---|---|---|
BIND_ADDR | :23000 | The host and port to bind to |
SERVICE_AUTH_TOKEN | 4424A9F2-B903-40F4-85F1-240107D1AFAF | The service authorization token |
KAFKA_ADDR | localhost:9092 | The list of kafka hosts |
BATCH_SIZE | 1 | Number of kafka messages that will be batched |
KAFKA_NUM_WORKERS | 1 | The maximum number of concurent kafka messages being consumed at the same time |
KAFKA_VERSION | "1.0.2" | The kafka version that this service expects to connect to |
KAFKA_OFFSET_OLDEST | true | sets the kafka offset to be oldest if true |
KAFKA_SEC_PROTO | unset | if set to TLS , kafka connections will use TLS [1] |
KAFKA_SEC_CLIENT_KEY | unset | PEM for the client key [1] |
KAFKA_SEC_CLIENT_CERT | unset | PEM for the client certificate [1] |
KAFKA_SEC_CA_CERTS | unset | CA cert chain for the server cert [1] |
KAFKA_SEC_SKIP_VERIFY | false | ignores server certificate issues if true [1] |
DATASET_API_ADDR | http://localhost:21800 | The address of the dataset API |
DIMENSIONS_EXTRACTED_TOPIC | dimensions-extracted | The topic to consume messages from when dimensions are extracted |
DIMENSIONS_EXTRACTED_CONSUMER_GROUP | dp-dimension-importer | The consumer group to consume messages from when dimensions are extracted |
DIMENSIONS_INSERTED_TOPIC | dimensions-inserted | The topic to write output messages when dimensions are inserted |
EVENT_REPORTER_TOPIC | report-events | The topic to write output messages when any errors occur during processing an instance |
GRACEFUL_SHUTDOWN_TIMEOUT | 5s | The graceful shutdown timeout (time.Duration) |
HEALTHCHECK_INTERVAL | 30s | The period of time between health checks (time.Duration) |
HEALTHCHECK_CRITICAL_TIMEOUT | 90s | The period of time after which failing checks will result in critical global check (time.Duration) |
ENABLE_PATCH_NODE_ID | true | If true, the NodeID value for a dimension option stored in Neptune will be sent to dataset API |
Notes:
Environment variable | Default | Description |
---|---|---|
GRAPH_DRIVER_TYPE | "" | string identifier for the implementation to be used (e.g. 'neptune' or 'mock') |
GRAPH_ADDR | "" | address of the database matching the chosen driver type (web socket) |
NEPTUNE_TLS_SKIP_VERIFY | false | flag to skip TLS certificate verification, should only be true when run locally |
NEPTUNE_TLS_SKIP_VERIFY
to true. See our Neptune guide for more details.
The /healthcheck
endpoint returns the current status of the service. Dependent services are health checked on an interval defined by the HEALTHCHECK_INTERVAL
environment variable.
On a development machine a request to the health check endpoint can be made by:
curl localhost:23000/healthcheck
See CONTRIBUTING for details.
Copyright © 2016-2021, Office for National Statistics (https://www.ons.gov.uk)
Released under MIT license, see LICENSE for details.