-
Notifications
You must be signed in to change notification settings - Fork 4
Architecture
LogHarbour runs on servers and stores a repository of all logs generated by a business application. LogHarbour does not have any UI or human users.
LogHarbour's software is in four parts:
- a client library which links with business application code and implements a Kafka producer. This library is used by application code to write log entries into logs. Other functions in the library also implement a query interface to extract logs from the LogHarbour database.
- a service, which implements a Kafka consumer, reads log entries from the Kafka stream, and writes them to an ElasticSearch database. This service is referred to as the "LogHarbour writer daemon"
- an ElasticSearch cluster hosting one or more indexes (databases)
- some administrative utilities, which are command-line programs run on demand
Multiple business applications running on multiple servers can all pump log entries into their respective LogHarbour repositories, but these repositories may be hosted on a common cluster of servers and managed with a single writer daemon.
The database runs on a cluster of three or more servers, and organises data in multiple indexes. An index is dedicated for each realm. (For details about realms see the page on access controls.)
Every time a new realm is created
- a new ElasticSearch index is created,
- replication is set up for this index across the cluster of servers so that at least three copies are stored of each record.
- Two roles are created on the index, one for full read-write access to the index and one for read-only access.
- The read-write role's access credentials are appended to the realm ID and other data to create a string, this string is encrypted with a shared key known only to ElasticSearch (this shared key is referred to as the token encryption key), and the final encrypted string is base64-encoded. The final output is called the write token for the realm.
- The read-only role's access credentials are appended to the realm ID, index URI and other data to create a string, and it is base64-encoded. This final base64-encoded output is called the query token for the realm.
- The realm-creation tools write the write token and query token for the realm to an output file, so that they may be shared with the business application which intends to use this new realm in LogHarbour.
- The two tokens, plus data about the index and servers, are all written to a special internal ElasticSearch index which holds only internal meta-data for LogHarbour's internal use. (For details of this data, see Master data for authorisation.)
At this point, the LogHarbour index for the new realm is ready for use.
This library has functions which (a) queue log entries for eventual insertion into the ElasticSearch index, and (b) query the index to extract logs for display or analysis.
All functions which queue log entries for insertion require the write token for the realm, and all functions which query the log repository require the query token.
When a client library function needs to queue a message for eventual insertion, it writes a record in the local Kafka stream where the first field is the write token.
When a function needs to query the repository, it base64-decodes the query token and parses the resultant string. From the parsed data, it extracts
- the URI needed to connect to one of the servers which host the index which belongs to it
- the identity of the index which belongs to its realm
- the access credentials for read-only access to the index
Using this information, it connects to the index, fires the query and pulls out the result needed.
This runs on one server (or two servers for redundancy) and reads messages coming in on the Kafka stream. For each message, it first separates out the write token, base64-decodes it, decrypts it, and then extracts from it various details like:
- the URI of the ElasticSearch index for this message's realm
- the index name and path
- the user credentials needed to insert records into this index
Using this information, it logs the record into ElasticSearch.
These tools perform three tasks:
- when a new realm is created, it creates a new ElasticSearch index, sets up replication for it across the cluster, creates two access roles in that index, one for writing and one for querying, creates tokens out of these two as described earlier, and writes all this information into a special private ElasticSearch index used only for LogHarbour metadata.
- when an additional write token or query token is required for a realm, it generates a new role in the ElasticSearch index for this realm, generates a token out of these credentials, adds the token to the set of tokens against the metadata of this realm in LogHarbour, and writes out the new token to a file so that it may then be shared with the business application.
- when a token needs to be deactivated, it removes the credentials from the ElasticSearch index so that those credentials cease to work any more, and then it removes the token from the LogHarbour metadata for the realm.