Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CDRIVER-3775 mongoc_structured_log #1795

Open
wants to merge 82 commits into
base: master
Choose a base branch
from
Open

Conversation

mdbmes
Copy link
Contributor

@mdbmes mdbmes commented Nov 19, 2024

This is a revival of an old pull request by @alcaeus to add structured logging to the C driver. (#684)

One unanswered question from that PR remains open: Should we integrate structured logging more closely with unstructured logging? I've opted so far to continue with the previous PR's approach and keep the subsystems fully separate, but this may still be up for debate.

The previous PR defined a global log mutex. I've continued that design, but this is the time to consider carefully if we want a big logging lock. (It's likely not a big deal for perf, but any contention here will be a problem for multi-threaded apps and this is the time to get the design right.)

The public API is still similar to the one @alcaeus designed, with minor modifications to support efficient log level filtering and to allow necessary copies of message data to be owned and retained by the callback if necessary, potentially avoiding additional copies in cases where apps would like to store log messages. (e.g. in our own unified test runner, or in any application that handles log messages asynchronously.)

The internal API has been redesigned. The previous PR required each new type of log message to have functions and structures associated with that specific message format. In this redesign, I've split the single callback into a variable length list of callbacks, and added a set of macros to make it straightforward to build these function tables. This has some nice properties:

  • Submitting a log entry still doesn't require any deep copies, json serialization, or any memory allocation, only a stack-allocated table of references.
  • Now the mongoc_structured_log() call site explicitly names the keys that are included in the resulting document.
  • Now it's easy to define ad-hoc log entries or add new values to existing log entries.
  • The list-based approach also makes it easy to define reusable building blocks. This is immediately useful for command redaction and for server descriptions.

Here's a sample invocation:

mongoc_structured_log (
   MONGOC_STRUCTURED_LOG_LEVEL_DEBUG,
   MONGOC_STRUCTURED_LOG_COMPONENT_COMMAND,
   "Command started",
   int32 ("requestId", cluster->request_id),
   server_description (server_stream->sd, SERVER_HOST, SERVER_PORT, SERVER_CONNECTION_ID, SERVICE_ID),
   utf8 ("databaseName", db),
   utf8 ("commandName", "killCursors"),
   int64 ("operationId", operation_id),
   bson_as_json ("command", &doc));

The macros are explained by doc comments in mongoc-structured-log-private.h.

This PR includes updated unified tests from the command logging and monitoring spec, which now pass. This required several other changes to the unified test runner.

Contents:

  • Implement and document a structured logging facility
  • Structured logging items for command and reply redaction
  • Unified test runner support for: observeLogMessages, waitForEvent, $$matchAsDocument, $$matchAsRoot
  • Add serverDescriptionChangedEvent to the unified test runner and unify its two event serialization systems
  • Sync unified tests from the command-logging-and-monitoring spec
  • Enough command logging to pass the CLAM tests
  • bson-dsl support for oid() values
  • Private utilities for dealing with zero'ed oids
  • Minor drive-by cleanup

…alization

The command-logging-and-monitoring tests use the serverDescriptionChanged event. In implementing a new event type, it made sense to refactor the implementation for event storage and serialization. The serialization from observeEvents and storeEventsAsEntities unnecessarily diverged into a late-serialization and an early-serialization style. This decides on early serialization, opting to build a canonical bson representation as soon as possible and extract fields from it later as necessary for comparisons. This has fewer opportunities for optimization in some cases, but here the extra simplicity is a good match for the test runner environment.
These aren't long lived strings, they might be part of a temporary bson document like in createEntities.
Use a separate atomic flag instead of trying to make an atomic func pointer, to avoid non-portable hacks.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants