-
Notifications
You must be signed in to change notification settings - Fork 76
logging: add clock-jump recovery and tighten Alloy service ordering #1677
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vunnyso
reviewed
Jan 13, 2026
ffc19de to
c244e4a
Compare
vunnyso
reviewed
Jan 14, 2026
- Add ghaf.logging.recovery options and shared clock-jump watcher + recover oneshot. - Ensure alloy.service is ordered after/requires systemd-journald on client and server. - Server pipeline: route journald through loki.process, drop entries older than 168h, and align WAL max_segment_age. Signed-off-by: Everton de Matos <[email protected]>
c244e4a to
3ea543f
Compare
vunnyso
approved these changes
Jan 14, 2026
brianmcgillion
approved these changes
Jan 14, 2026
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description of Changes
This PR introduces a clock-jump recovery mechanism for Ghaf logging, designed to handle manual or abrupt realtime clock changes that may otherwise disrupt journald ordering and Alloy log shipping. This PR aims to resolve the bug described at https://jira.tii.ae/browse/SSRCSP-7772. Summary of modifications:
ghaf.logging.recoveryoptions and clock-jump watcher + recover oneshot services.alloy.serviceis ordered after/requires systemd-journald on client and server.modules/common/logging/common.nixand reusable across all VMs.admin-vm, as it aggregates and forwards the system logs. Can be enabled for different VMs with different parameters (e.g.,thresholdSeconds,intervalSeconds, etc.)loki.process, drop entries older than 168h, and align WALmax_segment_age. Aligned WAL retention and log dropping policy (older_than= 168h). It is also aligned with the Grafana 7-day (168h) default policy.Performance Evaluation
The
ghaf-clock-jump-watcher.servicewas monitored in two 30-minute window situations:The following Table summarizes the CPU and memory consumption results for both scenarios:
Graph for scenario (i):

Graph for scenario (ii):

Type of Change
Related Issues / Tickets
https://jira.tii.ae/browse/SSRCSP-7772
Checklist
make-checksand it passesTesting Instructions
Applicable Targets
aarch64aarch64x86_64x86_64x86_64Installation Method
nixos-rebuild ... switchTest Steps To Verify:
You can perform the exact same steps as described at https://jira.tii.ae/browse/SSRCSP-7772:
9.1. This step is not mandatory, as the system does it by default when back online