Skip to content

Alarm cleared/not-cleared tracking is not always reliable #14

@jktjkt

Description

@jktjkt

In the log below, the first line with rousette.service is red, and the alarm is marked as "not cleared".

roadm-c1 ~ # velia-list-alarms 
   Resource                    Severity  Detail                                                    Timestamp                            Status
⏸   rousette.service            cleared   systemd unit state: (active, running)                     2025-06-12T09:04:39.969290690+00:00  active
                               critical  systemd unit state: (failed, failed)                      2025-06-12T09:16:17.842602430+00:00  
                               critical  systemd unit state: (failed, failed)                      2025-06-12T09:04:38.616300571+00:00  
⏸   cla-sdn-roadm-line.service  critical  systemd unit state: (failed, failed-before-auto-restart)  2025-06-11T15:30:12.138506057+00:00  cleared
                               cleared   systemd unit state: (activating, auto-restart-queued)     2025-06-12T10:15:09.152393179+00:00  
                               critical  systemd unit state: (activating, auto-restart)            2025-06-12T10:15:08.899430823+00:00  
                               critical  systemd unit state: (failed, failed-before-auto-restart)  2025-06-12T10:15:08.845296504+00:00  
                               cleared   systemd unit state: (activating, auto-restart-queued)     2025-06-12T07:09:35.168704438+00:00  
                               critical  systemd unit state: (activating, auto-restart)            2025-06-12T07:09:35.125342190+00:00  
                               critical  systemd unit state: (failed, failed-before-auto-restart)  2025-06-12T07:09:35.058616208+00:00  
                               cleared   systemd unit state: (activating, auto-restart-queued)     2025-06-12T07:09:26.297133664+00:00  
                               critical  systemd unit state: (activating, auto-restart)            2025-06-12T07:09:26.174499822+00:00  
                               critical  systemd unit state: (failed, failed-before-auto-restart)  2025-06-12T07:09:26.110860892+00:00  
                               cleared   systemd unit state: (activating, auto-restart-queued)     2025-06-12T06:16:23.651356762+00:00  
                               critical  systemd unit state: (activating, auto-restart)            2025-06-12T06:16:23.553184229+00:00  
                               critical  systemd unit state: (failed, failed-before-auto-restart)  2025-06-12T06:16:23.496047639+00:00  
                               cleared   systemd unit state: (activating, auto-restart-queued)     2025-06-11T23:39:25.518846180+00:00  
                               critical  systemd unit state: (activating, auto-restart)            2025-06-11T23:39:25.435106670+00:00  
                               critical  systemd unit state: (failed, failed-before-auto-restart)  2025-06-11T23:39:25.377808645+00:00  
                               cleared   systemd unit state: (activating, auto-restart-queued)     2025-06-11T16:36:33.865288501+00:00  
                               critical  systemd unit state: (activating, auto-restart)            2025-06-11T16:36:33.811046686+00:00  
                               critical  systemd unit state: (failed, failed-before-auto-restart)  2025-06-11T16:36:33.754711641+00:00  
                               cleared   systemd unit state: (activating, auto-restart-queued)     2025-06-11T15:30:12.325154947+00:00  
                               critical  systemd unit state: (activating, auto-restart)            2025-06-11T15:30:12.203141124+00:00  
                               cleared   systemd unit state: (activating, auto-restart-queued)     2025-06-11T12:21:41.886795795+00:00  
                               critical  systemd unit state: (activating, auto-restart)            2025-06-11T12:21:41.783434066+00:00  
                               critical  systemd unit state: (failed, failed-before-auto-restart)  2025-06-11T12:21:41.730486477+00:00  
                               cleared   systemd unit state: (activating, auto-restart-queued)     2025-06-11T11:27:07.903159419+00:00  
                               critical  systemd unit state: (activating, auto-restart)            2025-06-11T11:27:07.793043345+00:00  
                               critical  systemd unit state: (failed, failed-before-auto-restart)  2025-06-11T11:27:07.736087763+00:00  
                               cleared   systemd unit state: (activating, auto-restart-queued)     2025-06-10T20:13:32.918521510+00:00  
                               critical  systemd unit state: (activating, auto-restart)            2025-06-10T20:13:32.667942587+00:00  
                               critical  systemd unit state: (failed, failed-before-auto-restart)  2025-06-10T20:13:32.615481038+00:00  
                               cleared   systemd unit state: (activating, auto-restart-queued)     2025-06-10T16:45:28.661430353+00:00  
                               critical  systemd unit state: (activating, auto-restart)            2025-06-10T16:45:28.576524581+00:00  
⏶   ne:psu1:voltage-in          warning   Sensor value crossed high threshold (245250 > 245000).    2025-06-09T13:50:34.155457827+00:00  cleared
                               cleared   Sensor value is within normal parameters.                 2025-06-09T19:35:57.285562288+00:00  
                               warning   Sensor value crossed high threshold (245250 > 245000).    2025-06-09T19:21:04.209862140+00:00  
                               cleared   Sensor value is within normal parameters.                 2025-06-09T13:52:42.346230651+00:00

Also, the rousette.service has been up & running for hours:

roadm-c1 ~ # systemctl status rousette --no-pager
● rousette.service - RESTCONFish server
     Loaded: loaded (/usr/lib/systemd/system/rousette.service; enabled; preset: enabled)
    Drop-In: /usr/lib/systemd/system/rousette.service.d
             └─reset-sysrepo.conf
     Active: active (running) since Thu 2025-06-12 09:16:17 UTC; 5h 54min ago
 Invocation: 779616e50c734f6c95e669457e2aeed7
   Main PID: 23278 (rousette)
        CPU: 10min 13.237s
     CGroup: /system.slice/rousette.service
             └─23278 /usr/bin/rousette

Jun 12 15:10:18 roadm-c1 rousette[23278]: change: 57326 bytes
Jun 12 15:10:19 roadm-c1 rousette[23278]: change: 57326 bytes
Jun 12 15:10:19 roadm-c1 rousette[23278]: change: 57326 bytes
Jun 12 15:10:20 roadm-c1 rousette[23278]: change: 57326 bytes
Jun 12 15:10:21 roadm-c1 rousette[23278]: change: 57326 bytes
Jun 12 15:10:21 roadm-c1 rousette[23278]: change: 57328 bytes
Jun 12 15:10:22 roadm-c1 rousette[23278]: change: 57326 bytes
Jun 12 15:10:23 roadm-c1 rousette[23278]: change: 57328 bytes
Jun 12 15:10:23 roadm-c1 rousette[23278]: change: 57326 bytes
Jun 12 15:10:24 roadm-c1 rousette[23278]: change: 57327 bytes

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions