Skip to content

drivers/dstate: improve ALARM status safeguards, documentation and handling #2931

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

desertwitch
Copy link
Contributor

fix #2928

work in progress

@AppVeyorBot
Copy link

@jimklimov
Copy link
Member

CI complains about FAIL driver_methods_utest

Copy link
Member

@jimklimov jimklimov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks

@desertwitch
Copy link
Contributor Author

CI will fail this again, but I think we need to talk in which direction to go here before further work.
Right now there are some open questions regarding both the safeguard logic and the test cases (which fail).

The safeguard logic assumes that a driver is using status_init() but not alarm_init() (and its related functions).
This is generally logic enough, although not sure how realistic that someone uses modern status_init() & not alarm_init()?

In any case, we can separate this into legacy logic and modern logic quite well (see last commit) - so that's so far so good...
BUT -- Perhaps the safeguard logic should cover a rudimentary driver only using dstate_setinfo but neither functions?
I think that scenario is more realistic than one modern function being used, but not the other?

The test logic (and resulting failures) makes no sense to me, it assumes odd combinations of both modern/legacy code.
This makes it hard to separate into what's legacy code, to activate the safeguards, and what's modern/current standard code.

The segmentation fault results from the safeguard thinking it's a cleared alarm state, nulling the buffer and then the test function subsequently querying a no longer existing ups.alarm (being already deleted) and dereferencing the received null pointer.

alarm_init();
status_set("OL BOOST");
alarm_set("[Test alarm 2]");
status_set("ALARM");
status_set("OB");
alarm_commit();

/* test case #9+#10 from scratch */
status_init();
alarm_init();
status_set("OL BOOST");
status_set("ALARM");
status_set("OB");
alarm_commit();
status_commit();
valueStr = dstate_getinfo("ups.status");
report_0_means_pass(strcmp(valueStr, "ALARM OL BOOST OB"));
printf(" test for ups.status with explicit ALARM set via status_set() and no extra alarm_set(): '%s'; any duplicate ALARM?\n", NUT_STRARG(valueStr));
valueStr = dstate_getinfo("ups.alarm");
report_0_means_pass(strcmp(NUT_STRARG(valueStr), "[N/A]"));
printf(" test for ups.alarm with explicit ALARM set via status_set() and no extra alarm_set(): '%s'; got 1 (injected) alarm\n", NUT_STRARG(valueStr));
/* test case #11+#12 from scratch */
/* flush ups.alarm report */
alarm_init();
alarm_commit();
status_init();
alarm_init();
status_set("OL BOOST");
status_set("ALARM");
status_set("OB");
status_set("LB"); /* Should be honoured */
status_commit();
valueStr = dstate_getinfo("ups.status");
report_0_means_pass(strcmp(valueStr, "ALARM OL BOOST OB LB"));
printf(" test for ups.status with explicit ALARM set via status_set() and no extra alarm_set() nor alarm_commit(): '%s'; is ALARM reported?\n", NUT_STRARG(valueStr));
valueStr = dstate_getinfo("ups.alarm");
/*
report_0_means_pass(valueStr != NULL); // pass if valueStr is NULL
printf(" test for ups.alarm with explicit ALARM set via status_set() and no extra alarm_set() nor alarm_commit(): '%s'; got no alarms spelled out\n", NUT_STRARG(valueStr));
*/
report_0_means_pass(strcmp(NUT_STRARG(valueStr), "[N/A]"));
printf(" test for ups.alarm with explicit ALARM set via status_set() and no extra alarm_set() nor alarm_commit(): '%s'; got 1 (injected) alarm\n", NUT_STRARG(valueStr));

The mixture in these test cases makes no sense, either it should be legacy enough status_set("ALARM") but without any alarm_ functions or it should be the full batch of alarm_ functions. A mixture of both, I think, is a really unrealistic scenario.

I see three credible scenarios:

  • A legacy driver just setting alarm via ups.status without either status_ or alarm_ functions.
  • A legacy driver setting alarm via ups.status with status_ functions but no alarm_ functions.
  • A modern driver setting alarm via and alarm_ functions only, in which case it should persist status_ changes.

So we see that legacy drivers could couple alarm states with the ups.status and should safeguard that accordingly, while also keeping the alarm state internally coupled to the status (as done in my last commit). In contrast, a modern driver should be able to call status_ functions (including status_init()) without resetting appropriately published (and still active) alarm_ states.

@desertwitch
Copy link
Contributor Author

desertwitch commented Apr 30, 2025

P.S. Maybe the cleanest solution would be to fix up the - on first glance +/- 3 remaining drivers - and completely remove this fallback logic in favour of appropriately using both the status_ and alarm_ functions (separately). Dragging along such legacy logic adds unnecessary complexity and coupling of alarm states to the UPS status... something the alarm_ functions were probably intended to decouple in the first place?

We can theoretically still show a manually set ALARM state in the status for visibility reasons, but just not make any assumptions and subsequently not trigger any further modern functions and injections from it?

Basically this would entail only removing these sections and a manually set ALARM state would be treated like any other textual status of no special significance? I think even as a textual status it would still raise the upsmon notifications? The only thing we'd miss out on is the injection of a placeholder ups.alarm with N/A which seems kind of pointless in this contrast?

nut/drivers/dstate.c

Lines 1797 to 1809 in dfe6062

if (!strcasecmp(token, "ALARM")) {
/* Drivers really should not raise alarms this way,
* but for the sake of third-party forks, we handle
* the possibility...
*/
upsdebugx(2, "%s: (almost) ignoring ALARM set as a status", __func__);
if (!alarm_status && !alarm_active && strlen(alarm_buf) == 0) {
alarm_init(); /* no-op currently, but better be proper about it */
alarm_set("[N/A]");
}
alarm_status++;
return 0;
}

nut/drivers/dstate.c

Lines 1872 to 1875 in dfe6062

if (!alarm_active && alarm_status && !strcmp(alarm_buf, "[N/A]")) {
upsdebugx(2, "%s: Assume sloppy driver coding that ignored alarm methods and used status_set(\"ALARM\") instead: commit the injected N/A ups.alarm value", __func__);
alarm_commit();
}

Isn't the whole point of the alarm_ functions to let alarms persist during status resets of the drivers? If the driver is not using that in the first place and couples it with the UPS status against better practice, why even bother to migrate it into a decoupled state that we then make dependent on the UPS status again (for removal, effectively coupling it all over)?

... or is this for upsmon to have a ups.alarm value for its notifications?
Maybe a simple check to detect a non-existing ups.alarm in upsmon instead and not announce that?
If we want to keep this here no matter what, see post above for legacy safeguard strategies that we could do. 😉

@jimklimov
Copy link
Member

Thinking (slowly and in the background though, got some work to catch up with)...
It may well be that I did not see the forest behind one tree I was chopping at here (for #2708 related work).

@desertwitch
Copy link
Contributor Author

Thinking (slowly and in the background though, got some work to catch up with)... It may well be that I did not see the forest behind one tree I was chopping at here (for #2708 related work).

No worries, just wanted to drain my thoughts to "paper" (sorry it ended up in a wall of text), mostly so I don't forget. I'll keep this as is for now, but will get started on a second alternative PR on how it would look without legacy code and simply handling textual alarms vs. modern alarms in upsmon instead (not needing to inject variables that don't exist in dstate). For later comparison. 😎

@AppVeyorBot
Copy link

desertwitch added a commit to desertwitch/nut that referenced this pull request Apr 30, 2025
desertwitch added a commit to desertwitch/nut that referenced this pull request Apr 30, 2025
desertwitch added a commit to desertwitch/nut that referenced this pull request Apr 30, 2025
desertwitch added a commit to desertwitch/nut that referenced this pull request Apr 30, 2025
desertwitch added a commit to desertwitch/nut that referenced this pull request Apr 30, 2025
desertwitch added a commit to desertwitch/nut that referenced this pull request Apr 30, 2025
@desertwitch
Copy link
Contributor Author

superseded by #2934

@desertwitch desertwitch closed this May 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

dstate/dummy-ups: ALARM status does not clear after dirty entrance into state
3 participants