Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Startup loop #1870

Closed
mstormi opened this issue May 28, 2024 · 7 comments · Fixed by #1897
Closed

Startup loop #1870

mstormi opened this issue May 28, 2024 · 7 comments · Fixed by #1897
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@mstormi
Copy link
Contributor

mstormi commented May 28, 2024

At times, after updating the openHAB package, startup is stuck in a loop, spitting lots of java component errors.
I came to think if we can supervise startup, detect and break these loops ?
Unsure how, though.

Maybe if detected, pkill -9 java maybe add that to the update script option ?
Add openhab-cli clean-cache, too ? (I think it's done on package install so may not help to do again but not 100% sure)
Or add a systemd timer to do a regular health check maybe ? Note that can be a dangerous sword should it mistakenly detect OH to be down. It should only be active for say some hours after openhab service startup time.

@mstormi mstormi added enhancement New feature or request help wanted Extra attention is needed labels May 28, 2024
@BClark09
Copy link
Member

BClark09 commented May 29, 2024

I've not had this myself, but I have seen many talk about this issue. Is the openHAB process restarting (i.e. Karaf) or is the loop happening within openHAB's functions. If the former, then we can do a lot if an exit code is issued.

Add openhab-cli clean-cache, too ? (I think it's done on package install so may not help to do again but not 100% sure)

It is done on package install or on update..

Or add a systemd timer to do a regular health check maybe ?

There are things we can test by polling the activeness of core bundles: openhab/openhab-linuxpkg#138

I came to think if we can supervise startup, detect and break these loops ?

I've been wondering if it's related to the amount of time that openHAB takes to shut down or restart. If it takes more than 120 seconds, then the openHAB service is killed with a SIGTERM, and I wonder if this has any effect on the quality of the update.

@mstormi
Copy link
Contributor Author

mstormi commented May 29, 2024

Is the openHAB process restarting (i.e. Karaf) or is the loop happening within openHAB's functions. If the former, then we can do a lot if an exit code is issued.

No, it's within openHAB. Lots of Java exceptions.

@ecdye
Copy link
Contributor

ecdye commented May 29, 2024

So are we just trying to add a temporary fix on the openHABian side until the underlying bug can be corrected with openHAB and its Java process?

@mstormi
Copy link
Contributor Author

mstormi commented May 29, 2024

yeah, although AFAIK there's no such bug filed with openHAB

@miloit
Copy link

miloit commented May 30, 2024

I have never seen this, but if you have @mstormi than file an issue so that they girls and boys aware of it

@mstormi
Copy link
Contributor Author

mstormi commented Jul 27, 2024

I've been wondering if it's related to the amount of time that openHAB takes to shut down or restart. If it takes more than 120 seconds, then the openHAB service is killed with a SIGTERM, and I wonder if this has any effect on the quality of the update.

As said the issue here is Java exceptions so from systemd perspective the service (= java process) is running and I don't think it's related to your 120 secs timeout.

@mstormi
Copy link
Contributor Author

mstormi commented Aug 10, 2024

openhab.log for such an occurrence (haven't analyzed it)
Happened on first start after upgrading the openhab packages 4.2.0-1 -> 4.2.1-1

continuous-restart.log

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants