You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Just testing controller HA (using drbd-reactor) and simulating failures.
Drbd-reactor is working nicely but satellite nodes aren't recovering automatically.
Here is what I see after a failure
However, if I restart the satellite daemon on both nodes - here is what I see
I could somehow work out how to create a process to restart all satellite nodes after controller failure but I guess somehow there should be a way to allow the satellite service to "self-heal" without a restart.
Thanks
Daniel
The text was updated successfully, but these errors were encountered:
Yes, once the satellite is up again, the controller should automatically reconnect to the satellite and get the resource UpToDate again. Can you verify if the satellite 1) did start and 2) is shown as Online in linstor node list?
If that looks fine, please check for possible ErrorReports (via err list) to see if something happened that could help us why LINSTOR did not properly restore the state again.
Hi
Just testing controller HA (using drbd-reactor) and simulating failures.
Drbd-reactor is working nicely but satellite nodes aren't recovering automatically.
Here is what I see after a failure
However, if I restart the satellite daemon on both nodes - here is what I see
I could somehow work out how to create a process to restart all satellite nodes after controller failure but I guess somehow there should be a way to allow the satellite service to "self-heal" without a restart.
Thanks
Daniel
The text was updated successfully, but these errors were encountered: