Skip to content

Fix start up exception in TabletServer after the introduction of the FATE AccumuloStore in the Manager #4166

@cshannon

Description

@cshannon

The manager was updated to support multiple Fate instance types and to use an AccumuloStore for user tables in #4133 which has now introduced an exception on start up in the Tablet server log file.

Currently the Accumulo store uses the AgeOff store (just like zookeeper) so on initialization the list of existing transactions is loaded. This means while the Manager is still starting up and initializing the store it will scan the Fate table which in turn causes the Tablet sever that accepts the scan to log an exception. This exception is due to the tablet server loading a tablet and trying to send a message back to the Manager over the manager RPC channel that the tablet was loaded, but because the Manager hasn't finished starting yet, the Tablet server can't connect back to the Manager over the rpc channel yet so an error is logged.

Eventually a retry happens and the message is passed back successfully so this is not really a huge issue as the scan completes successfully and eventually the message gets sent back but it does cause an ugly error message on start up.

This may be resolved by #4130 and #3964 but adding this issue to track it separately

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugThis issue has been verified to be a bug.

    Type

    No type

    Projects

    Status

    ✅ Done

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions