Skip to content
This repository was archived by the owner on Jun 6, 2024. It is now read-only.
This repository was archived by the owner on Jun 6, 2024. It is now read-only.

Updating Nimbus SessionKey is ignored. Client continues proposing new blocks with the old key on runtime upgrade #75

@Garandor

Description

@Garandor

I'm seeing that after calling setKeys to update the node's Nimbus key, the node continues proposing with the old SessionKey.

Our implementation of AccountLookup uses pallet_session::key_owner to map NimbusId -> AccountId, so the Nimbus key must match what's set in pallet_session::queuedKeys for the node to produce valid blocks.

Because of this, our polkadot-launch --chain parachain-local nodes can not start block production after a chain upgrade that enables nimbus on a chain that didn't use it before

Expected Results

The node queries the current nimbus key from the runtime and uses that to author

Analysis

The fn the client uses to build the digest is nimbus_consensus::seal_header nimbus-consensus/src/lib.rs which calls SyncCryptoStore::sign_with

Rustdocs of sign_with state

Given a list of public keys, find the first supported key and sign the provided message with that key.

Now the problem is that the two ways to change your node's session keys - author_rotateKeys and author_insertKey both internally call SessionKeys::generate_session_keys which adds the new keys to the Crypto store, but does NOT remove the old keys.

So after running one of the above, there will be multiple keys in store matching the nmbs session key type and Nimbus Consensus implicitly picks the first one to propose blocks, which then get rejected by the runtime as they don't match the key set in pallet_session::queuedKeys, preventing the node from producing blocks.

Workaround

  1. Stop the node
  2. Find the keystore folder and manually delete all old nimbus key that don't match what you have provided to the most recent call of setKeys. Nimbus keys start with 6e6d6273 ( nmbs in hex ) and have the public key after that.
  3. Restart the node

This is extra problematic on polkadot-launch --chain something-local deployments with the --alice etc. nodes as their well-known keys are not held in the keystore on the filesystem and thus can not be deleted.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions