Skip to content

Failed to configure Operator Mode in Helm Chart #1023

@dopic

Description

@dopic

What version were you using?

Nats Server Version: 2.11.5
Helm Chart Version: 1.3.8 (running a 3 replica cluster)

What environment was the server running in?

EKS with AutoMode (Kubernetes version 1.33)

Is this defect reproducible?

  • Create nats configuration:
nsc init
nsc add operator <OPERATOR_NAME> --sys
nsc env -o <OPERATOR_NAME>
nsc generate config --nats-resolver
# Generated file
# Operator named tenant_dev12
operator: eyJ0eXAiOiJKV1QiLCJhbGciOiJlZDI1NTE5LW5rZXkifQ.eyJqdGkiOiJZT1ZEQ05ZQU9PNjdDMkNSS0taSDVGVExXVEpXRU1RSkVUVVpKWjZEVlpINkM3Qk8yS1JRIiwiaWF0IjoxNzUxODA0MDY3LCJpc3MiOiJPRE5aSE5DNjZNRjdQQU9aWVpTV0JGRjdFWkNRMk5TRUJXN0QzWjVGQVBaSzRWUTJVNjJDSVpLVCIsIm5hbWUiOiJ0ZW5hbnRfZGV2ZWxvcG1lbnQiLCJzdWIiOiJPRE5aSE5DNjZNRjdQQU9aWVpTV0JGRjdFWkNRMk5TRUJXN0QzWjVGQVBaSzRWUTJVNjJDSVpLVCIsIm5hdHMiOnsic3lzdGVtX2FjY291bnQiOiJBRFBVQUpWUjJCT1RBWjY1UTVXT1o0WVVSR0JWWlNVTUNMRjU3U0lMUDYzNzJBU1UySEtJVDNCQSIsInR5cGUiOiJvcGVyYXRvciIsInZlcnNpb24iOjJ9fQ.gW3SODy32S7g0YFFKW0EyZ-2MtD4U0rdjPYNtQZ5_VhG8krpXOolvHwLLPXb2d8cpHZ6At5X4Ke7w-N5ikJfAg
# System Account named SYS
system_account: ADPUAJVR2BOTAZ65Q5WOZ4YURGBVZSUMCLF57SILP6372ASU2HKIT3BA

# configuration of the nats based resolver
resolver {
    type: full
    # Directory in which the account jwt will be stored
    dir: './jwt'
    # In order to support jwt deletion, set to true
    # If the resolver type is full delete will rename the jwt.
    # This is to allow manual restoration in case of inadvertent deletion.
    # To restore a jwt, remove the added suffix .delete and restart or send a reload signal.
    # To free up storage you must manually delete files with the suffix .delete.
    allow_delete: false
    # Interval at which a nats-server with a nats based account resolver will compare
    # it's state with one random nats based account resolver in the cluster and if needed, 
    # exchange jwt and converge on the same set of jwt.
    interval: "2m"
    # Timeout for lookup requests in case an account does not exist locally.
    timeout: "1.9s"
}


# Preload the nats based resolver with the system account jwt.
# This is not necessary but avoids a bootstrapping system account. 
# This only applies to the system account. Therefore other account jwt are not included here.
# To populate the resolver:
# 1) make sure that your operator has the account server URL pointing at your nats servers.
#    The url must start with: "nats://" 
#    nsc edit operator --account-jwt-server-url nats://localhost:4222
# 2) push your accounts using: nsc push --all
#    The argument to push -u is optional if your account server url is set as described.
# 3) to prune accounts use: nsc push --prune 
#    In order to enable prune you must set above allow_delete to true
# Later changes to the system account take precedence over the system account jwt listed here.
resolver_preload: {
	ADPUAJVR2BOTAZ65Q5WOZ4YURGBVZSUMCLF57SILP6372ASU2HKIT3BA: eyJ0eXAiOiJKV1QiLCJhbGciOiJlZDI1NTE5LW5rZXkifQ.eyJqdGkiOiJCRVdFRjdUVkhGWUlCVkpJWldEMlJRWUs1TEdDNE9IMlVLUUQyS0pPWkZGN0RaNVpLR01RIiwiaWF0IjoxNzUxODA0MDY3LCJpc3MiOiJPRE5aSE5DNjZNRjdQQU9aWVpTV0JGRjdFWkNRMk5TRUJXN0QzWjVGQVBaSzRWUTJVNjJDSVpLVCIsIm5hbWUiOiJTWVMiLCJzdWIiOiJBRFBVQUpWUjJCT1RBWjY1UTVXT1o0WVVSR0JWWlNVTUNMRjU3U0lMUDYzNzJBU1UySEtJVDNCQSIsIm5hdHMiOnsiZXhwb3J0cyI6W3sibmFtZSI6ImFjY291bnQtbW9uaXRvcmluZy1zdHJlYW1zIiwic3ViamVjdCI6IiRTWVMuQUNDT1VOVC4qLlx1MDAzZSIsInR5cGUiOiJzdHJlYW0iLCJhY2NvdW50X3Rva2VuX3Bvc2l0aW9uIjozLCJkZXNjcmlwdGlvbiI6IkFjY291bnQgc3BlY2lmaWMgbW9uaXRvcmluZyBzdHJlYW0iLCJpbmZvX3VybCI6Imh0dHBzOi8vZG9jcy5uYXRzLmlvL25hdHMtc2VydmVyL2NvbmZpZ3VyYXRpb24vc3lzX2FjY291bnRzIn0seyJuYW1lIjoiYWNjb3VudC1tb25pdG9yaW5nLXNlcnZpY2VzIiwic3ViamVjdCI6IiRTWVMuUkVRLkFDQ09VTlQuKi4qIiwidHlwZSI6InNlcnZpY2UiLCJyZXNwb25zZV90eXBlIjoiU3RyZWFtIiwiYWNjb3VudF90b2tlbl9wb3NpdGlvbiI6NCwiZGVzY3JpcHRpb24iOiJSZXF1ZXN0IGFjY291bnQgc3BlY2lmaWMgbW9uaXRvcmluZyBzZXJ2aWNlcyBmb3I6IFNVQlNaLCBDT05OWiwgTEVBRlosIEpTWiBhbmQgSU5GTyIsImluZm9fdXJsIjoiaHR0cHM6Ly9kb2NzLm5hdHMuaW8vbmF0cy1zZXJ2ZXIvY29uZmlndXJhdGlvbi9zeXNfYWNjb3VudHMifV0sImxpbWl0cyI6eyJzdWJzIjotMSwiZGF0YSI6LTEsInBheWxvYWQiOi0xLCJpbXBvcnRzIjotMSwiZXhwb3J0cyI6LTEsIndpbGRjYXJkcyI6dHJ1ZSwiY29ubiI6LTEsImxlYWYiOi0xfSwic2lnbmluZ19rZXlzIjpbIkFERk4zWEZQSUo0MlpUVUdBN0xSUEpVUjdQT0M1MklSRDNCMlpVQTUyRUNITlUyQjI2T0NaWUFMIl0sImRlZmF1bHRfcGVybWlzc2lvbnMiOnsicHViIjp7fSwic3ViIjp7fX0sImF1dGhvcml6YXRpb24iOnt9LCJ0eXBlIjoiYWNjb3VudCIsInZlcnNpb24iOjJ9fQ.uCRsVKprN33p45DYRxYjrSQw6YxIPL5A8DE0q-_u5RPZrgwyAtldhRsWBRsuJMm2tQeAK6rnk3aT5ry0pP00Cg,
}


  • Configure Helm Chart with: OPERATOR_JWT, SYSTEM_ACCOUNT_ID, SYSTEM_ACCOUNT_JWT from the configuration file generated in the previous step.
# example configuration:
merge:
    operator: eyJ0eXAiOiJKV1QiLCJhbGciOiJlZDI1NTE5LW5rZXkifQ.eyJqdGkiOiJZT1ZEQ05ZQU9PNjdDMkNSS0taSDVGVExXVEpXRU1RSkVUVVpKWjZEVlpINkM3Qk8yS1JRIiwiaWF0IjoxNzUxODA0MDY3LCJpc3MiOiJPRE5aSE5DNjZNRjdQQU9aWVpTV0JGRjdFWkNRMk5TRUJXN0QzWjVGQVBaSzRWUTJVNjJDSVpLVCIsIm5hbWUiOiJ0ZW5hbnRfZGV2ZWxvcG1lbnQiLCJzdWIiOiJPRE5aSE5DNjZNRjdQQU9aWVpTV0JGRjdFWkNRMk5TRUJXN0QzWjVGQVBaSzRWUTJVNjJDSVpLVCIsIm5hdHMiOnsic3lzdGVtX2FjY291bnQiOiJBRFBVQUpWUjJCT1RBWjY1UTVXT1o0WVVSR0JWWlNVTUNMRjU3U0lMUDYzNzJBU1UySEtJVDNCQSIsInR5cGUiOiJvcGVyYXRvciIsInZlcnNpb24iOjJ9fQ.gW3SODy32S7g0YFFKW0EyZ-2MtD4U0rdjPYNtQZ5_VhG8krpXOolvHwLLPXb2d8cpHZ6At5X4Ke7w-N5ikJfAg
    resolver_preload:
      ADPUAJVR2BOTAZ65Q5WOZ4YURGBVZSUMCLF57SILP6372ASU2HKIT3BA: eyJ0eXAiOiJKV1QiLCJhbGciOiJlZDI1NTE5LW5rZXkifQ.eyJqdGkiOiJCRVdFRjdUVkhGWUlCVkpJWldEMlJRWUs1TEdDNE9IMlVLUUQyS0pPWkZGN0RaNVpLR01RIiwiaWF0IjoxNzUxODA0MDY3LCJpc3MiOiJPRE5aSE5DNjZNRjdQQU9aWVpTV0JGRjdFWkNRMk5TRUJXN0QzWjVGQVBaSzRWUTJVNjJDSVpLVCIsIm5hbWUiOiJTWVMiLCJzdWIiOiJBRFBVQUpWUjJCT1RBWjY1UTVXT1o0WVVSR0JWWlNVTUNMRjU3U0lMUDYzNzJBU1UySEtJVDNCQSIsIm5hdHMiOnsiZXhwb3J0cyI6W3sibmFtZSI6ImFjY291bnQtbW9uaXRvcmluZy1zdHJlYW1zIiwic3ViamVjdCI6IiRTWVMuQUNDT1VOVC4qLlx1MDAzZSIsInR5cGUiOiJzdHJlYW0iLCJhY2NvdW50X3Rva2VuX3Bvc2l0aW9uIjozLCJkZXNjcmlwdGlvbiI6IkFjY291bnQgc3BlY2lmaWMgbW9uaXRvcmluZyBzdHJlYW0iLCJpbmZvX3VybCI6Imh0dHBzOi8vZG9jcy5uYXRzLmlvL25hdHMtc2VydmVyL2NvbmZpZ3VyYXRpb24vc3lzX2FjY291bnRzIn0seyJuYW1lIjoiYWNjb3VudC1tb25pdG9yaW5nLXNlcnZpY2VzIiwic3ViamVjdCI6IiRTWVMuUkVRLkFDQ09VTlQuKi4qIiwidHlwZSI6InNlcnZpY2UiLCJyZXNwb25zZV90eXBlIjoiU3RyZWFtIiwiYWNjb3VudF90b2tlbl9wb3NpdGlvbiI6NCwiZGVzY3JpcHRpb24iOiJSZXF1ZXN0IGFjY291bnQgc3BlY2lmaWMgbW9uaXRvcmluZyBzZXJ2aWNlcyBmb3I6IFNVQlNaLCBDT05OWiwgTEVBRlosIEpTWiBhbmQgSU5GTyIsImluZm9fdXJsIjoiaHR0cHM6Ly9kb2NzLm5hdHMuaW8vbmF0cy1zZXJ2ZXIvY29uZmlndXJhdGlvbi9zeXNfYWNjb3VudHMifV0sImxpbWl0cyI6eyJzdWJzIjotMSwiZGF0YSI6LTEsInBheWxvYWQiOi0xLCJpbXBvcnRzIjotMSwiZXhwb3J0cyI6LTEsIndpbGRjYXJkcyI6dHJ1ZSwiY29ubiI6LTEsImxlYWYiOi0xfSwic2lnbmluZ19rZXlzIjpbIkFERk4zWEZQSUo0MlpUVUdBN0xSUEpVUjdQT0M1MklSRDNCMlpVQTUyRUNITlUyQjI2T0NaWUFMIl0sImRlZmF1bHRfcGVybWlzc2lvbnMiOnsicHViIjp7fSwic3ViIjp7fX0sImF1dGhvcml6YXRpb24iOnt9LCJ0eXBlIjoiYWNjb3VudCIsInZlcnNpb24iOjJ9fQ.uCRsVKprN33p45DYRxYjrSQw6YxIPL5A8DE0q-_u5RPZrgwyAtldhRsWBRsuJMm2tQeAK6rnk3aT5ry0pP00Cg
    system_account: ADPUAJVR2BOTAZ65Q5WOZ4YURGBVZSUMCLF57SILP6372ASU2HKIT3BA

Given the capability you are leveraging, describe your expectation?

The error is happening between replicas. I don't have any other client connected.
It should authenticate correctly.

Maybe I'm missing something.

Given the expectation, what is the defect you are observing?

When the replicas initialize, I'm observing two things.

  1. There is a Warning in the logs that I think is related:
[WRN] Account fetch failed: will only fetch valid account keys
  1. Nats replicas are retrying the authentication process that results in an error:
[7] 2025/07/06 13:50:59.375205 [ERR] [2600:1f18:404c:6403:a89c::4]:55343 - cid:305 - authentication error
[7] 2025/07/06 13:51:07.386996 [ERR] [2600:1f18:404c:6403:a89c::4]:50061 - cid:306 - authentication error
[7] 2025/07/06 13:51:07.387316 [ERR] [2600:1f18:404c:6403:a89c::4]:43599 - cid:307 - authentication error
[7] 2025/07/06 13:51:19.403365 [ERR] [2600:1f18:404c:6403:a89c::4]:57109 - cid:308 - authentication error
[7] 2025/07/06 13:51:23.408362 [ERR] [2600:1f18:404c:6403:a89c::4]:37729 - cid:309 - authentication error
[7] 2025/07/06 13:51:27.414778 [ERR] [2600:1f18:404c:6403:a89c::4]:44101 - cid:310 - authentication error
[7] 2025/07/06 13:51:39.431060 [ERR] [2600:1f18:404c:6403:a89c::4]:51315 - cid:311 - authentication error
[7] 2025/07/06 13:51:43.437221 [ERR] [2600:1f18:404c:6403:a89c::4]:37825 - cid:312 - authentication error

Metadata

Metadata

Assignees

No one assigned

    Labels

    defectSuspected defect such as a bug or regression

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions