Skip to content
This repository was archived by the owner on Jan 18, 2024. It is now read-only.
This repository was archived by the owner on Jan 18, 2024. It is now read-only.

Error when upgrade from 0.13.1 to latest version #554

Open
@throrin19

Description

@throrin19

What happened?
After upgrade, all my nodes return this error :

2023-01-20 10:15:50,010 ERROR: create_config_service failed
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/patroni/dcs/kubernetes.py", line 890, in _create_config_service
    if not self._api.create_namespaced_service(self._namespace, body):
  File "/usr/lib/python3/dist-packages/patroni/dcs/kubernetes.py", line 468, in wrapper
    return getattr(self._core_v1_api, func)(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/patroni/dcs/kubernetes.py", line 404, in wrapper
    return self._api_client.call_api(method, path, headers, body, **kwargs)
  File "/usr/lib/python3/dist-packages/patroni/dcs/kubernetes.py", line 373, in call_api
    return self._handle_server_response(response, _preload_content)
  File "/usr/lib/python3/dist-packages/patroni/dcs/kubernetes.py", line 203, in _handle_server_response
    raise k8s_client.rest.ApiException(http_resp=response)
patroni.dcs.kubernetes.K8sClient.rest.ApiException: (403)
Reason: Forbidden
HTTP response headers: HTTPHeaderDict({'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Content-Type-Options': 'nosniff', 'X-Kubernetes-Pf-Flowschema-Uid': 'e47d519f-f244-47a4-ad9f-201cbe928c4a', 'X-Kubernetes-Pf-Prioritylevel-Uid': 'f182df09-64d8-40dd-8faa-445be223320d', 'Date': 'Fri, 20 Jan 2023 10:15:50 GMT', 'Content-Length': '300'})
HTTP response body: b'{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"services is forbidden: User \\"system:serviceaccount:opennms:timescaledb\\" cannot create resource \\"services\\" in API group \\"\\" in the namespace \\"stage\\"","reason":"Forbidden","details":{"kind":"services"},"code":403}\n'

Did you expect to see something different?

Yes, no error in my timescaledb nodes

How to reproduce it (as minimally and precisely as possible):

  1. Use this timescaledb-ha image : pg12.13-ts2.9.1-latest
  2. Install the chart timescaledb-single in version 0.13.0
  3. try to upgrade to latest or others 0.1X.X
  4. And voilà ! You have same error in nodes

Environment

  • Which helm chart and what version are you using?

TimescaleDB-single in 0.13.1 at start and I try to upgrade to 0.30.0

  • What is in your values.yaml ?
affinity: {}
backup:
  enabled: true
  env: null
  envFrom: null
  jobs:
    - name: full-weekly
      schedule: 12 02 * * 0
      type: full
    - name: incremental-daily
      schedule: 12 02 * * 1-6
      type: incr
  pgBackRest:
    compress-type: lz4
    process-max: 4
    repo1-cipher-type: none
    repo1-retention-diff: 2
    repo1-retention-full: 2
    repo1-s3-endpoint: s3.amazonaws.com
    repo1-s3-region: eu-west-1
    repo1-type: s3
    start-fast: 'y'
  pgBackRest:archive-get: {}
  pgBackRest:archive-push: {}
  resources: {}
bootstrapFromBackup:
  enabled: false
  repo1-path: null
  secretName: pgbackrest-bootstrap
callbacks:
  configMap: null
clusterName: stage
debug:
  execStartPre: null
env:
  - name: TIMESCALEDB_TELEMETRY
    value: 'off'
envFrom: null
fullnameOverride: '{{ .Release.Name }}'
image:
  pullPolicy: Always
  repository: timescale/timescaledb-ha
  tag: pg12.13-ts2.9.1-latest
networkPolicy:
  enabled: false
  ingress: null
  prometheusApp: prometheus
nodeSelector: {}
patroni:
  bootstrap:
    dcs:
      loop_wait: 10
      maximum_lag_on_failover: 33554432
      postgresql:
        parameters:
          archive_command: /etc/timescaledb/scripts/pgbackrest_archive.sh %p
          archive_mode: 'on'
          archive_timeout: 1800s
          autovacuum_analyze_scale_factor: 0.02
          autovacuum_max_workers: 10
          autovacuum_naptime: 5s
          autovacuum_vacuum_cost_limit: 500
          autovacuum_vacuum_scale_factor: 0.05
          hot_standby: 'on'
          log_autovacuum_min_duration: 1min
          log_checkpoints: 'on'
          log_connections: 'on'
          log_disconnections: 'on'
          log_line_prefix: '%t [%p]: [%c-%l] %u@%d,app=%a [%e] '
          log_lock_waits: 'on'
          log_min_duration_statement: 1s
          log_statement: ddl
          max_connections: 100
          max_prepared_transactions: 150
          shared_preload_libraries: timescaledb,pg_stat_statements
          ssl: 'on'
          ssl_cert_file: /etc/certificate/tls.crt
          ssl_key_file: /etc/certificate/tls.key
          tcp_keepalives_idle: 900
          tcp_keepalives_interval: 100
          temp_file_limit: 1GB
          timescaledb.passfile: ../.pgpass
          unix_socket_directories: /var/run/postgresql
          unix_socket_permissions: '0750'
          wal_level: hot_standby
          wal_log_hints: 'on'
          max_locks_per_transaction: 2200
          max_parallel_workers: 14
          max_worker_processes: 32
          timescaledb.max_background_workers: 16
        use_pg_rewind: true
        use_slots: true
      retry_timeout: 10
      ttl: 30
    method: restore_or_initdb
    post_init: /etc/timescaledb/scripts/post_init.sh
    restore_or_initdb:
      command: >
        /etc/timescaledb/scripts/restore_or_initdb.sh --encoding=UTF8
        --locale=C.UTF-8
      keep_existing_recovery_conf: true
  kubernetes:
    role_label: role
    scope_label: cluster-name
    use_endpoints: true
  log:
    level: WARNING
  postgresql:
    authentication:
      replication:
        username: standby
      superuser:
        username: postgres
    basebackup:
      - waldir: /var/lib/postgresql/wal/pg_wal
    callbacks:
      on_reload: /etc/timescaledb/scripts/patroni_callback.sh
      on_restart: /etc/timescaledb/scripts/patroni_callback.sh
      on_role_change: /etc/timescaledb/scripts/patroni_callback.sh
      on_start: /etc/timescaledb/scripts/patroni_callback.sh
      on_stop: /etc/timescaledb/scripts/patroni_callback.sh
    create_replica_methods:
      - pgbackrest
      - basebackup
    listen: 0.0.0.0:5432
    pg_hba:
      - local     all             postgres                              peer
      - local     all             all                                   md5
      - hostssl   all             all                127.0.0.1/32       md5
      - hostssl   all             all                ::1/128            md5
      - hostssl   replication     standby            all                md5
      - hostssl   all             all                all                md5
      - host      all             all                all                md5
    pgbackrest:
      command: /etc/timescaledb/scripts/pgbackrest_restore.sh
      keep_data: true
      no_master: true
      no_params: true
    recovery_conf:
      restore_command: /etc/timescaledb/scripts/pgbackrest_archive_get.sh %f "%p"
    use_unix_socket: true
  restapi:
    listen: 0.0.0.0:8008
persistentVolumes:
  data:
    accessModes:
      - ReadWriteOnce
    annotations: {}
    enabled: true
    mountPath: /var/lib/postgresql
    size: 25Gi
    subPath: ''
  wal:
    accessModes:
      - ReadWriteOnce
    annotations: {}
    enabled: true
    mountPath: /var/lib/postgresql/wal
    size: 5Gi
    storageClass: null
    subPath: ''
pgBouncer:
  config:
    default_pool_size: 12
    max_client_conn: 500
    pool_mode: transaction
    server_reset_query: DISCARD ALL
    client_tls_sslmode: prefer
    ignore_startup_parameters: extra_float_digits
  enabled: true
  pg_hba:
    - local     all postgres                   peer
    - host      all postgres,standby 0.0.0.0/0 reject
    - host      all postgres,standby ::0/0     reject
    - hostssl   all all              0.0.0.0/0 md5
    - hostssl   all all              ::0/0     md5
    - host      all all              0.0.0.0/0 md5
    - host      all all              ::0/0     md5
  port: 6432
  userListSecretName: null
podAnnotations: {}
podLabels: {}
podManagementPolicy: OrderedReady
podMonitor:
  enabled: false
  interval: 10s
  path: /metrics
postInit:
  - configMap:
      name: custom-init-scripts
      optional: true
  - secret:
      name: custom-secret-scripts
      optional: true
prometheus:
  args: []
  enabled: false
  env: null
  image:
    pullPolicy: Always
    repository: quay.io/prometheuscommunity/postgres-exporter
    tag: v0.11.1
  volumeMounts: null
  volumes: null
rbac:
  create: true
readinessProbe:
  enabled: true
  failureThreshold: 6
  initialDelaySeconds: 5
  periodSeconds: 30
  successThreshold: 1
  timeoutSeconds: 5
replicaCount: 3
resources: {}
secrets:
  certificate:
    tls.crt: ''
    tls.key: ''
  certificateSecretName: certificate
  credentials:
    PATRONI_REPLICATION_PASSWORD: ''
    PATRONI_SUPERUSER_PASSWORD: ''
    PATRONI_admin_PASSWORD: ''
  credentialsSecretName: credentials
  pgbackrest:
    PGBACKREST_REPO1_S3_BUCKET: ''
    PGBACKREST_REPO1_S3_ENDPOINT: s3.amazonaws.com
    PGBACKREST_REPO1_S3_KEY: ''
    PGBACKREST_REPO1_S3_KEY_SECRET: ''
    PGBACKREST_REPO1_S3_REGION: ''
  pgbackrestSecretName: pgbackrest
service:
  primary:
    annotations:
      service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout: '4000'
      service.beta.kubernetes.io/aws-load-balancer-internal: 'true'
      service.beta.kubernetes.io/aws-load-balancer-type: nlb
    labels: {}
    nodePort: null
    port: 5432
    spec: {}
    type: LoadBalancer
  replica:
    annotations:
      service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout: '4000'
      service.beta.kubernetes.io/aws-load-balancer-internal: 'true'
      service.beta.kubernetes.io/aws-load-balancer-type: nlb
    labels: {}
    nodePort: null
    port: 5432
    spec: {}
    type: LoadBalancer
serviceAccount:
  annotations: {}
  create: true
  name: null
sharedMemory:
  useMount: true
timescaledbTune:
  args: {}
  enabled: true
tolerations: []
topologySpreadConstraints: []
version: null
global:
  cattle:
    systemProjectId: p-rg64l
loadBalancer:
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-internal: 'true'
    service.beta.kubernetes.io/aws-load-balancer-type: nlb
  enabled: false
replicaLoadBalancer:
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-internal: 'true'
    service.beta.kubernetes.io/aws-load-balancer-type: nlb
  • Kubernetes version information:

    kubectl version

    Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.8", GitCommit:"4a3b558c52eb6995b3c5c1db5e54111bd0645a64", GitTreeState:"clean", BuildDate:"2021-12-15T14:52:11Z", GoVersion:"go1.16.12", Compiler:"gc", Platform:"linux/amd64"}
    Server Version: version.Info{Major:"1", Minor:"22+", GitVersion:"v1.22.15-eks-fb459a0", GitCommit:"be82fa628e60d024275efaa239bfe53a9119c2d9", GitTreeState:"clean", BuildDate:"2022-10-24T20:33:23Z", GoVersion:"go1.16.15", Compiler:"gc", Platform:"linux/amd64"}
    
  • Kubernetes cluster kind:

    insert how you created your cluster: Rancher

Anything else we need to know?: I saw the same error here : #405 But the thread is marked as resolved. But no real solution is proposed to resolve this thread.

I saw the problem is with patroni but I don't Know if timescaledb-ha images using only postgres 12 version are patched

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions