Skip to content

[2.7.0] clusterdomain detection fail, so pgbackrest got wrong hostnames #1321

@pasztorl

Description

@pasztorl

Report

i see that operator try to detect the clusterdomain here.
We use custom clusterdomains, what this function result is just "kubernetes"
If i run a query from a postgres pod i got this:

cat /etc/resolv.conf 
search example-db.svc.k8s.test1.example.com svc.k8s.test1.example.com k8s.test1.example.com
nameserver 10.15.85.18
options ndots:5

host kubernetes.default.svc
kubernetes.default.svc.k8s.test1.example.com has address 10.15.85.234

More about the problem

The result is that the pgbackrest config looks like this:

...
pg1-host = example-pg-example-pg-25pk-0.example-pg-pods.example-db.svc.kubernetes
...

This name (domain) not exists in the cluster so backup fails:

time="2025-10-13T20:02:44Z" level=info msg="[pgbackrest:stdout] 2025-10-13 20:02:44.231 P00   WARN: unable to check pg1: [HostConnectError] unable to get address for 'example-pg-example-pg-25pk-0.example-pg-example-db.svc.kubernetes': [-2] Name or service not known"

Steps to reproduce

  1. i created a little go application for checking using the same code:
package main

import (
        "context"
        "fmt"
        "net"
        "os"
        "strings"
        "time"
)

func main() {
        if len(os.Args) < 2 {
                fmt.Fprintf(os.Stderr, "usage: %s <nameserver-ip[:port]>\n", os.Args[0])
                os.Exit(2)
        }
        ns := os.Args[1]
        if !strings.Contains(ns, ":") {
                ns += ":53"
        }

        // Use stdlib resolver, pointed at the provided nameserver.
        resolver := &net.Resolver{
                PreferGo: true,
                Dial: func(ctx context.Context, _, _ string) (net.Conn, error) {
                        var d net.Dialer
                        return d.DialContext(ctx, "udp", ns)
                },
        }

        ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second)
        defer cancel()

        api := "kubernetes.default.svc"
        cname, err := resolver.LookupCNAME(ctx, api)
        if err == nil {
                fmt.Println(strings.TrimSuffix(strings.TrimPrefix(cname, api+"."), "."))
                fmt.Println(cname,api)
                return
        }
        fmt.Println("cluster.local")
}
  1. this code returns:
kdc 10.15.85.18
kubernetes
kubernetes kubernetes.default.svc

Versions

  1. Kubernetes 1.32.2
  2. Operator 2.7.0

Anything else?

I prefer adding clusterDomain as a helm value, but I also like the "auto-detection", but in this case it not worked.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions