Skip to content

etcd snapshot cannot runnning successfully #44

Open
@damuji8

Description

@damuji8

using milvus helm 4.1.9 etcd image is 3.5.5-r2.
in this image /opt/bitnami/scripts/etcd/snapshot.sh
/opt/bitnami/scripts/libetcd.sh

etcdctl_get_endpoints() {
echo "$ETCD_INITIAL_CLUSTER" | sed 's/^[^=]+=http/http/g' |sed 's/,[^=]+=/,/g'
}

i need to add env ETCD_INITIAL_CLUSTER in cronjob.

without this env . will show error "all etcd endpoints are unhealthy!"

in etcd etcd:3.5.5-debian-11-r23
/opt/bitnami/scripts/libetcd.sh
etcdctl_get_endpoints() {
local only_others=${1:-false}
local -a endpoints=()
local host domain port

ip_has_valid_hostname() {
    local ip="${1:?ip is required}"
    local parent_domain="${1:?parent_domain is required}"

    # 'getent hosts $ip' can return hostnames in 2 different formats:
    #     POD_NAME.HEADLESS_SVC_DOMAIN.NAMESPACE.svc.cluster.local (using headless service domain)
    #     10-237-136-79.SVC_DOMAIN.NAMESPACE.svc.cluster.local (using POD's IP and service domain)
    # We need to discad the latter to avoid issues when TLS verification is enabled.
    [[ "$(getent hosts "$ip")" = *"$parent_domain"* ]] && return 0
    return 1
}

hostname_has_ips() {
    local hostname="${1:?hostname is required}"
    [[ "$(getent ahosts "$hostname")" != "" ]] && return 0
    return 1
}

# This piece of code assumes this code is executed on a K8s environment
# where etcd members are part of a statefulset that uses a headless service
# to create a unique FQDN per member. Under these circumstances, the
# ETCD_ADVERTISE_CLIENT_URLS env. variable is created as follows:
#   SCHEME://POD_NAME.HEADLESS_SVC_DOMAIN:CLIENT_PORT,SCHEME://SVC_DOMAIN:SVC_CLIENT_PORT
#
# Assuming this, we can extract the HEADLESS_SVC_DOMAIN and obtain
# every available endpoint
read -r -a advertised_array <<<"$(tr ',;' ' ' <<<"$ETCD_ADVERTISE_CLIENT_URLS")"
host="$(parse_uri "${advertised_array[0]}" "host")"
port="$(parse_uri "${advertised_array[0]}" "port")"
domain="${host#"${ETCD_NAME}."}"
# When ETCD_CLUSTER_DOMAIN is set, we use that value instead of extracting
# it from ETCD_ADVERTISE_CLIENT_URLS
! is_empty_value "$ETCD_CLUSTER_DOMAIN" && domain="$ETCD_CLUSTER_DOMAIN"
# Depending on the K8s distro & the DNS plugin, it might need
# a few seconds to associate the POD(s) IP(s) to the headless svc domain
if retry_while "hostname_has_ips $domain"; then
    local -r ahosts="$(getent ahosts "$domain" | awk '{print $1}' | uniq | wc -l)"
    for i in $(seq 0 $((ahosts - 1))); do
        # We use the StatefulSet name stored in MY_STS_NAME to get the peer names based on the number of IPs registered in the headless service
        pod_name="${MY_STS_NAME}-${i}"
        if ! { [[ $only_others = true ]] && [[ "$pod_name" = "$MY_POD_NAME" ]]; }; then
            endpoints+=("${pod_name}.${ETCD_CLUSTER_DOMAIN}:${port:-2380}")
        fi
    done
fi
echo "${endpoints[*]}" | tr ' ' ','

}

bitnami helm template has the env ETCD_CLUSTER_DOMAIN and MY_STS_NAME. So we can running snapshot successfully.

i think this is problem.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions