ETCD docker-compose runtime configuration #17011

MatthewComtois · 2023-11-23T16:38:18Z

MatthewComtois
Nov 23, 2023

Hi, I’m pretty new to etcd and I have difficulty finding information about this kind of configuration in docker. I’m not even certain if I what I want to do is even possible. I’m using docker-compose to start a three node cluster of etcd. My node are being attack by a service similar to a choas monkey. The initial configuration seem to be working fine. I can interact with it using dotnet-etcd. The problem is when one of the node restart, it only start etcd in the background but does not join back the cluster. The only thing that kind if helped is the documentation about runtime reconfiguration (https://etcd.io/docs/v3.3/op-guide/runtime-configuration/) but it mainly involve doing manual reconfiguration and not automatic.

My question are:

Is their a way that to automatically delete node that has become inactive?

Also, do I always need to change my environnement variable to rejoin the cluster or I can just keep the same env variable from the initialization of the cluster.

thank you

Answered by MatthewComtois

Nov 26, 2023

I think I resolve my problem, I created an image from the etcd docker image that run a check, it validate if the podName is in the cluster. If it find it, it set the - --initial-cluster-state=existing, if not, I set it to new

View full answer

jmhbnz · 2023-11-24T08:36:54Z

jmhbnz
Nov 24, 2023
Maintainer

Hey @MatthewComtois - Thanks for your question. Are you using docker volumes to persist the data directory of your etcd members?

If not, this could potentially explain why they are not coming back up with the expected configuration after being disrupted by your chaos monkey.

For an example of a docker configuration for etcd that uses volumes refer to: #16138 (comment).

Just to be clear on why this is important, once an etcd member is restarted, it is definitely expected that it should rejoin whichever cluster it is already a member of provided it gets started with the same data directory it used to have and the right parameters.

Feel free to post your docker-compose file if you need more help.

0 replies

MatthewComtois · 2023-11-24T21:44:25Z

MatthewComtois
Nov 24, 2023
Author

Hi,
Thank you for your answer.

I did configure with a volume, but until the whole cluster is down, it doesn't rejoin it. It does a loop with this error saying it has already been bootstrapped.

Also, I'm using the arm64 image because I'm running it in Docker with Apple Silicon, but I don't know if it's necessary.

etcd1:
image: quay.io/coreos/etcd:v3.5.10-arm64
container_name: etcd1
hostname: etcd1
restart: always
command:
- etcd
- --name=etcd1
- --data-dir=data.etcd
- --advertise-client-urls=http://etcd1:2379
- --listen-client-urls=http://0.0.0.0:2379
- --initial-advertise-peer-urls=http://etcd1:2380
- --listen-peer-urls=http://0.0.0.0:2380
- --initial-cluster=etcd1=http://etcd1:2380,etcd2=http://etcd2:2380,etcd3=http://etcd3:2380
- --initial-cluster-state=new
- --initial-cluster-token=etcd-cluster-1
ports:
- "2379:2379"
volumes:
- ./store/etcd1/data:/etcd_data
etcd2:
image: quay.io/coreos/etcd:v3.5.10-arm64
container_name: etcd2
hostname: etcd2
restart: always
command:
- etcd
- --name=etcd2
- --data-dir=data.etcd
- --advertise-client-urls=http://etcd2:2379
- --listen-client-urls=http://0.0.0.0:2379
- --initial-advertise-peer-urls=http://etcd2:2380
- --listen-peer-urls=http://0.0.0.0:2380
- --initial-cluster=etcd1=http://etcd1:2380,etcd2=http://etcd2:2380,etcd3=http://etcd3:2380
- --initial-cluster-state=new
- --initial-cluster-token=etcd-cluster-1
ports:
- "22379:2379"
volumes:
- ./store/etcd2/data:/etcd_data
etcd3:
image: quay.io/coreos/etcd:v3.5.10-arm64
container_name: etcd3
hostname: etcd3
restart: always
command:
- etcd
- --name=etcd3
- --data-dir=data.etcd
- --advertise-client-urls=http://etcd3:2379
- --listen-client-urls=http://0.0.0.0:2379
- --initial-advertise-peer-urls=http://etcd3:2380
- --listen-peer-urls=http://0.0.0.0:2380
- --initial-cluster=etcd1=http://etcd1:2380,etcd2=http://etcd2:2380,etcd3=http://etcd3:2380
- --initial-cluster-state=new
- --initial-cluster-token=etcd-cluster-1
ports:
- "32379:2379"
volumes:
- ./store/etcd3/data:/etcd_data

10 replies

MatthewComtois Nov 25, 2023
Author

I run :
docker-compose up -d ---- everthing works fine
docker rm --force etcd1 ---- everthing works fine with etcd2 and etcd3
docker compose up etcd1 -d ---- everthing works fine for etcd2 and etcd3, but etcd1 does a restart loop trying to enter the cluster
and i get those logs on etcd1

jmhbnz Nov 25, 2023
Maintainer

Ok so below is the exact script I ran on an arm64 machine, just to rule out anything cpu architecture related. with that I cannot reproduce the member <id> has already been bootstrapped error you are seeing.

If anything in my script doesn't match what you are doing let me know. Otherwise it must be something specific to your setup sorry as I cannot reproduce it.

# Write the compose file
cat << EOF > docker-compose.yaml
---
services:
  etcd1:
    image: quay.io/coreos/etcd:v3.5.10-arm64
    container_name: etcd1
    hostname: etcd1
    restart: always
    command: |
      etcd
      --name=etcd1
      --data-dir=data.etcd
      --advertise-client-urls=http://etcd1:2379
      --listen-client-urls=http://0.0.0.0:2379
      --initial-advertise-peer-urls=http://etcd1:2380
      --listen-peer-urls=http://0.0.0.0:2380
      --initial-cluster=etcd1=http://etcd1:2380,etcd2=http://etcd2:2380,etcd3=http://etcd3:2380
      --initial-cluster-token=etcd-cluster-1
    ports:
      - "2379:2379"
    volumes:
      - ./store/etcd1/data:/etcd_data
  etcd2:
    image: quay.io/coreos/etcd:v3.5.10-arm64
    container_name: etcd2
    hostname: etcd2
    restart: always
    command: |
      etcd
      --name=etcd2
      --data-dir=data.etcd
      --advertise-client-urls=http://etcd2:2379
      --listen-client-urls=http://0.0.0.0:2379
      --initial-advertise-peer-urls=http://etcd2:2380
      --listen-peer-urls=http://0.0.0.0:2380
      --initial-cluster=etcd1=http://etcd1:2380,etcd2=http://etcd2:2380,etcd3=http://etcd3:2380
      --initial-cluster-token=etcd-cluster-1
    ports:
      - "22379:2379"
    volumes:
      - ./store/etcd2/data:/etcd_data
  etcd3:
    image: quay.io/coreos/etcd:v3.5.10-arm64
    container_name: etcd3
    hostname: etcd3
    restart: always
    command: |
      etcd
      --name=etcd3
      --data-dir=data.etcd
      --advertise-client-urls=http://etcd3:2379
      --listen-client-urls=http://0.0.0.0:2379
      --initial-advertise-peer-urls=http://etcd3:2380
      --listen-peer-urls=http://0.0.0.0:2380
      --initial-cluster=etcd1=http://etcd1:2380,etcd2=http://etcd2:2380,etcd3=http://etcd3:2380
      --initial-cluster-token=etcd-cluster-1
    ports:
      - "32379:2379"
    volumes:
      - ./store/etcd3/data:/etcd_data
EOF

# Start the cluster
docker-compose up -d && sleep 5s

# Check containers
docker ps
docker exec -it etcd1 etcdctl endpoint status --endpoints etcd1:2379,etcd2:2379,etcd3:2379 -w table && sleep 5s

# Restart the etcd1 container
docker rm --force etcd1 && sleep 5s

# Bring etcd1 back only
docker-compose down
docker-compose up etcd1

MatthewComtois Nov 26, 2023
Author

Hi, I'm really sorry, like I said I'm new to ETCD and I dont know if I'm being mistaken, but like you wrote at the botton, you are shutting everything else and only bringing back etcd1. So the cluster is not operationnal, you only have the etcd1 node alive.

I'm searching more of a way to bring back etcd1 in the cluster after the command docker rm --force etcd1 && sleep 5s is being executed. And that when I have the error that I send above. I don't really want to have to shut down the cluster everytime one node is being deleted

MatthewComtois Nov 26, 2023
Author

I think I resolve my problem, I created an image from the etcd docker image that run a check, it validate if the podName is in the cluster. If it find it, it set the - --initial-cluster-state=existing, if not, I set it to new

Answer selected by jmhbnz

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ETCD docker-compose runtime configuration #17011

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 10 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

ETCD docker-compose runtime configuration #17011

Uh oh!

MatthewComtois Nov 23, 2023

Replies: 2 comments · 10 replies

Uh oh!

jmhbnz Nov 24, 2023 Maintainer

Uh oh!

Uh oh!

MatthewComtois Nov 24, 2023 Author

Uh oh!

MatthewComtois Nov 25, 2023 Author

Uh oh!

jmhbnz Nov 25, 2023 Maintainer

Uh oh!

MatthewComtois Nov 26, 2023 Author

Uh oh!

MatthewComtois Nov 26, 2023 Author

MatthewComtois
Nov 23, 2023

Replies: 2 comments 10 replies

jmhbnz
Nov 24, 2023
Maintainer

MatthewComtois
Nov 24, 2023
Author

MatthewComtois Nov 25, 2023
Author

jmhbnz Nov 25, 2023
Maintainer

MatthewComtois Nov 26, 2023
Author

MatthewComtois Nov 26, 2023
Author