Skip to content
This repository has been archived by the owner on Dec 12, 2023. It is now read-only.

one-click-tangle broken #85

Open
IoTAdri opened this issue Mar 23, 2023 · 13 comments
Open

one-click-tangle broken #85

IoTAdri opened this issue Mar 23, 2023 · 13 comments
Labels
bug Something isn't working

Comments

@IoTAdri
Copy link

IoTAdri commented Mar 23, 2023

Bug description

when using private-tangle.sh lots of error-msgs occur:
WARN[0000] network tangle: network.external.name is deprecated. Please set network.name with external: true

AND

Waiting coordinator bootstrap to stop gracefully...
Error response from daemon: No such container: 2385239fcb05553225d6e5238fb94b540dc23b88ea1adabeac7ef83ef36296f1

nodes are not deployed

Docker and docker-compose version

Docker version 23.0.1, build a5ee5b1
Docker Compose version v2.16.0
Portainer 2.17.1

Hardware specification

VPS on Ubuntu 22.04

Steps To reproduce the bug

Explain how the maintainer can reproduce the bug.

  1. just run the one-click-install

Expected behaviour

private tangle with 4 containers should be deployed

Actual behaviour

nodes are not deployed in docker and when I edit docker-compose.yaml and correct the first bug with:

networks:
  tangle:
    name: private-tangle
    external: true

it seems to go wrong in bootstrapCoordinator ()
like the coo.bootstrap.container is not deployed or something
it generates an error and the other nodes (node, spammer, etc.) are not deployed.

Errors

Waiting coordinator bootstrap to stop gracefully...
Error response from daemon: No such container: 2385239fcb05553225d6e5238fb94b540dc23b88ea1adabeac7ef83ef36296f1

@IoTAdri IoTAdri added the bug Something isn't working label Mar 23, 2023
@IoTAdri
Copy link
Author

IoTAdri commented Mar 24, 2023

addendum:

Since I could not solve th above problem I did a workaround and bootsraped the coo outside the bashfile to get it running.
now... 3 out of 5 times when I run the "stop"-procedure: ./private-tangle.sh stop
the coo will not stop and when i force it after 5 minutes it deletes the coordinator.state
With no coordinator.state the start procedure does not work and the coo exits.

I tried to use coo-fix-state:
docker-compose run --rm coo tool coo-fix-state --databasePath /app/db --stateFilePath /app/coo-state
(with the paths mentioned in the container)
but it crashes with:
image

this is not a stable situation... HELP!

@jmcanterafonseca-iota
Copy link
Collaborator

@IoTAdri please could you execute from scratch ./private-tangle.sh install (in the hornet-private-net folder and attach here all the output you get in the console?

@IoTAdri
Copy link
Author

IoTAdri commented Mar 28, 2023

Hi Jorge,
Thank you for responding to me.
My first errors (related to docker) are fixed (I used a version from 10/3 which you corrected already) but.. my second error still exists; coo not stopping corrupting the coordinator.state.

Here is the screenshot when I do the whole install as per your request:
ptanglestandardokay
everything went smooth, up and running very fast...
but after a few hours and starting and stopping a few times (as will happen over a number of months time during deployment) the coo keeps running when you ask it to stop... it will not gracefully exit.. (I can send you the logs also).. it will keep trying to exit for a few minutes (till timeout occurs) and then when the timer is at zero... be killed.. deleting the coordinator.state in the process.
privatetangleError

here a shot from the logs:
Schermafbeelding 2023-03-27 223550

and when you are trying to start the network again the coo will fail..
I tried too use coo-fix-state but get the above mentioned error (see addendum)

@jmcanterafonseca-iota
Copy link
Collaborator

jmcanterafonseca-iota commented Mar 28, 2023

can you try to edit your config-coo.json (located under folder config) and under the entry described below, change the stateFilePath field to point to an absolute path?

"coordinator": {
        "stateFilePath": "<full_path_to your_folder>/coo-state/coordinator.state",

@IoTAdri

@IoTAdri
Copy link
Author

IoTAdri commented Mar 28, 2023

@jmcanterafonseca-iota okay, will change the path but first have to reinstall becaus current error is unrecoverable (cannot use coo-fix-state)

but.. is that not the path "inside" the docker container?

image
and when I first remove \db (just to be sure) and do an install I get:
image

@jmcanterafonseca-iota
Copy link
Collaborator

you are right, then it should be /app/coo-state/coordinator.state

@jmcanterafonseca-iota
Copy link
Collaborator

@IoTAdri was it fixed?

@IoTAdri
Copy link
Author

IoTAdri commented Mar 29, 2023

this seems to work a bit more stable... but after a number of times of stopping and starting the private tangle it gets into trouble again:
image
and I do not know how to recover from this...

@IoTAdri
Copy link
Author

IoTAdri commented Mar 29, 2023

@jmcanterafonseca-iota after a few minutes the "gracefull exit" times out and the coo-container is closed but the coordinator.state is gone...
image
we used to be able to correct this with coo-fix-state but i cannot get this to work (see above in "addendum")

[I tried renaming the coordinator.state_old to coordinator.state and restart the private tangle but... nope, no luck!]

@jmcanterafonseca-iota
Copy link
Collaborator

can you try stopping the Coo manually i.e. through

docker kill --signal="SIGTERM" coo

@IoTAdri

@IoTAdri
Copy link
Author

IoTAdri commented Mar 30, 2023

can you try stopping the Coo manually i.e. through

docker kill --signal="SIGTERM" coo

@IoTAdri

@jmcanterafonseca-iota
I probably can but..

The problem is not that the coo keeps on running and cannot be stopped but that when it is killed (or timed out by itself in 5 minutes) it deletes the coordinator.state and I cannot reconstruct it because coo-fix-state does not work...

@jmcanterafonseca-iota
Copy link
Collaborator

maybe @muXxer can comment on the above

@Nilesh0711
Copy link

WARN[0000] network tangle: network.external.name is deprecated in favor of network.name
Waiting for 10 seconds ... ⏳
2023-05-31T11:54:34Z INFO Coordinator milestone issued (1): ba93f3575086e127994707bb0030e7c12213c9aed1414f9614e78bbbed1ea199
Coordinator bootstrapped!
d7203e741b014b46400b3b565780cdc9c2423766920cf59c018efcc557918c44
Waiting coordinator bootstrap to stop gracefully...
Error: No such container: d7203e741b014b46400b3b565780cdc9c2423766920cf59c018efcc557918c44

any better solution?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants