Replies: 3 comments 6 replies
-
|
/cc @theblobinthesky Maybe you have an idea where to start debugging, @Hialus or @bensofficial? |
Beta Was this translation helpful? Give feedback.
-
|
In our own setup we have a config like this.
Core Node Config: So the main difference is the |
Beta Was this translation helpful? Give feedback.
-
|
Thanks everyone for looking into the issue. Since the planned alternative setup with Redis (#11449) was a lot easier to configure, I do not think we will consider Hazelcast any further. |
Beta Was this translation helpful? Give feedback.

Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I have a question regarding Hazelcast in a multi-instance setup: How can I debug if the Hazelcast nodes do not connect to each other?
Artemis setup
I want to try out the LocalCI setup with the build-agent running in a separate VM. We deploy our Artemis as a container instead of directly via the WAR file. Since Artemis is running in a container, it cannot bind directly to the public address of the machine, but Hazelcast allows for different bind-interfaces and public addresses with the addition of a configuration parameter in Artemis [1].
[1] develop...feature/general/hazelcast-public-address, built container image
ghcr.io/uni-passau-artemis/artemis:8.2.2-hazelcastArtemis main instance:
hazelcast.interface: 0.0.0.0hazelcast.publicAddress: 172.16.1.10:5701Artemis build agent:
hazelcast.interface: 0.0.0.0hazelcast.publicAddress: 172.16.1.20:5701In the log and in the Eureka web interface I can now see that both instances correcly connect to the registry:
However, for example in the log of artemis-container I then get
and every two minutes the same (on the build agent the same with swapped IPs). So it looks like the Hazelcast cluster does not form, but there is no error log explaining what might go wrong. It looks like it correctly uses the public addresses as defined in the config to try to connect.
Using
nmapI verified that the port 5701 is open on both machines when checking from the respective other one (i.e. there should not be a firewall that blocks the communication). I also get an error log in Artemis that the Hazelcast REST API is not enabled when I try tocurl http://172.16.0.{10,20}:5701/hazelcast/rest/, so it seems the public address/port is successfully routed to Artemis itself and Artemis listens on it.Artemis config
Sanity-check using plain Hazelcast container
To check if this is an issue with Hazelcast itself, or the way it is used in Artemis, I used the plain Hazelcast containers on both hosts to check if they can connect using the public/private IP mapping.
hazelcast.ymlidentical on both hostsThen I start the containers on the two hosts:
In the log I can see that the two instances successfully connect:
Network traffic
Using the Hazelcast containers, I can see network traffic to the destination port 5702 on the other host as expected.
When I
tcpdumpthe traffic to5701to check for potential connections Artemis (tries) to open, there is nothing.Also using
strace -f -t -e network -p $ARTEMIS_PID, I can see nothing at the time when Artemis logsAdding Hazelcast cluster member.Beta Was this translation helpful? Give feedback.
All reactions