Skip to content

LXD node IP wrongly resolving to 127.0.1.1 due to /etc/hosts rules causing test programs binding to localhost #615

@Wyb1406272043

Description

@Wyb1406272043

Description

When I was building test nodes using lxd and running the etcd test program in the tutorial, I encountered problems with the nodes not communicating properly.

2025-03-14 10:33:49.711760 W | rafthttp: health check for peer 9b116f88cab4dc9 could not connect: dial tcp 10.148.173.139:2380: getsockopt: connection refused
2025-03-14 10:33:49.716165 W | rafthttp: health check for peer 5aa594b5d9b66c42 could not connect: dial tcp 10.148.173.249:2380: getsockopt: connection refused
2025-03-14 10:33:49.722938 W | rafthttp: health check for peer 7f6143cbd22aca00 could not connect: dial tcp 10.148.173.95:2380: getsockopt: connection refused
2025-03-14 10:33:49.723323 W | rafthttp: health check for peer f82e563e5c75137e could not connect: dial tcp 10.148.173.142:2380: getsockopt: connection refused

Root Cause

After my debugging, I found that n1 incorrectly resolved its own address to 127.0.1.1.As a result, the node uses 127.0.1.1 as the bind address, causing connections from other nodes to be rejected.The reason for this is rule 127.0.1.1 in /etc/hosts.This rule is written by lxd when it is created, some other containers like docker don't have this rule.

2025-03-14 03:16:43.405882 W | etcdmain:no data-dir provided,using default data-dir./n1.etcd
2025-03-14 03:16:43.405906 W | embed:expected Ip in URL for binding(http://n1:2380)
2025-03-14 03:16:43.405925 W | embed:expected Ip in URL for binding(http://n1:2379)
2025-03-14 03:16:43.406274 I | embed:listening for peers on http://n1:2380
2025-03-14 03:16:43.406384 I | embed:listening for client requests on n1:2379
2025-03-14 03:16:43.439387 I | pkg/netutil:resolving n1:2380 to 127.0.1.1:2380

The lxd documentation proves this (127.0.0.1 in the documentation is actually 127.0.1.1 in the code)

https://github.com/canonical/lxd-imagebuilder/blob/7574f6883f23d88716937e5951de7f4d5301ca93/doc/reference/lxd-imagebuilder/generators.md?plain=1#L90-L95
https://github.com/canonical/lxd-imagebuilder/blob/7574f6883f23d88716937e5951de7f4d5301ca93/generators/hosts.go#L37-L38

A possible fix

Just delete the "127.0.1.1 hostname" rule,nodes can correctly resolve their DNS, or more trouble in the configuration manually set the detailed address

Other related DNS issues

By the way, when configuring the search domain, if "sudo resolvectl status lxdbr0" displays the correct search domain but "ping n1" still fails to resolve to "ping n1.lxd", It can be resolved by adding a Domains, such as “Domains=lxd”, to the “/etc/systemd/resolved.conf”.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions