Skip to content

Conversation

skilau
Copy link

@skilau skilau commented Oct 10, 2025

This PR fixes an issue we have on our platform, where time_t is a int64_t instead of the assumed uint32_t.

We ran into a problem where when stopping the daemon, and restarting it would fail to properly load the neighbor cache, and would cause the cached nodes to be rejected until their 30 minute timeout expired on their side, and eventually gave up, and reconnected.

This issue was quite puzzling at first, but adding debugging to the loading of the neighbor cache files showed that the cached neighbors were all being loaded, and then promptly being rejected as being too old, and flushed/deleted away.

When digging into a neighbor cache file, we discovered the issue, here is an example of the issue before the fix:

ipv6[0] = fd00:5425:5c08:3e2:435c:8d05:ac0d:5fa8
lifetime[0] = 2220
# Fri Oct 10 04:01:16 2025 UTC
expiration[0] = 3118
ipv6[1] = fe80::8a0f:62ff:fe10:aa93
lifetime[1] = 600
# Fri Oct 10 03:34:07 2025 UTC
expiration[1] = 3118

If we look at the values for the expiration[], they are far too small to be realistic, even though the comment above them shows the proper timestamp in text format, so we knew that the time_t value was correct and not corrupt at that time.

Looking at the code that generates that line, we found the problem.
It was doing a printf() with "%lu", instead of the "%llu" that it needed to be, for platforms that have a time_t that is set to a uint64_t. (Or int64_t, etc)

Changing it to "%llu" (or "%ju) gets us the correct value, which would be the standard unix timestamp that we all expect:

ipv6[0] = fd00:5425:5c08:3e2:435c:8d05:ac0d:5fa8
lifetime[0] = 2220
# Fri Oct 10 05:49:00 2025 UTC
expiration[0] = 1760075340
ipv6[1] = fe80::8a0f:62ff:fe10:aa93
lifetime[1] = 2220
# Fri Oct 10 05:49:00 2025 UTC
expiration[1] = 1760075340

This fixes our existing problem.

We did notice that another "expiration_s" value was called out as a uint32_t, in one of the headers, so I have made the change to time_t there, and included the change in this PR.

This seems to also work fine, and we haven't noticed any problems with these changes so far.

The tools platform we are on, which is what is providing the 64bit time_t change, is:

# ldd
musl libc (arm)
Version 1.2.5
Dynamic Program Loader
Usage: ldd [options] [--] pathname

The typedef is here:
https://git.musl-libc.org/cgit/musl/tree/include/alltypes.h.in#n12

MathisMARION and others added 15 commits September 3, 2025 17:11
In case several frames were queued since a long time, the following
recursive calls could happen:

    ws_llc_mac_confirm_cb
    > ws_llc_eapol_confirm
      > ws_llc_mpx_eapol_send
        > ws_llc_mac_confirm_cb

In terms of traces, this typically manifests as:

    1755596506.427657: hif rx: CNF_DATA_TX a8 03 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 13 00 01 b4 0a 00
    1755596506.427740: tx-abort: tx failure status=NOACK dst=38:6e:21:ff:fe:59:c3:74
    1755596506.427825: tx-15.4 eapol     dst:38:6e:21:ff:fe:59:c0:b7 panid:156
    warning: wsbr_data_req_ext: neighbor timeout before packet send
    1755596506.427854: tx-abort: tx failure status=TIMEDOUT dst=38:6e:21:ff:fe:59:c0:b7
    ws_llc_mac_confirm_cb():457: warning: "!ws_neigh"
    1755596506.427881: tx-15.4 eapol     dst:94:b2:16:ff:fe:05:98:90 panid:156
    warning: wsbr_data_req_ext: neighbor timeout before packet send
    1755596506.427901: tx-abort: tx failure status=TIMEDOUT dst=94:b2:16:ff:fe:05:98:90
    ws_llc_mac_confirm_cb():457: warning: "!ws_neigh"
    1755596506.427925: tx-15.4 eapol     dst:94:b2:16:ff:fe:05:75:48 panid:156
    warning: wsbr_data_req_ext: neighbor timeout before packet send
    1755596506.427945: tx-abort: tx failure status=TIMEDOUT dst=94:b2:16:ff:fe:05:75:48
    ws_llc_mac_confirm_cb():457: warning: "!ws_neigh"
    1755596506.427967: tx-15.4 eapol     dst:94:b2:16:ff:fe:00:c7:7e panid:156
    warning: wsbr_data_req_ext: neighbor timeout before packet send
    1755596506.427987: tx-abort: tx failure status=TIMEDOUT dst=94:b2:16:ff:fe:00:c7:7e
    ws_llc_mac_confirm_cb():457: warning: "!ws_neigh"
    1755596506.428010: tx-15.4 eapol     dst:38:6e:21:ff:fe:59:c2:05 panid:156
    warning: wsbr_data_req_ext: neighbor timeout before packet send
    1755596506.428029: tx-abort: tx failure status=TIMEDOUT dst=38:6e:21:ff:fe:59:c2:05
    ws_llc_mac_confirm_cb():457: warning: "!ws_neigh"
    1755596506.428052: tx-15.4 eapol     dst:38:6e:21:ff:fe:59:bc:fd panid:156
    warning: wsbr_data_req_ext: neighbor timeout before packet send
    1755596506.428071: tx-abort: tx failure status=TIMEDOUT dst=38:6e:21:ff:fe:59:bc:fd
    ws_llc_mac_confirm_cb():457: warning: "!ws_neigh"
    1755596506.428094: tx-15.4 eapol     dst:94:b2:16:ff:fe:05:91:41 panid:156
    warning: wsbr_data_req_ext: neighbor timeout before packet send
    1755596506.428114: tx-abort: tx failure status=TIMEDOUT dst=94:b2:16:ff:fe:05:91:41
    ws_llc_mac_confirm_cb():457: warning: "!ws_neigh"
    1755596506.428136: tx-15.4 eapol     dst:94:b2:16:ff:fe:05:9a:a4 panid:156
    warning: wsbr_data_req_ext: neighbor timeout before packet send
    1755596506.428156: tx-abort: tx failure status=TIMEDOUT dst=94:b2:16:ff:fe:05:9a:a4
    ws_llc_mac_confirm_cb():457: warning: "!ws_neigh"
    1755596506.428179: tx-15.4 eapol     dst:94:b2:16:ff:fe:05:80:37 panid:156
    warning: wsbr_data_req_ext: neighbor timeout before packet send
    1755596506.428198: tx-abort: tx failure status=TIMEDOUT dst=94:b2:16:ff:fe:05:80:37
    ws_llc_mac_confirm_cb():457: warning: "!ws_neigh"

WI_SUN-4654
Fixes: a89ca7f ("dbus: expose EAPoL target in Nodes property")
Origin: SiliconLabs#29
Fixes: 13c26eb ("ws: update SG PHY definitions with spec 2v03").

WI_SUN-4641
There are no such chan_plan_id supported in chan_params_table.

WI_SUN-4641
There are no such chan_plan_id supported in chan_params_table.

WI_SUN-4641
MbedTLS version 3.6.3 [1] introduced a breaking change, now requiring
explicit hostname configuration, otherwise causeing authentication
failure:

    In TLS clients, if mbedtls_ssl_set_hostname() has not been called,
    mbedtls_ssl_handshake() now fails with
    MBEDTLS_ERR_SSL_CERTIFICATE_VERIFICATION_WITHOUT_HOSTNAME
    if certificate-based authentication of the server is attempted.
    This is because authenticating a server without knowing what name
    to expect is usually insecure. To restore the old behavior, either
    call mbedtls_ssl_set_hostname() with NULL as the hostname, or
    enable the new compile-time option
    MBEDTLS_SSL_CLI_ALLOW_WEAK_CERTIFICATE_VERIFICATION_WITHOUT_HOSTNAME.

[1]: https://github.com/Mbed-TLS/mbedtls/releases/tag/mbedtls-3.6.3
gcc -fsanitize=leak reported:

    Direct leak of 7 byte(s) in 1 object(s) allocated from:
        #0 0x62c51c in __interceptor_calloc (/usr/local/bin/wsbrd+0x15051c) (BuildId: 118a136d933787dcd018225c9fbf39ff524a9995)
        SiliconLabs#1 0x76de2570 in zalloc sl_cpc.c
        SiliconLabs#2 0x76ded368 in cpc_get_secondary_app_version (/usr/local/lib/libcpc.so.3+0xc368) (BuildId: 2ccfb015683fef6696474d811e1962d05724f1cd)
        SiliconLabs#3 0x9964e0 in cpc_secondary_app_version common/bus_cpc.c:66:11
        SiliconLabs#4 0x7961f4 in rcp_init common/rcp_api.c:696:28
        SiliconLabs#5 0x678a60 in wsbr_main app_wsbrd/app/wsbrd.c:594:5
        SiliconLabs#6 0x671c58 in main app_wsbrd/app/main.c:18:12
        SiliconLabs#7 0x76c223ec in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16

Note that cpc_get_secondary_app_version() documents:

    On success, an owned copy of the string is returned.
    This copy can be freed using cpc_free_secondary_app_version(copy).

However, cpc_free_secondary_app_version() takes a non-const char * as
argument, which forces to cast away the const char * returned by
cpc_get_secondary_app_version().
The vanilla glibc headers produce a warning when _FORTIFY_SOURCE is used
without optimizations [1]. However Debian removes that warning when
packaging glibc [2].

Hardening features should be manually enabled by users.

[1]: https://sourceware.org/git/?p=glibc.git;f=include/features.h;h=glibc-2.42#l435
[2]: https://salsa.debian.org/glibc-team/glibc/-/blob/sid/debian/patches/any/local-revert-bz13979.diff

Origin: SiliconLabs#30
@skilau skilau changed the title Fix issue with time_t being a uint64_t on some platforms Fix issue with time_t being a int64_t on some platforms Oct 10, 2025
@MathisMARION
Copy link
Collaborator

MathisMARION commented Oct 13, 2025

Hello, and thank you for the detailed investigation and bug fix. We have experienced this kind of issue in the past with the width of time_t. This is a combination of the native integer sizes, and the type definition for time_t. We have tested:

  • time_t as long and sizeof(long) == 4.
  • time_t as long and sizeof(long) == 8.

It seems that your platform has time_t as a long long with sizeof(long) == 4 and sizeof(long long) == 8.

I believe you should have gotten a compilation warning of this kind, is it not the case?

warning: format ‘%luexpects argument of typelong unsigned int’, but argument 3 has typelong long int’ [-Wformat=]

As for the other edit inside struct ws_neigh, the field appears to be unused, we will remove it.

@skilau
Copy link
Author

skilau commented Oct 13, 2025

Yes, we actually went with this for now, in our code:

fprintf(nvm->file, "expiration[%d] = %" PRId64 "\n", i, ts);

as getting the printf() right when it could be a uint32_t, uint64_t or int64_t is kind of painful, I am not sure what the best printf format would be to be compatible with all the platforms wsbrd supports.

@MathisMARION
Copy link
Collaborator

You should have received an edited version of your patch by email. We tend to do our internal review by email, but don't worry about it unless you have extra comments to make. I actually did not mean to add you as a cc but here we are 😅 Your patch will be applied to our internal repo and hopefully released as part of v2.7 (supposedly this week).

MathisMARION pushed a commit that referenced this pull request Oct 16, 2025
There are platforms where time_t is a long long and
sizeof(long long) != sizeof(long).

Origin: #34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants