Skip to content

Conversation

@joshk
Copy link
Contributor

@joshk joshk commented May 4, 2025

This new extension enables devices to send log output to NervesHub for processing and storage.

This is untested on a device.

Hub PR companion: nerves-hub/nerves_hub_web#2080

@lawik
Copy link
Contributor

lawik commented May 4, 2025

I really like the feature :)

All the extensions so far are designed to essentially only report on request from the server, to avoid uncontrollable transmission of data, spamming, saturating connections, tripping rate limits and what not.

Logging can produce a ton of messages, especially if something is wrong, but even when it isn't if the device is just an enthusiastic logger. This makes me a bit hesitant to set it to be on by default.

I could see the NervesHub requesting logs on an interval and you essentially get them as batches by grabbing whatever is in RingLogger at the time and sending that as a batch (which would probably be nicer on the Postgres side as well, batch inserts are fast).

I could also see a mode where you visit the logging tab and that activates the logging. Pulling the most recent backlog from RingLogger on the device and then streaming while you watch it. This wouldn't produce a great history unless we also fetch at some interval.

The safety design of extensions kind of hinge on them not sending information without NervesHub requesting it. Which is counter-intuitive from a performance perspective but is reassuring from a fleet management one. While we can disable an extension and NervesHubLink should detach it.

I could see a setting that you can switch on for your product that does stream all the logs but I wouldn't necessarily use it as a default. Especially as logging as default can suddenly run up an LTE bill because you updated nerves_hub_link and defaults changed.

Happy to twist and turn on this because the feature will be a lot nicer than needing to console in to get some logs.

@joshk
Copy link
Contributor Author

joshk commented May 4, 2025

Thanks for the great feedback and super relevant points!

I've removed the extension from the defaults, and marked it as Early Release. This will allow us to test it with a select number of devices.

I think getting the Hub PR merged in first is most important, as this PR can stay as a branch while we getting everything working well.

Copy link
Collaborator

@jjcarstens jjcarstens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some initial thoughts

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this file could be combined with NervesHubLink.Extensions.Logging - Not critical, but on devices more modules == slower boot up time. Worth keeping in mind at times 😉

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmmmm, I hear you. I'll think about it. I like files for clarity and searchability.

end

defp send_logs(log_payload, state) do
case push("send", log_payload) do
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have some sort of rate limiting on the client side as well. Otherwise we're going to bloat the connection which also controls updates

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe would be worthwhile to batch logs for ~500ms or something. Means we get natural batches on the insert side as well. Not exactly rate limiting but does improve the rate potentially.

More rate-limity: If a device is spewing logs it would make sense to shed some of them (it is probably looping/spewing the same thing continuously) and just include a log line about having shed logs?

joshk added a commit to nerves-hub/nerves_hub_web that referenced this pull request May 9, 2025
Companion to nerves-hub/nerves_hub_link#303

New log lines are streamed to the UI, with only 25 max shown.

This has been implemented using
**[ClickHouse](https://clickhouse.com/)**.

I've also used [hammer](https://hexdocs.pm/hammer) for rate limiting log
messages (3 per second, with a burst capacity of 10).

**Pros:**
- built-in row truncation (defaults to 3 days)
- isolate analytics from core platform functionality
- the analytics system isn't required, and if turned off, the logging
extension can't be attached

**Cons:**
- we don't get transaction isolation in tests
- we don't get nice association helpers as these are separate ecto repos

**Future possibilities:**
- "Show more"
- Filtering on level 
- Graph of number of log lines over X period
- Search log messages (not sure how well this would work with
ClickHouse)

**Things to note:**
- using `make iex-server` will try to start the analytics subsystem
- to skip the analytics subsystem, use `ANALYTICS_ENABLED=false make
iex-server`
- the docker compose file has been updated to include ClickHouse, its
recommended to use this for starting supplementary systems like Postgres
and ClickHouse.

---

**Simple UI** (to start with)


https://github.com/user-attachments/assets/b4ccd5e8-2411-4423-809e-8b48d19ebbff

---

**To finish:**

- [x] `device_log_lines` truncation (customizable time period)
- [x] tests

---------

Co-authored-by: Nate Shoemaker <[email protected]>
@joshk joshk force-pushed the add-logging-extension branch from 6ed80c5 to 7e9f4bd Compare August 17, 2025 23:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants