Skip to content

v1.5-Beta-1

Pre-release
Pre-release
Compare
Choose a tag to compare
@rchac rchac released this 25 Jun 19:57
· 656 commits to develop since this release
81677b1

LibreQoS V1.5-BETA-1

After 6 months of development, LibreQoS is proud to announce the first beta for LibreQoS version 1.5. Our V1.4 (released January, 2024) is now in use by over 150 ISPs in 22 US states and in 42 countries, for over a million subscribers - reducing network latency via CAKE, and providing extensive diagnostics and statistics to ISP support staff. We hope that v1.5 will cover the world!

LibreQoS would like to thank our subscribers, donors, and Equinix Metal, NLnet, and Zulip for their support during this process. We wouldn't be able to keep fixing the internet without you! Also a huge thanks to Ubuntu for such a solid OS, and the libbpf team for all their improvements to the underlying network code we leverage.

Get v1.5-BETA-1 here: https://libreqos.io/#download

LibreQoS v1.5 Features Overview:

  • 30%+ Performance Improvement. Large improvements in code efficiency. LibreQoS can now push 70 Gbit/s on a $1500 AMD Ryzen 9 7950X box with a Mellanox MCX516A card.
  • Intelligent Binpacking. Dynamically redistribute load (based on long-term statistics) to get the most out of all of your CPU cores.
  • Flow-Based Analysis. Flows are tracked individually, for TCP retransmits, round-trip time, and performance. Completed flows can be analyzed by ASN, geolocated, or exported via Netflow.
  • Unified Configuration System and GUI. No more separate ispConfig.py and lqos.conf files - all configuration is managed in one place. The web user interface now lets you manage the whole configuration, including devices and network lists.
  • Support for Newer Linux Enhancements. LibreQoS can take advantage of eBPF improvements in the 6.x kernel tree to further improve your performance - but remains compatible with later 5.x kernels.
  • Improved CLI tools. lqtop, a new support tool and more.

Changelog Since 1.4

Unified Configuration System

  • Replace ispConfig.py with a singular /etc/lqos.conf
  • Automatically migrate previous configuration
  • In-memory cache system (load the config once and share it, detect changes)
  • Shared configuration between Python and Rust

Dynamic Binpacking

  • It was a common problem for the CPU-queue assignment (in tree mode) to allocate too many resources
    to a single CPU.
  • Each circuit is assigned a "weight". If you have Long-Term Stats, then the weight is calculated based on
    past usage AND assigned speed plan. Without LTS, it's just assigned speed plan.
  • The total weight - at this time of day - is then calculated for each top-level entry.
  • A "binpacking" algorithm then attempts to equalize load between CPUs.
  • Significant performance improvement on many networks.

Per-Flow Tracking System

  • "Flows" are detected as TCP connections, UDP connections that reuse a source/destination and ICMP between a source/destination.
  • Rather than just track per-host statistics, statistics are attached to a flow.
  • Flows maintain a rate estimation at all times, in-kernel.
  • Flows calculate Round-Trip Time (RTT) continuously.
  • Flows spot timestamp duplications indicating TCP Retry (or duplicate).
  • Much of the kernel code moved from the TC part of eBPF to the XDP part, giving a modest speed-up and improvement in
    overall throughput.

Per-Flow Userland/Kernel Kernel-Userland System

  • Rather than reporting RTT via a giant data structure, individual reports are fed to the kernel through a userspace callback
    system.
  • Flows "closing" (clean closure) results in a kernel-userspace notify.
  • Flows also expire on a periodic tick if no data has arrived in a given time period.
  • This decreased kernel side overhead significantly (eBPF kernel to userspace is non-blocking send).
  • This increased userspace CPU usage very slightly, but removed the processing overhead from the packet-flow execution path.

Per-Flow Reporting System

  • RTT is compiled per-flow into a ringbuffer. Results from very-low (mostly idle) flows are ignored. RTT is calculated from a median of the last
    hundred reports. Significant accuracy improvement.
  • Per-flow TCP retries are recorded.
  • When flows "close", they are submitted for additional analysis.
  • Simple protocol naming system maps ethertype/port to known protocols.

Export Flow Data in netflow version 5 and 9 (IPFIX)

Closed Flow Reporting System

  • Created "geo.bin", a compiled list of by-ASN and by IP geolocations.
  • lqosd will download a refreshed geo.bin periodically.
  • Closed flows are mapped to an ASN, giving per-ASN performance reports.
  • Closed flows are mapped to a geolocation, giving geographic performance reports.
  • Closed flows are mapped to ethertype and protocol.
  • User interface expanded in lqos_node_manager to display all of this.

Preflight checks for lqosd

  • Prior to startup, common configuration and hardware support issues are checked.
  • Single-queue NICs now get a proper error message.
  • If the user tries to run both a Linux bridge and an XDP bridge on the same interface pair,
    the XDP bridge is disabled and a warning emitted.

XDP "Hot Cache"

  • Much CPU time was spent running a longest-prefix match check on every ISP-facing IP address.
  • Added a least-recently-used cache that matches IP adddresses to circuits with a much less
    expensive fast lookup.
  • Added a "negative cache" entry to speed up "this IP still isn't mapped"
  • Added cache invalidation code to handle the IP mappings changing
  • This resulted in a 20-30% CPU usage reduction under heavy load.

Config UI

  • lqos_node_manager is now aware of the entire configuration system.
  • All configuration items may be edited.
  • ShapedDevices.csv can be edited from the web UI.
  • network.json can be edited from the web UI.
  • Heavy validation, ensuring that devices have matching network.json entries, IPs aren't duplicated, etc.

LQTop

  • New lqtop CLI tool with much prettier text UI and support for flows.

UISP Integration 2

  • An all-new, massively faster UISP Integration system.
  • Includes much better network map traversal.

Support Tools

  • CLI tool for running a "sanity check" on common issues.
  • Gather configuration into a bundle for sending.
  • View the bundle.
  • Submit the bundle to LibreQoS for analysis.
  • A web UI (lqos_node_manager) version of the same thing, using shared code.

Misc

  • Improvements and fixes to all integrations, especially Spylnx.
  • Update back-end code to latest versions.

Our lines-of-code counts are approximately:

Language code
Rust 18251
Python 5859

Herbert: And I forgot one: if you're running the 6.x kernel line, XDP metadata kicks in and you get another decent performance boost - but it still works with the older kernels (just without the boost).
Herbert: Also... with all of these improvements stacked, we've cracked the 10gbps single-flow barrier on Payne!

Beta-2 (or 3) will contain an updated user interface (UI2). The final release of v1.5 is presently targeted for August, 2024.