-
-
Notifications
You must be signed in to change notification settings - Fork 515
feat(bbr3) Implemented bbr3's latest version based on the RFC #2481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Thanks for working on this! I'm excited to have this feature, and those performance numbers look promising. It's hard to read the vertical scales, but I guess the second chart is the new path? The pacing rate units don't look quite right. What conditions was this tested in? It'd be interesting to contrast e.g. loopback, LANs, wifi, cellular, long fat networks, satellite networks, etc. We don't need all of that to merge this, but it would help inform what our guidance to users on its use should be, and might motivate future changes in default configuration. Is there a strong reason for us to keep the old BBR code around? It hasn't received any attention in years, so unless it serves a very strong purpose I think it would be better for this change to replace it outright, perhaps by just deleting it in a final commit. As this is a large body of algorithmically nuanced code, review will be difficult. To start with, can you reorganize the PR as a series of incremental, linear commits? For example, please remove merge commits and squash together WIP, incomplete, or later-revised changes, and separate changes to existing Quinn interfaces from the introduction of new logic. |
|
This was tested on local with simulated network latency (between a client and a server from the examples provided in quinn) 100 ms ping with 20 ms jitter. I'll edit the history here soon, change the units for pacing rate and post more graphs under different condition :) |
|
Alright I have cleaned up the history of commits and adjusted the units for pacing_rate I'll get to doing some more tests under different circumstances now. |
|
I'm not sure what's the best way to share the graphs with you, I tried to get an average sample and no outliers (positive or negative) for cubic vs bbr3: |
Ralith
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for putting this together! The commit organization is great.
Because this is marked as experimental and replaces obsolete and equally experimental code, I'm not inclined to block this on an in-depth review of the actual BBR logic, which would otherwise take us much longer. That said, some documentation around it would dramatically improve our ability to notice implementation errors, and make it much easier for yourself or others to update and debug it in the future. Appropriate documentation might heavily cite or even excerpt the IETF draft.
quinn-proto/src/connection/mod.rs
Outdated
| } | ||
| self.path | ||
| .congestion | ||
| .on_packet_acked(now, info.time_sent, info.size, pn, &self.path.rtt); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we're calling this at almost exactly the same times as on_ack. Does it need to be a separate function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so for this, I used to use this but it was definitely missing some of the packets that were registered on send within bbr, I believe it's because we're looking for ack_eliciting acks before sending on_ack to the congestion controller. I didn't want to break / modify implementations present within other congestion algorithms so I decided to add a separate event here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you certain that BBR should be tracking non-ACK-eliciting packets at all? I suspect it shouldn't. There's a reason the existing controllers ignore them: the peer is, as the name suggests, not obliged to respond to such packets on any particular schedule, so we can't reliably infer anything from the timing of their acknowledgements.
The real issue here might be that we're incorrectly calling on_sent on non-ACK-eliciting packets. If that makes sense, then I'd rather correct that than change or substantially duplicate the on_ack behavior.
If(!) there is a clear need for non-ACK-eliciting packets to be tracked, then I'd prefer to add an ack_eliciting: bool argument to both on_sent and on_ack rather than adding more methods, but I suspect simply ignoring such packets for congestion control purposes is most correct, and simpler for controllers besides.
|
I just noticed there are some inconsistencies with what the value of the pacing gain when in the draining state in the ietf draft, I'll update its value soon with additional justifications |
|
Please rebase and squash all your commits on your next push. |
|
(don't literally squash them all, just assemble into a logical progression like before) |
|
I introduced a bug within the latest changes, so I will first find it and fix it and then clean up the commits. |
fixed the bug, there were a couple errors in calculations but the main issue was with how inflight was set, ack_eliciting is the number of packets, not an amount of bytes, resulting in underestimating inflight by a lot |
|
Alright cleaned up the commits :) |
c716f82 to
ae39099
Compare
quinn-proto/src/connection/mod.rs
Outdated
| } | ||
| self.path | ||
| .congestion | ||
| .on_packet_acked(now, info.time_sent, info.size, pn, &self.path.rtt); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you certain that BBR should be tracking non-ACK-eliciting packets at all? I suspect it shouldn't. There's a reason the existing controllers ignore them: the peer is, as the name suggests, not obliged to respond to such packets on any particular schedule, so we can't reliably infer anything from the timing of their acknowledgements.
The real issue here might be that we're incorrectly calling on_sent on non-ACK-eliciting packets. If that makes sense, then I'd rather correct that than change or substantially duplicate the on_ack behavior.
If(!) there is a clear need for non-ACK-eliciting packets to be tracked, then I'd prefer to add an ack_eliciting: bool argument to both on_sent and on_ack rather than adding more methods, but I suspect simply ignoring such packets for congestion control purposes is most correct, and simpler for controllers besides.
| /// by definition, and it samples the most recent one. So we restart fresh on | ||
| /// every new max and overwrites 2nd & 3rd choices. The same property | ||
| /// holds for 2nd & 3rd best. | ||
| /// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: stray empty line
…ry for some delay based algorithms such as BBRv3
… adjust the burst size and the overall pacing rate of the connection
…rom BBR3 struct, fixed data type for cycle_count




solves #1254
Based on https://www.ietf.org/archive/id/draft-ietf-ccwg-bbr-04.txt
tested with sudo tc qdisc add dev lo root netem delay 50ms 10ms distribution normal with a big file, this is a single connection
and it's BBR3 vs Cubic (the default) for quinn
1206KiB/s for BB3 vs 1049 KiB/s for Cubic so approximately 13% faster transfer speed (this is with a big file so over a long period of time).
Let me know if you need any additional information or if you want me to change the implementation in some ways.