Skip to content

Conversation

@afrind
Copy link
Collaborator

@afrind afrind commented May 8, 2025

Fixes: #548
Fixes: #549

@afrind afrind added Wire Format Related to how messages are serialized and parsed Design Issues or PRs that change how MoQ works including the wire format. labels May 8, 2025

### Variable-Length Integers

RFC9000 variable-length integers use a two-bit length prefix in the first byte
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand why you put this text here but it seems superfluous to talk about something that is not suitable for MoQT. Consider simply stating MoQT's requirements or abilities and then state how the encoding supports thst on the wire.

@LPardue
Copy link
Contributor

LPardue commented May 8, 2025

For a blanket change, I think you'll need to sanity check all uses. For instance, in SUBSCRIBE_DONE the Stream Count field would allow expressing a larger stream count than the underlying transport can support . That's easily fixable by stating the logical limit of the field and what to do if you receive a bogus value, but it's an example of a new interop issue this change introduces.

afrind and others added 3 commits May 9, 2025 10:24
Since it's only peer uni streams, the max count is 2^60
Co-authored-by: Suhas Nandakumar <[email protected]>

This document redfines the following RFC9000 syntax:

x (i):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we change (i) to avoid confusion with QUIC?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it's a real issue to be confused with QUIC, since we define the encoding in this document, and have this comment pointing to the definition.

However, I'm open to suggestions. x (v) for varint? I like x (ï)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also fine with x(i) as it is redefinition. If not, x(V) seems fine to me

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I posted somewhere already but I can't find it, so I want to repeat it here. I hate the idea of redefinition, it provides no value and only a source of eternal confusion. Please just pick a different identifier like mi or whatever.

In isolation you think you have no issues. When I want to e.g. draw a diagram that renders how a MoQ control message is sent/received on an HTTP/3 stream I'm going to want to draw something like

STREAM Frame {
  Type (i) = 0x08..0x0f,
  Stream ID (i),
  [Offset (i)],
  [Length (i)],
  GOAWAY Message {
    Type (i) = 0x10,
    Length (16),
    New Session URI Length (i),
    New Session URI (..),
  }

see the ambiguity?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on trying to keep the syntax so it does not clash with quic

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I'm sold by @LPardue's example.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 ... how about we define x(V64) to be very explicit about the intent of the type ?

@LPardue @afrind @fluffy ??

|--------------|----------------|-------------|---------------|
| 1110 | 8 | 60 | 0-2^60-1 |
|--------------|----------------|-------------|---------------|
| 1111XXXX | 9 | 64 | 0-2^64-1 |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea what this actually means. What are the XXXX?

Do we expect to revisit this scheme later? If the goal is to make "the range of 1 byte values is as large as possible", it would make more sense to just allocate 0x00~0xf8 values for those, but then we basically don't have two-byte varints.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea what this actually means. What are the XXXX?

The 9 byte encoding uses 4 bits for length with 4 reserved bits.

Do we expect to revisit this scheme later?

No. If we make the switch let's go from QUIC to MoQ varints once and never look back. Now is the time to propose new schemes.

allocate 0x00~0xf8 values for those, but then we basically don't have two-byte varints.

To clarify, you if the first byte is > 0xf8 the length is value - 0xf8 + 1? Technically you have 2 byte varints but they only encode 7 values that can't be encoded in 1 byte. These things are all about tradeoffs and I'm agnostic if the group prefers 248 1-byte values, at the expense of more bytes for larger values. It might be bad for shortish payload lengths (eg: 300-1200 bytes, though datagrams don't have a length field).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the "1111XXXX" should be "11110000" and then we should have
"111110001 to 11111111" as RESERVED

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fluffy sounds good to me. Would you have it be an error to receive 11110001-11111111? I don't want to check those bits in my decoder.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't care if the receiver never checks them, I just care that senders set them all zero and that future extensions can negotiate using them.

That said, it really does not seem a big deal to check and it's going to be easier to debug if you get a great reason phrase explaining what is going on when they are not valid.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pulled up short of making 11110001-11111111 an error and backtracked on reserved, just saying ignored/SHOULD. Since it's a 64 bit value, there's no use for these bits. My thinking was that I didn't want any way processing the first byte could fail - but then the coffee kicked in a bit more and I realized decoding can always fail on some subsequent byte (running out of input) so I don't know that this is a huge advantage.

I'm open to doing whatever here, but want a broader signal from the wg. We should also discuss if we like Victor's idea better.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still want the change to 1111XXXX to 11110000 and leave the other values as reserved for future extensions.

Copy link
Collaborator

@ianswett ianswett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LG, a few comments. I think reusing 'i' from QUIC as a varint could be very confusing.


This document redfines the following RFC9000 syntax:

x (i):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it's a real issue to be confused with QUIC, since we define the encoding in this document, and have this comment pointing to the definition.

However, I'm open to suggestions. x (v) for varint? I like x (ï)

|--------------|----------------|-------------|---------------|
| 1110 | 8 | 60 | 0-2^60-1 |
|--------------|----------------|-------------|---------------|
| 1111XXXX | 9 | 64 | 0-2^64-1 |
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea what this actually means. What are the XXXX?

The 9 byte encoding uses 4 bits for length with 4 reserved bits.

Do we expect to revisit this scheme later?

No. If we make the switch let's go from QUIC to MoQ varints once and never look back. Now is the time to propose new schemes.

allocate 0x00~0xf8 values for those, but then we basically don't have two-byte varints.

To clarify, you if the first byte is > 0xf8 the length is value - 0xf8 + 1? Technically you have 2 byte varints but they only encode 7 values that can't be encoded in 1 byte. These things are all about tradeoffs and I'm agnostic if the group prefers 248 1-byte values, at the expense of more bytes for larger values. It might be bad for shortish payload lengths (eg: 300-1200 bytes, though datagrams don't have a length field).

Copy link
Collaborator

@ianswett ianswett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Comment Deleted, Lucas convinced me that we should not use (i) *

Copy link
Collaborator

@fluffy fluffy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments worth considering but the actual core part that defined what we are doing here looks great.


This document redfines the following RFC9000 syntax:

x (i):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on trying to keep the syntax so it does not clash with quic

|--------------|----------------|-------------|---------------|
| 1110 | 8 | 60 | 0-2^60-1 |
|--------------|----------------|-------------|---------------|
| 1111XXXX | 9 | 64 | 0-2^64-1 |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the "1111XXXX" should be "11110000" and then we should have
"111110001 to 11111111" as RESERVED

--- back

# Sample Variable-Length Integer Encoding and Decoding {#sample-varint}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm against adding example code like this to the draft - it just makes the draft longer and tends to accumulate errors over time. It gets endless review comments on better ways to write it. People can figure out how to write this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm mirroring RFC 9000 here, but I don't have a strong opinion.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to think it's somewhat useful.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in an appendix, correct?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this in an appendix

@afrind afrind added the Needs Discussion Tags for issues which need discussion, ideally at an interim or IETF meeting. label May 31, 2025
Copy link
Collaborator

@fluffy fluffy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this still needs some small changes but agree with big picture

The WriteVarint function takes two parameters, a 64 bit integer and an
output buffer with an append operation.

~~~pseudocode
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im very against putting code like this in this draft. I'm fine with another draft about implementation advice but we need to keep this draft as short as possible.

|--------------|----------------|-------------|---------------|
| 1110 | 8 | 60 | 0-2^60-1 |
|--------------|----------------|-------------|---------------|
| 1111XXXX | 9 | 64 | 0-2^64-1 |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still want the change to 1111XXXX to 11110000 and leave the other values as reserved for future extensions.

@fluffy
Copy link
Collaborator

fluffy commented Sep 9, 2025

any update on this one ?

@afrind
Copy link
Collaborator Author

afrind commented Sep 9, 2025

any update on this one ?

  1. We can't land this change until ALPN version negotiation (Always use ALPN for version negotiation #499) lands, otherwise this would break version negotiation. That in turn is blocked on Chrome support for WT-AvailableProtocols, which has just reached Chrome canary. I expect we can resolve this blocker by Montreal. Hopefully FireFox and Safari will quickly follow, but I don't think we should wait for them.

  2. There are still some issues to discuss here. First, we've not had a discussion of Victor's alternate design mentioned above or a similar one from MT on the linked issue. Second, there's the other outstanding feedback here - what to do about the trailing 4 bits when using 64-bit varints, and whether we should include pseudocode. We can add time to resolve these in Toronto.

@afrind
Copy link
Collaborator Author

afrind commented Sep 25, 2025

Discussed in Toronto:

  • No one felt strongly one way or the other about having pseudocode in the doc. I'm happy to remove it for now.
  • We should require the tailing bits of 1111XXXX to be 0000. This will allow future extensibility.
  • Not a clear signal about whether we should pursue an alternate encoding. There wasn't a specific counterproposal that was easy to digest, and we also lack production data about histograms of the values that matter.
  • At least one person felt we shouldn't be inventing a new encoding, but should use some existing encoding, although to our knowledge there is not an existing varint encoding with the tradeoffs and properties we want.

@afrind
Copy link
Collaborator Author

afrind commented Sep 25, 2025

I believe the counterproposal is:

250 1 byte values, 256 2 byte values, 2^16 3 byte values, etc.

I think this is a worse tradeoff than this PR:

128 1 byte values, 2^14 2 byte values.

For, say, datagram Object IDs in a group where the group is 30s long and there are 50 objects/second, we will more values between 250 and 2^14 than between between 128 and 256.

@afrind afrind removed the Needs Discussion Tags for issues which need discussion, ideally at an interim or IETF meeting. label Oct 1, 2025
@ianswett ianswett added the Merge Ready Has the requisite number of approvals; all comments addressed. label Nov 8, 2025
@afrind afrind removed the Merge Ready Has the requisite number of approvals; all comments addressed. label Nov 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Design Issues or PRs that change how MoQ works including the wire format. Wire Format Related to how messages are serialized and parsed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Better compression of VarInts Support 64 bit Var Int values

8 participants