Skip to content
69 changes: 46 additions & 23 deletions draft-ietf-moq-transport.md
Original file line number Diff line number Diff line change
Expand Up @@ -2203,6 +2203,7 @@ REQUEST_ERROR Message {
Length (16),
Request ID (i),
Error Code (i),
Retry Interval (i),
Error Reason (Reason Phrase),
}
~~~
Expand All @@ -2212,66 +2213,81 @@ REQUEST_ERROR Message {

* Error Code: Identifies an integer error code for request failure.

* Retry Interval: The minimum time (in seconds) before the request SHOULD be
sent again, plus one. If the value is 0, the request SHOULD NOT be retried.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just as something to think about ... ff you use seconds instead of ms, it makes it hard for the server to pace multiple clients but I can see how either would work.


* Error Reason: Provides a text description of the request error. See
{{reason-phrase}}.

The application SHOULD use a relevant error code in REQUEST_ERROR,
as defined below. Most codepoints have identical meanings for various request
types, but some have request-specific meanings.

INTERNAL_ERROR (0x0):
: An implementation specific or generic error occurred.

UNAUTHORIZED (0x1):
as defined below and assigned in {{iana-request-error}}. Most codepoints have
identical meanings for various request types, but some have request-specific
meanings.

If a request is retryable with the same parameters at a later time, the sender of
REQUEST_ERROR includes a non-zero Retry Interval in the message. If it is
sending more than one such message within a second or so across one or more
sessions, it SHOULD apply randomization to each retry interval so that retries
are spread out over time, minimizing the risk of synchronized retry storms. A
Retry Interval value of 1 indicates the request can be retried immediately.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it needs to do randomization even if it is not doing multiple requests. A common problem is Start of Service attacks. With IP phones we regularly see cities of over 10 million people loose power, then all restart at the same time often all with a set top box from of some sort from the same vendor. This forms a Start Of Service attack where the normal mitigations is for the server to be able to respond with retry after and it is more robust to spreading out the load if that can be spread.

My suggestion would be just make this very specific that the min time to respond was a number between the values returned and 125% of value returned with uniform random distribution in that time. Sure some clients won't bother but the ones that cause SoS attacks will.


INTERNAL_ERROR:
: An implementation specific or generic error occurred. This might be retryable
or not, depending on the implementation conditions that caused the error.

UNAUTHORIZED:
: The subscriber is not authorized to perform the requested action on the given
track.
track. This might be retryable if the authorization token is not yet valid.

TIMEOUT (0x2):
TIMEOUT:
: The subscription could not be completed before an implementation specific
timeout. For example, a relay could not establish an upstream subscription
within the timeout.

NOT_SUPPORTED (0x3):
NOT_SUPPORTED:
: The endpoint does not support the type of request.

MALFORMED_AUTH_TOKEN (0x4):
MALFORMED_AUTH_TOKEN:
: Invalid Auth Token serialization during registration (see
{{authorization-token}}).

EXPIRED_AUTH_TOKEN (0x5):
EXPIRED_AUTH_TOKEN:
: Authorization token has expired ({{authorization-token}}).

Below are errors for use by the publisher. They can appear in response to
SUBSCRIBE, FETCH, TRACK_STATUS, and SUBSCRIBE_NAMESPACE, unless otherwise noted.

DOES_NOT_EXIST (0x10):
: The track or namespace is not available at the publisher.
DOES_NOT_EXIST:
: The track or namespace is not available at the publisher. This might be
retryable or not, if the target might exist later.

INVALID_RANGE (0x11):
INVALID_RANGE:
: In response to SUBSCRIBE or FETCH, specified Filter or range of Locations
cannot be satisfied.
cannot be satisfied. This might be retryable if the range is expected to have
objects in the future.

MALFORMED_TRACK (0x12):
: In response to a FETCH, a relay publisher detected that the track was
MALFORMED_TRACK:
: In response to a FETCH, a relay publisher detected the track was
malformed (see {{malformed-tracks}}).

The following are errors for use by the subscriber. They can appear in response
to PUBLISH or PUBLISH_NAMESPACE, unless otherwise noted.

UNINTERESTED (0x20):
: The subscriber is not interested in the track or namespace.
UNINTERESTED:
: The subscriber is not interested in the track or namespace. This might be
retryable if it expects to be interested later.

Errors below can only be used in response to one message type.

PREFIX_OVERLAP (0x30):
PREFIX_OVERLAP:
: In response to SUBSCRIBE_NAMESPACE, the namespace prefix overlaps with another
SUBSCRIBE_NAMESPACE in the same session.

INVALID_JOINING_REQUEST_ID(0x32):
INVALID_JOINING_REQUEST_ID:
: In response to a Joining FETCH, the referenced Request ID is not an
`Established` Subscription.

UNKNOWN_STATUS_IN_RANGE(0x33):
UNKNOWN_STATUS_IN_RANGE:
: In response to a FETCH, the requested range contains an object with unknown
status.

Expand Down Expand Up @@ -3924,6 +3940,13 @@ TODO: register the URI scheme and the ALPN and grease the Extension types
| INVALID_JOINING_REQUEST_ID | 0x32 | {{message-request-error}} |
| UNKNOWN_STATUS_IN_RANGE | 0x33 | {{message-request-error}} |

The range of error codes including 0x100 to 0xffff is reserved for
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not keen on implementation specific error code. I think that contributes to non interoperable implementations. I'd like to have some discussion on this.

implementation-speceific codes.

The range of error codes starting with 0x10000 is reserved for
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we right this up in the normal way we would IANA. This should be there is a way to get IANA to pre allocate a code.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any reason we need to invent new IANA processes beyond what is in rfc8126

provisional error codes that are under consideration for a permanent
code point by the IETF.

### PUBLISH_DONE Codes {#iana-publish-done}

| Name | Code | Specification |
Expand Down