Skip to content

Conversation

@fluffy
Copy link
Collaborator

@fluffy fluffy commented Nov 4, 2025

@afrind - have a look at this and rewrite if you want. I think some of it could just be deleted as it more motivational than needed in a spec but added it with the view we can delete it while reviewing the PR.

Thanks

@fluffy
Copy link
Collaborator Author

fluffy commented Nov 4, 2025

Fixes #1234

Copy link
Collaborator

@afrind afrind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems directionally right. We can probably bikeshed on . and - but I could live with them I think.

between them followed by the track name with a minus between the last
namespace and track name.

* Bytes in the range a-z, A-Z, and 0-9 are are output as is while bytes
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems overly restrictive, as there are plenty of printable ASCII chars. I noticed @suhasHere uses full track names like this (/ is inserted as a tuple separator in my log):

000001/app=01/conf=000003/media=C1[h264,width=1920,height=1080,fps=30
,br=2000]/endpoint=0016/

Is there a reason to forbid symbols like ,, =, [, and ]. The less escaping the easier reading logs will be on the eyes.

I do think it's imperative we escape the tuple and namespace/name separators that appear in the names.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can change the way the draft says us to :-) Might be a good reason for this PR

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI says:
"The ASCII characters considered safe to print to logs are the printable graphic characters, which are decimal codes 32 through 126. These include:
Space (decimal 32)
Alphanumeric characters (a-z, A-Z, 0-9)
Punctuation and symbols (!, ", #, $, %, &, ', (, ), *, +, ,, -, ., /, :, ;, <, =, >, ?, @, [, , ], ^, _, `, {, |, }, ~) "

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's escape whitespace.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep in mind that if we want to align with files draft which I think is good idea, then we need to restrict to the file safe set. This use . which is not always file safe but it should not have a name that ends up starting with .

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I probably need to go read the files draft, and think about how it uses full-track names encoded as strings. Note that if your first namespace tuple is empty (0 length, which we don't prohibit, I think?), you will get one starting with a dot.

@afrind afrind added the Editorial & Minor Design For PRs that are primarily editorial with a small, non-wire breaking design change label Nov 5, 2025
@ianswett ianswett changed the title Start text on formmating names for logs Start text on formatting names for logs Nov 5, 2025
Copy link
Collaborator

@suhasHere suhasHere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me. c

@afrind
Copy link
Collaborator

afrind commented Nov 8, 2025

I was thinking maybe we should try to align with a scheme that already exists, so we don't need to write another encoder. The proposed scheme is close to URL Safe encoding, which allows only 0-9, A-Z, a-z and these four characters:

  • (hyphen) (ASCII 45)
    . (dot) (ASCII 46)
    _ (underscore) (ASCII 95)
    ~ (tilde) (ASCII 126)

But to use that, we can't select . and - as tuple and name separators. So if we chose, say / and +, you could do:

result = '/'.join(map(urllib.parse.quote, namespace)) + '+' + urllib.parse.quote(name)

Examples:

namespace = ['simple', 'path', 'no-specials]
name = 'track'

simple/path/no-specials+track

namespace = ['path', 'with spaces', 'and special chars!']
name = 'hello world'

path/with%20spaces/and%20special%20chars%21+hello%20world

@fluffy
Copy link
Collaborator Author

fluffy commented Nov 10, 2025

I think the starting point is do we want URL safe or or file name safe.

If we are going to align this with file names, I think most people do now want / in them. If we want them to look like URL then / might be a good choice.

The more I thought about this, the more, the more I think it is good to be able to tell if the last one is a track name or just part of the name space with no track name. So I think we should have different separators between tuples in the namespace from what separates the track name.

Copy link
Collaborator

@afrind afrind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The requirements of what this encoding is for was not clear from the issue or this PR, both of which are talking about what to put in logs, not filenames or URLs. I suggested URL encoding not because I want to put this in a URL, but because it a) it has a very restricted unencoded charset that is close what the PR proposed b) uses %xx escaping, as proposed here, and c) the encoders already exist.

If the requirement is really to come up with a way for putting track names in file names or URLs, that will generate different tradeoffs (see this MSF/WARP issue moq-wg/msf#60). For logs, I just need an encoding that will give me back the original full track name bytes unambiguously.

The more I thought about this, the more, the more I think it is good to be able to tell if the last one is a track name or just part of the name space with no track name. So I think we should have different separators between tuples in the namespace from what separates the track name.

100% agree there should be two different separators.

between them followed by the track name with a minus between the last
namespace and track name.

* Bytes in the range a-z, A-Z, and 0-9 are are output as is while bytes
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I probably need to go read the files draft, and think about how it uses full-track names encoded as strings. Note that if your first namespace tuple is empty (0 length, which we don't prohibit, I think?), you will get one starting with a dot.

@fluffy
Copy link
Collaborator Author

fluffy commented Nov 10, 2025

I tried to explain it it the text of the PR that contains

"The goal of this format is to have a format that is both filename and URL safe."

This type of stuff is much harder to do with issues than PR having a discussion on a call or meeting where we can walk through why the design is like this. I'm not married to filename safe or anything but I definitely do not find the argument writing an encode for this is too much work very motivating.

@afrind
Copy link
Collaborator

afrind commented Nov 10, 2025

"The goal of this format is to have a format that is both filename and URL safe."

Sorry, I missed that.

I'm not married to filename safe or anything but I definitely do not find the argument writing an encode for this is too much work very motivating.

Me neither. I can live with what's in this PR. I did want to ask the question about reusing something that's already out there, since I think we're already going to get some sideways looks about defining our own varints (which I think are good).

@afrind
Copy link
Collaborator

afrind commented Nov 18, 2025

Discussed in 11/17 interim:

The wg was ok with expanding the scope of this PR to use this format anywhere a track name needs to be escaped. We should probably make that recommendation more prominent. There wasn't any specific feedback on character sets or re-use though.

@vasilvv
Copy link
Collaborator

vasilvv commented Nov 21, 2025

My feedback:

  • Filename-safe and URL-safe is a good goal.
  • ASCII Alphanumerics definitely should be unescaped.
  • Underscore probably should be unescaped.
  • If you're aiming for URL-safe, you probably don't want to use percent?
  • Characters that are not URL-escaped are ! * ( ) _ - .
  • Of those, * is not allowed in filenames, and ! is a bit questionable.
  • This leaves us with ! ( ) _ - .

This is a bit annoying. Some potential approaches:

  • ns1-ns2-ns_three(track.20name) -- I think there is already precedent for using dots for escaping?
  • ns1!ns2!ns_three-track.20name -- similar to what's currently in the text, for everyone nostalgic for the good old UUCP days.

@afrind afrind added the Needs Discussion Tags for issues which need discussion, ideally at an interim or IETF meeting. label Nov 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Editorial & Minor Design For PRs that are primarily editorial with a small, non-wire breaking design change Needs Discussion Tags for issues which need discussion, ideally at an interim or IETF meeting.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants