-
Notifications
You must be signed in to change notification settings - Fork 41
Start text on formatting names for logs #1355
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Fixes #1234 |
afrind
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems directionally right. We can probably bikeshed on . and - but I could live with them I think.
| between them followed by the track name with a minus between the last | ||
| namespace and track name. | ||
|
|
||
| * Bytes in the range a-z, A-Z, and 0-9 are are output as is while bytes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems overly restrictive, as there are plenty of printable ASCII chars. I noticed @suhasHere uses full track names like this (/ is inserted as a tuple separator in my log):
000001/app=01/conf=000003/media=C1[h264,width=1920,height=1080,fps=30
,br=2000]/endpoint=0016/
Is there a reason to forbid symbols like ,, =, [, and ]. The less escaping the easier reading logs will be on the eyes.
I do think it's imperative we escape the tuple and namespace/name separators that appear in the names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can change the way the draft says us to :-) Might be a good reason for this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AI says:
"The ASCII characters considered safe to print to logs are the printable graphic characters, which are decimal codes 32 through 126. These include:
Space (decimal 32)
Alphanumeric characters (a-z, A-Z, 0-9)
Punctuation and symbols (!, ", #, $, %, &, ', (, ), *, +, ,, -, ., /, :, ;, <, =, >, ?, @, [, , ], ^, _, `, {, |, }, ~) "
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's escape whitespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keep in mind that if we want to align with files draft which I think is good idea, then we need to restrict to the file safe set. This use . which is not always file safe but it should not have a name that ends up starting with .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I probably need to go read the files draft, and think about how it uses full-track names encoded as strings. Note that if your first namespace tuple is empty (0 length, which we don't prohibit, I think?), you will get one starting with a dot.
suhasHere
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine to me. c
|
I was thinking maybe we should try to align with a scheme that already exists, so we don't need to write another encoder. The proposed scheme is close to URL Safe encoding, which allows only 0-9, A-Z, a-z and these four characters:
But to use that, we can't select
Examples:
|
|
I think the starting point is do we want URL safe or or file name safe. If we are going to align this with file names, I think most people do now want / in them. If we want them to look like URL then / might be a good choice. The more I thought about this, the more, the more I think it is good to be able to tell if the last one is a track name or just part of the name space with no track name. So I think we should have different separators between tuples in the namespace from what separates the track name. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The requirements of what this encoding is for was not clear from the issue or this PR, both of which are talking about what to put in logs, not filenames or URLs. I suggested URL encoding not because I want to put this in a URL, but because it a) it has a very restricted unencoded charset that is close what the PR proposed b) uses %xx escaping, as proposed here, and c) the encoders already exist.
If the requirement is really to come up with a way for putting track names in file names or URLs, that will generate different tradeoffs (see this MSF/WARP issue moq-wg/msf#60). For logs, I just need an encoding that will give me back the original full track name bytes unambiguously.
The more I thought about this, the more, the more I think it is good to be able to tell if the last one is a track name or just part of the name space with no track name. So I think we should have different separators between tuples in the namespace from what separates the track name.
100% agree there should be two different separators.
| between them followed by the track name with a minus between the last | ||
| namespace and track name. | ||
|
|
||
| * Bytes in the range a-z, A-Z, and 0-9 are are output as is while bytes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I probably need to go read the files draft, and think about how it uses full-track names encoded as strings. Note that if your first namespace tuple is empty (0 length, which we don't prohibit, I think?), you will get one starting with a dot.
|
I tried to explain it it the text of the PR that contains "The goal of this format is to have a format that is both filename and URL safe." This type of stuff is much harder to do with issues than PR having a discussion on a call or meeting where we can walk through why the design is like this. I'm not married to filename safe or anything but I definitely do not find the argument writing an encode for this is too much work very motivating. |
Sorry, I missed that.
Me neither. I can live with what's in this PR. I did want to ask the question about reusing something that's already out there, since I think we're already going to get some sideways looks about defining our own varints (which I think are good). |
|
Discussed in 11/17 interim: The wg was ok with expanding the scope of this PR to use this format anywhere a track name needs to be escaped. We should probably make that recommendation more prominent. There wasn't any specific feedback on character sets or re-use though. |
|
My feedback:
This is a bit annoying. Some potential approaches:
|
@afrind - have a look at this and rewrite if you want. I think some of it could just be deleted as it more motivational than needed in a spec but added it with the view we can delete it while reviewing the PR.
Thanks