Skip to content

Commit

Permalink
Exclude hex above max Unicode Scalar Value (#456)
Browse files Browse the repository at this point in the history
* Exclude hex above max Unicode Scalar Value

simplify surrogate regex to use ranges

* allow leading 0s, but still limit max length to 6

* Add explicit regex-set rules to hex unicode

document {1,3} ranges

* add space-separators between sets

* Make test fail *only* for length limits

Previously it failed due to specifying a codepoint past max *as well*, obscuring the intended fail condition.

---------

Co-authored-by: Tab Atkins Jr. <[email protected]>
  • Loading branch information
eugenesvk and tabatkins authored Jan 21, 2025
1 parent e9e6a84 commit d76063e
Show file tree
Hide file tree
Showing 3 changed files with 9 additions and 4 deletions.
11 changes: 7 additions & 4 deletions draft-marchan-kdl2.md
Original file line number Diff line number Diff line change
Expand Up @@ -983,10 +983,13 @@ string-character :=
[^\\"] - disallowed-literal-code-points
ws-escape := '\\' (unicode-space | newline)+
hex-digit := [0-9a-fA-F]
hex-unicode := hex-digit{1, 6} - surrogates
surrogates := [dD][8-9a-fA-F]hex-digit{2}
// U+D800-DFFF: D 8 00
// D F FF
hex-unicode := hex-digit{1, 6} - surrogate - above-max-scalar // Unicode Scalar Value in hex₁₆, leading 0s allowed within length ≤ 6
surrogate := [0]{0, 2} [dD] [8-9a-fA-F] hex-digit{2}
// U+D800-DFFF: D 8 00
// D F FF
above-max-scalar = [2-9a-fA-F] hex-digit{5} | [1] [1-9a-fA-F] hex-digit{4}
// >U+10FFFF: >1 _____ 1 >0 ____
raw-string := '#' raw-string-quotes '#' | '#' raw-string '#'
raw-string-quotes :=
Expand Down
1 change: 1 addition & 0 deletions tests/test_cases/input/unicode_escaped_above_max_fail.kdl
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
no "Higher than max Unicode Scalar Value \u{10FFFF} \u{11FFFF}"
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
no "Even with leading 0s Unicode Scalar Value escapes must ≤6: \u{0012345}"

0 comments on commit d76063e

Please sign in to comment.