Skip to content

Commit d76063e

Browse files
eugenesvktabatkins
andauthored
Exclude hex above max Unicode Scalar Value (#456)
* Exclude hex above max Unicode Scalar Value simplify surrogate regex to use ranges * allow leading 0s, but still limit max length to 6 * Add explicit regex-set rules to hex unicode document {1,3} ranges * add space-separators between sets * Make test fail *only* for length limits Previously it failed due to specifying a codepoint past max *as well*, obscuring the intended fail condition. --------- Co-authored-by: Tab Atkins Jr. <[email protected]>
1 parent e9e6a84 commit d76063e

File tree

3 files changed

+9
-4
lines changed

3 files changed

+9
-4
lines changed

draft-marchan-kdl2.md

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -983,10 +983,13 @@ string-character :=
983983
[^\\"] - disallowed-literal-code-points
984984
ws-escape := '\\' (unicode-space | newline)+
985985
hex-digit := [0-9a-fA-F]
986-
hex-unicode := hex-digit{1, 6} - surrogates
987-
surrogates := [dD][8-9a-fA-F]hex-digit{2}
988-
// U+D800-DFFF: D 8 00
989-
// D F FF
986+
hex-unicode := hex-digit{1, 6} - surrogate - above-max-scalar // Unicode Scalar Value in hex₁₆, leading 0s allowed within length ≤ 6
987+
surrogate := [0]{0, 2} [dD] [8-9a-fA-F] hex-digit{2}
988+
// U+D800-DFFF: D 8 00
989+
// D F FF
990+
above-max-scalar = [2-9a-fA-F] hex-digit{5} | [1] [1-9a-fA-F] hex-digit{4}
991+
// >U+10FFFF: >1 _____ 1 >0 ____
992+
990993
991994
raw-string := '#' raw-string-quotes '#' | '#' raw-string '#'
992995
raw-string-quotes :=
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
no "Higher than max Unicode Scalar Value \u{10FFFF} \u{11FFFF}"
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
no "Even with leading 0s Unicode Scalar Value escapes must ≤6: \u{0012345}"

0 commit comments

Comments
 (0)