Code points, scalar values, and validity

- A character is defined as a ‘Unicode code point’. This means (unpaired) surrogates are allowed in input and, by implication, in output. If this is not intended (which is what I glean from the answer to #614) the definition should be changed to ‘Unicode scalar value’. Changing ‘invalid Unicode code points’ to ‘invalid Unicode scalar values’ would also resolve #614.

- It is not explicitly stated that every possible sequence of Unicode scalar values (or code points?) is a valid CommonMark input text for which some HTML output must be produced, although I also believe that this is the intention. If so, it should be made explicit that a processor which fails to parse any input document is non-conforming.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Code points, scalar values, and validity #778

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Code points, scalar values, and validity #778

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions