[RFC] Specifying how terminals process Unicode text

Hi all,

Following on from the text-sizing work in #8226 I have decided to specify the exact algorithm terminals should use to split Unicode text into cells and implement it in kitty. It is based on the Unicode specification's Grapheme segmentation rules but in addition also specifies things particular to terminals not covered in the Unicode spec. It fixes various long standing issues such as #3810 (emoji with zwj) and #8433 (Korean text).

The specification is [here](https://sw.kovidgoyal.net/kitty/text-sizing-protocol/#the-algorithm-for-splitting-text-into-cells). Feel free to read it and comment. There might well be differing opinions on the parts not covered by the Unicode spec. I am open to suggestions for modification.

From kitty users, I would appreciate if some of you can run [nightly](https://sw.kovidgoyal.net/kitty/binary/#customizing-the-installation) and report if there are any issues. It's possible you will have issues if you use ZWJ based emoji in your workflows, as the width kitty assigns to these has changed, and terminal programs may use a different width than the correct one.

In master, there is also a kitten that can be run easily to test a terminal's compliance with the spec. It uses grapheme test data from the Unicode consortium. Run it as:

```
kitten __width_test__
```

Here are results of running it on various terminals.

| Terminal name        | Number of tests failed |
| -------------------- | ---------------------- |
| kitty (master)       | 0                      |
| kitty 0.41.1         | 45                     |
| wezterm 5046fc22     | 179                    |
| foot 1.21.0          | 186                    |
| konsole 24.12.3      | 280                    |
| iTerm2 3.5.13        | 289                    |
| gnome-term 3.56.0    | 317                    |
| kitty-master+tmux-3.5| 347                    |
| xterm 397            | 371                    |
| Apple terminal 2.14  | 479                    |

And finally, we have ghostty 1.1.3, on which the test kitten failed to run because ghostty returned way more cursor position reports than were expected, something badly broken there. I did happen to look at its code as it claims to do grapheme segmentation, and it [doesn't implement the segmentation algorithm correctly anyway](https://github.com/ghostty-org/ghostty/discussions/6931).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[RFC] Specifying how terminals process Unicode text #8533

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Terminal name	Number of tests failed
kitty (master)	0
kitty 0.41.1	45
wezterm 5046fc22	179
foot 1.21.0	186
konsole 24.12.3	280
iTerm2 3.5.13	289
gnome-term 3.56.0	317
kitty-master+tmux-3.5	347
xterm 397	371
Apple terminal 2.14	479

Uh oh!

[RFC] Specifying how terminals process Unicode text #8533

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions