Skip to content

feat: Adds Latin1 (filtered CP1252) support. Optimizes ASCII/UTF8/16 support#2317

Merged
kamronbatman merged 1 commit intomainfrom
kbatman/latin1_support
Jan 22, 2026
Merged

feat: Adds Latin1 (filtered CP1252) support. Optimizes ASCII/UTF8/16 support#2317
kamronbatman merged 1 commit intomainfrom
kbatman/latin1_support

Conversation

@kamronbatman
Copy link
Contributor

@kamronbatman kamronbatman commented Jan 22, 2026

Summary

  • Adds proper Latin1 encoding support, replacing CP1252 usage throughout the codebase
  • Adds specialized, optimized string decoding methods with safe string filtering for each encoding type
  • Filters invalid Unicode characters (C0/C1 control codes, non-characters) by removal rather than replacement since
    the UO client renders nothing for these characters
  • Fixes UTF-16 null terminator position handling to correctly advance by 2 bytes

Changes

TextEncoding.cs

  • Added SearchValues-based invalid byte/char detection for efficient filtering
  • Added encoding-specific GetString methods: GetStringAscii, GetStringLatin1, GetStringUtf8, GetStringBigUni,
    GetStringLittleUni
  • Each method supports a safeString parameter for filtering invalid characters
  • Little-endian UTF-16 uses direct memory cast for zero-copy decoding on LE systems
  • Invalid characters are removed (not replaced with U+FFFD) since the client renders nothing for them

SpanReader.cs

  • Added ReadLatin1() and ReadLatin1Safe() methods
  • Rewrote encoding-specific read methods to use optimized TextEncoding.GetString* methods
  • Fixed UTF-16 null terminator handling: position now correctly advances by byteLength (2) instead of 1

SpanWriter.cs

  • Added WriteLatin1 and WriteLatin1Null methods

Packet Updates

  • Updated all packet code to use Latin1 encoding instead of CP1252
  • Affected: account packets, equipment packets, menu packets, message packets, mobile packets, player packets, secure
    trade packets, vendor packets, gump packets, book packets, mahjong packets

Filtering Behavior

Invalid characters filtered in safe mode:

┌───────────────┬────────────────────────┐
│     Range     │      Description       │
├───────────────┼────────────────────────┤
│ 0x00-0x1F     │ C0 control codes       │
├───────────────┼────────────────────────┤
│ 0x7F          │ DEL                    │
├───────────────┼────────────────────────┤
│ 0x80-0x9F     │ C1 control codes       │
├───────────────┼────────────────────────┤
│ 0xFFFE-0xFFFF │ Unicode non-characters │
└───────────────┴────────────────────────┘

Note: Surrogate pairs (0xD800-0xDFFF) are not filtered because proper validation requires context checking for paired
vs unpaired surrogates. The UO client renders nothing for these anyway.

Test Plan

  • All 631 Server.Tests pass
  • Verified client rendering behavior using TestUnicodeGump command (pages 1-5)
  • Confirmed U+FFFD, unpaired surrogates, and non-characters all render as blank in client
  • Verified Latin1 characters (0xA0-0xFF) display correctly
  • Verified C1 control codes (0x80-0x9F) are filtered and don't display

Copilot AI review requested due to automatic review settings January 22, 2026 21:41
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Latin1 (with filtering of C0/C1 control codes) as a supported encoding across multiple UO network packet read/write paths, and introduces optimized encoding/decoding helpers.

Changes:

  • Switches many “ASCII” packet fields to Latin1 equivalents (ReadLatin1*/WriteLatin1*), including gumps, menus, vendors, and speech.
  • Refactors SpanReader to use encoding-specific read paths and adds Latin1 read helpers.
  • Extends TextEncoding with optimized, filtered decoding methods for ASCII/Latin1/UTF-8/UTF-16.

Reviewed changes

Copilot reviewed 24 out of 24 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
Projects/UOContent/Network/Packets/IncomingPlayerPackets.cs Reads prompt response text as Latin1-safe.
Projects/UOContent/Network/Packets/IncomingMobilePackets.cs Reads rename request text as Latin1-safe.
Projects/UOContent/Network/Packets/IncomingMessagePackets.cs Reads ASCII speech text as Latin1-safe.
Projects/UOContent/Network/Packets/IncomingAccountPackets.cs Reads character/account/game login fields as Latin1.
Projects/UOContent/Items/Games/Mahjong/MahjongPackets.cs Writes Mahjong player names as Latin1.
Projects/UOContent/Items/Books/BookPackets.cs Reads legacy book title/author as Latin1-safe.
Projects/UOContent/Gumps/Base/RelayInfo.cs Uses TextEncoding.GetStringBigUni for gump text entries.
Projects/UOContent/Gumps/Base/OutgoingGumpPackets.cs Writes sign gump strings as Latin1 null-terminated.
Projects/UOContent/Gumps/Base/Legacy/GumpTooltip.cs Writes tooltip layout using Latin1 when args are present.
Projects/UOContent/Gumps/Base/Legacy/GumpHtmlLocalized.cs Writes localized HTML token args using Latin1.
Projects/UOContent/Gumps/Base/GumpLayoutBuilder.cs Encodes layout values using Latin1 bytes.
Projects/Server/Text/TextEncoding.cs Adds optimized ASCII/Latin1/UTF-8/UTF-16 decoding with filtering + Latin1 helpers.
Projects/Server/Server.csproj Bumps System.IO.Hashing package version.
Projects/Server/Network/Packets/OutgoingVendorSellPackets.cs Writes vendor sell item names as Latin1.
Projects/Server/Network/Packets/OutgoingVendorBuyPackets.cs Writes vendor buy descriptions as Latin1 null-terminated.
Projects/Server/Network/Packets/OutgoingSecureTradePackets.cs Writes secure trade name as Latin1 fixed-length.
Projects/Server/Network/Packets/OutgoingPlayerPackets.cs Writes profile header/paperdoll title/scroll text as Latin1.
Projects/Server/Network/Packets/OutgoingMobilePackets.cs Writes mobile names/status names as Latin1 fixed-length.
Projects/Server/Network/Packets/OutgoingMessagePackets.cs Writes message “ascii” branch fields using Latin1.
Projects/Server/Network/Packets/OutgoingMenuPackets.cs Writes menu questions/answers as Latin1.
Projects/Server/Network/Packets/OutgoingEquipmentPackets.cs Writes crafter name as Latin1.
Projects/Server/Network/Packets/OutgoingAccountPackets.cs Writes character/server list strings as Latin1.
Projects/Server/Main.cs Registers code page provider at setup time.
Projects/Server/Buffers/SpanWriter.cs Adds Latin1/CP1252 write helpers and Latin1 null-terminated writes.
Projects/Server/Buffers/SpanReader.cs Adds Latin1 read helpers and uses encoding-specific read methods.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@kamronbatman kamronbatman force-pushed the kbatman/latin1_support branch from 2c2d412 to e518a22 Compare January 22, 2026 23:20
@kamronbatman kamronbatman changed the title feat: Adds Latin1 (filtered CP1252) support feat: Adds Latin1 (filtered CP1252) support. Optimizes ASCII/UTF8/16 support Jan 22, 2026
@kamronbatman kamronbatman merged commit 0404251 into main Jan 22, 2026
9 checks passed
@kamronbatman kamronbatman deleted the kbatman/latin1_support branch January 22, 2026 23:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant