From 9b02a8ac0c19da6fd90e6fa7313932b4939b39a8 Mon Sep 17 00:00:00 2001 From: "Steven R. Loomis" Date: Tue, 29 Oct 2024 10:36:32 -0500 Subject: [PATCH] CLDR-18065 site: typo fixes - tables were broken in the transliteration guidelines - broken (escaped) NCRs - http to https --- .../european-ordering-rules-issues.md | 4 ++-- .../index/cldr-spec/transliteration-guidelines.md | 14 ++++++++------ 2 files changed, 10 insertions(+), 8 deletions(-) diff --git a/docs/site/development/development-process/design-proposals/european-ordering-rules-issues.md b/docs/site/development/development-process/design-proposals/european-ordering-rules-issues.md index 11ed1935ae5..364c014f309 100644 --- a/docs/site/development/development-process/design-proposals/european-ordering-rules-issues.md +++ b/docs/site/development/development-process/design-proposals/european-ordering-rules-issues.md @@ -22,8 +22,8 @@ Issues with current EOR rules: 1. The ignoring rules for currency etc. should be filtered out in the CLDR context. ( Mark, John, Åke) 2. The rule for U+029F SMALL CAPITAL L is missing (typo in standard). ( Åke ) 3. There are relevant comments by Kent Karlsson in ticket #[763](http://unicode.org/cldr/trac/ticket/763) (2010-10-27), with a modified proposal - 1. --- \⃩(\⃩ = [U+20E9](http://unicode.org/cldr/utility/character.jsp?a=20E9) ( ⃩ ) COMBINING WIDE BRIDGE ABOVE) is the (currently) weightiest, at level 2, non-letter general purpose combining mark - 2. --- \⃩ is used in the proposal to make all "variants" come after all single-accented versions of letters + 1. --- ⃩(⃩ = [U+20E9](http://unicode.org/cldr/utility/character.jsp?a=20E9) ( ⃩ ) COMBINING WIDE BRIDGE ABOVE) is the (currently) weightiest, at level 2, non-letter general purpose combining mark + 2. --- ⃩ is used in the proposal to make all "variants" come after all single-accented versions of letters 3. --- resetting to just A, B, etc. would make variant versions come before accented versions 4. ( Åke ) The current reset rules work fine with MimerSQL, but I think you must check the ICU behaviour. Kent might have a vital point here. 5. (Kent) (digraphs) ----tertiary difference in DUCET; keep it that way diff --git a/docs/site/index/cldr-spec/transliteration-guidelines.md b/docs/site/index/cldr-spec/transliteration-guidelines.md index 42f7685bfce..e4cf2ded4de 100644 --- a/docs/site/index/cldr-spec/transliteration-guidelines.md +++ b/docs/site/index/cldr-spec/transliteration-guidelines.md @@ -13,6 +13,7 @@ Transliteration is the general process of converting characters from one script Transliteration is *not* translation. Rather, transliteration is the conversion of letters from one script to another without translating the underlying words. The following shows a sample of transliteration systems: Sample Transliteration Systems + | Source | Translation | Transliteration | System | |:---:|:---:|:---:|:---:| | Αλφαβητικός | Alphabetic | Alphabētikós | Classic | @@ -32,6 +33,7 @@ While an English speaker may not recognize that the Japanese word kyanpasu is eq - When a service engineer is sent a program dump that is filled with characters from foreign scripts, it is much easier to diagnose the problem when the text is transliterated and the service engineer can recognize the characters. Sample Transliterations + | Source | Transliteration | |---|---| | 김, 국삼 | Gim, Gugsam | @@ -322,7 +324,7 @@ If you are interested in providing transliterations for one or more scripts, fil For submission to CLDR, the data needs to supplied in the correct XML format or in the ICU format, and should follow an accepted standard (like UNGEGN, BGN, or others). -- The format for rules is specified in [Transform\_Rules](http://www.unicode.org/reports/tr35/#Transform_Rules). It is best if the results are tested using the [ICU Transform Demo](https://icu4c-demos.unicode.org/icu-bin/translit) first, since if the data doesn't validate it would not be accepted into CLDR. +- The format for rules is specified in [Transform\_Rules](https://www.unicode.org/reports/tr35/#Transform_Rules). It is best if the results are tested using the [ICU Transform Demo](https://icu4c-demos.unicode.org/icu-bin/translit) first, since if the data doesn't validate it would not be accepted into CLDR. - As mentioned above, even if a transliteration is only used in certain countries or contexts CLDR can provide for them with different variant tags. - For comparison, you can see what is currently in CLDR in the [transforms]() folder online. For example, see [Hebrew\-Latin.xml](). - Script transliterators should cover every character in the exemplar sets for the CLDR locales using that script. @@ -331,10 +333,10 @@ For submission to CLDR, the data needs to supplied in the correct XML format or | Shavian | Relation | Latin | Comments | |:---:|:---:|:---:|---| -| \𐑐 | ↔ | p | Map all uppercase to lowercase first | -| \𐑚 | ↔ | b | | -| \𐑑 | ↔ | t | | -| \𐑒\𐑕 | ← | x | fallback | +| 𐑐 | ↔ | p | Map all uppercase to lowercase first | +| 𐑚 | ↔ | b | | +| 𐑑 | ↔ | t | | +| 𐑒𐑕 | ← | x | fallback | | ... | | | | ## More Information @@ -349,5 +351,5 @@ For more information, see: - [ISO\-15915 (Gujarati)](http://transliteration.eki.ee/pdf/Gujarati.pdf) - [ISO\-15915 (Kannada)](http://transliteration.eki.ee/pdf/Kannada.pdf) - [ISCII\-91](http://www.cdacindia.com/html/gist/down/iscii_d.asp) -- [UTS \#35: Locale Data Markup Language (LDML)](http://www.unicode.org/reports/tr35/) +- [UTS \#35: Locale Data Markup Language (LDML)](https://www.unicode.org/reports/tr35/)