Conversation
e8d75d1 to
4b66286
Compare
laurmaedje
left a comment
There was a problem hiding this comment.
Looks very nice and clean!
As an automated sanity check, it would be nice to validate once that the new algorithms give the exact same output as Typst's current ones for all numbers from 0..100_000 or something like that.
It would also be nice to have better test coverage. While the underlying systems are all tested in principle, we don't have tests for very large numbers and also not for all the named systems. So, it would be easy to introduce a regression when editing that code.
Since it's quite annoying to test those all manually, I could imagine to just have a hash-based safety net. Basically, after testing against Typst's currently output, we at least have reasonable confidence that it's not worse than before. We could then, for each system, compute a hash of how it formats the numbers from 0 to some big N and commit that hash into a test. (I'm kind of on a hash-based testing hype train... ^^)
PS: As you already noticed, I extracted and complemented the unrelated changes into #150.
This is a rewrite of #116.
In order to move forward with this, I suggest we only focus on the architecture for now; we can discuss the names and shorthands and add missing numeral systems in a later PR. I also think we should wait to decide precisely which numeral systems to include before adding more extensive tests. That way, we can simply test the individual numeral systems, which is what we care about.I added all the numeral systems from Typst with no modification other than removing the zero case when it felt appropriate. Adding new numeral systems can be left for future PR.I would like some resource to link for Hiragana, Katakana, Korean, and Bengali letters (specifically their order).
The algorithms are the same as the ones we ended up with in #116. The main change is that I wanted to have a way to represent the fact that some numeral systems are simply unable to represent some values. In this case, we now return an error instead of falling back to Arabic numerals. Then, the client (i.e., Typst) can decide what to do (raise an error, or fallback to Arabic numerals).
Given that this plans to move some things out of typst/typst, I'll cc. @laurmaedje.