-
Notifications
You must be signed in to change notification settings - Fork 78
Description
Describe the bug
Zeroes preceding a non zero digit are ignored, either initially or following a pause
the problem is partly related to the in-predictability of pauses in readings of number sequences
as
"0 1 4 6 0 6" is correct interpreted to [0.0, 1.0, 4.0, 6.0, 0.0, 6.0]
but "01 46 06") incorrectly goes to [1.0, 46.0, 6.0]
To Reproduce
Steps to reproduce the behavior:
- install lingua_franca
- open python3
>>> from lingua_franca import load_languages, set_default_lang, parse
>>> from lingua_franca import extractnumbers
>>> load_languages(['en'])
>>>extract_numbers("010 101")
[10.0, 101.0]
>>> extract_numbers("01 010 101")
[1.0, 10.0, 101.0]
>>> extract_numbers("51 21 05")
[51.0, 21.0, 5.0]
>>> extract_numbers("01 46 06")
[1.0, 46.0, 6.0]
Expected behavior
zeros should be added to output as separate numbers,
I think zeros preceding a single non zero digit should be treated as a separate number, either by default or as an option
e.g
"0 1" (zero one) -> [0, 1]
"01 46 06" (zero one four six zero six) -> [0, 1, 46, 0, 6]
Additional context
this is problematic used for reading code numbers e.g totp codes
which could be zero in any digit and can be read in multiple ways
e.g 0 1 4 6 0 6 (zero one four six zero six)
34 45 65 (three four four five six seven ,thirty four forty five sixty five)
234 567 (two hundred and thirty four five hundred and sixty seven
one aspect i'm not sure of is should 46 read as "four six" be interpreted as [46] or [4, 6] when preceding a decimal (or there is no decimal) after a decimal point is different as "normal" reading is e.g 0.01475 (zero point zero one four seven five)
however "46" (fourty six) can always be converted to "4 6" however missing zeroes cannot be recovered