Rule: large number without underscore separators (PEP 515) #18221
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This adds a rule (with code RUF062) that automatically formats large numbers with underscore separators to make them more readable. This is as described in PEP515, and discussed in #12787 which I opened a while back.
E.g:
123456
becomes123_456
.12345
becomes.123_45
0xDEADBEEF
becomes0xDEAD_BEEF
(see test snapshot for more examples)
This rule works for:
9999
or less) since they are already readable enough.0xABCD
): add underscores to form groups of 4 digits by default (configurable)0o1234
): add underscores to form groups of 4 digits by default (configurable)0b1010101
): add underscores to from groups of 8 bits (octets) by default (configurable)123e10
): the leading part is formatted with the same rules as integers. The exponent part is untouched as it should never be more than 3 chars anyway.+
or-
sign is not part ofExpr::NumberLiteral
instances once parsed by ruff, so this rule does not modify them in any way, they just stay in place.Support for indian-style number formatting:
According to https://randombits.dev/articles/number-localization/formatting , most of the world groups decimal digits 3 by 3, excepted for India who uses groups of 2 after the first group of 3 (so thousands, hundred of thousands, hundreds of hundreds of thousands, etc.). A configuration option allows enabling this kind of grouping.
I am however not sure about what is the practice for formatting the float part in India. I implemented a "reversed" logic, with separators on thousandth, then hundredth of thousandth, hundredth of hundredth of thousandth, etc. (not sure if anyone ever needed such a float precision ^^'). This may need to be adjusted.
Test Plan
A new test file
RUF062.py
is part of the PR and is executed oncargo test
.TODO / to discuss