Skip to content

Rule: large number without underscore separators (PEP 515) #18221

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

guillp
Copy link

@guillp guillp commented May 20, 2025

Summary

This adds a rule (with code RUF062) that automatically formats large numbers with underscore separators to make them more readable. This is as described in PEP515, and discussed in #12787 which I opened a while back.

E.g:
123456 becomes 123_456
.12345 becomes .123_45
0xDEADBEEF becomes 0xDEAD_BEEF
(see test snapshot for more examples)

This rule works for:

  • integers: if they are 5 digits or more (configurable), the rule will require underscores as thousands, millions, billions, etc. separators. This threshold avoid formatting relatively small integers (9999 or less) since they are already readable enough.
  • floats: same rules as integers, and the float part is also formatted with those same rules, but the groups are formed left to right (thousandths, millionths, etc.)
  • hexadecimal notation (0xABCD): add underscores to form groups of 4 digits by default (configurable)
  • octal notation (0o1234): add underscores to form groups of 4 digits by default (configurable)
  • binary notation (0b1010101): add underscores to from groups of 8 bits (octets) by default (configurable)
  • scientific notation (123e10): the leading part is formatted with the same rules as integers. The exponent part is untouched as it should never be more than 3 chars anyway.
  • positive or negative literals: the leading + or - sign is not part of Expr::NumberLiteral instances once parsed by ruff, so this rule does not modify them in any way, they just stay in place.
  • any kind of number already containing separators but in the wrong places: number will be reformatted with the defined configuration.

Support for indian-style number formatting:

According to https://randombits.dev/articles/number-localization/formatting , most of the world groups decimal digits 3 by 3, excepted for India who uses groups of 2 after the first group of 3 (so thousands, hundred of thousands, hundreds of hundreds of thousands, etc.). A configuration option allows enabling this kind of grouping.
I am however not sure about what is the practice for formatting the float part in India. I implemented a "reversed" logic, with separators on thousandth, then hundredth of thousandth, hundredth of hundredth of thousandth, etc. (not sure if anyone ever needed such a float precision ^^'). This may need to be adjusted.

Test Plan

A new test file RUF062.py is part of the PR and is executed on cargo test.

TODO / to discuss

  • rule code: I took RUF062 as it is the next available in the RUFF group, and I selected this group because I did not see such a rule implemented in any other formatter/linter.
  • naming of configuration options should be reviewed and probably improved
  • default config values to be discussed
  • more tests?

@guillp guillp changed the title Rule/large number without separators Rule: large number without underscore separators (PEP 515) May 20, 2025
@MichaReiser MichaReiser added rule Implementing or modifying a lint rule needs-decision Awaiting a decision from a maintainer labels May 20, 2025
@guillp
Copy link
Author

guillp commented Jul 1, 2025

I rebased on current main and changed the code to RUF062 since 061 is now taken by another rule.

@ntBre
Copy link
Contributor

ntBre commented Jul 1, 2025

I think something went wrong with the rebase, as GitHub is now showing more than 500 commits and over 100,000 lines changed, which would make it pretty difficult to review!

@guillp guillp force-pushed the rule/large_number_without_separators branch from d005005 to c30fbe2 Compare July 3, 2025 07:02
@guillp
Copy link
Author

guillp commented Jul 3, 2025

Not sure what I did wrong. But I made some clean-up so there is a single commit on top of current main with all the changes now.
And also I added configuration options for digit group sizes, and also added support for indian-style number formatting.
Edited the PR description to reflect this. Thanks for reviewing this.

@ntBre
Copy link
Contributor

ntBre commented Jul 4, 2025

Thanks for tidying up and for your work on this! I think we'll still want to resolve the needs-decision label on the issue before giving this a full review.

@MichaReiser
Copy link
Member

I agree that we still need to answer the question if we want to have such an opinionated rule that enforces specific grouping of numbers.

I do like clippy's rule that are less opinionated but enforce good practice:

The first two seem very useful to me. It's less clear to me if we want to add any more opinionated rules.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-decision Awaiting a decision from a maintainer rule Implementing or modifying a lint rule
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants