-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pre-commit hook + ruff usage #2013
base: master
Are you sure you want to change the base?
Conversation
b890199
to
9aba4cf
Compare
May be of interest to you @s-boardman - towards consistency between repositories. |
@gavinevans - you happy for us to proceed with this? I can chase-up a second review from the technical team. I just wasn't sure whether science would be happy with leaving this as a purely technical team concern change. |
I have updated the code style guide to provide more context. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few more points, I have been testing improver/nbhood/nbhood.py;
- There appears to be an issue with the copyright_check, it is failing when I don't think it should. Can you add a fix option to copyright_check so users don't have to figure out what the correct header is? (this will also tell me if it is working correctly)
- bin/improver_tests still contains references to old tools-black/isort/flake8 (is this used/ can it be removed? Or does it need updating)
- This change appears to update the line-length from 100 down to 88 (not total clear to me why) - note, I am okay with it going to 88. Best guess for cause is the flake8 configuration in setup.cfg (which is still there and presumably could be removed?)
- There is a formatting change that I think are actually quite bad (for readability), can we do anything about this. Specifically spacing in explicit arrays, e.g.
- [[[ 0.75, 0.75, 0.5 , 0.5 , 0.5 , 0.75, 0.75],
- [ 0.75, 0.55, 0.55, 0.5 , 0.55, 0.55, 0.55],
- [ 0.55, 0.55, 0.5 , 0.5 , 0.5 , 0.5 , 0.5 ],
- [ 0.5 , 0.5 , 0.5 , 0.5 , 0.5 , 0.5 , 0.5 ],
- [ 0.5 , 0.5 , 0.5 , 0.5 , 0.5 , 0.55, 0.55],
- [ 0.55, 0.55, 0.55, 0.5 , 0.55, 0.55, 0.75],
- [ 0.75, 0.75, 0.5 , 0.5 , 0.5 , 0.75, 0.75]]],
+ (
+ [
+ [
+ [0.75, 0.75, 0.5, 0.5, 0.5, 0.75, 0.75],
+ [0.75, 0.55, 0.55, 0.5, 0.55, 0.55, 0.55],
+ [0.55, 0.55, 0.5, 0.5, 0.5, 0.5, 0.5],
+ [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5],
+ [0.5, 0.5, 0.5, 0.5, 0.5, 0.55, 0.55],
+ [0.55, 0.55, 0.55, 0.5, 0.55, 0.55, 0.75],
+ [0.75, 0.75, 0.5, 0.5, 0.5, 0.75, 0.75],
+ ]
+ ],
+ )
Can you give an example of this case that failed when you think it shouldn't have?
The copyright check will already add a copyright header automatically if it is missing. When it appears to be present but isn't correct, it's very difficult to automatically fix or present a diff in the general case. Essentially, checking whether it 'appears' to have a copyright defined is achieved by searching for the term 'copyright' within a comment line ('#'). Beyond that, comments can include arbitrary information so there is no sure way to know where this incorrect copyright finishes, only that is doesn't match. EDIT: I suppose we could derive a diff based on the number of lines of the correct copyright notice we expect to be there. We could do this, but then wouldn't it be simpler to delete the copyright header and have it automatically populated for you?? (rather than modifying them I mean) EDIT2: Since the
I'll grep for references to flake8 and black 👍
Good spotting, yes I didn't think to look at setup.cfg for the flake8 config.
Yes, I'm not sure. Will look into it. I think ideally this would be done with an in-line rule exclusion for such things. |
Pulled back into draft to address some points raised by @SamGriffithsMO (thanks for taking a close look at this). |
I just ran it on nbhood/nbhood.py, it already has a header that, to me, looks like it should pass. Let me know if you can't reproduce it
👍 |
Copyright check fails on that nbhood.py because first line has |
In order to maintain a backlog of relevant PRs, we automatically label them as stale after 60 days of inactivity. If this PR is still important to you, then please comment on this PR and the stale label will be removed. Otherwise this PR will be automatically closed in 30 days time. |
line_length = 88 | ||
[tool.ruff.lint] | ||
extend-select = ["E", "F", "W", "I"] # add C90 later | ||
ignore = ["E203", "E731", "E501", "E741"] # remove "E501", "E741" later |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wanted to draw attention to this: Ideally we should work towards removing any exclusions. These exist here now so as not to attempt to cause too many changes (raising standards) whilst first adopting ruff here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was a lot of files!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will you remake the improver_production environment to exclude the now-redundant packages (black etc), or does the environment need to be left unchanged until post-PS46 to leave things genuinely unchanged?
One tiny typo highlighted. I've not looked at every file, but I get the gist of it (thanks to Sam for being more rigorous). Happy with this change, thanks.
As an aside, I note that the copyright check is by far the slowest of the tests to run, so we might optimise that in future (not required here) to save ourselves whole seconds.
Co-authored-by: bayliffe <[email protected]>
09ac849
Keep our existing environments unchanged 👍
Done. Thanks.
I have switched from parsing with Python to using grep within the copyright header checking to achieve an order of magnitude improvement in speed (with minor caveats to change in pattern matching - not as strict). Crude benchmark: xychart-beta
title "copyright header check benchmark (1 sample only)"
x-axis "Number of files" [1, 2, 5, 10, 20, 30, 50, 100, 200, 300, 600]
y-axis "Elapsed time (s)" 0.5 --> 1.3
line [0.60, 0.62, 0.69, 0.66, 0.61, 0.67, 0.72, 0.79, 0.82, 1.01, 1.23]
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @cpelley the copyright check is noticeably much faster.
README.md
Outdated
Ensure that you have python available on the path, then install the pre-commit hook by running `pre-commit install` from within your working copy. | ||
pre-commit checks will run against modified files when you commit from then on. | ||
|
||
These pre-commit hooks will run as part of continuous integration to maintain standards in the project. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which standards? Should this say "code quality standards"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤷♂️ I don't see what might be ambiguous myself.
Since I'm making the below suggested change, I'll include this one here too anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some small cosmetic changes suggested
bb07197
improver_tests/test_source_code.py
(removed in this PR).__init__.py
file checks handled viainit_check
script. Auto-fix supported (creates missing__init__.py
files).ci.yml
formatting while I was at it to increase readability (line spacing between steps).Note that you can run pre-commit manually (you needn't wait for a commit):
All files:
Specific files:
Issues
Why ruff?
This PR switches our use from black, isort, and flake8 to ruff for code formatting and linting. Ruff is gaining popularity due to its performance, efficiency and flexibility. It combines the functionality of black, isort, and flake8 into a single tool (and more if we enable additional rulesets), reducing the complexity of our tooling setup. Ruff offers excellent performance, faster execution times, and the ability to auto-fix issues, ensuring a consistent codebase. Adopting ruff aligns with industry trends and provides a unified approach to code quality with our other repositories.
Note
Some differences between black and ruff:
https://docs.astral.sh/ruff/formatter/#black-compatibility
(CI now only runs pre-commit on files changed so changes will be made to files as people touch them)