Skip to content

GH-46528: [CI][Dev] Remove "archery lint" #46686

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

kou
Copy link
Member

@kou kou commented Jun 3, 2025

Rationale for this change

We can use pre-commit instead of "archery lint" for all linters/formatters.

What changes are included in this PR?

  • Remove "archery lint" CI job
  • Remove "archery lint" related codes including IWYU and clang-tidy because IWYU and clang-tidy aren't used for now
  • Remove "archery lint" related docs including IWYU and clang-tidy related docs

Are these changes tested?

Yes.

Are there any user-facing changes?

No.

It's only affected to developers.

We can use pre-commit instead of "archery lint".
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll revisit IWYU in the future.
It's tracked by #46543 .

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll revisit clang-tidy in the future.
It's tracked by #46542 .

Comment on lines -116 to -124
In order to account for variations in the behavior of ``clang-format`` between
major versions of LLVM, we pin the version of ``clang-format`` used. You can
confirm the current pinned version by finding the ``CLANG_TOOLS`` variable
value in `.env <https://github.com/apache/arrow/blob/main/.env>`_. Note that
the version must match exactly; a newer version (even a patch release) will
not work. LLVM can be installed through a system package manager or a package
manager like Conda or Homebrew, though note they may not offer the exact
version needed. Alternatively, binaries can be directly downloaded from the
`LLVM website <https://releases.llvm.org/>`_.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pre-commit installs pined clang-format automatically.

The instructions on how to set up and use Archery
can be found in the Coding Style section of the
:ref:`python-development`.
:ref:`python-coding-style`.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ID is added in this PR.

# note the file that should not be styled
styler::style_pkg(exclude_files = c("data-raw/codegen.R"))
```
You can automatically change the formatting of the code in the package using the [styler](https://styler.r-lib.org/) package.
Copy link
Member Author

@kou kou Jun 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll use Air in the future.
It's tracked by #46592 .

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting committer review Awaiting committer review labels Jun 3, 2025
Copy link
Member

@raulcd raulcd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove these bits from the LICENSE and NOTICE files about cpplint code?

arrow/LICENSE.txt

Lines 784 to 793 in acbad29

--------------------------------------------------------------------------------
This project includes code from the Google styleguide.
* cpp/build-support/cpplint.py is based on the scripts from the Google styleguide.
Copyright: 2009 Google Inc. All rights reserved.
Homepage: https://github.com/google/styleguide
License: 3-clause BSD

arrow/NOTICE.txt

Lines 20 to 21 in acbad29

This product includes software from the google-lint project
* Copyright (c) 2009 Google Inc. All rights reserved.

@kou
Copy link
Member Author

kou commented Jun 3, 2025

Good catch! We should remove them. I'll do it.

@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Jun 3, 2025
Copy link
Member

@raulcd raulcd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems good to me, I haven't used archery lint lately since we've been adding more linters to pre-commit. @pitrou any concern here?
I've double checked numpydoc have it's own separate archery command.

@github-actions github-actions bot added awaiting merge Awaiting merge and removed awaiting change review Awaiting change review labels Jun 4, 2025
@pitrou
Copy link
Member

pitrou commented Jun 4, 2025

That's fine with me. @kszucs Any comments?

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting merge Awaiting merge labels Jun 5, 2025

To run the linter locally, install the `{lintr}` package (note, we currently use a fork that includes fixes not yet accepted upstream, see how lintr is being installed in the file [`ci/docker/linux-apt-lint.dockerfile`](https://github.com/apache/arrow/blob/main/ci/docker/linux-apt-lint.dockerfile) for the current status) and then run
Copy link
Member

@thisisnic thisisnic Jun 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making this all a lot more concise!
We should leave in information about how to run it other than with pre-commit. Many R users are more comfortable with running lintr directly as it's much simpler (I've personally had issues getting pre-commit set up so won't be the only one), so it'd be great to give them both options.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.
Is it OK that running r/tools/lint.R for non pre-commit option? R developers need to install lintr manually with this approach.
(Can we remove ci/docker/linux-apt-lint.dockerfile? I want to remove ci/docker/linux-apt-lint.dockerfile.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or is running lintr::lint_package("arrow/r") directly better?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running lintr::lint_package("arrow/r") is better for a non-pre-commit option, yes. arrow/r/lint.R could possibly be removed once it's unhooked from CI. The average R contributor will probably run lintr::lint_package to lint and the core dev team generally runs arrow/r/lint.sh to lint R and C++ at the same time.

I'm not sure we can remove linux-apt-lint.dockerfile just yet. IIRC the way that's set up, it will catch lints in the R packages internal C++ code but the current r.yml workflow just lints the R code so removing linux-apt-lint.dockerfile could cause us to merge R C++ code with lint issues. I think we could modify the R workflow to keep the functionality though. Happy to work on that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Let's use lintr::lint_package("arrow/r").

the core dev team generally runs arrow/r/lint.sh to lint R and C++ at the same time.

Hmm... This PR removes r/lint.sh...

Could you share pre-commit related issues you got? Can we fix them instead of avoiding pre-commit?

I'm not sure we can remove linux-apt-lint.dockerfile just yet. IIRC the way that's set up, it will catch lints in the R packages internal C++ code but the current r.yml workflow just lints the R code so removing linux-apt-lint.dockerfile could cause us to merge R C++ code with lint issues. I think we could modify the R workflow to keep the functionality though.

Oh, I missed that the r.yml workflow also runs lintr:

- name: Run lintr
if: ${{ matrix.config.rversion == 'release' }}
env:
NOT_CRAN: "true"
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
shell: Rscript {0}
working-directory: r
run: |
Sys.setenv(
RWINLIB_LOCAL = file.path(Sys.getenv("GITHUB_WORKSPACE"), "r", "windows", "libarrow.zip"),
MAKEFLAGS = paste0("-j", parallel::detectCores()),
ARROW_R_DEV = TRUE,
"_R_CHECK_FORCE_SUGGESTS_" = FALSE
)
# we use pak for package installation since it is faster, safer and more convenient
pak::local_install()
pak::pak("lintr")
lintr::expect_lint_free()

(BTW, I didn't know that lintr::expect_lint_free() exists. I changed the pre-commit configuration to use lintr::expect_lint_free() and removes r/tools/lint.R.)

I want to remove the "Run lintr" step too because it's already done by pre-commit

- name: Run pre-commit
run: |
pre-commit run --all-files --color=always --show-diff-on-failure
and archery lint
- name: Execute Docker Build
env:
ARCHERY_DOCKER_USER: ${{ secrets.DOCKERHUB_USER }}
ARCHERY_DOCKER_PASSWORD: ${{ secrets.DOCKERHUB_TOKEN }}
UBUNTU: 22.04
run: |
source ci/scripts/util_enable_core_dumps.sh
archery docker run -e GITHUB_ACTIONS=true ubuntu-lint
.

And pre-commit and archery lint check not only R code but also R C++ code. So we will not merge R C++ code with lint issues.

@kszucs
Copy link
Member

kszucs commented Jun 5, 2025

We have duplicated most of the linters in both archery and pre-commit hooks also causing version and configuration mismatches for certain tools, so I strongly support this change.

@github-actions github-actions bot added awaiting change review Awaiting change review awaiting changes Awaiting changes and removed awaiting changes Awaiting changes awaiting change review Awaiting change review labels Jun 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants