-
-
Notifications
You must be signed in to change notification settings - Fork 160
TST: Ensure 100% type-completeness (according to Pyright), check it in CI #1628
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
MarcoGorelli
wants to merge
10
commits into
pandas-dev:main
Choose a base branch
from
MarcoGorelli:type-completeness-in-ci
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
5c340ee
wip
MarcoGorelli a7cab15
get to, and enforce, 100% type completeness#
MarcoGorelli 1d6d28d
update
MarcoGorelli dcc45bf
try adding to ci#
MarcoGorelli 0ff0ef6
migrate to poe
MarcoGorelli 61cefd5
dont run on whole matrix
MarcoGorelli 90eb04f
Merge remote-tracking branch 'upstream/main' into type-completeness-i…
MarcoGorelli 6cc3dee
Merge remote-tracking branch 'upstream/main' into type-completeness-i…
MarcoGorelli 36bf549
update to get back to 100%
MarcoGorelli 0ab3299
Merge remote-tracking branch 'upstream/main' into type-completeness-i…
MarcoGorelli File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -58,3 +58,20 @@ jobs: | |
| - uses: actions/checkout@v5 | ||
|
|
||
| - uses: pre-commit/[email protected] | ||
|
|
||
| type_completeness: | ||
| runs-on: ubuntu-latest | ||
| timeout-minutes: 5 | ||
|
|
||
| steps: | ||
| - uses: actions/checkout@v5 | ||
|
|
||
| - name: Install project dependencies | ||
| uses: ./.github/setup | ||
| with: | ||
| # This is quite slow (2-3 minutes) so we don't run it for all OSs / Python versions. | ||
| os: ubuntu-latest | ||
| python-version: 3.14 | ||
|
|
||
| - name: Verify type completeness using Pyright | ||
| run: poetry run poe type_completeness | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,166 @@ | ||
| """Ensure that pandas' public API is type-complete, using Pyright. | ||
|
|
||
| We run Pyright's `--verifytypes` to ensure that type-completeness is at 100%. | ||
|
|
||
| Rather than running the command as-is, we need to make some adjustments: | ||
|
|
||
| - Use `--ignoreexternal` to ignore untyped symbols in dependent libraries: | ||
| https://github.com/microsoft/pyright/discussions/9911#discussioncomment-12192388. | ||
| - We exclude symbols which are technically public (accordinging to Pyright) but which | ||
| aren't in pandas' documented API and not considered public by pandas. There is no | ||
| CLI flag for this in Pyright, but we can parse the output json and exclude paths ourselves: | ||
| https://github.com/microsoft/pyright/discussions/10614#discussioncomment-13543475. | ||
| - We create a temporary virtual environment with pandas installed in it, as Pyright | ||
| needs that to run its `--verifytypes` command. | ||
| """ | ||
|
|
||
| from __future__ import annotations | ||
|
|
||
| from fnmatch import fnmatch | ||
| import json | ||
| import os | ||
| from pathlib import Path | ||
| import shutil | ||
| import subprocess | ||
| import sys | ||
| import tempfile | ||
| from typing import Any | ||
|
|
||
| EXCLUDE = [ | ||
| # pandas distributes (untyped) tests with the package | ||
| "*.tests.*", | ||
| "*.conftest.*", | ||
| # pandas.core is technically private, and anything considered public | ||
| # is re-exported in other places. For example, `DataFrameGroupBy` is | ||
| # re-exported in `pandas.api.typing`. The re-exports are available | ||
| # under `'alternateNames'`, which we consider when excluding symbols. | ||
| "pandas.core.*", | ||
| # Not considered public | ||
| # https://github.com/pandas-dev/pandas/blob/e87248e1a5d6d78a138039f2856a3aec6b9fef54/doc/source/reference/index.rst#L34 | ||
| "pandas.compat.*", | ||
| # The only parts of `pandas.io` which appears in the API reference are: | ||
| # - `pandas.io.json` | ||
| # - `pandas.io.formats.style` | ||
| # https://github.com/pandas-dev/pandas/blob/b8371f5e6f329bfe1b5f1e099e221c8219fc6bbd/doc/source/reference/io.rst | ||
| # See also: https://github.com/pandas-dev/pandas/issues/27522#issuecomment-516360201 | ||
| "pandas.io.common.*", | ||
| "pandas.io.parsers.*", | ||
| "pandas.io.excel.*", | ||
| "pandas.io.formats.csvs.*", | ||
| "pandas.io.formats.excel.*", | ||
| "pandas.io.formats.html.*", | ||
| "pandas.io.formats.info.*", | ||
| "pandas.io.formats.printing.*", | ||
| "pandas.io.formats.string.*", | ||
| "pandas.io.formats.xml.*", | ||
| # Not documented, not really part of public API | ||
| "pandas.api.executors.BaseExecutionEngine", | ||
| ] | ||
| THRESHOLD = 1 | ||
|
|
||
|
|
||
| def venv_site_packages(venv_python: str) -> Path: | ||
| """Return the site-packages directory for a given venv Python executable.""" | ||
| cmd = [ | ||
| venv_python, | ||
| "-c", | ||
| "import sysconfig, json; print(sysconfig.get_paths()['purelib'])", | ||
| ] | ||
| out = subprocess.check_output(cmd, text=True).strip() | ||
| return Path(out) | ||
|
|
||
|
|
||
| def run_pyright(venv_path: str) -> dict[str, Any]: | ||
| env = os.environ.copy() | ||
| venv = Path(venv_path) | ||
| bin_dir = venv / ("Scripts" if sys.platform == "win32" else "bin") | ||
| env["PATH"] = f"{bin_dir}{os.pathsep}{env['PATH']}" | ||
| out = subprocess.run( | ||
| [ # noqa: S607 | ||
| "pyright", | ||
| "--verifytypes", | ||
| "pandas", | ||
| "--ignoreexternal", | ||
| "--outputjson", | ||
| ], | ||
| check=False, | ||
| env=env, | ||
| text=True, | ||
| capture_output=True, | ||
| ).stdout | ||
| return json.loads(out) | ||
|
|
||
|
|
||
| def parse_pyright_json(data: dict[str, Any]) -> float: | ||
| symbols = data["typeCompleteness"]["symbols"] | ||
| matched_symbols = [ | ||
| x | ||
| for x in symbols | ||
| if x["isExported"] | ||
| # Keep symbols where there's any name which doesn't match any excluded patterns. | ||
| and any( | ||
| all(not fnmatch(name, pattern) for pattern in EXCLUDE) | ||
| for name in [x["name"], *x.get("alternateNames", [])] | ||
| ) | ||
| ] | ||
| return sum(x["isTypeKnown"] for x in matched_symbols) / len(matched_symbols) | ||
|
|
||
|
|
||
| def main() -> int: | ||
| tmpdir = Path(tempfile.mkdtemp(prefix="pandas-stubs-venv-")) | ||
| venv_dir = tmpdir / "venv" | ||
| try: | ||
| subprocess.run([sys.executable, "-m", "venv", venv_dir], check=True) | ||
|
|
||
| if sys.platform == "win32": | ||
| venv_python = (venv_dir / "Scripts") / "python.exe" | ||
| else: | ||
| venv_python = (venv_dir / "bin") / "python" | ||
|
|
||
| subprocess.check_call([venv_python, "-m", "pip", "install", "-U", "pip"]) | ||
| subprocess.check_call( | ||
| [venv_python, "-m", "pip", "install", "-U", "pyright", "pandas"] | ||
| ) | ||
|
|
||
| site_packages = venv_site_packages(str(venv_python)) | ||
|
|
||
| # Copy stubs into site-packages/pandas. | ||
| dest = site_packages / "pandas" | ||
| pandas_dir = Path(site_packages / "pandas").parent | ||
| tracked_files = subprocess.run( | ||
| ["git", "ls-files"], # noqa: S607 | ||
| check=False, | ||
| capture_output=True, | ||
| text=True, | ||
| ).stdout.splitlines() | ||
| for item in tracked_files: | ||
| if not item.startswith("pandas-stubs"): | ||
| continue | ||
| s = item | ||
| d = pandas_dir / item.replace("pandas-stubs", "pandas") | ||
| d.parent.mkdir(parents=True, exist_ok=True) | ||
| shutil.copy2(s, d) | ||
|
|
||
| # Pyright requires `py.typed` to exist. | ||
| (dest / "py.typed").write_text("\n") | ||
|
|
||
| sys.stdout.write("Running pyright --verifytypes (may take a while)...\n") | ||
| out = run_pyright(str(venv_dir)) | ||
|
|
||
| completeness = parse_pyright_json(out) | ||
|
|
||
| sys.stdout.write("--- Results ---\n") | ||
| sys.stdout.write(f"Completeness: {completeness:.4%}\n") | ||
|
|
||
| if completeness < 1: | ||
| sys.stdout.write(f"Completeness {completeness:.1%} below threshold 100%\n") | ||
| return 1 | ||
| sys.stdout.write("Completeness is at 100% threshold\n") | ||
| return 0 | ||
|
|
||
| finally: | ||
| shutil.rmtree(tmpdir) | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| raise SystemExit(main()) |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How long does that step take? It could be worth adding to the
test_allstep that one would run locally when developing.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like it's 2 minutes 26 seconds 😩
I know that pyrefly also wants to add this functionality, and maybe we can get ty to add it too, so hopefully there'll be a faster way to check this is in the future
do you still want it in
test_allor is this too slow?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 min is a bit long i would say, I think you would agree too, the mypy step is quite time consuming too so maybe that would add too much friction. Otherwise we could add it in the same style as the nightly steps, so only when we merge, open to options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't need to be in
test_all, but one should be able to run it locally. Not sure if that is already in there.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it would be already there, I think with the time it takes it is better to just run it on GitHub CI and not in test_all to reduce friction (it is fine if CI takes a bit of time)>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup,
poetry run poe type_completenessruns