Skip to content

Commit

Permalink
TemporaryAutoInstallResolver: Use uv if available
Browse files Browse the repository at this point in the history
Until now TemporaryAutoInstallResolver has used venv.create() to create
the temporary virtualenv and then run `pip install` to install
dependencies into this venv. This works, but is slow.

So slow, in fact, that our tests (test_resolver in particular) have had
to implement caching of the virtualenv simply to reduce the test runtime
from ~60s to ~15s (runtimes vary wildly since this is a hypothesis test,
but these are averages from mutiple runs).

When `uv` is available we can use it to both create the temporary
virtualenv, as well as install packages into it. Due to `uv` itself and
the local cache of packages that it maintains, this is now so fast that
we no longer have to cache the virtualenv in test runs (test_resolver
now takes ~7s without caching, and ~6s with caching).

For now, we don't make this configurable: If `uv` is found in $PATH, we
will use it, otherwise we fall back to venv + pip.
  • Loading branch information
jherland committed Jun 12, 2024
1 parent 9c7db43 commit 591ea1b
Show file tree
Hide file tree
Showing 3 changed files with 101 additions and 47 deletions.
19 changes: 12 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -232,23 +232,28 @@ fallback strategy.

#### Mapping by temporarily installing packages

Your local Python environements might not always have all your project's
Your local Python environments might not always have all your project's
dependencies installed. Assuming that you don’t want to go through the
bother of installing packages manually, and you also don't want to rely on
the inaccurate identity mapping as your fallback strategy, you can use the
`--install-deps` option. This will `pip install`
missing dependencies (from [PyPI](https://pypi.org/), by default) into a
_temporary virtualenv_, and allow FawltyDeps to use this to come up with the
correct mapping.
`--install-deps` option. This will automatically install missing dependencies
(from [PyPI](https://pypi.org/), by default) into a _temporary virtualenv_,
and allow FawltyDeps to use this to come up with the correct mapping.

Since this is a potentially expensive strategy (e.g. downloading packages from
PyPI), we have chosen to hide it behind the `--install-deps` command-line
option. If you want to always enable this option, you can set the corresponding
`install_deps` configuration variable to `true` in the `[tool.fawltydeps]`
section of your `pyproject.toml`.

To customize how this auto-installation happens (e.g. use a different package index),
you can use [pip’s environment variables](https://pip.pypa.io/en/stable/topics/configuration/).
FawltyDeps will use [`uv`](https://github.com/astral-sh/uv) by default to
temporarily install missing dependencies. If `uv` not available, `pip` will be
used instead.

To further customize how this automatic installation is done (e.g. if you need
to use a different package index), you can use environment variables to alter
[`uv`'s](https://github.com/astral-sh/uv?tab=readme-ov-file#environment-variables)
or [`pip`’s ](https://pip.pypa.io/en/stable/topics/configuration/) behavior.

Note that we’re never guaranteed to be able to resolve _all_ dependencies with
this method: For example, there could be a typo in your `requirements.txt` that
Expand Down
109 changes: 76 additions & 33 deletions fawltydeps/packages.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""Encapsulate the lookup of packages and their provided import names."""

import logging
import shutil
import subprocess
import sys
import tempfile
Expand Down Expand Up @@ -411,52 +412,92 @@ class TemporaryAutoInstallResolver(BasePackageResolver):
cached_venv: Optional[Path] = None

@staticmethod
def _venv_create(venv_dir: Path, uv_exe: Optional[str] = None) -> None:
"""Create a new virtualenv at the given venv_dir."""
if uv_exe is None: # use venv module
venv.create(venv_dir, clear=True, with_pip=True)
else:
subprocess.run(
[uv_exe, "venv", "--python", sys.executable, str(venv_dir)], # noqa: S603
check=True,
)

@staticmethod
def _venv_install_cmd(venv_dir: Path, uv_exe: Optional[str] = None) -> List[str]:
"""Return argv prefix for installing packages into the given venv.
Construct the initial part of the command line (argv) for installing one
or more packages into the given venv_dir. The caller will append one or
more packages to the returned list, and run it via subprocess.run().
"""
if sys.platform.startswith("win"): # Windows
python_exe = venv_dir / "Scripts" / "python.exe"
else: # Assume POSIX
python_exe = venv_dir / "bin" / "python"

if uv_exe is None: # use `$python_exe -m pip install`
return [
f"{python_exe}",
"-m",
"pip",
"install",
"--no-deps",
"--quiet",
"--disable-pip-version-check",
]
# else use `uv pip install`
return [
uv_exe,
"pip",
"install",
f"--python={python_exe}",
"--no-deps",
"--quiet",
]

@classmethod
@contextmanager
def installed_requirements(
venv_dir: Path, requirements: List[str]
cls, venv_dir: Path, requirements: List[str]
) -> Iterator[Path]:
"""Install the given requirements into venv_dir with `pip install`.
"""Install the given requirements into venv_dir.
We try to install as many of the given requirements as possible. Failed
requirements will be logged with warning messages, but no matter how
many failures we get, we will still enter the caller's context. It is
up to the caller to handle any requirements that we failed to install.
"""
uv_exe = shutil.which("uv") # None -> fall back to venv/pip

marker_file = venv_dir / ".installed"
if not marker_file.is_file():
venv.create(venv_dir, clear=True, with_pip=True)
# Capture output from `pip install` to prevent polluting our own stdout
pip_install_runner = partial(
subprocess.run,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
text=True,
check=False,
)
if sys.platform.startswith("win"): # Windows
pip_path = venv_dir / "Scripts" / "pip.exe"
else: # Assume POSIX
pip_path = venv_dir / "bin" / "pip"
cls._venv_create(venv_dir, uv_exe)

def install_helper(*packages: str) -> int:
"""Install the given package(s) into venv_dir.
Return the subprocess exit code from the install process.
"""
argv = cls._venv_install_cmd(venv_dir, uv_exe) + list(packages)
proc = subprocess.run(
argv, # noqa: S603
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
text=True,
check=False,
)
if proc.returncode: # log warnings on failure
logger.warning("Command failed (%i): %s", proc.returncode, argv)
if proc.stdout.strip():
logger.warning("Output:\n%s", proc.stdout)
return proc.returncode

argv = [
f"{pip_path}",
"install",
"--no-deps",
"--quiet",
"--disable-pip-version-check",
]
proc = pip_install_runner(argv + requirements)
if proc.returncode: # pip install failed
logger.warning("Command failed: %s", argv + requirements)
if proc.stdout.strip():
logger.warning("Output:\n%s", proc.stdout)
if install_helper(*requirements): # install failed
logger.info("Retrying each requirement individually...")
for req in requirements:
proc = pip_install_runner([*argv, req])
if proc.returncode: # pip install failed
if install_helper(req):
logger.warning("Failed to install %s", repr(req))
if proc.stdout.strip():
logger.warning("Output:\n%s", proc.stdout)

marker_file.touch()
yield venv_dir

Expand Down Expand Up @@ -485,11 +526,13 @@ def lookup_packages(self, package_names: Set[str]) -> Dict[str, Package]:
to provide the Package objects that correspond to the package names.
"""
if self.cached_venv is None:
# Use .temp_installed_requirements() to create a new virtualenv for
# installing these packages (and then automatically remove it).
installed = self.temp_installed_requirements
logger.info("Installing dependencies into a temporary Python environment.")
# If self.cached_venv has been set, then use that path instead of creating
# a temporary venv for package installation.
else:
# self.cached_venv has been set, so pass that path directly to
# .installed_requirements() instead of creating a temporary dir.
installed = partial(self.installed_requirements, self.cached_venv)
logger.info(f"Installing dependencies into {self.cached_venv}.")
with installed(sorted(package_names)) as venv_dir:
Expand Down
20 changes: 13 additions & 7 deletions tests/test_resolver.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
"""Verify behavior of packages resolver."""

import shutil
import sys
import time
from pathlib import Path
Expand Down Expand Up @@ -179,15 +180,20 @@ def test_resolve_dependencies__generates_expected_mappings(

isolate_default_resolver(installed_deps)

# Tell TemporaryAutoInstallResolver to reuse our cached venv, instead of
# potentially creating a new venv for every test case.
cached_venv = Path(
request.config.cache.mkdir(
f"fawltydeps_reused_venv_{sys.version_info.major}.{sys.version_info.minor}"
# If we're using `venv.create()` to create virtualenvs and `pip install` to
# populate them, we will waste a lot of time recreating and repopulating
# virtualenvs in these tests. This is not the case for `uv` which caches
# packages locally, and is simply much faster than venv/pip. Therefore, tell
# TemporaryAutoInstallResolver to reuse our cached venv, instead of creating
# a new venv for every test case, but only when we're not using uv...
if shutil.which("uv") is None:
cached_venv = Path(
request.config.cache.mkdir(
f"fawltydeps_reused_venv_{sys.version_info.major}.{sys.version_info.minor}"
)
)
)
try:
TemporaryAutoInstallResolver.cached_venv = cached_venv
try:
actual = resolve_dependencies(
dep_names,
setup_resolvers(
Expand Down

0 comments on commit 591ea1b

Please sign in to comment.