Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[conda] Support Python environments embedded inside Conda environments? #451

Open
jherland opened this issue Sep 4, 2024 · 4 comments
Open
Labels
integration Integrating FawltyDeps with other tools needs-real-projects-test This issue is more easily tackled once we have a project in `real_project` that illustrate the issue research-needed type: feature request

Comments

@jherland
Copy link
Member

jherland commented Sep 4, 2024

(found while exploring potential Conda support for FawltyDeps, see e.g. #447 for more context)

I'm following the documentation at https://www.activestate.com/resources/quick-reads/how-to-manage-python-dependencies-with-conda/ to see what a Conda environment looks like, and what would be needed from FawltyDeps to use it for deducing the package-name -> import-name mapping.

Specifically, after running the following commands:

conda create --name my_conda_project python=3.8
conda activate my_conda_project
conda install requests

there are no files/directories created inside the current directory (like you would expect with e.g. Poetry or a similar (virtual) environment manager). Instead, there is a my_conda_project/ subdirectory created under ~/.conda/envs/, and inside this directory we find:

❯ ls -al ~/.conda/envs/my_conda_project/
total 44
drwxr-xr-x 11 jherland users 4096 Aug 30 10:39 .
drwxr-xr-x  4 jherland users 4096 Sep  4 16:53 ..
drwxr-xr-x  2 jherland users 4096 Aug 30 10:39 bin
drwxr-xr-x  2 jherland users 4096 Aug 30 10:36 compiler_compat
drwxr-xr-x  2 jherland users 4096 Aug 30 10:39 conda-meta
drwxr-xr-x  8 jherland users 4096 Aug 30 10:36 include
drwxr-xr-x 16 jherland users 4096 Aug 30 10:36 lib
drwxr-xr-x  9 jherland users 4096 Aug 30 10:36 share
drwxr-xr-x  3 jherland users 4096 Aug 30 10:36 ssl
drwxr-xr-x  3 jherland users 4096 Aug 30 10:36 x86_64-conda_cos7-linux-gnu
drwxr-xr-x  3 jherland users 4096 Aug 30 10:36 x86_64-conda-linux-gnu
❯ ls -al ~/.conda/envs/my_conda_project/bin/
total 20812
[...]
-rwxrwxr-x  4 jherland users   976264 Jun 14 19:13 openssl
-rwxr-xr-x  1 jherland users      263 Aug 30 10:36 pip
-rwxr-xr-x  1 jherland users      263 Aug 30 10:36 pip3
lrwxrwxrwx  1 jherland users        8 Aug 30 10:36 pydoc -> pydoc3.8
lrwxrwxrwx  1 jherland users        8 Aug 30 10:36 pydoc3 -> pydoc3.8
-rwxr-xr-x  1 jherland users      117 Aug 30 10:36 pydoc3.8
lrwxrwxrwx  1 jherland users        9 Aug 30 10:36 python -> python3.8
lrwxrwxrwx  1 jherland users        9 Aug 30 10:36 python3 -> python3.8
-rwxr-xr-x  1 jherland users 15176400 Aug 30 10:36 python3.8
-rwxr-xr-x  1 jherland users     3542 Aug 30 10:36 python3.8-config
lrwxrwxrwx  1 jherland users       16 Aug 30 10:36 python3-config -> python3.8-config
lrwxrwxrwx  1 jherland users        4 Aug 30 10:36 reset -> tset
-rwxrwxr-x  4 jherland users  1777144 Apr 30 16:45 sqlite3
-rwxrwxr-x  4 jherland users    30392 May  3 23:10 sqlite3_analyzer
[...]
❯ ls -al ~/.conda/envs/my_conda_project/conda-meta/
total 3048
drwxr-xr-x  2 jherland users   4096 Aug 30 10:39 .
drwxr-xr-x 11 jherland users   4096 Aug 30 10:39 ..
-rw-r--r--  1 jherland users   6060 Aug 30 10:39 brotli-python-1.0.9-py38h6a678d5_8.json
-rw-r--r--  1 jherland users   1661 Aug 30 10:36 ca-certificates-2024.7.2-h06a4308_0.json
-rw-r--r--  1 jherland users   7658 Aug 30 10:39 certifi-2024.7.4-py38h06a4308_0.json
-rw-r--r--  1 jherland users  13045 Aug 30 10:39 charset-normalizer-3.3.2-pyhd3eb1b0_0.json
-rw-r--r--  1 jherland users   1534 Aug 30 10:39 history
-rw-r--r--  1 jherland users  10696 Aug 30 10:39 idna-3.7-py38h06a4308_0.json
-rw-r--r--  1 jherland users   2254 Aug 30 10:36 ld_impl_linux-64-2.38-h1181459_1.json
-rw-r--r--  1 jherland users   5884 Aug 30 10:36 libffi-3.4.4-h6a678d5_1.json
-rw-r--r--  1 jherland users    980 Aug 30 10:36 _libgcc_mutex-0.1-main.json
-rw-r--r--  1 jherland users   8544 Aug 30 10:36 libgcc-ng-11.2.0-h1234567_1.json
-rw-r--r--  1 jherland users   2059 Aug 30 10:36 libgomp-11.2.0-h1234567_1.json
-rw-r--r--  1 jherland users   2311 Aug 30 10:36 libstdcxx-ng-11.2.0-h1234567_1.json
-rw-r--r--  1 jherland users 960217 Aug 30 10:37 ncurses-6.4-h6a678d5_0.json
-rw-r--r--  1 jherland users   1327 Aug 30 10:36 _openmp_mutex-5.1-1_gnu.json
-rw-r--r--  1 jherland users  56083 Aug 30 10:37 openssl-3.0.14-h5eee18b_0.json
-rw-r--r--  1 jherland users 371112 Aug 30 10:37 pip-24.2-py38h06a4308_0.json
-rw-r--r--  1 jherland users   5980 Aug 30 10:39 pysocks-1.7.1-py38h06a4308_0.json
-rw-r--r--  1 jherland users 774383 Aug 30 10:37 python-3.8.19-h955ad1f_0.json
-rw-r--r--  1 jherland users  10156 Aug 30 10:37 readline-8.2-h5eee18b_0.json
-rw-r--r--  1 jherland users  19382 Aug 30 10:39 requests-2.32.3-py38h06a4308_0.json
-rw-r--r--  1 jherland users 455715 Aug 30 10:37 setuptools-72.1.0-py38h06a4308_0.json
-rw-r--r--  1 jherland users   3976 Aug 30 10:37 sqlite-3.45.3-h5eee18b_0.json
-rw-r--r--  1 jherland users 179604 Aug 30 10:37 tk-8.6.14-h39e8969_0.json
-rw-r--r--  1 jherland users  33611 Aug 30 10:39 urllib3-2.2.2-py38h06a4308_0.json
-rw-r--r--  1 jherland users  27183 Aug 30 10:37 wheel-0.43.0-py38h06a4308_0.json
-rw-r--r--  1 jherland users  93976 Aug 30 10:37 xz-5.4.6-h5eee18b_1.json
-rw-r--r--  1 jherland users   3443 Aug 30 10:37 zlib-1.2.13-h5eee18b_1.json
❯ ls -al ~/.conda/envs/my_conda_project/lib/python3.8/site-packages/
total 920
drwxr-xr-x 23 jherland users   4096 Aug 30 10:39 .
drwxr-xr-x 35 jherland users  12288 Aug 30 10:36 ..
drwxr-xr-x  2 jherland users   4096 Aug 30 10:39 Brotli-1.0.9.dist-info
-rwxr-xr-x  3 jherland users 788808 Apr 30 15:22 _brotli.cpython-38-x86_64-linux-gnu.so
-rw-r--r--  3 jherland users   1857 Apr 30 15:22 brotli.py
drwxr-xr-x  3 jherland users   4096 Aug 30 10:39 certifi
drwxr-xr-x  2 jherland users   4096 Aug 30 10:39 certifi-2024.7.4.dist-info
drwxr-xr-x  4 jherland users   4096 Aug 30 10:39 charset_normalizer
drwxr-xr-x  2 jherland users   4096 Aug 30 10:39 charset_normalizer-3.3.2.dist-info
drwxr-xr-x  3 jherland users   4096 Aug 30 10:36 _distutils_hack
-rw-r--r--  3 jherland users    151 Aug  5 14:15 distutils-precedence.pth
drwxr-xr-x  3 jherland users   4096 Aug 30 10:39 idna
drwxr-xr-x  2 jherland users   4096 Aug 30 10:39 idna-3.7.dist-info
drwxr-xr-x  5 jherland users   4096 Aug 30 10:36 pip
drwxr-xr-x  2 jherland users   4096 Aug 30 10:36 pip-24.2.dist-info
drwxr-xr-x  4 jherland users   4096 Aug 30 10:36 pkg_resources
drwxr-xr-x  2 jherland users   4096 Aug 30 10:39 __pycache__
drwxr-xr-x  2 jherland users   4096 Aug 30 10:39 PySocks-1.7.1.dist-info
-rw-r--r--  3 jherland users    119 Mar 20 21:07 README.txt
drwxr-xr-x  3 jherland users   4096 Aug 30 10:39 requests
drwxr-xr-x  2 jherland users   4096 Aug 30 10:39 requests-2.32.3.dist-info
drwxr-xr-x  9 jherland users   4096 Aug 30 10:36 setuptools
drwxr-xr-x  2 jherland users   4096 Aug 30 10:36 setuptools-72.1.0-py3.8.egg-info
-rw-r--r--  3 jherland users   3966 Nov 13  2020 sockshandler.py
-rw-r--r--  3 jherland users  31086 Nov 13  2020 socks.py
drwxr-xr-x  5 jherland users   4096 Aug 30 10:39 urllib3
drwxr-xr-x  3 jherland users   4096 Aug 30 10:39 urllib3-2.2.2.dist-info
drwxr-xr-x  5 jherland users   4096 Aug 30 10:36 wheel
drwxr-xr-x  2 jherland users   4096 Aug 30 10:36 wheel-0.43.0.dist-info

This seems roughly to be a superset of a virtual environment: It contains more than just Python packages (which Conda is known to support), but it also contains the stuff we'd expect to find in a virtualenv (possibly except a pyvenv.cfg at the root of the env):

  • bin/python
  • lib/pythonX.Y/site-packages/...

Fortunately, for the purposes of matching package names to import names in Python, we can hopefully get away with ignoring the rest of the Conda environment, and focusing only on the subset that it has in common with virtualenvs.

And this seems to already work, in the sense that passing --list-sources --pyenv ~/.conda/envs/my_conda_project does indeed find ~/.conda/envs/my_conda_project/lib/python3.8/site-packages as a valid Python environment, and FawltyDeps is also able to resolve package names to import names from that directory.

Thus, this preliminary investigation indicates that this issue might already be solved, but I propose keeping it open until we have built some more confidence that we indeed support this properly.

@jherland jherland added type: feature request research-needed needs-real-projects-test This issue is more easily tackled once we have a project in `real_project` that illustrate the issue integration Integrating FawltyDeps with other tools labels Sep 4, 2024
@jherland
Copy link
Member Author

jherland commented Sep 9, 2024

I've started looking at Pixi (as part of #453 and eventually #454), and it appears to construct environments that are very similar to Conda environments (as laid out above). These are the only differences I can see after a cursory inspection:

  • Pixi stores environments under the project directory (under ./.pixi/envs/...) whereas Conda stores environments in a central location (under ~/.conda/envs).
  • Under conda-meta/... the same kind of .json files are available. Where Conda uses the history file as a log of package installation actions, Pixi does not use this file, and instead has two more files: pixi and pixi_env_prefix with some extra metadata.
  • As with Conda, the Pixi environment's Python executable is made available under bin/, and Python packages are installed under lib/pythonX.Y/site-packages/. However, I notice that in addition to the python3.12 installed by Pixi into my example environment, it also created a couple of python3.1 -> python3.12 symlinks under bin/ and lib/. I suspect some kind of ugly workaround for a numbering issue in either Conda or Pixi. (My Conda environment used Python 3.8 and had no such symlinks.)
  • Packages installed under lib/pythonX.Y/site-packages/ in these environments look like normal Python packages AFAICS. There are associated lib/pythonX.Y/site-packages/$PACKAGE-$VERSION.dist-info/ directories, with INSTALLER files documenting who installed the package. (Pixi stores uv-pixi in this file, Conda stores conda, for completeness, other installers store strings like uv, pip, and Poetry $VERSION.)

My initial impressions is that FawltyDeps (as of v0.17.0) is already able to find and use Python packages in a Pixi environment when e.g. --pyenv .pixi is passed on the command line. In the same way, a Conda environment is supported when passing e.g. --pyenv ~/.conda/envs/my_conda_project.

@jherland
Copy link
Member Author

jherland commented Sep 9, 2024

FWIW, when FawltyDeps itself is installed into a Pixi environment (with pixi add --pypi fawltydeps), and run inside it (either with pixi run fawltydeps or from inside a pixi shell), the default SysPathPackageResolver in FawltyDeps is able to automatically find Python packages installed inside the Pixi environment.

I suspect something similar applies when FawltyDeps is installed inside a Conda environment, but I haven't yet tried installed PyPI packages inside a Conda environment (and FawltyDeps is not distributed as a Conda package).

As an interesting corner case, if the Pixi project is using pyproject.toml instead of pixi.toml (pixi init --format pyproject $NAME), AND it is limited to only PyPI dependencies (pixi add --pypi $PACKAGE), then there is really no difference between this Pixi project and any other PEP621-compliant project, and FawltyDeps (when installed as a dependency) will currently work without any changes!

@jherland
Copy link
Member Author

A reminder to myself that we also need to look at Pixi/Conda environments on Windows, and how they differ from corresponding Linux environments.

(I assume MacOS Conda environments are ~identical to corresponding Linux environments, just like as Python virtualenvs are.)

@jherland
Copy link
Member Author

Reading up on Mamba (another modern Conda alternative) it seems that environments in Mamba (Conda too?) can be "stacked" on top of a base environment, where packages installed in the base environment are automatically also available in any other environment based on top that. Also, it appears environments can be stacked when you activate them:

When activating an environment from another, you can choose to stack or not upon the currently activated env.
Stacking will result in a new intermediate [prefix]https://mamba.readthedocs.io/en/latest/user_guide/concepts.html#prefix): system prefix < base < env1 < env2.

How this should be handled by FawltyDeps is not entirely clear yet. Worst case, the user should always be able to simply pass multiple --pyenv options to include all environments in the desired stack.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integration Integrating FawltyDeps with other tools needs-real-projects-test This issue is more easily tackled once we have a project in `real_project` that illustrate the issue research-needed type: feature request
Projects
None yet
Development

No branches or pull requests

1 participant