Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some python packages include unintended directories in site-packages/ #33397

Open
smoser opened this issue Nov 5, 2024 · 3 comments
Open

some python packages include unintended directories in site-packages/ #33397

smoser opened this issue Nov 5, 2024 · 3 comments

Comments

@smoser
Copy link
Contributor

smoser commented Nov 5, 2024

In trying to build mult-version version of py3-proto-plus I found that it could not be installed along side of py3.12-google-resumable-media, because they both had usr/lib/python3.12/site-packages/testing/constraints-3.10.txt.

That didn't seem right.

I used https://gist.github.com/smoser/0a11e2643b884960c1e5349d4dc0b8c7#file-get-archive-info to get .flist files (tar -tvf output) of all the files in the wolfi to see what other packages had that problem.

#!/usr/bin/gawk
# print any the top level entries in usr/lib/python3.XX/site-packages
# that do not match dist-info (python module info) or __pycache__
BEGINFILE { delete tld }
$6 ~ /usr.lib.python3[.][0-9]+.site-packages\/[^/]*$/ &&
   $6 !~ /dist-info/ &&
   $6 !~ "__pycache__" {
    split($6, r, /\//)
    tld[r[5]]=1
}
ENDFILE { 
  if (length(tld)) {
      buf = ""
      for (key in tld)  {
          buf = buf " " key
      }
      printf("%02d %s%s\n", length(tld), FILENAME, buf)
  }
}

Using the above awk, here are some interesting bits:

$ awk -f my.awk py3.*.flist  | grep testing
03 py3.12-google-resumable-media-2.7.2-r1.flist google build testing
04 py3.12-proto-plus-1.25.0-r0.flist docs build testing proto

$ awk -f my.awk py3*.flist  | grep docs
04 py3.12-google-auth-oauthlib-1.2.1-r1.flist scripts docs build google_auth_oauthlib
04 py3.12-pipenv-2024.3.1-r0.flist docs build examples pipenv
04 py3.12-proto-plus-1.25.0-r0.flist docs build testing proto

$ awk -f my.awk py3*.flist  | grep -w build
01 py3.12-build-1.2.2-r1.flist build
02 py3.12-codespell-2.3.0-r1.flist build codespell_lib
02 py3.12-google-auth-2.36.0-r0.flist google build
04 py3.12-google-auth-oauthlib-1.2.1-r1.flist scripts docs build google_auth_oauthlib
03 py3.12-google-resumable-media-2.7.2-r1.flist google build testing
02 py3.12-googleapis-common-protos-1.65.0-r1.flist google build
04 py3.12-pipenv-2024.3.1-r0.flist docs build examples pipenv
04 py3.12-proto-plus-1.25.0-r0.flist docs build testing proto
02 py3.12-python-gitlab-5.0.0-r0.flist build gitlab
01 py3.12-scikit-build-0.18.1-r1.flist skbuild
01 py3.12-scikit-build-core-0.10.7-r1.flist scikit_build_core
02 py3.12-tqdm-4.66.6-r0.flist build tqdm

So we can see there that some things are providing top level python modules that they surely did not intend to.

I opened googleapis/proto-plus-python#503 , but haven't gotten to the root cause of why those directories are getting pulled in to the installed package. The 'build' directory is output of a previous 'pip build' run, and that can be cleaned out with an 'rm' in the pipeline, but the other directories should not be getting in either.

@smoser
Copy link
Contributor Author

smoser commented Nov 5, 2024

I'm attaching a modified version of the script I attached googleapis/proto-plus-python#503 .
do-build.sh.txt.

When building with:

git clone https://github.com/googleapis/proto-plus-python.git
cd proto-plus-python
rm -Rf *
git checkout .

sh ../do-build.sh.txt 

If I build in a clean repo (rm -Rf *; git checkout .), and include setuptools-scm in the packages list then I get the docs/ and testing/ dirs in the .whl. If I exclude it, then I will not get that.

@smoser
Copy link
Contributor Author

smoser commented Nov 5, 2024

Maybe related to pypa/setuptools-scm#561

@smoser
Copy link
Contributor Author

smoser commented Nov 6, 2024

OK. @pnasrat I'm interested in your thoughts on what we can do here.
It does seem like setuptools_scm is changing the behavior and resulting in the 'docs' and 'testing/' packages getting added.

The 'build/' is easy enough, we can just explicitly clean that dir out in py/pip-build-install, and I've verified that does work, but the non-'build' directories we'll have to handle another way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant