-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: update release workflow to follow pypa and pypi guidelines #73
base: main
Are you sure you want to change the base?
Conversation
README.md
Outdated
@@ -44,6 +44,22 @@ To use this template: | |||
as your source. You can read more about generating your project | |||
in the [copier documentation](https://copier.readthedocs.io/en/stable/generating/). | |||
## How to run the test suite locally |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Friends, bare with me on the instructions. I'm still getting to know copier and am wrapping my head around the test suite works. it seems great as it caught some spacing issues in my workflow! I just don't fully understand it yet so if any of this isn't quite right please say the word!
i'd love review on this (but also see a merge conflict now that i just merged the other pr!! a few points of potential contention
id love your feedback here all @sneakers-the-rat @blink1073 @Midnighter i'd llike to get our template nailed down for spring workshops and then scipy if we get accepted! |
environment: | ||
name: pypi | ||
# Modify the url to be the name of your package | ||
url: https://pypi.org/p/yourPackage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be able to template this line, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are right! I forgot about that. I still need to learn more about how this works with copier. i'll update the PR this week.
template/{% if use_git and dev_platform == 'GitHub' %}.github{% endif %}/workflows/release.yml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggested a few minor things. With what you intend to do, it makes sense to turn the release pipeline into a templated file and use copier values to fill in the information. That's the bigger change.
jobs: | ||
prerequisites: | ||
uses: ./.github/workflows/test.yml | ||
|
||
release: | ||
needs: [prerequisites] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part was responsible for running the test suite before publishing. And running the release job only when the test suite is successful. It sounds like you want that kind of behaviour (and I agree it's a good idea), just add these lines again to the jobs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm torn here. i LIKE this approach as it is robust. I like it best from a learning standpoint because it adds complexity to the CI job - we are calling a CI job within a CI job (reusable workflows), which is another concept for people to learn who may just be learning about tests and CI in general. I'm curious what @agriyakhetarpal and thinks about this.
weighing complexity for those new to packaging vs. a robust release process.
also pinging @ucodery @willingc on this one as it's part education, part packaging complexity but beyond packaging it's CI complexity. cognative load x 3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think i've been convinced that this approach reduces cognative load as it allows a user to just rerun tests on release. i'll fix my change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it best from a learning standpoint because it adds complexity to the CI job - we are calling a CI job within a CI job (reusable workflows), which is another concept for people to learn who may just be learning about tests and CI in general. I'm curious what @agriyakhetarpal and thinks about this.
weighing complexity for those new to packaging vs. a robust release process.
I would be okay with such a mechanism. However, running the tests (from a built wheel, that is) is something that I've seen more commonly used with compiled packages, and can be generally overkill with pure Python packages, especially when intended for a tutorial for beginners. I'll leave it up to you to remove or keep it. :D
name: >- | ||
Publish Python 🐍 distribution 📦 to PyPI | ||
# Modify the repo name below to be your project's repo name. | ||
if: github.repository_owner == 'pyopensci' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks to me like we need to enable templating for this workflow and insert the organization/repo owner here.
template/{% if use_git and dev_platform == 'GitHub' %}.github{% endif %}/workflows/release.yml
Outdated
Show resolved
Hide resolved
path: dist/ | ||
- name: Publish package to PyPI | ||
# Only publish to real PyPI on release | ||
if: github.event_name == 'release' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this line really needed since the whole workflow only triggers on release?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest including it, as (in my opinion) CD/release-related workflows like this are also supposed to run on a periodic basis using a CRON job (with a schedule:
trigger) to see if the build part does not have any issues. This applies much more for packages with compiled codebases where a host of errors from the entirety of their toolchains and supported platforms are the norm, but is also applicable for almost any release workflow, i.e., even for a pure Python package – its dependencies can break, build failures can occur, and so on.
If at all, I would suggest making it more stringent, like this:
if: github.event_name == 'release' | |
if: github.event_name == 'release' && github.event.action == 'published' |
in the hope of making a release-oriented developer more careful (similar to my point above) to check things twice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so this is a triple check. i'm going to call in @webknjaz - Sviatoslav this is a template for pyOpenSci that people can use to quickly create a package structure. I Updated the GitHub action part to reflect what PyPA suggests in terms of security. Here we are having a small friendly debate around how many event checks should go into the build. I'm curious what your take on this is?
More can't hurt, I would presume. it just adds a bit of complexity to the build to explain to people. So i want to make sure we keep things simple yet safe! thank you!!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @lwasser! This looks great; as promised, here's a review. I've tried to express my thought process with my suggestions, but please feel free to push back on them as you feel fit, especially for a template that's oriented towards a tutorial. Most of these comments are security-focused.
# This ensures that the publish action only runs in the main repository | ||
# rather than forks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume that an if:
condition was probably removed from here, and the comment is a remnant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oooh yes likely that was the case!! i put the publish step into it's own independent step for security!
- name: Checkout | ||
uses: actions/checkout@v4 | ||
with: | ||
# This fetch element is only important if you are use SCM based | ||
# versioning (that looks at git tags to gather the version) | ||
fetch-depth: 100 | ||
# Need the tags so that setuptools-scm can form a valid version number | ||
- name: Fetch git tags | ||
run: git fetch origin 'refs/tags/*:refs/tags/*' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be simplified with the fetch-tags:
input, which fetches all tags without needing to fetch up to a higher depth.
- name: Checkout | |
uses: actions/checkout@v4 | |
with: | |
# This fetch element is only important if you are use SCM based | |
# versioning (that looks at git tags to gather the version) | |
fetch-depth: 100 | |
# Need the tags so that setuptools-scm can form a valid version number | |
- name: Fetch git tags | |
run: git fetch origin 'refs/tags/*:refs/tags/*' | |
- name: Checkout | |
uses: actions/checkout@v4 | |
with: | |
# This fetch element is only important if you are use SCM based | |
# versioning (that looks at git tags to gather the version). | |
# setuptools-scm needs tags to form a valid version number | |
fetch-tags: true |
I have to note that this isn't working with ubuntu-24.04
images, though (actions/checkout#2041), but it looks like it will be resolved soon, well before ubuntu-latest
starts pointing to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahhhh ok perhaps we hold off if this is bleeding edge and not vetted across platforms. As we do want our users to have a happy experience creating their first package! It's more important that this template is easy to use and understand!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, it's not an entirely new feature (it calls a variant of git fetch
underneath), actually. The fix will be rolled out soon enough, reducing one step here, so I'll suggest we keep these, but I'm fine with waiting until we know it's working everywhere!
|
||
- name: Build package | ||
run: hatch build | ||
# Security recommends we should pin deps. Should we pin the workflow version? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, please. I'll recommend that all GitHub Actions—whether by GitHub as an actor or external ones available on the marketplace—should be pinned to their commit hashes, at least for a workflow concerning a release. There is a case to be made about how it makes things a bit more challenging for new contributors/packagers, but I think supply-chain security shouldn't be at the end of a compromise.
There are tools such as gha-update
which can replace them with pins, that we can recommend elsewhere (not in this PR, though, of course). Once this is done, I think it's just a matter of letting Dependabot (if configured) take over, as it can handle both the upgrade and the inline version comment in its automated PRs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great. So, looking at the action here
Ofek suggests that too - we'd use this - this hash is almost the most recent but the most recent is just a readme update.
-
name: Install Hatch
uses: pypa/hatch@a3c83abBut now we'd have to update that hash manually? How often should that update happen?
We should do that as part of this repo's maintenance, but also, we'll want to tell users to update this periodically, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great. So, looking at the action here
Ofek suggests that too - we'd use this - this hash is almost the most recent but the most recent is just a readme update.
- name: Install Hatch
uses: pypa/hatch@a3c83ab
But now we'd have to update that hash manually? How often should that update happen?We should do that as part of this repo's maintenance, but also, we'll want to tell users to update this periodically, right?
Updating it weekly for our maintenance could be a good cadence; what do you think? It would depend on the amount of activity we have on this template.
If we were to ask our users to enable Dependabot in the GitHub settings and inherit the same settings, they should also get the same updates. However, I'm not sure whether it's too much to ask for beginners (😕).
As long as it is pinned for a start, we can proceed with a small guide on how to keep this workflow maintained, where we can describe this more (or link to an external guide if that's alright and vetted by us?).
# Modify the repo name below to be your project's repo name. | ||
if: github.repository_owner == 'pyopensci' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Modify the repo name below to be your project's repo name. | |
if: github.repository_owner == 'pyopensci' | |
# Modify the repo name below to be your project's repo name. | |
if: github.repository == 'pyopensci/pyos-package-template' |
How would you feel about a more stringent check here, like this? The idea is to make users more careful (and it's possible that a fork of the repository within one's organisation can trigger this workflow otherwise).
- name: Download dists | ||
uses: actions/download-artifact@v4 | ||
with: | ||
name: python-package-distributions | ||
path: dist/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To place the source distribution and the wheel next to each other, this change would be required:
- name: Download dists | |
uses: actions/download-artifact@v4 | |
with: | |
name: python-package-distributions | |
path: dist/ | |
- name: Download dists | |
uses: actions/download-artifact@v4 | |
with: | |
name: python-package-distributions | |
path: dist/ | |
merge-multiple: true |
Relevant docs: https://github.com/actions/download-artifact/tree/v4/?tab=readme-ov-file#inputs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, @agriyakhetarpal. Please tell me more about placing them next to each other. i see in the docs that GitHub created directories for each archive. BUT, it seems to be ok for without.
I may be missing something!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is that the download-artifact@v4
action (differing in behaviour from v3
) will download the artifact under a folder dist/python-package-distributions/
, when where we actually want them to be placed is under dist/
itself (i.e., dist/mypackage-version.tar.gz
and dist/mypackage-version-<...>.whl
). Setting merge-multiple: true
takes care of this here.
path: dist/ | ||
- name: Publish package to PyPI | ||
# Only publish to real PyPI on release | ||
if: github.event_name == 'release' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest including it, as (in my opinion) CD/release-related workflows like this are also supposed to run on a periodic basis using a CRON job (with a schedule:
trigger) to see if the build part does not have any issues. This applies much more for packages with compiled codebases where a host of errors from the entirety of their toolchains and supported platforms are the norm, but is also applicable for almost any release workflow, i.e., even for a pure Python package – its dependencies can break, build failures can occur, and so on.
If at all, I would suggest making it more stringent, like this:
if: github.event_name == 'release' | |
if: github.event_name == 'release' && github.event.action == 'published' |
in the hope of making a release-oriented developer more careful (similar to my point above) to check things twice.
# Environment is encouraged for increased security | ||
environment: build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a new one for me, so I'm curious. Could you please share how it works?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh! @agriyakhetarpal environments allow you to setup a trusted connection with pypi. This is the build environment, so we don't need to worry too much about security here in terms of PYPI. so anyone can run the build step in theory. but by having a publish environment, we can control who can trigger that step of the build. (if that is what you're asking) .
here is a bit more about this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I'm aware of trusted publishing and environments, thanks! My question was more around the need for this build
environment, as it seems to be redundant. Most release workflows can get by with just one pypi
(or publish
, or any other similar name) environment, omitting the need for an environment for the build step. Does including two environments, one for building and one for publishing, have security provisions? Moreover, I imagine that the users would also have questions around why they need (and we recommend) to set up two environments in their repository's settings, when it is only the publishing one that connects to PyPI's OIDC publishing functionalities.
template/{% if use_git and dev_platform == 'GitHub' %}.github{% endif %}/workflows/release.yml
Outdated
Show resolved
Hide resolved
template/{% if use_git and dev_platform == 'GitHub' %}.github{% endif %}/workflows/release.yml
Outdated
Show resolved
Hide resolved
template/{% if use_git and dev_platform == 'GitHub' %}.github{% endif %}/workflows/release.yml
Outdated
Show resolved
Hide resolved
release: | ||
# Setup build separate from publish for added security | ||
# See https://github.com/pypa/gh-action-pypi-publish/issues/217#issuecomment-1965727093 | ||
build: | ||
needs: [prerequisites] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Midnighter i added the test run back -- here before we build the package.
@@ -77,7 +77,7 @@ documentation: | |||
type: str | |||
help: "Do you want to include documentation for your project and which framework do you want to use?" | |||
choices: | |||
"Sphinx (https://www.pyopensci.org/pyos-sphinx-theme)": sphinx | |||
"Sphinx (https://pydata-sphinx-theme.readthedocs.io/en/stable/index.html)": sphinx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just a tiny change - pyos-sphinx-theme is specific to our guidebooks. so let's go with the community lead pydata-sphinx-theme instead 🚀 :)
Ok i've made changes and seemingly have broken CI. I wondered if someone can help me understand how to test this locally. i tried:
|
@all-contributors please add @agriyakhetarpal for review |
I've put up a pull request to add @agriyakhetarpal! 🎉 |
Copier by default applies the template from the last tag rather than the local version. A design choice I'm not super happy with either. You will need to use the -r option with your branch name so it uses your last commit instead. |
…% endif %}/workflows/release.yml
…% endif %}/workflows/release.yml
This makes sense. Thank you, @Midnighter !!! |
well it turns out #77 fixes this pr too. I am not sure why CI was passing before 🤷🏻♀️ but the issue was pydoclint can't run without knowing where to run. |
name: python-package-distributions | ||
path: dist/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name: python-package-distributions | |
path: dist/ | |
name: python-package-distributions | |
path: dist/ | |
if-no-files-found: error |
Introducing this option here will make the action fail in the upload step itself, rather than silently passing even if no artifact was found in the specified path. It will avoid proceeding to the publish:
action, which will fail to find an artifact to download otherwise and cause users trouble.
this closes #48 We've been working with the pypa and pypi security folks on our blog post and this pr should address the suggested approaches to publishing using a release based workflow.
Because we are using a release, this assumes that tests are running on the main branch before a release is made. It also allows a dynamic maintainer team to make releases without the command line which is really nice.
NOTE: i haven't yet tested this specific file for issues. i'm not quite sure how - maybe via test pypi?