-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run release_commit job sequentially to avoid race #8508
base: main
Are you sure you want to change the base?
Conversation
Is this our only option? While there might be race conditions in that one job pushes before or after the other one, the rebase should avoid a failure. The other files are obviously changed, did you have any success figuring out if symlinks don't work well during rebases? |
I think it's our best option. I didn't find an ways to fix the rebase with symlinks. Even if it the rebase could be fixed, I don't think the complexity is worth it. Also, we'd still have race conditions since we'd still be doing I/O in parallel without any locking mechanism. |
The jobs are running on different VMs, so we wouldn't have parallel I/O. I think what I did wrong is:
Other race conditions are protected by git itself. Feel free to refactor the link change into a function if you like. diff --git a/.github/workflows/shippable_builds.yml b/.github/workflows/shippable_builds.yml
index 2c953ad53..aee72beb0 100644
--- a/.github/workflows/shippable_builds.yml
+++ b/.github/workflows/shippable_builds.yml
@@ -347,20 +347,23 @@ jobs:
git config --global user.name "GitHub Actions Bot"
git config --global user.email "41898282+github-actions[bot]@users.noreply.github.com"
+ # Pull before we make changes, otherwise changes to the symlink might not be detected
+ git pull
+
# We need the metadata to point to the right application for the release commit
rm metadata
ln -sf app-metadata/${APPLICATION_ID} metadata
+ git add metadata
+ ls -l metadata
- # Add changelogs, build version changes and metadata symlink
+ # Add changelogs, build version changes
git add ./app-metadata/${APPLICATION_ID}/en-US/changelogs/*
git add ./app-${APP_NAME}/src/main/res/raw/changelog_master.xml
git add ./app-${APP_NAME}/build.gradle.kts
- git add metadata
- # Ready to commit. Make sure to pull again to reduce likelihood of race conditions
+ # Ready to commit
git status
git commit -m "Release: ${APPLICATION_LABEL} ${FULL_VERSION_NAME}"
- git pull --rebase -X ours
git log -n 5
set +e
@@ -371,7 +374,17 @@ jobs:
if [ $GIT_RESULT -gt 0 ]; then
echo "Push rejected, trying again once in 5 seconds"
sleep 5
- git pull --rebase -X ours
+
+ # theirs/ours is flipped on rebases, prefer what we did
+ git pull --rebase -X theirs
+
+ # Do the symlink again in case there is no change before the rebase
+ rm metadata
+ ln -sf app-metadata/${APPLICATION_ID} metadata
+ git add metadata
+ git commit --amend --no-edit
+ ls -l metadata
+
git log -n 5
git push
fi
|
There is parallel I/O in the sense that we have 2 VMs running in parallel that are reading and writing to the same repository. There is still a race condition in your proposal between the 'git pull --rebase -X theirs' and 'git push'. |
There is a chance for a race condition, but it would simply cause the job to fail and we can re-run it. This would only happen in case something external to the release process pushes in the wrong moment. Under normal circumstances, I can't come up with a scenario where in parallel:
Either way, we wouldn't have a situation where the job succeeds but the SHA we use going forward contains the wrong thing. However, if or if not there is a conflict is a red herring. f-droid had updated the com.fsck.k9 listing with "Thunderbird Beta", which means that the symlink would need to point to net.thunderbird.android.beta. I originally pointed you to the release build, apologies. We need to look at https://github.com/thunderbird/thunderbird-android/actions/runs/11576907642 What ultimately went wrong in the build case was:
What would happen with the patch above is: Scenario A:
Scenario B:
Are there any cases I missed? |
On 11/5/24 4:45 PM, Philipp Kewisch wrote:
There is a chance for a race condition, but it would simply cause the
job to fail and we can re-run it. This would only happen in case
something external to the release process pushes in the wrong moment.
Under normal circumstances, I can't come up with a scenario where in
parallel:
1. the first push fails because the other job has pushed between the
first pull/push cycle (this can happen)
2. the second pull --rebase is triggered
3. another job (which one?) pushes to create a conflict
4. the second push fails because of a conflict.
I don't know if we've experienced it yet but there is a race where:
1. first push occurs after the 'git pull --rebase -X theirs'
2. the second push is attempted and fails
So it would fail, yes, and we can recover and re-run the actions. But do
we want to deal with potential failures? Running that matrix job
sequentially should remove any race concerns, but does have a cost of
about 6 extra minutes, in my testing.
Either way, we wouldn't have a situation where the job succeeds but
the SHA we use going forward contains the wrong thing.
However, if or if not there is a conflict is a red herring. f-droid
had updated the com.fsck.k9 listing with "Thunderbird Beta", which
means that the symlink would need to point to
net.thunderbird.android.beta. I originally pointed you to the release
build, apologies. We need to look at
https://github.com/thunderbird/thunderbird-android/actions/runs/11576907642
I agree. I just think the code could be simplified and I was asked to
ensure this issue doesn't happen again. Simplifying the code and running
sequentially gets rid of a lot of extra code including the automated
merge, which is where the issue occurred.
What ultimately went wrong in the build case was:
1. TfA bumps first in 5e5b1f2
<https://github.com/thunderbird/thunderbird-android/blob/5e5b1f2eeabb82a7e14a4baf08854d919c41d8ec/metadata>
and changes to net.thunderbird.android.beta
2. K-9 goes second and was on
e2f84a6
<https://github.com/thunderbird/thunderbird-android/blob/e2f84a6802f27d8d021e50d8e34e101cc59ade45/metadata>
prior, where it is already com.fsck.k9
3. K-9 removes and re-adds the symlink, but no change occurs, so
metadata is not part of the intermediate commit b58ffe50f
4. K-9 rebases, which pulls in
5e5b1f2
<https://github.com/thunderbird/thunderbird-android/blob/5e5b1f2eeabb82a7e14a4baf08854d919c41d8ec/metadata>,
which updates the symlink to net.thunderbird.android.beta, and
b58ffe50f becomes 017307e
<https://github.com/thunderbird/thunderbird-android/blob/017307eed74edc59da35eaa0c2e396e1b7f94f7a/metadata>.
5. No further changes to the symlink occur.
What would happen with the patch above is:
Scenario A:
1. TfA bumps first and changes to net.thunderbird.android.beta
2. K-9 goes second, pulls the first time before it changes the
symlink, changes the symlink from n.t.a.b to c.f.k9, pushes
Scenario B:
1. TfA bumps first and changes to net.thunderbird.android.beta
2. K-9 has already pulled prior, the symlink is not part of the
commit because it remains c.f.k9. git causes the push to fail
because it isn't up to date.
3. K-9 does git pull --rebase -X theirs (which could simply be git
pull --rebase tbh), which makes the symlink n.t.a.b, THEN changes
the symlink to c.f.k9, amends its latest commit to include the
changed metadata, then pushes again.
Are there any cases I missed?
It seems those are typical sequence of events that we would have seen so
far. I don't know if that will always be the sequence of events.
I think you feel strongly on your approach, and I do on mine. Do you
want to submit a patch for your approach? I can review it, and if it
works, I'll approve and we can move forward.
|
No description provided.