Skip to content

Conversation

@chromy
Copy link
Contributor

@chromy chromy commented Oct 29, 2025

This resolves EME-473

Both find_base_sha and git_repo_base_ref were not correct for non-Github workflow contexts.

find_base_ref returned previously returned a SHA where the intention of find_base_ref
was to return the nice name of the default merge branch of the remote. So in the common case:

---o---U---o---o---o---H

Where:

  • H is some-branch and
  • U is origin/main and
  • refs/remotes/origin/HEAD is refs/remotes/origin/main

git_repo_base_ref should return main

find_base_sha did not handle the not Github case when it should return the equivalent of:
git merge-base HEAD origin/HEAD. For find_base_sha we want the merge base of
of U and H. In the above example this is the SHA of U. However they can be different:
e.g.

    o---o---o---origin/main
   /
---1---o---o---o---foo

find_base_ref should still return main
but find_base_sha should be the SHA of 1.

@chromy chromy requested review from a team and szokeasaurusrex as code owners October 29, 2025 13:33
@linear
Copy link

linear bot commented Oct 29, 2025

cursor[bot]

This comment was marked as outdated.

@chromy chromy force-pushed the chromy/2025-09-29-fix-base branch 6 times, most recently from 8ca51a4 to a5447c9 Compare October 29, 2025 16:09
cursor[bot]

This comment was marked as outdated.

@chromy chromy force-pushed the chromy/2025-09-29-fix-base branch from a5447c9 to 905715d Compare October 29, 2025 16:14
Copy link
Contributor

@runningcode runningcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice find and fix!!

Copy link
Member

@szokeasaurusrex szokeasaurusrex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @chromy, I left a couple comments, but I am finding it difficult to review this change because it appears this PR is combining a bug fix (fixing git_repo_base_ref) and a new feature (adding support for non-GitHub environments to find_base_sha, which previously, was clearly only designed to handle GitHub environments), plus a minor refactor (folding find_merge_base_ref into git_repo_base_ref), all into one.

I would appreciate if you could split this change into two PRs (one for the bug fix, one for the new feature), or possibly three (by splitting the find_merge_base_ref refactor further into its own PR) so I can have a better idea of which changes go together 🙏

I am also confused because the PR description is mentioning a function named find_base_ref, which does not exist. I assume you meant git_repo_base_ref, but would appreciate if you can confirm this and update the PR description accordingly when making the split PRs.

.resolve()?
.shorthand()
.ok_or(anyhow::anyhow!(
"Could not find remote tracking branch for {remote_name}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shorthand() returns None if the shorthand is not valid UTF-8.

Suggested change
"Could not find remote tracking branch for {remote_name}"
"Remote branch name is not valid UTF-8"

Comment on lines 612 to 626
/// Extracts the PR base SHA from GitHub Actions event payload JSON.
/// Returns Ok(None) if not a PR event or if SHA cannot be extracted.
/// Returns an error if we cannot parse the JSON.
fn extract_pr_base_sha_from_event(json_content: &str) -> Result<Option<String>> {
let v: Value = serde_json::from_str(json_content)
.context("Failed to parse GitHub event payload as JSON")?;
/// Returns None if not a PR event or if SHA cannot be extracted.
fn extract_pr_base_sha_from_event(json_content: &str) -> Option<String> {
let v: Value = match serde_json::from_str(json_content) {
Ok(v) => v,
Err(_) => {
debug!("Failed to parse GitHub event payload as JSON");
return None;
}
};

Ok(v.pointer("/pull_request/base/sha")
v.pointer("/pull_request/base/sha")
.and_then(|s| s.as_str())
.map(|s| s.to_owned()))
.map(|s| s.to_owned())
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see no reason for this function to change. If we want to debug-log the error message rather than error, the calling code should change how it handles the error returned from this function.

Suggested change
/// Extracts the PR base SHA from GitHub Actions event payload JSON.
/// Returns Ok(None) if not a PR event or if SHA cannot be extracted.
/// Returns an error if we cannot parse the JSON.
fn extract_pr_base_sha_from_event(json_content: &str) -> Result<Option<String>> {
let v: Value = serde_json::from_str(json_content)
.context("Failed to parse GitHub event payload as JSON")?;
/// Returns None if not a PR event or if SHA cannot be extracted.
fn extract_pr_base_sha_from_event(json_content: &str) -> Option<String> {
let v: Value = match serde_json::from_str(json_content) {
Ok(v) => v,
Err(_) => {
debug!("Failed to parse GitHub event payload as JSON");
return None;
}
};
Ok(v.pointer("/pull_request/base/sha")
v.pointer("/pull_request/base/sha")
.and_then(|s| s.as_str())
.map(|s| s.to_owned()))
.map(|s| s.to_owned())
}
/// Extracts the PR base SHA from GitHub Actions event payload JSON.
/// Returns Ok(None) if not a PR event or if SHA cannot be extracted.
/// Returns an error if we cannot parse the JSON.
fn extract_pr_base_sha_from_event(json_content: &str) -> Result<Option<String>> {
let v: Value = serde_json::from_str(json_content)
.context("Failed to parse GitHub event payload as JSON")?;
Ok(v.pointer("/pull_request/base/sha")
.and_then(|s| s.as_str())
.map(|s| s.to_owned()))
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree, there are two functions:

  1. extract_pr_head_sha_from_event
  2. extract_pr_base_sha_from_event

They ought to match exactly except for which SHA they extract.
Previously they didn't have the same implementation or API Result<Option<String>> vs. Option<String>. It does not make sense for one to handle invalid json and not the other.

Since invalid json from the Github API is (hopefully) rare I don't think it makes sense to push handling the case on to callers (making all calling code more complicated)*. Better options would be:

  • crashing
  • returning None and logging

I would generally lean towards crashing but since extract_pr_head_sha_from_event already had that behavior I matched that.

*There is actually only one real caller so I was tempted to inline these and simplify the code (c.f. http://number-none.com/blow/john_carmack_on_inlined_code.html) but since it has some nice tests I left it.

I have split this into it's own PR here: #2893

@chromy chromy marked this pull request as draft October 30, 2025 16:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants