Skip to content

Improve App Hosting compute service account flow for source deploys #8785

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jul 1, 2025

Conversation

blidd-google
Copy link
Contributor

In the firebase init apphosting flow, we preemptively enable ensure the App Hosting compute service accounts exists and add the required IAM roles. However, we don't actually need the compute service account for any of the init steps. So instead of failing the init and deploy flows when a service account is still being provisioned, we optimistically proceed with the remaining steps.

@blidd-google blidd-google changed the title improve App Hosting compute service account flow for source deploys Improve App Hosting compute service account flow for source deploys Jun 25, 2025
);
spinner.succeed("App Hosting compute Service account is ready");
await ensureAppHostingComputeServiceAccount(projectId, /* serviceAccount= */ "");
spinner.succeed("App Hosting compute service account is ready");
} catch (err) {
if ((err as FirebaseError).status === 400) {
Copy link
Contributor

@annajowang annajowang Jun 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we know for sure that a 400 error always means that the account is still being provisioned or could it be another error ? I don't know where this 400 is coming from

Copy link
Contributor

@annajowang annajowang Jun 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, if i'm reading the internal chat thread correctly, 400 is from the chunk of code you've deleted in 281 - 302 ?

it seems like we'd want to do either of the following:

  1. keep the IAM role setting, but ignore any 400 error from it and just log a warning saying:
    a) the SA is being provisioned in the background and
    b) IAM roles needed for other actions may not be ready yet.
    and I assume, at some later point when the user does need the iam roles, the CLI will attempt to set them again for the SA? in which case i feel like (b) is optional
  2. OR get rid of IAM role setting (as you've done in this PR). raise any errors that come from provisioning the SA. or if no errors, we just log that the SA is being provisioned in the background.

Copy link
Contributor Author

@blidd-google blidd-google Jun 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately it seems the IAM API errors are somewhat non-deterministic; the addServiceAccountToRoles API call will sometimes return the 400 error when the service account doesn't exist yet, causing the init flow to error out. This is what Kiana was running into.

If it is just an error due to propagation delay, we would prefer not to interrupt the init flow, so I added the logic to ignore the 400 error. However, I think it is also valid to fail out on any error in case the issue isn't related to propagation delay.

Copy link
Contributor

@annajowang annajowang Jun 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok got it addServiceAccountToRoles is still being called in provisionDefaultComputeServiceAccount.

If we know that this 400 is coming from addServiceAccountToRoles, can we do this try-catch for the 400 in provisionDefaultComputeServiceAccount.

ideally, the errors should be handled as close as possible to the source. putting it here gives the reader no context, makes maintenance difficult, and could potentially try-catch errors we should not be catching.

@blidd-google blidd-google requested a review from annajowang July 1, 2025 18:22
Copy link
Contributor

@annajowang annajowang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm with a final comment on error handling refactoring

thanks for fixing this!

@@ -267,39 +266,14 @@ export async function ensureAppHostingComputeServiceAccount(
} catch (err: unknown) {
if (!(err instanceof FirebaseError)) {
throw err;
}
if (err.status === 404) {
await provisionDefaultComputeServiceAccount(projectId);
} else if (err.status === 403) {
throw new FirebaseError(
`Failed to create backend due to missing delegation permissions for ${sa}. Make sure you have the iam.serviceAccounts.actAs permission.`,
{ original: err },
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we got rid of that final throw err in apphosting.ts, i believe we need to add throw err here for any error that passed the first two if statements?

And then with that, I'm wondering if the error handling in prepare.ts is no longer needed and can be removed?

@blidd-google blidd-google merged commit 740da34 into master Jul 1, 2025
61 of 64 checks passed
@blidd-google blidd-google deleted the bl-fah-source-deploy-compute-sa branch July 1, 2025 20:43
@github-project-automation github-project-automation bot moved this from Approved [PR] to Done in [Cloud] Extensions + Functions Jul 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants