Skip to content

Conversation

@kekcleader
Copy link
Contributor

@kekcleader kekcleader commented Oct 9, 2025

#24

Finalizer: Remove AccountInfo finalizer only when no other finalizers remain and no child Account CRs exist.

Backoff: Use existing rate limiter to requeue with exponential backoff for up to 1 minute; after that, return a non-sentry retryable error (“finalizer not removed yet”).

Tests: Added

  • TestFinalize_Blocks_WhenChildAccountsExist
  • TestFinalize_ReturnsError_AfterOneMinute_WhenChildAccountsExist

Summary by CodeRabbit

  • New Features
    • None
  • Bug Fixes
    • More reliable account deletion and finalization, avoiding premature cleanup when child accounts exist.
    • Reduced unnecessary processing during reconciliation with consistent requeue timing and fewer transient errors.
  • Refactor
    • Simplified authorization cleanup by handling it during finalization and resetting rate-limiter state for smoother retries.
  • Tests
    • Added comprehensive tests covering finalizers, no-op processing, requeue behavior, and cleanup scenarios.
  • Chores
    • Upgraded Go toolchain and builder image for improved stability and compatibility.

@github-actions github-actions bot added the chore label Oct 9, 2025
@kekcleader kekcleader marked this pull request as ready for review October 9, 2025 14:22
@kekcleader kekcleader requested a review from a team as a code owner October 9, 2025 14:22
@kekcleader kekcleader self-assigned this Oct 9, 2025
@platform-mesh platform-mesh deleted a comment from coderabbitai bot Oct 9, 2025
@kekcleader
Copy link
Contributor Author

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Oct 9, 2025

Walkthrough

Updates Go versions in Dockerfile and go.mod. Refactors FGASubroutine to shift work from Process to a new Finalize method with requeue logic based on child accounts and limiter management. Enhances AccountInfo Finalize with deletion-timestamp-aware requeue handling and cross-cluster account listing. Adds unit tests for FGASubroutine finalizers and finalize behavior.

Changes

Cohort / File(s) Summary
Build and version bump
`Dockerfile`, `go.mod`
Updated builder base image to golang:1.24.8; bumped Go version in go.mod to 1.24.8. No other changes.
FGASubroutine implementation
`pkg/subroutines/fga.go`
Introduced dedicated limiter setup; moved initialization/no-op to Process; added Finalize to handle cleanup: obtains cluster/client, lists child accounts, requeues if children exist, performs delete writes via OpenFGA client, and forgets limiter state. Removed prior account-store wiring and tuple construction logic.
AccountInfo finalization adjustments
`pkg/subroutines/accountinfo/accountinfo.go`
Finalize now respects deletion timestamp to gate requeue; lists Accounts in target cluster by owner label; requeues or errors when multiple finalizers and accounts exist; clears limiter when no accounts remain; refined control flow and errors.
FGASubroutine tests
`pkg/subroutines/fga_test.go`
Added tests covering finalizer presence, no-op processing, Finalize requeue when child accounts exist, and tuple deletions via OpenFGA client. Includes mocks for Manager, Cluster, Client, and FGA service and operator-configured contexts.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45–75 minutes

Suggested reviewers

  • aaronschweig

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title succinctly captures the primary feature change by indicating that finalizer removal is contingent on absence of child accounts, matching the core modifications in the pull request.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/delete-account-info-only-after-accounts

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai
Copy link

coderabbitai bot commented Oct 9, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
pkg/subroutines/accountinfo.go (2)

52-59: Magic number: extract the 1-minute threshold as a constant.

The hardcoded 1-minute duration at line 55 should be extracted as a named constant for better maintainability and documentation of this business rule.

Apply this diff to extract the constant:

 const (
 	AccountInfoSubroutineName = "AccountInfoSubroutine"
 	DefaultAccountInfoName    = "account"
+	FinalizerGracePeriod      = 1 * time.Minute
 )

Then update the usage:

 func (r *AccountInfoSubroutine) Finalize(ctx context.Context, ro runtimeobject.RuntimeObject) (ctrl.Result, errors.OperatorError) {
 	cn := MustGetClusteredName(ctx, ro)
 
 	// Determine whether we should keep requeuing or surface an error based on deletion age.
 	requeue := true
 	if ts := ro.GetDeletionTimestamp(); ts != nil {
-		oneMinAgo := v1.Now().Add(-1 * time.Minute)
+		oneMinAgo := v1.Now().Add(-FinalizerGracePeriod)
 		if ts.Time.Before(oneMinAgo) {
 			requeue = false
 		}
 	}

63-68: Duplicate requeue logic: extract to a helper function.

The requeue-or-error decision at lines 63-68 and 75-79 is identical. Extract this pattern into a helper function to improve maintainability.

Add a helper method:

func (r *AccountInfoSubroutine) requeueOrError(cn ClusteredName, requeue bool, reason string) (ctrl.Result, errors.OperatorError) {
	if requeue {
		delay := r.limiter.When(cn)
		return ctrl.Result{RequeueAfter: delay}, nil
	}
	return ctrl.Result{}, errors.NewOperatorError(fmt.Errorf("finalizer not removed yet: %s", reason), true, false)
}

Then refactor the Finalize method:

 	// The account info object is relevant input for other finalizers, removing the accountinfo finalizer at last
 	if len(ro.GetFinalizers()) > 1 {
-		if requeue {
-			delay := r.limiter.When(cn)
-			return ctrl.Result{RequeueAfter: delay}, nil
-		}
-		return ctrl.Result{}, errors.NewOperatorError(fmt.Errorf("finalizer not removed yet"), true, false)
+		return r.requeueOrError(cn, requeue, "other finalizers still present")
 	}
 
 	accountList := &v1alpha1.AccountList{}
 	if err := r.client.List(ctx, accountList); err != nil {
 		return ctrl.Result{}, errors.NewOperatorError(err, true, true)
 	}
 	if len(accountList.Items) > 0 {
-		if requeue {
-			delay := r.limiter.When(cn)
-			return ctrl.Result{RequeueAfter: delay}, nil
-		}
-		return ctrl.Result{}, errors.NewOperatorError(fmt.Errorf("finalizer not removed yet"), true, false)
+		return r.requeueOrError(cn, requeue, "child accounts still exist")
 	}

Also applies to: 75-79

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6c80cf2 and 8059850.

📒 Files selected for processing (6)
  • .gitignore (1 hunks)
  • Dockerfile (1 hunks)
  • go.mod (1 hunks)
  • pkg/subroutines/accountinfo.go (1 hunks)
  • pkg/subroutines/accountinfo_test.go (1 hunks)
  • pkg/subroutines/fga_test.go (0 hunks)
💤 Files with no reviewable changes (1)
  • pkg/subroutines/fga_test.go
🔇 Additional comments (1)
pkg/subroutines/accountinfo.go (1)

70-80: Critical: List operation lacks proper scoping and may fetch accounts from all clusters.

Line 71 calls r.client.List(ctx, accountList) without any list options to filter by workspace, namespace, or parent account. This will likely list ALL accounts across ALL clusters/workspaces, not just the child accounts of the resource being finalized.

Based on the PR objectives ("only remove finalizer in case no child accounts exist"), the intent is to check for child accounts of the current AccountInfo's workspace, not all accounts system-wide.

Run the following script to verify how List is used elsewhere in the codebase and check if there's a pattern for scoping lists to specific workspaces:

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6c80cf2 and 8059850.

📒 Files selected for processing (6)
  • .gitignore (1 hunks)
  • Dockerfile (1 hunks)
  • go.mod (1 hunks)
  • pkg/subroutines/accountinfo.go (1 hunks)
  • pkg/subroutines/accountinfo_test.go (1 hunks)
  • pkg/subroutines/fga_test.go (0 hunks)
💤 Files with no reviewable changes (1)
  • pkg/subroutines/fga_test.go
🔇 Additional comments (8)
go.mod (1)

3-3: Version bump looks good.

Updating the module’s Go toolchain to 1.24.8 keeps it aligned with the Docker builder image. No issues spotted.

Dockerfile (1)

2-2: Double-check the Debian base change.

Switching from golang:1.24.x-bullseye to the plain golang:1.24.8 tag also moves the underlying distro from Bullseye to the default (currently Bookworm). Please confirm we’re not depending on Bullseye-specific packages or behavior before merging.

.gitignore (1)

35-36: Nice addition.

Ignoring .gocache keeps local Go build artifacts out of the repo. Looks good.

pkg/subroutines/accountinfo.go (3)

52-59: LGTM! Deletion-age aware requeue logic is correctly implemented.

The requeue flag logic properly checks if the deletion timestamp is older than one minute and disables requeue accordingly. This aligns with the PR objective of using exponential backoff for up to 1 minute before returning a non-Sentry error.


62-68: LGTM! Multiple finalizers handling is correctly implemented.

The logic properly applies the requeue-or-error behavior based on the deletion age: requeues with exponential backoff before 1 minute, then returns a non-Sentry retryable error after 1 minute. This matches the PR objectives.


82-84: LGTM! Early exit path is correctly implemented.

The finalizer removal logic properly cleans up the rate limiter state and returns with no requeue when all conditions are met (assuming the child accounts check is fixed).

pkg/subroutines/accountinfo_test.go (2)

488-514: LGTM! Test correctly verifies requeue behavior when child accounts exist.

The test properly validates that Finalize returns no error and requeues with exponential backoff when child accounts exist and deletion is recent (less than 1 minute old).


516-545: LGTM! Test correctly verifies error behavior after the 1-minute threshold.

The test properly validates that Finalize returns a retryable, non-Sentry error with zero requeue when deletion is older than 1 minute and child accounts still exist. The assertions match the expected behavior per the PR objectives.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 374485b and 06c0925.

📒 Files selected for processing (4)
  • go.mod (1 hunks)
  • pkg/subroutines/accountinfo/accountinfo.go (1 hunks)
  • pkg/subroutines/fga.go (4 hunks)
  • pkg/subroutines/fga_test.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • go.mod
🧰 Additional context used
🧬 Code graph analysis (3)
pkg/subroutines/fga.go (2)
pkg/clusteredname/clusteredname.go (2)
  • ClusteredName (12-15)
  • MustGetClusteredName (31-36)
api/v1alpha1/account_types.go (3)
  • Account (81-87)
  • AccountList (92-96)
  • NamespaceAccountOwnerLabel (29-29)
pkg/subroutines/fga_test.go (8)
pkg/subroutines/fga.go (1)
  • NewFGASubroutine (35-45)
api/v1alpha1/account_types.go (4)
  • Account (81-87)
  • AccountList (92-96)
  • AccountSpec (34-53)
  • AccountTypeAccount (28-28)
pkg/subroutines/mocks/mock_MCManager.go (1)
  • NewManager (25-35)
pkg/subroutines/mocks/mock_K8Cluster.go (1)
  • NewCluster (22-32)
pkg/subroutines/mocks/mock_K8Service.go (1)
  • NewClient (505-515)
pkg/subroutines/mocks/mock_OpenFGAServiceClient.go (1)
  • NewOpenFGAServiceClient (17-27)
pkg/subroutines/accountinfo/accountinfo.go (1)
  • DefaultAccountInfoName (31-31)
api/v1alpha1/account_info_types.go (5)
  • AccountInfo (65-71)
  • AccountInfoSpec (24-30)
  • AccountLocation (36-46)
  • FGAInfo (48-50)
  • StoreInfo (52-54)
pkg/subroutines/accountinfo/accountinfo.go (1)
api/v1alpha1/account_types.go (2)
  • AccountList (92-96)
  • NamespaceAccountOwnerLabel (29-29)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: pipe / dockerBuild / docker
  • GitHub Check: pipe / testSource / test
  • GitHub Check: pipe / lint / lint
🔇 Additional comments (10)
pkg/subroutines/accountinfo/accountinfo.go (3)

57-63: LGTM! Deletion timestamp logic is correct.

The requeue flag correctly disables requeue when the deletion timestamp is older than one minute, implementing the backoff behavior described in the PR objectives.


65-70: LGTM! Multiple finalizers handling is correct.

The code properly implements the backoff behavior: requeue with exponential backoff when allowed, then return a non-Sentry retryable error after one minute. The error parameters (retryable=true, sentry=false) are appropriate for this expected blocking scenario.


72-91: LGTM! Child account checking logic is correct.

The code properly implements the requirement to remove the finalizer only when no child accounts exist. The logic correctly:

  1. Retrieves the cluster client using the clustered name
  2. Lists child accounts filtered by the owner label
  3. Requeues if child accounts exist (with the same backoff behavior)
  4. Proceeds with finalization and clears the limiter when no child accounts exist

The flow ensures this check only runs when this is the last finalizer, which is the correct behavior.

pkg/subroutines/fga.go (4)

47-51: LGTM! Process method correctly converted to no-op.

The architectural shift to handle FGA setup via workspace initializer (with cleanup moved to Finalize) is clear and well-documented. The debug message provides useful context for developers.


102-133: LGTM! Creator relation deletion and FGA write loop are correct.

The code properly:

  1. Constructs delete tuples for both assignee-to-role and role-to-account relations
  2. Uses formatUser to handle service account naming conventions
  3. Writes deletes individually with appropriate error handling
  4. Gracefully handles duplicate write errors (expected for idempotent deletes)

The error handling strategy is sound for cleanup operations.


136-136: LGTM! Proper limiter cleanup.

Forgetting the limiter state after successful finalization prevents stale rate-limit data from affecting future reconciliations.


36-36: Align maximum backoff delay with PR objectives.

All instances of NewTypedItemExponentialFailureRateLimiter in pkg/subroutines/{fga.go:36, workspace/workspace.go:41, accountinfo/accountinfo.go:42} are set to a 120 s max delay, but the PR calls for up to 1 minute. Adjust to 60 s or confirm that 120 s is intentional.

pkg/subroutines/fga_test.go (3)

26-29: LGTM! Finalizer test is clear and correct.

The test properly verifies that the Finalizers method returns the expected FGA finalizer.


31-36: LGTM! Process no-op test is correct.

The test validates that Process returns an empty result with no error, confirming the expected no-op behavior after the refactoring.


38-68: LGTM! Child account requeue test is comprehensive.

The test properly verifies the blocking behavior when child accounts exist:

  1. Mocks are set up correctly with appropriate expectations
  2. Context includes cluster information required by the code
  3. Verification confirms requeue is scheduled with a positive delay

Good test coverage for the PR's key requirement.

)

fgaMock := mocks.NewOpenFGAServiceClient(t)
fgaMock.EXPECT().Write(mock.Anything, mock.Anything).Return(&openfgav1.WriteResponse{}, nil).Times(2)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fix the expected Write call count.

The test expects Write to be called twice (Times(2)), but the code in fga.go constructs three delete tuples when a creator is present:

  1. Parent relation tuple (lines 96-100)
  2. Assignee-to-role tuple (lines 104-108)
  3. Role-to-account tuple (lines 110-114)

Each tuple is written individually in the loop (lines 117-133), resulting in three Write calls.

Apply this diff to fix the test:

-	fgaMock.EXPECT().Write(mock.Anything, mock.Anything).Return(&openfgav1.WriteResponse{}, nil).Times(2)
+	fgaMock.EXPECT().Write(mock.Anything, mock.Anything).Return(&openfgav1.WriteResponse{}, nil).Times(3)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
fgaMock.EXPECT().Write(mock.Anything, mock.Anything).Return(&openfgav1.WriteResponse{}, nil).Times(2)
fgaMock.EXPECT().Write(mock.Anything, mock.Anything).Return(&openfgav1.WriteResponse{}, nil).Times(3)
🤖 Prompt for AI Agents
In pkg/subroutines/fga_test.go around line 104 the mock expectation currently
uses Times(2) but fga.go creates and writes three delete tuples when a creator
is present, so update the expectation to Times(3) to match the three individual
Write calls; change the call from Times(2) to Times(3) so the test reflects the
actual number of Write invocations.

Comment on lines 94 to +100
deletes := []*openfgav1.TupleKeyWithoutCondition{}
if account.Spec.Type != v1alpha1.AccountTypeOrg {
parentAccountName := accountInfo.Spec.Account.Name

deletes = append(deletes, &openfgav1.TupleKeyWithoutCondition{
User: fmt.Sprintf("%s:%s/%s", e.objectType, accountInfo.Spec.ParentAccount.OriginClusterId, parentAccountName),
Relation: e.parentRelation,
Object: fmt.Sprintf("%s:%s/%s", e.objectType, accountInfo.Spec.Account.OriginClusterId, account.GetName()),
})
}
parentAccountName := accountInfo.Spec.Account.Name
deletes = append(deletes, &openfgav1.TupleKeyWithoutCondition{
User: fmt.Sprintf("%s:%s/%s", e.objectType, accountInfo.Spec.ParentAccount.OriginClusterId, parentAccountName),
Relation: e.parentRelation,
Object: fmt.Sprintf("%s:%s/%s", e.objectType, accountInfo.Spec.Account.OriginClusterId, account.GetName()),
})
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify parent account name in delete tuple.

Line 95 assigns accountInfo.Spec.Account.Name to parentAccountName, but line 97 uses it as the parent in the User field. This appears incorrect—the parent's name should come from accountInfo.Spec.ParentAccount.Name instead.

Apply this diff if the parent account name should be sourced from ParentAccount:

-		parentAccountName := accountInfo.Spec.Account.Name
+		parentAccountName := accountInfo.Spec.ParentAccount.Name
 		deletes = append(deletes, &openfgav1.TupleKeyWithoutCondition{
 			User:     fmt.Sprintf("%s:%s/%s", e.objectType, accountInfo.Spec.ParentAccount.OriginClusterId, parentAccountName),
 			Relation: e.parentRelation,

Verify the intended tuple structure by checking the corresponding write logic that created this tuple, or review the FGA model definition.


🏁 Script executed:

#!/bin/bash
sed -n '80,140p' pkg/subroutines/fga.go

Length of output: 2356


🏁 Script executed:

#!/bin/bash
rg -n 'type AccountInfo' -n .

Length of output: 361


🏁 Script executed:

#!/bin/bash
rg -n 'ParentAccount' -C3 api/v1alpha1/account_info_types.go

Length of output: 318


🏁 Script executed:

#!/bin/bash
rg -n 'type AccountLocation' -C3 api/v1alpha1/account_info_types.go

Length of output: 267


Use the parent’s Name when building the delete tuple
parentAccountName is incorrectly sourced from Spec.Account.Name; it must use Spec.ParentAccount.Name so the User field truly references the parent.

-    parentAccountName := accountInfo.Spec.Account.Name
+    parentAccountName := accountInfo.Spec.ParentAccount.Name
🤖 Prompt for AI Agents
In pkg/subroutines/fga.go around lines 94 to 100, the code builds a delete tuple
for the parent User but sources parentAccountName from Spec.Account.Name; change
it to use accountInfo.Spec.ParentAccount.Name (or assign parentAccountName :=
accountInfo.Spec.ParentAccount.Name) so the User field correctly references the
parent account, then use that variable when formatting the User string.

@kekcleader kekcleader marked this pull request as draft October 24, 2025 10:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants