Skip to content

Conversation

@Demch1k
Copy link

@Demch1k Demch1k commented Feb 28, 2025

K8SPSMDB-1387 Powered by Pull Request Badge

CHANGE DESCRIPTION

https://perconadev.atlassian.net/browse/K8SPSMDB-1387


Problem:
We have enabled --enable-certificate-owner-ref for certmanager and after that mongodb operator can not startup any mongodb clusters.

Cause:
Mongodb operator return error when can't update owner references for certificates recources. But with --enable-certificate-owner-ref certmanager do it by itselfs.

Solution:
Catch error connected with already exists owner ref and jus print it out

CHECKLIST

Jira

  • Is the Jira ticket created and referenced properly?
  • Does the Jira ticket have the proper statuses for documentation (Needs Doc) and QA (Needs QA)?
  • Does the Jira ticket link to the proper milestone (Fix Version field)?

Tests

  • Is an E2E test/test case added for the new feature/change?
  • Are unit tests added where appropriate?
  • Are OpenShift compare files changed for E2E tests (compare/*-oc.yml)?

Config/Logging/Testability

  • Are all needed new/changed options added to default YAML files?
  • Are all needed new/changed options added to the Helm Chart?
  • Did we add proper logging messages for operator actions?
  • Did we ensure compatibility with the previous version or cluster upgrade process?
  • Does the change support oldest and newest supported MongoDB version?
  • Does the change support oldest and newest supported Kubernetes version?

@CLAassistant
Copy link

CLAassistant commented Feb 28, 2025

CLA assistant check
All committers have signed the CLA.

@Demch1k Demch1k force-pushed the fix-certmanager-owner-ref branch from e219161 to 227c0fe Compare February 28, 2025 12:06
@gkech gkech added the community label Mar 4, 2025
Copy link
Contributor

@egegunes egegunes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

few comments.

also I wonder if we need to set this flag while deploying cert-manager in our tests

return "", errors.Wrap(err, "set controller reference")
switch errors.Cause(err).(type) {
case *controllerutil.AlreadyOwnedError:
fmt.Sprintf("%s", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we return error here?

return errors.Wrap(err, "set controller reference")
switch errors.Cause(err).(type) {
case *controllerutil.AlreadyOwnedError:
fmt.Sprintf("%s", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we return error here?

}
if err = controllerutil.SetControllerReference(cr, secret, c.scheme); err != nil {
return errors.Wrap(err, "set controller reference")
switch errors.Cause(err).(type) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gkech wdyt of this errors.Cause maybe we should check with errors.Is?

Copy link
Contributor

@gkech gkech Mar 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes it is better @egegunes

@Demch1k let's use errors.Is and also, let's drop switch since it is not needed, so the following for all cases.

if err = controllerutil.SetControllerReference(cr, secret, c.scheme); err != nil {
if errors.Is(err, &controllerutil.AlreadyOwnedError{}) {
	return errors.Wrap(err, "set owner reference")
}
return errors.Wrap(err, "set controller reference")
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Demch1k any updates on this one?

@github-actions github-actions bot added the stale label Apr 10, 2025
@hors hors added this to the v1.21.0 milestone Apr 14, 2025
@hors hors removed the stale label Apr 14, 2025
@egegunes
Copy link
Contributor

seems like we will need to take this over, i'm moving this to next release

@egegunes egegunes modified the milestones: v1.21.0, v1.22.0 May 19, 2025
@gkech gkech changed the title Fix for certmanager owner ref K8SPSMDB-1387 certmanager --enable-certificate-owner-ref option causes no startup of any mongodb clusters May 19, 2025
@gkech gkech requested a review from egegunes August 20, 2025 13:15
To fix the issue, we only need to modify the `WaitForCert` method by
adding a check to see if the secret has a controller reference to a
certificate
@pull-request-size pull-request-size bot added size/L 100-499 lines and removed size/S 10-29 lines labels Sep 15, 2025
@Demch1k
Copy link
Author

Demch1k commented Nov 6, 2025

@egegunes Could you check changes made by @pooknull , I think it's ready for review

@Demch1k
Copy link
Author

Demch1k commented Nov 7, 2025

@pooknull Could you approve this PR?

@JNKPercona
Copy link
Collaborator

Test Name Result Time
arbiter passed 00:11:25
balancer passed 00:17:16
cross-site-sharded passed 00:17:53
custom-replset-name passed 00:09:42
custom-tls passed 00:13:45
custom-users-roles passed 00:10:11
custom-users-roles-sharded passed 00:11:10
data-at-rest-encryption passed 00:12:07
data-sharded passed 00:22:11
demand-backup passed 00:15:22
demand-backup-eks-credentials-irsa passed 00:00:06
demand-backup-fs passed 00:22:35
demand-backup-if-unhealthy passed 00:07:42
demand-backup-incremental passed 00:45:15
demand-backup-incremental-sharded passed 00:59:31
demand-backup-physical-parallel passed 00:07:58
demand-backup-physical-aws passed 00:11:43
demand-backup-physical-azure passed 00:11:40
demand-backup-physical-gcp-s3 passed 00:12:00
demand-backup-physical-gcp-native failure 00:58:16
demand-backup-physical-minio passed 00:19:37
demand-backup-physical-sharded-parallel passed 00:09:48
demand-backup-physical-sharded-aws passed 00:17:50
demand-backup-physical-sharded-azure passed 00:17:30
demand-backup-physical-sharded-gcp-native failure 01:02:30
demand-backup-physical-sharded-minio passed 00:16:12
demand-backup-sharded passed 00:23:17
expose-sharded passed 00:32:35
finalizer passed 00:09:43
ignore-labels-annotations passed 00:07:12
init-deploy passed 00:12:21
ldap passed 00:08:47
ldap-tls passed 00:12:49
limits passed 00:05:59
liveness passed 00:08:00
mongod-major-upgrade passed 00:12:18
mongod-major-upgrade-sharded passed 00:20:42
monitoring-2-0 passed 00:24:35
monitoring-pmm3 passed 00:28:19
multi-cluster-service passed 00:13:29
multi-storage passed 00:18:16
non-voting-and-hidden passed 00:16:10
one-pod passed 00:07:39
operator-self-healing-chaos passed 00:12:50
pitr passed 00:31:19
pitr-physical passed 01:00:56
pitr-sharded passed 00:20:14
pitr-to-new-cluster failure 00:27:40
pitr-physical-backup-source passed 00:53:12
preinit-updates passed 00:04:52
pvc-resize passed 00:12:55
recover-no-primary passed 00:25:47
replset-overrides passed 00:16:12
rs-shard-migration passed 00:13:42
scaling passed 00:11:05
scheduled-backup passed 00:17:06
security-context passed 00:06:57
self-healing-chaos passed 00:14:55
service-per-pod passed 00:18:04
serviceless-external-nodes passed 00:07:27
smart-update passed 00:08:42
split-horizon passed 00:07:35
stable-resource-version passed 00:04:40
storage passed 00:07:35
tls-issue-cert-manager passed 00:29:31
upgrade passed 00:09:47
upgrade-consistency passed 00:06:06
upgrade-consistency-sharded-tls passed 00:51:59
upgrade-sharded passed 00:19:09
upgrade-partial-backup passed 00:16:01
users passed 00:16:47
version-service passed 00:25:39
Summary Value
Tests Run 72/72
Job Duration 03:26:21
Total Test Time 22:22:44

commit: 4eb33c1
image: perconalab/percona-server-mongodb-operator:PR-1850-4eb33c10

@egegunes
Copy link
Contributor

demand-backup-physical-gcp-native, demand-backup-physical-sharded-gcp-native and pitr-to-new-cluster test failures needs to be investigated. @Demch1k if you aren't able to understand what's wrong with them, we'll help you when we start working on v1.22.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community size/L 100-499 lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants