Skip to content

Revert "Add draining for cc_uploader (#529)" #544

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 8, 2025

Conversation

Samze
Copy link
Contributor

@Samze Samze commented May 7, 2025

This reverts commit 45d2a21.

This original commit introduces the requirement that cc_uploader is co-located on the same VM as the cloud_controller_ng job. While this is the case in cf-deployment, this is not the case for all deployments of CF (like the VMware one).

As this commit has broken our deployment, and its not easy for us simply to relocate the job, we propose we revert this and discuss options going forwards.

The cc_uploader originally was designed an independent process, typically processes that require to be co-located are within the same job. E.g. nginx, cloud_controller_ng and cloud_controller_local_worker.

Some options:

  • Investigate a way to generically drain uploads from any source in nginx rather than specifically in cc_uploader.
  • Find a way to sync draining between jobs without requiring co-location.
  • Keep the co-location requirement for draining but behind a capi-property.

We can discuss options async or at the next office hours (14th May)


Thanks for contributing to the capi_release. To speed up the process of reviewing your pull request please provide us with:

  • A short explanation of the proposed change:

  • An explanation of the use cases your change solves

  • Links to any other associated PRs

  • I have viewed signed and have submitted the Contributor License Agreement

  • I have made this pull request to the develop branch

  • I have run CF Acceptance Tests on bosh lite

This reverts commit 45d2a21.

This original commit introduces the requirement that cc_uploader is
co-located on the same VM as the cloud_controller_ng job. While this is
the case in cf-deployment, this is not the case for all deployments of
CF (like the VMware one).

As this commit has broken our deployment, and its not easy for us simply
to relocate the job, we propose we revert this and discuss options going forwards.

The cc_uploader originally was designed an independent process, typically processes
that require to be co-located are within the same job.
E.g. nginx, cloud_controller_ng and cloud_controller_local_worker.

Some options:
* Investigate a way to generically drain uploads from any source in
  nginx rather than specifically in cc_uploader.
    * Current timeout is [10 seconds](https://github.com/cloudfoundry/cloud_controller_ng/blob/5c4dac049bd28979284aeab1efa48c7075676131/lib/cloud_controller/drain.rb#L9)
* Find a way to sync draining between jobs without requiring
  co-location.
* Keep the co-location requirement for draining but behind a capi-property.
@@ -6,8 +6,6 @@ $LOAD_PATH.unshift('/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/l
require 'cloud_controller/drain'

@drain = VCAP::CloudController::Drain.new('/var/vcap/sys/log/cloud_controller_ng')

@drain.shutdown_cc_uploader('/var/vcap/sys/run/bpm/cc_uploader/cc_uploader.pid')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change orphans some code in CC https://github.com/cloudfoundry/cloud_controller_ng/blob/b922faf661a4203a0089a2a1c80c212c47e75c9c/lib/cloud_controller/drain.rb#L61

I'm fine with this for the short term, but we probably want a PR reverting the CC changes also.

@kathap
Copy link
Contributor

kathap commented May 8, 2025

We can revert the change and see how we can implement this without breaking your deployment, for us a draining of cc-uploader would be valuable. Is there a possibility to have your set up in a test scenario in the capi pipeline so that we have a chance to see that it won't work?

@Samze Samze merged commit a635825 into develop May 8, 2025
2 checks passed
@moleske moleske deleted the revert_cc_uploader_colocation branch May 8, 2025 15:34
@Samze
Copy link
Contributor Author

Samze commented May 8, 2025

We can revert the change and see how we can implement this without breaking your deployment, for us a draining of cc-uploader would be valuable. Is there a possibility to have your set up in a test scenario in the capi pipeline so that we have a chance to see that it won't work?

It would take a lot to replicate how we deploy CF in OSS pipelines, its using custom tooling to build the overall manifest and maintaining it would also be a lot of engineering effort. From our side I think we need to better at reviewing PRs and releases and identifying these possible issues as early as possible.

@Gerg
Copy link
Member

Gerg commented May 8, 2025

@kathap could you please open an issue where we can discuss implementation options? Thanks!

@kathap
Copy link
Contributor

kathap commented May 9, 2025

@kathap could you please open an issue where we can discuss implementation options? Thanks!
I created
cloudfoundry/cloud_controller_ng#4351 for discussions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants