Skip to content

Target plugin for IBMCloud (VSI) Instance Group #1056

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

chickenandpork
Copy link

@chickenandpork chickenandpork commented Mar 13, 2025

This PR brings in a target plugin to scale an IBMCloud "instance group". Consider the following:

(Terraform)

resource "ibm_is_instance_group" "autoscale_group" {
  name = "autoscale-group"

  instance_count    = 1
  instance_template = <some template ID>
  subnets           = [ <subnet.id> <subnet.id> ]
}

NOTE this will create an instance group with an ID such as r006-faceb00c-feed-d00d-beef-123412341234

(nomad-autoscaler config ${NOMAD_TASK_DIR}/config.hcl)

nomad {
  address = "http://{{env "attr.unique.network.ip-address" }}:4646"
}

apm "prometheus" {
  driver = "prometheus"
  config = {
    address = "http://{{ range service "prometheus" }}{{ .Address }}:{{ .Port }}{{ end }}"
  }
}

strategy "target-value" {
  driver = "target-value"
}

target "ibmcloud-ig" {
  driver = "ibmcloud-ig"
  config = {
    api_key = "${ibm_api_key}"  # an IBM API Key: I'm merging mine at deployment
  }
}

(nomad-autoscaler config ${NOMAD_TASK_DIR}/policies/autoscale-group-policy.hcl)

scaling "cluster_policy" {
  enabled = true
  min     = 0
  max     = 3

  policy {
    cooldown            = "2m"
    evaluation_interval = "1m"

    check "utilization" {
      source = "prometheus"
      query = "< a prometheus query yielding a scalar matching the unit of 'strategy.target' below>"
      query_window = "2m"

      strategy "target-value" {
        target = 80
      }
    }

    target "ibmcloud-ig" {
      dry-run             = false
      instance_group_id   = "r006-faceb00c-feed-d00d-beef-123412341234" # instance group ID from above
      node_drain_deadline = "5m"
    }
  }
}

I've tested this using a label as a metric, and witnessed the autoscaler exercising the target to scale an Instance Group up to 20, limited by the max I've set to 3, and then scaling back to 0.

This is currently in test at my employer but I'm contributing this to Hashicorp with supervisor permission. I'm willing to follow this with a contribution to nomad-autoscaler-demos. I can document the process in an article as well.

@chickenandpork chickenandpork requested review from a team as code owners March 13, 2025 22:14
Copy link

hashicorp-cla-app bot commented Mar 13, 2025

CLA assistant check
All committers have signed the CLA.

@chickenandpork chickenandpork changed the title Feat/provide target plugin for IBMCloud (VSI) Instance Group feat: Provide target plugin for IBMCloud (VSI) Instance Group Mar 13, 2025
Copy link
Member

@schmichael schmichael left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution! The error handling doesn't need newlines and could use some rewording in places. Hopefully the comments I left guide you in the right direction.

Comment on lines 59 to 64
instance_group_id, ok := n.getConfig(configKeyInstanceGroupID, config)
if !ok {
return fmt.Errorf("required config param %s not found", configKeyInstanceGroupID)
}

api_key, ok := n.getConfig(configKeyAPIKey, config)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instanceGroupID and apiKey would be more idiomatic Go names, but no big deal. Not a blocker!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in ceb5459

// hit that client handle to query information about the victim Instance Group
ig, _, err := n.vpc.GetInstanceGroup(&vpcv1.GetInstanceGroupOptions{ID: &instance_group_id})
if err != nil {
fmt.Errorf("IG: %s: Exception in GetInstanceGroup: %v\n", instance_group_id, err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fmt.Errorf("IG: %s: Exception in GetInstanceGroup: %v\n", instance_group_id, err)
return fmt.Errorf("failed to GetInstanceGroup for %s: %w", instance_group_id, err)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in ceb5459

instanceGroupPatchModel.MembershipCount = core.Int64Ptr(int64(action.Count))
instanceGroupPatch, err := instanceGroupPatchModel.AsPatch()
if err != nil {
return fmt.Errorf("IG: %s: Exception in instanceGroupPatchModel: %v\n", instance_group_id, err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return fmt.Errorf("IG: %s: Exception in instanceGroupPatchModel: %v\n", instance_group_id, err)
return fmt.Errorf("error creating patch for instance group %s: %w", instance_group_id, err)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in ceb5459

@chickenandpork
Copy link
Author

chickenandpork commented Mar 14, 2025

Thanks @schmichael ! My really bad go code here is a but humbling -- "how did this ever build?" -- thanks for the suggestions. I changed out the identifiers (I was thinking too much like the terraform code) and tried to convert the exception messages similar to your examples with the previous.

Down the road I would like to dedupe some of the code (most of the instanceID, apiKey -> NewVpcV1(...) stuff) but I wanted to share this sooner.

@chickenandpork chickenandpork changed the title feat: Provide target plugin for IBMCloud (VSI) Instance Group Target plugin for IBMCloud (VSI) Instance Group Mar 14, 2025
@chickenandpork
Copy link
Author

@schmichael changes are made to address all your suggestions -- mostly wholesale accepting your suggestions (thanks) -- is there anything else we need to merge this?

@chickenandpork
Copy link
Author

chickenandpork commented Mar 26, 2025

@schmichael while this awaits permission to begin workflows, I'm curious whether you prefer that I rebase locally and force-push, or merge main on top of this PR. Can you advise on your preferred workflow?

@jeffmccollum
Copy link

jeffmccollum commented Apr 7, 2025

Hello @chickenandpork
I've been testing this out and found the config on the plugin should be api_key not ibm_api_key in your example above.
Otherwise I didn't have any issues scaling up and down in IBM Cloud.

@chickenandpork
Copy link
Author

chickenandpork commented Apr 7, 2025

thanks, @jeffmccollum .. I think I originally had ibm_api_key in my version in prod. Nice catch!

I'll fix the description so that it's accurate for the next. [edit: updated description]

Do you know if I should just keep merging updates to this PR until merged, or whether Hashicorp prefers rebase/squash/force-push ? I'd like to keep it as easy to merge as possible when the PR is approved.

@jeffmccollum

This comment was marked as resolved.

@tgross
Copy link
Member

tgross commented Apr 9, 2025

Do you know if I should just keep merging updates to this PR until merged, or whether Hashicorp prefers rebase/squash/force-push ? I'd like to keep it as easy to merge as possible when the PR is approved.

We'll squash-merge when we're done anyways. GitHub's review tools are a little easier to use if you don't squash while the review is ongoing.

@jeffmccollum
Copy link

@chickenandpork

Looks like plugins/builtin/target/ibmcloud-ig/main.go is missing the copywrite header

// Copyright (c) HashiCorp, Inc.
// SPDX-License-Identifier: MPL-2.0

Also could support for trusted profiles be added as an option?

@chickenandpork
Copy link
Author

Hi @jeffmccollum

Looks like plugins/builtin/target/ibmcloud-ig/main.go is missing the copywrite header

// Copyright (c) HashiCorp, Inc.
// SPDX-License-Identifier: MPL-2.0

Strange -- I thought that was checked before, and it was green. taking a look.

Also could support for trusted profiles be added as an option?

Likely, but I'd want to follow this PR with that enhancement. My personal goal is to get this small effort into main so that others can see and leverage, then extend capabilities: minimum-ship before feature-ful.

@chickenandpork
Copy link
Author

@jeffmccollum added, but I think the CI needs approval before checking.

@chickenandpork chickenandpork force-pushed the feat/provide-target-plugin-for-ibmcloud branch 2 times, most recently from d03baab to 9545da2 Compare April 15, 2025 17:07
@chickenandpork
Copy link
Author

chickenandpork commented Apr 15, 2025

@jeffmccollum @schmichael rebased for the 10 commits between my PR and current builds.

fixes https://github.com/hashicorp/nomad-autoscaler/actions/runs/14088268351/job/40258769936

The CI makes any outstanding PR have a recurring cost if we keep up (CI that used to pass can now fail if retried post-dependency update, especially because it enforced tidy-ness, which can become broken by updates while the PR is delayed).

Would you folks be interested in some effort to accelerate the build process?

@chickenandpork chickenandpork force-pushed the feat/provide-target-plugin-for-ibmcloud branch from 9545da2 to 14ff184 Compare May 8, 2025 17:01
@chickenandpork
Copy link
Author

chickenandpork commented May 8, 2025

Easier from the CLI to rebase on main and push. Churn caused due to being blocked on review while #1078 merged

@chickenandpork chickenandpork force-pushed the feat/provide-target-plugin-for-ibmcloud branch from 14ff184 to ad8b588 Compare May 14, 2025 18:39
@chickenandpork
Copy link
Author

Rebased on CLI to ad8b588 due to churn caused by #1080 merging while this PR is waiting

@chickenandpork
Copy link
Author

pushed a change to drop the specific dependency updated in #1042 and trimmed from dependencies in #1069

@chickenandpork chickenandpork force-pushed the feat/provide-target-plugin-for-ibmcloud branch from d3cbcc3 to ce83e63 Compare May 21, 2025 07:06
@chickenandpork
Copy link
Author

Rebased on CLI to ce83e63 due to churn caused by #1083, #1084, #1085 merging while this PR is waiting to ensure easy merge on approval.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants