Skip to content

Updating Azure Plugin to Support Flexible VMSS #1106

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 38 commits into
base: main
Choose a base branch
from

Conversation

benjamin-lykins
Copy link

@benjamin-lykins benjamin-lykins commented Jun 9, 2025

Fixes: #1096

  • Reviewed contributor guide.
    I will note, this might not be a small first PR to put in. Ended up being a little bit more noisy since I did migrate off the legacy Azure SDK.

Changes

  1. Updated from deprecated to current Azure Compute SDK.

"github.com/Azure/azure-sdk-for-go/services/compute/mgmt/2020-06-01/compute"
"github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/compute/armcompute"

  1. Updated status checks

Supports both flexible and uniform scale sets. Queries the VMSS to get the type and then conditionally runs the respective process status/scale in and out processes.

  • Flexible scale sets are a shift from uniform and were unable to query the instance view of the instance. Essentially, grab the list of instances from the VMSS, then use that against the VM (not VMSS) API to get instance status.

  • When getting instance states, since Flexible VMSS can have up to 2000 instances, I used a goroutine to iterate through them more effectively. Hardcoded at 5 concurrent, might be worth allowing the end user to provide a limit on this, but from my testing with 5 concurrent against 100 instances, it took ~1 second. When I was sequentially pulling instance state, it took ~30 seconds for the same count. I did test limiting to 10 concurrent, but it took it from ~1 second to a little less than a second, so not much gained.

  1. The only existing function specifically with Uniform I updated was processInstanceView.

I can double check the why, but I believe one of the attributes was not available in the current api.

  1. Testing

Admittedly Test_processInstanceView used copilot to refactor for the new api.

Testing

Flexible VMSS

  • Scale In
  • Scale Out

Uniform VMSS

  • Scale In
  • Scale Out

- Updated Azure SDK imports to use the new `azidentity` and `armcompute` packages.
- Replaced deprecated methods for creating Azure clients with new client secret credential methods.
- Modified `setupAzureClient` to handle client creation and error handling more effectively.
- Enhanced `scaleOut` and `scaleIn` methods to support both Uniform and Flexible orchestration modes.
- Implemented concurrent processing for VM instance views in Flexible mode to improve performance.
- Updated tests to reflect changes in the Azure SDK and added new test cases for instance view processing.
- Simplified instance view processing logic to accommodate new SDK structure.
Copy link
Member

@tgross tgross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great work @benjamin-lykins. I've left a bunch of comments -- some on small details and some on design. For the smaller comments like contents of debug logging, try to take a pass over the whole PR with those in mind. (Edit: it looks like GitHub is unhelpfully doing the "hidden conversations" thing so make sure you expand that as that includes some of the major design items)

We've added a lot of new code here and it doesn't seem like there's a lot of test coverage. Unfortunately we don't have integration tests so that's challenging, but maybe as you break up some of those large methods I pointed out there'd be a way to extract some of the logic into testable chunks.

benjamin-lykins and others added 23 commits June 9, 2025 20:39
removing comments and cleanup

Co-authored-by: Tim Gross <[email protected]>
consolidating scale in function and generalizing. 

tested both uniform and flexible without issues.
Copy link
Member

@tgross tgross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making good progress!

@benjamin-lykins
Copy link
Author

benjamin-lykins commented Jun 23, 2025

@tgross Replied back to your latest reviews and put in a series of commits to address the recommendations.

I'm working to clean up the functions which get both status and IDs for Flexible VMSS. I may try to consolidate those two functions, or generalize some of it that I can.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The azure-vmss target plugin fails when used with Azure scale sets in flexible orchestration mode
2 participants