Skip to content

Conversation

@olegbbtr
Copy link
Contributor

@olegbbtr olegbbtr commented Jun 25, 2025

New implementation has production-ready performance.

For more details, see https://www.notion.so/neondatabase/RFC-robust-IPAM-1fcf189e0047800e96e0c73b388eebad?source=copy_link

Per-commit review can be done.

@olegbbtr olegbbtr force-pushed the oleg/20-ipam-manager branch from 3903317 to 19bfa32 Compare June 30, 2025 14:19
@olegbbtr olegbbtr force-pushed the oleg/20-ipam-manager branch from 3049112 to ce8e8ef Compare July 9, 2025 10:28
@olegbbtr olegbbtr changed the base branch from main to oleg/22-makefile July 9, 2025 10:29
@olegbbtr olegbbtr force-pushed the oleg/20-ipam-manager branch 7 times, most recently from 5f8d32f to 72207e6 Compare July 14, 2025 11:50
@olegbbtr olegbbtr marked this pull request as ready for review July 14, 2025 11:52
Base automatically changed from oleg/22-makefile to oleg/21-fluent-bit July 16, 2025 14:45
@stradig stradig assigned olegbbtr and petuhovskiy and unassigned olegbbtr Jul 24, 2025
@olegbbtr olegbbtr force-pushed the oleg/21-fluent-bit branch from 3a10913 to bed5841 Compare July 25, 2025 12:25
Base automatically changed from oleg/21-fluent-bit to main July 31, 2025 14:50
We don't need them - we have retries on the level of the reconcille
queue.

Signed-off-by: Oleg Vasilev <[email protected]>
We changed all environments to the cannonical style.

Signed-off-by: Oleg Vasilev <[email protected]>
More details and tests to follow.
Combines GaugeFunc and GaugeVec into GaugeVecFunc
@olegbbtr olegbbtr force-pushed the oleg/20-ipam-manager branch from 72207e6 to 66d1e1a Compare August 1, 2025 12:25
@olegbbtr olegbbtr requested a review from Copilot August 18, 2025 10:37
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR reimplements the IPAM (IP Address Management) system via a new Manager architecture to achieve production-ready performance. The implementation moves from a whereabouts-based approach to a custom Manager that provides better concurrency, metrics, and resource management.

Key changes include:

  • Complete IPAM architecture redesign: Replaced whereabouts-based allocation with a custom Manager system that handles IP pools, cooldown periods, and automatic rebalancing
  • Enhanced metrics and monitoring: Added comprehensive metrics for tracking IP states (allocated, free, unknown, cooldown) and pool statistics
  • Improved API design: Changed from NamespacedName-based identification to UID-based VM identification for better consistency

Reviewed Changes

Copilot reviewed 22 out of 23 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
pkg/neonvm/ipam/manager.go Core Manager implementation handling IP allocation, release, and pool management
pkg/neonvm/ipam/pool.go PoolClient for managing IPPool CRD interactions
pkg/neonvm/ipam/metrics.go Enhanced metrics collection with new GaugeVecFunc support
pkg/neonvm/ipam/config.go Configuration parsing and IP range management utilities
pkg/neonvm/ipam/ipam.go Updated IPAM interface to use new Manager architecture
pkg/neonvm/controllers/vm_controller.go Updated VM controller to use UID-based IPAM calls
pkg/util/metricfunc/metricfunc.go New utility for function-based Prometheus metrics
Comments suppressed due to low confidence (2)

pkg/neonvm/ipam/manager_test.go:254

  • The goroutine does not capture the loop variable manager. This will cause all goroutines to operate on the last value of manager. Use go func(m *Manager) { m.Run(ctx); wg.Done() }(manager) instead.
	}

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

func (c *PoolClient) PoolName() string {
// for IP range 10.11.22.0/24 poll name will be
// "10.11.22.0-24" if no network name in ipam spec, or
// "samplenet-10.11.22.0-24" if nametwork name is `samplenet`
Copy link

Copilot AI Aug 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a typo in the comment: 'nametwork' should be 'network'.

Suggested change
// "samplenet-10.11.22.0-24" if nametwork name is `samplenet`
// "samplenet-10.11.22.0-24" if network name is `samplenet`

Copilot uses AI. Check for mistakes.
}

if c.LowIPCount > c.TargetIPCount {
return fmt.Errorf("low IP count must be less than target IP count")
Copy link

Copilot AI Aug 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message should be 'low IP count must be less than or equal to target IP count' since the validation allows equality (LowIPCount > TargetIPCount is the failing condition).

Suggested change
return fmt.Errorf("low IP count must be less than target IP count")
return fmt.Errorf("low IP count must be less than or equal to target IP count")

Copilot uses AI. Check for mistakes.
}

if c.HighIPCount < c.TargetIPCount {
return fmt.Errorf("high IP count must be greater than target IP count")
Copy link

Copilot AI Aug 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message should be 'high IP count must be greater than or equal to target IP count' since the validation allows equality (HighIPCount < TargetIPCount is the failing condition).

Suggested change
return fmt.Errorf("high IP count must be greater than target IP count")
return fmt.Errorf("high IP count must be greater than or equal to target IP count")

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants