-
Notifications
You must be signed in to change notification settings - Fork 86
Description
Describe the bug
When Queue is converted from core version to v1beta1 version, the Affinity assignment is lost. For example, I added affinity field for the root queue in the scheduler, but after I submitted a vcjob, the affinity field lost. After I debugged, I found that it's because when closing session, the scheduler will update the spec of root queue: https://github.com/volcano-sh/volcano/blob/4068723dca90fdfdc48d548ba27c847758a65e5c/pkg/scheduler/framework/session.go#L305-L343
In the process of updateRootQueueResources
, the scheduler initiates a v1beta1 version queue struct, and then convert the core version of queue to v1beta1 version:
https://github.com/volcano-sh/volcano/blob/4068723dca90fdfdc48d548ba27c847758a65e5c/pkg/scheduler/framework/session.go#L318
But in apis dependency, currently the convert function Convert_scheduling_QueueSpec_To_v1beta1_QueueSpec
was manually written, and it didn't contain the deepcopy logic of the affinity field
To Reproduce
Steps to reproduce the behavior:
- Add affinity field for root queue
- Submit a vcjob in sub-queue
- Notice that the affinity field of root queue is lost
Expected behavior
The affinity field of root queue should remain the same even after a vcjob submitted. The scheduler shouldn't change the affinity field of the root queue
Screenshots
I added two lines of logs(At L313 and L320, and it showed that, before converting (core version of queue) the affinity field of root queue exists, but after converting(v1beta1 version of queue), the affinity field became nil:

I0715 07:00:15.974414 1 session.go:313] The root queue spec affinity is &scheduling.Affinity{NodeGroupAffinity:(*scheduling.NodeGroupAffinity)(0xc000944120), NodeGroupAntiAffinity:(*scheduling.NodeGroupAntiAffinity)(0xc000944150)} [Before Convert], affinity &scheduling.NodeGroupAffinity{RequiredDuringSchedulingIgnoredDuringExecution:[]string{"groupname1", "groupname2"}, PreferredDuringSchedulingIgnoredDuringExecution:[]string{"groupname1"}}/anti-affinity &scheduling.NodeGroupAntiAffinity{RequiredDuringSchedulingIgnoredDuringExecution:[]string{"groupname3", "gropuname4"}, PreferredDuringSchedulingIgnoredDuringExecution:[]string{"groupname3"}}
I0715 07:00:15.974433 1 session.go:321] The root queue spec is v1beta1.QueueSpec{Weight:1, Capability:v1.ResourceList{"cpu":resource.Quantity{i:resource.int64Amount{value:20, scale:0}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:"20", Format:"DecimalSI"}, "ephemeral-storage":resource.Quantity{i:resource.int64Amount{value:844165283840, scale:0}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:"844165283840", Format:"DecimalSI"}, "hugepages-1Gi":resource.Quantity{i:resource.int64Amount{value:0, scale:0}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:"0", Format:"DecimalSI"}, "hugepages-2Mi":resource.Quantity{i:resource.int64Amount{value:0, scale:0}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:"0", Format:"DecimalSI"}, "memory":resource.Quantity{i:resource.int64Amount{value:83003781120, scale:0}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:"81058380Ki", Format:"BinarySI"}, "nvidia.com/gpu":resource.Quantity{i:resource.int64Amount{value:8, scale:0}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:"8", Format:"DecimalSI"}, "pods":resource.Quantity{i:resource.int64Amount{value:550, scale:0}, d:resource.infDecAmount{Dec:(*inf.Dec)(nil)}, s:"550", Format:"DecimalSI"}}, Reclaimable:(*bool)(0xc0008a22f6), ExtendClusters:[]v1beta1.Cluster(nil), Guarantee:v1beta1.Guarantee{Resource:v1.ResourceList(nil)}, Affinity:(*v1beta1.Affinity)(nil), Type:"", Parent:"", Deserved:v1.ResourceList(nil), Priority:0} [After Convert]
Desktop (please complete the following information):
- Volcano v1.12.1
Additional context
Add any other context about the problem here.