Skip to content

Commit 666c6c4

Browse files
zachdworkinSean Hefty
authored andcommitted
src/hmem_ze: Move creation of ze command queues out of initialization and into the copy function.
In the case of ULLS (ultra low latency submission) where ze gets initialized and never used; creating a command queue spawns a thread that repeatedly submits 0 byte copies to the command queue so that it never gets swapped out. If ze is initialized and never used it creates a performance impact because it is never used. Move the initialization to the copy function to remove this overhead. Signed-off-by: zdworkin <[email protected]>
1 parent 0978837 commit 666c6c4

File tree

1 file changed

+11
-9
lines changed

1 file changed

+11
-9
lines changed

src/hmem_ze.c

Lines changed: 11 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -754,15 +754,6 @@ int ze_hmem_init(void)
754754
if (ze_ret)
755755
goto err;
756756

757-
cq_desc.ordinal = ordinals[num_devices];
758-
cq_desc.index = indices[num_devices];
759-
ze_ret = ofi_zeCommandQueueCreate(context,
760-
devices[num_devices],
761-
&cq_desc,
762-
&cmd_queue[num_devices]);
763-
if (ze_ret)
764-
goto err;
765-
766757
for (i = 0; i < count; i++) {
767758
if (ofi_zeDeviceCanAccessPeer(devices[num_devices],
768759
devices[i], &access) || !access)
@@ -793,6 +784,17 @@ int ze_hmem_copy(uint64_t device, void *dst, const void *src, size_t size)
793784
return 0;
794785
}
795786

787+
if (!cmd_queue[device]) {
788+
cq_desc.ordinal = ordinals[device];
789+
cq_desc.index = indices[device];
790+
ze_ret = ofi_zeCommandQueueCreate(context,
791+
devices[device],
792+
&cq_desc,
793+
&cmd_queue[device]);
794+
if (ze_ret)
795+
goto err;
796+
}
797+
796798
cl_desc.commandQueueGroupOrdinal = ordinals[dev_id];
797799
ze_ret = ofi_zeCommandListCreate(context, devices[dev_id], &cl_desc,
798800
&cmd_list);

0 commit comments

Comments
 (0)