-
Notifications
You must be signed in to change notification settings - Fork 29
Description
I'd appreciate clarification on the implications of the changes in 44df8eb and 9f2dd09 (triggered by the discussion in #321),
As part of that change, it seems like other parts of the spec should also be updated, such as the following statement, which I think now doesn't make sense any more:
The command queue index provides a mechanism for an application to indicate which command queues can execute concurrently (different indices).
Because now that all queue indices must be 0, there cannot be different indices, so I guess the statement should just be removed as well.
The same reasoning applies to more statements in the command queue section, I think basically everything that talks about command queue indices (as they're now all equal to 0
by definition). It seems the index
attribute now just exists for backwards compatibility reasons and could otherwise just be removed entirely.
Regarding the further implications of the removal of explicit queues: to my understanding, explicit queues were the mechanism that Level Zero provided to allow fine-grained use of multiple physical engines in a queue group. Let me give an example:
- Let's assume my Level Zero device has a DMA unit with 4 channels, i.e., it can handle 4 parallel transfers independently.
- So far, I could expose this DMA unit as a queue group
queueGroup
with onlyCOPY
capability and 4 physical engines, i.e.,queueGroup.numQueues == 4
. - If I wanted a transfer to happen on a specific channel, I could create an explicit queue with the appropriate index (0 for the first channel, 1 for the second channel, etc.) and then copy commands submitted to that queue would only execute on the appropriate channel.
- If I didn't care about the channel, I could create a non-explicit queue and let the implementation pick one (likely just one that is free).
So now that the concept of explicit command queues was removed, what purpose does the numQueues
attribute of a queue group have? It seems like it's entirely useless and can be removed as well or must at least be forced to 1
, to keep the attribute itself around. Or am I mistaken?
And going further, once numQueues
is gone or always forced to 1
, isn't also the whole concept of a queue group moot? Because it seems you cannot really do a lot with them now that explicit queues are gone.
As a side remark: while playing around with Intel's Level Zero implementation for GPUs on my machine, I did notice that it makes no usage of those concepts at all and just exposes one queue group with capability COMPUTE | COPY
and 1 physical engine, which is the degenerate case where the whole queue group concept wouldn't be needed. It would be interesting to know if there's another implementation that makes actual use of these concepts or whether you planned to remove them anyway from the spec.