Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

terminator_CreateDisplayPlaneSurfaceKHR() with multiple ICDs #1541

Open
jjulianoatnv opened this issue Aug 21, 2024 · 8 comments
Open

terminator_CreateDisplayPlaneSurfaceKHR() with multiple ICDs #1541

jjulianoatnv opened this issue Aug 21, 2024 · 8 comments
Labels
bug Something isn't working

Comments

@jjulianoatnv
Copy link
Contributor

jjulianoatnv commented Aug 21, 2024

By code inspection of terminator_CreateDisplayPlaneSurfaceKHR(), it appears that this function may not function as intended when multiple ICDs that implement icd_term->dispatch.CreateDisplayPlaneSurfaceKHR.

Around here:

// Loop through each ICD and determine if they need to create a surface

is code that loops over all ICDs. It dispatches vkCreateDisplayPlaneSurfaceKHR to every ICD that implements icd_term->dispatch.CreateDisplayPlaneSurfaceKHR. If any of the ICDs returns a VkResult other than VK_SUCCESS, then the function early-exits with that VkResult (with that error).

The displayMode member of the VkDisplaySurfaceCreateInfoKHR structure is a VkDisplayModeKHR.
A VkDisplayModeKHR is a non-dispatchable handle with parent type VkDisplayKHR.
A VkDisplayKHR is a non-dispatchable handle with parent type VkPhysicalDevice.
VkPhysicalDevice identifies one ICD.

I had expected the loader to dispatch vkCreateDisplayPlaneSurfaceKHR to the one ICD that exposed the indirectly-referenced VkPhysicalDevice , and not to dispatch it to any other ICDs. That is not how the loader behaves.
As written, when the loader vkCreateDisplayPlaneSurfaceKHR to an ICD that did not expose the indirectly-referenced VkPhysicalDevice, then I expect that other ICD to return a VkResult that is not VK_SUCCESS. This will cause terminator_CreateDisplayPlaneSurfaceKHR() to early exit and to return that same error VkResult.

If the loader does not know the relationship VkDisplayModeKHR -> VkDisplayKHR -> VkPhysicalDevice due to the loader not tracking the objects VkDisplayModeKHR and VkDisplayKHR, then perhaps a reasonable fix is the following:

  • Deprecate vkCreateDisplayPlaneSurfaceKHR
  • Create a new vkCreateDisplayPlaneSurface2KHR
    • This command is like the previous command, but it also includes a parameter of type VkPhysicalDeviceKHR.
    • Have the loader unpack the physical device to obtain the ICD that owns the physical device.
    • Have the loader dispatch this new command to only that one ICD.
@jjulianoatnv jjulianoatnv added the bug Something isn't working label Aug 21, 2024
@jjulianoatnv
Copy link
Contributor Author

@cubanismo , I'd like your opinion on this.

@jjulianoatnv
Copy link
Contributor Author

After thinking about this more, I've come to the conclusion that vkCreateDisplayPlaneSurfaceKHR is probably simply broken. All other command that operate on VkDisplayKHR and on VkDisplayModeKHR also have a vkPhysicalDeviceKHR parameter. This one does not. I think it's a bug that this one does not have a physical device parameter.

@cubanismo
Copy link
Contributor

After thinking about this more, I've come to the conclusion that vkCreateDisplayPlaneSurfaceKHR is probably simply broken. All other command that operate on VkDisplayKHR and on VkDisplayModeKHR also have a vkPhysicalDeviceKHR parameter. This one does not. I think it's a bug that this one does not have a physical device parameter.

It's not the function that's broken, it's the general implementation of surfaces in the loader. Surface creation isn't supposed to dispatch to any drivers. It's just supposed to create a container struct for the parameters that is then passed to drivers in subsequent commands that do always dispatch properly based on other objects. Surfaces are driver-agnostic. My opinion is that any driver that's relying on this behavior is broken, and the loader dispatching of surface creation/destruction to drivers should be ripped out.

@jjulianoatnv
Copy link
Contributor Author

OK, yes, that seems like another viable solution path.

It looks like every command that consumes a surface takes a physical device parameter. These commands dispatch to one ICD.

The various surface factories each have their own surface createInfo. I do not see any createInfo that contains a structure member which needs to be unpacked by the loader. If any of these createInfo do have a structure member that needs to be unpacked by the loader, then the system is broken and your suggested fix will not succeed for that surface type.

@jjulianoatnv
Copy link
Contributor Author

Wait...on second thought, if the surface factories stop dispatching into ICDs, then the loader/ICD interface will have to be changed. Every command that does dispatch to an ICD and that takes a surface parameter will have to be modified to pass into the ICD the createInfo struct that was used to create that surface.

This does seem viable. But it's not an easy change as it breaks backwards compatibility of the current loader/ICD interface.

@pdaniell-nv
Copy link
Contributor

A long time ago the loader handled the VkSurfaceKHR objects itself, but that proved to be weird for ICDs to deal with as it was unusual and it wasn't possible to include driver internals with those objects. ICD interface version 3 (https://github.com/KhronosGroup/Vulkan-Loader/blob/main/docs/LoaderDriverInterface.md#loader-and-driver-interface-version-3-requirements) added support to allow ICD to implement their own VkSurfaceKHR handling. I believe the loader still supports doing its own VkSurfaceKHR if the ICD doesn't export vkCreateWin32SurfaceKHR and the others.

@cubanismo
Copy link
Contributor

and it wasn't possible to include driver internals with those objects.

Right, that's the point. They were never supposed to.

If any of these createInfo do have a structure member that needs to be unpacked by the loader, then the system is broken and your suggested fix will not succeed for that surface type.

I don't see why that follows. VkSurface is an instance-level object. The loader should parse the createInfo and its pNext chain and populate some internal representation of a VkSurface object using that data. Then when dispatching commands to ICDs that include a VkSurface, it's supposed to pass that internal structure as part of the ABI. That was the intended design for VkSurface when it was introduced as a "typesafe" version of the original APIs that just cast a surface-type-dependent non-opaque struct to void* or simklar as a parameter to vkCreateSwapchainKHR().

@jjulianoatnv
Copy link
Contributor Author

The loader should parse the createInfo and its pNext chain and populate some internal representation of a VkSurface object using that data. Then when dispatching commands to ICDs that include a VkSurface, it's supposed to pass that internal structure as part of the ABI.

OK, I think I agree that this suggestion is viable. I also agree that what you describe was the original loader design, and that implementation and interface changed in a later revision. I was not involved in discussion that lead to these changes, and I do not know why they were made or if these changes were the "best" option at the time.

Scanning other terminators in the loader, I see that terminator_CreateDevice() behaves similar to what we are describing. Below this comment:

// Before we continue, If KHX_device_group is the list of enabled and viable extensions, then we then need to look for the
// corresponding VkDeviceGroupDeviceCreateInfo struct in the device list and replace all the physical device values (which

// are really loader physical device terminator values) with the ICD versions.

is a loop that scans the VkDeviceCreateInfo pNext chain and replaces loader physical device terminator values with ICD physical device handles. The code is here:

// Before calling down, replace the incoming physical device values (which are really loader terminator

If surfaces are modified to behave as you suggest, then during any command that takes a surface as input parameter, the loader will have to do similar substitution to replace all loader-wrapped terminator values in the *SurfaceCreateInfo structs and their pNext chains with corresponding ICD handles.

What we have instead is that the loader dispatches surface factory commands to ICDs, and the ICDs create ICD-internal surface objects. Later, the loader skips doing handle replacement during execution of commands that take a surface as input parameter. But the loader's current implementation is broken because it fails when more than one ICD implements a function such as vkCreateDisplayPlaneSurfaceKHR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants