Skip to content

Conversation

@fu5ha
Copy link
Member

@fu5ha fu5ha commented May 22, 2025

Fixes two issues surfaced by newer nvidia drivers

  1. It is now common for the driver to return swapchain images out-of-order, which crashes on an assert. Simply removing the assert is incorrect given this behavior and would result in overlapping use of the rendering_complete semaphores. Instead I moved the acquire semaphore into the frame data, to make sure there are enough if the number of frames is increased in the future (frames in flight is the bound on acquire semaphores). I then renamed rendering_complete to ready_for_present semaphores in the swapchain, which I think is a bit more descriptive of exactly their purpose, and index them based on the returned presentation index from the driver. In this way we guarantee that the ready_for_present semaphore is not in use when we wait on it before presentation.

  2. The prefix_scan regime always accesses out to SEGMENT_SIZE * SEGMENT_SIZE (currently 1024 * 1024) element in the input buffer. Previously, we were passing it a buffer with MAX_ENTRIES elements (which is only 1024 * 64). So, we were accessing out of bounds quite dramatically. Probably the better solution here would be to make it adapt to the input buffer size dynamically and/or reduce the static size to match MAX_ENTRIES, but I haven't gone through the algorithm closely enough to adapt it yet

@fu5ha fu5ha requested a review from h3r2tic as a code owner May 22, 2025 21:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant