Conversation
While checking against loose thresholds would in some cases result in more conservative output, it also significantly regressed the known coverage and didn't guarantee conservative results anyway. So as a baseline we are now comparing to 0.5; more samples will be necessary to provide more conservative masks.
When rasterizing 4-state OMMs, we now take more samples along each edge according to the edge length. These samples are used to confirm the known state: state only remains known if all samples agree with the corner based guess. This naturally makes rasterization slower, but that can be mitigated by adjusting the mip size for a quality-speed tradeoff.
When we are using edge samples, we might as well use them to improve the quality of coverage estimation; this makes binary coverage much closer to the ground truth (from ~13% deviation to ~7%). For simplicity, we store the popcount in the same integer instead of re-computing it again. As part of this, we need the extra samples for 2-state OMMs too. The additional cost can be mitigated by using preferred mips, which has minimal downsides for 2-state output. Instead of treating the edge masks as a 0..1 value, we count the bits and normalize by the maximum bit count. This provides a more precise estimate for coverage, reducing the error further by 1-2%.
Instead of targeting a specific ratio during rasterization, we now allow to customize it with a quality parameter. Every increment is a half-step in the log space, so increasing quality by 2 decreases the mip number by 1. This provides a gradual tradeoff between quality and speed, as rasterization ends up taking more samples for higher quality.
When edgeres > 0, we sample edge data which is duplicated between adjacent triangles; as level grows, we end up doing almost 2x the number of edge samples compared to optimum. To combat this, we now share edge samples within the bottom level-1 recursion hierarchy. This reduces the edge samples from 12 to 9 for a 25% decrease, which improves rasterization performance by 15% (as there's usually a mix of subdiv levels and corner/center sampling doesn't change). It's possible to extend this further, but doing this at level-2 is incrementally less effective (10% gains) and more complicated, so we'll settle for this for now.
uvedge = sqrt(area); this is a little misleading because it scales with the length of the UV edge, but isn't equal to it - and, naturally, triangles are not usually equilateral so this whole computation is a little off. When triangles are equilateral, side ~= 1.5 * uvedge. If we require the subdivision to produce segments with length < 2 pixels, then sample count should be ceil(side/2 - 1) ~= floor(side/2) ~= floor(uvedge*0.75).
With this, quality 0 matches our previous calculation where the level is rounded down to provide more information in the source texture, and allow the extended edge sampling to produce a more conservative 4-state mask. Before this, quality 0 was a poor default, but now it's a reasonable default that provides a good balance between rasterization cost and output; a negative quality level can be specified to get even faster rasterization with less precision.
While this configuration is somewhat redundant, it's correct and there might be cases where it's desireable to simply produce a per-triangle index; using measure with max_level=0 and a subsequent compact will guarantee that no OMM data is generated and it's all encoded via special indices.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This change reworks rasterization code to generally produce mostly-conservative known states and higher quality coverage.
Instead of just using corner & center samples, we now adaptively sample the edges between micro-triangles. The samples are used to establish the known mask (if they are consistent with center/corner data), and also to improve the coverage estimate. The previous version used alpha cutoffs that were too loose; the presence of edge data allows us to fix that and use strict cutoffs, resulting in near optimal number of known vs unknown states.
To avoid too many redundant samples, the edge data is shared within level-1 triangle subdivision. It's possible to go further - we still have redundant sampling on large subdivisions - but the code for this is a little gnarly so for now just the simpler version is added here, which saves ~15% time when rasterizing.
For coverage estimation, the corner, center and edge samples are combined into a coverage value that's thresholded vs 0.5. This computation uses a tiny "MLP" (technically SLP?) which is effectively a dot product, that was trained to match the ground truth coverage and has ~5% error vs ground truth. In the future we might use slightly larger MLPs for this - I've experimented with them but ended up settling on the minimum viable version with no hidden layers.
Because we now sample edge data adaptively, the time to rasterize depends on the resolution of the resulting triangles. To balance quality vs time, texture mip data can be used instead of mip 0;
meshopt_opacityMapPreferredMipnow takes a separate quality argument, which can be increased (>0) or decreased (<0) to balance quality vs time tradeoff. This data is for single threaded generation for the entire Bistro level.It's also possible to just use a hardcoded mip level instead of per-triangle adaptive:
This contribution is sponsored by Valve.