Commit 9d2db88
Improve frustum culling of skinned meshes through per-joint bounds (#21837)
## Objective
Mostly fix #4971 by adding a new option for updating skinned mesh `Aabb`
components from joint transforms.
https://github.com/user-attachments/assets/c25b31fa-142d-462b-9a1d-012ea928f839
This fixes cases where vertex positions are only modified through
skinning. It doesn't fix other cases like morph targets and vertex
shaders.
The PR kind of upstreams
[`bevy_mod_skinned_aabb`](https://github.com/greeble-dev/bevy_mod_skinned_aabb),
but with some changes to make it simpler and more reliable.
### Dependencies
- (MERGED) #21732 (or something similar) is desirable to make the new
option work with `RenderAssetUsages::RENDER_WORLD`-only meshes.
- This PR is authored as if 21732 has landed. But if that doesn't happen
then I can adjust this PR to note the limitation.
- (Optional) #21845 adds an option related to skinned mesh bounds.
- Either PR can land first - the second will need to be updated.
## Background
If a main world entity has a `Mesh3d` component then it's automatically
assigned an `Aabb` component. This is done by `bevy_camera` or
`bevy_gltf`. The `Aabb` is used by `bevy_camera` for frustum culling. It
can also be used by `bevy_picking` as an optimization, and by third
party crates.
But there's a problem - the `Aabb` can be wrong if something changes the
mesh's vertex positions after the `Aabb` is calculated. This can be done
by vertex shaders - notably skinning and morph targets - or by mutating
the `Mesh` asset (#4294).
For the skinning case, the most common solution has been to disable
frustum culling via the `NoFrustumCulling` component. This is simple,
and might even be the most efficient approach for apps where meshes tend
to stay on-screen. But it's annoying to implement, bad for apps where
meshes are often off-screen, and it only fixes frustum culling - it
doesn't help other systems that use the `Aabb`.
## Solution
This PR adds a reliable and reasonably efficient method of updating the
`Aabb` of a skinned mesh from its animated joint transforms. See the
"How does it work" section for more detail.
The glTF loader can add skinned bounds automatically if a new
`GltfSkinnedMeshBoundsPolicy` option is enabled in `GltfPlugin` or
`GltfLoaderSettings`:
```rust
app.add_plugins(DefaultPlugins.set(GltfPlugin {
skinned_mesh_bounds_policy: GltfSkinnedMeshBoundsPolicy::Dynamic,
..default()
}))
```
_The new glTF loader option is enabled by default_. I think this is the
right choice for several reasons:
- Bugs caused by skinned mesh culling have been a regular pain for both
new and experienced users. Now the most common case Just Works(tm).
- The CPU cost is modest (see later section), and sophisticated users
can opt-out.
- GPU limited apps might see a performance increase if the user was
previously disabling culling.
Non-glTF cases require some manual steps. The user must ask `Mesh` to
generate the skinned bounds, and then add the `DynamicSkinnedMeshBounds`
marker component to their mesh entity.
```rust
mesh.generate_skinned_mesh_bounds()?;
let mesh_asset = mesh_assets.add(mesh);
entity.insert((Mesh3d(mesh_asset), DynamicSkinnedMeshBounds));
```
See the `custom_skinned_mesh` example for real code.
## Bonus Features
### `GltfSkinnedMeshBoundsPolicy::NoFrustumCulling`
This is a convenience for users who prefer the `NoFrustumCulling`
workaround, but want to avoid the hassle of adding it after a glTF scene
has been spawned.
```rust
app.add_plugins(DefaultPlugins.set(GltfPlugin {
skinned_mesh_bounds_policy: GltfSkinnedMeshBoundsPolicy::NoFrustumCulling,
..default()
}))
```
PR #21845 is also adding an option related to skinned mesh bounds. I'm
fine if that PR lands first - I'll update this PR to include the option.
### Gizmos
`bevy_gizmos::SkinnedMeshBoundsGizmoPlugin` can draw the per-joint
AABBs.
```rust
fn toggle_skinned_mesh_bounds(mut config: ResMut<GizmoConfigStore>) {
config.config_mut::<SkinnedMeshBoundsGizmoConfigGroup>().1.draw_all ^= true;
}
```
The name is debatable. It's not technically drawing the bounds of the
skinned mesh - it's drawing the per-joint bounds that contribute to the
bounds of the skinned mesh.
## Testing
```sh
cargo run --example test_skinned_mesh_bounds
# Press `B` to show mesh bounds, 'J' to show joint bounds.
cargo run --example scene_viewer --features "free_camera" -- "assets/models/animated/Fox.glb"
cargo run --example scene_viewer --features "free_camera" -- "assets/models/SimpleSkin/SimpleSkin.gltf"
# More complicated mesh downloaded from https://github.com/KhronosGroup/glTF-Sample-Assets/tree/main/Models/RecursiveSkeletons
cargo run --example scene_viewer --features "free_camera" -- "RecursiveSkeletons.glb"
cargo run --example custom_skinned_mesh
```
I also hacked `custom_skinned_mesh` to simulate awkward cases like
rotated and off-screen entities.
## How Does It Work?
<details><summary>Click to expand</summary>
### Summary
`Mesh::generated_skinned_mesh_bounds` calculates an AABB for each joint
in the mesh - the AABB encloses all the vertices skinned to that joint.
Then every frame, `bevy_camera::update_skinned_mesh_bounds` uses the
current joint transforms to calculate an `Aabb` that encloses all the
joint AABBs.
This approach is reliable, in that the final `Aabb` will always enclose
the skinned vertices. But it can be larger than necessary. In practice
it's tight enough to be useful, and rarely more than 50% bigger.
This approach works even with non-rigid transforms and soft skinning. If
there's any doubt then I can add more detail.
### Awkward Bits
The solution is not as simple and efficient as it could be.
#### Problem 1: Joint transforms are world-space, `Aabb` is
entity-space.
- Ideally we'd use the world-space joint transforms to calculate a
world-space `Aabb`, but that's not possible.
- The obvious solution is to transform the joints to entity-space, so
the `Aabb` is directly calculated in entity-space.
- But that means an extra matrix multiply per joint.
- This PR calculates the `Aabb` in world-space and then transforms it to
entity-space.
- That avoids a matrix multiply per-joint, but can increase the size of
the `Aabb`.
#### Problem 2: Joint AABBs are in a surprising(?) space.
- When creating joint AABBs from a mesh, the intuitive solution would be
to calculate them in joint-space.
- Then the update just has to transform them by the world-space joint
transform.
- But to calculate them in joint-space we need both the bind pose vertex
positions and the bind pose joint transforms.
- These two parts are in separate assets - `Mesh` and
`SkinnedMeshInverseBindposes` - and those assets can be mixed and
matched.
- So we'd need to calculate a `SkinnedMeshBoundsAsset` for each
combination of `Mesh` and `SkinnedMeshInverseBindposes`.
- (`bevy_mod_skinned_aabb` uses this approach - it's slow and fragile.)
- This PR calculates joint AABBs in *mesh-space* (or more strictly
speaking: bind pose space).
- That can be done with just the `Mesh` asset.
- One downside is that the update needs an extra matrix multiply so we
can go from mesh-space to world-space.
- However, this might become a performance advantage if frustum culling
changes - see the "Future Options" section.
- Another minor downside is that mesh-space AABBs (red in the screenshot
below) tend to be bigger than joint-space AABBs (green), since joints
with one long axis might be at an awkward angle in mesh-space.
<img width="1085" height="759" alt="image"
src="https://github.com/user-attachments/assets/a02a28c3-8882-412c-9be1-64109b767da7"
/>
### Future Options
For frustum culling there's a cheeky way to optimize and simplify
skinned bounds - put frustum culling in the renderer and calculate a
world-space AABB during `extract_skins`. The joint transform will be
already loaded and in the right space, so we can avoid an entity lookup
and matrix multiply. I estimate this would make skinned bounds 3x
faster.
Another option is to change main world frustum culling to use a
world-space AABB. So there would be a new `GlobalAabb` component that
gets updated each frame from `Aabb` and the entity transform (which is
basically the same as transform propagation and the relationship between
`Transform` and `GlobalTransform`). This has some advantages and
disadvantages but I won't get into them here - I think putting frustum
culling into the renderer is a better option.
(Note that putting frustum culling into the renderer doesn't mean
removing the current main world visibility system - it just means the
main world system would be separate opt-in system)
</details>
## Performance
<details><summary>Click to expand</summary>
### Initialization
Creating the skinned bounds asset for `Fox.glb` (576 verts, 22 skinned
joints) takes **0.03ms**. Loading the whole glTF takes 8.7ms, so this is
a **<1% increase**.
### Per-Frame
The `many_foxes` example has 1000 skinned meshes, each with 22 skinned
joints. Updating the skinned bounds takes **0.086ms**. This is a
throughput of roughly 250,000 joints per millisecond, using two threads.
<img width="2404" height="861" alt="image"
src="https://github.com/user-attachments/assets/c27165ae-dc6c-4f6b-bbfb-4e211ab0263c"
/>
The whole animation update takes 3.67ms (where "animation update" =
advancing players + graph evaluation + transform propagation). So we can
kinda sorta claim that this PR increases the cost of skinned animation
by roughly **3%**. But that's very hand-wavey and situation dependent.
This was tested on an AMD Ryzen 7900 but with
`TaskPoolOptions::with_num_threads(6)` to simulate a lower spec CPU.
Comparing against a few other threading options:
- Non-parallel: **0.141ms**.
- 6 threads (2 compute threads): **0.086ms**.
- 24 threads (15 compute threads): **0.051ms**.
So the parallel iterator is better but quickly hits diminishing returns
as the number of threads increases.
### Future Options
The "How Does It Work" section mentions moving skinned mesh bounds into
the renderer's skin extraction. Based on some microbenchmarks, I
estimate this would reduce non-parallel `many_foxes` from 0.141ms to
0.049ms, so roughly 3x faster. Requiring AVX2 (to enable broadcast
loads) or pre-splatting (to fake broadcast loads for SSE) would knock
off another 25%. And fancier SIMD approaches could do better again.
There's also approaches that trade reliability for performance. For
character rigs, an effective optimization is to fold face and finger
joints into a single bound on the head and hand joints. This can reduce
the number of joints required by 50-80%.
</details>
## FAQ
<details><summary>Click to expand</summary>
#### Why can't it be automatically added to any mesh? Then the glTF
importer and custom mesh generators wouldn't need special logic.
`bevy_mod_skinned_aabb` took the automatic approach, and I don't think
the outcome was good. It needs some surprisingly fiddly and fragile
logic to decide when an entity has the right combination of assets in
the right loaded state. And it can never work with
`RenderAssetUsages::RENDER_WORLD`.
So this PR takes a more modest and manual approach. I think there's
plenty of scope to generalise and automate as the asset pipeline
matures. If the glTF importer becomes a purer glTF -> BSN transform,
then adding skinned bounds could be a general scene/asset transform
that's shared with other importers and custom mesh generators.
#### Why is the data in `Mesh`? Shouldn't it go in `SkinnedMesh` or
`SkinnedMeshInverseBindposes`?
That might seem intuitive, but it wouldn't work in practice - the data
is derived from `Mesh` alone. `SkinnedMesh` doesn't work because it's
per mesh instance, so the data would be duplicated.
`SkinnedMeshInverseBindposes` doesn't work because it can be shared
between multiple meshes.
The names are a bit misleading - `Mesh` does contain some skinning data,
while `SkinnedMesh` and `SkinnedMeshInverseBindposes` are more like
joint bindings one step removed from the vertex data.
#### Why not put the bounds on the joint entities?
This is surprisingly tricky in practice because multiple meshes can be
bound to the same joint entity. So there would need to be logic that
tracks the bindings and updates the bounds as meshes are added and
removed.
#### Why is the `DynamicSkinnedMeshBounds` component required?
It's an optimisation for users who want to opt out. It might also be
useful for future expansion, like adding options to approximate the
bounds with an AABB attached to a single joint.
#### Why are the update system and `DynamicSkinnedMeshBounds` component
in `bevy_camera`? Shouldn't they be in `bevy_mesh`?
`bevy_camera` is the owner and main user of `Aabb`, and already has some
mesh related logic (`calculate_bounds` automatically adds an `Aabb` to
mesh entities). So putting it in `bevy_camera` is consistent with the
current structure. I'd agree that it's a little awkward though and could
change in future.
</details>
## What Do Other Engines Do?
<details><summary>Click to expand</summary>
- **Unreal**: Automatically uses [collision
shapes](https://dev.epicgames.com/documentation/en-us/unreal-engine/physics-asset-editor-in-unreal-engine)
attached to joints, which is similar to this PR in practice but fragile
and inefficient. Also supports various fixed bounds options.
- **Unity**: Fixed bounds attached to the root bone. Automatically
calculated from animation poses or specified manually
([documentation](https://docs.unity3d.com/6000.4/Documentation/Manual/troubleshooting-skinned-mesh-renderer-visibility.html)).
- **Godot**: Appears to use roughly the same method as this PR, although
I didn't 100% confirm. See
[`MeshStorage::mesh_get_aabb`](https://github.com/godotengine/godot/blob/fafc07335bdecacd96b548c4119fbe1f47ee5866/servers/rendering/renderer_rd/storage_rd/mesh_storage.cpp#L650)
and
[`RendererSceneCull::_update_instance_aabb`](https://github.com/godotengine/godot/blob/235a32ad11f40ecba26d6d9ceea8ab245c13adb0/servers/rendering/renderer_scene_cull.cpp#L1991).
- **O3DE**: Fixed bounds attached to root bone, plus option to
approximate the AABB from joint origins and a fudge factor.
- **Northlight** (Remedy, Alan Wake 2): Specifically for vegetation,
calculates bounds from joint extents on GPU
([source](https://gdcvault.com/play/1034310/Large-Scale-GPU-Based-Skinning),
slide 48)
An approach that's been proposed several times for Bevy is copying
Unity's "fixed AABB from animation poses". I think this is more
complicated and less reliable than many people expect. More complicated
because linking animations to meshes can often be difficult. Less
reliable because it doesn't account for ragdolls and procedural
animation. But it could still be viable for for simple cases like a
single self-contained glTF with basic animation.
</details>
---------
Co-authored-by: Alice Cecile <[email protected]>1 parent 304265b commit 9d2db88
File tree
15 files changed
+1399
-10
lines changed- crates
- bevy_camera/src/visibility
- bevy_gizmos
- src
- bevy_gltf/src
- loader
- bevy_internal
- bevy_mesh
- src
- examples
- animation
- tools/scene_viewer
- release-content/release-notes
- tests/3d
15 files changed
+1399
-10
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
5092 | 5092 | | |
5093 | 5093 | | |
5094 | 5094 | | |
| 5095 | + | |
| 5096 | + | |
| 5097 | + | |
| 5098 | + | |
| 5099 | + | |
| 5100 | + | |
| 5101 | + | |
| 5102 | + | |
5095 | 5103 | | |
5096 | 5104 | | |
5097 | 5105 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
9 | 12 | | |
10 | 13 | | |
11 | 14 | | |
| |||
265 | 268 | | |
266 | 269 | | |
267 | 270 | | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
268 | 284 | | |
269 | 285 | | |
270 | 286 | | |
| |||
420 | 436 | | |
421 | 437 | | |
422 | 438 | | |
423 | | - | |
| 439 | + | |
| 440 | + | |
| 441 | + | |
424 | 442 | | |
425 | 443 | | |
426 | 444 | | |
| |||
485 | 503 | | |
486 | 504 | | |
487 | 505 | | |
| 506 | + | |
| 507 | + | |
| 508 | + | |
| 509 | + | |
| 510 | + | |
| 511 | + | |
| 512 | + | |
| 513 | + | |
| 514 | + | |
| 515 | + | |
| 516 | + | |
| 517 | + | |
| 518 | + | |
| 519 | + | |
| 520 | + | |
| 521 | + | |
| 522 | + | |
| 523 | + | |
| 524 | + | |
| 525 | + | |
| 526 | + | |
| 527 | + | |
| 528 | + | |
| 529 | + | |
| 530 | + | |
| 531 | + | |
| 532 | + | |
| 533 | + | |
| 534 | + | |
| 535 | + | |
488 | 536 | | |
489 | 537 | | |
490 | 538 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
| 24 | + | |
24 | 25 | | |
25 | 26 | | |
26 | 27 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
36 | 36 | | |
37 | 37 | | |
38 | 38 | | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
39 | 42 | | |
40 | 43 | | |
41 | 44 | | |
42 | 45 | | |
43 | 46 | | |
44 | 47 | | |
45 | 48 | | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
46 | 55 | | |
47 | 56 | | |
48 | 57 | | |
| |||
74 | 83 | | |
75 | 84 | | |
76 | 85 | | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
77 | 89 | | |
78 | 90 | | |
79 | 91 | | |
| |||
86 | 98 | | |
87 | 99 | | |
88 | 100 | | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
89 | 104 | | |
90 | 105 | | |
91 | 106 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
136 | 136 | | |
137 | 137 | | |
138 | 138 | | |
| 139 | + | |
139 | 140 | | |
140 | 141 | | |
141 | 142 | | |
| |||
191 | 192 | | |
192 | 193 | | |
193 | 194 | | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
194 | 213 | | |
195 | 214 | | |
196 | 215 | | |
| |||
206 | 225 | | |
207 | 226 | | |
208 | 227 | | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
209 | 232 | | |
210 | 233 | | |
211 | 234 | | |
| |||
214 | 237 | | |
215 | 238 | | |
216 | 239 | | |
| 240 | + | |
217 | 241 | | |
218 | 242 | | |
219 | 243 | | |
| |||
268 | 292 | | |
269 | 293 | | |
270 | 294 | | |
| 295 | + | |
271 | 296 | | |
272 | 297 | | |
273 | 298 | | |
0 commit comments