-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
Retain asset without data for RENDER_WORLD-only assets #21732
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I wonder if it would be better having an enum like this just to have some more information enum RenderAssetData {
Available(Vec<u8>),
Unavailable,
SentToRenderWorld
} |
|
maybe ... i tried to avoid changing the api of the main-world types (particularly |
crates/bevy_mesh/src/mesh.rs
Outdated
| #[inline] | ||
| pub fn indices(&self) -> Option<&Indices> { | ||
| self.indices.as_ref() | ||
| self.indices.as_ref().expect("Mesh has been extracted to RenderWorld. To access vertex attributes, the mesh must have RenderAssetUsages::MAIN_WORLD").as_ref() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we swap these methods to return Result and avoid the expects here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could … it is a usage error though, rather than “something that can happen and you should deal with”, like result usually implies.
I don’t feel strongly, will change if you prefer it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the intent is for a user to "deal with it", but rather "let's not crash the program because one asset has the wrong flags set". So I am in favour of making these Results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
making these return results will add literally hundreds of unwraps to the code base, e.g. there are 104 from primitive mesh builder functions calling with_inserted_attribute which are all guaranteed to succeed since they operate on meshes they've just constructed themselves.
perhaps i can add try_xxx variants, so that if we find situations where there are panics that we don't want we have a way to manage them in future.
crates/bevy_mesh/src/mesh.rs
Outdated
| #[inline] | ||
| pub fn indices(&self) -> Option<&Indices> { | ||
| self.indices.as_ref() | ||
| self.indices.as_ref().expect("Mesh has been extracted to RenderWorld. To access vertex attributes, the mesh must have RenderAssetUsages::MAIN_WORLD").as_ref() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the intent is for a user to "deal with it", but rather "let's not crash the program because one asset has the wrong flags set". So I am in favour of making these Results.
crates/bevy_mesh/src/mesh.rs
Outdated
| } | ||
| } | ||
|
|
||
| pub fn take_gpu_data(&mut self) -> Option<Self> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doc comment please! Also linking to RenderAsset::take_gpu_data would be good.
| fn take_gpu_data( | ||
| source: &mut Self::SourceAsset, | ||
| _previous_gpu_asset: Option<&Self>, | ||
| ) -> Option<Self::SourceAsset> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we even provide a default impl here? Cloning generally seems like the wrong behaviour to me, since I'd expect pretty much every impl to want to transfer the data rather than doing a possibly expensive clone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm. i'm not sure the defaul impl gets called anywhere currently (i think the remaining RenderAsset types don't allow the user to set the usages), but then, what should we do here? panic in the trait impl or implement it explicitly for all the RenderAsset types with a panic or a clone, and require users to implement it explicitly as well even if they won't use it?
that doesn't seem helpful to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i changed it so the default returns AssetExtractionError::NoImplementation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I much prefer making users implement this explicitly - pretty much every render asset should want to handle this correctly I think, especially since the default is just breaking.
|
|
||
| /// Retrieves the vertex `indices` of the mesh mutably. | ||
| #[inline] | ||
| pub fn indices_mut(&mut self) -> Option<&mut Indices> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All these mutation methods essentially stop functioning when the asset is extracted. What about runtime generated (and mutated) assets? Is the intent that users will like fully replace the Mesh with like *mesh_mut = Mesh::new()?
If so, this probably needs a migration guide.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the pattern would be either to fully initialize in one step, or to initialize with RenderAssetUsages::MAIN_WORLD, make any modifications, then when it's finished change the usage to RenderAssetUsages::RENDER_WORLD. after that no more modifications are possible since the data is only on GPU and we can't get it back.
if users want to make continuous runtime modifications, then they should use MAIN_WORLD || RENDER_WORLD, or fully replacing will also work.
this PR doesn't change anything in those scenarios, modifying an asset with RENDER_WORLD and not MAIN_WORLD is not possible currently either since the asset gets removed from the Assets collection. i can write something if you like, but the usage doesn't change.
| ) -> Option<Self::SourceAsset> { | ||
| let data = source.data.take(); | ||
|
|
||
| let valid_upload = data.is_some() || previous_gpu_asset.is_none_or(|prev| !prev.had_data); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't we just reupload here? We're losing a capability here I think, since previously hot-reloading an image would just send the data to the GPU, which would unconditionally upload. Am I missing something here?
Could you also test that hot reloading still works? I believe the alter_sprite example is probably sufficient with the file_watcher feature (and then messing with the loaded sprite).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so the purpose of RenderAssetUsages::RENDER_WORLD is to reduce CPU memory by not keeping data on the CPU that is only required on the GPU. we clearly can't re-upload data from the CPU that we have chosen to purge.
as i described in the top comment, this check is specifically to catch an error which was not previously possible: where a user modifies some metadata on a RENDER_WORLD-only image after it has been transferred. that's not possible currently because currently the image gets removed from the CPU-side Assets collection so cannot be accidentally modified.
keeping access to it CPU side has the benefits i listed right at the top, but introduces this new way to break things, so i added this check to help anyone who does so accidentally. without this check, the image would be treated as an uninitialized texture -- for images with data: None we create an uninitialized GPU-texture which is much faster -- and the GPU pixel data would be zeros, which will never be what the user wants if the image had data when first transferred.
we're not losing any capabilities. default is MAIN_WORLD | RENDER_WORLD, in which case you can modify and (re)upload as much as you like because we keep the pixel data CPU side.
hot reloading still works for MAIN_WORLD | RENDER_WORLD and for RENDER_WORLD-only as reloading an asset reloads the data too.
i still need to do a pass through to fix up docs for all the copy/pasted functions, but the logic should be complete now i think. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code seems reasonable - just got a couple of requests to avoid some corner cases. I also tested a variety of examples, and hacked in a few extra tests (including glTFs with RENDER_WORLD only, and making some examples deliberately trigger errors).
I don't like how the Mesh interface has ended up, but I'm not sure that's a problem with this PR. My hot take is that it's a problem with Mesh trying to be three things at once:
- A finished, immutable asset that can be efficiently shared and rendered.
- A way to incrementally build a mesh by adding/modifying attributes and suchlike.
- A way to animate meshes by mutating an asset in-place and re-extracting.
I think the solution could be:
- Add a new
MeshBuilderstruct and move all theMeshmutation methods toMeshBuilder.- This struct wouldn't have to deal with
MeshAccesserrors since it's entirely CPU side, so a load oftry_xxxmethods disappear. - A
finishfunction would consume the builder and return aMesh.
- This struct wouldn't have to deal with
- Change
Meshto be immutable, except bytake_gpu_data.- Animation that wants to mutate the mesh now has to create a fresh
Meshand overwrite the entire asset.
- Animation that wants to mutate the mesh now has to create a fresh
That gives as clear separation between 1) and 2), but makes 3) more awkward and inefficient. I'd argue that's ok in the short-term since it still basically works and mutating the asset is always going to be somewhat inefficient. In the long-term there could be a separate way of efficiently sending fine-grained mesh updates to the GPU.
The Mesh/MeshBuilder separation might also make compressed and derived data (like AABBs) more robust and efficient. So MeshBuilder::finish would automatically calculate AABBs and optionally apply compression.
EDIT: I should have said that I don't think this PR needs to explore stuff like MeshBuilder. What the PR does right now - make the Mesh interface more awkward in the short-term - seems like a reasonable trade-off for solving RENDER_WORLD issues now.
crates/bevy_mesh/src/mesh.rs
Outdated
| }) = attributes.as_ref()?.get(&Self::ATTRIBUTE_POSITION.id) | ||
| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| }) = attributes.as_ref()?.get(&Self::ATTRIBUTE_POSITION.id) | |
| { | |
| }) = attributes.as_ref()?.get(&Self::ATTRIBUTE_POSITION.id) | |
| && !position_values.is_empty() | |
| { |
Not 100% sure it's actually possible to make and extract a mesh with an empty position attribute, but seems reasonable to avoid a panic here just in case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'll change this to not bubble the error here, just fail to calc the aabb. then the only "?" error will be AlreadyExtracted. that should solve this and the below issue.
it's legitimate to extract a mesh with no positions, i guess, if you are doing some custom processing that doesn't use positions? maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry i misunderstood - i see now why this is required and added it
| fn take_gpu_data( | ||
| source: &mut Self::SourceAsset, | ||
| _previous_gpu_asset: Option<&Self>, | ||
| ) -> Result<Self::SourceAsset, AssetExtractionError> { | ||
| source | ||
| .take_gpu_data() | ||
| .map_err(|_| AssetExtractionError::AlreadyExtracted) | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If Mesh::take_gpu_data returns MeshAccessError::NotFound (e.g. because the mesh has no attributes), then this logic will wrongly report AlreadyExtracted. I repro'd this by changing the generate_custom_mesh example to make a mesh with no attributes.
I'm not sure what the solution is - maybe AssetExtractionError should also have a discriminant for missing data.
|
it would be nice to have a MeshBuilder with infallible access methods. then all the Mesh accessors could be result-based without scattering 100s of unwraps about. i don't think i want to do it as part of this PR though. |
|
Ah, my bad, I should have been clearer that I think this PR is fine as it is. It would have been simpler if |
…hat bevyengine#21732 or similar will land, so we don't lose the bounds if the mesh doesn't have `RenderAssetUsages::MAIN_WORLD`.
Co-authored-by: Greeble <166992735+greeble-dev@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm clicking approve because the PR solves an immediate problem that I've seen a few people trip over. It will also unblock #21837. The Mesh interface ends up a bit awkward right now, but there's room to improve it in future (see #21986). I think the PR still needs a migration guide though?
EDIT: Minor correction - maybe this PR can skip the migration guide, assuming #21986 lands in the same release and has a migration that covers both? Would keep things simpler.
I totally agree with @greeble-dev. So copying vertex and indices to MeshAllocator is fast and no longer needs repacking data. This opens a door to direct constructing Mesh and mutating Mesh in-place if you have the raw vertex and indices data and info and want to avoid repacking. I think the current approach of extracting vertex attributes is suboptimal. And Edit: Oh, think again, I can see the reason to store attributes on the Mesh since |
|
Regarding the Mesh interface, I‘d prefer to remove these |
since the change to results was at request of @andriyDev (and @alice-i-cecile) i'll defer to them. i agree fwiw |
| attributes: BTreeMap<MeshVertexAttributeId, MeshAttributeData>, | ||
| indices: Option<Indices>, | ||
| attributes: MeshExtractableData<BTreeMap<MeshVertexAttributeId, MeshAttributeData>>, | ||
| indices: MeshExtractableData<Indices>, | ||
| #[cfg(feature = "morph")] | ||
| morph_targets: Option<Handle<Image>>, | ||
| morph_targets: MeshExtractableData<Handle<Image>>, | ||
| #[cfg(feature = "morph")] | ||
| morph_target_names: Option<Vec<String>>, | ||
| morph_target_names: MeshExtractableData<Vec<String>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These extractable data can be consolidated. This ensures you always extract them together and that they are either all with data or without data simultaneously. Like this:
struct MeshExtractableData {
attributes: BTreeMap<MeshVertexAttributeId, MeshAttributeData>,
indices: Option<Indices>,
#[cfg(feature = "morph")]
morph_targets: Option<Handle<Image>>,
#[cfg(feature = "morph")]
morph_target_names: Option<Vec<String>>,
}
struct Mesh {
...
extractable_data: Option<MeshExtractableData>,
...
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could be. i don't think it improves anything but i can change it if others disagree.
IceSentry
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new _option function feel a bit weird but I don't really have an alternative solution. I mean it could either always return the option or always map the option to a result by it's not a blocker.
LGTM
|
And yeah, I prefer keeping the |
| } | ||
|
|
||
| impl<T> MeshExtractableData<T> { | ||
| // get a reference to internal data. returns error if data has been extracted, or if no |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should make these all doc comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we typically use doc comments for private methods on private structs? i will change if so but i didn't think it was standard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We typically do, as we have documentation built for private items on dev-docs.bevy.org.
| fn take_gpu_data( | ||
| source: &mut Self::SourceAsset, | ||
| _previous_gpu_asset: Option<&Self>, | ||
| ) -> Option<Self::SourceAsset> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I much prefer making users implement this explicitly - pretty much every render asset should want to handle this correctly I think, especially since the default is just breaking.
| if let Some(asset) = assets.remove(id) { | ||
| extracted_assets.push((id, asset)); | ||
| added.insert(id); | ||
| if let Some(asset) = assets.get_mut_untracked(id) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little worried that we're silently mutating the asset. Is this cause for concern?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would argue it's kinda part of the contract of using the render world asset usage but I can see why some people wouldn't like it. I'm not sure what the alternative is unless we can communicate more details about why an asset changed. If a user could react to this change right now there would be no way to know if it was a user event or the render world extraction. I guess if they look at the asset and see missing data they would know that's what hapenned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i used get_mut initially, but it ended up in a broken feedback loop since re-extraction is triggered by asset modifications (which is required behaviour for intentional user modifications), including this one.
the alternative would be to use a get_mut variant that fires a different event i guess. but the behaviour of extraction is predictable enough -- the data will be removed in the extract stage -- that i don't think it's warranted.
there are several |
My concern is that if we add more extractable resources (like RawMesh, etc.), we will also follow a similar pattern of providing |
you've seen the |
|
Another possible approach I consider is to consolidate the extractable data and access it through something similar to entry API, thereby avoiding multiple unwrap operations. |
That is unrelated. I mean if more extractable assets are added in the future, or if crate authors want to add their extractable assets, should the interfaces also follow the |
## Objective Fix a panic in mesh picking if a mesh is `RenderAssetUsages::RENDER_WORLD` only. The panic was reported in #22206, although that issue is framed as a broader concern and so is not resolved by this PR. The problem was introduced in #21732. ## Solution Changed mesh accesses to use the non-panicking `try_` variants. This means extracted meshes are silently ignored as before. I also did a quick scan to see if could spot any other engine crates that might have the same issue - didn't see any but could easily have missed one. ## Testing Tested by modifying the `mesh_picking` example: ```diff + let mut p = Mesh::from(Cuboid::default()); + p.asset_usage = bevy_asset::RenderAssetUsages::RENDER_WORLD; let shapes = [ - meshes.add(Cuboid::default()), + meshes.add(p), ```
…ly meshes (#22407) This is targeting `release-0.18.0`, NOT `main`! # Objective - Fixes #22206 ## Solution - Adds a migration guide for the changes in #21732. As someone not too familiar with mesh stuff, I’d appreciate a review from someone more familiar with the changes and mesh things! FYI @robtfm @beicause
## Objective Fix a panic in mesh picking if a mesh is `RenderAssetUsages::RENDER_WORLD` only. The panic was reported in #22206, although that issue is framed as a broader concern and so is not resolved by this PR. The problem was introduced in #21732. ## Solution Changed mesh accesses to use the non-panicking `try_` variants. This means extracted meshes are silently ignored as before. I also did a quick scan to see if could spot any other engine crates that might have the same issue - didn't see any but could easily have missed one. ## Testing Tested by modifying the `mesh_picking` example: ```diff + let mut p = Mesh::from(Cuboid::default()); + p.asset_usage = bevy_asset::RenderAssetUsages::RENDER_WORLD; let shapes = [ - meshes.add(Cuboid::default()), + meshes.add(p), ```
…21837) ## Objective Mostly fix #4971 by adding a new option for updating skinned mesh `Aabb` components from joint transforms. https://github.com/user-attachments/assets/c25b31fa-142d-462b-9a1d-012ea928f839 This fixes cases where vertex positions are only modified through skinning. It doesn't fix other cases like morph targets and vertex shaders. The PR kind of upstreams [`bevy_mod_skinned_aabb`](https://github.com/greeble-dev/bevy_mod_skinned_aabb), but with some changes to make it simpler and more reliable. ### Dependencies - (MERGED) #21732 (or something similar) is desirable to make the new option work with `RenderAssetUsages::RENDER_WORLD`-only meshes. - This PR is authored as if 21732 has landed. But if that doesn't happen then I can adjust this PR to note the limitation. - (Optional) #21845 adds an option related to skinned mesh bounds. - Either PR can land first - the second will need to be updated. ## Background If a main world entity has a `Mesh3d` component then it's automatically assigned an `Aabb` component. This is done by `bevy_camera` or `bevy_gltf`. The `Aabb` is used by `bevy_camera` for frustum culling. It can also be used by `bevy_picking` as an optimization, and by third party crates. But there's a problem - the `Aabb` can be wrong if something changes the mesh's vertex positions after the `Aabb` is calculated. This can be done by vertex shaders - notably skinning and morph targets - or by mutating the `Mesh` asset (#4294). For the skinning case, the most common solution has been to disable frustum culling via the `NoFrustumCulling` component. This is simple, and might even be the most efficient approach for apps where meshes tend to stay on-screen. But it's annoying to implement, bad for apps where meshes are often off-screen, and it only fixes frustum culling - it doesn't help other systems that use the `Aabb`. ## Solution This PR adds a reliable and reasonably efficient method of updating the `Aabb` of a skinned mesh from its animated joint transforms. See the "How does it work" section for more detail. The glTF loader can add skinned bounds automatically if a new `GltfSkinnedMeshBoundsPolicy` option is enabled in `GltfPlugin` or `GltfLoaderSettings`: ```rust app.add_plugins(DefaultPlugins.set(GltfPlugin { skinned_mesh_bounds_policy: GltfSkinnedMeshBoundsPolicy::Dynamic, ..default() })) ``` _The new glTF loader option is enabled by default_. I think this is the right choice for several reasons: - Bugs caused by skinned mesh culling have been a regular pain for both new and experienced users. Now the most common case Just Works(tm). - The CPU cost is modest (see later section), and sophisticated users can opt-out. - GPU limited apps might see a performance increase if the user was previously disabling culling. Non-glTF cases require some manual steps. The user must ask `Mesh` to generate the skinned bounds, and then add the `DynamicSkinnedMeshBounds` marker component to their mesh entity. ```rust mesh.generate_skinned_mesh_bounds()?; let mesh_asset = mesh_assets.add(mesh); entity.insert((Mesh3d(mesh_asset), DynamicSkinnedMeshBounds)); ``` See the `custom_skinned_mesh` example for real code. ## Bonus Features ### `GltfSkinnedMeshBoundsPolicy::NoFrustumCulling` This is a convenience for users who prefer the `NoFrustumCulling` workaround, but want to avoid the hassle of adding it after a glTF scene has been spawned. ```rust app.add_plugins(DefaultPlugins.set(GltfPlugin { skinned_mesh_bounds_policy: GltfSkinnedMeshBoundsPolicy::NoFrustumCulling, ..default() })) ``` PR #21845 is also adding an option related to skinned mesh bounds. I'm fine if that PR lands first - I'll update this PR to include the option. ### Gizmos `bevy_gizmos::SkinnedMeshBoundsGizmoPlugin` can draw the per-joint AABBs. ```rust fn toggle_skinned_mesh_bounds(mut config: ResMut<GizmoConfigStore>) { config.config_mut::<SkinnedMeshBoundsGizmoConfigGroup>().1.draw_all ^= true; } ``` The name is debatable. It's not technically drawing the bounds of the skinned mesh - it's drawing the per-joint bounds that contribute to the bounds of the skinned mesh. ## Testing ```sh cargo run --example test_skinned_mesh_bounds # Press `B` to show mesh bounds, 'J' to show joint bounds. cargo run --example scene_viewer --features "free_camera" -- "assets/models/animated/Fox.glb" cargo run --example scene_viewer --features "free_camera" -- "assets/models/SimpleSkin/SimpleSkin.gltf" # More complicated mesh downloaded from https://github.com/KhronosGroup/glTF-Sample-Assets/tree/main/Models/RecursiveSkeletons cargo run --example scene_viewer --features "free_camera" -- "RecursiveSkeletons.glb" cargo run --example custom_skinned_mesh ``` I also hacked `custom_skinned_mesh` to simulate awkward cases like rotated and off-screen entities. ## How Does It Work? <details><summary>Click to expand</summary> ### Summary `Mesh::generated_skinned_mesh_bounds` calculates an AABB for each joint in the mesh - the AABB encloses all the vertices skinned to that joint. Then every frame, `bevy_camera::update_skinned_mesh_bounds` uses the current joint transforms to calculate an `Aabb` that encloses all the joint AABBs. This approach is reliable, in that the final `Aabb` will always enclose the skinned vertices. But it can be larger than necessary. In practice it's tight enough to be useful, and rarely more than 50% bigger. This approach works even with non-rigid transforms and soft skinning. If there's any doubt then I can add more detail. ### Awkward Bits The solution is not as simple and efficient as it could be. #### Problem 1: Joint transforms are world-space, `Aabb` is entity-space. - Ideally we'd use the world-space joint transforms to calculate a world-space `Aabb`, but that's not possible. - The obvious solution is to transform the joints to entity-space, so the `Aabb` is directly calculated in entity-space. - But that means an extra matrix multiply per joint. - This PR calculates the `Aabb` in world-space and then transforms it to entity-space. - That avoids a matrix multiply per-joint, but can increase the size of the `Aabb`. #### Problem 2: Joint AABBs are in a surprising(?) space. - When creating joint AABBs from a mesh, the intuitive solution would be to calculate them in joint-space. - Then the update just has to transform them by the world-space joint transform. - But to calculate them in joint-space we need both the bind pose vertex positions and the bind pose joint transforms. - These two parts are in separate assets - `Mesh` and `SkinnedMeshInverseBindposes` - and those assets can be mixed and matched. - So we'd need to calculate a `SkinnedMeshBoundsAsset` for each combination of `Mesh` and `SkinnedMeshInverseBindposes`. - (`bevy_mod_skinned_aabb` uses this approach - it's slow and fragile.) - This PR calculates joint AABBs in *mesh-space* (or more strictly speaking: bind pose space). - That can be done with just the `Mesh` asset. - One downside is that the update needs an extra matrix multiply so we can go from mesh-space to world-space. - However, this might become a performance advantage if frustum culling changes - see the "Future Options" section. - Another minor downside is that mesh-space AABBs (red in the screenshot below) tend to be bigger than joint-space AABBs (green), since joints with one long axis might be at an awkward angle in mesh-space. <img width="1085" height="759" alt="image" src="https://github.com/user-attachments/assets/a02a28c3-8882-412c-9be1-64109b767da7" /> ### Future Options For frustum culling there's a cheeky way to optimize and simplify skinned bounds - put frustum culling in the renderer and calculate a world-space AABB during `extract_skins`. The joint transform will be already loaded and in the right space, so we can avoid an entity lookup and matrix multiply. I estimate this would make skinned bounds 3x faster. Another option is to change main world frustum culling to use a world-space AABB. So there would be a new `GlobalAabb` component that gets updated each frame from `Aabb` and the entity transform (which is basically the same as transform propagation and the relationship between `Transform` and `GlobalTransform`). This has some advantages and disadvantages but I won't get into them here - I think putting frustum culling into the renderer is a better option. (Note that putting frustum culling into the renderer doesn't mean removing the current main world visibility system - it just means the main world system would be separate opt-in system) </details> ## Performance <details><summary>Click to expand</summary> ### Initialization Creating the skinned bounds asset for `Fox.glb` (576 verts, 22 skinned joints) takes **0.03ms**. Loading the whole glTF takes 8.7ms, so this is a **<1% increase**. ### Per-Frame The `many_foxes` example has 1000 skinned meshes, each with 22 skinned joints. Updating the skinned bounds takes **0.086ms**. This is a throughput of roughly 250,000 joints per millisecond, using two threads. <img width="2404" height="861" alt="image" src="https://github.com/user-attachments/assets/c27165ae-dc6c-4f6b-bbfb-4e211ab0263c" /> The whole animation update takes 3.67ms (where "animation update" = advancing players + graph evaluation + transform propagation). So we can kinda sorta claim that this PR increases the cost of skinned animation by roughly **3%**. But that's very hand-wavey and situation dependent. This was tested on an AMD Ryzen 7900 but with `TaskPoolOptions::with_num_threads(6)` to simulate a lower spec CPU. Comparing against a few other threading options: - Non-parallel: **0.141ms**. - 6 threads (2 compute threads): **0.086ms**. - 24 threads (15 compute threads): **0.051ms**. So the parallel iterator is better but quickly hits diminishing returns as the number of threads increases. ### Future Options The "How Does It Work" section mentions moving skinned mesh bounds into the renderer's skin extraction. Based on some microbenchmarks, I estimate this would reduce non-parallel `many_foxes` from 0.141ms to 0.049ms, so roughly 3x faster. Requiring AVX2 (to enable broadcast loads) or pre-splatting (to fake broadcast loads for SSE) would knock off another 25%. And fancier SIMD approaches could do better again. There's also approaches that trade reliability for performance. For character rigs, an effective optimization is to fold face and finger joints into a single bound on the head and hand joints. This can reduce the number of joints required by 50-80%. </details> ## FAQ <details><summary>Click to expand</summary> #### Why can't it be automatically added to any mesh? Then the glTF importer and custom mesh generators wouldn't need special logic. `bevy_mod_skinned_aabb` took the automatic approach, and I don't think the outcome was good. It needs some surprisingly fiddly and fragile logic to decide when an entity has the right combination of assets in the right loaded state. And it can never work with `RenderAssetUsages::RENDER_WORLD`. So this PR takes a more modest and manual approach. I think there's plenty of scope to generalise and automate as the asset pipeline matures. If the glTF importer becomes a purer glTF -> BSN transform, then adding skinned bounds could be a general scene/asset transform that's shared with other importers and custom mesh generators. #### Why is the data in `Mesh`? Shouldn't it go in `SkinnedMesh` or `SkinnedMeshInverseBindposes`? That might seem intuitive, but it wouldn't work in practice - the data is derived from `Mesh` alone. `SkinnedMesh` doesn't work because it's per mesh instance, so the data would be duplicated. `SkinnedMeshInverseBindposes` doesn't work because it can be shared between multiple meshes. The names are a bit misleading - `Mesh` does contain some skinning data, while `SkinnedMesh` and `SkinnedMeshInverseBindposes` are more like joint bindings one step removed from the vertex data. #### Why not put the bounds on the joint entities? This is surprisingly tricky in practice because multiple meshes can be bound to the same joint entity. So there would need to be logic that tracks the bindings and updates the bounds as meshes are added and removed. #### Why is the `DynamicSkinnedMeshBounds` component required? It's an optimisation for users who want to opt out. It might also be useful for future expansion, like adding options to approximate the bounds with an AABB attached to a single joint. #### Why are the update system and `DynamicSkinnedMeshBounds` component in `bevy_camera`? Shouldn't they be in `bevy_mesh`? `bevy_camera` is the owner and main user of `Aabb`, and already has some mesh related logic (`calculate_bounds` automatically adds an `Aabb` to mesh entities). So putting it in `bevy_camera` is consistent with the current structure. I'd agree that it's a little awkward though and could change in future. </details> ## What Do Other Engines Do? <details><summary>Click to expand</summary> - **Unreal**: Automatically uses [collision shapes](https://dev.epicgames.com/documentation/en-us/unreal-engine/physics-asset-editor-in-unreal-engine) attached to joints, which is similar to this PR in practice but fragile and inefficient. Also supports various fixed bounds options. - **Unity**: Fixed bounds attached to the root bone. Automatically calculated from animation poses or specified manually ([documentation](https://docs.unity3d.com/6000.4/Documentation/Manual/troubleshooting-skinned-mesh-renderer-visibility.html)). - **Godot**: Appears to use roughly the same method as this PR, although I didn't 100% confirm. See [`MeshStorage::mesh_get_aabb`](https://github.com/godotengine/godot/blob/fafc07335bdecacd96b548c4119fbe1f47ee5866/servers/rendering/renderer_rd/storage_rd/mesh_storage.cpp#L650) and [`RendererSceneCull::_update_instance_aabb`](https://github.com/godotengine/godot/blob/235a32ad11f40ecba26d6d9ceea8ab245c13adb0/servers/rendering/renderer_scene_cull.cpp#L1991). - **O3DE**: Fixed bounds attached to root bone, plus option to approximate the AABB from joint origins and a fudge factor. - **Northlight** (Remedy, Alan Wake 2): Specifically for vegetation, calculates bounds from joint extents on GPU ([source](https://gdcvault.com/play/1034310/Large-Scale-GPU-Based-Skinning), slide 48) An approach that's been proposed several times for Bevy is copying Unity's "fixed AABB from animation poses". I think this is more complicated and less reliable than many people expect. More complicated because linking animations to meshes can often be difficult. Less reliable because it doesn't account for ragdolls and procedural animation. But it could still be viable for for simple cases like a single self-contained glTF with basic animation. </details> --------- Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>
Objective
when
RenderAssetswithRenderAssetUsages::RENDER_WORLDand withoutRenderAssetUsages::MAIN_WORLDare extracted, the asset is removed from the assets collection. this causes some issues:AssetServer::get_handleSolution
extraction:
take_gpu_datato theRenderAssettrait. use it to pull the data out of the asset for transfer, and leave the empty asset in the collection. default implementation justclones the asset.panic. this follows from modifying an asset after extraction, which is always a code error, so i think panic here makes senselog an errorMesh/RenderMesh:
Mesh::attributesandMesh::indicesoptionsexpectoperations which access or modify the vertex data or indices if it has been extracted. accessing the vertex data after extraction is always a code error. fixes Somehow prevent confusion caused by Assets being removed due to not having RenderAssetUsages::MAIN_WORLD #19737 by resulting in the errorMesh has been extracted to RenderWorld. To access vertex attributes, the mesh must have RenderAssetUsages::MAIN_WORLDtry_xxxoperations which allow users to handle the access error gracefully if required (no usages as part of this pr, but provided for future)Aabbwhen gpu data is taken and store the result. this allows extracted meshes to still use frustum culling (otherwise using multiple copies of an extracted mesh now panics ascompute_aabbrelied on vertex positions). there's a bit of a tradeoff here: users may not need the Aabb and we needlessly compute it. but i think users almost always do want them, and computing once (for extracted meshes) is cheaper than the alternative, keeping position data and computing a freshAabbevery time the mesh is used on a new entity.Image/GpuImage:
images are a little more complex because the data can be deliberately
Nonefor render-targets / GPU-written textures where we only want an uninitialized gpu-side texture.Image::dataon extractionGpuImagewhether any data was found initiallycorner case / issue: when used with
RenderAssetBytesPerFrameLimiterthere may be no previous gpu asset if it is still queued pending upload due to the bandwidth limit. this can result in a modified image with initial data skipping thehad_datacheck, resulting in a blank texture. i think this is sufficiently rare that it's not a real problem, users would still hit the panic if the asset is transferred in time and the problem/solution should be clear when they do hit it.ShaderStorageBuffer/GpuShaderStorageBuffer
follows the same pattern as Image/GpuImage:
ShaderStorageBuffer::dataon extractionGpuShaderStorageBufferwhether any data was found initiallywe don't have the queue issue here because
GpuShaderStorageBufferdoesn't implementbyte_lenso we can't end up queueing them.other RenderAssets
i didn't modify the other
RenderAssettypes (GpuAutoExposureCompensationCurve,GpuLineGizmo,RenderWireframeMaterial,PreparedMaterial,PreparedMaterial2d,PreparedUiMaterial) on the assumption thatcloning these is cheap enough anywaythe asset usages are not exposed so we should never calltake_gpu_data. the default implementation panics with a message directing users to implement the method if requiredTesting
only really tested within my work project. i can add some explicit tests if required.