-
Notifications
You must be signed in to change notification settings - Fork 32
FAQ
Low-dimensioned matrices and vectors (i.e <= 4 dimensions) are so commonly used in multimedia applications that they deserve to be promoted to core types, and are expected to be as fast as the hardware allows, at all times.
It is safe to assume
that they are used a LOT, especially when they're processed en masse.
Positions, directions, curves, shapes, colors, texture coordinates, physics
systems, animation systems, and many others... All of them need and
process such types all the time. When they are fast, a lot of dependent
systems greatly benefit from it.
It is quite a commitment to choose a vector/matrix library for a project - any game, engine or library that uses it is implictly tied to its decisions regarding performance and functionality. When none of these libraries are suitable for the task, one ends up being written in-house, with a variable quality criteria.
Today, especially with Rust's power, it is possible to unconditionnaly have
genericity, ergonomics, functionality and performance all at once.
With vek
, I hope to provide these core types as generics, while
generating high-quality assembly output on specific hardware, when
appropriate types are used (and you're on Nightly with the repr_simd
feature enabled).
Let's have our cake and eat it too !
Yes! The repr_simd
feature simply does nothing in this case.
Use the excellent rustc_version
crate in a custom build script.
You may want to look at how vek
itself does this.
vec
always sounded cool in my head. With a k
at the end, it's even better!
Perhaps it's also because it reminds me of vk
, the prefix used by the Vulkan API.
Yeah it's quite a pain. There are ways to make it noticeably less slow though :
Disable default features, then selectively enable the ones you need.
It's unlikely that you need absolutely all of vek
's types, so pick the few
that you'll actually use.
In particular, disabling the repr_simd
feature should divide build times by approximately two.
Borrowing members of packed struct
s is unsafe
see #46043.
One of the numerous implications of this is that packed struct
s cannot contain members that implement Drop
.
So, this would mean goodbye to Vec4<BigInt>
.
Arguably, this is weird, because vectors are essentially arrays, and getting a reference to an array's element is safe.
So, why aren't vectors implemented as arrays ? Well, because IMO, the following is outright ridiculous :
let v = Vec4 { elements: [0, 1, 2, 3] };
let x = v.x();
*v.x_mut() = 42;
Compared to:
let v = Vec4 { x: 0, y: 1, z: 2, w: 3 };
let x = v.x;
v.x = 42;
Also, no, I don't find v[0]
or v[1]
to be acceptable as the only way to access elements. It's a Vec4
, I want proper x
, y
, z
and w
elements.
The choice I've made was to remove repr(packed)
but still have tests
that demonstrate that members are actually packed (at least for basic types). And it makes sense ! If it's legal to make an array of any type T
, then a #[repr(C)]
struct containing only T
members has all reasons to be laid out as an array.
In any case, don't forget you may safely convert a vector to an array with into_array()
.
With the ergonomic, idiomatic way to shuffle any struct in Rust!
You want destructuring:
let Vec4 { x, y, .. } = xyzw;
let wzyx = Vec4 { x: w, y: z, z: y, w: x };
let xyxx = Vec4 { x, y, z: x, w: x };
But don't take my word for it - let the (release mode) assembly speak for itself!
On x86 with SSE, it lowers to shufps
as wanted (provided you used repr_simd::Vec4
).
If you're only interested in a single element you can use broadcast
(or even from
):
let Vec4 { x, .. } = xyzw;
let xxxx = Vec4::broadcast(x);
let xxxx = Vec4::from(x);
Vec4
s also provide their own shuffle API based on existing x86 intrinsics.
See Vec4::shuffled(...)
for instance.
Because the actual meaning changes depending on the matrix's storage layout.
What you describe will probably do what you expect on row-major matrices,
but how should it behave on column-major matrices ? vek
's mantra is to stay
true to reality, so if it had to do this, it would just index the
public member cols
in this case.
But here's the thing - if you decided to switch the layout, no m[i][j]
access
would behave as expected anymore, and there's no way the compiler could catch this.
For these reasons, in vek
, it is almost always required that you explicitly
use the matrix's public member instead, which conveys the intent and enforces correctness.
Here's how you index matrices in vek
(assuming, for instance, that i=1
and j=3
):
- Row-major, static indexing:
(m.rows.y).w
; - Row-major, dynamic indexing:
m.rows[i][j]
; - Column-major, static indexing:
(m.cols.w).y
; - Column-major, dynamic indexing:
m.cols[j][i]
; - Any layout, dynamic indexing:
m[(i, j)]
.
(Static indexing with x
, y
, z
and w
is not pretty-looking, but I wanted
to reuse the vector types because of their representation, alignement requirements, etc.
Using tuples and/or creating tuple structs for this sole purpose isn't a good idea)
In the same way, if you want to get a pointer to the matrix's data, for e.g transferring it
to a graphics API, get the address of the public member explictly instead, which makes clear
whether you're sending an array of rows or an array of columns.
If you're using OpenGL, check out the gl_should_transpose()
method!
I believe the names match Rust's general terseness level.
Dynamic growable arrays are named Vec
, not Vector
, because they're
so widely used that typing more than three characters becomes annoying.
It's the same reasoning behind names of shell commands such as ls
.
Also, people accustomed to GLSL are familiar with names such as mat4
and vec4
.
Finally, in Rust, we might tend to forget that we have renaming imports :
use vek::Vec2 as Vector2;
use vek::Mat4 as Matrix4x4;
let v = Vector2 { x: 13, y: 42 };
// ....
This crate misses one or more optimized operations for my target platform (or, some parts could be improved) !
I care about this. File an issue and let's see what we can do!
TL:DR;
First, the historical problem with generics in C++ is the increase in build times,
and possibly insanely long and confusing error message, but none of these apply to Rust (within reason, of course!).
Second, we can be generic and still generate efficient code. The compiler has enough information
and Rust is backed by LLVM.
I hear that guy in the back saying "THAT doesn't guarantee anything!" with a smug face,
to which I would reply "well if you're not happy with the assembly, AND if it is indeed
a noticeable bottleneck, then that's where you should drop everything and use intrinsics".
The actual reason
As much as 32-bit floating-point happens to be the most common case, the
algorithms are universal! There's no reason
we wouldn't suddenly switch to, say, fixed-point numbers or bignums.
Fixed-point numbers do provide some goodies:
- Consistency of results across platforms and compiler options, useful for lockstep networking models;
- They are the only option for representing real numbers on some esoteric platforms such as the Nintendo DS. So I want to be able to switch to using them painlessly.
Bignums also come up as an interesting target, even though we might wonder in which cases we need more than 64-bit worth of integer values.
On the other hand, one thing that indeed plagues generics is that the code is written once but over-generalized, such that the compiler can't always "see" what the actual optimal code for the target hardware is.
#[repr(simd)]
is good, but not a magic wand. It will lower basic vector
operations into shuffles, packed arithmetic operations, etc, but it won't be
able to "guess" when what you just wrote is actually
the "unpcklps" instruction (and there, the generated assembly is an awful
bunch of "shufps" instead).
It happens that sometimes we do want
to use the intrinsics directly, but we still want to be generic!
That's why, in the future, I would like to provide specialized functions that lower to relevant intrinsics
(for instance, Mat4<f32>
would provide transposed_sse()
on SSE-enabled x86 CPUs).
In any case, there's still the option to use intrinsics yourself.
Because most hardware-specific intrinsics have semantics that bypass some of Rust's assumptions, some of which are :
- Alignment requirements (most load/store instructions);
- Precision of floating-point operations (e.g
_mm_rsqrt_ps()
); - Handling of integer overflow (e.g
_mm_add_epi32()
); - Expectations from the user (e.g
_mm_cmpeq_ps()
uses0xffffffff
in the output vector as the value fortrue
);
The point is, hardware-specific intrinsics are, well, hardware-specific, which
is why it's up to you to opt-in explicitly.
The generic implementations may not be as efficient, but won't backstab you either.
You don't actually care in this situation. It's release builds you're after.
Also keep in mind that Rust checks for integer overflows on debug builds, so
e.g your pretty Vec4<i32>
addition won't be lowered to paddd
on
SSE2-enabled CPUs.
As some have discussed, the perceived "quality" of a vector/matrix library is often
a matter of personal preference.
I think it is normal and healthy that there are
so many choices available! It's as healthy as having so many different game engines,
operating systems, Web browsers, countries, and cultures.
So instead of trying to convince you why, for some reason, you shouldn't use one of the
well-written libraries out there such as cgmath
, nalgebra
and vecmath
, I'll try
my best at explaining the problems I want to solve with vek
.
(Yeah I knew that XKCD and not only because of the mint
crate :) )
Totally.
In all seriousness, it's my first "public crate" and I think it's safe to say that it made me improve in Rust A LOT.
This wasn't that much part of my goals, but it's a really handy consequence.
Also, it made me learn more than I wanted to know about the underlying maths.
I understand better quaternions, know of a way to picture the result of a cross product, and in general, have gone
through confusions that I won't ever have to deal with anymore, because I've experienced them.
All in all, NIH is pretty great! 10/10
1. I don't want to worry anymore about my vectors and matrices being less efficient than they ought to be.
It's common to assume that the compiler can optimize everything (and often, it does) but it's a huge oversight for libraries that provide core types.
As a user, you might not realize that the "matrix * vector" products you use
everywhere take twice as many instructions as they should.
Yes, you won't see the difference in "most cases", but in my (limited)
experience, "most cases" is "moderately ambitious games running on
x86-64 CPUs", which is why there's no noticeable slowdown (they're very forgiving
compared to those of previous generation consoles), but that shouldn't
get in the way of "potentially ambitious games running on PC, consoles and
mobile devices".
SSE and SSE2 have been around since 1999 and 2001 respectively. All x86-64 CPUs
have them, and nearly every PC today is powered by such CPUs.
According to the Steam Hardware Survey,
100% of PCs have SSE2.
So obviously, on such targets and in release builds, if my Vec4<f32>
addition doesn't lower to the addps
instruction, I'll get quite upset.
2. I want to be able to choose freely between row-major and column-major (and get rid of the confusion between them, while I'm at it).
Row-major matrices have their uses just as well
as column-major ones.
One should be allowed to pick the correct one for the
job at any time and place.
It seems to be widely accepted that all libraries only offer one of these two layouts
(either always assumed, or one at a time via #define
s (e.g GLM)).
It happens that column-major matrices are good at multiplying themselves by a column vector, which is the most common case in computer graphics (because it's how most people transform vertices), but this doesn't mean this is somehow the One True Layout.
Row-major matrices are good at being the right-hand side of a product with a row vector. Also, one might just prefer them because of the indexing order.
This all boils down to giving more control to the user. Who am I to decide on your behalf ?
Back when I was using SFML, I would write stuff such as window.size.x
but something about it feels odd.
The Vulkan API was wise enough to
define vkExtent2d
and vkExtent3d
types for representing spatial extents, so I
want these types too. What are they ? Plain old vectors. But their members are
named such that it is clear that we're dealing with widths, heights and depths.
The others that come back all too often are Rgb
and Rgba
. I need these all the time, either
as Rgba<u8>
of Rgba<f32>
. They are used everywhere there are images and GUIs, and even more than that
(that is, pretty much every application or game I've ever known).
"Reality of the hardware" rather than "pretty pink pony mathematical reality".
I'm looking at libraries that abuse the type system to shoehorn mathematical properties and
result in a mess that gamedevs don't actually know or care about.
I don't need more abstraction. I want compression of information.
I don't need a pretty-looking mathematical model. I want to have access to the
actual building blocks that don't actively try to hide what's actually happening.
There's no such type as OrthogonalMat4
, AffineMat4
or the like. It's a damn Mat4
,
because it's what the hardware deals with.
i.e I don't want to be stuck on floating-point numbers.
This may be fixed by providing a prelude
, but I don't want this.
Fundamentally, if I'm given an Rgba
type, I don't want to have to import
some ColorVector
trait (or some prelude) to be able to call red()
on it.
The same goes for dot products, identity matrices, and whatever.
However, of course it's practical to make types implement relevant traits.
I don't want to pollute
my mental cache by asking myself if I have to use fixed-size arrays, tuples,
structs, or tuple structs for my vectors or matrices.
I want them readily available for use and never have to go back to, or question, their implementation.
I know how much I'll need these types for the foreseeable future.