Skip to content

Commit

Permalink
minor fixes, apply Jed's paper changes
Browse files Browse the repository at this point in the history
  • Loading branch information
sarah-quinones committed Feb 2, 2025
1 parent 39e72ec commit 56028f4
Show file tree
Hide file tree
Showing 6 changed files with 65 additions and 72 deletions.
2 changes: 1 addition & 1 deletion .github/FUNDING.yml
Original file line number Diff line number Diff line change
@@ -1 +1 @@
github: sarah-ek
github: sarah-quinones
1 change: 1 addition & 0 deletions faer-traits/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -885,6 +885,7 @@ impl Index for u32 {
type FixedWidth = u32;
type Signed = i32;
}
#[cfg(any(target_pointer_width = "64"))]
impl Index for u64 {
type FixedWidth = u64;
type Signed = i64;
Expand Down
53 changes: 26 additions & 27 deletions faer/src/sparse/csc/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -542,17 +542,17 @@ impl<'a, Rows: Shape, Cols: Shape, I: Index> SymbolicSparseColMatRef<'a, I, Rows
}
}

/// Returns a view over the symbolic structure of `self`.
#[inline]
pub fn as_ref(self) -> SymbolicSparseColMatRef<'a, I, Rows, Cols> {
SymbolicSparseColMatRef {
nrows: self.nrows,
ncols: self.ncols,
col_ptr: self.col_ptr,
col_nnz: self.col_nnz,
row_idx: self.row_idx,
}
}
/// Returns a view over the symbolic structure of `self`.
#[inline]
pub fn as_ref(self) -> SymbolicSparseColMatRef<'a, I, Rows, Cols> {
SymbolicSparseColMatRef {
nrows: self.nrows,
ncols: self.ncols,
col_ptr: self.col_ptr,
col_nnz: self.col_nnz,
row_idx: self.row_idx,
}
}
}

impl<Rows: Shape, Cols: Shape, I: Index> SymbolicSparseColMat<I, Rows, Cols> {
Expand Down Expand Up @@ -781,17 +781,17 @@ impl<Rows: Shape, Cols: Shape, I: Index> SymbolicSparseColMat<I, Rows, Cols> {
}
}

/// Returns a view over the symbolic structure of `self`.
#[inline]
pub fn as_ref(&self) -> SymbolicSparseColMatRef<'_, I, Rows, Cols> {
SymbolicSparseColMatRef {
nrows: self.nrows,
ncols: self.ncols,
col_ptr: &self.col_ptr,
col_nnz: self.col_nnz.as_deref(),
row_idx: &self.row_idx,
}
}
/// Returns a view over the symbolic structure of `self`.
#[inline]
pub fn as_ref(&self) -> SymbolicSparseColMatRef<'_, I, Rows, Cols> {
SymbolicSparseColMatRef {
nrows: self.nrows,
ncols: self.ncols,
col_ptr: &self.col_ptr,
col_nnz: self.col_nnz.as_deref(),
row_idx: &self.row_idx,
}
}

#[inline]
pub(crate) fn try_new_from_indices_impl(
Expand Down Expand Up @@ -909,11 +909,10 @@ impl<Rows: Shape, Cols: Shape, I: Index> SymbolicSparseColMat<I, Rows, Cols> {
col_ptr[j.unbound() + 1] = col_ptr[j.unbound()] + I::truncate(n_unique);
}

Ok((unsafe { Self::new_unchecked(nrows, ncols, col_ptr, None, row_idx) }, Argsort {
idx: argsort,
all_nnz,
nnz,
}))
Ok((
unsafe { Self::new_unchecked(nrows, ncols, col_ptr, None, row_idx) },
Argsort { idx: argsort, all_nnz, nnz },
))
}

/// create a new symbolic structure, and the corresponding order for the numerical values
Expand Down
44 changes: 22 additions & 22 deletions faer/src/sparse/csr/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -508,17 +508,17 @@ impl<'a, Rows: Shape, Cols: Shape, I: Index> SymbolicSparseRowMatRef<'a, I, Rows
}
}

/// Returns a view over the symbolic structure of `self`.
#[inline]
pub fn as_ref(self) -> SymbolicSparseRowMatRef<'a, I, Rows, Cols> {
SymbolicSparseRowMatRef {
nrows: self.nrows,
ncols: self.ncols,
row_ptr: self.row_ptr,
row_nnz: self.row_nnz,
col_idx: self.col_idx,
}
}
/// Returns a view over the symbolic structure of `self`.
#[inline]
pub fn as_ref(self) -> SymbolicSparseRowMatRef<'a, I, Rows, Cols> {
SymbolicSparseRowMatRef {
nrows: self.nrows,
ncols: self.ncols,
row_ptr: self.row_ptr,
row_nnz: self.row_nnz,
col_idx: self.col_idx,
}
}
}

impl<Rows: Shape, Cols: Shape, I: Index> SymbolicSparseRowMat<I, Rows, Cols> {
Expand Down Expand Up @@ -738,17 +738,17 @@ impl<Rows: Shape, Cols: Shape, I: Index> SymbolicSparseRowMat<I, Rows, Cols> {
}
}

#[inline]
/// Returns a view over the symbolic structure of `self`.
pub fn as_ref(&self) -> SymbolicSparseRowMatRef<'_, I, Rows, Cols> {
SymbolicSparseRowMatRef {
nrows: self.nrows,
ncols: self.ncols,
row_ptr: &self.row_ptr,
row_nnz: self.row_nnz.as_deref(),
col_idx: &self.col_idx,
}
}
#[inline]
/// Returns a view over the symbolic structure of `self`.
pub fn as_ref(&self) -> SymbolicSparseRowMatRef<'_, I, Rows, Cols> {
SymbolicSparseRowMatRef {
nrows: self.nrows,
ncols: self.ncols,
row_ptr: &self.row_ptr,
row_nnz: self.row_nnz.as_deref(),
col_idx: &self.col_idx,
}
}

#[inline]
/// create a new symbolic structure, and the corresponding order for the numerical values
Expand Down
4 changes: 2 additions & 2 deletions paper.bib
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ @BOOK{lapack99
PUBLISHER = {Society for Industrial and Applied Mathematics},
YEAR = {1999},
ADDRESS = {Philadelphia, PA},
ISBN = {0-89871-447-8 (paperback)}
ISBN = {0-89871-447-8}
}
@MISC{eigenweb,
author = {Ga\"{e}l Guennebaud and Beno\^{i}t Jacob and others},
Expand Down Expand Up @@ -93,7 +93,7 @@ @book{chandra2001parallel
title={Parallel programming in OpenMP},
author={Chandra, Rohit and Dagum, Leo and Kohr, David and Menon, Ramesh and Maydan, Dror and McDonald, Jeff},
year={2001},
publisher={Morgan kaufmann}
publisher={Morgan Kaufmann}
}
@article{tbb,
author = {Pheatt, Chuck},
Expand Down
33 changes: 13 additions & 20 deletions paper.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ authors:
affiliations:
- name: Independent Researcher, France
index: 1
date: 5 October 2023
date: 01 October 2023
bibliography: paper.bib
---

Expand All @@ -26,7 +26,7 @@ and multithreading settings.
Supported platforms include the ones supported by Rust.
Explicit SIMD instructions are currently used for x86-64 and Aarch64 (NEON),
with plans for SVE/SME and RVV optimizations once intrinsics for those are stabilized in Rust,
possibly earlier than that if we allow usage of a JIT backend[^1].
possibly earlier than that if we allow usage of a Just-In-Time (JIT) backend[^1].

The library provides a `Mat` type, allowing for quick and simple construction
and manipulation of matrices, as well as lightweight view types `MatRef` and
Expand All @@ -46,7 +46,7 @@ complex numbers using the aforementioned types as the base element, dual/hyper-d


[^1]: Inline assembly is not entirely appropriate for our use case since it's hard to make it generic enough for all the operations and types that we wish to support.
[^2]: IEEE 754-2008, with no implicit `fusedMultiplyAdd` contractions and with slight differences around NaN handling. See the [float semantics](https://github.com/rust-lang/rfcs/pull/3514) RFC for more information.
[^2]: IEEE 754-2008, with no implicit `fusedMultiplyAdd` contractions and with slight differences around NaN handling. See the [float semantics](https://github.com/rust-lang/rust/issues/128288) tracking issue for more information.
[^3]: These support at least for the simpler matrix decompositions (Cholesky, LU, QR). It's not clear yet how to handle iterative algorithms like the SVD and Eigendecomposition.

# Statement of need
Expand All @@ -60,7 +60,7 @@ such as data races and fatal use-after-free errors.

Rust also allows compatibility with the C ABI, allowing for simple interoperability
with C, and most other languages by extension. Once a design has been properly fleshed out,
we plan to expose a C API, along with bindings to other languages (Currently planned are C, C++, Python and Julia bindings).
we plan to expose a C API, along with bindings to other languages (C, C++, Python and Julia bindings are currently planned).

Aside from `faer`, the Rust ecosystem lacks high performance matrix factorization
libraries that aren't C library wrappers, which presents a distribution
Expand All @@ -70,25 +70,18 @@ challenge and can impede generic programming.

# Features

`faer` exposes a central `Entity` trait that allows users to describe how their
data should be laid out in memory. For example, native floating point types are
laid out contiguously in memory to make use of SIMD instructions that prefer this layout,
while complex types have the option of either being laid out contiguously or in a split format.
The latter is also called a zomplex data type in CHOLMOD (@cholmod).
An example of a type that benefits immensely from this is the double-double type, which is
composed of two `f64` components, stored in separate containers. This separate
storage scheme allows us to load each chunk individually to a SIMD register,
opening new avenues for generic vectorization.
`faer` supports vectorization of arbitrary SIMD user types by deinterleaving Array of Structures (AoS) data to Structure of Arrays (SoA) in registers on the fly.
This helps keep the API simple in that it doesn't need to accomodate SoA storage in memory, and is often also benefitial from a memory locality point of view.

The library generically implements algorithms for matrix multiplication, based
on the approach of @BLIS1. For native types, `faer` uses explicit SIMD
depending on the detected CPU features, that dispatch to several precompiled
variants for operations that can make use of these features.
An interesting alternative would be to compile the code Just-in-Time, which could improve compilation times and reduce binary size.
An interesting alternative would be to compile the code JIT, which could improve compilation times and reduce binary size.
But there are also possible downsides that have to be weighed against these advantages,
such as increasing the startup time to optimize and assemble the code,
as well as the gap in maturity between ahead-of-time compilation (currently backed by LLVM),
and just-in-time compilation, for which the Rust ecosystem is still developing.
and JIT compilation, for which the Rust ecosystem is still developing.
The library then uses matrix multiplication as a building block to implement commonly used matrix
decompositions, based on state of the art algorithms in order to guarantee
numerical robustness:
Expand All @@ -106,13 +99,13 @@ eigenvalue decomposition, as described by @10.1145/2382585.2382587.

State of the art algorithms are used for each decomposition, allowing performance
that matches or even surpasses other low level libraries such as OpenBLAS
(@10.1145/2503210.2503219), LAPACK (@lapack99), and Eigen (@eigenweb).
(@10.1145/2503210.2503219), LAPACK [@lapack99], and Eigen [@eigenweb].

To achieve high performance parallelism, `faer` uses the Rayon library (@rayon) as a
backend, and has shown to be competitive with other frameworks such as OpenMP (@chandra2001parallel)
and Intel Thread Building Blocks (@tbb).
To achieve high performance parallelism, `faer` uses the Rayon library [@rayon] as a
backend, and has shown to be competitive with other frameworks such as OpenMP [@chandra2001parallel]
and Intel Thread Building Blocks [@tbb].

[^5]: For example, computing $A x$ and $A.T y$ with a single pass over $A$, rather than two.
[^5]: For example, computing $A x$ and $A^\top y$ with a single pass over $A$, rather than two.

# Performance

Expand Down

0 comments on commit 56028f4

Please sign in to comment.