Conversation
ivarflakstad
left a comment
There was a problem hiding this comment.
This review was performed with a fever. The author takes no responsibility for its quality nor relevance.
candle-core/src/cpu_backend/mod.rs
Outdated
| // Conditionally use polyfill Vec when pinned-memory feature is enabled | ||
| #[cfg(not(feature = "pinned-memory"))] | ||
| use std::vec::Vec; | ||
|
|
||
| #[cfg(feature = "pinned-memory")] | ||
| use allocator_api2::vec::Vec; |
There was a problem hiding this comment.
Could we perhaps move this to a candle_core/src/vec module, where based on the feature flag we import either the std or allocator_api2 impl?
candle-core/src/cpu_backend/mod.rs
Outdated
| macro_rules! storage_vec { | ||
| ($val:expr; $len:expr) => {{ | ||
| let mut v = Vec::with_capacity($len); | ||
| v.resize($len, $val); | ||
| v | ||
| }}; | ||
| } |
There was a problem hiding this comment.
Kind of a nit but I'd like for this to support the same branches as the original macro.
So something like
| macro_rules! storage_vec { | |
| ($val:expr; $len:expr) => {{ | |
| let mut v = Vec::with_capacity($len); | |
| v.resize($len, $val); | |
| v | |
| }}; | |
| } | |
| #[cfg(feature = "pinned-memory")] | |
| pub use allocator_api2::vec as storage_vec; | |
| #[cfg(not(feature = "pinned-memory")] | |
| pub use std::vec as storage_vec; |
| //! The `pinned-memory` feature automatically enables `cuda` (see Cargo.toml), so checking | ||
| //! for `pinned-memory` is sufficient. | ||
|
|
||
| #[cfg(feature = "pinned-memory")] |
There was a problem hiding this comment.
I don't think we need all these as the entire file is behind the feature flag
candle-core/src/quantized/cuda.rs
Outdated
| .device | ||
| .memcpy_dtov(&self.data.inner.slice(..self.data.len))?; | ||
| let mut out = vec![0.0; elem_count]; | ||
| let mut out_std: std::vec::Vec<f32> = (0..elem_count).map(|_| 0.0).collect(); |
There was a problem hiding this comment.
If you're specifically using std::vec::Vec then I don't see why you can't simply use good ol' vec!
candle-core/Cargo.toml
Outdated
| # pinned-memory requires cuda because it uses CUDA APIs for pinned host memory allocation | ||
| pinned-memory = ["cuda", "dep:allocator-api2"] |
There was a problem hiding this comment.
Clearly documented here, but when running across the feature flag in the wild it could be confusing as pinning memory is not a cuda specific concept.
Could we rename to cuda-pinned-memory. Perhaps cuda-pin-mem if you think it's getting tedious to read/write.
We may find uses for pinned memory in other contexts in the future, in which case I think we're approaching a structure like
pinned-memory = ["dep:allocator-api2"]
cuda-pin-mem = ["cuda", "pinned-memory"]
xxx-pin-mem = ["xxx", "pinned-memory"]Where pinned-memory would surface a generic API and *-pin-mem would enable the backend specific implementations taking advantage of it.
|
Officially alive again. Yay. I think that covering every usage of |
This PR adds support for pinned-memory backed tensors. Since the custom allocator trait is not stable in Rust yet (and I don't see that happening soon), this PR uses polyfill libraries to stay on the stable release. If there's a better approach, I'm all ears. :)
All of this is gated behind the
pinned-memoryfeature which also enablescudadue to its nature.