Pinned memory by ahadnagy · Pull Request #3216 · huggingface/candle

ahadnagy · 2025-11-27T13:45:49Z

This PR adds support for pinned-memory backed tensors. Since the custom allocator trait is not stable in Rust yet (and I don't see that happening soon), this PR uses polyfill libraries to stay on the stable release. If there's a better approach, I'm all ears. :)

All of this is gated behind the pinned-memory feature which also enables cuda due to its nature.

ivarflakstad

This review was performed with a fever. The author takes no responsibility for its quality nor relevance.

ivarflakstad · 2025-11-27T14:04:10Z

candle-core/src/cpu_backend/mod.rs

+// Conditionally use polyfill Vec when pinned-memory feature is enabled
+#[cfg(not(feature = "pinned-memory"))]
+use std::vec::Vec;
+
+#[cfg(feature = "pinned-memory")]
+use allocator_api2::vec::Vec;


Could we perhaps move this to a candle_core/src/vec module, where based on the feature flag we import either the std or allocator_api2 impl?

ivarflakstad · 2025-11-27T14:29:51Z

candle-core/src/cpu_backend/mod.rs

+macro_rules! storage_vec {
+    ($val:expr; $len:expr) => {{
+        let mut v = Vec::with_capacity($len);
+        v.resize($len, $val);
+        v
+    }};
+}


Kind of a nit but I'd like for this to support the same branches as the original macro.
So something like

Suggested change

macro_rules! storage_vec {

($val:expr; $len:expr) => {{

let mut v = Vec::with_capacity($len);

v.resize($len, $val);

v

}};

}

#[cfg(feature = "pinned-memory")]

pub use allocator_api2::vec as storage_vec;

#[cfg(not(feature = "pinned-memory")]

pub use std::vec as storage_vec;

ivarflakstad · 2025-11-27T14:40:15Z

candle-core/src/cuda_backend/pinned_allocator.rs

+//! The `pinned-memory` feature automatically enables `cuda` (see Cargo.toml), so checking
+//! for `pinned-memory` is sufficient.
+
+#[cfg(feature = "pinned-memory")]


I don't think we need all these as the entire file is behind the feature flag

ivarflakstad · 2025-11-27T14:41:18Z

candle-core/src/quantized/cuda.rs

            .device
            .memcpy_dtov(&self.data.inner.slice(..self.data.len))?;
-        let mut out = vec![0.0; elem_count];
+        let mut out_std: std::vec::Vec<f32> = (0..elem_count).map(|_| 0.0).collect();


If you're specifically using std::vec::Vec then I don't see why you can't simply use good ol' vec!

ivarflakstad · 2025-11-27T15:26:17Z

candle-core/Cargo.toml

+# pinned-memory requires cuda because it uses CUDA APIs for pinned host memory allocation
+pinned-memory = ["cuda", "dep:allocator-api2"]


Clearly documented here, but when running across the feature flag in the wild it could be confusing as pinning memory is not a cuda specific concept.
Could we rename to cuda-pinned-memory. Perhaps cuda-pin-mem if you think it's getting tedious to read/write.

We may find uses for pinned memory in other contexts in the future, in which case I think we're approaching a structure like

pinned-memory = ["dep:allocator-api2"] cuda-pin-mem = ["cuda", "pinned-memory"] xxx-pin-mem = ["xxx", "pinned-memory"]

Where pinned-memory would surface a generic API and *-pin-mem would enable the backend specific implementations taking advantage of it.

ivarflakstad · 2025-12-16T20:23:52Z

Officially alive again. Yay.

I think that covering every usage of Vec with the feature toggled StorageVec is kind of suboptimal. I think it would be better if we only create pinned memory where we need it.
I see how that could be difficult if you want CpuStorage to be backed by pinned memory, but when do you actually need that?

ahadnagy added 4 commits November 10, 2025 12:17

Add pinned memory-backed Tensor

2c4c14b

Rethink it a bit

c9caa22

Works, but some debloat is required

644a55e

Remove assertions from speedup test

ccf16ca

ahadnagy marked this pull request as draft November 27, 2025 13:45

ahadnagy changed the title ~~WIP: Pinned memory~~ [WIP] Pinned memory Nov 27, 2025

ivarflakstad reviewed Nov 27, 2025

View reviewed changes

ahadnagy added 2 commits November 28, 2025 14:14

Address review comments

498a698

Merge remote-tracking branch 'origin/main' into pinned-memory

e53583b

ahadnagy marked this pull request as ready for review November 28, 2025 22:46

ahadnagy requested a review from ivarflakstad December 3, 2025 19:18

ahadnagy changed the title ~~[WIP] Pinned memory~~ Pinned memory Dec 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pinned memory#3216

Pinned memory#3216
ahadnagy wants to merge 6 commits intohuggingface:mainfrom
ahadnagy:pinned-memory

ahadnagy commented Nov 27, 2025 •

edited

Loading

Uh oh!

ivarflakstad left a comment

Uh oh!

ivarflakstad Nov 27, 2025

Uh oh!

ivarflakstad Nov 27, 2025

Uh oh!

ivarflakstad Nov 27, 2025

Uh oh!

ivarflakstad Nov 27, 2025

Uh oh!

ivarflakstad Nov 27, 2025

Uh oh!

ivarflakstad commented Dec 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		# pinned-memory requires cuda because it uses CUDA APIs for pinned host memory allocation
		pinned-memory = ["cuda", "dep:allocator-api2"]

Conversation

ahadnagy commented Nov 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ivarflakstad left a comment

Choose a reason for hiding this comment

Uh oh!

ivarflakstad Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

ivarflakstad Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

ivarflakstad Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

ivarflakstad Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

ivarflakstad Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

ivarflakstad commented Dec 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ahadnagy commented Nov 27, 2025 •

edited

Loading