modeld: skip redundant cast, reshape, and flatten #35735

Quantizr · 2025-07-16T03:02:23Z

Removing the redundant operations improves modeld.py CPU usage from 32% to 24% and makes execution time ~2% faster

For dmonitoringmodeld.py, it improves CPU usage from 17% to 12% and makes execution time ~4% faster.

In tinygrad, Tensor.numpy() is essentially defined as

self.cast(self.dtype.base).contiguous().to("CPU").realize().uop.base.buffer.numpy().reshape(self.shape)

in compile3.py, we already cast to float32

  run_onnx_jit = TinyJit(lambda **kwargs:
                         next(iter(run_onnx({k:v.to(Device.DEFAULT) for k,v in kwargs.items()}).values())).cast('float32'), prune=True)

and in modeld.py and dmonitoringmodeld.py, flatten() is called after .numpy(), making the reshape redundant.

There is also an extra .to("CPU") which I think currently causes tinygrad to do a useless copy from CPU to NPY

gotta afford the hey comma cpu usage somehow

commaci-public · 2025-07-16T03:07:01Z

ref for commit f7324f2: https://raw.githubusercontent.com/commaai/ci-artifacts/refs/heads/model_replay_models-go-brrr/8494c69d3c710e81|000001d4--2648a9a404_model_tici_f7324f263f6b24cd3889bf010d4fb66d97d310db.zst

All Model Replay Plots

Copilot

Pull Request Overview

This PR removes redundant tensor-to-NumPy operations (cast, reshape, flatten, and extra .to("CPU")) in model inference to lower CPU usage and slightly speed up execution.

Replace .numpy().flatten() with direct buffer extraction via .contiguous().realize().uop.base.buffer.numpy()
Apply the same change for both vision and policy outputs in modeld.py and for the monitoring model in dmonitoringmodeld.py

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
selfdrive/modeld/modeld.py	Drop `.flatten()` and use direct buffer NumPy extraction
selfdrive/modeld/dmonitoringmodeld.py	Remove redundant `.flatten()` in monitoring model output process

Comments suppressed due to low confidence (3)

selfdrive/modeld/modeld.py:166

Removing .flatten() changes vision_output from a 1D array to its original multi-dimensional shape, which may break slice_outputs. To preserve existing behavior, reapply a flatten step (e.g., .reshape(-1) or .flatten()) after the NumPy conversion.

    self.vision_output = self.vision_run(**self.vision_inputs).contiguous().realize().uop.base.buffer.numpy()

selfdrive/modeld/modeld.py:173

The .flatten() call was removed here as well, altering policy_output shape from a flat vector to multi-dimensional. Consider adding .reshape(-1) or .flatten() to maintain the expected 1D output.

    self.policy_output = self.policy_run(**self.policy_inputs).contiguous().realize().uop.base.buffer.numpy()

selfdrive/modeld/dmonitoringmodeld.py:96

By removing .flatten(), output will retain its original multi-dimensional shape. If downstream logic expects a flat array, reapply a flatten (e.g., .reshape(-1) or .flatten()) after conversion.

    output = self.model_run(**self.tensor_inputs).contiguous().realize().uop.base.buffer.numpy()

Copilot · 2025-07-17T03:50:58Z

selfdrive/modeld/modeld.py

+    self.vision_output = self.vision_run(**self.vision_inputs).contiguous().realize().uop.base.buffer.numpy()
    vision_outputs_dict = self.parser.parse_vision_outputs(self.slice_outputs(self.vision_output, self.vision_output_slices))

    self.full_features_buffer[0,:-1] = self.full_features_buffer[0,1:]
    self.full_features_buffer[0,-1] = vision_outputs_dict['hidden_state'][0, :]
    self.numpy_inputs['features_buffer'][:] = self.full_features_buffer[0, self.temporal_idxs]

-    self.policy_output = self.policy_run(**self.policy_inputs).numpy().flatten()
+    self.policy_output = self.policy_run(**self.policy_inputs).contiguous().realize().uop.base.buffer.numpy()


[nitpick] The long chain .contiguous().realize().uop.base.buffer.numpy() is repeated multiple times. Consider extracting this pattern into a helper function (e.g., to_cpu_array(tensor)) to reduce duplication and improve readability.

Quantizr added 2 commits July 15, 2025 20:19

skip redundant cast, reshape, and flatten

b39f7fe

accidentally deleted a newline lol

f7324f2

Quantizr force-pushed the models-go-brrr branch from 17b364d to f7324f2 Compare July 16, 2025 03:20

adeebshihadeh requested a review from haraschax July 16, 2025 17:28

sshane requested a review from Copilot July 17, 2025 03:50

Copilot AI reviewed Jul 17, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

modeld: skip redundant cast, reshape, and flatten #35735

modeld: skip redundant cast, reshape, and flatten #35735

Quantizr commented Jul 16, 2025 •

edited

Loading

Uh oh!

commaci-public commented Jul 16, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jul 17, 2025

Uh oh!

Uh oh!

modeld: skip redundant cast, reshape, and flatten #35735

Are you sure you want to change the base?

modeld: skip redundant cast, reshape, and flatten #35735

Conversation

Quantizr commented Jul 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

commaci-public commented Jul 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Jul 17, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Quantizr commented Jul 16, 2025 •

edited

Loading

commaci-public commented Jul 16, 2025 •

edited

Loading