Skip to content

predict_action function performs pyTorch conversion using GPU #37

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: feat/improve-predict
Choose a base branch
from

Conversation

demobo-com
Copy link

What this does

(⚡️ Performance)
Optimizes predict_action function's observations' pyTorch conversions by using CUDA GPU

How to checkout & try? (for the reviewer)

Run the same ACT policy evaluation in these 3 different scenarios:

  1. original predict_action without running other CPU intensive applications (ex. Zoom)
  2. original predict_action while running other CPU intensive applications (ex. Zoom): very slow fps inferencing
  3. new predict_action without running other CPU intensive applications (ex. Zoom): expect a little performance optimization
  4. new predict_action while running other CPU intensive applications (ex. Zoom): expect 10x better performance

Examples:

python lerobot/scripts/control_robot.py --control.type=record --control.policy.path=someACT_model --some.option=true

@shantanuparab-tr shantanuparab-tr requested a review from Copilot July 7, 2025 14:19
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR optimizes predict_action by moving the tensor device transfer to before data transformations so that normalization and reshaping occur on the GPU, improving inference performance under CPU load.

  • Moves observation[name].to(device) to the start of the conversion loop and removes the redundant transfer at the end
  • Ensures all type casting, normalization, permute, and unsqueeze operations happen on the CUDA device
Comments suppressed due to low confidence (1)

lerobot/common/robot_devices/control_utils.py:112

  • Add a unit test to verify that after predict_action, all observation tensors are on the specified device and have the correct shape and dtype.
            observation[name] = observation[name].to(device)

Comment on lines +112 to 116
observation[name] = observation[name].to(device)
if "image" in name:
observation[name] = observation[name].type(torch.float32) / 255
observation[name] = observation[name].permute(2, 0, 1).contiguous()
observation[name] = observation[name].unsqueeze(0)
Copy link
Preview

Copilot AI Jul 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can combine device transfer, dtype conversion, and normalization into a single chained call to reduce intermediate allocations, e.g.: observation[name] = observation[name].to(device=device, dtype=torch.float32).div(255).permute(2,0,1).unsqueeze(0).

Suggested change
observation[name] = observation[name].to(device)
if "image" in name:
observation[name] = observation[name].type(torch.float32) / 255
observation[name] = observation[name].permute(2, 0, 1).contiguous()
observation[name] = observation[name].unsqueeze(0)
if "image" in name:
observation[name] = observation[name].to(device=device, dtype=torch.float32).div(255).permute(2, 0, 1).unsqueeze(0)
else:
observation[name] = observation[name].to(device).unsqueeze(0)

Copilot uses AI. Check for mistakes.

@@ -109,11 +109,11 @@ def predict_action(observation, policy, device, use_amp):
):
# Convert to pytorch format: channel first and float32 in [0,1] with batch dimension
for name in observation:
observation[name] = observation[name].to(device)
if "image" in name:
observation[name] = observation[name].type(torch.float32) / 255
Copy link
Preview

Copilot AI Jul 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Consider using the .float() alias instead of .type(torch.float32) for readability and consistency with common PyTorch code style.

Suggested change
observation[name] = observation[name].type(torch.float32) / 255
observation[name] = observation[name].float() / 255

Copilot uses AI. Check for mistakes.

@demobo-com demobo-com changed the title predicti_action function performs pyTorch conversion using GPU predict_action function performs pyTorch conversion using GPU Jul 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant