Skip to content

Implement batched GPU transforms #232

@ziw-liu

Description

@ziw-liu
          > This might be because MONAI transforms are not batched (executed in a loop), and CPU/GPU sync could be taking much longer than the actual compute.

Benchmark of 3D random affine in 6a88ec4 (10 runs, milliseconds):

Device MONAI (sequential) Kornia (batched) Relative
Zen 2 CPU (1 thread) 9160 3800 2.4
Zen 2 CPU (16 threads) 7320 556 13.2
A40 GPU 2620 210 12.5

Originally posted by @ziw-liu in #218 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestrepresentationRepresentation learning (SSL)translationImage translation (VS)

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions