The only two things I think it would be good to have at some point are:
- Moving the function that applies the unitary, and all the functions it depends on, to the
clifford_operations file. This would allow for customization by the different engines, e.g. jitting with numba and writing the cuda kernels with cupy.
- Leveraging the packed bits representation that is used by the rest of the backend which provides a huge speed up.
These two do not need to be addressed here in this PR, but we should probably open an issue about them to remind us.
Originally posted by @BrunoLiegiBastonLiegi in #1700 (review)