Atomic operations should use a `memcpy` for comparison (See https://github.com/NVIDIA/cccl/issues/989). Current [pair implementation](https://github.com/owensgroup/BGHT/blob/main/include/detail/pair.cuh#L35) uses a custom `==` operator.