I am using 10x of the GPU But I end up seeing no devices at the very point when the address generation is supposed to start. Can anyone provide a solution to running this program gracefully?