function iter_DIEs is very slow, 1200000 dies taking 45 seconds, I want to have multiple threads to process one cu per thread ,but the thread will report an error in iter_DIES, or is there any other way to speed up the traversal,I want to traversal all dies in 3 seconds