You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've been trying to run this code on accel-sim, but I am running into some problems. First, the output of the simulation is different from the output when running on HW. For some reason, the simulation shows the ciphertext and plaintext as 0, while running on HW has a different outcome. I have attached the output file of the code for AES-128 in counter mode, which shows the ciphertext and plaintext as 0 on line number 733. Do you know what might be causing this to happen?
Another problem I've been facing (which might be related to the first problem) is that the simulated results are not very close to the HW results. While profiling the application with both nsight and nvprof produces similar results of around 5B cycles, the simulation only outputs 137k. I've tried using the Tuner to get more accurate simulations, but the results were the same. Do you know what might be causing this behavior? Here are some of the results: 256-ctr, 128-ctr. Although the cycle results are way off, they seem to be off by a constant factor. Some other stats are also very different from the HW results (included in the additional info).
Some additional info:
I've been running the simulation in PTX mode. I tried using SASS, but tracing the application generates a huge file (I had to stop the tracer after a couple of minutes and the file was already over 7GB). Do you know what might be causing this?
I am using a 16GB Tesla V100. I tried using the tuner to get more accurate results and even tried different parameters (warp scheduling, memory scheduling) but did not change the overall result.
I ran the rodinia benchmark, and it produced somewhat accurate results. The results were a lot closer than the app I am trying to run, which produces errors orders of magnitude above.
Do you have any suggestions or ideas on how to solve these issues?
Thank you in advance!
The text was updated successfully, but these errors were encountered:
This is interesting, if I comment out the printfs from inside the kernels, the workload's (128-ES) execution time goes from 3.319 seconds to 42.785 uS, much closer to the reported simulation time. I don't know what shenanigans happen inside the device-side printf, but maybe we are not accounting for that in PTX execution mode. Will run some more tests.
Hello,
I've been trying to run this code on accel-sim, but I am running into some problems. First, the output of the simulation is different from the output when running on HW. For some reason, the simulation shows the ciphertext and plaintext as 0, while running on HW has a different outcome. I have attached the output file of the code for AES-128 in counter mode, which shows the ciphertext and plaintext as 0 on line number 733. Do you know what might be causing this to happen?
Another problem I've been facing (which might be related to the first problem) is that the simulated results are not very close to the HW results. While profiling the application with both nsight and nvprof produces similar results of around 5B cycles, the simulation only outputs 137k. I've tried using the Tuner to get more accurate simulations, but the results were the same. Do you know what might be causing this behavior? Here are some of the results: 256-ctr, 128-ctr. Although the cycle results are way off, they seem to be off by a constant factor. Some other stats are also very different from the HW results (included in the additional info).
Some additional info:
Do you have any suggestions or ideas on how to solve these issues?
Thank you in advance!
The text was updated successfully, but these errors were encountered: