Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deadlock running sass trace #341

Open
Evane5cence opened this issue Oct 7, 2024 · 8 comments
Open

Deadlock running sass trace #341

Evane5cence opened this issue Oct 7, 2024 · 8 comments

Comments

@Evane5cence
Copy link

Hi,

I slightly modified the kernel to implement a simple function and run GEMM, but I encountered a deadlock. This only occurs when the matrix size is large (4096x4096). May I ask if there is any indication of why this might happen? Is there a common reason for this?

Thank you!!!

GPGPU-Sim uArch: ERROR ** deadlock detected: last writeback core 40 @ gpu_sim_cycle 28773 (+ gpu_tot_sim_cycle 4294867296) (71227 cycles ago)
GPGPU-Sim uArch: DEADLOCK  shader cores no longer committing instructions [core(# threads)]:
GPGPU-Sim uArch: DEADLOCK  0(128) 1(128) 2(128) 3(128) 4(128) 5(128) 6(128) 7(128)  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ... 
GPGPU-Sim uArch DEADLOCK:  memory partition 0 busy
GPGPU-Sim uArch DEADLOCK:  memory partition 1 busy
GPGPU-Sim uArch DEADLOCK:  memory partition 2 busy
GPGPU-Sim uArch DEADLOCK:  memory partition 3 busy
GPGPU-Sim uArch DEADLOCK:  memory partition 4 busy
GPGPU-Sim uArch DEADLOCK:  memory partition 5 busy
GPGPU-Sim uArch DEADLOCK:  memory partition 6 busy
GPGPU-Sim uArch DEADLOCK:  memory partition 7 busy
GPGPU-Sim uArch DEADLOCK:  memory partition 8 busy
GPGPU-Sim uArch DEADLOCK:  memory partition 9 busy
GPGPU-Sim uArch DEADLOCK:  memory partition 10 busy
GPGPU-Sim uArch DEADLOCK:  memory partition 11 busy
GPGPU-Sim uArch DEADLOCK:  memory partition 12 busy
GPGPU-Sim uArch DEADLOCK:  memory partition 13 busy
GPGPU-Sim uArch DEADLOCK:  memory partition 14 busy
GPGPU-Sim uArch DEADLOCK:  memory partition 15 busy
GPGPU-Sim uArch DEADLOCK:  iterconnect contains traffic
GPGPU-Sim uArch: ICNT:Display State: Under implementation

Re-run the simulator in gdb and use debug routines in .gdbinit to debug this
@JRPan
Copy link
Collaborator

JRPan commented Oct 7, 2024

I need more info. What did you change?
Did you change the trace directly?

@Evane5cence
Copy link
Author

Thanks for the reply! I did not change the trace. I changed the address of mf using a certain formula.

@JRPan
Copy link
Collaborator

JRPan commented Oct 8, 2024

at what stage? At mf allocation?

which function did you change.

@Evane5cence
Copy link
Author

At the stage when the interconnect passes the mf to the L2 cache, I modify the address of the mf. When the L2 cache passes the mf back to the interconnect, I restore the original address.

@JRPan
Copy link
Collaborator

JRPan commented Oct 8, 2024

This sounds like fine. Without seeing your code I cannot really help much.

But my guess is mf is being directed to somewhere else.
Or the mf gets merged at L2, but failed to notify L1 when writeback.

My recommendation would be just to modify the address at mf allocation and keep it like that. No need to restore it back.
If you just want to redirect the mf to different L2 banks, you can just change the memory subpartition hash function, without changing the address.

@Evane5cence
Copy link
Author

Thanks!
May I ask what is the base_addr of L1/DRAM? How can I retreive them?

@Evane5cence
Copy link
Author

i only see the shmem base_addr and local mem base_addr

@JRPan
Copy link
Collaborator

JRPan commented Oct 11, 2024

base_addr is for local and shmem only. the actual address is base_addr + offset. You can consider this as an allocated array.

L1/DRAM is global memory, which uses the global address. the address for each instruction can be found in mem_access_t object.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants