WIP: lane debugging support #35

palves · 2025-04-02T10:53:09Z

New lane debugging support from scratch, on top of "amd-staging with existing lane support stripped out" branch.

The users/palves/amd-staging-without-lane-support branch is amd-staging with rocgdb's current lane debugging support stripped out.

The users/palves/lane-debugging branch then reapplies all the lane debugging patches again, up until a diff against amd-staging is (almost) empty. Then, it adds patches on top adjusting to the user interface changes that we will want to submit upstream.

Eventually, the new patches will be squashed into the older patches so that we don't have the intermediate behavior in between. It should then be easier to rebase it all on upstream master. Meanwhile, having this based on amd-staging instead helps with the prototyping and discussions because upstream does not have support the needed DWARF extensions and other important bits, required for accessing variables, unwinding, etc.

Add some convenience wrappers to rocm-tdep.h. These will be used in the following patches. Change-Id: I8d57404855cc3fa759245ed10e1a2c11bdfa0bdb

gdbarch_active_lanes_mask returns the active SIMD lanes mask of a given thread. Also introduce simd_lanes_mask_t, a typedef for uint64_t. 64-bits is sufficient for all currently known architectures. Change-Id: Ifbb7761431fb7460268765651cdca361b58bc12d

gdbarch_supported_lanes_count returns the number of lanes supported by a thread. For example, on AMDGPU, if the wavefront size is 32 lanes then this returns 32. If the wavefront size is 64 lanes then this returns 64. Change-Id: I91f9fc5d955a41467d1ae8b1a0a7b1f2105d1e58

gdbarch_used_lanes_count returns the number of lanes in a thread actually used by the kernel, because we have partial work-groups. For example, on AMDGPU, if the wavefront size is 64 lanes then the supported lanes count is 64. However, if the program allocates 90 work items all in one dimension, then the first wave will have all its lanes used, and the second wave will only use 90-64=26 lanes. It's a bit more tricky than that because work items are allocated in a three-dimensional space, but the gist of it is that there are hardware lanes which aren't used at all by the program. This hook lets core GDB be aware of them so that it can hide them. Change-Id: I976cfa790416060fdf0008b2d8bfb33b0316c31d

target_lane_to_str is meant to be used similarly to how target_pid_to_str is used to convert a ptid_t to a string, except it converts a lane instead of a thread. E.g., on AMDGPU, we get: target_pid_to_str => "AMDGPU Thread 1:2:1:3 (0,0,0)/2" target_lane_to_str => "AMDGPU Lane 1:2:1:3/6 (0,0,0)[4,1,3]" An interesting thing here is the extracting of work item ids/coordinates from flat ids. See lane_workgroup_pos_string. I made sure that is correct by comparing what GDB prints with the coordinates printed by a HIP program. lane_workgroup_pos_string is a separate function because it will be used in another context in a following patch. Change-Id: I86e87830038f4001c25781605759666ca403ebaa

The intent of this knob is such that when off: - Disables DW_AT_LLVM_lane_pc support, in case the compiler emits bogus DW_AT_LLVM_lane_pc info. - Makes GDB not try to step over divergent code regions. - Makes GDB not ignore breakpoints that trigger when the execution mask is 0. Change-Id: I8214f738d3541d9115fba102a5501565bd31019d

(These bits were originally written by Intel, though there are considerable local changes and additions.) - Teaches GDB about keeping track of each thread's selected lane. thread_info gains a few fields and methods related to lanes. - Adds scoped_restore_current_simd_lane, to save/restore a thread's selected lane. - Teaches GDB to record and display which lanes were active when a breakpoint was hit: Thread 5 "dw2-lane-pc" hit Breakpoint 1, with SIMD lanes [0-63], lane_pc_test (....) at dw2-lane-pc.cc:130 ^^^^^^ - That "[0-63]" range shown above is built with a new routine, called make_ranges_from_sorted_vector. (I think we could probably make this routine work with a much cheaper simd_lanes_mask_t bit mask instead of a std::vector<int>, but I never got around to try that.) - Teaches GDB to also say which lane is current when saying which thread is current: [Switching to thread 5, lane 0 (AMDGPU Lane 1:1:1:1/0 (0,0,0)[0,0,0])] - Makes GDB evaluate the breakpoint condition for all SIMD lanes which were active. This is important because breakpoint conditions may involve symbols, references to lane-specific memory (lds in AMDGPU), or lane-specific convenience variables (e.g., $_lane, added in a following patch), for example. There's no way to manually change the selected lane in this patch yet. That will be added in a follow up patch. The gdb.rocm/ testsuite is adjusted to match the GDB output changes. gdb.base/annota1.exp had to be tweaked because the patch has the side effect of changing the order of a frames-invalid and a breakpoints-invalid annotion. Limitations: - There's no way to set a breakpoint that triggers even if all lanes are masked out / inactive. You have to disable lane divergence support with "maint set lane-divergence-support off" if you need that. - No MI. Co-Authored-By: Intel Change-Id: Ie4ff78fc4d6733de972cfa76c1384b08d35103b4

This adds "info lanes" & "lane" commands, the basic commands that let you list lanes, and switch between lanes. Switching to an active lane: (gdb) lane 0 [Switching to thread 374, lane 0 (AMDGPU Lane 1:1:1:370/0 (92,0,0)[64,0,0])] #0 bit_extract_kernel (C_d=<optimized out>, A_d=<optimized out>, N=<optimized out>) at bit_extract.cpp:46 46 C_d[i] = __bitextract_u32(A_d[i], 8, 4) + 2; Switching to an inactive lane (note the warning): (gdb) lane 1 [Switching to thread 374, lane 1 (AMDGPU Lane 1:1:1:370/1 (92,0,0)[65,0,0])] warning: Current lane is inactive. #0 bit_extract_kernel (C_d=<optimized out>, A_d=<optimized out>, N=<optimized out>) at bit_extract.cpp:46 46 C_d[i] = __bitextract_u32(A_d[i], 8, 4) + 2; (gdb) Listing lanes: (gdb) help info lanes Display currently known lanes. Usage: info lanes [OPTION]... [ID]... Options: -all All lanes (active, inactive and unused). -active Only active lanes. -inactive Only inactive lanes. If ID is given, it is a space-separated list of IDs of lanes to display. Otherwise, all lanes are displayed. (gdb) Unused lanes are lanes that end up not associated with any work-item, because we ended up with a partial work-group. By default, "info lanes" hides them. For example, if you launch a kernel with 1 block and 3 threads per block all in the x axis, then you see only 3 lanes, even if the hardware has 32 or 64 lanes per wave: (gdb) info lanes Id State Target Id Frame * 0 A AMDGPU Lane 1:1:1:1/0 (0,0,0)[0,0,0] bit_extract_kernel (C_d=<optimized out>, A_d=<optimized out>, N=<optimized out>) at bit_extract.cpp:38 1 A AMDGPU Lane 1:1:1:1/1 (0,0,0)[1,0,0] bit_extract_kernel (C_d=<optimized out>, A_d=<optimized out>, N=<optimized out>) at bit_extract.cpp:38 2 A AMDGPU Lane 1:1:1:1/2 (0,0,0)[2,0,0] bit_extract_kernel (C_d=<optimized out>, A_d=<optimized out>, N=<optimized out>) at bit_extract.cpp:38 -all shows even unused lanes: (gdb) info lanes -all Id State Target Id Frame * 0 A AMDGPU Lane 1:1:1:1/0 (0,0,0)[0,0,0] bit_extract_kernel (C_d=<optimized out>, A_d=<optimized out>, N=<optimized out>) at bit_extract.cpp:38 1 A AMDGPU Lane 1:1:1:1/1 (0,0,0)[1,0,0] bit_extract_kernel (C_d=<optimized out>, A_d=<optimized out>, N=<optimized out>) at bit_extract.cpp:38 2 A AMDGPU Lane 1:1:1:1/2 (0,0,0)[2,0,0] bit_extract_kernel (C_d=<optimized out>, A_d=<optimized out>, N=<optimized out>) at bit_extract.cpp:38 3 U AMDGPU Lane 1:1:1:1/3 (0,0,0)[0,0,1] (unused) .... 63 U AMDGPU Lane 1:1:1:1/63 (0,0,0)[0,0,21] (unused) (gdb) Inactive lanes show like this: (gdb) info lanes Id State Target Id Frame * 0 A AMDGPU Lane 1:1:1:1/0 (0,0,0)[0,0,0] bit_extract_kernel (C_d=<optimized out>, A_d=<optimized out>, N=<optimized out>) at bit_extract.cpp:46 1 I AMDGPU Lane 1:1:1:1/1 (0,0,0)[1,0,0] (inactive) 2 A AMDGPU Lane 1:1:1:1/2 (0,0,0)[2,0,0] bit_extract_kernel (C_d=<optimized out>, A_d=<optimized out>, N=<optimized out>) at bit_extract.cpp:46 If the wave if running, we show this: (gdb) info lanes Id State Target Id Frame * 0 R AMDGPU Lane 1:1:1:1/0 (0,0,0)[0,0,0] (running) 1 R AMDGPU Lane 1:1:1:1/1 (0,0,0)[1,0,0] (running) 2 R AMDGPU Lane 1:1:1:1/2 (0,0,0)[2,0,0] (running) The "State" column can thus show 4 different states as seen in the shown above: A - active lane I - inactive lane U - unused lane R - wave is running Change-Id: Ifbae436b47f5fa78d58dcaba2a991c5f30373c8c

This commit adds a "lane apply" command, analogous to "thread apply", but iterates over all lanes of the current wave. E.g.: lane apply 1 2 7 4 backtrace Apply 'backtrace' cmd to lanes 1,2,7,4 lane apply 2-7 9 p foo Apply 'p foo' cmd to lanes 2->7 & 9 lane apply all x/i $lane_pc Apply 'x/i $lane_pc' cmd to all lanes. Note you can use: (gdb) taa lane apply all CMD to apply CMD to all lanes of all threads. Online help shows: (gdb) help lane apply Apply a command to a list of lanes. Usage: lane apply ID... [OPTION]... COMMAND ID is a space-separated list of IDs of lanes to apply COMMAND on. Prints lane number and target system's lane id followed by COMMAND output. By default, an error raised during the execution of COMMAND aborts "lane apply". Options: -q Disables printing the thread or lane information. -c Print any error raised by COMMAND and continue. -s Silently ignore any errors or empty output produced by COMMAND. -all All lanes (active, inactive and unused). -active Only active lanes. -inactive Only inactive lanes. The -q, -c, -s options are shared with "thread apply". The -all,-active,-inactive options are shared with "info lanes". Change-Id: Ifd7d79be4a09daa8d268faf1733bed5cbefacc1c

(gdb) info threads Id Target Id Frame 1 Thread 0x7ffff66fb880 (LWP 4151298) "bit_extract" 0x00007ffff6d3c50b in ioctl () at ../sysdeps/unix/syscall-template.S:78 2 Thread 0x7ffff66fa700 (LWP 4151306) "bit_extract" 0x00007ffff6d3c50b in ioctl () at ../sysdeps/unix/syscall-template.S:78 4 Thread 0x7ffff5c7f700 (LWP 4151308) "bit_extract" 0x00007ffff6d3c50b in ioctl () at ../sysdeps/unix/syscall-template.S:78 * 5 AMDGPU Thread 1:1:1:1 (0,0,0)/0 "bit_extract" bit_extract_kernel (C_d=<optimized out>, A_d=<optimized out>, N=<optimized out>) at bit_extract.cpp:38 (gdb) info lane Id State Target Id Frame * 0 A AMDGPU Lane 1:1:1:1/0 (0,0,0)[0,0,0] bit_extract_kernel (C_d=<optimized out>, A_d=<optimized out>, N=<optimized out>) at bit_extract.cpp:38 1 A AMDGPU Lane 1:1:1:1/1 (0,0,0)[1,0,0] bit_extract_kernel (C_d=<optimized out>, A_d=<optimized out>, N=<optimized out>) at bit_extract.cpp:38 2 A AMDGPU Lane 1:1:1:1/2 (0,0,0)[2,0,0] bit_extract_kernel (C_d=<optimized out>, A_d=<optimized out>, N=<optimized out>) at bit_extract.cpp:38 (gdb) thread find \[0,0,0\] Thread 5, lane 0 has target id 'AMDGPU Lane 1:1:1:1/0 (0,0,0)[0,0,0]' (gdb) thread find \[1,0,0\] Thread 5, lane 1 has target id 'AMDGPU Lane 1:1:1:1/1 (0,0,0)[1,0,0]' (gdb) Change-Id: I38cd34702edab47478732bb5e52ff536383adfca

E.g.: (gdb) lane 0 [Switching to thread 5, lane 0 (AMDGPU Lane 1:2:1:2/0 (0,0,0)[12,1,1])] ... (gdb) p $_lane $1 = 0 (gdb) lane 1 [Switching to thread 5, lane 1 (AMDGPU Lane 1:2:1:2/1 (0,0,0)[0,2,1])] ... (gdb) p $_lane $2 = 1 $_lane_count shows the number of lanes the current thread uses, taking partial work-groups into account. E.g., on an AMDGPU with wavefront size 64 lanes, if all lanes are used: (gdb) p $_lane_count $3 = 64 When focused on a CPU thread, we get: (gdb) p $_lane $6 = 0 (gdb) p $_lane_count $7 = 0 $_lane is already described in the manual. $_lane_count is mentioned there, but in a commented out block. Change-Id: Iead080cc4872e0be090b1cbcbf3622a15c7ef030

E.g.: (gdb) info thread 1924 Id Target Id Frame * 1924 AMDGPU Thread 1:1:1:1920 (479,0,0)/3 "bit_extract" bit_extract_kernel (C_d=<optimized out>, A_d=<optimized out>, N=<optimized out>) at bit_extract.cpp:38 (gdb) p $_thread_workgroup_pos $1 = "3" (gdb) info lane 0 Id State Target Id Frame * 0 A AMDGPU Lane 1:1:1:1920/0 (479,0,0)[192,0,0] bit_extract_kernel (C_d=<optimized out>, A_d=<optimized out>, N=<optimized out>) at bit_extract.cpp:38 (gdb) p $_lane_workgroup_pos $2 = "[192,0,0]" After attaching to a process (when we can't read the dispatch info): (gdb) p $_thread_workgroup_pos $1 = "?" (gdb) p $_lane_workgroup_pos $2 = "[?,?,?]" And when focused on a CPU thread: (gdb) thread 1 [Switching to thread 1 (Thread 0x7ffff66fb880 (LWP 102291))] #0 0x00007ffff6d3c50b in ioctl () at ../sysdeps/unix/syscall-template.S:78 78 ../sysdeps/unix/syscall-template.S: No such file or directory. (gdb) p $_lane_workgroup_pos $3 = "" (gdb) p $_thread_workgroup_pos $4 = "" These are already described in the manual. Change-Id: I81984ecae92ca76a04c86afca91f0196b2459768

This teaches GDB to automatically step over divergent regions. E.g., with this, when you step over an if/then/else, you'll observe the lane stepping across only one of the "then" or "else" branches, not both. Conceptually, it's very simple: - when the thread is stepping, if the current lane becomes inactive, then continue stepping until it becomes active, at which point resume the normal stepping algorithm (check whether we reached a different line, etc.). Single-stepping the whole divergent range can be slow, as there may be many instructions to step, e.g., because the inactive code calls functions. Once we have DW_AT_LLVM_lane_pc support, we'll be able to speed that up, by setting a breakpoint at the logical lane PC, and running to it. As an escape hatch in case this goes wrong, you can disable the automatic stepping over divergent regions with the "maint set lane-divergence-support off" command. Change-Id: Idfc205cc366f5b4d645fd946b577fc303e6f7a10

Co-Authored-By: Zoran Zaric <[email protected]> Change-Id: I7aca8548c29b6f07b2233affd03449c0deeb122d

E.g.: (gdb) info thread 1924 Id Target Id Frame * 1924 AMDGPU Thread 1:1:1:1920 (479,0,0)/3 "bit_extract" bit_extract_kernel (C_d=<optimized out>, A_d=<optimized out>, N=<optimized out>) at bit_extract.cpp:38 (gdb) p $_dispatch_pos $1 = "(479,0,0)/3" After attaching to a process (when we can't read the dispatch info): (gdb) p $_dispatch_pos $1 = "(?,?,?)/?" And when focused on a CPU thread: (gdb) thread 1 [Switching to thread 1 (Thread 0x7ffff66fb880 (LWP 102291))] #0 0x00007ffff6d3c50b in ioctl () at ../sysdeps/unix/syscall-template.S:78 78 ../sysdeps/unix/syscall-template.S: No such file or directory. (gdb) p $_dispatch_pos $3 = "" This is already described in the manual. Change-Id: I87da4cb6aac27d1176b82dc1948599a27d722202

This adds two testcases: - gdb.rocm/lane-execution.exp - exercises execution-related scenarios; - gdb.rocm/lane-info.exp - exercises lane info & search use cases. Overall, these: - Test that GDB automatically steps over code regions where the current lane is divergent. - Test breakpoint hits with some lanes inactive. Also test that GDB evaluates the breakpoint condition on each lane. - Test that GDB warns when you select an inactive lane. - Test $_lane and the "lane" command. - Test invalid arguments to lane-related commands. - Test the $_lane_count convenience variable, making sure it takes into account unused lanes. - Test "info lanes", in combination with -all, -active, -inactive, and also making sure GDB hides unused lanes by default. - Test "lane apply", similarly, in combination with -all, -active, -inactive, and also making sure GDB hides unused lanes by default. Bug: https://ontrack-internal.amd.com/browse/SWDEV-302019 Change-Id: I61fcbfe7ef0cad2bcf35342b3577d6a6318edff2

This teaches the MI command parser about a new global --lane switch, similar to the global --thread switch, but for lanes. Change-Id: I6031e3b10420b7b1578fd5741ea95d46596b2591

Change-Id: I45b2c197e787ed0e81f9e13fef666b654c49a8d2

This adds two new attributes to MI *stopped. "lane-id" and "hit-lanes". "lane-id" indicates which lane GDB switched to. This is analogous/paired to/with the existing "thread-id" attribute, thus it is positioned next to it. "hit-lanes" is printed for breakpoint-hit stops. It indicates which lanes among the set of active lanes evaluated the breakpoint condition as true, if there was a condition, or all active lanes if there was none. E.g.: ~"[Switching to thread 3, lane 0 (AMDGPU Lane 1:2:1:1/0 (0,0,0)[0,0,0])]\n" ~"\n" ~"Thread 3 \"mi-lanes\" hit Temporary breakpoint 1, with lanes [0-31], lane_pc_test () at lane-execution.cpp:79\n" ~"79\t if (gid % 2)\t\t\t\t\t/* if_0_cond */\n" *stopped,reason="breakpoint-hit",disp="del",bkptno="1",hit-lanes="0-31",frame={...},thread-id="3",lane-id="0",stopped-threads="all" (gdb) Above you can see that hit-lanes="0-31" is the same as "with lanes [0-31]" in the CLI text. Change-Id: I227a09c05ca32054704f118b35b6cfd7a35939d5

When the tracked expression uses locals, GDB saves the thread and frame with the varobj, so that it can later on switch to the correct context to re-evaluate the varobj expression. This patch extends that logic to also save and switch to the right lane. In addition, makes varobj printing show the varobj's lane, if there's one, with a new "lane-id" attribute. E.g.: -var-create var * gid ^done,name="var",numchild="0",value="5",type="unsigned int",thread-id="3",lane-id="5",has_more="0" (gdb) Change-Id: I44b307153bc3d2a60875d69ee99ce70f91d76e3a

The previous patch made GDB remember the lane that was current when a varobj is created, so that GDB can restore that lane context when it needs to re-evaluate the varobj expression. However, even with that in place, GDB is still showing the wrong value for the wrong lane. Follow's an example. When stopped at a function that looks like this: __device__ void lane_pc_test (unsigned gid, const int *in, struct test_struct *out) { The "gid" parameter's location is in private_lane memory: p &gid &"p &gid\n" ~"$1 = (unsigned int *) private_lane#0x50\n" ^done Assuming GDB thread 3 is the GPU thread, let's switch to its lane 5: -thread-select -l 5 3 ^done,new-thread-id="3",lane-id="5",frame={...} (gdb) And create a varobj called "var", inspecting "gid": -var-create var * gid ^done,name="var",numchild="0",value="5",type="unsigned int",thread-id="3",lane-id="5",has_more="0" (gdb) Note above we see value=5. Now let's switch to a different lane, which has a different value for "gid". Here, lane 0: -thread-select -l 0 3 ^done,new-thread-id="3",lane-id="0",frame={...} (gdb) Let's confirm "gid"'s value with -data-evaluate-expression, i.e., without the varobj machinery: -data-evaluate-expression gid ^done,value="0" (gdb) Now let's update the varobj. GDB internally switches to lane 5, the varobj's lane, reevaluates the expression, and saves the resulting value within the varobj. The value should still be 5. -var-update var ^done,changelist=[{name="var",in_scope="true",type_changed="false",has_more="0"}] (gdb) However, we see that GDB gives back an incorrect value: -var-evaluate-expression var ^done,value="0" (gdb) It is printing the value of "gid" for the current lane, lane 0, instead of the value for lane 5. What happens is that when -var-update evaluates the expression, even though value_of_root_1 switches to the thread/lane of the varobj, in this case lane 5, the resulting value is a lazy memory lval value, with address private_lane#0x50, thus an address that depends on the current lane. After creating the lazy value, value_of_root_1 restores back the current lane to lane 0. varobj_update continues, and calls install_new_value. Since the value is lazy, install_new_value fetches it. But since we are now focused on lane 0, and the value's address is from the "private_lane" address space, we end up with the result for that lane, instead of for lane 5. The fix is to make install_new_value switch to the right thread/lane context before fetching the lazy value. Change-Id: I916f9075145ac5ecdf17b120b5672c4de597f1df

This adds a new "-l LANE" option to -thread-select. This lets the frontend switch the current lane. Change-Id: I9345260ecf27afcabf5db62ca80040598741c6e2

Before: -thread-select -l 2 3 ^done,new-thread-id="3",frame={level="0",addr=...} After: -thread-select -l 2 3 ^done,new-thread-id="3",lane-id="2",frame={level="0",addr=...} -thread-select 3 ^done,new-thread-id="3",lane-id="2",frame={level="0",addr=...} Note the else branch can be simplified because there we know we're handling a non-MI uiout. Change-Id: I45bcf9c0a4782392634134baf5dbf08a736b30af

Both the "thread" and "lane" CLI commands result in a =thread-selected MI async event emitted. Augment that event to include the selected lane in a new "lane-id" attribute. E.g.: lane 1 &"lane 1\n" ~"[Switching to thread 3, lane 1 (AMDGPU Lane 1:2:1:1/1 (0,0,0)[1,0,0])]\n" ~"#0 lane_pc_test (gid=1, in=0x7ffff5801000, out=0x7ffff5800000) at lane-execution.cpp:79\n" ~"79\t if (gid % 2)\t\t\t\t\t/* if_0_cond */\n" =thread-selected,id="3",lane-id="1",frame={level="0",addr="0x00007ffff580b8b4",....} ^done (gdb) Change-Id: Ic607380ddf0ca8228c7b0bf7ba265617d25f83da

This adds a new "-lane-info" command, MI equivalent of the "info lanes" command, inspired by the existing "-thread-info" MI command, but for lanes. Unlike the CLI's "info lanes", "-lane-info" doesn't take any option other than the IDs of the lanes to list (like -thread-info), thus it always lists all lanes of all states. E.g. (simplified): -lane-info ^done,lanes=[ {id="0",state="A",target-id="AMDGPU Lane 1:1:1:1/0 (0,0,0)[0,0,0]",frame={level="0",addr=...}}, {id="1",state="A",target-id="AMDGPU Lane 1:1:1:1/1 (0,0,0)[1,0,0]",frame={level="0",addr=...}}, ... {id="63",state="A",target-id="AMDGPU Lane 1:1:1:1/63 (0,0,0)[63,0,0]",frame={level="0",addr=...}} ] (gdb) Note this prints the same frame for each lane, but that will change in the future when we add support for lane divergence. The patch is actually quite small, because the lane printing code had already been written from the get go printing to uiout assuming it'd be reused by MI. print_lane_row is tweaked to no longer prints a "state" running|stopped attribute, because that (thread-wide) state can be inferred from the lane's state instead. E.g., the CLI shows this when the wave is stopped: (gdb) info lanes 0 Id State Target Id Frame * 0 A AMDGPU Lane 1:2:1:1/0 (0,0,0)[0,0,0] kernel () at kernel.cpp:79 And this when the wave is running: (gdb) info lanes 0 Id State Target Id Frame * 0 R AMDGPU Lane 1:2:1:1/0 (0,0,0)[0,0,0] (running) The "(running)" part is redandant with the "R" state column. Getting rid of it simplifies things, as this way we only have one MI "state" attribute. Otherwise, we'd have to come up with different attribute names for both state "kinds". Change-Id: Ia1b3b3e211b12c57486b4a8df279bfad18d218ea

This adds testcases exercising the new MI lane debugging features. . -lane-info . -thread-select -l LANE . =thread-selected lane-id attribute . "lane-id" and "hit-lanes" attributes in *stopped . - global --lane option . lane-awareness in varobjs Change-Id: Ic122f5e34718a7b9d9db46fcc4acfc6acbdae011

This adds documentation for the new MI lane debugging features. . -lane-info . -thread-select -l LANE . =thread-selected lane-id attribute . "lane-id" and "hit-lanes" attributes in *stopped . - global --lane option . lane-awareness in varobjs Change-Id: I880e083c3351fc01ca93251ddf833945ea37fa85

GDB's dcache is flushed whenever we switch threads. However, this is not safe enough for all address spaces considering that some of them might be local to a lane. This patch makes GDB flush the dcache if the lane changes as well. Change-Id: Iaeded6a0b04ee200ac8bab9b6c9ddea83b8248d1

Not meant for upstream along the base lane support. It will go with the whole "finish" support as a single feature. Change-Id: Idc56239c10d2f7d4db98c45aadec9172f947b5f5

Not meant for upstream along the base lane support. It will go with the whole watchpoints support as a single feature, which depends on address spaces support. Change-Id: I4f3a6e9f525aea31c3be0fb0ef06829791ad8a76

Change-Id: I21f4be049e0815c05918cf5b54f40ab460a49584

The breakpoints code passes around 'int inferior, int thread, int task', as arguments and as field structures all over the place. This makes it difficult to add another kind of object breakpoints can be specific to. This patch helps with that by adding a new struct bp_specificity, which aggregates inferior, thread, task, and passing that around instead. To add another kind of object breakpoints can be specific to in the future, we save touching many function signatures by instead adding a new field to bp_specificity. Change-Id: I30a56c70f5ba5e76e2e8bcda5bb210e105be4a57

describe_other_breakpoints is extern, but nothing uses describe_other_breakpoints outside breakpoint.c. Make it static. Change-Id: I47713f72a952a46c6e74c999ea49802d0be659f4

Currently: (gdb) b func Breakpoint 1 at ADDR: file main.cpp, line 10. (gdb) b func Note: breakpoint 1 also set at pc ADDR. Breakpoint 2 at ADDR: file main.cpp, line 10. (gdb) b func thread 1 Note: breakpoints 1 (all threads) and 2 (all threads) also set at pc ADDR. Breakpoint 3 at ADDR: file main.cpp, line 10. (gdb) b func inferior 1 Note: breakpoints 1, 2 and 3 (thread 1) also set at pc ADDR. Breakpoint 4 at ADDR: file main.cpp, line 10. (gdb) b func inferior 1 Note: breakpoints 1, 2, 3 (thread 1) and 4 also set at pc ADDR. Breakpoint 5 at ADDR: file main.cpp, line 10. (gdb) b func thread 1 Note: breakpoints 1 (all threads), 2 (all threads), 3 (thread 1), 4 (all threads) and 5 (all threads) also set at pc ADDR. Breakpoint 6 at ADDR: file main.cpp, line 10. (gdb) b func Note: breakpoints 1, 2, 3 (thread 1), 4, 5 and 6 (thread 1) also set at pc ADDR. Breakpoint 7 at ADDR: file main.cpp, line 10. Observations on the above: - Breakpoints 4 and 5 in the last breakpoint set above don't have a "(inferior N)" note. - We say "(all threads)", even if the breakpoint is inferior specific. - The "(all threads)" note only appears when the new breakpoint is thread specific. That makes sense. However, if we fix this to handle inferior-specific breakpoints, we get to wonder what to print in this scenario: (gdb) b func inferior 1 Note: breakpoint 1 (all threads) also set at ... ^^^^^^^^^^^ A patch later in the series will add support for lane-specific breakpoints, so should you get something like this? : (gdb) b func lane 1 Note: breakpoint 1 (all threads) also set at ... ^^^^^^^^^^^ I think it's better to say "not specific" instead, so that the string doesn't depend on the specificity kind of the new breakpoint. So this patch changes the above example to output this: (gdb) b func Breakpoint 1 at ADDR: file main.cpp, line 10. (gdb) b func Note: breakpoint 1 also set at pc ADDR. Breakpoint 2 at ADDR: file main.cpp, line 10. (gdb) b func thread 1 Note: breakpoints 1 (not specific) and 2 (not specific) also set at pc ADDR. Breakpoint 3 at ADDR: file main.cpp, line 10. (gdb) b func inferior 1 Note: breakpoints 1 (not specific), 2 (not specific) and 3 (thread 1) also set at pc ADDR. Breakpoint 4 at ADDR: file main.cpp, line 10. (gdb) b func inferior 1 Note: breakpoints 1 (not specific), 2 (not specific), 3 (thread 1) and 6 (inferior 1) also set at pc ADDR. Breakpoint 5 at ADDR: file main.cpp, line 10. (gdb) b func thread 1 Note: breakpoints 1 (not specific), 2 (not specific), 3 (thread 1), 4 (inferior 1) and 5 (inferior 1) also set at pc ADDR. Breakpoint 6 at ADDR: file main.cpp, line 10. (gdb) b func Note: breakpoints 1, 2, 3 (thread 1), 4 (inferior 1), 5 (inferior 1) and 6 (thread 1) also set at pc ADDR. Breakpoint 7 at ADDR: file main.cpp, line 10. (gdb) Change-Id: I09a2c70f8ca7c3ed579a2b56d38dd46c41ec010f

Change-Id: Ia156850dae3c73f67f6f4b666380fa444aa57cc7

get_positive_number_trailer can also return 0. Rename it to get_non_negative_number_trailer so that it's clearer. Change-Id: Icd4df329e7e4d359743e0e6a701f8b200dcef928

So that we can distinguish return value of 0 from junk return. Needed later in the series, when parsing lane numbers, which can be 0. Currently, calling get_number_trailer on "123xyz" returns 0, which all callers assume means error. Change-Id: If0708d47f8554ee178217348516b89b39a31abdc

WIP, quickly whipped up. Need to handle lane ranges too, like "lane apply 1.2.0-32". Testcases. Documentation. Etc. Bug: SWDEV-485608 Change-Id: I3f71157dbab8e0b9980d462221cb44f6e8b26eff

This commit adds lane-specific breakpoints. WIP. A bit rough around the edges but sufficient for basic testing of the UI. (gdb) thread [Current thread is 6, lane 0 (AMDGPU Lane 1:1:1:1/0 (0,0,0)[0,0,0])] (gdb) b bit_extract_kernel lane 3 # lane 3 of current thread Breakpoint 6 at 0x7ffff6085514: file /home/pedro/rocm/bit_extract/bit_extract.cpp, line 62. (gdb) b bit_extract_kernel lane 50.4 # explicit thread of current inferior Note: breakpoint 6 (thread 6, lane 3) also set at pc 0x7ffff6085514. Breakpoint 7 at 0x7ffff6085514: file /home/pedro/rocm/bit_extract/bit_extract.cpp, line 62. (gdb) b bit_extract_kernel lane 1.50.5 # explicit thread, explicit inferior Note: breakpoints 6 (thread 6, lane 3) and 7 (thread 50, lane 4) also set at pc 0x7ffff6085514. Breakpoint 8 at 0x7ffff6085514: file /home/pedro/rocm/bit_extract/bit_extract.cpp, line 62. (gdb) b bit_extract_kernel lane 2.50.5 # explicit thread, explicit (non-existant) inferior No inferior number '2' (gdb) b bit_extract_kernel lane 1.5 # explicit lane of a CPU thread? Note: breakpoints 6 (thread 6, lane 3), 7 (thread 50, lane 4) and 8 (thread 50, lane 5) also set at pc 0x7ffff6085514. Breakpoint 9 at 0x7ffff6085514: file /home/pedro/rocm/bit_extract/bit_extract.cpp, line 62. (gdb) b bit_extract_kernel lane 10000.5 # non-existant thread. Unknown thread 10000. (gdb) info breakpoints Num Type Disp Enb Address What 6 breakpoint keep y 0x00007ffff6085514 in bit_extract_kernel(unsigned int*, unsigned int const*, unsigned long) at /home/pedro/rocm/bit_extract/bit_extract.cpp:62 stop only in thread 6, lane 3 7 breakpoint keep y 0x00007ffff6085514 in bit_extract_kernel(unsigned int*, unsigned int const*, unsigned long) at /home/pedro/rocm/bit_extract/bit_extract.cpp:62 stop only in thread 50, lane 4 8 breakpoint keep y 0x00007ffff6085514 in bit_extract_kernel(unsigned int*, unsigned int const*, unsigned long) at /home/pedro/rocm/bit_extract/bit_extract.cpp:62 stop only in thread 50, lane 5 9 breakpoint keep y 0x00007ffff6085514 in bit_extract_kernel(unsigned int*, unsigned int const*, unsigned long) at /home/pedro/rocm/bit_extract/bit_extract.cpp:62 stop only in thread 1, lane 5 (gdb) Missing testcases, documentation, fix hacks, etc. Bug: SWDEV-485608 Change-Id: Id8f27a4dc89133565dfc60d27e67329e23fa977b

…d lane IDs Note: this is a WIP prototype: - IWBN to rewrite the TID parser using the same mechanism of this new lane IDs parser. ~~~~ Currently in ROCgdb, both "info lanes" and "lane apply" iterate over the lanes of the current thread. This commit makes them iterate over all the lanes of all threads, like "info threads" and "thread apply" iterate over all threads of all inferiors, instead, as a more natural fit. E.g., currently in ROCgdb: - "info lanes" list all lanes of the current thread - "info lanes 4-6" list lanes 4-6 of the current thread - thread apply all \ info lanes list all lanes of all threads of all inferiors - thread apply 5 \ info lanes list all lanes of thread 5 of the current inferior This commit changes to this: - "info lanes" list all lanes of all threads of all inferiors - "info lanes 4-6" list lanes 4-6 of the current thread - "info lanes *" list all lanes of the current thread - "info lanes 5.*" list all lanes of thread 5 of the current inferior - "info lanes 2.5.*" list all lanes of inferior 2, thread 5 This also works: - "info lanes *.*" list all lanes of all threads of the current inferior - "info lanes 1-6.2-4" list lanes 2 to 4 of threads 1 to 6 of the current inferior - "info lanes *.2-4" list lanes 2 to 4 of all threads of the current inferior Same syntax is used for "lane apply LANE_ID_LIST". "lane apply all" now walks all lanes of all threads. Change-Id: I038c8a64f9c5c1e01130e0d89068447c4a58c9e6

Change-Id: I2114ff267cc124255ff496ab054f2d404437db68

This implements the changes discussed with Intel. Running to breakpoint: Starting program: /home/pedro/rocm/gdb/build/gdb/testsuite/outputs/gdb.rocm/meeting/meeting [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". [New Thread 0x7fffeac00640 (LWP 1653959)] [New Thread 0x7fffea200640 (LWP 1653960)] [Thread 0x7fffea200640 (LWP 1653960) exited] [New Thread 0x7fffe8a00640 (LWP 1654059)] [Thread 0x7fffe8a00640 (LWP 1654059) exited] [Switching to lane 6.0 (AMDGPU Lane 1:2:1:1/0 (0,0,0)[0,0,0])] Thread 6 "meeting" hit Breakpoint 1.10, with lanes [0-1], kernel () at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:34 34 while (1) (Should the above print a "Switching to thread" line too?) Querying current state: (gdb) inferior [Current inferior is 1 [process 1653956] (/home/pedro/rocm/gdb/build/gdb/testsuite/outputs/gdb.rocm/meeting/meeting)] (gdb) thread [Current thread is 6 (AMDGPU Wave 1:2:1:1 (0,0,0)/0)] (gdb) lane [Current lane is 6.0 (AMDGPU Lane 1:2:1:1/0 (0,0,0)[0,0,0])] Changing state: (gdb) inferior 1 [Switching to inferior 1 [process 1653956] (/home/pedro/rocm/gdb/build/gdb/testsuite/outputs/gdb.rocm/meeting/meeting)] [Switching to thread 6 (AMDGPU Wave 1:2:1:1 (0,0,0)/0)] [Switching to lane 6.0 (AMDGPU Lane 1:2:1:1/0 (0,0,0)[0,0,0])] #0 kernel () at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:34 34 while (1) GDB was already printing the "Switching to thread" part, so I made it print all the finer switching (thread->lane) too. (gdb) thread 6 [Switching to thread 6 (AMDGPU Wave 1:2:1:1 (0,0,0)/0)] [Switching to lane 6.0 (AMDGPU Lane 1:2:1:1/0 (0,0,0)[0,0,0])] #0 kernel () at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:34 34 while (1) Here I followed the "inferior" command's logic. Note: "Switching to lane" only appears if the thread has lanes. (gdb) lane 0 [Switching to lane 6.0 (AMDGPU Lane 1:2:1:1/0 (0,0,0)[0,0,0])] #0 kernel () at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:34 34 while (1) (gdb) Change-Id: I3e8783a687d4ec4ca940066a0280ae9d6e3184cc

…ption This adds "info lanes -unused" for completeness. And also, makes it possible to combine different state filters, like e.g.: (gdb) info lanes -active -unused This prints both active and unused lanes. Currently it would print no lanes. Likewise "lane apply". Change-Id: If7215a243532aaa15790bc838e55f0172df7fdeb

(gdb) info threads Id Lane Target Id Frame 1 Thread 0x7ffff63d4180 (LWP 2627482) "meeting" 0x00007fffeb26a1b1 in ?? () from /opt/rocm/lib/libhsa-runtime64.so.1 2 Thread 0x7fffeac00640 (LWP 2627485) "meeting" __GI___ioctl (fd=3, request=3222817548) at ../sysdeps/unix/sysv/linux/ioctl.c:36 5 Thread 0x7ffff629f640 (LWP 2627488) "meeting" __GI___ioctl (fd=3, request=3222817548) at ../sysdeps/unix/sysv/linux/ioctl.c:36 * 6 1 AMDGPU Wave 1:2:1:1 (0,0,0)/0 "meeting" foo () at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 7 2 AMDGPU Wave 1:2:1:2 (1,0,0)/0 "meeting" foo () at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 8 0 AMDGPU Wave 1:2:1:3 (2,0,0)/0 "meeting" foo () at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 9 0 AMDGPU Wave 1:2:1:4 (3,0,0)/0 "meeting" foo () at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 (gdb) Change-Id: I0a659858c2033b2ad8de802aee2b951eb40c4e17

(gdb) info threads Id Target Id Lane Frame 1 Thread 0x7ffff63d4180 (LWP 2625761) "meeting" 0x00007fffeb26a1c7 in ?? () from /opt/rocm/lib/libhsa-runtime64.so.1 2 Thread 0x7fffeac00640 (LWP 2625764) "meeting" __GI___ioctl (fd=3, request=3222817548) at ../sysdeps/unix/sysv/linux/ioctl.c:36 5 Thread 0x7ffff629f640 (LWP 2625786) "meeting" __GI___ioctl (fd=3, request=3222817548) at ../sysdeps/unix/sysv/linux/ioctl.c:36 * 6 AMDGPU Wave 1:2:1:1 (0,0,0)/0 "meeting" 1 foo () at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 7 AMDGPU Wave 1:2:1:2 (1,0,0)/0 "meeting" 2 foo () at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 8 AMDGPU Wave 1:2:1:3 (2,0,0)/0 "meeting" 0 foo () at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 9 AMDGPU Wave 1:2:1:4 (3,0,0)/0 "meeting" 0 foo () at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 Change-Id: I2c89fbea729a9e6b0a49ce1496cbd747561606c2

Change-Id: If73cee485c37c8b7d8363631cb4f749def11680a

This replaces rocgdb's warning when switcing to inactive lanes, with making gdb show " = <lane inactive>". (gdb) lane apply 0-1 frame Lane 6.0 (AMDGPU Lane 1:2:1:1/0 (0,0,0)[0,0,0]): 24 } Lane 6.1 (AMDGPU Lane 1:2:1:1/1 (0,0,0)[1,0,0]): 24 } (gdb) p arg1 lane inactive # an actual error. making this be smarter and show: # $1 = <lane inactive> # would be interesting. (gdb) lane apply 0-1 info args Lane 6.0 (AMDGPU Lane 1:2:1:1/0 (0,0,0)[0,0,0]): arg1 = <lane inactive> arg2 = <lane inactive> Lane 6.1 (AMDGPU Lane 1:2:1:1/1 (0,0,0)[1,0,0]): arg1 = 1 arg2 = 2 Notes: - Throwing from push_lane isn't sufficient, since that isn't used for local variables in lane memory. I had to make the dwarf evaluator throw an error when dereferencing the lane address space too. rocgdb downstream already has a ensure_have_simd_lane function called wherever the DWARF evaluator needs a lane, so I added the !inactive check there. - The predicate I am thinking of using to suppress the error will be if we have a lane PC expression for the current frame. I can't think of a case this would go wrong, but admitedly I didn't think all that much about it. I didn't try combining this work with the lane divergence support code, as it would require rebasing the lane divergence branch on top of this, which I think is more than I can chew atm. Change-Id: I783abed49841cbd38851bccfb4296fb6b7ff31d6

Change-Id: Ie4d358e66b9730342644c8db246858a1422030e5

This reverts: From fc59b01 Mon Sep 17 00:00:00 2001 From: Pedro Alves <[email protected]> Date: Sat, 17 Oct 2020 00:54:53 +0100 Subject: [PATCH] Make "thread find" hit lane target IDs as well Change-Id: I1dffdf80d94b823f0f49df1a0dfe7cea05f34ffe

As a counterpart to "thread find: (gdb) thread find 0,0,0 Thread 6 has target id 'AMDGPU Wave 1:2:1:1 (0,0,0)/0' This commit adds a new "lane find" command: (gdb) lane find (0,0,0) Lane 6.0 has target id 'AMDGPU Lane 1:2:1:1/0 (0,0,0)[0,0,0]' Lane 6.1 has target id 'AMDGPU Lane 1:2:1:1/1 (0,0,0)[1,0,0]' Lane 6.2 has target id 'AMDGPU Lane 1:2:1:1/2 (0,0,0)[2,0,0]' Lane 6.3 has target id 'AMDGPU Lane 1:2:1:1/3 (0,0,0)[3,0,0]' (gdb) lane find \[0,0,0\] Lane 6.0 has target id 'AMDGPU Lane 1:2:1:1/0 (0,0,0)[0,0,0]' Lane 7.0 has target id 'AMDGPU Lane 1:2:1:2/0 (1,0,0)[0,0,0]' Lane 8.0 has target id 'AMDGPU Lane 1:2:1:3/0 (2,0,0)[0,0,0]' Lane 9.0 has target id 'AMDGPU Lane 1:2:1:4/0 (3,0,0)[0,0,0]' TODO: - support options like "lane find -inactive/-active/-all" ? - docs, testcases, etc. Change-Id: I5eca44bdcc9ae0e358355619f3dd1d1b2c2d68d1

The following patches will introduce filtering options like: info threads -if EXPRESSION info lanes -if EXPRESSION thread apply ID_LIST|all -if EXPRESSION lane apply ID_LIST|all -if EXPRESSION EXPRESSION is a string, which raises the question of -- how can GDB tell where the expression ends, and where the next option starts? E.g., with: info threads -if foo -bar is that a "foo minus bar" subtraction, or a "-bar" option? We currently support string options. If we make EXPRESSION be a string option, then the user resolved the ambiguity by quoting EXPRESSION. However, I think it is nicer for the user to not quote. Particularly, the expression may use quoted strings, for example, which then either requires escaping, or playing with single vs double quotes: info threads -if "foo(\"str\")" info threads -if 'foo("str")' Instead, I propose we introduce a new var_expression option type, and let the user resolve the ambiguity by wrapping the expression in parentheses. E.g.: info threads -if (foo > 2) If the expression does not have spaces, you can do: info threads -if foo I actually left code in place to support quoting the expression too, in which case escapes are processed: (gdb) info lanes -if "$_lane == 3" Id State Target Id Frame 6.3 A AMDGPU Lane 1:2:1:1/3 (0,0,0)[3,0,0] foo (arg1=1, arg2=2) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 7.3 A AMDGPU Lane 1:2:1:2/3 (1,0,0)[3,0,0] foo (arg1=1, arg2=2) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 8.3 A AMDGPU Lane 1:2:1:3/3 (2,0,0)[3,0,0] foo (arg1=1, arg2=2) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 9.3 A AMDGPU Lane 1:2:1:4/3 (3,0,0)[3,0,0] foo (arg1=1, arg2=2) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 I'm still thinking whether to keep the quoted support. I made TAB completion works, which is quite nice. TODO: - Take a look at how should "maint test-options -expression EXPR" behave. - Docs, test, etc. Change-Id: I7d3fdd9e82f2bfe34ee342f803728027be037b28

If the expression does not have spaces, you can do: (gdb) info lanes -if $_lane -active Id State Target Id Frame * 6.1 A AMDGPU Lane 1:2:1:1/1 (0,0,0)[1,0,0] foo (arg1=1, arg2=2) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 6.3 A AMDGPU Lane 1:2:1:1/3 (0,0,0)[3,0,0] foo (arg1=1, arg2=2) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 7.1 A AMDGPU Lane 1:2:1:2/1 (1,0,0)[1,0,0] foo (arg1=1, arg2=2) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 7.3 A AMDGPU Lane 1:2:1:2/3 (1,0,0)[3,0,0] foo (arg1=1, arg2=2) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 8.1 A AMDGPU Lane 1:2:1:3/1 (2,0,0)[1,0,0] foo (arg1=1, arg2=2) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 8.3 A AMDGPU Lane 1:2:1:3/3 (2,0,0)[3,0,0] foo (arg1=1, arg2=2) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 9.1 A AMDGPU Lane 1:2:1:4/1 (3,0,0)[1,0,0] foo (arg1=1, arg2=2) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 9.3 A AMDGPU Lane 1:2:1:4/3 (3,0,0)[3,0,0] foo (arg1=1, arg2=2) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 Otherwise, if the expression does have spaces, you can use parens to delimit it: (gdb) info lanes -if ($_lane >= 2 && $_lane < 5) -active Id State Target Id Frame 6.3 A AMDGPU Lane 1:2:1:1/3 (0,0,0)[3,0,0] foo (arg1=1, arg2=2) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 7.3 A AMDGPU Lane 1:2:1:2/3 (1,0,0)[3,0,0] foo (arg1=1, arg2=2) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 8.3 A AMDGPU Lane 1:2:1:3/3 (2,0,0)[3,0,0] foo (arg1=1, arg2=2) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 9.3 A AMDGPU Lane 1:2:1:4/3 (3,0,0)[3,0,0] foo (arg1=1, arg2=2) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 Another example, using frame arguments: (gdb) info lanes -if (x==1) Id State Target Id Frame * 6.1 A AMDGPU Lane 1:2:1:1/1 (0,0,0)[1,0,0] foo (arg1=1, arg2=2, x=1) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:33 7.1 A AMDGPU Lane 1:2:1:2/1 (1,0,0)[1,0,0] foo (arg1=1, arg2=2, x=1) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:33 8.1 A AMDGPU Lane 1:2:1:3/1 (2,0,0)[1,0,0] foo (arg1=1, arg2=2, x=1) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:33 9.1 A AMDGPU Lane 1:2:1:4/1 (3,0,0)[1,0,0] foo (arg1=1, arg2=2, x=1) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:33 Lanes for which the filter failed to evaluate are skipped. Alternatively, you can also quote the expression, in which case escapes are processed: (gdb) info lanes -if "$_lane == 3" Id State Target Id Frame 6.3 A AMDGPU Lane 1:2:1:1/3 (0,0,0)[3,0,0] foo (arg1=1, arg2=2) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 7.3 A AMDGPU Lane 1:2:1:2/3 (1,0,0)[3,0,0] foo (arg1=1, arg2=2) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 8.3 A AMDGPU Lane 1:2:1:3/3 (2,0,0)[3,0,0] foo (arg1=1, arg2=2) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 9.3 A AMDGPU Lane 1:2:1:4/3 (3,0,0)[3,0,0] foo (arg1=1, arg2=2) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 I'm still thinking whether to keep the quoted support. "lane apply" accepts the same option. TODO: - Take a look at how should "maint test-options -expression EXPR" behave. - Docs, test, etc. Change-Id: I7d3fdd9e82f2bfe34ee342f803728027be037b28

After the previous patch, doing the same to the threads commands is easy. Change-Id: If1d8aba6b2b7db19654102f31aa57748fbaae30b

…rget_breakpoint::check_status ROCgdb handles target events very slowly when running a test case like this, where a breakpoint is preset on HipTest::vectorADD: for (int i=0; i < numDevices; ++i) { HIPCHECK(hipSetDevice(i)); hipLaunchKernelGGL(HipTest::vectorADD, dim3(blocks), dim3(threadsPerBlock), 0, stream[i], static_cast<const int*>(A_d[i]), static_cast<const int*>(B_d[i]), C_d[i], N); } What happens is: - A kernel is launched - The internal runtime breakpoint is hit during the second hipLaunchKernelGGL call, which causes amd_dbgapi_target_breakpoint::check_status to be called - Meanwhile, all waves of the kernel hit the breakpoint on vectorADD - amd_dbgapi_target_breakpoint::check_status calls process_event_queue, which pulls the thousand of breakpoint hit events from the kernel - As part of handling the breakpoint hit events, we write the PC of the waves that stopped to decrement it. Because the forward progress requirement is not disabled, this causes a suspend/resume of the queue each time, which is time-consuming. The stack trace where this all happens is: #32 0x00007ffff6b9abda in amd_dbgapi_write_register (wave_id=..., register_id=..., offset=0, value_size=8, value=0x7fffea9fdcc0) at /home/smarchi/src/amd-dbgapi/src/register.cpp:587 #33 0x00005555588c0bed in amd_dbgapi_target::store_registers (this=0x55555c7b1d20 <the_amd_dbgapi_target>, regcache=0x507000002240, regno=470) at /home/smarchi/src/wt/amd/gdb/amd-dbgapi-target.c:2504 #34 0x000055555a5186a1 in target_store_registers (regcache=0x507000002240, regno=470) at /home/smarchi/src/wt/amd/gdb/target.c:3973 #35 0x0000555559fab831 in regcache::raw_write (this=0x507000002240, regnum=470, src=...) at /home/smarchi/src/wt/amd/gdb/regcache.c:890 #36 0x0000555559fabd2b in regcache::cooked_write (this=0x507000002240, regnum=470, src=...) at /home/smarchi/src/wt/amd/gdb/regcache.c:915 #37 0x0000555559fc3ca5 in regcache::cooked_write<unsigned long, void> (this=0x507000002240, regnum=470, val=140737323456768) at /home/smarchi/src/wt/amd/gdb/regcache.c:850 #38 0x0000555559fab09a in regcache_cooked_write_unsigned (regcache=0x507000002240, regnum=470, val=140737323456768) at /home/smarchi/src/wt/amd/gdb/regcache.c:858 #39 0x0000555559fb0678 in regcache_write_pc (regcache=0x507000002240, pc=0x7ffff62bd900) at /home/smarchi/src/wt/amd/gdb/regcache.c:1460 #40 0x00005555588bb37d in process_one_event (event_id=..., event_kind=AMD_DBGAPI_EVENT_KIND_WAVE_STOP) at /home/smarchi/src/wt/amd/gdb/amd-dbgapi-target.c:1873 #41 0x00005555588bbf7b in process_event_queue (process_id=..., until_event_kind=AMD_DBGAPI_EVENT_KIND_BREAKPOINT_RESUME) at /home/smarchi/src/wt/amd/gdb/amd-dbgapi-target.c:2006 #42 0x00005555588b1aca in amd_dbgapi_target_breakpoint::check_status (this=0x511000140900, bs=0x50600014ed00) at /home/smarchi/src/wt/amd/gdb/amd-dbgapi-target.c:890 #43 0x0000555558c50080 in bpstat_stop_status (aspace=0x5070000061b0, bp_addr=0x7fffed0b9ab0, thread=0x518000026c80, ws=..., stop_chain=0x50600014ed00) at /home/smarchi/src/wt/amd/gdb/breakpoint.c:6126 #44 0x000055555984f4ff in handle_signal_stop (ecs=0x7fffeaa40ef0) at /home/smarchi/src/wt/amd/gdb/infrun.c:7169 #45 0x000055555984b889 in handle_inferior_event (ecs=0x7fffeaa40ef0) at /home/smarchi/src/wt/amd/gdb/infrun.c:6621 #46 0x000055555983eab6 in fetch_inferior_event () at /home/smarchi/src/wt/amd/gdb/infrun.c:4750 #47 0x00005555597caa5f in inferior_event_handler (event_type=INF_REG_EVENT) at /home/smarchi/src/wt/amd/gdb/inf-loop.c:42 #48 0x00005555588b838e in handle_target_event (client_data=0x0) at /home/smarchi/src/wt/amd/gdb/amd-dbgapi-target.c:1513 Fix that performance problem by disabling the forward progress requirement in amd_dbgapi_target_breakpoint::check_status, before calling process_event_queue, so that we can process all events efficiently. Since the same performance problem could theoritically happen any time process_event_queue is called with forward progress requirement enabled, add an assert to ensure that forward progress requirement is disabled when process_event_queue is invoked. This makes it necessary to add a require_forward_progress call to amd_dbgapi_finalize_core_attach. It looks a bit strange, since core files don't have execution, but it doesn't hurt. Add a test that replicates this scenario. The test launches a kernel that hits a breakpoint (with an always false condition) repeatedly. Meanwhile, the host process loads an unloads a code object, causing check_status to be called. Bug: SWDEV-482511 Change-Id: Ida86340d679e6bd8462712953458c07ba3fd49ec Approved-by: Lancelot Six <[email protected]>

…rget_breakpoint::check_status ROCgdb handles target events very slowly when running a test case like this, where a breakpoint is preset on HipTest::vectorADD: for (int i=0; i < numDevices; ++i) { HIPCHECK(hipSetDevice(i)); hipLaunchKernelGGL(HipTest::vectorADD, dim3(blocks), dim3(threadsPerBlock), 0, stream[i], static_cast<const int*>(A_d[i]), static_cast<const int*>(B_d[i]), C_d[i], N); } What happens is: - A kernel is launched - The internal runtime breakpoint is hit during the second hipLaunchKernelGGL call, which causes amd_dbgapi_target_breakpoint::check_status to be called - Meanwhile, all waves of the kernel hit the breakpoint on vectorADD - amd_dbgapi_target_breakpoint::check_status calls process_event_queue, which pulls the thousand of breakpoint hit events from the kernel - As part of handling the breakpoint hit events, we write the PC of the waves that stopped to decrement it. Because the forward progress requirement is not disabled, this causes a suspend/resume of the queue each time, which is time-consuming. The stack trace where this all happens is: #32 0x00007ffff6b9abda in amd_dbgapi_write_register (wave_id=..., register_id=..., offset=0, value_size=8, value=0x7fffea9fdcc0) at /home/smarchi/src/amd-dbgapi/src/register.cpp:587 #33 0x00005555588c0bed in amd_dbgapi_target::store_registers (this=0x55555c7b1d20 <the_amd_dbgapi_target>, regcache=0x507000002240, regno=470) at /home/smarchi/src/wt/amd/gdb/amd-dbgapi-target.c:2504 #34 0x000055555a5186a1 in target_store_registers (regcache=0x507000002240, regno=470) at /home/smarchi/src/wt/amd/gdb/target.c:3973 #35 0x0000555559fab831 in regcache::raw_write (this=0x507000002240, regnum=470, src=...) at /home/smarchi/src/wt/amd/gdb/regcache.c:890 #36 0x0000555559fabd2b in regcache::cooked_write (this=0x507000002240, regnum=470, src=...) at /home/smarchi/src/wt/amd/gdb/regcache.c:915 #37 0x0000555559fc3ca5 in regcache::cooked_write<unsigned long, void> (this=0x507000002240, regnum=470, val=140737323456768) at /home/smarchi/src/wt/amd/gdb/regcache.c:850 #38 0x0000555559fab09a in regcache_cooked_write_unsigned (regcache=0x507000002240, regnum=470, val=140737323456768) at /home/smarchi/src/wt/amd/gdb/regcache.c:858 #39 0x0000555559fb0678 in regcache_write_pc (regcache=0x507000002240, pc=0x7ffff62bd900) at /home/smarchi/src/wt/amd/gdb/regcache.c:1460 #40 0x00005555588bb37d in process_one_event (event_id=..., event_kind=AMD_DBGAPI_EVENT_KIND_WAVE_STOP) at /home/smarchi/src/wt/amd/gdb/amd-dbgapi-target.c:1873 #41 0x00005555588bbf7b in process_event_queue (process_id=..., until_event_kind=AMD_DBGAPI_EVENT_KIND_BREAKPOINT_RESUME) at /home/smarchi/src/wt/amd/gdb/amd-dbgapi-target.c:2006 #42 0x00005555588b1aca in amd_dbgapi_target_breakpoint::check_status (this=0x511000140900, bs=0x50600014ed00) at /home/smarchi/src/wt/amd/gdb/amd-dbgapi-target.c:890 #43 0x0000555558c50080 in bpstat_stop_status (aspace=0x5070000061b0, bp_addr=0x7fffed0b9ab0, thread=0x518000026c80, ws=..., stop_chain=0x50600014ed00) at /home/smarchi/src/wt/amd/gdb/breakpoint.c:6126 #44 0x000055555984f4ff in handle_signal_stop (ecs=0x7fffeaa40ef0) at /home/smarchi/src/wt/amd/gdb/infrun.c:7169 #45 0x000055555984b889 in handle_inferior_event (ecs=0x7fffeaa40ef0) at /home/smarchi/src/wt/amd/gdb/infrun.c:6621 #46 0x000055555983eab6 in fetch_inferior_event () at /home/smarchi/src/wt/amd/gdb/infrun.c:4750 #47 0x00005555597caa5f in inferior_event_handler (event_type=INF_REG_EVENT) at /home/smarchi/src/wt/amd/gdb/inf-loop.c:42 #48 0x00005555588b838e in handle_target_event (client_data=0x0) at /home/smarchi/src/wt/amd/gdb/amd-dbgapi-target.c:1513 Fix that performance problem by disabling the forward progress requirement in amd_dbgapi_target_breakpoint::check_status, before calling process_event_queue, so that we can process all events efficiently. Since the same performance problem could theoritically happen any time process_event_queue is called with forward progress requirement enabled, add an assert to ensure that forward progress requirement is disabled when process_event_queue is invoked. This makes it necessary to add a require_forward_progress call to amd_dbgapi_finalize_core_attach. It looks a bit strange, since core files don't have execution, but it doesn't hurt. Add a test that replicates this scenario. The test launches a kernel that hits a breakpoint (with an always false condition) repeatedly. Meanwhile, the host process loads an unloads a code object, causing check_status to be called. Bug: SWDEV-482511 Change-Id: Ida86340d679e6bd8462712953458c07ba3fd49ec Approved-by: Lancelot Six <[email protected]> (cherry picked from commit bb7c679)

PR gdb/33512 reports an assertion failure in test-case gdb.ada/access_to_packed_array.exp on i386-linux: ... (gdb) maint print symbols gdb/frame.c:3400: internal-error: reinflate: \ Assertion `m_cached_level >= -1' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. Quit this debugging session? (y or n) FAIL: $exp: \ maint print symbols (GDB internal error) ... I haven't been able to reproduce the failure by running the test-case on x86_64-linux with target board unix/-m32, but I'm able to reproduce on x86_64-linux by using the exec attached to the PR: ... $ cat gdb.in file foo maint expand-symtabs maint print symbols $ gdb -q -batch -ex "set trace-commands on" -x gdb.in ... c_to: array (gdb/frame.c:3395: internal-error: reinflate: \ Assertion `m_cached_level >= -1' failed. ... The problem happens when trying to print variable c_to: ... <4><f227>: Abbrev Number: 3 (DW_TAG_variable) <f228> DW_AT_name : c_to <f230> DW_AT_type : <0xf214> ... with type: ... <4><f214>: Abbrev Number: 7 (DW_TAG_array_type) <f215> DW_AT_type : <0x9f39> <5><f21d>: Abbrev Number: 12 (DW_TAG_subrange_type) <f21e> DW_AT_type : <0x9d6c> <f222> DW_AT_upper_bound : <0xf209> ... with upper bound: ... <4><f209>: Abbrev Number: 89 (DW_TAG_variable) <f20a> DW_AT_name : system__os_lib__copy_file__copy_to__TTc_toSP1___U <f20e> DW_AT_type : <0x9d6c> <f212> DW_AT_artificial : 1 <f212> DW_AT_location : 1 byte block: 57 (DW_OP_reg7 (edi)) ... The backtrace at the point of the assertion failure is: ... (gdb) bt #0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44 #1 0x00007ffff62a8e7f in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78 #2 0x00007ffff6257842 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 #3 0x00007ffff623f5cf in __GI_abort () at abort.c:79 #4 0x00000000010e7ac6 in dump_core () at gdb/utils.c:223 #5 0x00000000010e81b8 in internal_vproblem(internal_problem *, const char *, int, const char *, typedef __va_list_tag __va_list_tag *) ( problem=0x2ceb0c0 <internal_error_problem>, file=0x1ad5a90 "gdb/frame.c", line=3395, fmt=0x1ad5a08 "%s: Assertion `%s' failed.", ap=0x7fffffffc3c0) at gdb/utils.c:475 #6 0x00000000010e82ac in internal_verror ( file=0x1ad5a90 "gdb/frame.c", line=3395, fmt=0x1ad5a08 "%s: Assertion `%s' failed.", ap=0x7fffffffc3c0) at gdb/utils.c:501 #7 0x00000000019be79f in internal_error_loc ( file=0x1ad5a90 "gdb/frame.c", line=3395, fmt=0x1ad5a08 "%s: Assertion `%s' failed.") at gdbsupport/errors.cc:57 #8 0x00000000009b5c16 in frame_info_ptr::reinflate (this=0x7fffffffc878) at gdb/frame.c:3395 #9 0x00000000009b66f9 in frame_info_ptr::operator-> (this=0x7fffffffc878) at gdb/frame.h:290 #10 0x00000000009b4bd5 in get_frame_arch (this_frame=...) at gdb/frame.c:3075 #11 0x000000000081dd89 in dwarf_expr_context::fetch_result ( this=0x7fffffffc810, type=0x410d600, subobj_type=0x410d600, subobj_offset=0, as_lval=true) at gdb/dwarf2/expr.c:1006 #12 0x000000000081e2ef in dwarf_expr_context::evaluate (this=0x7fffffffc810, addr=0x7ffff459ce6b "W\aF\003", len=1, as_lval=true, per_cu=0x7fffd00053f0, frame=..., addr_info=0x7fffffffcc30, type=0x0, subobj_type=0x0, subobj_offset=0) at gdb/dwarf2/expr.c:1136 #13 0x0000000000877c14 in dwarf2_locexpr_baton_eval (dlbaton=0x3e99c18, frame=..., addr_stack=0x7fffffffcc30, valp=0x7fffffffcab0, push_values=..., is_reference=0x7fffffffc9b0) at gdb/dwarf2/loc.c:1604 #14 0x0000000000877f71 in dwarf2_evaluate_property (prop=0x3e99ce0, initial_frame=..., addr_stack=0x7fffffffcc30, value=0x7fffffffcab0, push_values=...) at gdb/dwarf2/loc.c:1668 #15 0x00000000009def76 in resolve_dynamic_range (dyn_range_type=0x3e99c50, addr_stack=0x7fffffffcc30, frame=..., rank=0, resolve_p=true) at gdb/gdbtypes.c:2198 #16 0x00000000009e0ded in resolve_dynamic_type_internal (type=0x3e99c50, addr_stack=0x7fffffffcc30, frame=..., top_level=true) at gdb/gdbtypes.c:2934 #17 0x00000000009e1079 in resolve_dynamic_type (type=0x3e99c50, valaddr=..., addr=0, in_frame=0x0) at gdb/gdbtypes.c:2989 #18 0x0000000000488ebc in ada_discrete_type_low_bound (type=0x3e99c50) at gdb/ada-lang.c:710 #19 0x00000000004eb734 in print_range (type=0x3e99c50, stream=0x30157b0, bounds_preferred_p=0) at gdb/ada-typeprint.c:156 #20 0x00000000004ebffe in print_array_type (type=0x3e99d10, stream=0x30157b0, show=1, level=9, flags=0x1bdcf20 <type_print_raw_options>) at gdb/ada-typeprint.c:381 #21 0x00000000004eda3c in ada_print_type (type0=0x3e99d10, varstring=0x401f710 "c_to", stream=0x30157b0, show=1, level=9, flags=0x1bdcf20 <type_print_raw_options>) at gdb/ada-typeprint.c:1015 #22 0x00000000004b4627 in ada_language::print_type ( this=0x2f949b0 <ada_language_defn>, type=0x3e99d10, varstring=0x401f710 "c_to", stream=0x30157b0, show=1, level=9, flags=0x1bdcf20 <type_print_raw_options>) at gdb/ada-lang.c:13681 #23 0x0000000000f74646 in print_symbol (gdbarch=0x3256270, symbol=0x3e99db0, depth=9, outfile=0x30157b0) at gdb/symmisc.c:545 #24 0x0000000000f737e6 in dump_symtab_1 (symtab=0x3ddd7e0, outfile=0x30157b0) at gdb/symmisc.c:313 #25 0x0000000000f73a69 in dump_symtab (symtab=0x3ddd7e0, outfile=0x30157b0) at gdb/symmisc.c:370 #26 0x0000000000f7420f in maintenance_print_symbols (args=0x0, from_tty=0) at gdb/symmisc.c:481 #27 0x00000000006c7fde in do_simple_func (args=0x0, from_tty=0, c=0x321e270) at gdb/cli/cli-decode.c:94 #28 0x00000000006ce65a in cmd_func (cmd=0x321e270, args=0x0, from_tty=0) at gdb/cli/cli-decode.c:2826 #29 0x0000000001005b78 in execute_command (p=0x3f48fe3 "", from_tty=0) at gdb/top.c:564 #30 0x0000000000966095 in command_handler ( command=0x3f48fd0 "maint print symbols") at gdb/event-top.c:613 #31 0x0000000001005141 in read_command_file (stream=0x3011a40) at gdb/top.c:333 #32 0x00000000006e2a64 in script_from_file (stream=0x3011a40, file=0x7fffffffe21f "gdb.in") at gdb/cli/cli-script.c:1705 #33 0x00000000006bb88c in source_script_from_stream (stream=0x3011a40, file=0x7fffffffe21f "gdb.in", file_to_open=0x7fffffffd760 "gdb.in") at gdb/cli/cli-cmds.c:706 #34 0x00000000006bba12 in source_script_with_search ( file=0x7fffffffe21f "gdb.in", from_tty=0, search_path=0) at gdb/cli/cli-cmds.c:751 #35 0x00000000006bbab2 in source_script (file=0x7fffffffe21f "gdb.in", from_tty=0) at gdb/cli/cli-cmds.c:760 #36 0x0000000000b835cb in catch_command_errors ( command=0x6bba7e <source_script(char const*, int)>, arg=0x7fffffffe21f "gdb.in", from_tty=0, do_bp_actions=false) at gdb/main.c:510 #37 0x0000000000b83803 in execute_cmdargs (cmdarg_vec=0x7fffffffd980, file_type=CMDARG_FILE, cmd_type=CMDARG_COMMAND, ret=0x7fffffffd8c8) at gdb/main.c:606 #38 0x0000000000b84d79 in captured_main_1 (context=0x7fffffffdb90) at gdb/main.c:1349 #39 0x0000000000b84fe4 in captured_main (context=0x7fffffffdb90) at gdb/main.c:1372 #40 0x0000000000b85092 in gdb_main (args=0x7fffffffdb90) at gdb/main.c:1401 #41 0x000000000041a382 in main (argc=9, argv=0x7fffffffdcc8) at gdb/gdb.c:38 (gdb) ... The immediate problem is in dwarf_expr_context::fetch_result where we're calling get_frame_arch: ... switch (this->m_location) { case DWARF_VALUE_REGISTER: { gdbarch *f_arch = get_frame_arch (this->m_frame); ... with a null frame: ... (gdb) p this->m_frame.is_null () $1 = true (gdb) ... Fix this using ensure_have_frame in dwarf_expr_context::execute_stack_op for DW_OP_reg<n> and DW_OP_regx, getting us instead: ... c_to: array (<>) of character; computed at runtime ... Tested on x86_64-linux. Approved-By: Tom Tromey <[email protected]> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=33512

palves and others added 30 commits April 2, 2025 11:21

Add wave_get_info/wave_get_info_throw/dispatch_get_info_throw wrappers

defa7f9

Add some convenience wrappers to rocm-tdep.h. These will be used in the following patches. Change-Id: I8d57404855cc3fa759245ed10e1a2c11bdfa0bdb

Implement support for DW_OP_LLVM_push_lane

bf8a4ca

Co-Authored-By: Zoran Zaric <[email protected]> Change-Id: I7aca8548c29b6f07b2233affd03449c0deeb122d

MI: Handle global "--lane" switch

cb20964

This teaches the MI command parser about a new global --lane switch, similar to the global --thread switch, but for lanes. Change-Id: I6031e3b10420b7b1578fd5741ea95d46596b2591

MI: account for lane changes

59b316b

Change-Id: I45b2c197e787ed0e81f9e13fef666b654c49a8d2

MI: -thread-select -l LANE

1aef17a

This adds a new "-l LANE" option to -thread-select. This lets the frontend switch the current lane. Change-Id: I9345260ecf27afcabf5db62ca80040598741c6e2

HACK: Restore per-lane "finish" support

0f9dd2b

Not meant for upstream along the base lane support. It will go with the whole "finish" support as a single feature. Change-Id: Idc56239c10d2f7d4db98c45aadec9172f947b5f5

HACK: Restore lane watchpoints supports

e2c8dba

Not meant for upstream along the base lane support. It will go with the whole watchpoints support as a single feature, which depends on address spaces support. Change-Id: I4f3a6e9f525aea31c3be0fb0ef06829791ad8a76

palves force-pushed the users/palves/lane-debugging branch 2 times, most recently from 467f8f3 to 60b44af Compare April 2, 2025 23:46

Tests for lane improvements meetings

cfcb276

Change-Id: I21f4be049e0815c05918cf5b54f40ab460a49584

palves force-pushed the users/palves/lane-debugging branch from 60b44af to 552d471 Compare April 3, 2025 22:28

palves added 22 commits April 4, 2025 19:36

Make describe_other_breakpoints static

536d584

describe_other_breakpoints is extern, but nothing uses describe_other_breakpoints outside breakpoint.c. Make it static. Change-Id: I47713f72a952a46c6e74c999ea49802d0be659f4

Make gdb.Breakpoint __repr__ handle inferior-specific bps

de80b91

Change-Id: Ia156850dae3c73f67f6f4b666380fa444aa57cc7

get_positive_number_trailer -> get_non_negative_number_trailer

293f13a

get_positive_number_trailer can also return 0. Rename it to get_non_negative_number_trailer so that it's clearer. Change-Id: Icd4df329e7e4d359743e0e6a701f8b200dcef928

lane INF.THR.LANE

bc5a9f5

WIP, quickly whipped up. Need to handle lane ranges too, like "lane apply 1.2.0-32". Testcases. Documentation. Etc. Bug: SWDEV-485608 Change-Id: I3f71157dbab8e0b9980d462221cb44f6e8b26eff

"thread apply" should preserve each thread's lane

1b3091f

Change-Id: I2114ff267cc124255ff496ab054f2d404437db68

Remove "Current lane is inactive" warning

2c0830d

Change-Id: If73cee485c37c8b7d8363631cb4f749def11680a

Print frame if lane is inactive

8106997

Change-Id: Ie4d358e66b9730342644c8db246858a1422030e5

"info threads -if EXPR" and "thread apply -if EXPR"

1140b0b

After the previous patch, doing the same to the threads commands is easy. Change-Id: If1d8aba6b2b7db19654102f31aa57748fbaae30b

palves force-pushed the users/palves/lane-debugging branch from 552d471 to 1140b0b Compare April 4, 2025 19:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

WIP: lane debugging support #35

WIP: lane debugging support #35

Uh oh!

palves commented Apr 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

WIP: lane debugging support #35

Are you sure you want to change the base?

WIP: lane debugging support #35

Uh oh!

Conversation

palves commented Apr 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants