-
Notifications
You must be signed in to change notification settings - Fork 20
WIP: lane debugging support #35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
palves
wants to merge
55
commits into
users/palves/amd-staging-without-lane-support
Choose a base branch
from
users/palves/lane-debugging
base: users/palves/amd-staging-without-lane-support
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
WIP: lane debugging support #35
palves
wants to merge
55
commits into
users/palves/amd-staging-without-lane-support
from
users/palves/lane-debugging
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add some convenience wrappers to rocm-tdep.h. These will be used in the following patches. Change-Id: I8d57404855cc3fa759245ed10e1a2c11bdfa0bdb
gdbarch_active_lanes_mask returns the active SIMD lanes mask of a given thread. Also introduce simd_lanes_mask_t, a typedef for uint64_t. 64-bits is sufficient for all currently known architectures. Change-Id: Ifbb7761431fb7460268765651cdca361b58bc12d
gdbarch_supported_lanes_count returns the number of lanes supported by a thread. For example, on AMDGPU, if the wavefront size is 32 lanes then this returns 32. If the wavefront size is 64 lanes then this returns 64. Change-Id: I91f9fc5d955a41467d1ae8b1a0a7b1f2105d1e58
gdbarch_used_lanes_count returns the number of lanes in a thread actually used by the kernel, because we have partial work-groups. For example, on AMDGPU, if the wavefront size is 64 lanes then the supported lanes count is 64. However, if the program allocates 90 work items all in one dimension, then the first wave will have all its lanes used, and the second wave will only use 90-64=26 lanes. It's a bit more tricky than that because work items are allocated in a three-dimensional space, but the gist of it is that there are hardware lanes which aren't used at all by the program. This hook lets core GDB be aware of them so that it can hide them. Change-Id: I976cfa790416060fdf0008b2d8bfb33b0316c31d
target_lane_to_str is meant to be used similarly to how target_pid_to_str is used to convert a ptid_t to a string, except it converts a lane instead of a thread. E.g., on AMDGPU, we get: target_pid_to_str => "AMDGPU Thread 1:2:1:3 (0,0,0)/2" target_lane_to_str => "AMDGPU Lane 1:2:1:3/6 (0,0,0)[4,1,3]" An interesting thing here is the extracting of work item ids/coordinates from flat ids. See lane_workgroup_pos_string. I made sure that is correct by comparing what GDB prints with the coordinates printed by a HIP program. lane_workgroup_pos_string is a separate function because it will be used in another context in a following patch. Change-Id: I86e87830038f4001c25781605759666ca403ebaa
The intent of this knob is such that when off: - Disables DW_AT_LLVM_lane_pc support, in case the compiler emits bogus DW_AT_LLVM_lane_pc info. - Makes GDB not try to step over divergent code regions. - Makes GDB not ignore breakpoints that trigger when the execution mask is 0. Change-Id: I8214f738d3541d9115fba102a5501565bd31019d
(These bits were originally written by Intel, though there are
considerable local changes and additions.)
- Teaches GDB about keeping track of each thread's selected lane.
thread_info gains a few fields and methods related to lanes.
- Adds scoped_restore_current_simd_lane, to save/restore a thread's
selected lane.
- Teaches GDB to record and display which lanes were active when a
breakpoint was hit:
Thread 5 "dw2-lane-pc" hit Breakpoint 1, with SIMD lanes [0-63], lane_pc_test (....) at dw2-lane-pc.cc:130
^^^^^^
- That "[0-63]" range shown above is built with a new routine, called
make_ranges_from_sorted_vector.
(I think we could probably make this routine work with a much
cheaper simd_lanes_mask_t bit mask instead of a std::vector<int>,
but I never got around to try that.)
- Teaches GDB to also say which lane is current when saying which
thread is current:
[Switching to thread 5, lane 0 (AMDGPU Lane 1:1:1:1/0 (0,0,0)[0,0,0])]
- Makes GDB evaluate the breakpoint condition for all SIMD lanes which
were active. This is important because breakpoint conditions may
involve symbols, references to lane-specific memory (lds in AMDGPU),
or lane-specific convenience variables (e.g., $_lane, added in a
following patch), for example.
There's no way to manually change the selected lane in this patch yet.
That will be added in a follow up patch.
The gdb.rocm/ testsuite is adjusted to match the GDB output changes.
gdb.base/annota1.exp had to be tweaked because the patch has the side
effect of changing the order of a frames-invalid and a
breakpoints-invalid annotion.
Limitations:
- There's no way to set a breakpoint that triggers even if all lanes
are masked out / inactive. You have to disable lane divergence
support with "maint set lane-divergence-support off" if you need
that.
- No MI.
Co-Authored-By: Intel
Change-Id: Ie4ff78fc4d6733de972cfa76c1384b08d35103b4
This adds "info lanes" & "lane" commands, the basic commands that let
you list lanes, and switch between lanes.
Switching to an active lane:
(gdb) lane 0
[Switching to thread 374, lane 0 (AMDGPU Lane 1:1:1:370/0 (92,0,0)[64,0,0])]
#0 bit_extract_kernel (C_d=<optimized out>, A_d=<optimized out>, N=<optimized out>) at bit_extract.cpp:46
46 C_d[i] = __bitextract_u32(A_d[i], 8, 4) + 2;
Switching to an inactive lane (note the warning):
(gdb) lane 1
[Switching to thread 374, lane 1 (AMDGPU Lane 1:1:1:370/1 (92,0,0)[65,0,0])]
warning: Current lane is inactive.
#0 bit_extract_kernel (C_d=<optimized out>, A_d=<optimized out>, N=<optimized out>) at bit_extract.cpp:46
46 C_d[i] = __bitextract_u32(A_d[i], 8, 4) + 2;
(gdb)
Listing lanes:
(gdb) help info lanes
Display currently known lanes.
Usage: info lanes [OPTION]... [ID]...
Options:
-all
All lanes (active, inactive and unused).
-active
Only active lanes.
-inactive
Only inactive lanes.
If ID is given, it is a space-separated list of IDs of lanes to display.
Otherwise, all lanes are displayed.
(gdb)
Unused lanes are lanes that end up not associated with any work-item,
because we ended up with a partial work-group. By default, "info
lanes" hides them. For example, if you launch a kernel with 1 block
and 3 threads per block all in the x axis, then you see only 3 lanes,
even if the hardware has 32 or 64 lanes per wave:
(gdb) info lanes
Id State Target Id Frame
* 0 A AMDGPU Lane 1:1:1:1/0 (0,0,0)[0,0,0] bit_extract_kernel (C_d=<optimized out>, A_d=<optimized out>, N=<optimized out>) at bit_extract.cpp:38
1 A AMDGPU Lane 1:1:1:1/1 (0,0,0)[1,0,0] bit_extract_kernel (C_d=<optimized out>, A_d=<optimized out>, N=<optimized out>) at bit_extract.cpp:38
2 A AMDGPU Lane 1:1:1:1/2 (0,0,0)[2,0,0] bit_extract_kernel (C_d=<optimized out>, A_d=<optimized out>, N=<optimized out>) at bit_extract.cpp:38
-all shows even unused lanes:
(gdb) info lanes -all
Id State Target Id Frame
* 0 A AMDGPU Lane 1:1:1:1/0 (0,0,0)[0,0,0] bit_extract_kernel (C_d=<optimized out>, A_d=<optimized out>, N=<optimized out>) at bit_extract.cpp:38
1 A AMDGPU Lane 1:1:1:1/1 (0,0,0)[1,0,0] bit_extract_kernel (C_d=<optimized out>, A_d=<optimized out>, N=<optimized out>) at bit_extract.cpp:38
2 A AMDGPU Lane 1:1:1:1/2 (0,0,0)[2,0,0] bit_extract_kernel (C_d=<optimized out>, A_d=<optimized out>, N=<optimized out>) at bit_extract.cpp:38
3 U AMDGPU Lane 1:1:1:1/3 (0,0,0)[0,0,1] (unused)
....
63 U AMDGPU Lane 1:1:1:1/63 (0,0,0)[0,0,21] (unused)
(gdb)
Inactive lanes show like this:
(gdb) info lanes
Id State Target Id Frame
* 0 A AMDGPU Lane 1:1:1:1/0 (0,0,0)[0,0,0] bit_extract_kernel (C_d=<optimized out>, A_d=<optimized out>, N=<optimized out>) at bit_extract.cpp:46
1 I AMDGPU Lane 1:1:1:1/1 (0,0,0)[1,0,0] (inactive)
2 A AMDGPU Lane 1:1:1:1/2 (0,0,0)[2,0,0] bit_extract_kernel (C_d=<optimized out>, A_d=<optimized out>, N=<optimized out>) at bit_extract.cpp:46
If the wave if running, we show this:
(gdb) info lanes
Id State Target Id Frame
* 0 R AMDGPU Lane 1:1:1:1/0 (0,0,0)[0,0,0] (running)
1 R AMDGPU Lane 1:1:1:1/1 (0,0,0)[1,0,0] (running)
2 R AMDGPU Lane 1:1:1:1/2 (0,0,0)[2,0,0] (running)
The "State" column can thus show 4 different states as seen in the
shown above:
A - active lane
I - inactive lane
U - unused lane
R - wave is running
Change-Id: Ifbae436b47f5fa78d58dcaba2a991c5f30373c8c
This commit adds a "lane apply" command, analogous to "thread apply",
but iterates over all lanes of the current wave.
E.g.:
lane apply 1 2 7 4 backtrace Apply 'backtrace' cmd to lanes 1,2,7,4
lane apply 2-7 9 p foo Apply 'p foo' cmd to lanes 2->7 & 9
lane apply all x/i $lane_pc Apply 'x/i $lane_pc' cmd to all lanes.
Note you can use:
(gdb) taa lane apply all CMD
to apply CMD to all lanes of all threads.
Online help shows:
(gdb) help lane apply
Apply a command to a list of lanes.
Usage: lane apply ID... [OPTION]... COMMAND
ID is a space-separated list of IDs of lanes to apply COMMAND on.
Prints lane number and target system's lane id
followed by COMMAND output.
By default, an error raised during the execution of COMMAND
aborts "lane apply".
Options:
-q
Disables printing the thread or lane information.
-c
Print any error raised by COMMAND and continue.
-s
Silently ignore any errors or empty output produced by COMMAND.
-all
All lanes (active, inactive and unused).
-active
Only active lanes.
-inactive
Only inactive lanes.
The -q, -c, -s options are shared with "thread apply". The
-all,-active,-inactive options are shared with "info lanes".
Change-Id: Ifd7d79be4a09daa8d268faf1733bed5cbefacc1c
(gdb) info threads Id Target Id Frame 1 Thread 0x7ffff66fb880 (LWP 4151298) "bit_extract" 0x00007ffff6d3c50b in ioctl () at ../sysdeps/unix/syscall-template.S:78 2 Thread 0x7ffff66fa700 (LWP 4151306) "bit_extract" 0x00007ffff6d3c50b in ioctl () at ../sysdeps/unix/syscall-template.S:78 4 Thread 0x7ffff5c7f700 (LWP 4151308) "bit_extract" 0x00007ffff6d3c50b in ioctl () at ../sysdeps/unix/syscall-template.S:78 * 5 AMDGPU Thread 1:1:1:1 (0,0,0)/0 "bit_extract" bit_extract_kernel (C_d=<optimized out>, A_d=<optimized out>, N=<optimized out>) at bit_extract.cpp:38 (gdb) info lane Id State Target Id Frame * 0 A AMDGPU Lane 1:1:1:1/0 (0,0,0)[0,0,0] bit_extract_kernel (C_d=<optimized out>, A_d=<optimized out>, N=<optimized out>) at bit_extract.cpp:38 1 A AMDGPU Lane 1:1:1:1/1 (0,0,0)[1,0,0] bit_extract_kernel (C_d=<optimized out>, A_d=<optimized out>, N=<optimized out>) at bit_extract.cpp:38 2 A AMDGPU Lane 1:1:1:1/2 (0,0,0)[2,0,0] bit_extract_kernel (C_d=<optimized out>, A_d=<optimized out>, N=<optimized out>) at bit_extract.cpp:38 (gdb) thread find \[0,0,0\] Thread 5, lane 0 has target id 'AMDGPU Lane 1:1:1:1/0 (0,0,0)[0,0,0]' (gdb) thread find \[1,0,0\] Thread 5, lane 1 has target id 'AMDGPU Lane 1:1:1:1/1 (0,0,0)[1,0,0]' (gdb) Change-Id: I38cd34702edab47478732bb5e52ff536383adfca
E.g.: (gdb) lane 0 [Switching to thread 5, lane 0 (AMDGPU Lane 1:2:1:2/0 (0,0,0)[12,1,1])] ... (gdb) p $_lane $1 = 0 (gdb) lane 1 [Switching to thread 5, lane 1 (AMDGPU Lane 1:2:1:2/1 (0,0,0)[0,2,1])] ... (gdb) p $_lane $2 = 1 $_lane_count shows the number of lanes the current thread uses, taking partial work-groups into account. E.g., on an AMDGPU with wavefront size 64 lanes, if all lanes are used: (gdb) p $_lane_count $3 = 64 When focused on a CPU thread, we get: (gdb) p $_lane $6 = 0 (gdb) p $_lane_count $7 = 0 $_lane is already described in the manual. $_lane_count is mentioned there, but in a commented out block. Change-Id: Iead080cc4872e0be090b1cbcbf3622a15c7ef030
E.g.: (gdb) info thread 1924 Id Target Id Frame * 1924 AMDGPU Thread 1:1:1:1920 (479,0,0)/3 "bit_extract" bit_extract_kernel (C_d=<optimized out>, A_d=<optimized out>, N=<optimized out>) at bit_extract.cpp:38 (gdb) p $_thread_workgroup_pos $1 = "3" (gdb) info lane 0 Id State Target Id Frame * 0 A AMDGPU Lane 1:1:1:1920/0 (479,0,0)[192,0,0] bit_extract_kernel (C_d=<optimized out>, A_d=<optimized out>, N=<optimized out>) at bit_extract.cpp:38 (gdb) p $_lane_workgroup_pos $2 = "[192,0,0]" After attaching to a process (when we can't read the dispatch info): (gdb) p $_thread_workgroup_pos $1 = "?" (gdb) p $_lane_workgroup_pos $2 = "[?,?,?]" And when focused on a CPU thread: (gdb) thread 1 [Switching to thread 1 (Thread 0x7ffff66fb880 (LWP 102291))] #0 0x00007ffff6d3c50b in ioctl () at ../sysdeps/unix/syscall-template.S:78 78 ../sysdeps/unix/syscall-template.S: No such file or directory. (gdb) p $_lane_workgroup_pos $3 = "" (gdb) p $_thread_workgroup_pos $4 = "" These are already described in the manual. Change-Id: I81984ecae92ca76a04c86afca91f0196b2459768
This teaches GDB to automatically step over divergent regions. E.g., with this, when you step over an if/then/else, you'll observe the lane stepping across only one of the "then" or "else" branches, not both. Conceptually, it's very simple: - when the thread is stepping, if the current lane becomes inactive, then continue stepping until it becomes active, at which point resume the normal stepping algorithm (check whether we reached a different line, etc.). Single-stepping the whole divergent range can be slow, as there may be many instructions to step, e.g., because the inactive code calls functions. Once we have DW_AT_LLVM_lane_pc support, we'll be able to speed that up, by setting a breakpoint at the logical lane PC, and running to it. As an escape hatch in case this goes wrong, you can disable the automatic stepping over divergent regions with the "maint set lane-divergence-support off" command. Change-Id: Idfc205cc366f5b4d645fd946b577fc303e6f7a10
Co-Authored-By: Zoran Zaric <[email protected]> Change-Id: I7aca8548c29b6f07b2233affd03449c0deeb122d
E.g.: (gdb) info thread 1924 Id Target Id Frame * 1924 AMDGPU Thread 1:1:1:1920 (479,0,0)/3 "bit_extract" bit_extract_kernel (C_d=<optimized out>, A_d=<optimized out>, N=<optimized out>) at bit_extract.cpp:38 (gdb) p $_dispatch_pos $1 = "(479,0,0)/3" After attaching to a process (when we can't read the dispatch info): (gdb) p $_dispatch_pos $1 = "(?,?,?)/?" And when focused on a CPU thread: (gdb) thread 1 [Switching to thread 1 (Thread 0x7ffff66fb880 (LWP 102291))] #0 0x00007ffff6d3c50b in ioctl () at ../sysdeps/unix/syscall-template.S:78 78 ../sysdeps/unix/syscall-template.S: No such file or directory. (gdb) p $_dispatch_pos $3 = "" This is already described in the manual. Change-Id: I87da4cb6aac27d1176b82dc1948599a27d722202
This adds two testcases: - gdb.rocm/lane-execution.exp - exercises execution-related scenarios; - gdb.rocm/lane-info.exp - exercises lane info & search use cases. Overall, these: - Test that GDB automatically steps over code regions where the current lane is divergent. - Test breakpoint hits with some lanes inactive. Also test that GDB evaluates the breakpoint condition on each lane. - Test that GDB warns when you select an inactive lane. - Test $_lane and the "lane" command. - Test invalid arguments to lane-related commands. - Test the $_lane_count convenience variable, making sure it takes into account unused lanes. - Test "info lanes", in combination with -all, -active, -inactive, and also making sure GDB hides unused lanes by default. - Test "lane apply", similarly, in combination with -all, -active, -inactive, and also making sure GDB hides unused lanes by default. Bug: https://ontrack-internal.amd.com/browse/SWDEV-302019 Change-Id: I61fcbfe7ef0cad2bcf35342b3577d6a6318edff2
This teaches the MI command parser about a new global --lane switch, similar to the global --thread switch, but for lanes. Change-Id: I6031e3b10420b7b1578fd5741ea95d46596b2591
Change-Id: I45b2c197e787ed0e81f9e13fef666b654c49a8d2
This adds two new attributes to MI *stopped. "lane-id" and "hit-lanes".
"lane-id" indicates which lane GDB switched to. This is
analogous/paired to/with the existing "thread-id" attribute, thus it
is positioned next to it.
"hit-lanes" is printed for breakpoint-hit stops. It indicates which
lanes among the set of active lanes evaluated the breakpoint condition
as true, if there was a condition, or all active lanes if there was
none.
E.g.:
~"[Switching to thread 3, lane 0 (AMDGPU Lane 1:2:1:1/0 (0,0,0)[0,0,0])]\n"
~"\n"
~"Thread 3 \"mi-lanes\" hit Temporary breakpoint 1, with lanes [0-31], lane_pc_test () at lane-execution.cpp:79\n"
~"79\t if (gid % 2)\t\t\t\t\t/* if_0_cond */\n"
*stopped,reason="breakpoint-hit",disp="del",bkptno="1",hit-lanes="0-31",frame={...},thread-id="3",lane-id="0",stopped-threads="all"
(gdb)
Above you can see that hit-lanes="0-31" is the same as "with lanes
[0-31]" in the CLI text.
Change-Id: I227a09c05ca32054704f118b35b6cfd7a35939d5
When the tracked expression uses locals, GDB saves the thread and frame with the varobj, so that it can later on switch to the correct context to re-evaluate the varobj expression. This patch extends that logic to also save and switch to the right lane. In addition, makes varobj printing show the varobj's lane, if there's one, with a new "lane-id" attribute. E.g.: -var-create var * gid ^done,name="var",numchild="0",value="5",type="unsigned int",thread-id="3",lane-id="5",has_more="0" (gdb) Change-Id: I44b307153bc3d2a60875d69ee99ce70f91d76e3a
The previous patch made GDB remember the lane that was current when a
varobj is created, so that GDB can restore that lane context when it
needs to re-evaluate the varobj expression. However, even with that
in place, GDB is still showing the wrong value for the wrong lane.
Follow's an example. When stopped at a function that looks like this:
__device__ void
lane_pc_test (unsigned gid, const int *in, struct test_struct *out)
{
The "gid" parameter's location is in private_lane memory:
p &gid
&"p &gid\n"
~"$1 = (unsigned int *) private_lane#0x50\n"
^done
Assuming GDB thread 3 is the GPU thread, let's switch to its lane 5:
-thread-select -l 5 3
^done,new-thread-id="3",lane-id="5",frame={...}
(gdb)
And create a varobj called "var", inspecting "gid":
-var-create var * gid
^done,name="var",numchild="0",value="5",type="unsigned int",thread-id="3",lane-id="5",has_more="0"
(gdb)
Note above we see value=5.
Now let's switch to a different lane, which has a different value for
"gid". Here, lane 0:
-thread-select -l 0 3
^done,new-thread-id="3",lane-id="0",frame={...}
(gdb)
Let's confirm "gid"'s value with -data-evaluate-expression, i.e.,
without the varobj machinery:
-data-evaluate-expression gid
^done,value="0"
(gdb)
Now let's update the varobj. GDB internally switches to lane 5, the
varobj's lane, reevaluates the expression, and saves the resulting
value within the varobj. The value should still be 5.
-var-update var
^done,changelist=[{name="var",in_scope="true",type_changed="false",has_more="0"}]
(gdb)
However, we see that GDB gives back an incorrect value:
-var-evaluate-expression var
^done,value="0"
(gdb)
It is printing the value of "gid" for the current lane, lane 0,
instead of the value for lane 5.
What happens is that when -var-update evaluates the expression, even
though value_of_root_1 switches to the thread/lane of the varobj, in
this case lane 5, the resulting value is a lazy memory lval value,
with address private_lane#0x50, thus an address that depends on the
current lane. After creating the lazy value, value_of_root_1 restores
back the current lane to lane 0.
varobj_update continues, and calls install_new_value. Since the value
is lazy, install_new_value fetches it. But since we are now focused
on lane 0, and the value's address is from the "private_lane" address
space, we end up with the result for that lane, instead of for lane 5.
The fix is to make install_new_value switch to the right thread/lane
context before fetching the lazy value.
Change-Id: I916f9075145ac5ecdf17b120b5672c4de597f1df
This adds a new "-l LANE" option to -thread-select. This lets the frontend switch the current lane. Change-Id: I9345260ecf27afcabf5db62ca80040598741c6e2
Before:
-thread-select -l 2 3
^done,new-thread-id="3",frame={level="0",addr=...}
After:
-thread-select -l 2 3
^done,new-thread-id="3",lane-id="2",frame={level="0",addr=...}
-thread-select 3
^done,new-thread-id="3",lane-id="2",frame={level="0",addr=...}
Note the else branch can be simplified because there we know we're
handling a non-MI uiout.
Change-Id: I45bcf9c0a4782392634134baf5dbf08a736b30af
Both the "thread" and "lane" CLI commands result in a =thread-selected
MI async event emitted. Augment that event to include the selected
lane in a new "lane-id" attribute. E.g.:
lane 1
&"lane 1\n"
~"[Switching to thread 3, lane 1 (AMDGPU Lane 1:2:1:1/1 (0,0,0)[1,0,0])]\n"
~"#0 lane_pc_test (gid=1, in=0x7ffff5801000, out=0x7ffff5800000) at lane-execution.cpp:79\n"
~"79\t if (gid % 2)\t\t\t\t\t/* if_0_cond */\n"
=thread-selected,id="3",lane-id="1",frame={level="0",addr="0x00007ffff580b8b4",....}
^done
(gdb)
Change-Id: Ic607380ddf0ca8228c7b0bf7ba265617d25f83da
This adds a new "-lane-info" command, MI equivalent of the "info
lanes" command, inspired by the existing "-thread-info" MI command,
but for lanes. Unlike the CLI's "info lanes", "-lane-info" doesn't
take any option other than the IDs of the lanes to list (like
-thread-info), thus it always lists all lanes of all states.
E.g. (simplified):
-lane-info
^done,lanes=[
{id="0",state="A",target-id="AMDGPU Lane 1:1:1:1/0 (0,0,0)[0,0,0]",frame={level="0",addr=...}},
{id="1",state="A",target-id="AMDGPU Lane 1:1:1:1/1 (0,0,0)[1,0,0]",frame={level="0",addr=...}},
...
{id="63",state="A",target-id="AMDGPU Lane 1:1:1:1/63 (0,0,0)[63,0,0]",frame={level="0",addr=...}}
]
(gdb)
Note this prints the same frame for each lane, but that will change in
the future when we add support for lane divergence.
The patch is actually quite small, because the lane printing code had
already been written from the get go printing to uiout assuming it'd
be reused by MI.
print_lane_row is tweaked to no longer prints a "state"
running|stopped attribute, because that (thread-wide) state can be
inferred from the lane's state instead. E.g., the CLI shows this when
the wave is stopped:
(gdb) info lanes 0
Id State Target Id Frame
* 0 A AMDGPU Lane 1:2:1:1/0 (0,0,0)[0,0,0] kernel () at kernel.cpp:79
And this when the wave is running:
(gdb) info lanes 0
Id State Target Id Frame
* 0 R AMDGPU Lane 1:2:1:1/0 (0,0,0)[0,0,0] (running)
The "(running)" part is redandant with the "R" state column. Getting
rid of it simplifies things, as this way we only have one MI "state"
attribute. Otherwise, we'd have to come up with different attribute
names for both state "kinds".
Change-Id: Ia1b3b3e211b12c57486b4a8df279bfad18d218ea
This adds testcases exercising the new MI lane debugging features. . -lane-info . -thread-select -l LANE . =thread-selected lane-id attribute . "lane-id" and "hit-lanes" attributes in *stopped . - global --lane option . lane-awareness in varobjs Change-Id: Ic122f5e34718a7b9d9db46fcc4acfc6acbdae011
This adds documentation for the new MI lane debugging features. . -lane-info . -thread-select -l LANE . =thread-selected lane-id attribute . "lane-id" and "hit-lanes" attributes in *stopped . - global --lane option . lane-awareness in varobjs Change-Id: I880e083c3351fc01ca93251ddf833945ea37fa85
GDB's dcache is flushed whenever we switch threads. However, this is not safe enough for all address spaces considering that some of them might be local to a lane. This patch makes GDB flush the dcache if the lane changes as well. Change-Id: Iaeded6a0b04ee200ac8bab9b6c9ddea83b8248d1
Not meant for upstream along the base lane support. It will go with the whole "finish" support as a single feature. Change-Id: Idc56239c10d2f7d4db98c45aadec9172f947b5f5
Not meant for upstream along the base lane support. It will go with the whole watchpoints support as a single feature, which depends on address spaces support. Change-Id: I4f3a6e9f525aea31c3be0fb0ef06829791ad8a76
467f8f3 to
60b44af
Compare
Change-Id: I21f4be049e0815c05918cf5b54f40ab460a49584
60b44af to
552d471
Compare
The breakpoints code passes around 'int inferior, int thread, int task', as arguments and as field structures all over the place. This makes it difficult to add another kind of object breakpoints can be specific to. This patch helps with that by adding a new struct bp_specificity, which aggregates inferior, thread, task, and passing that around instead. To add another kind of object breakpoints can be specific to in the future, we save touching many function signatures by instead adding a new field to bp_specificity. Change-Id: I30a56c70f5ba5e76e2e8bcda5bb210e105be4a57
describe_other_breakpoints is extern, but nothing uses describe_other_breakpoints outside breakpoint.c. Make it static. Change-Id: I47713f72a952a46c6e74c999ea49802d0be659f4
Currently:
(gdb) b func
Breakpoint 1 at ADDR: file main.cpp, line 10.
(gdb) b func
Note: breakpoint 1 also set at pc ADDR.
Breakpoint 2 at ADDR: file main.cpp, line 10.
(gdb) b func thread 1
Note: breakpoints 1 (all threads) and 2 (all threads) also set at pc ADDR.
Breakpoint 3 at ADDR: file main.cpp, line 10.
(gdb) b func inferior 1
Note: breakpoints 1, 2 and 3 (thread 1) also set at pc ADDR.
Breakpoint 4 at ADDR: file main.cpp, line 10.
(gdb) b func inferior 1
Note: breakpoints 1, 2, 3 (thread 1) and 4 also set at pc ADDR.
Breakpoint 5 at ADDR: file main.cpp, line 10.
(gdb) b func thread 1
Note: breakpoints 1 (all threads), 2 (all threads), 3 (thread 1), 4 (all threads) and 5 (all threads) also set at pc ADDR.
Breakpoint 6 at ADDR: file main.cpp, line 10.
(gdb) b func
Note: breakpoints 1, 2, 3 (thread 1), 4, 5 and 6 (thread 1) also set at pc ADDR.
Breakpoint 7 at ADDR: file main.cpp, line 10.
Observations on the above:
- Breakpoints 4 and 5 in the last breakpoint set above don't have a
"(inferior N)" note.
- We say "(all threads)", even if the breakpoint is inferior specific.
- The "(all threads)" note only appears when the new breakpoint is
thread specific. That makes sense. However, if we fix this to
handle inferior-specific breakpoints, we get to wonder what to
print in this scenario:
(gdb) b func inferior 1
Note: breakpoint 1 (all threads) also set at ...
^^^^^^^^^^^
A patch later in the series will add support for lane-specific
breakpoints, so should you get something like this? :
(gdb) b func lane 1
Note: breakpoint 1 (all threads) also set at ...
^^^^^^^^^^^
I think it's better to say "not specific" instead, so that the
string doesn't depend on the specificity kind of the new
breakpoint.
So this patch changes the above example to output this:
(gdb) b func
Breakpoint 1 at ADDR: file main.cpp, line 10.
(gdb) b func
Note: breakpoint 1 also set at pc ADDR.
Breakpoint 2 at ADDR: file main.cpp, line 10.
(gdb) b func thread 1
Note: breakpoints 1 (not specific) and 2 (not specific) also set at pc ADDR.
Breakpoint 3 at ADDR: file main.cpp, line 10.
(gdb) b func inferior 1
Note: breakpoints 1 (not specific), 2 (not specific) and 3 (thread 1) also set at pc ADDR.
Breakpoint 4 at ADDR: file main.cpp, line 10.
(gdb) b func inferior 1
Note: breakpoints 1 (not specific), 2 (not specific), 3 (thread 1) and 6 (inferior 1) also set at pc ADDR.
Breakpoint 5 at ADDR: file main.cpp, line 10.
(gdb) b func thread 1
Note: breakpoints 1 (not specific), 2 (not specific), 3 (thread 1), 4 (inferior 1) and 5 (inferior 1) also set at pc ADDR.
Breakpoint 6 at ADDR: file main.cpp, line 10.
(gdb) b func
Note: breakpoints 1, 2, 3 (thread 1), 4 (inferior 1), 5 (inferior 1) and 6 (thread 1) also set at pc ADDR.
Breakpoint 7 at ADDR: file main.cpp, line 10.
(gdb)
Change-Id: I09a2c70f8ca7c3ed579a2b56d38dd46c41ec010f
Change-Id: Ia156850dae3c73f67f6f4b666380fa444aa57cc7
get_positive_number_trailer can also return 0. Rename it to get_non_negative_number_trailer so that it's clearer. Change-Id: Icd4df329e7e4d359743e0e6a701f8b200dcef928
So that we can distinguish return value of 0 from junk return. Needed later in the series, when parsing lane numbers, which can be 0. Currently, calling get_number_trailer on "123xyz" returns 0, which all callers assume means error. Change-Id: If0708d47f8554ee178217348516b89b39a31abdc
WIP, quickly whipped up. Need to handle lane ranges too, like "lane apply 1.2.0-32". Testcases. Documentation. Etc. Bug: SWDEV-485608 Change-Id: I3f71157dbab8e0b9980d462221cb44f6e8b26eff
This commit adds lane-specific breakpoints. WIP. A bit rough around the edges but sufficient for basic testing of the UI. (gdb) thread [Current thread is 6, lane 0 (AMDGPU Lane 1:1:1:1/0 (0,0,0)[0,0,0])] (gdb) b bit_extract_kernel lane 3 # lane 3 of current thread Breakpoint 6 at 0x7ffff6085514: file /home/pedro/rocm/bit_extract/bit_extract.cpp, line 62. (gdb) b bit_extract_kernel lane 50.4 # explicit thread of current inferior Note: breakpoint 6 (thread 6, lane 3) also set at pc 0x7ffff6085514. Breakpoint 7 at 0x7ffff6085514: file /home/pedro/rocm/bit_extract/bit_extract.cpp, line 62. (gdb) b bit_extract_kernel lane 1.50.5 # explicit thread, explicit inferior Note: breakpoints 6 (thread 6, lane 3) and 7 (thread 50, lane 4) also set at pc 0x7ffff6085514. Breakpoint 8 at 0x7ffff6085514: file /home/pedro/rocm/bit_extract/bit_extract.cpp, line 62. (gdb) b bit_extract_kernel lane 2.50.5 # explicit thread, explicit (non-existant) inferior No inferior number '2' (gdb) b bit_extract_kernel lane 1.5 # explicit lane of a CPU thread? Note: breakpoints 6 (thread 6, lane 3), 7 (thread 50, lane 4) and 8 (thread 50, lane 5) also set at pc 0x7ffff6085514. Breakpoint 9 at 0x7ffff6085514: file /home/pedro/rocm/bit_extract/bit_extract.cpp, line 62. (gdb) b bit_extract_kernel lane 10000.5 # non-existant thread. Unknown thread 10000. (gdb) info breakpoints Num Type Disp Enb Address What 6 breakpoint keep y 0x00007ffff6085514 in bit_extract_kernel(unsigned int*, unsigned int const*, unsigned long) at /home/pedro/rocm/bit_extract/bit_extract.cpp:62 stop only in thread 6, lane 3 7 breakpoint keep y 0x00007ffff6085514 in bit_extract_kernel(unsigned int*, unsigned int const*, unsigned long) at /home/pedro/rocm/bit_extract/bit_extract.cpp:62 stop only in thread 50, lane 4 8 breakpoint keep y 0x00007ffff6085514 in bit_extract_kernel(unsigned int*, unsigned int const*, unsigned long) at /home/pedro/rocm/bit_extract/bit_extract.cpp:62 stop only in thread 50, lane 5 9 breakpoint keep y 0x00007ffff6085514 in bit_extract_kernel(unsigned int*, unsigned int const*, unsigned long) at /home/pedro/rocm/bit_extract/bit_extract.cpp:62 stop only in thread 1, lane 5 (gdb) Missing testcases, documentation, fix hacks, etc. Bug: SWDEV-485608 Change-Id: Id8f27a4dc89133565dfc60d27e67329e23fa977b
…d lane IDs
Note: this is a WIP prototype:
- IWBN to rewrite the TID parser using the same mechanism of this
new lane IDs parser.
~~~~
Currently in ROCgdb, both "info lanes" and "lane apply" iterate over
the lanes of the current thread. This commit makes them iterate over
all the lanes of all threads, like "info threads" and "thread apply"
iterate over all threads of all inferiors, instead, as a more natural
fit.
E.g., currently in ROCgdb:
- "info lanes" list all lanes of the current thread
- "info lanes 4-6" list lanes 4-6 of the current thread
- thread apply all \
info lanes list all lanes of all threads of all inferiors
- thread apply 5 \
info lanes list all lanes of thread 5 of the current inferior
This commit changes to this:
- "info lanes" list all lanes of all threads of all inferiors
- "info lanes 4-6" list lanes 4-6 of the current thread
- "info lanes *" list all lanes of the current thread
- "info lanes 5.*" list all lanes of thread 5 of the current inferior
- "info lanes 2.5.*" list all lanes of inferior 2, thread 5
This also works:
- "info lanes *.*" list all lanes of all threads of the current inferior
- "info lanes 1-6.2-4" list lanes 2 to 4 of threads 1 to 6 of the current inferior
- "info lanes *.2-4" list lanes 2 to 4 of all threads of the current inferior
Same syntax is used for "lane apply LANE_ID_LIST".
"lane apply all" now walks all lanes of all threads.
Change-Id: I038c8a64f9c5c1e01130e0d89068447c4a58c9e6
Change-Id: I2114ff267cc124255ff496ab054f2d404437db68
This implements the changes discussed with Intel. Running to breakpoint: Starting program: /home/pedro/rocm/gdb/build/gdb/testsuite/outputs/gdb.rocm/meeting/meeting [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". [New Thread 0x7fffeac00640 (LWP 1653959)] [New Thread 0x7fffea200640 (LWP 1653960)] [Thread 0x7fffea200640 (LWP 1653960) exited] [New Thread 0x7fffe8a00640 (LWP 1654059)] [Thread 0x7fffe8a00640 (LWP 1654059) exited] [Switching to lane 6.0 (AMDGPU Lane 1:2:1:1/0 (0,0,0)[0,0,0])] Thread 6 "meeting" hit Breakpoint 1.10, with lanes [0-1], kernel () at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:34 34 while (1) (Should the above print a "Switching to thread" line too?) Querying current state: (gdb) inferior [Current inferior is 1 [process 1653956] (/home/pedro/rocm/gdb/build/gdb/testsuite/outputs/gdb.rocm/meeting/meeting)] (gdb) thread [Current thread is 6 (AMDGPU Wave 1:2:1:1 (0,0,0)/0)] (gdb) lane [Current lane is 6.0 (AMDGPU Lane 1:2:1:1/0 (0,0,0)[0,0,0])] Changing state: (gdb) inferior 1 [Switching to inferior 1 [process 1653956] (/home/pedro/rocm/gdb/build/gdb/testsuite/outputs/gdb.rocm/meeting/meeting)] [Switching to thread 6 (AMDGPU Wave 1:2:1:1 (0,0,0)/0)] [Switching to lane 6.0 (AMDGPU Lane 1:2:1:1/0 (0,0,0)[0,0,0])] #0 kernel () at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:34 34 while (1) GDB was already printing the "Switching to thread" part, so I made it print all the finer switching (thread->lane) too. (gdb) thread 6 [Switching to thread 6 (AMDGPU Wave 1:2:1:1 (0,0,0)/0)] [Switching to lane 6.0 (AMDGPU Lane 1:2:1:1/0 (0,0,0)[0,0,0])] #0 kernel () at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:34 34 while (1) Here I followed the "inferior" command's logic. Note: "Switching to lane" only appears if the thread has lanes. (gdb) lane 0 [Switching to lane 6.0 (AMDGPU Lane 1:2:1:1/0 (0,0,0)[0,0,0])] #0 kernel () at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:34 34 while (1) (gdb) Change-Id: I3e8783a687d4ec4ca940066a0280ae9d6e3184cc
…ption This adds "info lanes -unused" for completeness. And also, makes it possible to combine different state filters, like e.g.: (gdb) info lanes -active -unused This prints both active and unused lanes. Currently it would print no lanes. Likewise "lane apply". Change-Id: If7215a243532aaa15790bc838e55f0172df7fdeb
(gdb) info threads Id Lane Target Id Frame 1 Thread 0x7ffff63d4180 (LWP 2627482) "meeting" 0x00007fffeb26a1b1 in ?? () from /opt/rocm/lib/libhsa-runtime64.so.1 2 Thread 0x7fffeac00640 (LWP 2627485) "meeting" __GI___ioctl (fd=3, request=3222817548) at ../sysdeps/unix/sysv/linux/ioctl.c:36 5 Thread 0x7ffff629f640 (LWP 2627488) "meeting" __GI___ioctl (fd=3, request=3222817548) at ../sysdeps/unix/sysv/linux/ioctl.c:36 * 6 1 AMDGPU Wave 1:2:1:1 (0,0,0)/0 "meeting" foo () at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 7 2 AMDGPU Wave 1:2:1:2 (1,0,0)/0 "meeting" foo () at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 8 0 AMDGPU Wave 1:2:1:3 (2,0,0)/0 "meeting" foo () at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 9 0 AMDGPU Wave 1:2:1:4 (3,0,0)/0 "meeting" foo () at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 (gdb) Change-Id: I0a659858c2033b2ad8de802aee2b951eb40c4e17
(gdb) info threads Id Target Id Lane Frame 1 Thread 0x7ffff63d4180 (LWP 2625761) "meeting" 0x00007fffeb26a1c7 in ?? () from /opt/rocm/lib/libhsa-runtime64.so.1 2 Thread 0x7fffeac00640 (LWP 2625764) "meeting" __GI___ioctl (fd=3, request=3222817548) at ../sysdeps/unix/sysv/linux/ioctl.c:36 5 Thread 0x7ffff629f640 (LWP 2625786) "meeting" __GI___ioctl (fd=3, request=3222817548) at ../sysdeps/unix/sysv/linux/ioctl.c:36 * 6 AMDGPU Wave 1:2:1:1 (0,0,0)/0 "meeting" 1 foo () at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 7 AMDGPU Wave 1:2:1:2 (1,0,0)/0 "meeting" 2 foo () at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 8 AMDGPU Wave 1:2:1:3 (2,0,0)/0 "meeting" 0 foo () at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 9 AMDGPU Wave 1:2:1:4 (3,0,0)/0 "meeting" 0 foo () at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 Change-Id: I2c89fbea729a9e6b0a49ce1496cbd747561606c2
Change-Id: If73cee485c37c8b7d8363631cb4f749def11680a
This replaces rocgdb's warning when switcing to inactive lanes, with
making gdb show " = <lane inactive>".
(gdb) lane apply 0-1 frame
Lane 6.0 (AMDGPU Lane 1:2:1:1/0 (0,0,0)[0,0,0]):
24 }
Lane 6.1 (AMDGPU Lane 1:2:1:1/1 (0,0,0)[1,0,0]):
24 }
(gdb) p arg1
lane inactive # an actual error. making this be smarter and show:
# $1 = <lane inactive>
# would be interesting.
(gdb) lane apply 0-1 info args
Lane 6.0 (AMDGPU Lane 1:2:1:1/0 (0,0,0)[0,0,0]):
arg1 = <lane inactive>
arg2 = <lane inactive>
Lane 6.1 (AMDGPU Lane 1:2:1:1/1 (0,0,0)[1,0,0]):
arg1 = 1
arg2 = 2
Notes:
- Throwing from push_lane isn't sufficient, since that isn't used
for local variables in lane memory. I had to make the dwarf
evaluator throw an error when dereferencing the lane address space
too. rocgdb downstream already has a ensure_have_simd_lane
function called wherever the DWARF evaluator needs a lane, so I
added the !inactive check there.
- The predicate I am thinking of using to suppress the error will be
if we have a lane PC expression for the current frame. I can't
think of a case this would go wrong, but admitedly I didn't think
all that much about it. I didn't try combining this work with the
lane divergence support code, as it would require rebasing the
lane divergence branch on top of this, which I think is more than
I can chew atm.
Change-Id: I783abed49841cbd38851bccfb4296fb6b7ff31d6
Change-Id: Ie4d358e66b9730342644c8db246858a1422030e5
This reverts: From fc59b01 Mon Sep 17 00:00:00 2001 From: Pedro Alves <[email protected]> Date: Sat, 17 Oct 2020 00:54:53 +0100 Subject: [PATCH] Make "thread find" hit lane target IDs as well Change-Id: I1dffdf80d94b823f0f49df1a0dfe7cea05f34ffe
As a counterpart to "thread find: (gdb) thread find 0,0,0 Thread 6 has target id 'AMDGPU Wave 1:2:1:1 (0,0,0)/0' This commit adds a new "lane find" command: (gdb) lane find (0,0,0) Lane 6.0 has target id 'AMDGPU Lane 1:2:1:1/0 (0,0,0)[0,0,0]' Lane 6.1 has target id 'AMDGPU Lane 1:2:1:1/1 (0,0,0)[1,0,0]' Lane 6.2 has target id 'AMDGPU Lane 1:2:1:1/2 (0,0,0)[2,0,0]' Lane 6.3 has target id 'AMDGPU Lane 1:2:1:1/3 (0,0,0)[3,0,0]' (gdb) lane find \[0,0,0\] Lane 6.0 has target id 'AMDGPU Lane 1:2:1:1/0 (0,0,0)[0,0,0]' Lane 7.0 has target id 'AMDGPU Lane 1:2:1:2/0 (1,0,0)[0,0,0]' Lane 8.0 has target id 'AMDGPU Lane 1:2:1:3/0 (2,0,0)[0,0,0]' Lane 9.0 has target id 'AMDGPU Lane 1:2:1:4/0 (3,0,0)[0,0,0]' TODO: - support options like "lane find -inactive/-active/-all" ? - docs, testcases, etc. Change-Id: I5eca44bdcc9ae0e358355619f3dd1d1b2c2d68d1
The following patches will introduce filtering options like:
info threads -if EXPRESSION
info lanes -if EXPRESSION
thread apply ID_LIST|all -if EXPRESSION
lane apply ID_LIST|all -if EXPRESSION
EXPRESSION is a string, which raises the question of -- how can GDB
tell where the expression ends, and where the next option starts?
E.g., with:
info threads -if foo -bar
is that a "foo minus bar" subtraction, or a "-bar" option?
We currently support string options. If we make EXPRESSION be a
string option, then the user resolved the ambiguity by quoting
EXPRESSION.
However, I think it is nicer for the user to not quote. Particularly,
the expression may use quoted strings, for example, which then either
requires escaping, or playing with single vs double quotes:
info threads -if "foo(\"str\")"
info threads -if 'foo("str")'
Instead, I propose we introduce a new var_expression option type, and
let the user resolve the ambiguity by wrapping the expression in
parentheses. E.g.:
info threads -if (foo > 2)
If the expression does not have spaces, you can do:
info threads -if foo
I actually left code in place to support quoting the expression too,
in which case escapes are processed:
(gdb) info lanes -if "$_lane == 3"
Id State Target Id Frame
6.3 A AMDGPU Lane 1:2:1:1/3 (0,0,0)[3,0,0] foo (arg1=1, arg2=2) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24
7.3 A AMDGPU Lane 1:2:1:2/3 (1,0,0)[3,0,0] foo (arg1=1, arg2=2) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24
8.3 A AMDGPU Lane 1:2:1:3/3 (2,0,0)[3,0,0] foo (arg1=1, arg2=2) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24
9.3 A AMDGPU Lane 1:2:1:4/3 (3,0,0)[3,0,0] foo (arg1=1, arg2=2) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24
I'm still thinking whether to keep the quoted support.
I made TAB completion works, which is quite nice.
TODO:
- Take a look at how should "maint test-options -expression EXPR" behave.
- Docs, test, etc.
Change-Id: I7d3fdd9e82f2bfe34ee342f803728027be037b28
If the expression does not have spaces, you can do: (gdb) info lanes -if $_lane -active Id State Target Id Frame * 6.1 A AMDGPU Lane 1:2:1:1/1 (0,0,0)[1,0,0] foo (arg1=1, arg2=2) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 6.3 A AMDGPU Lane 1:2:1:1/3 (0,0,0)[3,0,0] foo (arg1=1, arg2=2) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 7.1 A AMDGPU Lane 1:2:1:2/1 (1,0,0)[1,0,0] foo (arg1=1, arg2=2) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 7.3 A AMDGPU Lane 1:2:1:2/3 (1,0,0)[3,0,0] foo (arg1=1, arg2=2) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 8.1 A AMDGPU Lane 1:2:1:3/1 (2,0,0)[1,0,0] foo (arg1=1, arg2=2) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 8.3 A AMDGPU Lane 1:2:1:3/3 (2,0,0)[3,0,0] foo (arg1=1, arg2=2) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 9.1 A AMDGPU Lane 1:2:1:4/1 (3,0,0)[1,0,0] foo (arg1=1, arg2=2) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 9.3 A AMDGPU Lane 1:2:1:4/3 (3,0,0)[3,0,0] foo (arg1=1, arg2=2) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 Otherwise, if the expression does have spaces, you can use parens to delimit it: (gdb) info lanes -if ($_lane >= 2 && $_lane < 5) -active Id State Target Id Frame 6.3 A AMDGPU Lane 1:2:1:1/3 (0,0,0)[3,0,0] foo (arg1=1, arg2=2) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 7.3 A AMDGPU Lane 1:2:1:2/3 (1,0,0)[3,0,0] foo (arg1=1, arg2=2) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 8.3 A AMDGPU Lane 1:2:1:3/3 (2,0,0)[3,0,0] foo (arg1=1, arg2=2) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 9.3 A AMDGPU Lane 1:2:1:4/3 (3,0,0)[3,0,0] foo (arg1=1, arg2=2) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 Another example, using frame arguments: (gdb) info lanes -if (x==1) Id State Target Id Frame * 6.1 A AMDGPU Lane 1:2:1:1/1 (0,0,0)[1,0,0] foo (arg1=1, arg2=2, x=1) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:33 7.1 A AMDGPU Lane 1:2:1:2/1 (1,0,0)[1,0,0] foo (arg1=1, arg2=2, x=1) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:33 8.1 A AMDGPU Lane 1:2:1:3/1 (2,0,0)[1,0,0] foo (arg1=1, arg2=2, x=1) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:33 9.1 A AMDGPU Lane 1:2:1:4/1 (3,0,0)[1,0,0] foo (arg1=1, arg2=2, x=1) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:33 Lanes for which the filter failed to evaluate are skipped. Alternatively, you can also quote the expression, in which case escapes are processed: (gdb) info lanes -if "$_lane == 3" Id State Target Id Frame 6.3 A AMDGPU Lane 1:2:1:1/3 (0,0,0)[3,0,0] foo (arg1=1, arg2=2) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 7.3 A AMDGPU Lane 1:2:1:2/3 (1,0,0)[3,0,0] foo (arg1=1, arg2=2) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 8.3 A AMDGPU Lane 1:2:1:3/3 (2,0,0)[3,0,0] foo (arg1=1, arg2=2) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 9.3 A AMDGPU Lane 1:2:1:4/3 (3,0,0)[3,0,0] foo (arg1=1, arg2=2) at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.rocm/meeting.cpp:24 I'm still thinking whether to keep the quoted support. "lane apply" accepts the same option. TODO: - Take a look at how should "maint test-options -expression EXPR" behave. - Docs, test, etc. Change-Id: I7d3fdd9e82f2bfe34ee342f803728027be037b28
After the previous patch, doing the same to the threads commands is easy. Change-Id: If1d8aba6b2b7db19654102f31aa57748fbaae30b
552d471 to
1140b0b
Compare
rahulc-gh
pushed a commit
that referenced
this pull request
Jun 25, 2025
…rget_breakpoint::check_status
ROCgdb handles target events very slowly when running a test case like
this, where a breakpoint is preset on HipTest::vectorADD:
for (int i=0; i < numDevices; ++i) {
HIPCHECK(hipSetDevice(i));
hipLaunchKernelGGL(HipTest::vectorADD, dim3(blocks), dim3(threadsPerBlock), 0, stream[i],
static_cast<const int*>(A_d[i]), static_cast<const int*>(B_d[i]), C_d[i], N);
}
What happens is:
- A kernel is launched
- The internal runtime breakpoint is hit during the second
hipLaunchKernelGGL call, which causes
amd_dbgapi_target_breakpoint::check_status to be called
- Meanwhile, all waves of the kernel hit the breakpoint on vectorADD
- amd_dbgapi_target_breakpoint::check_status calls process_event_queue,
which pulls the thousand of breakpoint hit events from the kernel
- As part of handling the breakpoint hit events, we write the PC of the
waves that stopped to decrement it. Because the forward progress
requirement is not disabled, this causes a suspend/resume of the
queue each time, which is time-consuming.
The stack trace where this all happens is:
#32 0x00007ffff6b9abda in amd_dbgapi_write_register (wave_id=..., register_id=..., offset=0, value_size=8, value=0x7fffea9fdcc0) at /home/smarchi/src/amd-dbgapi/src/register.cpp:587
#33 0x00005555588c0bed in amd_dbgapi_target::store_registers (this=0x55555c7b1d20 <the_amd_dbgapi_target>, regcache=0x507000002240, regno=470) at /home/smarchi/src/wt/amd/gdb/amd-dbgapi-target.c:2504
#34 0x000055555a5186a1 in target_store_registers (regcache=0x507000002240, regno=470) at /home/smarchi/src/wt/amd/gdb/target.c:3973
#35 0x0000555559fab831 in regcache::raw_write (this=0x507000002240, regnum=470, src=...) at /home/smarchi/src/wt/amd/gdb/regcache.c:890
#36 0x0000555559fabd2b in regcache::cooked_write (this=0x507000002240, regnum=470, src=...) at /home/smarchi/src/wt/amd/gdb/regcache.c:915
#37 0x0000555559fc3ca5 in regcache::cooked_write<unsigned long, void> (this=0x507000002240, regnum=470, val=140737323456768) at /home/smarchi/src/wt/amd/gdb/regcache.c:850
#38 0x0000555559fab09a in regcache_cooked_write_unsigned (regcache=0x507000002240, regnum=470, val=140737323456768) at /home/smarchi/src/wt/amd/gdb/regcache.c:858
#39 0x0000555559fb0678 in regcache_write_pc (regcache=0x507000002240, pc=0x7ffff62bd900) at /home/smarchi/src/wt/amd/gdb/regcache.c:1460
#40 0x00005555588bb37d in process_one_event (event_id=..., event_kind=AMD_DBGAPI_EVENT_KIND_WAVE_STOP) at /home/smarchi/src/wt/amd/gdb/amd-dbgapi-target.c:1873
#41 0x00005555588bbf7b in process_event_queue (process_id=..., until_event_kind=AMD_DBGAPI_EVENT_KIND_BREAKPOINT_RESUME) at /home/smarchi/src/wt/amd/gdb/amd-dbgapi-target.c:2006
#42 0x00005555588b1aca in amd_dbgapi_target_breakpoint::check_status (this=0x511000140900, bs=0x50600014ed00) at /home/smarchi/src/wt/amd/gdb/amd-dbgapi-target.c:890
#43 0x0000555558c50080 in bpstat_stop_status (aspace=0x5070000061b0, bp_addr=0x7fffed0b9ab0, thread=0x518000026c80, ws=..., stop_chain=0x50600014ed00) at /home/smarchi/src/wt/amd/gdb/breakpoint.c:6126
#44 0x000055555984f4ff in handle_signal_stop (ecs=0x7fffeaa40ef0) at /home/smarchi/src/wt/amd/gdb/infrun.c:7169
#45 0x000055555984b889 in handle_inferior_event (ecs=0x7fffeaa40ef0) at /home/smarchi/src/wt/amd/gdb/infrun.c:6621
#46 0x000055555983eab6 in fetch_inferior_event () at /home/smarchi/src/wt/amd/gdb/infrun.c:4750
#47 0x00005555597caa5f in inferior_event_handler (event_type=INF_REG_EVENT) at /home/smarchi/src/wt/amd/gdb/inf-loop.c:42
#48 0x00005555588b838e in handle_target_event (client_data=0x0) at /home/smarchi/src/wt/amd/gdb/amd-dbgapi-target.c:1513
Fix that performance problem by disabling the forward progress
requirement in amd_dbgapi_target_breakpoint::check_status, before
calling process_event_queue, so that we can process all events
efficiently.
Since the same performance problem could theoritically happen any time
process_event_queue is called with forward progress requirement enabled,
add an assert to ensure that forward progress requirement is disabled
when process_event_queue is invoked. This makes it necessary to add a
require_forward_progress call to amd_dbgapi_finalize_core_attach. It
looks a bit strange, since core files don't have execution, but it
doesn't hurt.
Add a test that replicates this scenario. The test launches a kernel
that hits a breakpoint (with an always false condition) repeatedly.
Meanwhile, the host process loads an unloads a code object, causing
check_status to be called.
Bug: SWDEV-482511
Change-Id: Ida86340d679e6bd8462712953458c07ba3fd49ec
Approved-by: Lancelot Six <[email protected]>
rocm-ci
pushed a commit
that referenced
this pull request
Jul 9, 2025
…rget_breakpoint::check_status
ROCgdb handles target events very slowly when running a test case like
this, where a breakpoint is preset on HipTest::vectorADD:
for (int i=0; i < numDevices; ++i) {
HIPCHECK(hipSetDevice(i));
hipLaunchKernelGGL(HipTest::vectorADD, dim3(blocks), dim3(threadsPerBlock), 0, stream[i],
static_cast<const int*>(A_d[i]), static_cast<const int*>(B_d[i]), C_d[i], N);
}
What happens is:
- A kernel is launched
- The internal runtime breakpoint is hit during the second
hipLaunchKernelGGL call, which causes
amd_dbgapi_target_breakpoint::check_status to be called
- Meanwhile, all waves of the kernel hit the breakpoint on vectorADD
- amd_dbgapi_target_breakpoint::check_status calls process_event_queue,
which pulls the thousand of breakpoint hit events from the kernel
- As part of handling the breakpoint hit events, we write the PC of the
waves that stopped to decrement it. Because the forward progress
requirement is not disabled, this causes a suspend/resume of the
queue each time, which is time-consuming.
The stack trace where this all happens is:
#32 0x00007ffff6b9abda in amd_dbgapi_write_register (wave_id=..., register_id=..., offset=0, value_size=8, value=0x7fffea9fdcc0) at /home/smarchi/src/amd-dbgapi/src/register.cpp:587
#33 0x00005555588c0bed in amd_dbgapi_target::store_registers (this=0x55555c7b1d20 <the_amd_dbgapi_target>, regcache=0x507000002240, regno=470) at /home/smarchi/src/wt/amd/gdb/amd-dbgapi-target.c:2504
#34 0x000055555a5186a1 in target_store_registers (regcache=0x507000002240, regno=470) at /home/smarchi/src/wt/amd/gdb/target.c:3973
#35 0x0000555559fab831 in regcache::raw_write (this=0x507000002240, regnum=470, src=...) at /home/smarchi/src/wt/amd/gdb/regcache.c:890
#36 0x0000555559fabd2b in regcache::cooked_write (this=0x507000002240, regnum=470, src=...) at /home/smarchi/src/wt/amd/gdb/regcache.c:915
#37 0x0000555559fc3ca5 in regcache::cooked_write<unsigned long, void> (this=0x507000002240, regnum=470, val=140737323456768) at /home/smarchi/src/wt/amd/gdb/regcache.c:850
#38 0x0000555559fab09a in regcache_cooked_write_unsigned (regcache=0x507000002240, regnum=470, val=140737323456768) at /home/smarchi/src/wt/amd/gdb/regcache.c:858
#39 0x0000555559fb0678 in regcache_write_pc (regcache=0x507000002240, pc=0x7ffff62bd900) at /home/smarchi/src/wt/amd/gdb/regcache.c:1460
#40 0x00005555588bb37d in process_one_event (event_id=..., event_kind=AMD_DBGAPI_EVENT_KIND_WAVE_STOP) at /home/smarchi/src/wt/amd/gdb/amd-dbgapi-target.c:1873
#41 0x00005555588bbf7b in process_event_queue (process_id=..., until_event_kind=AMD_DBGAPI_EVENT_KIND_BREAKPOINT_RESUME) at /home/smarchi/src/wt/amd/gdb/amd-dbgapi-target.c:2006
#42 0x00005555588b1aca in amd_dbgapi_target_breakpoint::check_status (this=0x511000140900, bs=0x50600014ed00) at /home/smarchi/src/wt/amd/gdb/amd-dbgapi-target.c:890
#43 0x0000555558c50080 in bpstat_stop_status (aspace=0x5070000061b0, bp_addr=0x7fffed0b9ab0, thread=0x518000026c80, ws=..., stop_chain=0x50600014ed00) at /home/smarchi/src/wt/amd/gdb/breakpoint.c:6126
#44 0x000055555984f4ff in handle_signal_stop (ecs=0x7fffeaa40ef0) at /home/smarchi/src/wt/amd/gdb/infrun.c:7169
#45 0x000055555984b889 in handle_inferior_event (ecs=0x7fffeaa40ef0) at /home/smarchi/src/wt/amd/gdb/infrun.c:6621
#46 0x000055555983eab6 in fetch_inferior_event () at /home/smarchi/src/wt/amd/gdb/infrun.c:4750
#47 0x00005555597caa5f in inferior_event_handler (event_type=INF_REG_EVENT) at /home/smarchi/src/wt/amd/gdb/inf-loop.c:42
#48 0x00005555588b838e in handle_target_event (client_data=0x0) at /home/smarchi/src/wt/amd/gdb/amd-dbgapi-target.c:1513
Fix that performance problem by disabling the forward progress
requirement in amd_dbgapi_target_breakpoint::check_status, before
calling process_event_queue, so that we can process all events
efficiently.
Since the same performance problem could theoritically happen any time
process_event_queue is called with forward progress requirement enabled,
add an assert to ensure that forward progress requirement is disabled
when process_event_queue is invoked. This makes it necessary to add a
require_forward_progress call to amd_dbgapi_finalize_core_attach. It
looks a bit strange, since core files don't have execution, but it
doesn't hurt.
Add a test that replicates this scenario. The test launches a kernel
that hits a breakpoint (with an always false condition) repeatedly.
Meanwhile, the host process loads an unloads a code object, causing
check_status to be called.
Bug: SWDEV-482511
Change-Id: Ida86340d679e6bd8462712953458c07ba3fd49ec
Approved-by: Lancelot Six <[email protected]>
(cherry picked from commit bb7c679)
rahulc-gh
pushed a commit
that referenced
this pull request
Oct 7, 2025
PR gdb/33512 reports an assertion failure in test-case
gdb.ada/access_to_packed_array.exp on i386-linux:
...
(gdb) maint print symbols
gdb/frame.c:3400: internal-error: reinflate: \
Assertion `m_cached_level >= -1' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n) FAIL: $exp: \
maint print symbols (GDB internal error)
...
I haven't been able to reproduce the failure by running the test-case on
x86_64-linux with target board unix/-m32, but I'm able to reproduce on
x86_64-linux by using the exec attached to the PR:
...
$ cat gdb.in
file foo
maint expand-symtabs
maint print symbols
$ gdb -q -batch -ex "set trace-commands on" -x gdb.in
...
c_to: array (gdb/frame.c:3395: internal-error: reinflate: \
Assertion `m_cached_level >= -1' failed.
...
The problem happens when trying to print variable c_to:
...
<4><f227>: Abbrev Number: 3 (DW_TAG_variable)
<f228> DW_AT_name : c_to
<f230> DW_AT_type : <0xf214>
...
with type:
...
<4><f214>: Abbrev Number: 7 (DW_TAG_array_type)
<f215> DW_AT_type : <0x9f39>
<5><f21d>: Abbrev Number: 12 (DW_TAG_subrange_type)
<f21e> DW_AT_type : <0x9d6c>
<f222> DW_AT_upper_bound : <0xf209>
...
with upper bound:
...
<4><f209>: Abbrev Number: 89 (DW_TAG_variable)
<f20a> DW_AT_name : system__os_lib__copy_file__copy_to__TTc_toSP1___U
<f20e> DW_AT_type : <0x9d6c>
<f212> DW_AT_artificial : 1
<f212> DW_AT_location : 1 byte block: 57 (DW_OP_reg7 (edi))
...
The backtrace at the point of the assertion failure is:
...
(gdb) bt
#0 __pthread_kill_implementation (threadid=<optimized out>,
signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
#1 0x00007ffff62a8e7f in __pthread_kill_internal (signo=6,
threadid=<optimized out>) at pthread_kill.c:78
#2 0x00007ffff6257842 in __GI_raise (sig=sig@entry=6)
at ../sysdeps/posix/raise.c:26
#3 0x00007ffff623f5cf in __GI_abort () at abort.c:79
#4 0x00000000010e7ac6 in dump_core () at gdb/utils.c:223
#5 0x00000000010e81b8 in internal_vproblem(internal_problem *, const char *, int, const char *, typedef __va_list_tag __va_list_tag *) (
problem=0x2ceb0c0 <internal_error_problem>,
file=0x1ad5a90 "gdb/frame.c", line=3395,
fmt=0x1ad5a08 "%s: Assertion `%s' failed.", ap=0x7fffffffc3c0)
at gdb/utils.c:475
#6 0x00000000010e82ac in internal_verror (
file=0x1ad5a90 "gdb/frame.c", line=3395,
fmt=0x1ad5a08 "%s: Assertion `%s' failed.", ap=0x7fffffffc3c0)
at gdb/utils.c:501
#7 0x00000000019be79f in internal_error_loc (
file=0x1ad5a90 "gdb/frame.c", line=3395,
fmt=0x1ad5a08 "%s: Assertion `%s' failed.")
at gdbsupport/errors.cc:57
#8 0x00000000009b5c16 in frame_info_ptr::reinflate (this=0x7fffffffc878)
at gdb/frame.c:3395
#9 0x00000000009b66f9 in frame_info_ptr::operator-> (this=0x7fffffffc878)
at gdb/frame.h:290
#10 0x00000000009b4bd5 in get_frame_arch (this_frame=...)
at gdb/frame.c:3075
#11 0x000000000081dd89 in dwarf_expr_context::fetch_result (
this=0x7fffffffc810, type=0x410d600, subobj_type=0x410d600,
subobj_offset=0, as_lval=true)
at gdb/dwarf2/expr.c:1006
#12 0x000000000081e2ef in dwarf_expr_context::evaluate (this=0x7fffffffc810,
addr=0x7ffff459ce6b "W\aF\003", len=1, as_lval=true,
per_cu=0x7fffd00053f0, frame=..., addr_info=0x7fffffffcc30, type=0x0,
subobj_type=0x0, subobj_offset=0)
at gdb/dwarf2/expr.c:1136
#13 0x0000000000877c14 in dwarf2_locexpr_baton_eval (dlbaton=0x3e99c18,
frame=..., addr_stack=0x7fffffffcc30, valp=0x7fffffffcab0,
push_values=..., is_reference=0x7fffffffc9b0)
at gdb/dwarf2/loc.c:1604
#14 0x0000000000877f71 in dwarf2_evaluate_property (prop=0x3e99ce0,
initial_frame=..., addr_stack=0x7fffffffcc30, value=0x7fffffffcab0,
push_values=...) at gdb/dwarf2/loc.c:1668
#15 0x00000000009def76 in resolve_dynamic_range (dyn_range_type=0x3e99c50,
addr_stack=0x7fffffffcc30, frame=..., rank=0, resolve_p=true)
at gdb/gdbtypes.c:2198
#16 0x00000000009e0ded in resolve_dynamic_type_internal (type=0x3e99c50,
addr_stack=0x7fffffffcc30, frame=..., top_level=true)
at gdb/gdbtypes.c:2934
#17 0x00000000009e1079 in resolve_dynamic_type (type=0x3e99c50, valaddr=...,
addr=0, in_frame=0x0) at gdb/gdbtypes.c:2989
#18 0x0000000000488ebc in ada_discrete_type_low_bound (type=0x3e99c50)
at gdb/ada-lang.c:710
#19 0x00000000004eb734 in print_range (type=0x3e99c50, stream=0x30157b0,
bounds_preferred_p=0) at gdb/ada-typeprint.c:156
#20 0x00000000004ebffe in print_array_type (type=0x3e99d10, stream=0x30157b0,
show=1, level=9, flags=0x1bdcf20 <type_print_raw_options>)
at gdb/ada-typeprint.c:381
#21 0x00000000004eda3c in ada_print_type (type0=0x3e99d10,
varstring=0x401f710 "c_to", stream=0x30157b0, show=1, level=9,
flags=0x1bdcf20 <type_print_raw_options>)
at gdb/ada-typeprint.c:1015
#22 0x00000000004b4627 in ada_language::print_type (
this=0x2f949b0 <ada_language_defn>, type=0x3e99d10,
varstring=0x401f710 "c_to", stream=0x30157b0, show=1, level=9,
flags=0x1bdcf20 <type_print_raw_options>)
at gdb/ada-lang.c:13681
#23 0x0000000000f74646 in print_symbol (gdbarch=0x3256270, symbol=0x3e99db0,
depth=9, outfile=0x30157b0) at gdb/symmisc.c:545
#24 0x0000000000f737e6 in dump_symtab_1 (symtab=0x3ddd7e0, outfile=0x30157b0)
at gdb/symmisc.c:313
#25 0x0000000000f73a69 in dump_symtab (symtab=0x3ddd7e0, outfile=0x30157b0)
at gdb/symmisc.c:370
#26 0x0000000000f7420f in maintenance_print_symbols (args=0x0, from_tty=0)
at gdb/symmisc.c:481
#27 0x00000000006c7fde in do_simple_func (args=0x0, from_tty=0, c=0x321e270)
at gdb/cli/cli-decode.c:94
#28 0x00000000006ce65a in cmd_func (cmd=0x321e270, args=0x0, from_tty=0)
at gdb/cli/cli-decode.c:2826
#29 0x0000000001005b78 in execute_command (p=0x3f48fe3 "", from_tty=0)
at gdb/top.c:564
#30 0x0000000000966095 in command_handler (
command=0x3f48fd0 "maint print symbols")
at gdb/event-top.c:613
#31 0x0000000001005141 in read_command_file (stream=0x3011a40)
at gdb/top.c:333
#32 0x00000000006e2a64 in script_from_file (stream=0x3011a40,
file=0x7fffffffe21f "gdb.in")
at gdb/cli/cli-script.c:1705
#33 0x00000000006bb88c in source_script_from_stream (stream=0x3011a40,
file=0x7fffffffe21f "gdb.in", file_to_open=0x7fffffffd760 "gdb.in")
at gdb/cli/cli-cmds.c:706
#34 0x00000000006bba12 in source_script_with_search (
file=0x7fffffffe21f "gdb.in", from_tty=0, search_path=0)
at gdb/cli/cli-cmds.c:751
#35 0x00000000006bbab2 in source_script (file=0x7fffffffe21f "gdb.in",
from_tty=0) at gdb/cli/cli-cmds.c:760
#36 0x0000000000b835cb in catch_command_errors (
command=0x6bba7e <source_script(char const*, int)>,
arg=0x7fffffffe21f "gdb.in", from_tty=0, do_bp_actions=false)
at gdb/main.c:510
#37 0x0000000000b83803 in execute_cmdargs (cmdarg_vec=0x7fffffffd980,
file_type=CMDARG_FILE, cmd_type=CMDARG_COMMAND, ret=0x7fffffffd8c8)
at gdb/main.c:606
#38 0x0000000000b84d79 in captured_main_1 (context=0x7fffffffdb90)
at gdb/main.c:1349
#39 0x0000000000b84fe4 in captured_main (context=0x7fffffffdb90)
at gdb/main.c:1372
#40 0x0000000000b85092 in gdb_main (args=0x7fffffffdb90)
at gdb/main.c:1401
#41 0x000000000041a382 in main (argc=9, argv=0x7fffffffdcc8)
at gdb/gdb.c:38
(gdb)
...
The immediate problem is in dwarf_expr_context::fetch_result where we're
calling get_frame_arch:
...
switch (this->m_location)
{
case DWARF_VALUE_REGISTER:
{
gdbarch *f_arch = get_frame_arch (this->m_frame);
...
with a null frame:
...
(gdb) p this->m_frame.is_null ()
$1 = true
(gdb)
...
Fix this using ensure_have_frame in dwarf_expr_context::execute_stack_op for
DW_OP_reg<n> and DW_OP_regx, getting us instead:
...
c_to: array (<>) of character; computed at runtime
...
Tested on x86_64-linux.
Approved-By: Tom Tromey <[email protected]>
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=33512
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New lane debugging support from scratch, on top of "amd-staging with existing lane support stripped out" branch.
The users/palves/amd-staging-without-lane-support branch is amd-staging with rocgdb's current lane debugging support stripped out.
The users/palves/lane-debugging branch then reapplies all the lane debugging patches again, up until a diff against amd-staging is (almost) empty. Then, it adds patches on top adjusting to the user interface changes that we will want to submit upstream.
Eventually, the new patches will be squashed into the older patches so that we don't have the intermediate behavior in between. It should then be easier to rebase it all on upstream master. Meanwhile, having this based on amd-staging instead helps with the prototyping and discussions because upstream does not have support the needed DWARF extensions and other important bits, required for accessing variables, unwinding, etc.