Replace USERDEFINED
pcodeop with user-defined string in decompilation?
#8447
Replies: 8 comments
-
How would you want the decompiler to treat the arguments of the |
Beta Was this translation helpful? Give feedback.
-
I haven’t looked much at the internal representation of the `USERDEFINED`
pcode, but I assume you could keep this all the same and still allow all
the analyses to work. I’m just asking is if there is a way to change the
string representation of the pcode operation when it gets displayed.
…On Thu, 14 Aug 2025 at 10:33 pm, thixotropist ***@***.***> wrote:
*thixotropist* left a comment (NationalSecurityAgency/ghidra#8435)
<#8435 (comment)>
How would you want the decompiler to treat the arguments of the
USERDEFINED pcode? The existing function call representation tracks input
parameters as live and any output value as newly generated, then calculates
variable lifetime and dead code segments based on that 'heritage'. It might
be easier - and more readable - to leave the function call as is and patch
the decompiler to insert a user defined string in a comment block.
—
Reply to this email directly, view it on GitHub
<#8435 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACB2DEWJFDU3Q5DQG2YYZTT3NR62ZAVCNFSM6AAAAACD36U4WKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTCOBYGI4TINZQGQ>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
The string representation of USERDEFINED is set in the SLEIGH files for a processor. You can alter those files so that the decompiler string for a given instruction is whatever you want for display. It's also common to define multiple fixed strings for a single base instruction, showing different strings for different arguments to that instruction. I don't believe you can change those names in the decompiler window the way you can change variable names. Maybe you can give an example of what kind of change and where you want to make that change? |
Beta Was this translation helpful? Give feedback.
-
For example this processor module lifts NodeJS bytecode
https://github.com/PositiveTechnologies/ghidra_nodejs/blob/main/data/languages/v8.slaspec
strict equality is modeled using the userop `TestStrictEq(a, b)`. However
it would be nice to display this in the decompilation window as `a === b`.
Similarly, Ghidra’s JVM processor spec defines `aThrowOp(obj)` to model
exception throwing. But I would be nice to display this as `throw obj`.
…On Fri, 15 Aug 2025 at 9:09 pm, thixotropist ***@***.***> wrote:
*thixotropist* left a comment (NationalSecurityAgency/ghidra#8435)
<#8435 (comment)>
The string representation of USERDEFINED is set in the SLEIGH files for a
processor. You can alter those files so that the decompiler string for a
given instruction is whatever you want for display. It's also common to
define multiple fixed strings for a single base instruction, showing
different strings for different arguments to that instruction. I don't
believe you can change those names in the decompiler window the way you can
change variable names.
Maybe you can give an example of what kind of change and where you want to
make that change?
—
Reply to this email directly, view it on GitHub
<#8435 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACB2DEWI43MXKOHRROEDNAD3NW5UXAVCNFSM6AAAAACD36U4WKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTCOJRGI3DONBTGI>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
You likely would need to patch the decompiler's |
Beta Was this translation helpful? Give feedback.
-
Oh cool, thanks for the pointer! I didn’t realize it was in the C++ code.
I’ll have a closer look.
Do you think it’s useful to have this as a generic feature? Either
something you could specify in a CSPEC`callotherfixup` or that you could
override in an implementation of `InjectPayload`.
…On Sat, 16 Aug 2025 at 10:25 pm, thixotropist ***@***.***> wrote:
*thixotropist* left a comment (NationalSecurityAgency/ghidra#8435)
<#8435 (comment)>
You likely would need to patch the decompiler's PrintC::opCallother
method to recognize userops like TestStrictEq(a, b), then add some custom
code to print as you wish. The example you cite would likely turn into result
= a === b; , following your slaspec definition of the output of TestStrictEq(a,
b).
—
Reply to this email directly, view it on GitHub
<#8435 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACB2DEXWXNWZMRPVGZGYTU33N4PL7AVCNFSM6AAAAACD36U4WKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTCOJTGYZTMNBYGQ>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
I can't guess at the utility of that generic feature, especially if you factor in any added maintenance burden of the C++ decompiler code. The existing decompiler includes some incomplete code to support user-specific or processor-specific plugins, and to transform one or more pcodeops into more readable If you want a user-grade example showing how to transform selected CALLOTHER pcodes into more readable C function invocations, you may want to look at this RISC-V vector plugin. That experimental project adds user plugin support to a released Ghidra decompiler, then shows how to create a plugin that can transform sequences of instructions into something more readable. The complexity goes off the rails pretty fast when you try for readability transforms that alter the control flow. |
Beta Was this translation helpful? Give feedback.
-
There is definitely added complexity and I’m not sure if the general use
case (seems pretty niche, which is why I was asking the question).
Once again, thanks for the pointers. I will take a look.
…On Sun, 17 Aug 2025 at 10:43 pm, thixotropist ***@***.***> wrote:
*thixotropist* left a comment (NationalSecurityAgency/ghidra#8435)
<#8435 (comment)>
I can't guess at the utility of that generic feature, especially if you
factor in any added maintenance burden of the C++ decompiler code. The
existing decompiler includes some incomplete code to support user-specific
or processor-specific plugins, and to transform one or more pcodeops into
more readable builtin_memcpy invocations. You might explore that path to
develop a local feature first along the general lines the developers were
taking.
If you want a user-grade example showing how to transform selected
CALLOTHER pcodes into more readable C function invocations, you may want to
look at this RISC-V vector plugin
<https://github.com/thixotropist/ghidra_decompiler_plugins>. That
*experimental* project adds user plugin support to a released Ghidra
decompiler, then shows how to create a plugin that can transform sequences
of instructions into something more readable. The complexity goes off the
rails pretty fast when you try for readability transforms that alter the
control flow.
—
Reply to this email directly, view it on GitHub
<#8435 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACB2DEQLUGZI25WXN6ICMCD3OB2FTAVCNFSM6AAAAACD36U4WKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTCOJUGM3DENZTHE>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi there!
USERDEFINED
pcode operations appear in the decompilation as function calls. Is it possible to replace the function call with a user-defined string?I understand that this will likely break the "Export to C" functionality (because you may not end up with C anymore), but it would be helpful for readability (particularly lifting bytecode instructions) where the source language isn't C.
Beta Was this translation helpful? Give feedback.
All reactions