Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

maintain separate Ghidra state for each Python shared interpreter #69

Merged
merged 7 commits into from
Aug 18, 2023

Conversation

mike-hunhoff
Copy link
Collaborator

@mike-hunhoff mike-hunhoff commented Aug 14, 2023

Switching to Python shared interpreters introduced a significant problem: maintaining independent Ghidra state at the builtins scope for each Python shared interpreter running in the same Java process.

This problem stems from the fact that by design Python shared interpreters share all imported modules, including builtins, so any modification to builtins affects all shared interpreters running in the same Java process. For example, we store currentProgram in builtins so all modules can access it without an explicit import. Unfortunately, if shared interpreter A sets currentProgram followed by shared interpreter B setting currentProgram then shared interpreter A's next access of currentProgram could be invalid. This issue extends to all GhidraScript state variables and FlatProgramAPI functions that we store in builtins. Additionally, we modify the sys module to redirect stdout and stderr to the correct Ghidra window, e.g. Ghidra console window. This issue also extends to these modifications as the sys module is also shared across all shared interpreters.

This PR proposes to fix the issues described above by using decorated Python methods that retrieve the correct GhidraScript state variable or FlatProgramAPI function when accessed from builtins. The independent state is maintained using the thread ID of the Java thread used to run the shared Python interpreter. For example, shared interpreter A is executed from Java thread 10 and shared interpreter B is executed from Java thread 11. All Ghidra-related builtins accesses made by the shared interpreter's A and B are redirected to the specific cached objects based on the interpreter's Java thread ID, 10 or 11.

This PR contains breaking changes. GhidraScript state variables:

  • monitor
  • currentProgram
  • currentAddress
  • currentLocation
  • currentSelection
  • currentHighlight

must now be accessed via a function call e.g. monitor(), currentProgram(), etc.. This diverts from the default behavior used by previous Ghidrathon versions where these GhidraScript state variables could be accessed directly by name e.g. monitor. Any Python scripts written for previous Ghidrathon versions must be updated accordingly.

closes #68
closes #67
closes #65
closes #71

To test:

  • Python 3.8
  • Python 3.9
  • Python 3.10
  • Python 3.11
  • Linux
  • Windows

To do:

  • update unit tests
  • update documentation to remove Python 3.7 support, because deprecated
  • document breaking changes
    • GhidraScript variables are now function calls
    • Python modules are only imported once per process which affects Python modules that store state in the global scope, and especially Ghidrathon scripts that store reference(s) to GhidraScript variables in the global scope, e.g. currentProgram = currentProgram() in the global can result in outdated object references

@mike-hunhoff mike-hunhoff marked this pull request as draft August 14, 2023 23:42
@mike-hunhoff
Copy link
Collaborator Author

@williballenthin I could use a second pair of eyes if you have the cycles.

@mike-hunhoff mike-hunhoff marked this pull request as ready for review August 18, 2023 18:20
@mike-hunhoff mike-hunhoff merged commit fdb9d0c into main Aug 18, 2023
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant