-
Notifications
You must be signed in to change notification settings - Fork 604
ghidra: Updates capa_explorer.py to enable users to select if namespaces, comments and bookmarks are added. Closes #1977 #2652
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…ate functions for comment and namespace creation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @Shajal-Kumar, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
Summary of Changes
This pull request enhances the capa_explorer.py
script in Ghidra, providing users with more control over the annotation process. Specifically, it introduces a dialog box that allows users to select whether to add labels/namespaces, comments, or bookmarks during the analysis. The label_matches()
function has been split into create_capa_namespaces()
and create_capa_comments()
for better organization. The parse_json()
function now accepts new arguments to facilitate the use of guard clauses in the new functions. Additionally, the CHANGELOG.md
file has been updated to reflect these changes.
Highlights
- User-Selectable Annotations: Introduces a dialog box using Ghidra's
askChoices
API, allowing users to choose whether to add labels/namespaces, comments, or bookmarks during analysis. - Code Reorganization: Splits the
label_matches()
function intocreate_capa_namespaces()
andcreate_capa_comments()
for improved code structure and readability. - Parameterization of
parse_json()
: Updates theparse_json()
function to accept arguments that control the creation of namespaces, comments, and bookmarks, enabling more granular control over the analysis process.
Changelog
- CHANGELOG.md
- Adds user-options for adding comments, bookmarks, and namespaces via a dialog box in
capa_explorer.py
. - Implements granular control over annotations with
create_capa_namespace
andcreate_capa_comments
.
- Adds user-options for adding comments, bookmarks, and namespaces via a dialog box in
- capa/ghidra/capa_explorer.py
- Adds
do_labels
,do_comments
, anddo_bookmarks
parameters to theCapaMatchData
class to control annotation creation. - Splits
label_matches()
intocreate_capa_namespace()
andcreate_capa_comments()
to handle namespace creation/labeling and commenting, respectively. - Modifies
parse_json()
to acceptdo_labels
,do_comments
, anddo_bookmarks
parameters and pass them toCapaMatchData
. - Adds a dialog box to allow users to select annotation options before analysis begins (lines 398-402).
- Adds
- rules
- Updates the subproject commit to 66975132455e3e22520a84dca14cca1d3afd292a
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command>
or @gemini-code-assist <command>
. Below is a summary of the supported commands.
Feature | Command | Description |
---|---|---|
Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/
folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
A script's enhanced,
Choices now in user's hands,
Annotations bloom,
Banishing all gloom,
Analysis understands.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request enhances the capa_explorer.py
script by adding user-selectable options for annotations in Ghidra, which is a valuable improvement. The code is generally well-structured, but there are a few areas that could be improved for clarity and efficiency.
Summary of Findings
- Duplicated Code: There is duplicated code in
label_matches
andcreate_capa_comments
for resolving the encompassing function and handling subscope matches. This duplication can be reduced by creating a helper function. - Conditional Logic: The nested conditional logic in
label_matches
andcreate_capa_comments
can be simplified to improve readability.
Merge Readiness
The pull request introduces useful functionality and is generally well-implemented. However, addressing the duplicated code and simplifying the conditional logic would improve the code's maintainability and readability. I recommend addressing these issues before merging. I am unable to approve this pull request, and users should have others review and approve this code before merging.
capa/ghidra/capa_explorer.py
Outdated
for sub_match in self.matches.get(addr): | ||
for loc, node in sub_match.items(): | ||
sub_ghidra_addr = toAddr(hex(loc)) # type: ignore [name-defined] # noqa: F821 | ||
|
||
if node != {}: | ||
if func is not None: | ||
# basic block/ insn scope under resolved function | ||
if func is not None: | ||
# basic block/ insn scope under resolved function | ||
# this would be a global/file scoped main match | ||
# try to resolve the encompassing function via the subscope match, instead | ||
# Ex. "run as service" rule | ||
sub_func = getFunctionContaining(sub_ghidra_addr) # type: ignore [name-defined] # noqa: F821 | ||
if sub_func is not None: | ||
sub_func_addr = sub_func.getEntryPoint() | ||
# place function in capa namespace & create the subscope match label in Ghidra's global namespace | ||
create_label(sub_func_addr, sub_func.getName(), capa_namespace) | ||
else: | ||
# addr is in some other file section like .data | ||
# represent this location with a label symbol under the capa namespace | ||
# Ex. See "Reference Base64 String" rule | ||
# in many cases, these will be ghidra-labeled data, so just add the existing | ||
# label symbol to the capa namespace | ||
for sym in symbol_table.getSymbols(sub_ghidra_addr): | ||
if sym.getSymbolType() == SymbolType.LABEL: | ||
sym.setNamespace(capa_namespace) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if func is not None: | ||
# basic block/ insn scope under resolved function | ||
# this would be a global/file scoped main match | ||
# try to resolve the encompassing function via the subscope match, instead | ||
# Ex. "run as service" rule | ||
sub_func = getFunctionContaining(sub_ghidra_addr) # type: ignore [name-defined] # noqa: F821 | ||
if sub_func is not None: | ||
sub_func_addr = sub_func.getEntryPoint() | ||
# place function in capa namespace & create the subscope match label in Ghidra's global namespace | ||
create_label(sub_func_addr, sub_func.getName(), capa_namespace) | ||
else: | ||
# addr is in some other file section like .data | ||
# represent this location with a label symbol under the capa namespace | ||
# Ex. See "Reference Base64 String" rule | ||
# in many cases, these will be ghidra-labeled data, so just add the existing | ||
# label symbol to the capa namespace | ||
for sym in symbol_table.getSymbols(sub_ghidra_addr): | ||
if sym.getSymbolType() == SymbolType.LABEL: | ||
sym.setNamespace(capa_namespace) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if node != {}: | ||
if func is not None: | ||
# basic block / insn scope under resolved function | ||
for sub_type, description in parse_node(node): | ||
self.set_pre_comment(sub_ghidra_addr, sub_type, description) | ||
else: | ||
# this would be a global/file scoped main match | ||
# try to resolve the encompassing function via the subscope match, instead | ||
# Ex. "run as service" rule | ||
sub_func = getFunctionContaining(sub_ghidra_addr) # type: ignore [name-defined] # noqa: F821 | ||
if sub_func is not None: | ||
sub_func_addr = sub_func.getEntryPoint() | ||
self.set_plate_comment(sub_func_addr) | ||
for sub_type, description in parse_node(node): | ||
self.set_pre_comment(sub_ghidra_addr, sub_type, description) | ||
else: | ||
# this would be a global/file scoped main match | ||
# try to resolve the encompassing function via the subscope match, instead | ||
# Ex. "run as service" rule | ||
sub_func = getFunctionContaining(sub_ghidra_addr) # type: ignore [name-defined] # noqa: F821 | ||
if sub_func is not None: | ||
sub_func_addr = sub_func.getEntryPoint() | ||
# place function in capa namespace & create the subscope match label in Ghidra's global namespace | ||
create_label(sub_func_addr, sub_func.getName(), capa_namespace) | ||
self.set_plate_comment(sub_func_addr) | ||
for sub_type, description in parse_node(node): | ||
self.set_pre_comment(sub_ghidra_addr, sub_type, description) | ||
else: | ||
# addr is in some other file section like .data | ||
# represent this location with a label symbol under the capa namespace | ||
# Ex. See "Reference Base64 String" rule | ||
for sub_type, description in parse_node(node): | ||
# in many cases, these will be ghidra-labeled data, so just add the existing | ||
# label symbol to the capa namespace | ||
for sym in symbol_table.getSymbols(sub_ghidra_addr): | ||
if sym.getSymbolType() == SymbolType.LABEL: | ||
sym.setNamespace(capa_namespace) | ||
self.set_pre_comment(sub_ghidra_addr, sub_type, description) | ||
# addr is in some other file section like .data | ||
# represent this location with a label symbol under the capa namespace | ||
# Ex. See "Reference Base64 String" rule | ||
# in many cases, these will be ghidra-labeled data, so just add the existing | ||
# label symbol to the capa namespace | ||
for sub_type, description in parse_node(node): | ||
self.set_pre_comment(sub_ghidra_addr, sub_type, description) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if func is not None: | ||
# basic block / insn scope under resolved function | ||
for sub_type, description in parse_node(node): | ||
self.set_pre_comment(sub_ghidra_addr, sub_type, description) | ||
else: | ||
# this would be a global/file scoped main match | ||
# try to resolve the encompassing function via the subscope match, instead | ||
# Ex. "run as service" rule | ||
sub_func = getFunctionContaining(sub_ghidra_addr) # type: ignore [name-defined] # noqa: F821 | ||
if sub_func is not None: | ||
sub_func_addr = sub_func.getEntryPoint() | ||
self.set_plate_comment(sub_func_addr) | ||
for sub_type, description in parse_node(node): | ||
self.set_pre_comment(sub_ghidra_addr, sub_type, description) | ||
else: | ||
# this would be a global/file scoped main match | ||
# try to resolve the encompassing function via the subscope match, instead | ||
# Ex. "run as service" rule | ||
sub_func = getFunctionContaining(sub_ghidra_addr) # type: ignore [name-defined] # noqa: F821 | ||
if sub_func is not None: | ||
sub_func_addr = sub_func.getEntryPoint() | ||
# place function in capa namespace & create the subscope match label in Ghidra's global namespace | ||
create_label(sub_func_addr, sub_func.getName(), capa_namespace) | ||
self.set_plate_comment(sub_func_addr) | ||
for sub_type, description in parse_node(node): | ||
self.set_pre_comment(sub_ghidra_addr, sub_type, description) | ||
else: | ||
# addr is in some other file section like .data | ||
# represent this location with a label symbol under the capa namespace | ||
# Ex. See "Reference Base64 String" rule | ||
for sub_type, description in parse_node(node): | ||
# in many cases, these will be ghidra-labeled data, so just add the existing | ||
# label symbol to the capa namespace | ||
for sym in symbol_table.getSymbols(sub_ghidra_addr): | ||
if sym.getSymbolType() == SymbolType.LABEL: | ||
sym.setNamespace(capa_namespace) | ||
self.set_pre_comment(sub_ghidra_addr, sub_type, description) | ||
# addr is in some other file section like .data | ||
# represent this location with a label symbol under the capa namespace | ||
# Ex. See "Reference Base64 String" rule | ||
# in many cases, these will be ghidra-labeled data, so just add the existing | ||
# label symbol to the capa namespace | ||
for sub_type, description in parse_node(node): | ||
self.set_pre_comment(sub_ghidra_addr, sub_type, description) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
closes #1977
Enhances the
capa_explorer.py
script by adding user-selectable options for annotations in Ghidra. Users can now choose whether to add labels/namespaces, comments, or bookmarks during analysis.askChoices
API.label_matches()
has been split intocreate_capa_namespaces(
) andcreate_capa_comments()
.parse_json()
has new arguments which are further passed to theCapaMatchData
class to faciliate the usage of guard clauses increate_capa_namespaces(
) andcreate_capa_comments()
.CHANGELOG.md
updatedChecklist