Skip to content

Commit 2db2a4a

Browse files
authored
Merge pull request #1 from BinaryAnalysisPlatform/new-python-interface
introducing new python interface
2 parents d3a14cc + f4b2eb1 commit 2db2a4a

File tree

7 files changed

+993
-384
lines changed

7 files changed

+993
-384
lines changed

setup.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,5 +7,7 @@
77
version = '1.0.0~alpha',
88
package_dir = {'bap' : 'src'},
99
packages = ['bap'],
10-
install_requires = ['requests']
10+
extras_require = {
11+
'rpc' : ['requests']
12+
}
1113
)

src/__init__.py

Lines changed: 46 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,43 @@
11
r"""Python inteface to BAP.
22
3+
4+
Porcelain Interace
5+
==================
6+
7+
The high level interface allows to run ``bap`` and get back the information
8+
that we were able to infer from the file. It consists only from one function,
9+
``bap.run``, that will drive ``bap`` for you. It is quite versatile, so read the
10+
documentation for the further information.
11+
12+
13+
Example
14+
-------
15+
16+
>>> import bap
17+
>>> proj = bap.run('/bin/true', ['--symbolizer=ida'])
18+
>>> text = proj.sections['.text']
19+
>>> main = proj.program.subs.find('main')
20+
>>> entry = main.blks[0]
21+
>>> next = main.blks.find(entry.jmps[0].target.arg)
22+
23+
It is recommended to explore the interface using ipython or similiar
24+
interactive toplevels.
25+
26+
We use ADT syntax to communicate with python. It is a syntactical
27+
subset of Python grammar, so in fact, bap just returns a valid Python
28+
program, that is then evaluated. The ADT stands for Algebraic Data
29+
Type, and is described in ``adt`` module. For non-trivial tasks one
30+
should consider using ``adt.Visitor`` class.
31+
32+
33+
34+
Plumbing interface [rpc]
35+
========================
36+
37+
The low level interface provides an access to internal services. It
38+
uses ``bap-server``, and talks with bap using RPC protocol. It is in
39+
extras section and must be installed explicitly with ``[rpc]`` tag.
40+
341
In a few keystrokes:
442
543
>>> import bap
@@ -23,7 +61,7 @@
2361
#. ``image`` loads given file
2462
2563
Disassembling things
26-
====================
64+
--------------------
2765
2866
``disasm`` is a swiss knife for disassembling things. It takes either a
2967
string object, or something returned by an ``image`` function, e.g.,
@@ -56,7 +94,7 @@
5694
that is instance of one of this kind, it will stop.
5795
5896
Reading files
59-
=============
97+
-------------
6098
6199
To read and analyze file one should load it with ``image``
62100
function. This function returns an instance of class ``Image`` that
@@ -99,6 +137,10 @@
99137
100138
Where data is actual string of bytes.
101139
"""
102-
__all__ = ['disasm', 'image', 'adt', 'asm', 'arm', 'bil']
103140

104-
from .bap import disasm, image
141+
from .bap import run
142+
143+
try :
144+
from .rpc import disasm, image
145+
except ImportError:
146+
pass

src/adt.py

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -3,11 +3,11 @@
33
Algebraic Data Types (ADT) is used to represent two kinds of things:
44
55
1. A discrimintated union of types, called sum
6-
2. A combination of some types, called product.
6+
2. A combination of ADT types, called product.
77
88
# Sum types
99
10-
Sum types represents a concept of generalizing. For example,
10+
Sum types represent a concept of generalizing. For example,
1111
on ARM R0 and R1 are all general purpose registers (GPR). Also on ARM
1212
we have Condition Code registers (CCR) :
1313
@@ -48,10 +48,10 @@ class ADT(object):
4848
""" Algebraic Data Type.
4949
5050
This is a base class for all ADTs. ADT represented by a tuple of arguments,
51-
stored in a val field. Arguments should be instances of ADT class, or numbers,
51+
stored in a `arg` field. Arguments should be instances of ADT class, or numbers,
5252
or strings. Empty set of arguments is permitted.
5353
A one-tuple is automatically untupled, i.e., `Int(12)` has value `12`, not `(12,)`.
54-
For convenience, a name of the constructor is provided in `name` field.
54+
A name of the constructor is stored in the `constr` field
5555
5656
A structural comparison is provided.
5757
@@ -115,7 +115,7 @@ def run(self, adt):
115115
116116
Otherwise, for an ADT of type C the method `visit_C` is looked up in the
117117
visitors methods dictionary. If it doesn't exist, then `visit_B` is
118-
looked up, where `D` is the base class of `C`. The process continues,
118+
looked up, where `B` is the base class of `C`. The process continues,
119119
until the method is found. This is guaranteed to terminate,
120120
since visit_ADT method is defined.
121121
@@ -124,8 +124,8 @@ def run(self, adt):
124124
Once the method is found it is called. It is the method's responsiblity
125125
to recurse into sub-elements, e.g., call run method.
126126
127-
For example, suppose that we want to count negative values in a given
128-
BIL expression:
127+
For example, suppose that we want to count negative values in
128+
some BIL expression:
129129
130130
class CountNegatives(Visitor):
131131
def __init__(self):
@@ -148,7 +148,7 @@ def visit_NEG(self, op):
148148
visit_Int for Int constructor and visit_NEG for counting unary minuses.
149149
(Actually we should count for bitwise NOT operation also, since it will
150150
change the sign bit also, but lets forget about it for the matter of the
151-
excercise (and it can be easily fixed just by matching visit_UnOp)).
151+
exercise (and it can be easily fixed just by matching visit_UnOp)).
152152
153153
When we hit visit_NEG we toggle current sign, storing its previous value
154154
and recurse into the operand. After we return from the recursion, we restore

0 commit comments

Comments
 (0)