Skip to content

Commit

Permalink
WASM Support MVP (#1495)
Browse files Browse the repository at this point in the history
The WASM branch is ready to be included in the rest of Manticore. It doesn't include several advanced features, including a symbolic memory model, symbolic floating point support, or a standard library implementation (WASI/emscripten), but it does have full symbolic semantics for all of the integer instructions. 

### Commit log:
* Empty WASM Module

Will run on a .wasm file and do nothing forever

* Read and parse sections

/shrug

* Can partially allocate modules

Just pushing for the weekend

* Blacken

* Can allocate with exports now

* Static initialization

Doesn't evaluate any of the starting expressions, but that's okay

* Make type hints a little more pycharm-friendly

* First part of invocation

* Blacken...

* Barebones execution poc

Can execute the following snippet:
```
main(){
return 42;
}
```

* blacken

* Add some loads and stores

* Add global operations

* Move float methods further down

* Support internal function calls

```
18:20:42 (mc) ehennenfent@nessie:~/wasm ➜ cat loop_check.c
#include <stdio.h>
#include <stdbool.h>

bool check(int arg){
  if (((arg << 2) ^ 16) == 36) {
    return true;
  }
  else{
    return false;
  }
}

int main(){
    printf("Hello world!\n");

    int g = getchar();

    printf("Got: %d\n", g);

    int y = 0;
    for (int i = 0; i < 20; i++){
      y++;

      if (y > 18){
        return check(g) ? 0 : -1;
      }
    }

    return -1;
}
18:20:54 (mc) ehennenfent@nessie:~/wasm ➜ manticore loop_check.wasm
65: i32.const (16)
11: end
65: i32.const (32)
11: end
Initialization Complete
65: i32.const (0)
65: i32.const (0)
40: i32.load (Offset 4)
65: i32.const (16)
107: i32.sub
34: tee_local (Local 1)
54: i32.store (Offset 4)
65: i32.const (32)
16: call (Func Idx 2)
Called stub function with args: (32,)
HostFunc returned [13]
26: drop
32: get_local (Local 1)
16: call (Func Idx 0)
Called stub function with args: ()
HostFunc returned [13]
34: tee_local (Local 0)
54: i32.store (Offset 0)
65: i32.const (16)
32: get_local (Local 1)
16: call (Func Idx 1)
Called stub function with args: (16, 16)
HostFunc returned [13]
26: drop
65: i32.const (0)
32: get_local (Local 1)
65: i32.const (16)
106: i32.add
54: i32.store (Offset 4)
65: i32.const (-1)
65: i32.const (0)
32: get_local (Local 0)
65: i32.const (1073741823)
113: i32.and
65: i32.const (13)
71: i32.ne
27: select
11: end
[0]
2019-07-25 18:20:58,138: [13842] m.c.manticore:INFO: Generated testcase No. 0 - test
2019-07-25 18:20:58,143: [13842] m.c.manticore:INFO: Results in /home/ehennenfent/wasm/mcore_1evptpip
```

* Improve pickle-friendliness

* Convert wasm module instructions into internal type s

* Fix class naming conflict

Thanks Pickle

* Support bitvecs as an integer type

* Refactor execution to handle single instructions at a time

* Add Atomic Stacks

It's a very inefficient implementation, but we'll make it better

* Handle Concretize Exceptions

Needs a lot of fine tuning but it's a start

* Propagate symbols all the way to return

* Add test generation infrastructure

Still very basic but it's a start

* Attempt to make Travis happy

* Fix missing generation step

* Ignore stack exhaustsion and invalid modules

* Add basic shift instructions

* Add lots of arithmetic instructions

* Fix global initialization

* Add optimization for fixed size left shift

* Make Travis tests less verbose

* Blacken

* Properly handle modulo in shifting

* Add several control flow instructions

* Blacken

* Partially Fix return instruction

* Fix missing return value from look_forward

* Update look_forward to handle nested blocks

* Fix type sigs and executor copy/paste errors, deduplicate test names

* Fix an instruction exit issue

* Fix local ordering

* Blacken

* test_address now passes for wasm
Fixed issue with json2mc.py issue with testname uniqueness
Started some work on floating point

* i hate git

* i hate git

* Adding linenumber to testcase function name
minor float progress

* f32_cmp passed for wasm

* f64_cmp passes for wasm

* f32_bitwise passes except for the problem with assertEqual not working with nan

* f64_bitwise passes except for the problem with assertEqual not working with nan

* Fixed the tests to handle [nan] comparisons

* better handling of nans and infinites

* added integer backing value for f64

* Blacken

* Blacken

* f64.nearest

* Fix argument ordering and implement N-bit memory stores

* Use subtests

* Switch print to logger

* Modify control flow instructions to match spec

"Continuation" != the thing that goes inside the Label

* Teporarily disable call instructions to prevent infinite loop

Still need to figure out control flow for nested function calls

* Blacken

* Fixed infinite loop in _call via block depth tracking

The behavior of `end` needs to depend on how many labels and activations are on the stack. Every time we push or pop a label, we increment or decrement the counter for the current call frame.

* Blacken

* Initial `br` implementation

Probably still bugged

* Support if blocks without an else instruction

* Add call_indirect

* Add grow_memory

* i32_wrap_i64

* Add several arithmetic operators

* Copy i32.ctz impl from x86 TZCNT

* Fix off-by-one in br

I think this is the correct fix (it makes all the _br tests pass) but I'm leaving the TODO there just in case.

* Blacken...

* Fix lingering issue with if block splitting

* Fix bad args to sextend in i64_extend_s_i32

* Fix indirect call null check and Float binop return types

* Improve br_table indexing

See comment

* Remove vestigial argument popping in invoke

* Fix looking forward in _return and _block

* Blacken

* Generate trap and action tests

* Fix extraneous call to exec_for_test

* Fix missing traps in load instructions

* Fix c1=0 case in ctz/clz

Hopefully this also fixes the fact that the original CTZ implementations seems to have been intended for little-endian architectures

* Handle traps in-stride during test execution

* Treat zero division as trap

* Use ctypes to convert integers

* Blacken

* Apparently I was wrong about the CTZ endianness thing

* Better range checks for offsets

This really hammers home why checking whether ea + size is _greater than_ len(mem.data) is strange if mem.data is 0-indexed, but idk, that's what the spec says.

* Make stub functions return the correct number of results

* ctypes can't handle floats

* Add option to run the start function

* Correctly handle functions with unacceptable names

* Blacken

* Treat NotImplemented as a Skip

Also counts individual subtests directly instead of screwing around with trying to grep for them in the log.

* Fix unhandled negative indices

* Fixed some integer instruction bugs

i64/32:
div_s/u
shr_s/u
shl
clz

* Fix global and memory imports

* CC

* Make default loglevel debug

* Rework conversion from int

* Make float_store work

* _actually_ fix float_store

* Restore old float creation method

* Blacken

* Add range checks to float load/store

* Force cast to unsigned in _u instructions

* Fix yet another I32 -> I64 copy paste error

How many times do we have to teach you this lesson, ~old~ young man?

* Skip problematic float tests for now

* Handle floating point inaccuracies when dividing large ints

* Skip conversion and endianness tests

* Rotl, rotl, popcnt

* Blacken

* Add some conversion stubs

* Maybe fix conversions

* Improve reinterpret and float demotion

* Re-enable endianness tests

* Include WASM logging in verbosity controls

* Propagate constraints to executor

* Blacken

* Add prototype symbolic tests

* Add api for invocation to Manticore object

* Temporarily disable native/evm tests

There's no reason to tie up Travis for 40 minutes every time I push a commit.

* Fix wasmworld import

* Concretize br_if

* Pass arg generator to invoke

* Download updated WABT

It's fixed now!

* Generate Symbolic tests (partial)

* Follow m.run format

* Blacken

* Reinitialize Manticore every 50 tests

This should cut down on the performance issues we've been seeing

* Revert "Temporarily disable native/evm tests"

This reverts commit 7abd6c1.

* I32 and I64 pass Symbolic

* Create new MC on reinit

* Blacken

* Fix return handling in test generation

* Concretize operand in `if_`

* Concretize memory addresses before access

In the long run, we should have a symbolic memory model

* Better memory access concretization (and br_table)

* Concretize float converions and indirect call operands

* Run start method

* Handle size errors in wrap+i64 and select

* Add timeout (and skip extra-problematic tests)

* Blacken

* Aggressively concretize floats

* Recast ints that should have been floats

* Better trap handling

* Unstash float concretization

* Fix log-lived 'trap' attr

* Blacken

* Concretize F32 when promoting to F64

_shouldn't_ strictly be necesssary, but the lazy conversion means that one of the tests fails. Until we get symbolic floating point, this will have to do.

* Docstrings for platforms/wasm.py

* Docstrings for ManticoreWASM

* Add docs for the module structure

* Add WASM to RTD

* missing docstring for Module itself

* Fix sphinx errors

* Bump RTD python version

I'm sorry and I swear I'll fix it later

* Reformat docstrings for sphinx

* Pacify RTD by adding newlines before params

* Missed a few

* Document the types

* Document executor

* A few more type docstrings

* Partial runtime_structure docs

* Finish documenting runtime structure

* Fix trailing whitespace

* Add collatz conjecture example

* Include undocumented members in RTD

* Add a bunch of events

* Appropriately connect event publishers

* Generate useful testcases

* Blacken

* Attempt at 3.6 backporting

* Optimize AtomicStack

* Sort solve_n output in test_solve_n

Very unclear why this is happening, but it's breaking a travis build
https://travis-ci.com/trailofbits/manticore/jobs/246451527

* Roll back to trusty?

Unsure if this is what's causing the EVM issues

* Create custom trap types

* Lazily evaluate overflows

May need to revert this if it slows down the Travis tests. It's 0.7% slower in the example I'm looking at, but that may not carry over.

* Blacken

* Address Dominik's feedback, remove stale TODOs

* Expand & Test Examples

* Fix relative path

* Fix relative path?

* capitalize Path

* Fix relative path for real

* Blacken...

* Test finalize

* Add supplemental tests for coverage

* Use dedicated helper for store access

* Handle quoted strings in imported module names

* Allow WASMWorld to hold multiple modules

Necessary for getting imports/exports to work, but still needs some changes before it passes the tests.

* Actually use export maps

* Fix most of the control flow issues with import tests

* Use available cores for test generation

* Escape module name

* Don't reinit modules that have already been initialized

* Use addresses for imports instead of copying

* Fix removed timeout argument

* Interleave registration and aliases

* Fix broken "None" check for address

* Blacken

* Make manual exports work with address syntax

* Import all registered modules

Only necessary for elem, but it's tricky to make changes just for one test, so *shrug emoji*

* Remove stringified annotations

Turns out those only work in the same file...

* Fix easy type issues in all but runtime structure

* Fix easy type issues in runtime_structure

* Fix* everything but imports

* Blacken

* Blacken 2

* Typo

* Merge module structure and runtime structure

* Fix instance vs type confusion

* Fix mypy and concrete tests

Still an issue with the symbolic tests

* Still generate symbolic tests though

I keep forgetting not to commit that file...

* Attempt to fix env for symbolic tests

* Blacken/mypy

* Fix missing supplemental environment variables arg

* Fix docstrings for new import style

* CC

* Bump timeout duration to 70 minutes

I don't like it, but adding more jobs to Travis isn't going to make our builds any faster

* Fix pycharm type, import, and docstring errors

* Explain type: ignore

* Missed two type:ignore's

* Fix CLI so it doesn't throw up on arbitrary modules
  • Loading branch information
Eric Hennenfent authored Nov 8, 2019
1 parent 561c3ee commit 1208759
Show file tree
Hide file tree
Showing 33 changed files with 5,632 additions and 12 deletions.
5 changes: 3 additions & 2 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,13 +15,14 @@ stages:
env:
global:
- CC_TEST_REPORTER_ID=db72f1ed59628c16eb0c00cbcd629c4c71f68aa1892ef42d18c7c2b8326f460a
- JOB_COUNT=4 # Four jobs generate test coverage: ethereum, ethereum_vm, native, and other
- JOB_COUNT=5 # Five jobs generate test coverage: ethereum, ethereum_vm, native, wasm, and other
- PYTHONWARNINGS="default::ResourceWarning" # Enable ResourceWarnings
matrix:
- TEST_TYPE=examples
- TEST_TYPE=ethereum
- TEST_TYPE=ethereum_vm
- TEST_TYPE=native
- TEST_TYPE=wasm
- TEST_TYPE=other

branches:
Expand Down Expand Up @@ -61,7 +62,7 @@ install:
- scripts/travis_install.sh $TEST_TYPE

script:
- travis_wait 60 scripts/travis_test.sh $TEST_TYPE
- travis_wait 70 scripts/travis_test.sh $TEST_TYPE

after_success:
- ./cc-test-reporter format-coverage -t coverage.py -o "coverage/codeclimate.$TEST_TYPE.json"
Expand Down
1 change: 1 addition & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ Manticore is a symbolic execution tool for analysis of binaries and smart contra
states
evm
native
wasm
plugins
gotchas

Expand Down
42 changes: 42 additions & 0 deletions docs/wasm.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
Web Assembly
------------


ManticoreWASM
^^^^^^^^^^^^^

.. automodule:: manticore.wasm.manticore
:members:
:undoc-members:


WASM World
^^^^^^^^^^

.. automodule:: manticore.platforms.wasm
:members:
:undoc-members:


Executor
^^^^^^^^

.. automodule:: manticore.wasm.executor
:members:
:undoc-members:


Module Structure
^^^^^^^^^^^^^^^^^

.. automodule:: manticore.wasm.structure
:members:
:undoc-members:


Types
^^^^^

.. automodule:: manticore.wasm.types
:members:
:undoc-members:
15 changes: 15 additions & 0 deletions examples/wasm/collatz/collatz.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
int collatz(int x){
if (x <= 1){
return 0;
}
if (x % 2 == 0){
return collatz(x / 2) + 1;
}
else{
return collatz(3 * x + 1) + 1;
}
}

int main() {
return collatz(getchar(">"));
}
145 changes: 145 additions & 0 deletions examples/wasm/collatz/collatz.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,145 @@
"""
Three different ways of evaluating the Collatz conjecture.
Example 1: Pass a symbolic argument to the `collatz` function
Example 2: Pass a symbolic version of `getchar` as an imported function
Example 3: Concretely evaluate `collatz(1337)` and count the instructions executed
See: examples/collatz.c
"""

from manticore.wasm import ManticoreWASM
from manticore.wasm.types import I32
from manticore.core.plugin import Plugin
from manticore.utils.log import set_verbosity


print(
"""
============ Example 1 ============
"""
)

m = ManticoreWASM("collatz.wasm")
set_verbosity(2)


def arg_gen(state):
# Generate a symbolic argument to pass to the collatz function.
# Possible values: 4, 6, 8
arg = state.new_symbolic_value(32, "collatz_arg")
state.constrain(arg > 3)
state.constrain(arg < 9)
state.constrain(arg % 2 == 0)
return [arg]


# Set up Manticore to run the collatz function with the given argument generator.
# We use an argument generator function instead of a list of arguments because Manticore
# might have multiple states waiting to begin execution, and we can conveniently map a
# generator function over all the ready states and get access to their respective
# constraint sets.
m.invoke("collatz", arg_gen)

# Run the collatz function
m.run()

# Manually collect return values
for idx, val_list in enumerate(m.collect_returns()):
print("State", idx, "::", val_list[0])

# Finalize the run (dump testcases)
m.finalize()


print(
"""
============ Example 2 ============
"""
)


def getchar(constraints, _addr):
"""
Stub implementation of the getchar function. All WASM cares about is that it accepts the right
number of arguments and returns the correct type. All _we_ care about is that it returns a symbolic
value, for which Manticore will produce all possible outputs.
:param constraints: The current constraint set
:param _addr: Memory index of the string that gets printed by getchar
:return: A symbolic value of the interval [1, 7]
"""
res = constraints.new_bitvec(32, "getchar_res")
constraints.add(res > 0)
constraints.add(res < 8)
return [res]


# Pass our symbolic implementation of the `getchar` function into the WASM environment
# as an import.
m = ManticoreWASM("collatz.wasm", env={"getchar": getchar})

# Invoke the main function, which will call getchar
m.invoke("main")

# Run the example
m.run()

# Manually collect return values
for idx, val_list in enumerate(m.collect_returns()):
print("State", idx, "::", val_list[0])

# Finalize the run (dump testcases)
m.finalize()


print(
"""
============ Example 3 ============
"""
)


class CounterPlugin(Plugin):
"""
A plugin that counts the number of times each instruction occurs
"""

def did_execute_instruction_callback(self, state, instruction):
with self.locked_context("counter", dict) as ctx:
val = ctx.setdefault(instruction.mnemonic, 0)
ctx[instruction.mnemonic] = val + 1

def did_terminate_state_callback(self, state, *args):
insn_sum = 0
with self.locked_context("counter") as ctx:
for mnemonic, count in sorted(ctx.items(), key=lambda x: x[1], reverse=True):
print("{: <10} {: >4}".format(mnemonic, count))
insn_sum += count
print(insn_sum, "instructions executed")


def arg_gen(_state):
return [I32(1337)]


m = ManticoreWASM("collatz.wasm")

# Registering the plugin connects its callbacks to the correct events
m.register_plugin(CounterPlugin())

# Invoke `collatz(1337)`
m.invoke("collatz", arg_gen)

# Run the collatz function
m.run()

# Manually collect return values
for idx, val_list in enumerate(m.collect_returns()):
print("State", idx, "::", val_list[0])

# Finalize the run (dump testcases)
m.finalize()
Binary file added examples/wasm/collatz/collatz.wasm
Binary file not shown.
116 changes: 116 additions & 0 deletions examples/wasm/collatz/collatz.wat
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
(module
(type $FUNCSIG$i (func (result i32)))
(type $FUNCSIG$ii (func (param i32) (result i32)))
(import "env" "getchar" (func $getchar (param i32) (result i32)))
(table 0 anyfunc)
(memory $0 1)
(data (i32.const 16) ">\00")
(export "memory" (memory $0))
(export "collatz" (func $collatz))
(export "main" (func $main))
(func $collatz (; 1 ;) (param $0 i32) (result i32)
(local $1 i32)
(set_local $1
(i32.const 0)
)
(block $label$0
(br_if $label$0
(i32.lt_s
(get_local $0)
(i32.const 2)
)
)
(set_local $1
(i32.const 0)
)
(loop $label$1
(set_local $1
(i32.add
(get_local $1)
(i32.const 1)
)
)
(br_if $label$1
(i32.gt_s
(tee_local $0
(select
(i32.add
(i32.mul
(get_local $0)
(i32.const 3)
)
(i32.const 1)
)
(i32.shr_u
(get_local $0)
(i32.const 1)
)
(i32.and
(get_local $0)
(i32.const 1)
)
)
)
(i32.const 1)
)
)
)
)
(get_local $1)
)
(func $main (; 2 ;) (result i32)
(local $0 i32)
(local $1 i32)
(set_local $1
(i32.const 0)
)
(block $label$0
(br_if $label$0
(i32.lt_s
(tee_local $0
(call $getchar
(i32.const 16)
)
)
(i32.const 2)
)
)
(set_local $1
(i32.const 0)
)
(loop $label$1
(set_local $1
(i32.add
(get_local $1)
(i32.const 1)
)
)
(br_if $label$1
(i32.gt_s
(tee_local $0
(select
(i32.add
(i32.mul
(get_local $0)
(i32.const 3)
)
(i32.const 1)
)
(i32.shr_u
(get_local $0)
(i32.const 1)
)
(i32.and
(get_local $0)
(i32.const 1)
)
)
)
(i32.const 1)
)
)
)
)
(get_local $1)
)
)
3 changes: 3 additions & 0 deletions manticore/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
from crytic_compile import is_supported, cryticparser
from .core.manticore import ManticoreBase, set_verbosity
from .ethereum.cli import ethereum_main
from .wasm.cli import wasm_main
from .utils import config, log, install_helper

consts = config.get_group("main")
Expand Down Expand Up @@ -39,6 +40,8 @@ def main():

if args.argv[0].endswith(".sol") or is_supported(args.argv[0]):
ethereum_main(args, logger)
elif args.argv[0].endswith(".wasm") or args.argv[0].endswith(".wat"):
wasm_main(args, logger)
else:
install_helper.ensure_native_deps()
native_main(args, logger)
Expand Down
2 changes: 1 addition & 1 deletion manticore/core/manticore.py
Original file line number Diff line number Diff line change
Expand Up @@ -327,7 +327,7 @@ def _fork(self, state, expression, policy="ALL", setstate=None):
to the ready list.
"""
assert isinstance(expression, Expression)
assert isinstance(expression, Expression), f"{type(expression)} is not an Expression"

if setstate is None:

Expand Down
2 changes: 2 additions & 0 deletions manticore/core/smtlib/operators.py
Original file line number Diff line number Diff line change
Expand Up @@ -248,6 +248,8 @@ def SREM(a, b):
return a.srem(b)
elif isinstance(b, BitVec):
return b.rsrem(a)
elif isinstance(a, int) and isinstance(b, int):
return a - int(a / b) * b
return a % b


Expand Down
Loading

0 comments on commit 1208759

Please sign in to comment.