#VMAttack **IDA** PRO Plugin
**IDA Pro Plugin for static and dynamic virtualization-based packed analysis and deobfuscation.**
**VMAttack was awarded the _second place_ at the annual [IDA Pro Plug-in Contest in 2016](https://www.hex-rays.com/contests/2016/index.shtml)!**
##Introduction
VMAttack is an **IDA PRO** Plug-in which enables the reverse engineer to use additional analysis features designed to counter _virtualization-based obfuscation_. For now the focus is on **stack based virtual machines**, but will be broadened to support more architectures in the future. The plugin supports static and dynamic analysis capabilities which use **IDA API** features in conjunction with the plugins own analysis capabilities to provide automatic, semi-automatic and manual analysis functionality.
The main goal of this plugin is to assist the reverse engineer in undoing the _virtualization-based obfuscation_ and to automate the reversing process where possible.
##Installation
###Prerequisites
- IDA Pro >= 6.6
- Python 2.7.10/.11
- Tested with Windows 7 and Windows 10.
###Guided Install
To install the plugin simply run the setup.py:
```python
python setup.py install
```
You will be prompted for the full path to your **IDA PRO** installation, aside from that no user interaction should be required.
###Alternative manual Install:
Should the guided install fail for any reason a manual installation is also possible.
The only required python dependencies are `distorm3` and `idacute` which can be installed via pip:
```python
pip install distorm3 idacute
```
Next the Windows environment variable should be set:
```
setx VMAttack X:full\path\to\plugin
```
Last you should copy the `VMAttack_plugin_stub.py` into your **IDA PRO** Plugins directory. That's it, now you're good to go!
##Quick start guide
The Example folder contains the obfuscated binary and source binary of an add function. The obfuscated **addvmp** contains the VM function which we will analyze now.
![alt text](screenshots/overview.png "Problem Statement")
After a quick glance over the binary we see the simple structure: two arguments, `0AFFE1` and `0BABE5` are deployed on the stack and then a stub is called.
![alt text](screenshots/stub.png "Problem Statement")
The stub starts the virtual machine function with a reference to the start of the VM byte code pushed onto the stack.
![alt text](screenshots/stub2.png "Problem Statement")
Following the address we see the virtual machine function which is basically an interpreter for the received byte code.
![alt text](screenshots/switch.png "Interpreter")
A solution to this obfuscation would be the reversal of the interpreter and the interpretation of the byte code by the reverse engineer. Due to the time consuming nature of this task we will try to reverse the binary with our VMAttack plugin.
VMAttacks static analysis functionality is enabled by default. The dynamic analysis capabilities however require an extra step. Since we want to use the static and dynamic capabilities for this demo, first we need to enable the dynamic functionality of VMAttack. This is done by either generating an instruction trace dynamically or loading an instruction trace from file. Trace generation is automatic and upon completion it will produce a success notification in **IDA**s _output window_. Traversed paths will be colored in a shade of blue, where a darker shade represents a higher number of traversals. Alternatively the loaded trace will only produce the success notification in **IDA**s _output window_.
With the newly generated/loaded trace we now have dynamic and static capabilities enabled and can start the _grading system analysis_. Starting with the _grading analysis_ is usually a good fit, since it is automated and takes several analysis capabilities into account. This enables a **cumulative** result which can even tolerate analysis errors to some extent and still produce good results. At the end of the grading analysis the now graded trace will be presented in the **grading viewer**. The trace can now be filtered either by double clicking a grade or via context menu where the user will be prompted to input the grade threshold to display.
In the case of addvmp it will be enough to select the highest grade to be presented with the deobfuscated function (since the original function is quite simple in this case). In becomes obvious, that the two values passed over the stack are added together. Additionally, should the result be not satisfiable, the user can change the importance of an analysis function (**see settings**) or even disable them (by setting the importance to 0), to produce better results. Simply change the importance and re-run the grading analysis.
![alt text](screenshots/grading4.png "Grading Success")
Lets assume we have a more complicated function and the _grading analysis_ did not lead us to the relevant instructions.
One of the _semi-automated analysis_ capabilities could present a viable alternative or even show us which analysis function failed the grading system.
The _input/output analysis_ could provide leads as to how the input arguments of the VM function are used and whether there is a connection between function input and function output. By checking the two input values `AFFE1` and `BABE5` and additionally the output value `16ABC6` it becomes evident which register contains the important instructions for our obfuscated function and how the `eax` return value came to be `16ABC6` (=AFFE1+BABE5).
![alt text](screenshots/InputOutput3.png "Input/Output Success")
Another powerful functionality is the _clustering analysis_. It enables the reverse engineer to quickly discern between repeating instructions and unique ones. The _clustering analysis view_ additionally enables quick removal of unnecessary clusters (or instructions) in a way speeding up the work of the reverse engineer. Should a mistake be made it can be undone or alternatively the original trace can be restored. To make sense of the clustering analysis usually requires an extensive analysis of the trace and can require repeating the clustering analysis with a different cluster heuristic value set via **settings**.
![alt text](screenshots/clustering1.png "Input/Output Viewer")
Out of the semi-automatic analysis the _optimization analysis_ requires the most user interaction. In turn it enables:
- Optimizations which make the trace easier to read or even filter as unnecessary recognized instructions.
- Filtering capabilities to remove as unnecessary recognized instructions or even whole registers from the trace.
- Undoing actions if you made a mistake.
- Restoring the initial trace if you hit a wall.
![alt text](screenshots/optimizations_success.png "Input/Output Viewer")
The _static analysis_ in this case would enable us to analyze the byte code and optionally view the analysis as an _abstract VM graph_ of the byte code. The static deobfuscation of the byte code will produce comments behind relevant bytes to describe the operation this byte produces. The commented instructions are quite intuitive and should be easy to read.
The abstract VM graph in turn will produce a control flow graph (in the case of addvmp just one basic block) filled with those abstract instructions from the byte code. This is also a good example of the accuracy of the static analysis, which without execution delivered an accurate representation of the initial deobfuscated function. After the static analysis we can clearly see that two arguments were passed to the function _(AOS = acces out of known space; indicates for example arguments passed via stack)_ and that they were eventually added together and then returned.
![alt text](screenshots/ab_vm_graph.png "Abstract VM Graph")
##Analysis Capabilities
The following subsection describes the analysis functions and offers additional information about the plugins inner workings. A quick start guide can be found in the next subsection.
评论0
最新资源