x86 differently: VINE and LLVM-klee
The power of intermediate languages

That's a Pentium I form 1993... and it grows more complex since these days.
x86 for RCE isn't that kewl: you've to cope with CISC, mostly more than 7 general purpose registers, complex memory access operands, flags, condition codes, specific model registers... Even if you know your stuff it's not desirable to spend a hell lot of time reading and hacking x86 while other people have relationships and party hard.
The goal of the following approaches therefore is to allow even you partying again and to free you from the chains of complexity, which makes you spend weeks in reversing. Reverse differently: with new approaches to translate x86 into intermediate languages or even real languages. Manipulations afterwards seem to be promising.
And I'm not really speaking of firing up IDA and HexRays and to get the stuff done. I'm speaking of a way to gain specific insight and to perform manipulations here. Surely some commercial software out there allows you to perform all of that. But it neither does contribute to the whole, nor is it really affordable or as flexible as it could be if it was open. If this here is half as flexible as I think it is, it's like Vesakh, Mani Rimdu and birthday together ;). If you know what I mean.
VEX and VINE
VEX is an intermediate language, RISC like, and therefore alone much easier to understand. It's from the Valgrind project and since a couple of days there's VINE, from the BitBlaze projects. In fact VINE translates x86 to VINE IL and lets you query for specific information. One of the particular highlights should be quoted:
Translate our IL to C, and then compile back down to an executable.
At 5 there's an example walk-trough.
VINE alone wouldn't be too exiting. If you weren't able to automate these steps for your analysis. If you get VINE to reproduce your binary to a C program, maybe one can use KLEE to do specific high-level coverage tests to do run-time traces instead of fuzzing?
Currently my time again is kind of limited, but I'll dive into it next week after my exams.
It's not specifically in the docs, but the subdirectory utils contains irtrans, whose --help function prints:
-to-debug-c Output program in C syntax with extra debugging statements -to-c Output program in C syntax
I do not expect to be able to "decompile" any kind of x86 binary directly compileable with GCC through VINE. - But for for some tests, if that works, this could be awesome. KLEE would definitely reach the code-paths and I could mark input variables as symbolic within functions. So it's not just fuzzing as complex as possible. It's deeper. You'd get a function-trace.
I didn't study anything related with design. This is good. Believe it!
In any case you gain much deeper insight. Querying STP will save time, decompiling from IL to C can help finding vulnerabilities. And I'll get rid of the abandoned valgrind-catchconv project that needs an outdated libc.
Other stuff
- I recently got a tip to look a llvm-qemu to perform dynamic binary translations. Sounds like a very interesting approach. However of course while focusing on KLEE marking variables as symbolic in LLVM bytecode is ... not nice.
- performing manipulations and checks directly within VINE IL or LLVM IR surely will be done.
- automation... I recently read a sweet as dissertation to automate exploit development with specific algorithms. What if I can find a vulnerability with these x86-> VINE IL -> C -> LLVM-GCC -> KLEE approach and even automate the exploit development ;). Okay... nerd-fanatasy. But hey... we need visionaries. Don't we?
Yes... I'm realistic. Still.
Update
I found this very interesting paper that is about the LLVM-qemu approach for "Selective Symbolic Execution" with LLVM-klee.
Have fun,
wishi

Post new comment