Reverse Engineering Software

This is my first blog post, so I thought I’d talk about reverse engineering software. I’ll use the abbreviation “RE” from now on to save space.

Why is it useful?

There are a variety of reasons you might want to RE a piece of proprietary software:

  • To achieve interoperability with a proprietary library, and be able to use it in your own programs
  • To modify its behaviour (to mod a game for example)
  • To break its copy protection or DRM
  • For fun!

How do I start?

You’ll need knowledge of the assembly for the target platform of your desired executable, the platforms ABI (Application Binary Interface) and preferably the language you think the executable was written in. 

The ABI defines low-level details like  how arguments should be passed to functions, or where execution starts. It could depend on your OS and CPU architecture. For example, i386/amd64 Linux follows the SystemV ABI, which says that the entrypoint in any executable is called _start.

Noticing the difference between a C and C++ executable is pretty easy: the C++ executable will probably have it’s symbols mangled to account for polymorphism like function overloading, so there will be extra metadata in them. The way they are mangled depends on the compiler, which is why linkers generally can’t link libraries built with a different compiler. For example, if it was compiled with g++ 3.0+, main would turn into “_Z4main”. A C executable probably won’t need to mangle it’s names. Obviously other languages exist that might need name mangling aswell, so you should check if there’s any C++ standard template symbols left, and see if there’s any information about this online just to be sure

If you’re willing to pay the money and learn how to use them, there are quite a few professional REing suites out there. iDA Pro is the most popular one, costs up to $1000, and has the following features:

  • It is extendable and scriptable using python scripts (anything from automating tasks to writing full-blown loaders for custom binary types)
  • A type system which allows you to define structs and enums, complete with a (C only) header file parser which basically lets you #include them
  • Different views which allow you to see the flow of the code easier
  • And many more

There is also an outdated freeware version available, it’s only for windows but should work on wine. If you don’t want it and don’t have the cash for the full version (and don’t want to pirate it like I did), then a simple disassembler and text editor are all that’s needed.

Ok, what now?

You have to learn how different high level constructs are implemented in assembly, and be able to recognise them.

With C, there isn’t much to it. Variables are entities typically located in registers or stack space, though the compiler is free to move them around when neccesary. Structs are just a collection of variables grouped together, usually padded to account for memory alignment. As mentioned earlier, how function arguments are passed depends on the ABI.

With C++, it gets a lot more complicated because of how big of a language it is. With OOP, generally objects are just structs with the member variables and a pointer to the object is passed as the first argument (the “this” parameter) to it’s member functions, which operate on it through that. When calling the constructor, a this pointer is still passed to show where the object should be created.

Once you think you understand these types of things, all that’s left to do is to look at the disassembly and try to make sense of it. If you just want to modify a single piece of behaviour, instead of digging through the entire code base you could find the function you need by hooking into an API it might use (for example, a rendering function will probably have to interface with a draw function). I like to write out the code it would have been in the original language as it makes it much clearer, but you probably want to put something in the disassembly aswell so it’s easier to look at in the future.

I hope I helped, and if you have any questions or corrections for me, feel free to comment 🙂