HSCTechnicalWiki


view edit history print Talk subscribe
SearchWiki

Views: 340

Full site statistics

Authors:

edit SideBar

Main » Blackhat Thinking Your Software

PageList

Papers

Tutorials

HSC welcomes all external visitors to this site, especially students and members of the academic community. Please use the comments box at the bottom of each page to record any comments or suggestions for improvement.

Abstract

Understanding how a program written in a high level language such as C/C++/Java is actually 'seen' at the machine code level is often very useful in building an internal understanding of how a program really works, what to optimize and potential interaction issues with the hosting OS which a source level debugging session may not reveal. This article discusses how a user with a malicious intent may defeat your efforts in trying to protect your own code, without having access to source.

READ-THIS: Please note that reverse-engineering 3rd party products is illegal, subject to their licensing. To learn how to protect your own software, it is advised that you write your own source code, compile it and then try to break it, so that you know how to make your own code stronger.

Fundamentals of Copy-Protection

Copy-Protection is the art of securing your binaries such that users cannot alter your executable in such a way in which you did not intend for them to do. As an example, you may have attached a time limit after which your trial will expire, or, to use the program, the user needs to purchase a license key.

Two Tenets of Licensing

Whenever you are protecting your code via licensing, there are two critical aspects to consider:

  • The algorithm of the licensing must be sufficiently complex
  • The invocation of the licensing routine must be sufficiently obfuscated

These days, 3rd party licensing software are fairly robust but there are still several programmers who write their own and usually spend a lot of time in complex licensing schemes but forget that the weakest link is the way in which the program uses the licensing results to decide whether to continue or not.

Example source code:

Consider this source code:

  1. include <stdio.h>

int checklicense() { // some very complicated licensing code here return 0; }

int main(void) { int result; result = checklicense(); if (!result) { printf ("Too bad, your software is not licensed\n"); } else printf ("Welcome to the rest of the program.\n"); return 0; }

Analysis

Looking at the source code, the programmer has invented a fantastic new checklicense() routine that really does a good job in checking the program license, but has implemented a really primitive mechanism of invocation and decision to continue. The very complex licensing code performs all its checks and returns a 0 if it is not validated, or, a 1 if it is validated.

The Output

Let us assume that the programmer has distributed the executable only, as, protect.exe . When a user runs this program, this is what he sees:

[arc@arc:~/fiddle] $ ./protect.exe Too bad, your software is not licensed [arc@arc:~/fiddle] $

Perfect. This is what was intended.

The Attack

Let us now discuss how easy it is to break this protection. Let us assume that the programmer has distributed the executable only, as, protect.exe (For this example, I am using a cygwin environment running on XP, hence the added '.exe')

The malicious user first fires up gdb on the executable:

[arc@arc:~/fiddle] $ gdb protect.exe GNU gdb 6.3.50_2004-12-28-cvs (cygwin-special) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i686-pc-cygwin"...(no debugging symbols found)

(gdb)

Next, the malicious user disassembles the main() function to see what is going on. Note that I left the symbol table intact in the generated .exe, just for better explanation. If you do not have a symbol table, you can always disassemble starting from the address offset of main(). Alternately, you can start execution in step mode, and when you see the license error, do a backtrace to see the calling function.

(gdb) disass main Dump of assembler code for function main: 0x0040105a <main+0>: push ëp 0x0040105b <main+1>: mov %esp,ëp 0x0040105d <main+3>: sub $0x18,%esp 0x00401060 <main+6>: and $0xfffffff0,%esp 0x00401063 <main+9>: mov $0x0,êx 0x00401068 <main+14>: add $0xf,êx 0x0040106b <main+17>: add $0xf,êx 0x0040106e <main+20>: shr $0x4,êx 0x00401071 <main+23>: shl $0x4,êx 0x00401074 <main+26>: mov êx,0xfffffff8(ëp) 0x00401077 <main+29>: mov 0xfffffff8(ëp),êx 0x0040107a <main+32>: call 0x4010c0 <_alloca> 0x0040107f <main+37>: call 0x401150 <__main> 0x00401084 <main+42>: call 0x401050 <checklicense> 0x00401089 <main+47>: mov êx,0xfffffffc(ëp) 0x0040108c <main+50>: cmpl $0x0,0xfffffffc(ëp) 0x00401090 <main+54>: jne 0x4010a0 <main+70> 0x00401092 <main+56>: movl $0x402000,(%esp) 0x00401099 <main+63>: call 0x401160 <printf> 0x0040109e <main+68>: jmp 0x4010ac <main+82> 0x004010a0 <main+70>: movl $0x402028,(%esp)

The interesting part is these lines:

0x00401084 <main+42>: call 0x401050 <checklicense> 0x0040108c <main+50>: cmpl $0x0,0xfffffffc(ëp) 0x00401090 <main+54>: jne 0x4010a0 <main+70>

The above 3 assembly code lines represent our original source code of:

        result = checklicense() ==> call   0x401050
        <checklicense>;   mov êx,0xfffffffc(ëp)

if (!result) ==> cmpl $0x0,0xfffffffc(ëp); jne 0x4010a0 <main+70>

Looking at your program dissassembly, it is obvious that the "jne 0x4010a0" is being executed because the licensing code has failed verification. Instead of trying to decipher the licensing code, how about just negating the comparison ?

That is, change jne 0x4010a0

 to je    0x4010a0

. Worth a try, eh ?

The first thing we need to do is figure out what are the machine code bytes that map to the 'jne 0x4010a0' instruction. To do that, we can use the gdb "x" (examine) command

(gdb) x/6xb 0x401090 0x401090 <main+54>: 0x75 0x0e 0xc7 0x04 0x24 0x00

Here, we have told gdb to display the first 6 hexadecimal bytes starting at 0x401090 (which is the location of the instruction we are trying to change. We got this address by looking at the left of the disass command which shows the address of each instruction)

Note that depending on the platform you are using, it is possible that gdb may allow you to directly change the byte in memory by doing something like set {address}=value. However, in my case, the code segment was protected and did not allow user level read/write even by using gdb mem, so I had to resort to byte code changing by a hex-editor.

So anyway, I now have the hex codes that represent the instructions - 0x75 0x0e 0xc7 0x04 0x24 0x00 . If you were to view protect.exe using any binary file editor/viewer (hexedit, bview, od etc.), somewhere in the exe, you would find these exact bytes which are nothing but the instruction we need to change.

The next step is now to change it. But before that, we need to know what are the opcodes for "je". We saw that jne is opcode x75 (first byte). The opcode for je is x74. How do I know ? Well, you can easily write a small instruction like 'je 0000' and do a 'nasm -a' on it and od the binary to see the opcode as an example.

I now load up my favourite binary editor, search for the hex string 0x75 0x0e 0xc7 0x04 0x24 0x00

 and replace 75
 with 74
 This is exactly where a larger search string helps - to
 ensure that we do not change some other part of the code,
 because a shorter string may occur in multiple places.

Finally, I run the program again.

[arc@arc:~/fiddle] $ ./protect.exe Welcome to the rest of the program. [arc@arc:~/fiddle] $

License code broken.

Conclusion

Securing a program should be an exercise where you go beyond the source level and should lead you to think like an intruder who you are trying to protect from. Not having source is a detterent, but those proficient with debuggers and assembly can achieve a lot without source.

This was a very simple example of negating a badly coded decision logic. However, I hope it helps the reader in seeing the benefit of how understanding your code at a level deeper than your high level code helps.

maintained by:arjun@hsc.com

Categories: Security Software

Comments

Add Comment 
Email address(will be kept hidden) 
Enter code:

Page last modified on September 17, 2009, at 07:40 AM