HSC welcomes all external visitors to this site, especially students and members of the academic community. Please use the comments box at the bottom of each page to record any comments or suggestions for improvement.
Abstract
Before you read this article, consider reading Blackhat Thinking? as a background on which this article is based. Here, you will find some of techniques for protecting your software. More will be covered in part-II of this article.
First Things First
Symbol? No Symbol
To give away all your symbol information in the binary, is an invitation to others to hack your software. Symbols give a lot of clue to a hacker, to dig in. Always compile your code with -s (i.e. gcc -s) or use strip on your binary.
[naresh@SUSE:~/naresh] $ gcc -s your_source
or
[naresh@SUSE:~/naresh] $ strip your_binary
Encoding free-form text
If your code has any string literal (for user notification or for logging) , a compiler will let keep it intact in ASCII format, placing it in the constant data section of the binary image. Many reverse engineering tools will display the ASCII interpretation of the raw hex data, so visually scanning the binary file and finding the address of each string is a simple matter. Once you know the addresses, you can search the executable code to see where they are used and draw some conclusions about what each section of code does.
One protection technique is to encode text strings so they don't show up as readable ASCII in the binary image. Ofcourse the encoding logic should be outside of your application software, so that only the encoded representations appear in the binary image. A character-by-character encoding scheme can be implemented. The character must then be decoded before being sent to an output device. Unfortunately, printf() has no inherent decoding capability, so we have to have write our own version of printf() that knows how to turn an encoded string into legitimate ASCII data at run time. And that is not really rocket science !
So the overall idea here is that encode your ASCII string in such a way that it does not appear as is in your binary image, to make hacker's work more difficult.
Make ptrace() your friend
Why should your software should be reverse-engineered by others ? Try to put in as many blocks as possible which make debugging tough. ptrace is one of them. ptrace is a system call, used for tracing any process. Many of the debugging tools are ptrace based (ofcourse, this excludes the more powerful debuggers such as SoftIce?).The Following code prevents your application from being debugged in 'general' debuggers.
- include <sys/ptrace.h>
int main()
{
if(ptrace(PTRACE_TRACEME,0,1,0) < 0 )
{
printf(\n Debugging......);
return 0;
}
printf(\n Hello World..);
return 1;
}
This tries to set a debugging request to itself. Now if the program is being debugged by gdb, this call to ptrace() fails, as there can only be one debugger. Though this is a preliminary technique to avoid your code being debugged, using it with gcc -s definitely makes the intruder's work tougher.
Jump in the Middle method
Darwin's law of revolution is very well applicable for software hacking. No matter how secure you write your program, a set of hackers are always capable to break it. So lets change our goal. Your software should be written in such a way that it confuses the hacker and he losess his patience. One such convolution technique is jump in the middle.
Lets see how it works. Lets use same source code as Blackhat Thinking? for better understanding. Lets call it hack.c
Example source code:
- include <stdio.h>
int checklicense()
{
// some very complicated licensing code here
return 0;
}
int main(void)
{
int result;
result = checklicense();
if (!result)
{
printf ("Too bad, your software is not licensed\n");
}
else
printf ("Welcome to the rest of the program.\n");
return 0;
}
We know how this code can be broken by replacing jne with je. But what if we obfuscate this a bit to thwart static debugging ?
To implement jump in middle trick, one need to play with assembly language. We compile hack.c to generate the assembly output. Again, -s (for stripping) is not used here for better understanding.
[naresh@SUSE:~/naresh] $ gcc -S hack.c
[naresh@SUSE:~/naresh] $ cat hack.s
.file "hack.c"
.section .rodata
.align 4
.LC0:
.string "\n Check license is callled ..\n"
.text
.globl checklicense
.type checklicense, @function
checklicense:
pushl ëp
movl %esp, ëp
subl $8, %esp
subl $12, %esp
pushl $.LC0
call printf
addl $16, %esp
movl $0, êx
leave
ret
.size checklicense, .-checklicense
.section .rodata
.align 4
.LC1:
.string "Too bad, your software is not licensed\n"
.align 4
.LC2:
.string "Welcome to the rest of the program.\n"
.text
.globl main
.type main, @function
main:
pushl ëp
movl %esp, ëp
subl $8, %esp
andl $-16, %esp
movl $0, êx
addl $15, êx
addl $15, êx
shrl $4, êx
sall $4, êx
subl êx, %esp
call checklicense
movl êx, -4(ëp)
cmpl $0, -4(ëp)
jne .L3
subl $12, %esp
pushl $.LC1
call printf
addl $16, %esp
jmp .L4
.L3:
subl $12, %esp
pushl $.LC2
call printf
addl $16, %esp
.L4:
movl $0, êx
leave
ret
.size main, .-main
.section .note.GNU-stack,"",@progbits
.ident "GCC: (GNU) 3.3.3 (SuSE Linux)"
You can assemble and link above code by following command.
[naresh@SUSE:~/naresh] $ gcc -o hack hack.s
Lets implement the trick. See the following assembly code (of hack.s). (Only relevant part of code has been shown)
main:
pushl ëp
movl %esp, ëp
subl $8, %esp
andl $-16, %esp
movl $0, êx
addl $15, êx
addl $15, êx
shrl $4, êx
sall $4, êx
subl êx, %esp
jmp .ME+1
.ME:
.byte 0xE9
call checklicense
movl êx, -4(ëp)
cmpl $0, -4(ëp)
jne .L3
subl $12, %esp
pushl $.LC1
call printf
addl $16, %esp
jmp .L4
Lets understand the newly added code. Examine following lines of hack.s carefully .
jmp .ME+1
.ME:
.byte 0xE9
call checklicense
Before we call to checklicence() (i.e. call checklicense ), we added jmp .ME+1 and at label .ME, we added an extra byte 0xE9. If you remember, 0xE9 is nothing but opcode for jmp instruction. So ultimately what we did, jumping to .ME+1, skipping executing byte at .ME label i.e. 0xE9. Now assemble and link the hack.s by following command
[naresh@SUSE:~/naresh] $ gcc -o hack hack.s
Run gdb to check up assembly of hack.
[naresh@SUSE:~/naresh] $ gdb hack
GNU gdb 6.3
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i586-suse-linux"...Using host libthread_db library "/lib/tls/libthread_db.so.1".
(gdb) disassemble main
Dump of assembler code for function main:
0x080483a9 <main+0>: push ëp
0x080483aa <main+1>: mov %esp,ëp
0x080483ac <main+3>: sub $0x8,%esp
0x080483af <main+6>: and $0xfffffff0,%esp
0x080483b2 <main+9>: mov $0x0,êx
0x080483b7 <main+14>: add $0xf,êx
0x080483ba <main+17>: add $0xf,êx
0x080483bd <main+20>: shr $0x4,êx
0x080483c0 <main+23>: shl $0x4,êx
0x080483c3 <main+26>: sub êx,%esp
0x080483c5 <main+28>: jmp 0x80483c8 <.ME+1>
End of assembler dump.
You see that disassembler breaks main() upto label .ME only. To see what it decodes at .ME execute following command.
(gdb) disassemble 0x80483c8
Dump of assembler code for function .ME:
0x080483c7 <.ME+0>: jmp 0x80443b4
0x080483cc <.ME+5>: decl 0x7d83fc45(ìx)
0x080483d2 <.ME+11>: cld
0x080483d3 <.ME+12>: add %dh,0x12(ëp)
0x080483d6 <.ME+15>: sub $0xc,%esp
0x080483d9 <.ME+18>: push $0x8048538
0x080483de <.ME+23>: call 0x80482c0 <_init+56>
0x080483e3 <.ME+28>: add $0x10,%esp
0x080483e6 <.ME+31>: jmp 0x80483f8 <.ME+49>
0x080483e8 <.ME+33>: sub $0xc,%esp
0x080483eb <.ME+36>: push $0x8048560
0x080483f0 <.ME+41>: call 0x80482c0 <_init+56>
0x080483f5 <.ME+46>: add $0x10,%esp
0x080483f8 <.ME+49>: mov $0x0,êx
0x080483fd <.ME+54>: leave
0x080483fe <.ME+55>: ret
0x080483ff <.ME+56>: nop
End of assembler dump.
What! Where is our call to checklicense() ? (i.e. call checklicense). It is not there. But check out hack.s. At label .ME+1 we have an instruction call checklicense. So somehow, even though a static disassembly does not show the correct instructions, the program still works. How this happened is explained later.
This is often a useful way to hide invocations. Note however that dynamic debuggers (such as IDApro that simulate muliple runs of the program to ensure it avoids morph code problems will pass this test). As mentioned before, the art of protection and anti-protection is really an evolution process. No solution is 100%.
Going back to our example, will objdump be fooled by this trick ? Lets check.
[naresh@SUSE:~/naresh] $ objdump -d hack | less
Checkout main. You will have following output.
080483a9 <main>:
80483a9: 55 push ëp
80483aa: 89 e5 mov %esp,ëp
80483ac: 83 ec 08 sub $0x8,%esp
80483af: 83 e4 f0 and
$0xfffffff0,%esp
80483b2: b8 00 00 00 00 mov $0x0,êx
80483b7: 83 c0 0f add $0xf,êx
80483ba: 83 c0 0f add $0xf,êx
80483bd: c1 e8 04 shr $0x4,êx
80483c0: c1 e0 04 shl $0x4,êx
80483c3: 29 c4 sub êx,%esp
80483c5: eb 01 jmp 80483c8
<.ME+0x1>
080483c7 <.ME>:
80483c7: e9 e8 bf ff ff jmp 80443b4
<_init-0x3ed4>
80483cc: ff 89 45 fc 83 7d decl 0x7d83fc45(ìx)
80483d2: fc cld
80483d3: 00 75 12 add %dh,0x12(ëp)
80483d6: 83 ec 0c sub $0xc,%esp
80483d9: 68 38 85 04 08 push $0x8048538
80483de: e8 dd fe ff ff call 80482c0
<_init+0x38>
80483e3: 83 c4 10 add $0x10,%esp
80483e6: eb 10 jmp 80483f8
<.ME+0x31>
80483e8: 83 ec 0c sub $0xc,%esp
80483eb: 68 60 85 04 08 push $0x8048560
80483f0: e8 cb fe ff ff call 80482c0
<_init+0x38>
80483f5: 83 c4 10 add $0x10,%esp
80483f8: b8 00 00 00 00 mov $0x0,êx
80483fd: c9 leave
80483fe: c3 ret
80483ff: 90 nop
Yes, it too was fooled, since it was a static analysis.
How it works
As we have added 0xE9 byte at label .ME, disassemblers assume it as a jump instruction. But a jump instruction must have a destination, so the disassembler takes the next four bytes as a destination of jump and interprets it. So it has taken the next four bytes (that is our opcode of call instruction and its destination) as jump destination. Subsequent instructions, therefore are interpreted following this 4 bytes which breaks the order of the instructions and therefore produces a garbled disassembly.
Summary
By this trick we have filtered out some initial break attempts. Ofcouse there are many, who can still break in. We will discuss, how to break this jump in middle trick next. In part-II, other tricks to proof your software and then how to break that trick as well(Remember Darwin's Law !!), will be discussed.
maintained by:naresh.prajapati@hsc.com
Categories: Software
Comments