Protection Techniques
How to protect your C programs
protec
How to protect
20 January 1999
by +puark
Courtesy of Reverser's page of reverse engineering
slightly edited
by reverser+
fra_00xx
990120
+puark
1100
HP
AD
Was about time!

Revenge of the protectors!

After so many years I was beginning to dispair... tons of emails by lots of programmers: "thanks, reverser+... great site you have there... I have learned a lot, really" and almost never a contribution, a small essay, a little constructive feedback...
Boys, the real sense (and power) of the web is GIVING. Giving is moving. If you just hoard knowledge you'll just remain where you are.
Thank +puark... awaiting your next essays! (Let's hope other protectors will follow your great example)
See, the crackers scene changed radically after the first tutorials sites came out (anyone still remembers how things were before +ORC? I do :-)
Why shouldn't the PROTECTOR's scene benefit from a similar approach?
There is a crack, a crack in everything That's how the light gets in
Rating
()Beginner (x)Intermediate ( )Advanced ( )Expert

The How to Protect Better lab offers some very useful bullet points. I would however like to pick out a few of the most important points and develop them further to show how to apply the techniques to C and C++ programs.
Protection schemes for C and C++ programs
Adding protection to your C and C++ programs after they are compiled.
Written by +puark


Introduction

Good protection schemes should be written in assembler. Applications should be written in a high level language. These two statements are incompatible if you truly wish to incorporate protection throughout your application. I have however developed a technique that allows protection to be added to a C or C++ program after the program has been compiled and linked. This has the advantage that the inserted code can be generated by another program and can thus be different for every program or version of a program that you issue.

Tools required
C Compiler
SoftIce
IDA
And most importantly, an enquiring mind


Target's URL/FTP
Not applicable.

History

Reverser+ and others have stated that to be a good protectionist you first need to be a good cracker. I admit I started out as a protectionist, and with hind-sight I was not a particularly good one. I then learned and used the crackers techniques, most from Reverser+s wonderful web site. (Thanks Reverser+). These lessons in cracking gave me some wonderful insights into how I might in turn protect my own programs from crackers. I now want to give something back to the community, hence this article.

Essay
Introduction

I don't want to get into the argument between the use of high level languages and assembler. I agree that assembler language can be much more efficient than a high level language both in speed of operation and in memory footprint. However I feel that a high level language (such as C or C++) is generally the only tool of choice for many applications. Unfortunately any protection scheme that is written in a high level language is significantly weaker than one in assembler.

I would like to demonstrate a technique that allows assembler code to be added to an executable after it has been compiled and linked.

To demonstrate the technique I will implement two of Mark's famous 14 protectors commandments.

6. Patch your own software. Change your code to call different validation routines each time.
B. Flood the cracker with bogus calls and hard coded strings.

I have techniques that implement many more of the commandments and I will describe them in later articles which will build on the techniques described here.

Patch your own software

My technique is probably not quite what Mark had in mind. I assume Mark meant that each time the application was run it would modify its own file copy. The idea I have in mind however is that every release of a program will be different. Many releases of a product are just bug-fixes or minor changes to the program. The comparison of two releases will show that they are substantially the same with only a small percentage change. Thus if one version has been cracked then cracking later versions is very easy.

To patch an executable we need to know two things.

1. Where the patch is to be applied
2. What goes into the patch

Where the patch is to be applied

We can determine where to apply a patch by putting a signature in the program which can be recognised by the patch program. This signature should be one that would not normally appear in the output of the compiler/linker. For patches to the code segment I define the following macro.

#define PATCH10 \
	__asm { _emit 0x72}; \
	__asm { _emit 0x01}; \
	__asm { _emit 0x72}; \
	__asm { _emit 0xf9}; \
	__asm { _emit 0xe9}; \
	__asm { _emit 0x01}; \
	__asm { _emit 0x00}; \
	__asm { _emit 0x00}; \
	__asm { _emit 0x00}; \
	__asm { nop };\
The code generated by this macro has the advantage that it could never be generated normally by a compiler. It is also code that cannot be correctly handled by any dis-assembler (including IDA), but more on this later.

The best dis-assembly of this code is:-

	72 01            jp +1
	72 f9            jp -7
	e9 01 00 00 00   jmp +1
	90               nop
The reason this cannot be generated normally is clear in the first jp instruction. It is actually jumping to the middle of the second instruction, specifically to the byte f9. This op-code is the assembler instruction 'stc'. It is clear that this sequence would not normally be generated by any compiler!

This pattern of 10 bytes can be searched for by the patch program and replaced by alternative assembler code of length 10 bytes. Of course the macro can be modified to produce different lengths of 'holes' in the executable. For example.

#define PATCH20 \
	__asm { _emit 0x72}; \
	__asm { _emit 0x01}; \
	__asm { _emit 0x72}; \
	__asm { _emit 0xf9}; \
	__asm { _emit 0xe9}; \
	__asm { _emit 0x0b}; \
	__asm { _emit 0x00}; \
	__asm { _emit 0x00}; \
	__asm { _emit 0x00}; \
	__asm { nop };\
	__asm { nop };\
	__asm { nop };\
	__asm { nop };\
	__asm { nop };\
	__asm { nop };\
	__asm { nop };\
	__asm { nop };\
	__asm { nop };\
	__asm { nop };\
	__asm { nop };\
This produces a 'hole' of 20 bytes. The same principle can be extended to provide holes of any size. I use holes of upto 100000 bytes in my programs.

To use these macros is very easy. The following source code shows the principle

void	main(void) {                    PATCH10
    printf("Hello world\n");            PATCH20
    exit(1);                            PATCH10
}
This code will result in 'holes' of 10 or 20 bytes between each line of compiler generated code. The next question is what do we want to put in these 'holes'.

What can be patched in

Now 10 or 20 bytes may not seem much. Consider however that a compiler will generate only 20 or 30 bytes of code on average for each line of source code (this varies from only four or five bytes upto 100 or so). Percentage wise this can easily add up to a substantial fraction of our code and has the advantage that is is dispersed throughout our code.

I admit that we can't do much functional code in 10 or 20 bytes. We can however (finally) get to Mark's Commandment B. Flood the cracker with bogus code and hard coded strings.

Flood the cracker with bogus code.

The Microsnot C++ compilers can produce either optimised or non-optimised code. I noted that in the non-optimised code every line of source code is self contained and registers EAX, EBX, ECX and EDX are not used to carry values between the code generated for different source code line. In optimised mode however these registers can be used to hold local variables.

To insert assembler between lines of C source code I compile my program with optimisation turned off. I can then generate bogus code using the following types of instructions.

mov toReg, frReg
mov toReg, [ebp+xx]
inc/dec toReg
mov toReg XXXXXXXX
mov toReg dword_xxxxxxxx
cmp toReg dword_xxxxxxxx
Where toReg is one of EAX,EBC,ECX,EDX and frReg is one of EAX,ECX,EDX,EBX,ESP,EBP,ESI or EDI. The dword_xxxxxxxx is a random address in the data segment. This one is especially useful since it results in many bogus references being produced by IDA. One of the powers of IDA is being able to easily identify any reference to a data value. With hundreds (or thousands) of bogus references this makes the crackers work so much harder.

This list of bogus code instructions can easily be extended. The trick is to look at the output of the compiler (using IDA) and see what typical instructions it generates. We don't want to generate code that will affect the program execution so the following instructions would not be allowed

mov frReg, toReg
mov dword_xxxxxxxx, toReg
push toReg
The source code to generate this code is simply a large switch statement, one case for each type of generated code. The selection is made on an random basis and the choice of registers for each case is also chosen at random. The switch statement is enclosed in a loop which adds data into a 'hole' until the hole is filled with random garbage. Take care however to include a few one byte op-codes so that the holes can be filled completely (e.g. stc, clc).

Confusing IDA and SoftIce

I would also like to demonstrate some techniques that will confuse IDA and SoftIce disassembly. Remember the code generated by the PATCH macros? The code had two execution routes, in one route a byte could be used as the first byte of an op-code in another route it could be used as a data byte to an op-code. IDA is unable to cope with this since it can't show the same byte being used in two different assembler instructions.

This technique can be used to hide functional code. Take for example the following code


430502  81 C7 33 C0 F7 F0  add edi, 0F0F7C033h
430508  0F 82 F6 FF FF FF  jb loc_430502+2

At first glance this looks fairly innocuous. But note that the conditional jump is to a location in the middle of a multi-byte op-code. If we dis-assemble from address 430504 we get the following code.

430504  33 C0              xor eax,eax
430506  F7 F0              div eax
430508  0F 82 F6 FF FF FF  jb loc_430504

Now this is certainly not innocuous. It will generate a divide by zero exception and this was hidden from view in the original IDA output.

Now it is certainly more difficult to write code in this way but for certain key routines (like tests for the presence of SoftIce) it can be an invaluable technique. Personally I have combined this technique with the technique of patching code into holes to put multiple checks for SoftIce into my own code. Each SoftIce check was generated using random registers and random code sections so that each SoftIce check is different. Having found one such routine the cracker can not then use a simple byte search to find the other routines.

Final Notes

In future articles I will describe further technique that can be used in C and C++ programs. In particular

How to use encrypted strings and how to avoid having to decrypt them into memory
How to encrypt parts of your C and C++ program.
How to calculate Cyclic Redundancy Checks on part of your program
How to detect bpx and bpmb type breakpoints in your code
How to stop a cracker from getting any useful information from your program resulting from putting a bpmb for read or write on a key global data variable
How to use system API calls (such as MessageBoxA) in such a way that breakpoints on them never break and API Spy programs don't report their usage

Ob Duh

Not applicable

You are deep inside reverser's page of reverse engineering, choose your way out:

redhomepage redlinks redsearch_forms red+ORC redstudents' essays redacademy database
redreality cracking redhow to search redjavascript wars
redtools redanonymity academy redcocktails redantismut CGI-scripts redmail_reverser
redIs reverse engineering legal?