A Journey within Steganos
How to find the Reverser's Advanced Steganography Page
'LIGHT' (pre-advanced) version ~ 'reduced' version
by Jean Flynn
Courtesy of Reverser's page of reverse engineering
A great essay. It seems that the steganography idea begins to carry some IMPORTANT first results. Clearly all this can be considered only the beginning of a real work on steganographical reversing. I decided, seen the importance of Jean Flynn's approach, to publish TWO versions of his essay. A LIGHT one (on this side of the strainer) and a COMPLETE one (on the advanced steganography page). People that are stuck with the reversing work will be able to tackle it from Jean Flynn's perspective. People that have already solved the task will have the possibility to read it in extenso. I like a lot the trial and error part of this essay... in fact I believe that all essays should have the methodological clarity of this one. What really helps other reversers is NOT to have a ready made solution, it is to gather ideas and to see different approaches. That's what internet was made for (before our enemies transformed it in a silly Disneyland of useless frill) that's the whole point of the thing: team work, building on the shoulders of others, bringing humanity cleverness together, just because it is great fun to udnerstand things.
And this is Jean Flynn's (great) contribution. Enjoy!

This light version will show you MANY approaches to the reversing of the strainer... see you 'on the other side'!
There is a crack, a crack in everything That's how the light gets in
( )Beginner (x)Intermediate ( )Advanced ( )Expert

I rated this essay "Intermediate" level, not that I think that it is difficult to understand, but it covers many different faces of Reverse Engineering. To follow and master everything need some knowledge. So, the reader may find here a view of several techniques needed to defeat Steganography, password cracking, reverse engineering, along with comments of mine on the subject.
A Journey within Steganos
How to find the Reverser's Advanced Steganography Page
Written by Jean Flynn (the_flynn@hotmail.com)

This essay will show how I found the Reverser's advanced steganography challenge.  

Tools required
Compare : my own favorite binary comparator/hex editor included here with its source code
SoftICE 3.0
C or C++ compiler
Whatever Site downloading utility, like BlackWidow, or any other.
Steganos by Fabian

Target's URL/FTP
red Steganos by Fabian Hansmann @ http://www.steganography.com

Program History


- Act I Scene I - The Challenge

When I discovered Reverser+'s steganography page, I found and solved the 'Andromeda' riddle pretty fast, the way caprine did.
I was teased, but I didn't write any essay about it, because I thought this was just an 'appetizer', and the essay worth writing was the one about finding the 'advanced page' of steganography.
I decided to attack that challenge ASAP, wrongly thinking it would be peanuts to achieve, since I thought all the people listed in Reverser+'s page did find that page and already wrote an essay about it.
I have been very surprised to realize that I prove wrong, and that these people wrote about the 'easy' riddle, and not the 'hard' one. (Well, you did all right guys, it was not so 'easy' after all)

So I downloaded redSteganos, and the two images from Reverser+, redt_tamra7.zip file and redcopyoft_.zip file, and started to have a look at them.

I discarded the GIF/s-tools4 side, since GIF are compressed and BMP usually not. (BMP might be compressed with RLE, but not very often. As I am kind of lazy, I thought it would make one less step to do. Beside, the idea of reversing IDEA was not teasing. I learned that Steganos uses a RC4 encryption scheme, and I like RC4 much better, since the key computation is very fast with this algorithm.

Comparing the files.
I'll use my own graphical comparison utility. One word about it. I used to have such a hexademimal compare utility on my (do _not_ laugh) old Apple II. I've been very frustrated _not_ to see a similar one on my Amiga, neither on my PC, or wherever. After a few decades, I wrote my own, and here it is. Have a look at it, if you don't like it, just change it, I included the source code. (Yet Another Boring Tool :-)

- Act I Scene II - The Extraction

Run "compare copyoft_.bmp t_tamra7.bmp', press F1 to go to the first difference, and hit the left and right keys to see alternatively both files. It tells us that lots of bytes starting from 0x440 have their LSB changed. We just need to find how.
I played a little bit with Steganos, and, as Reverser+ suggested used all its features.

One good feature is the Help File. Fabian is a great guy, he explains everything. So now I know from the help file that he stores information in the LSB of each byte, and that the file is crypted using "Alleged RC4".

What the heck is Alleged RC4? Well I didn't know at that time, but if you can search the Web, that kind of question won't stop you very long.

The next thing you learn about Steganos, is that it is able to crypt the file without hiding the data in a bitmap or a sound. Now this is a nice feature :-D

I prepared a file named adva.txt with "http://fravia.org/abcdefghijklmn.txt" in it, just 36 bytes. I hid it in a copy of copyoft_.bmp, and also in a file "adva.sef", crypted but not hidden.

Let's dump that file. We see a header "SteganosEncryptedFile", followed by some binary data, which is probably the encrypted data.

Let's look again at the bmp file, and analyse the bytes at offset 0x440 ... it won't take you long before you find that every LSB of an 8 bytes group from the bmp file represents one byte of the encrypted data found in the sef file.

So, let's write a little program to extract the data from the BMP file, and reconstruct the SEF file.

#include <stdio.h> #include <stdlib.h> void main(int argc, char *argv[]) { if (argc!=3) exit(1) ; FILE *fin = fopen(argv[1], "rb") ; // the BMP file if (fin==NULL) exit(0) ; FILE *fout = fopen(argv[2], "wb") ; // the SEF file unsigned char rb, rbc, cb ; int nb ; fseek(fin, 0x440, SEEK_SET) ; fwrite("SteganosEncryptedFile\0\1\0", 24, 1, fout) ; // write the header rbc = cb = 0 ; nb = 0 ; while(1) { if (fread(&rb, 1, 1, fin)<=0) break ; // pick up each byte cb >>= 1 ; if (rb & 1) cb |= 0x80 ; rbc++ ; if (rbc>=8) { rbc = 0 ; fwrite(&cb, 1, 1, fout) ; cb = 0 ; nb++ ; if (nb>=60) break ; // see the text for this line, not here in first version of the prog } } fclose(fout) ; fclose(fin) ; } The first version of this program (without the commented line) was used to process the whole BMP file. After a few tries, I found that just the first bytes were needed, so I cut the file there. Obviously Fabian fills up the BMP file with noise.
I found that this length doesn't have to be exact, since Steganos doesn't check it, as long as it has enough data to process. So I let a couple of more byte to go in the file, just in case the compression was worse for the Reverser's adva.txt file than for mine.

Anyway, it's little matter for the moment, since we'll use only a few bytes of the beginning of the crypted data to find the key(but I didn't know for sure at that time)


Now that you have your nice extraction tool, VALIDATE it by processing a few different data files (of the same size, since I hardcoded it), hiding them in the BMP file, using the extraction tool to generate the SEF file, and checking that Steganos can retreive the data from that generated SEF file. You must be confident in your data before you can go further.

Another point: this extractor won't work for other data file size, and BMP files, since it does not care about the end of lines in the BMP. End of lines are tricky in images, and Fabian probably skips them. For our purpose here, we don't care, since we just need a few bytes, much less than a pixel size of a line.

Another small point: I didn't really understand why Reverser+ gave us the original file.
I thought it might be needed, but it is not. Hide another random text file in the BMP, and you get a good candidate for an 'original' BMP file

- Act II Scene I - The RC4 Discovering

So what now? We have a nice RC4-encrypted text file, and Reverser+ is just requesting to reverse it. Let's find some information about RC4. I'll let you read Reverser's search lessons, although it is not really necesssary since huge amount of information is available about RC4 on the net. The main article you'll find is from the Cypherpunks, where they disclose the algorithm with humour. Let's see: --------------------------------------------------------------- I am shocked, shocked, I tell you, shocked, to discover that the cypherpunks have illegaly and criminally revealed a crucial RSA trade secret and harmed the security of America by reverse engineering the RC4 algorithm and publishing it to the world. On Saturday morning an anonymous cypherpunk wrote: SUBJECT: RC4 Source Code I've tested this. It is compatible with the RC4 object module that comes in the various RSA toolkits. /* rc4.h */ typedef struct rc4_key { unsigned char state[256]; unsigned char x; unsigned char y; } rc4_key; void prepare_key(unsigned char *key_data_ptr,int key_data_len, rc4_key *key); void rc4(unsigned char *buffer_ptr,int buffer_len,rc4_key * key); /*rc4.c */ #include "rc4.h" static void swap_byte(unsigned char *a, unsigned char *b); void prepare_key(unsigned char *key_data_ptr, int key_data_len, rc4_key *key) { unsigned char swapByte; unsigned char index1; unsigned char index2; unsigned char* state; short counter; state = &key->state[0]; for(counter = 0; counter < 256; counter++) state[counter] = counter; key->x = 0; key->y = 0; index1 = 0; index2 = 0; for(counter = 0; counter < 256; counter++) { index2 = (key_data_ptr[index1] + state[counter] + index2) % 256; swap_byte(&state[counter], &state[index2]); index1 = (index1 + 1) % key_data_len; } } void rc4(unsigned char *buffer_ptr, int buffer_len, rc4_key *key) { unsigned char x; unsigned char y; unsigned char* state; unsigned char xorIndex; short counter; x = key->x; y = key->y; state = &key->state[0]; for(counter = 0; counter < buffer_len; counter ++) { x = (x + 1) % 256; y = (state[x] + y) % 256; swap_byte(&state[x], &state[y]); xorIndex = (state[x] + state[y]) % 256; buffer_ptr[counter] ^= state[xorIndex]; } key->x = x; key->y = y; } static void swap_byte(unsigned char *a, unsigned char *b) { unsigned char swapByte; swapByte = *a; *a = *b; *b = swapByte; } --------------------------------------------------------------- (At least, writing that part of this essay was easy, just cut and paste...) All right. Good. It seems we have it all !

- Act II Scene II - The Steganos Reverse Engineering

Let's check if Fabian did RC4 the way the Cypherpunks described it. For this we'll use SoftICE. I'm not going to give you lots of explanations about this debugging session, since lots of essays are available here to do this. The experienced reader you are will easely find his way with this :-) (flattering the reader shouldn't hurt ;-)

I prepared an encrypted file (not hidden) from a 36-bytes text file named adva.txt, and launched Steganos under SoftICE. Ran it up to the password entering page, and set a breakpoint with BPX GetWindowTextA.

Press "Next" in Steganos, the program stops in GetWindowText.
Exit the routine, and look at the stack to find the adress of the string (ds:1211E0C)

Set a breakpoint at this address with BPMW [addr] RW, and run, you'll get right where Fabian uses the key. The first time is right in the middle of a strlen(), not very interesting.
Hit G two more times, and then you get in the middle of a code where Fabian copies the key 32 times to generate a 256 characters key.

Set a breakpoint to the adress of this 256 key buffer (BPMW ds:44D9D8) and G again, and you land at cs:40D97E right in the middle of the key computation. Go on with BPX cs:40D9E5, and use the P command to exit all routines up to cs:404420, where a call gives the result 0x61, first letter of the hidden filename.

Notice: Look at the state part of the computed key after its generated (it is at ds:44D8D8), and compare it with what the Cypherpunk's algorithm gives with the same ascii key. You'll see that it's the same, so we know that the key computation is identical to the Cypherpunk's version.

Now reverse the routine 404420, and you'll see that it computes the RC4 key in a slightly different way that the Cypherpunk's routine, as you will se at cs:40D837. Here it is now, functionnally equivalent, ready to replace the same routine from the Cypherpunk package.

void rc4(unsigned char *buffer_ptr, int buffer_len, rc4_key *key) { unsigned char x; unsigned char y; unsigned char* state; unsigned char xorIndex; short counter; x = key->x; y = key->y; state = &key->state[0]; for(counter = 0; counter < buffer_len; counter ++) { x = (x + 1) & 255; y = (state[x] + y) & 255; swap_byte(&state[x], &state[y]); xorIndex = (state[x] + state[y]) & 255; key->cte += 13 ; // difference with Cypherpunk key->cte &= 0xFF ; // difference with Cypherpunk buffer_ptr[counter] ^= state[xorIndex] ^ key->cte ; } key->x = x; key->y = y; } That's it, you have enough to make yourself a key generator. You'll find mine in the "Final Approach" chapter, with a fast version of prepare_key I found on the 'Net somewhere (Bad me. I should reward the original author of this, but sorry, I didn't keep track of this nice guy's name)

- Act III Scene I -
The Failures
and the 'Too Slow' practices

Knowing what we know so far, the problem of finding the hidden text file gets to to decrypt the RC4 encrypted file, that is, reverse RC4. This is just not possible, a search on the 'Net will show that the only things we can find about RC4 is a flaw in a certain class of keys, that would jeopardize these keys by a small factor (a percent or so).

So the first approach is Brute Forcing. Now that we have the bits and bytes of the coding, we can make a bruteforce program, able to compute about 20,000 keys per second. Scanning all keys of a 8 letters password will take 26^8 tries, that is, take about 120 days to find it. (I won't give you this program, it's easy for you to write it if you want with the above elements) It works, but it's too slow.

The accelerator

The probability of appearence of letters in English is as this : esiarntolcdugpmhbyfvkwzxjq (at least it is like this in the dictionary file I dowloaded)

Instead of trying all combinations of letters starting with a, b, c, etc, using e, s, i, etc... saves time for this search by a factor of about 13, finding the solution in about 9 days.
Works too, but it's again too slow, and the time saved is unsure, as it depends on the key.

The Dictionary approach

I downloaded wordlists from ftp.cdrom.com (especially English and Finnish, wonder why :-), and used a program to test all the words as passwords. This is incredibly fast, but it failed.

Gathering the target's environment

I used a Site Download (BlackWidow) utility to get all the Reverser's pages. By the way, I didn't find a good tool for this task. Some of them are "Full Bug Inside", and blow all the time, some of them download the site ten times just to make the page list (wonder why), some of them are limited. Do any of you know of a good clever one?

Anyway, once you have all your target's pages, you get some kind of "picture of his world", the way he interact with the language he uses. I used the following programs to make a word list for the pages (these tools are available in Unix environment, but not under Windows. So I wrote them.)

// program WORDS to extract all words from a file #include <stdio.h> #include <ctype.h> char w[256] ; main() { int ch; int col; col = 0; while ((ch = getchar()) != EOF) { if ((isalpha(ch))&&(col<255)) { w[col++] = ch ; } else { if (col >= 3) { w[col]=0 ; puts(w); } col = 0; } } } // Program XDUPL to suppress duplicates #include <stdio.h> #include <string.h> #include <ctype.h> char w[256] ; char wc[256] ; int main(int argc, char *argv[]) { *wc=0 ; while(gets(w)) { int l = strlen(w) ; if (l<=0) continue; else if (!isalpha(w[l-1])) w[l-1]=0 ; for(int i=0; i<l; ++i) w[i]=tolower(w[i]) ; if (strcmp(w, wc)) { puts(w) ; strcpy(wc, w) ; } } } use with : for %i in (*.htm) do words < %i >>list.txt sort <list.txt >lists.txt xdupl <lists.txt >fravia.txt I used the obtained file as a dictionary, but it failed again.

The Filtering methods
All the following methods are based on the same principle : removing all low probability solutions.

Filtering is a method that generates 'probable keys'. They are dangerous methods, in that they remove arbitrarily complete ranges of keys from our scan. You really need to know that the removed keys have less probabilities to appear unless the method is just a waste of time.

Another way of considering filtering method, is not to suppress low probability keys, but reorder the tests of these keys so the highest probable keys (from your criteria) will be scanned first. This method can statistically accelerate the findings.

Siko's method

You'll need to discover the Reverser's advanced java page to learn about this method. I didn't try it, (because I wasn't aware of it by that time) but I take the opportunity to comment it a little bit here: As it is not symmetrical regarding the letter ordering, it might fail badly, and the routine contains at first a restrictive condition :

if (pass[pos] == pass[pos+1]) { return 0; } contradicting some of the letters choices : case 'f': if (strchr("aefilortu", pass[pos+1]) == NULL) return 0; a word like "afford" wouldn't pass, as an example. These are 'tunings' you need to be aware of.

The limited number of vowel and consonant method

Build all words containg not more than 1 consecutive vowel and 2 consecutive consonants. Works, but it's dangerous and slow.

The 'Triplet' method
I built a table of all used 'triplet' of letters from the english dictionary.
There are only a part of all the 17,576 possible triplets used.
Then, I made a program building all possible combinaisons of words made from these triplets.

This generates 'probable' words. It works, but it's too slow.

The crazy methods

These methods use other approachs, like trying 'obvious' passwords ('steganos', etc...)
In fact, I read that zeezee, (and others) did find the solution (at least I thought they did). In the pages BlackWidow dowloaded for me from Reverser's site, I found a page named 'zee__4.htm'. Soooo, maybe one of the solutions can be found in 'zee__3.htm' ? or 'zee__5.htm' ? It didn't work, but it was worth a try. I named this page originally 'flynn__1.htm', but see how it's named now, I bet you Reverser+ changed it before he published it. Wonder why ?

- Act III Scene II - The Final Approach

I was travelling in a tramway when I though that Reverser was really liking 'reverse' engineering.
(see the advanced version, where a long and quite interesting c 'real reversing engineering' program follows)

... This program doesn't use the supposed file name 'adva.txt', but prints whatever ascii filename it could, so I could be able to detect if Reverser changed the file name. Useless actually. This program found the key within minutes, and guess in which file ... "Reverser.txt", the dictionary of terms imported from Reverser's site !
It sounds obvious afterwards, but this enlighten the fact that getting the "target's context" is essential in such a search. His password is probably a mixture of it's usual words.

Final Notes
We could find that password only because Reverser+ limited the number of letters in it, and told us about it. Would have he used a more complex one, we would need centuries of computing time to find it, or a lot of mathematical analysis to find a RC4 flaw, that probably doesn't exist.

I wish I could use this 'final notes' area to comment a small point from Fabian himself.
One can remark that most of the complicated work above has been done to 'crack' a RC4 password; not to extract the information from the BMP, task somehow easy.

Fabian said that steganography can be revealed because the bmp containing the hidden file compress less than the original one. This is only true because of Fabian's way of doing things, filling up the BMP unused area with noise.

I agree on compressing and crypting the file before hiding it. But I disagree on how the hiding is done, if room is left. Instead of filling up the file with noise, you can spread the information onto the whole file, using as a storage for one bit a n-bits cell (n being the total size available divided by the size of information to be hidden). Taking care to only switch bits in these cells whenever possible, the bandwidth will look about the same, and it will be difficult to know that a file is hidden. Pkzip as an exemple would compress the file with the same efficiency.

The significant bit of every n-bit cell can be chosen using the crypt function itself, making it real hard to reconstruct.

Jean "Flynn".

Ob Duh
I guess the usual Fravis's Ob Duh section doesn't apply here. We just reversed a small part of Fabian's Steganos in order to crack Reverser+'s password and we didn't crack any protection whatsoever.
And I think Reverser+ won't mind ... will he ? :-)

You are deep inside reverser's page of reverse engineering, choose your way out:

Back to Reverser's
Anonymity Academy
Back to Stego's
'normal' page
Disabled in the
'light' version

redhomepage redlinks redsearch_forms red+ORC redstudents' essays redacademy database
redreality cracking redhow to search redjavascript wars
redtools redanonymity academy redcocktails redantismut CGI-scripts redmail_reverser
redIs reverse engineering legal?