reverser's useful tools
special tools
Related files
(It's only reverser's copy: the real page can be found (with many other goodies) on Mammon_'s own VERY GOOD site)

Mammon_'s Tales to his Grandson
Breathing Life into Dead Listings

Hed * Hiew * IDA * Sourcer * W32Dasm


In reverse engineering, there are two approaches for examining the target: "live" or active examination, in which a program is run under the careful scrutiny of a) a debugger, b) a system monitor, or c) a capture utility that filters disk/file/memory access, API calls, messages, etc; and "dead" or passive examination, in which a program is opened in a hex editor or disassembler, with the end result being a .lst or .asm file containing a close approximation of the original aource code of the program. Most passive utilities will produce an "assembly-language rendering" of the target, which then must be reviewed and corrected by the engineer and finally--if at all--translated into C/C++, Visual Basic, Pascal, Fortran, Java, or whichever language the target is presumed to have been written in. What follows is an introduction to a number of "dead-listing" tools which, once learned by the engineer, will prove invaluable in retrieving an accurate assembly language rendering from the target binary file.


HED

Synpopsis:HED v.1.78 is a 370K (installed) "hacking tool disquised as a hex editor", running in a non-resizeable DOS box. Its features include tradtional hex/ASCII or disassembly mode, multiple-file editting, excellent search and replace capabilities, macro recording, win32 API function name resolution,integral expresion calculator and ASCII table, branch following, 10 bookmarks, and imports/exports/internal references tables. All in all, a pretty sophisticated hex editor at 193K with a 176K imports.dat file.

Usage: General hex/text editor with Win32 import/export information.

Shortcuts
EscActivate Menu Alt-OOpen File
Alt-PPrevious File Alt-NNext File
Alt-QClose File Alt-XExit Program
F2Save File Alt-F2Save As
F9DOS Shell Alt-F9Execute a program
Ctrl-F4Calculator F11ASCII Table
Ctrl-MRecord Macro Alt-FText Filter
Ctrl-InsCopy to HED clipboard Shift-InsPaste from HED clipboard
Ctrl-GGoto Offset Ctrl-BGoto Previous Position
Alt-Shift(0...9)Save Position (Bookmark) 0-9 Alt-[0...9)Goto Position (Bookmark) 0-9
F5Find Number F6Find Text String
F7Find Hex Data Ctrl-F6Find in ASM text
Ctrl-F8Find reference Alt-F8SuperFind & Replace
Shift-F7Find Again Alt-VToggle Hex/ASM view

Notes: HED is a freeware (or, rather, "emailware") hex editor written for OS/2 by Dimitris Kotsonis, and ported to Win32 by Malakoudis S. Panagiotis. It is written in Visual C++ (GNU C for OS/2 version) and takes advantage of the Win32 DOS console interface...the result being that the code of the .exe is interesting to scroll through, but the program itself can slow down in certain functions. The "file open" dialog box, as it does not alphabetize the file names, is tedious to use in large directories such as C:\WINDOWS...creating a PIF file that takes a filename parameter or simply drag'n'dropping the target on the HED icon is recommended. HED will also open multiple files, so that typing C:\WINDOWS\*.* in the "file open" box will open every file in the Windows directory...


HIEW

Synpopsis: HIEW v.5.66 is a 177K (installed) hex editor that runs in a DOS box and takes multiple files names as its startup parameters. Features include MZ/PE file header parsing, multiple file editting, hex/ASCII/disassembly views, file search and replace, saved jump table, reference calling, bookmarks, win32 API function name resolution, built-in 80386 assembler, and cyrptographic/XOR functions. HIEW has three different viewing modes (Asm, Hex, and Text, or A, H, and T), from A and H modes the user can enter the "Edit" mode (E). The PE file header summary is particularly effective, allowing the user to jump to locations in the file (such as the .text or .rsrc directories) reference by the PE Header, its Directory Table, or its Object Table.

Usage:General hex/text editor with PE file header and assembler capabilities.

Crypting Operations: The HIEW manual gives the followign explanation for its cryptographic funtions

   Crypt operations are using for crypting/decrypting the code/data. Crypt
algorithm is very simple. Code/data will be crypted by the bytes/words (to
change the size ot the unit, press F2). Crypting routine must be terminated
with "LOOP numberLine" operator.

Available commands:
        Reg mode    : neg,mul,div
        Reg-Reg mode: mov,xor,add,sub,rol,ror,xchg
        Reg-Imm mode: mov,xor,add,sub,rol,ror
        Imm mode    : loop
All 8/16 bit registers are available, except AL/AX that will be filled with (de)crypted byte/word.
The differences from standard assembler:
        there are no jumps;
        'loop' means 'jmp/stop'
        the operands of 'rol/ror' commands must have the same size, i.e.
        ROL AX,CL not allowed.
Example:
     a. XOR byte with 0AAh:
        1. XOR  al,0aah
        2. LOOP 1
     b. XOR word with mask increment
        1. MOV  dx,0
        2. XOR  ax,dx
        3. ADD  dx,1
        4. LOOP 2
Shortcuts
EnterToggle View Mode Alt-HHelp
F1File Info F2Wrap/Unwrap (T) Assemble (E)
F3Edit (A,H) Undo (E) F4Mode
F5Goto (A,H) F6Linefeed (T) Find reference on current position (A)
F7Search(A,H,T) Crypt (E) F8Header (A,H) XLAT (T) XOR (E)
F9Open Files (A,H,T) Update File (E) F10Exit (A,H,T) Truncate File (E)
Alt-PSave screen to file Alt-RReload file
Ctrl-F3Search and Replace Ctrl-F7 | Ctrl-EnterSearch Next
Ctrl-F8Previous File Ctrl-F9Next File
+Bookmark Alt-(1...8)Goto Bookmark
Alt- -Clear current bookmark Alt-0Clear all bookmarks
1...9 | A...YJump to target/save jump 0 | ZReturn from jump


IDA (Interactive DisAssembler)

Synpopsis: IDA v. 3.7 is a 15.9 MB interactive disassembler (hence its name), similar in a way to the old Bubble Chamber disassembler: a file is loaded, disassembled by the program, then the user is given a chance to modify code and data interpretations before saving the final output file. IDA takes this method to the extreme, modifying the code after the user makes changes to re-interpret the program...basically saving the user a lot of work. Features include multiple file editting, integral byte patcher (creates and .exe file for DOS files, or a .dif difference file for other formats), integral calculator, extensive macro language, integral text editor/viewer, full navigational and code interpretation facilities.

Usage: IDA is different from other disassemblers in that the user is intended to modify the disassembled file "interactively" with the program until an adequate approximation of the original source code is produced. Obtaining a full disassembled listing therefore requires that the user take part in three distinct processes:

  1. Giving IDA the correct loading information for the file at startup
  2. Modifying code and data when misinterpreted by IDA
  3. Commenting the disassembled file extensively

The first and the third processes are pretty simple: the "Load File Of New Format" window provides plenty of options for the user to configure (be sure to set the DLL directory to c:\windows\system and not c:\windows; also uncheck "Rename DLLs" and check "Load Resources" and "Make Imports Section"), and typing ":" allows the user to enter comments that stand out in bright white (and therefore easily distinguishable from the brown IDA-generated comments).

The second process is the hardest, the most time consuming, and the one that requires the most technical knowledge. The user can use the C command to change data into code, and the D to do the opposite--note that each of these commands will cause changes throughout the file, for all relevant bytes beneath the changed line will be coverted to data or code as well. This means basically that the user must have very intimate knowledge of the program itself and the structure of the file format they are working on in order to get full use out of IDA.

Not all files require this much work to disassemble, however; with Windows files in particular, IDA does a good job on its own and usually provides the user with a more than adequate disassembly that only needs a little commenting and data modification. For cases like this, IDA provides excellent navigational commands (summarized in the Shortcuts section below) as well as the ability to change the data representation on the current line to hexidecimal (Q), ASCII (R), octal, binary (B), or decimal (H). The user can also rename (N) functions or variables defined by IDA, and can even patch the file from within the IDA environment.

 

Tutorial
This brief example will make use of Rundll32.exe, found in every Windows directory. This program is 8K and thus is the perfect size for an introduction; its purpose is to manually load .DLL files into memory, as if they were executables. Run IDA by loading IDAW.EXE, then select c:\windows\rundll32.exe for the target file. IDA Pro fill present you with a dialog box of loading parameters:
Load as...
* Portable executable
_ MSDOS .exe
_Binary file

Loading segmemnt: 0x1000 (Exe & Bin) (paragraph where file will be loaded...only for exe/bin)
Loading Offset: 0x0 (bin)  (binary only...offset of first byte from start of first segment)

* Create Segments (bin)
* Load Resources
_Rename DLL entries unchecked, makled repeatedable comments for entries imported by ordinal...else renames 2nd occurence
_Manual Load (NE, LE, LX ...IDA will ask for loading addrersses/selectors for each object in file)
_Fill Segment Gaps      (NE)
* Make Imports Section (PE) (convert .idata section to extra directives)
_Don't align segments (OMF)
_IBM Object Table (OMF)
DLL directory: c:\windows\system
First thing : save the database by going to File->SaveDatabase (or pressing Ctrl-W); this will allow you to come back to your work later simply by loading the .IDB file instead of an executable when IDA starts up.

Next, scroll through the code to get the lay of the land...this is a relatively small file. Note that at offset 0041416 there starts a continuous sequence of add [eax], al repeating over and over. Toggling to hex mode via F4 or just examining the bytes after the offset will show that this is just a continous block of 00's, terminating at 4015FE with the end of the .text segment--meaning that these 00's are padding to fit the File Alignment "magic number"; the code segment therefore really ends at offset 0041416.

IDA has produced one anamoly in this block of padding: at offset 00401464 it has generated the comment CODE XREF: .text:004013F5^j, meaning that this address is referenced by a jump at 4013F5. Press ENTER while the cursor is over the cross-reference "jump-to" address and IDA will switch to this line of code: jnz short near ptr loc_401464+1. The location at 00401464 is always going to be zero, so the value 401464+1 would be simply 1, or the first line of code..which happens to be a subroutine.

Okay, on to work. Just what does this program do? Go to the View menu and choose names; this will show the imports used by the program and give you a brief overview: 28 names, all standard functions such as lstrcpyA, wsprintf, MessageBoxA, and LoadIconA, plus library functions like LoadLibraryA, FreeLibrary, and GetProcAddress that one would expect due to the nature of this program.

The .text section in this small program is only 416 lines...easy enough to track through manually using IDA:

Go to the program entry point by pressing Ctrl-E; you will start off at address 401028 which, as is standard for the start of a 
program or function, will prepare a stack frame. From here you can create a "skeleton" outline of the code by noting the "flow 
of execution", taking down relevant jumps and calls and any imports from the Windows API:

Start:
401028 Start of Program
40102F API: GetCommandLine, store pointer in esi
401075 API: GetStartupInfo, store STARTUPINFO structure in ebp+var_44
401090 API: GetModule Handle..either 0Ah or ebp+var_14 (address of mudule to return handle for)
401097 Call 401322
40109F API: Exit process

Type G 401322 or double-click/press enter on the address 401322 in line 401097:

Main:
401334 API: SetErrorMode mask:8001h
401344 Call 4010AC
401352 Call 40124F (RegisterClassA_CreateWindowExA function)
401373 Call J_SHELL32_122 
40137A Call 402010 (Bad Call: .data segment)
401380 Call 4012F8 (DestroyWindow_FreeLibrary)
40138S RET (end subroutine)

Using the same method, investigate each of the called subroutines:

Call from Main #1:
...to 4010AC...	*****Function 4010AC*****
4010DD Call 401000 (CharNextA function, parameters 20h, esi)
4010EF Call 401000 (CharNextA function, parameters 2Fh, esi)
4010F8 Jcc 401101
4010FC JMP 40120B (RET)

...to 401000...	*****Function 401000*****
40101A API: CharNextA
401025 RET

...to 401101...
401106 API: LoadLibrary
401113 Jcc 4011C7
40111A Call Kernel32_35
401128 Jcc 401182
40112C Call Kernel32_37
401139 Jcc 401161
401149 Call 40138D (LoadString_wsprintfA_MessageBox function)
401154 Call Kernel32_36
40115C JMP 40120B (RET)
...to 4011C7... 
4011CE API: GetProcAddress
4011DB Jcc 40116B
4011F6 API: FreeLibrary
4011FE JMP 40120B (RET)
...to 401182...
401192 API: GetLastError
4011A0 API: FormatMessageA
4011BE Call 40138D (LoadString_wsprintfA_MessageBox function)
4011C5 JMP 40120B (RET)
...to 401161...
40116D Jcc 401200 (RET)
401177 API: lstrcpy
40117D JMP 401206 (RET)
...to 401208...
401211 RET

...to 40138D...	*****Function 40138D*****
4013AB API: LoadStringA
4013C9 API: wsprintfA
4013E1 API: MessageBox
4013EA RET


Call From Main #2:
...to 401024F...	*****Function 401024F*****
401277 API: Call LoadIconA
401286 API: Call LoadCursorA
401290 API: Call GetStockObject
4012A7 API: Call RegisterClassA
4012DF API: Call CreateWindowExA
4012F5 RET

Call From Main #3:
...to 4013EE...	*****Function J_SHELL32_122*****
4013EE API: Shell32.122 (Unknown, poss ExtractAssociatedIconExW)

Call From Main #4:
...to 402010...
.data segment
402010 db 00 00 00 00

Call From Main #5:
...to 4012F8...	*****Function 4012F8*****
4012FE  API: DestroyWindow
401313  API: Kernel32.36 (unknown)
40131B API: FreeLibrary
401321 RET

Comparing the above abstract with the list of internal routines in View-> Functions shows that all 8 of Rundll32.exe's routines have been accounted for. While this source code still has a few mysteries that could be cleaned up, its functionality is relatively clear: this is simply a "loader" function that takes the name of a .DLL file as its startup parameter, then loads that .DLL using the GetProcAddress/LoadLibrary combo that is used in many applications for loading their own .DLLs. Not very mysterious at all...more like a patch than a utility.


Configuration
IDA, like Soft-Ice, has a configuration file (IDA.CFG) which the user can customize to suit his needs. Useful functions to add keyboard commands for are ViewFile (F9), EditFile (Alt-F9), ViewFunctions (Alt-F), and ViewNames (Alt-N). In addition to defining keyboard commands and #defining a lot of parameters, IDA.CFG contains analysis and display parameters that can be configured as follows (or to taste):

//-------------------------------------------------------------------------
//
//	Analysis parameters
//
//-------------------------------------------------------------------------

ENABLE_ANALYSIS		= YES	// Background analysis is enabled

SHOW_INDICATOR		= YES	// Show background analysis indicator

#define AF_FIXUP	0x0001	// Create offsets and segments using fixup info
#define AF_MARKCODE	0x0002	// Mark typical code sequences as code
#define AF_UNK		0x0004	// Delete instructions with no xrefs
#define AF_CODE		0x0008	// Trace execution flow
#define AF_PROC		0x0010	// Create functions if call is present
#define AF_USED		0x0020	// Analyse and create all xrefs
#define AF_FLIRT	0x0040	// Use flirt signatures
#define AF_PROCPTR	0x0080	// Create function if data xref data->code32 exists
#define AF_JFUNC	0x0100	// Rename jump functions as j_...
#define AF_NULLSUB	0x0200	// Rename empty functions as nullsub_...
#define AF_LVAR		0x0400	// Create stack variables
#define AF_TRACE	0x0800	// Trace stack pointer
#define AF_ASCII	0x1000	// Create ascii string if data xref exists
#define AF_IMMOFF	0x2000	// Convert 32bit instruction operand to offset
#define AF_DREFOFF	0x4000	// Create offset if data xref to seg32 exists
#define AF_FINAL	0x8000	// Final pass of analysis
				// See also ANALYSIS2, bit AF2_DODATA

ANALYSIS	= 0xFFFF	// This value is combination of the defined
				// above bits.

#define AF2_JUMPTBL	0x0001	// Locate and create jump tables
#define AF2_DODATA	0x0002	// Coagulate data segs in the final pass

ANALYSIS2	= 0x0001

//-------------------------------------------------------------------------
//
//	Text representation
//
//-------------------------------------------------------------------------

OPCODE_BYTES       	= 6		// don't display bytes of instruction/data
INDENTION		=0		// Indention of instructions
COMMENTS_INDENTION	= 30		// Indention for on-line comments
MAX_TAIL		= 16		// Tail depth
MAX_XREF_LENGTH		= 80		// Maximal length of line with cross-references
MAX_DATALINE_LENGTH	= 70		// Data directives (db,dw, etc):
					//   max length of argument string
SHOW_AUTOCOMMENTS	= YES		// Don't show silly comments
SHOW_BAD_INSTRUCTIONS	= NO		// Don't bother about instruction lengthes
SHOW_BORDERS		= YES		// Borders between data/code
SHOW_EMPTYLINES		= NO		// Generate empty line to make
					// text more readable
SHOW_LINEPREFIXES	= YES		// Show line prefixes (1000:0000)
SHOW_SEGMENTS		= YES		// Show segments in addresses
USE_SEGMENT_NAMES	= YES		// Show segment names instead of numbers
SHOW_REPEATABLE_COMMENTS = YES		// Of course, use repeatable comments
					// Disabling this increases IDA speed.
SHOW_VOIDS		= NO		// Don't display  marks
SHOW_XREFS		= 100		// Show 2 cross-references
SHOW_XREF_VALUES	= YES		// If not, xrefs are displayed
					// as "..."
SHOW_SEGXREFS		= YES		// Show segment part of addresses
					// in cross-references
SHOW_SOURCE_LINNUM	= YES		// Show source line numbers
					// (used in .obj files and java)
SHOW_ASSUMES		= YES		// Generate 'assume' directives
SHOW_ORIGINS		= YES		// Generate 'org' directives
USE_TABULATION		= YES		// Use '\t' in output file
//-------------------------------------------------------------------------
//	Proccesor specific parameters
//-------------------------------------------------------------------------
#ifdef __PC__				// INTEL 80x86 PROCESSORS
USE_FPP			= YES		
					// Floating Point Processor
					// instructions are enabled

WINDIR			= "c:\\windows\\system"	// Default directory to look up for
					// DLL files


Shortcuts
Alt-ZDOS Shell Alt-XExit
Ctrl-WSave Databse Ctrl-F10Produce .exe file
Alt-F10Produce .asm file Shift- F10Produce .map file
F1Help F2IDC File
F3Open Window F4Toggle Hex/Asm view
Shfit-F6Previous Window F6Next Window
F7Tile Windows F8Cascade Windows
F5Zoom F10Activate Menu
CCurrent line=Code DCurrent line=Data
ADisplay current line in ASCII NName current line
:Add comment Alt-MMark Position
QOperand=Hex HOperand=Decimal
BOperand=Binary ROperand=Character
EnterJump to location under cursor EscReturn from jump
GGoto Address Ctrl-LGoto Name
Ctrl-PGoto Function Ctrl-SGoto Segment
Ctrl-MGoto Marked Position Ctrl-XGoto Cross Reference
Ctrl-EGoto Entry Point Alt-TSearch for text
Ctrl-CSearch for next code Ctrl-DSearch for next data
?Calculate expression Shift-F2Run IDC command


Sourcer

Synpopsis: Sourcer v.7.0 is a DOS mode disassembler that uses a Windows pre-processor (essentially a script that calls resdump, dumppe, impdump, dumplx, and dumpne, then formats their output for use by Sr.exe); together the whole package is 1.79 MB. Output is a .lst file containing the asm source code for the original file; the goal of Sourcer is to provide source code that is re-compilable for the target assembler.

Usage: Sourcer is non-interactive; the user sets options for disassembly, then runs Sourcer--when it has finished, they can peruse the .lst or .asm file at their leisure in a standard text editor. Windows programs are first run through the winp.exe preprocessor, which produces a .r and .wdf file as input for sr.exe (the main Sourcer executable).

Windows Preprocessor:

Sourcer:


W32DASM

Synpopsis: W32DASM v.8.9 is a combined disassembler/debugger that totals up to 2.13MB. The disassembler allows viewing of one file at a time; starting a debug process allows the disassembled file to be run and patched in memory (debug-mode commands are marked with D, below). Features include import and export function tables, reference tables for strings, menus, and dialog boxes, hex dumps of data and code segments, and jump/call branching. The debugger is standard fare with the added features of in-memory code patching and Windows API call "detailing"--a valuable feature that gives the parameters and returns of any API call made by the program.

Usage:

Debugger:

Shortcuts
Ctrl-LLoad Process Ctrl-TTerminate Process (D)
F5Auto Step Into(D) F6Auto Step Over(D)
F7Step Into(D) F8Step Over(D)
F9Run Process(D) SpacePause Process(D)
F2Breakpoint Toggle (D) Ctrl-CCopy Selection
Ctrl-S,FFind Text F3Find Next
Ctrl-SGoto Code Start F10Goto Entry Point
F11Goto Page F12Goto Code Location
Lft ArrowExecute Jump Ctrl Rt ArrowReturn From Jump
Lft ArrowExecute Call Rt ArrowReturn From Call


red_ballBack to reverser's useful tools