Header information parser for PE, ELF, DEX, MachO, ZIP (JAR, DocX).
Parses header information of a binary (executable) file.
PE, ELF, DEX, MachO, ZIP (JAR, DocX) are parsed in depth.
Java.class, ART, .NET, NE, MS-DOS are recognized.
The focus was on PE and ELF.
The other types are handled less carefully but may be extended in the future.
As well as PE and ELF still have to be extended.
POSIX compliant.
Compiles and runs under
1.15.15
Last changed: 23.09.2024
script
$ ./linuxBuild.sh [-t app] [-m Release|Debug] [-h]
manual
$ mkdir build
$ gcc -o build/headerParser -Wl,-z,relro,-z,now -D_FILE_OFFSET_BITS=64 -Ofast src/headerParser.c src/pe/PEHeader.c src/pe/PEHeaderOffsets.c
Use clang
instead of gcc
in Termux on Android.
$ winBuild.bat [/exe] [/m <Release|Debug>] [/b <32|64>] [/rtl] [/pdb] [/bt <path>] [/pts <PlatformToolset>] [/h]
This will run in a normal cmd.
The correct path to your build tools may be passed with the /bt
parameter or changed in the script winBuild.bat itself.
The PlatformToolset defaults to “v142”, but may be changed with the /pts
option.
“v142” is used for VS 2019, “v143” would be used in VS 2022.
In a developer cmd you can also type:
$devcmd> msbuild HeaderParser.vcxproj /p:Configuration=<Release|Debug> /p:Platform=<x64|x86> [/p:PlatformToolset=<v142|v143|WindowsApplicationForDrivers10.0>]
Warnings
MSBuild issues some serious warnings:
headerParser\src\headerDataHandler.h(54): warning C6001: Using uninitialized memory
headerParser\src\dex\DexHeaderParser.h(371): warning C6386: Buffer overrun
But so far I could not figure out, how to fix them, or put another way, what’s the problem.
If someone knows, feel free to drop me a line.
If a “VCRUNTIMExxx.dll not found Error” occurs on the target system, statically including runtime libs is a solution.
This is done by using the /p:RunTimeLib=Debug|Release
(msbuild) or [/rtl]
(winBuild) flags.
$ ./headerParser a/file/name [options]
$ ./headerParser [options] a/file/name
Options:
It may be convenient to add HeaderParser to the context menu to be able to right-click a file and header parse it.
In this scenario, you may use
$ addHeaderParserToShellCtxtMenu.bat /p "c:\HeaderParser.exe" [/l "Open in HeaderParser"]
$ ./headerParser a/file/name [-i 1]
HeaderData:
coderegions:
(1) .text: ( 0x0000000000000400 - 0x000000000000fc00 )
(2) .init ...
(3) ...
headertype: PE|ELF|... (32|64)
bitness: 64-bit|32-bit|x-bit
endian: little|big
CPU_arch: Intel|Arm|...
Machine: ...
There is a difference between the header bitness (displayed in brackets following the headertype
) and the bitness of the executable (program code).
The header bitness is 32 or 64 bit for ELF, MACH-O and PE.
The bitness of the executable (program code) may be different though.
An extended output will be printed, by setting “-i 2”, which will cover the basic headers.
$ ./headerParser a/file.exe -i 2
PE Image Dos Header:
...
Coff File Header:
...
Optional Header::
...
Section Header:
1 / x
...
2 / x
$ ./headerParser an/elf/file -i 2
ELF File header:
...
Program Header Table:
1 / x
...
2 / x
...
Section Header Table:
1 / y
...
2 / y
...
A more fine-grained and/or extended printout is available with the PE or ELF only options.
If you think, the header starts somewhere in the file, you may pass an offset to it using the “-s” option.
If you think it is a PE file but the MZ or PE00 magic values are broken, try the “-f pe” option.
HeaderParser may also be build as a shared or static library.
Linux
$ ./linuxBuild.sh -t sh [-m Release|Debug] [-h]
or
$ ./linuxBuild.sh -t st [-m Release|Debug] [-h]
or plain:
shared
$ mkdir build
$ gcc -fPIC -Wl,-z,relro,-z,now -shared -Ofast -D_FILE_OFFSET_BITS=64 -Wall -o build/libheaderparser.so src/headerParserLib.c src/pe/PEHeader.c src/pe/PEHeaderOffsets.c
static
$ mkdir build
$ gcc -fPIC -Wl,-z,relro,-z,now -Ofast -D_FILE_OFFSET_BITS=64 -c -Wall -o build/headerParserLib.o src/headerParserLib.c
$ gcc -fPIC -Wl,-z,relro,-z,now -Ofast -D_FILE_OFFSET_BITS=64 -c -Wall -o build/PEHeader.o src/pe/PEHeader.c
$ gcc -fPIC -Wl,-z,relro,-z,now -Ofast -D_FILE_OFFSET_BITS=64 -c -Wall -o build/PEHeaderOffsets.o src/pe/PEHeaderOffsets.c
$ ar rcs build/headerParser.a build/*.o
Windows
$ winBuild.bat /dll [/m Release|Debug] [/b 32|64]
// or
$ winBuild.bat /lib [/m Release|Debug] [/b 32|64]
Additionally to the included header files src\exp.h
is needed in the same directory.
This may be removed soon.
// link library when compiling
// include
#include "src/HeaderData.h"
#include "src/headerParserLib.h"
...
// use library
size_t offset = 0;
uint8_t force = FORCE_NONE; // or FORCE_PE
HeaderData* data = getBasicHeaderParserInfo("a/file.path", offset, force);
if ( data )
{
// do stuff handling data
// ...
}
// clean up
freeHeaderData(data);
For PE files there is an extended parser available.
This one includes the basic data info.
// include
#include "src/PEHeaderData.h"
#include "src/headerParserLibPE.h"
...
// use library
size_t offset = 0;
PEHeaderData* data = getPEHeaderData("a/file.path", offset);
if ( data )
{
// do stuff handling data
// ...
}
// clean up
freePEHeaderData(data);
Using the library is the preferred usage in python.
On the python side, use header_parser.py.
from src import header_parser
# initialization
header_parser.init("src/of/libheaderparser.so")
# default usage
data = header_parser.get_basic_info('a/file.src')
# passing a start offset
data = header_parser.get_basic_info('a/file.src', 10)
# passing a start offset and forcing PE parsing
data = header_parser.get_basic_info('a/file.src', 10, header_parser.FORCE_PE)
# convert cpu id and header type id into strings
cpu = header_parser.lib_header_parser.getHeaderDataHeaderType(data['cpu'])
header_type = header_parser.lib_header_parser.getHeaderDataArchitecture(data['headertype'])
Published under GNU GENERAL PUBLIC LICENSE.
common_codeio.h, Icon.ico