Create your own disassembler in python (pefile & capstone)
Without any introductions, if you want to disassemble an exe file using python, you will need two libraries: pefile and capstone. To install these two, I used the following commands on windows:
If those two commands don’t work for you, please consider visiting the official pages of both packages.
Now, in order to create your disassembling script you will need to follow these two steps:
- Locate the code section using Pefile
- Parse the code section and disassemble it with capstone
let’s dig into more details!
1. LOCATING CODE SECTION
An exe file is composed of many sections: code, data, debug, rsrc… and when parsing a section, the name doesn’t matter. In fact windows doesn’t take the names of sections into consideration. All entries to sections are defined in a table called table of sections. To make things simple, we will focus only on the main code section (an exe file can have multiple code sections).
In the header of each exe file there’s a field which points the first instruction of the program. that instruction is located in the main code section. Knowing that, we will parse the exe file, get the addresses of all existing sections and then find the section that contains the address which point to the first instruction of the program.
Maybe some code will make things more clear. First, let’s import needed libraries:
Let’s define the function that locate the main code section:
2. DISASSEMBLE WITH CAPSTONE
Next, we use Capstone to parse the code section and disassemble it. We need to determine the beginning and the end of code. Then parse it, that’s it. The code below contains all steps you need to perform in order to do that.
Finally, let’s put everything together:
The end!!