GENESIS
DESCENDING THROUGH THE CIRCLES OF KNOWLEDGE
The Doctrine
Welcome to the Club
Before I start sharing resources etc. I want to say — I've been into reverse engineering for a while now and the one thing I learned the hard way is that resources alone don't teach you anything. You can watch 100 hours of tutorials, bookmark every blog post, save every cheatsheet, and still not know how to pop a shell when it's in front of you.
The thing that actually made me better was suffering. Sitting with a challenge I had no idea how to solve, analyzing malware that I had zero understanding of what it does. Spending 5 hours on something that someone else solved in 30 minutes. Feeling stupid. Googling the same thing 15 different ways. Failing over and over until something finally clicks.
That's the process. It's not fun and nobody talks about it but that's literally how it should go. Not from courses. Not from resources. From pain.
So when I start sharing materials and challenges here, don't just save them. Actually sit down and struggle with them. If you're not confused and frustrated, you're not learning. I just remember that, sometimes in past (when the AI was not as useful as now), I would suffer over almost everything.
To give an example, one of the important things with malware analysis in VMs is being able to avoid VM detection. Yeah you can actually patch malware etc. It is possible but when malware is highly obfuscated, it takes a lot of time (there are still other ways). But anyways, so basically I had to patch QEMU's source code to hide QEMU keywords, and had to manually go through the whole hypervisor phantom patch for a day. (It is a longer story btw, I know I can apply the patch directly.) Yeah it took a lot of my time, but at the end I learnt a lot.
Don't wait for someone to push you to success. A lot of people just ask me "How can I learn XXX?", "What do you suggest me to do for future?" etc.
Thing is... I am not your guider or something. I am not God or a motivator... You can choose your own path and if you REALLY need my advice to continue improving — you are not actually into cybersec. Just do it and see the results. Failing doesn't mean the end.
The Ancient Tongue
Learning Assembly
As I can't really guess how well everyone knows reversing, I would like to basically go from basics and tell how I learnt things. (I will try to write everyday if possible.)
You can learn reversing for multiple purposes (for solving challenges, debugging apps, exploiting apps, analyzing malware etc.) but we can say one of the fundamental knowledge is being able to READ (not write) assembly. Yeah, you can actually be a professional assembly coder or something — but you will gain literally nothing from it lol.
I'm a book/documentation guy myself which makes it easier to find resources. Yeah, you might get basic knowledge about assembly through videos etc. (but I think it is so boring to watch someone instead of doing it yourself, just like playing games.) Also, the thing is, you will lack research ability if you rely on videos.
Resources
1. Intel Manual — this will be your God. Every time you have a question, just ask it and you will get your answer. Volume 2 has all instruction references.
2. Programming from the Ground Up (GAS syntax) — to learn how to write in assembly, it is not essential to read this entire book. You might find better videos or better resources but the main thing is, you have to know "Why?" and "How?"
3. NASM Documentation — if you want to write in NASM, you have to know its syntax etc. I personally used NASM to practice writing in assembly so it was important for me.
NOTE: TO MAKE IT CLEAR, YOU DON'T ACTUALLY HAVE TO GO DEEP INTO THESE. I WOULD SAY JUST STOP READING WHEN YOU REALIZE YOU UNDERSTAND HOW THINGS WORK AND WHAT ASSEMBLY ACTUALLY IS. PRACTICE WITH PROGRAMMING IN ASSEMBLY BUT DON'T BE HARSH ON YOURSELF. The main thing is, when you reverse and don't know an instruction, you can just use the Intel manual to understand (or just use AI but I think you will not gain the ability to research, but choice is yours).
The Dissection
Binary Analysis & The Toolbox
Last time, I talked about how I learned assembly, but being able to read instructions is actually the easiest part. With compiled binaries, interpreted languages are much easier to analyze — you don't need deep technical knowledge, just a bit of logic is usually enough to understand them (the only time-consuming part is deobfuscating the script).
The thing with binaries is that they can be compiled from many different languages, each with its own unique functions and behaviors. Some binaries are obfuscated in a way that makes them lowkey impossible to reverse. You'll run into C++, Rust, Go, or Delphi, and all of them have extra "stuff" that requires specialized research. You can't learn everything at once — it's impossible. I just wanted to mention that before I forget.
Now, another crucial step is understanding how the system actually works: the CPU, RAM, etc. I honestly learned these randomly over time rather than sitting down to study all day. The best way to learn is to just start analyzing binaries, but to make it simpler, you can read Chapters 1 and 3 of the Practical Reverse Engineering book:
Linux Resources
- The Linux Programming Interface — Don't read it all; use it as a reference
- Syscall Table (Reference)
- ELF Format — You can actually read this one; it's not too long
Windows Resources
- Windows Internals Part 1 — Use for reference
- Windows Internals Part 2 — Use for reference
Debuggers & Decompilers
For debugging, I use pwndbg (and sometimes r2) on Linux, and x64dbg on Windows. You can use whatever you prefer; there are plenty of others like edb, IDA, etc.
For decompiling (changing from binary back to code), I use Ghidra, dnSpy, or jadx depending on the file type. IDA also supports decompiling if you prefer using that.
- GDB Documentation
- Pwndbg Cheatsheet
- Radare2 Book
- x64dbg Docs — not great but helps
- Ghidra
- IDA User Guide
The Convergence
Concrete & Symbolic execution
I would say directly going for dynamic insturmentation and symbolic execution is hard job. You need to actually know all basics else you just gonna suffer. And you need to have amazing skill to be able to read and understand docs, understand already written example codes etc. To explain a little bit about what is Concrete & Symbolic execution.
Concrete execution is basically executing program in normal way, with its own specific variables, meanwhile symbolic execution means, you test progam by using symbolic variables. It is used to explore execution paths, bug detection etc. Well you might think like how the symbolic variables are calculated, how do they for example. find a "path". I want you to check out z3, it well be easier to understand rather than me explaining. Even though z3 is the most famous one to "check the satisfiability of logical formulas over one or more theories". Angr uses claripy. I don't actually love angr, because it is "Pure Symbolic Execution" tool, which means, it simulates code instead of actually running it. What that means? That means there can be issues with complex apis...
Instead i love to use triton, which is "Dynamic Symbolic Execution", it means, it actually runs code and track symbolically. Amazing for me, but it's about which one you like, both will work anyways, its about what you want to do.
NOTE: These tools have amazing a lot of capabilities which you need to discover yourself.
Now i want to talk about my favourite framework, Dynamic Binary Instrumentation Framework: DynamoRIO Literally answer to everything. Observe, modify, analyze while running the program, write your own tools with framework. Literally a lot can be accomplished. Also some people use frida but it is just high level library, which mostly used for tracing mobile apps, tracing network etc. (i didnt use it once in my life with a purpose, yeah i tested it but just for testing. Nothing meaningful has been done w it for me), but yeah it can be useful too, idk.
IMPORTANT: I will write a whole guide about dynamorio, triton, z3 and maybe frida. Main purpose is including every capability of them in 1 post for each. It will be probably long work but it is future purpose.