Control Flow DeFlattening

Control Flow DeFlattening

August 13, 2025
Fuad Aliyev
Reverse Engineering
DeObfuscation
CFF

What is CFF

Control flow flattening is a classic control-flow transformation that removes structured control flow.

1|700

CFF Story

Control Flow Flattening (CFF) is a code obfuscation technique that emerged from the need to protect software from reverse engineering and analysis. Here's the story behind it:

Origins and Purpose

What is CFF

Control flow flattening is a classic control-flow transformation that removes structured control flow.

1|700

CFF Story

Control Flow Flattening (CFF) is a code obfuscation technique that emerged from the need to protect software from reverse engineering and analysis. Here's the story behind it:

CFF was developed as part of the broader field of code obfuscation - techniques designed to make programs harder to understand while preserving their functionality. The primary goals include:

  • Anti-reverse engineering: Making it difficult for attackers to understand program logic
  • IP protection: Hiding proprietary algorithms and business logic
  • Anti-tampering: Preventing modification of critical code sections
  • Malware evasion: Unfortunately, also used by malware to evade detection

How CFF Works

Traditional programs have structured control flow with clear if-else statements, loops, and function calls that create a natural hierarchy. CFF breaks this structure by:

  1. Flattening the control flow graph into a single large switch statement
  2. Using state variables to track which "basic block" should execute next
  3. Dispatching execution through a central dispatcher loop
  4. Obscuring the original program structure so the logical flow becomes opaque

Tigress

To test this obfuscation technique I will use tigress, which you can also get from tigress.wtf To make it long and use things such as if/for/while I asked AI to write me a C code so I dont waste time on it.

... // Function with while loop and complex logic int string_analyzer(const char *str) { int length = strlen(str); int vowels = 0; int consonants = 0; int digits = 0; int i = 0; while (i < length) { char c = str[i]; if (c >= '0' && c <= '9') { digits++; } else if ((c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z')) { if (c == 'a' || c == 'e' || c == 'i' || c == 'o' || c == 'u' || c == 'A' || c == 'E' || c == 'I' || c == 'O' || c == 'U') { vowels++; } else { consonants++; } } i++; } printf("Analysis of '%s':\n", str); printf("Length: %d\n", length); printf("Vowels: %d\n", vowels); printf("Consonants: %d\n", consonants); printf("Digits: %d\n", digits); return vowels + consonants + digits; } // Main function with multiple function calls and control structures int main() { printf("=== Control Flow Flattening Demo ===\n\n"); // Test fibonacci function printf("Fibonacci sequence (first 10 numbers):\n"); for (int i = 0; i < 10; i++) { printf("%d ", fibonacci(i)); } printf("\n\n"); // Test grade calculator printf("Grade calculations:\n"); int scores[] = {95, 87, 76, 65, 45}; for (int i = 0; i < 5; i++) { printf("Score %d gets grade: %c\n", scores[i], grade_calculator(scores[i])); } printf("\n"); // Test array processor printf("Array processing:\n"); int test_array[] = {10, 25, 3, 47, 8, 91, 2, 36}; int array_size = sizeof(test_array) / sizeof(test_array[0]); array_processor(test_array, array_size); printf("\n"); // Test string analyzer printf("String analysis:\n"); string_analyzer("Hello World 123"); printf("\n"); // Test menu handler printf("Menu simulation:\n"); menu_handler(1); menu_handler(3); menu_handler(99); printf("\nProgram completed successfully!\n"); return 0; } }

It gave me a lot longer example but I cut at here because I will show how to deflatten main function but i also checked with other functions too and it will work for them too. It is just about finding key points.

DeFlattening

So key point about deobfuscating binaries is, it can and most likely will change for different compiles and different tools. So we can't just write 1 script that can deflatten every binary possible.

After flattening my code, I open ghidra to find key points and write script accordingly.

NOTE: I will provide links to binaries, scripts, codes. So you can also work on it by yourself.

First of all we can easily see which variable it uses to jump between cases:

1|700

See the pattern? it uses local_18 to jump between cases using jump table, as you can see jump table points to 0x402308 and it contains address like this:

1|700

Also you can see it shifts local_18's value in RAX by 0x3 it means: RAX * 2^3 and jumps to RAX. We already have 2 key point:

  1. it is starting from case: 0xb.
  2. Switch table address is at: 0x402308

now let's check out case_0xb right? Let's see what it has.

1|700

In every case you can see the last 2 instructions where is first sets next case that's going to be processed and jumps back to where it finds case address and jumps. so we can also remember these 2

  1. DISPATCHER_JMP = "JMP 0x00401958"

  2. STATE_VAR_PATTERN = "MOV qword ptr [RBP + -0x10]"

NOTE: the reason I wrote RBP + -0x10 is because it is the value instead of -0x18 when analyzeHeadless used.

Also there are if/while/for loops written like this:

1|700

See it compares and jumps case accordingly, then we also have to check for example in this case, if 9 jumps back to case_0 if so it is while/for loop and in tigress flattening most of the time next jump is just the end.This is additional thing we have to remember.

The last thing is also finding which case returns. I know it is not really necessary we can check it in several ways like if it jumps to leave/ret or just returns or doesn't have default MOV/JMP etc. but in my case just hardcoding case was easier.

This is the ghidra script I have written:

# Ghidra Control Flow Deflattening Script with Symbol Resolution SWITCH_TABLE_ADDR = 0x402308 CODE_SECTION_START = 0x401000 CODE_SECTION_END = 0x402000 FUNCTION_END_ADDR = 0x40195d DISPATCHER_JMP = "JMP 0x00401958" STATE_VAR_PATTERN = "MOV qword ptr [RBP + -0x10]" EXIT_CASE = 5 START_CASE = 0xb import datetime output_lines = [] def write_output(text): print(text) output_lines.append(text) def resolve_address(addr_value): """Resolve address to function name, string, or symbol""" try: addr = toAddr(addr_value) func = getFunctionAt(addr) if func is not None: return func.getName() symbol = getSymbolAt(addr) if symbol is not None: return symbol.getName() data = getDataAt(addr) if data is not None and data.hasStringValue(): string_val = str(data.getValue()) return '"{}"'.format(string_val[:27] + "..." if len(string_val) > 30 else string_val) string_data = getString(addr) if string_data is not None: string_str = str(string_data) return '"{}"'.format(string_str[:27] + "..." if len(string_str) > 30 else string_str) return None except: return None def enhance_instruction(inst): """Enhance instruction with resolved symbols""" inst_str = str(inst) for i in range(inst.getNumOperands()): try: operand = inst.getOpObjects(i)[0] if hasattr(operand, 'getOffset'): addr_val = operand.getOffset() resolved = resolve_address(addr_val) if resolved is not None: hex_addr = "0x{:x}".format(addr_val) if hex_addr in inst_str: inst_str = inst_str.replace(hex_addr, "{} ; {}".format(hex_addr, resolved)) except: continue return inst_str # Header write_output("; Control Flow Deobfuscation Analysis") write_output("; Date: {} | Binary: a.out".format(datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"))) write_output(";" + "=" * 70) # Extract case addresses case_addresses = [] current_addr = toAddr(SWITCH_TABLE_ADDR) while True: addr_bytes = getBytes(current_addr, 8) addr_value = sum((addr_bytes[i] & 0xFF) << (i * 8) for i in range(8)) case_addr = toAddr(addr_value) if case_addr.getOffset() < CODE_SECTION_START or case_addr.getOffset() > CODE_SECTION_END: break case_addresses.append(case_addr) current_addr = current_addr.add(8) sorted_addresses = sorted(set([addr.getOffset() for addr in case_addresses])) def process_case(case_num, indent_level=0, from_case=None): case_addr = case_addresses[case_num] case_start = case_addr.getOffset() indent = " " * indent_level if from_case is not None: write_output("{} v".format(indent)) write_output("{} +==== CASE {} (0x{:x}) ====+".format(indent, case_num, case_start)) case_end = FUNCTION_END_ADDR for addr_val in sorted_addresses: if addr_val > case_start: case_end = addr_val break current_addr = case_addr next_cases = [] has_conditional = False while current_addr.getOffset() < case_end: inst = getInstructionAt(current_addr) if inst is None: break if DISPATCHER_JMP in str(inst): current_addr = inst.getMaxAddress().add(1) continue inst_str = enhance_instruction(inst) addr_hex = "0x{:x}".format(current_addr.getOffset()) flow_indicator = "" if inst.getMnemonicString() in ["JG", "JL", "JE", "JNE", "JA", "JB", "JGE", "JLE"]: has_conditional = True flow_indicator = " [BRANCH]" elif inst.getMnemonicString() == "CALL": try: target = inst.getOpObjects(0)[0] if hasattr(target, 'getOffset'): func_name = resolve_address(target.getOffset()) flow_indicator = " --> {}".format(func_name if func_name else "CALL") except: flow_indicator = " --> CALL" elif inst.getMnemonicString() == "JMP" and "0x0040195d" in str(inst): flow_indicator = " --> EXIT" if STATE_VAR_PATTERN in str(inst): parts = str(inst).split(',') if len(parts) > 1: next_case = int(parts[1].strip(), 0) next_cases.append(next_case) flow_indicator = " =====> CASE {}".format(next_case) write_output("{} | {:>10} | {:<60}{}".format(indent, addr_hex, inst_str, flow_indicator)) current_addr = inst.getMaxAddress().add(1) write_output("{} +{}+".format(indent, "=" * 70)) if len(next_cases) > 1 and has_conditional: write_output("{} +======> FALSE =====> CASE {}".format(indent, next_cases[0])) write_output("{} +======> TRUE =====> CASE {}".format(indent, next_cases[1])) elif next_cases: write_output("{} v".format(indent)) return next_cases visited_cases = set() case_queue = [(START_CASE, 0, None)] write_output("\n ENTRY POINT\n v") while case_queue: current_case, indent_level, from_case = case_queue.pop(0) if current_case == EXIT_CASE: indent_str = " " * indent_level write_output("{} +==== EXIT CASE {} ====+".format(indent_str, EXIT_CASE)) case_addr = case_addresses[EXIT_CASE] current_addr = case_addr for i in range(2): inst = getInstructionAt(current_addr) if inst is None: break addr_hex = "0x{:x}".format(current_addr.getOffset()) enhanced_inst = enhance_instruction(inst) flow_indicator = " --> EXIT" if inst.getMnemonicString() == "JMP" else "" write_output("{} | {:>10} | {:<60}{}".format(indent_str, addr_hex, enhanced_inst, flow_indicator)) current_addr = inst.getMaxAddress().add(1) write_output("{} +{}+".format(indent_str, "=" * 70)) continue visited_cases.add(current_case) next_cases = process_case(current_case, indent_level, from_case) for next_case in next_cases: if next_case is not None: if next_case in visited_cases: # This is a true loop - show it inline indent_str = " " * indent_level write_output("{} v".format(indent_str)) write_output("{} CASE {} (LOOP BACK)".format(indent_str, next_case)) else: new_indent = indent_level + (1 if len(next_cases) > 1 else 0) case_queue.append((next_case, new_indent, current_case)) # Save file output_file = "control_flow_analysis.txt" try: with open(output_file, 'w') as f: f.write('\n'.join(output_lines)) print("File saved: {}".format(output_file)) except: print("Failed to save file")

This script will not overwrite the bytes or anything just save deflattened code to file and you can check it out. With a few improvments you can also write overwrite bytes. But you will need to trace addresses etc. fix address jumps etc. accordingly. A lot more checks etc. will be needed so I just wrote simple code like this. It outputs like:

NOTE: I feel like just checking out script is easier to understand than me writing a bunch of things trying to explain everything 1 by 1.

; Control Flow Deobfuscation Analysis ; Date: 2025-08-13 14:17:50 | Binary: a.out ;====================================================================== ENTRY POINT v +==== CASE 11 (0x4018a3) ====+ | 0x4018a3 | MOV EDI,0x402298 | 0x4018a8 | CALL 0x00401040 --> puts | 0x4018ad | MOV EDI,0x4022c0 | 0x4018b2 | CALL 0x00401040 --> puts | 0x4018b7 | MOV dword ptr [RBP + -0x4],0x0 | 0x4018be | MOV qword ptr [RBP + -0x10],0x0 =====> CASE 0 +======================================================================+ v v +==== CASE 0 (0x401901) ====+ | 0x401901 | CMP dword ptr [RBP + -0x4],0x9 | 0x401905 | JG 0x00401911 [BRANCH] | 0x401907 | MOV qword ptr [RBP + -0x10],0x9 =====> CASE 9 | 0x401911 | MOV qword ptr [RBP + -0x10],0x1 =====> CASE 1 +======================================================================+ +======> FALSE =====> CASE 9 +======> TRUE =====> CASE 1 v +==== CASE 9 (0x4018cb) ====+ | 0x4018cb | MOV EAX,dword ptr [RBP + -0x4] | 0x4018ce | MOV EDI,EAX | 0x4018d0 | CALL 0x0040130f --> fibonacci | 0x4018d5 | MOV dword ptr [RBP + -0x18],EAX | 0x4018d8 | MOV EAX,dword ptr [RBP + -0x18] | 0x4018db | MOV ESI,EAX | 0x4018dd | MOV EDI,0x4022e7 | 0x4018e2 | MOV EAX,0x0 | 0x4018e7 | CALL 0x00401060 --> printf | 0x4018ec | ADD dword ptr [RBP + -0x4],0x1 | 0x4018f0 | MOV qword ptr [RBP + -0x10],0x0 =====> CASE 0 +======================================================================+ v v CASE 0 (LOOP BACK) v +==== CASE 1 (0x401838) ====+ | 0x401838 | MOV EDI,0x402281 | 0x40183d | CALL 0x00401040 --> puts | 0x401842 | MOV EDI,0x402283 | 0x401847 | CALL 0x00401040 --> puts | 0x40184c | MOV dword ptr [RBP + -0x40],0x5f | 0x401853 | MOV dword ptr [RBP + -0x3c],0x57 | 0x40185a | MOV dword ptr [RBP + -0x38],0x4c | 0x401861 | MOV dword ptr [RBP + -0x34],0x41 | 0x401868 | MOV dword ptr [RBP + -0x30],0x2d | 0x40186f | MOV dword ptr [RBP + -0x8],0x0 | 0x401876 | MOV qword ptr [RBP + -0x10],0x3 =====> CASE 3 +======================================================================+ v v +==== CASE 3 (0x401883) ====+ | 0x401883 | CMP dword ptr [RBP + -0x8],0x4 | 0x401887 | JG 0x00401896 [BRANCH] | 0x401889 | MOV qword ptr [RBP + -0x10],0x2 =====> CASE 2 | 0x401896 | MOV qword ptr [RBP + -0x10],0x8 =====> CASE 8 +======================================================================+ +======> FALSE =====> CASE 2 +======> TRUE =====> CASE 8 v ... ... ...

To execute script:

/nix/store/vbvp4d290sc9zjj45g08a1fqlxj0mmi9-ghidra-11.3.2/lib/ghidra/support/analyzeHeadless <projects location> <project name> -process a.out -postScript deflatten_ghidra.py

(I use nixos so this is the AnalyzeHeadless directory for me, it changes from OS to OS)

Files

original.c

flattened.c

deflatten_ghidra.py

a.out