What is CFF

Control flow flattening is a classic control-flow transformation that removes structured control flow.

CFF Story

Control Flow Flattening (CFF) is a code obfuscation technique that emerged from the need to protect software from reverse engineering and analysis. Here's the story behind it:

Origins and Purpose

What is CFF

Control flow flattening is a classic control-flow transformation that removes structured control flow.

CFF Story

Control Flow Flattening (CFF) is a code obfuscation technique that emerged from the need to protect software from reverse engineering and analysis. Here's the story behind it:

CFF was developed as part of the broader field of code obfuscation - techniques designed to make programs harder to understand while preserving their functionality. The primary goals include:

Anti-reverse engineering: Making it difficult for attackers to understand program logic
IP protection: Hiding proprietary algorithms and business logic
Anti-tampering: Preventing modification of critical code sections
Malware evasion: Unfortunately, also used by malware to evade detection

How CFF Works

Traditional programs have structured control flow with clear if-else statements, loops, and function calls that create a natural hierarchy. CFF breaks this structure by:

Flattening the control flow graph into a single large switch statement
Using state variables to track which "basic block" should execute next
Dispatching execution through a central dispatcher loop
Obscuring the original program structure so the logical flow becomes opaque

Tigress

To test this obfuscation technique I will use tigress, which you can also get from tigress.wtf To make it long and use things such as if/for/while I asked AI to write me a C code so I dont waste time on it.

...
// Function with while loop and complex logic
int string_analyzer(const char *str) {
    int length = strlen(str);
    int vowels = 0;
    int consonants = 0;
    int digits = 0;
    int i = 0;

    while (i < length) {
        char c = str[i];

        if (c >= '0' && c <= '9') {
            digits++;
        } else if ((c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z')) {
            if (c == 'a' || c == 'e' || c == 'i' || c == 'o' || c == 'u' ||
                c == 'A' || c == 'E' || c == 'I' || c == 'O' || c == 'U') {
                vowels++;
                } else {
                    consonants++;
                }
        }
        i++;
    }

    printf("Analysis of '%s':\n", str);
    printf("Length: %d\n", length);
    printf("Vowels: %d\n", vowels);
    printf("Consonants: %d\n", consonants);
    printf("Digits: %d\n", digits);

    return vowels + consonants + digits;
}

// Main function with multiple function calls and control structures
int main() {
    printf("=== Control Flow Flattening Demo ===\n\n");

    // Test fibonacci function
    printf("Fibonacci sequence (first 10 numbers):\n");
    for (int i = 0; i < 10; i++) {
        printf("%d ", fibonacci(i));
    }
    printf("\n\n");

    // Test grade calculator
    printf("Grade calculations:\n");
    int scores[] = {95, 87, 76, 65, 45};
    for (int i = 0; i < 5; i++) {
        printf("Score %d gets grade: %c\n", scores[i], grade_calculator(scores[i]));
    }
    printf("\n");

    // Test array processor
    printf("Array processing:\n");
    int test_array[] = {10, 25, 3, 47, 8, 91, 2, 36};
    int array_size = sizeof(test_array) / sizeof(test_array[0]);
    array_processor(test_array, array_size);
    printf("\n");

    // Test string analyzer
    printf("String analysis:\n");
    string_analyzer("Hello World 123");
    printf("\n");

    // Test menu handler
    printf("Menu simulation:\n");
    menu_handler(1);
    menu_handler(3);
    menu_handler(99);

    printf("\nProgram completed successfully!\n");
    return 0;
}

}

It gave me a lot longer example but I cut at here because I will show how to deflatten main function but i also checked with other functions too and it will work for them too. It is just about finding key points.

DeFlattening

So key point about deobfuscating binaries is, it can and most likely will change for different compiles and different tools. So we can't just write 1 script that can deflatten every binary possible.

After flattening my code, I open ghidra to find key points and write script accordingly.

NOTE: I will provide links to binaries, scripts, codes. So you can also work on it by yourself.

First of all we can easily see which variable it uses to jump between cases:

See the pattern? it uses local_18 to jump between cases using jump table, as you can see jump table points to 0x402308 and it contains address like this:

Also you can see it shifts local_18's value in RAX by 0x3 it means: RAX * 2^3 and jumps to RAX. We already have 2 key point:

it is starting from case: 0xb.
Switch table address is at: 0x402308

now let's check out case_0xb right? Let's see what it has.

In every case you can see the last 2 instructions where is first sets next case that's going to be processed and jumps back to where it finds case address and jumps. so we can also remember these 2

DISPATCHER_JMP = "JMP 0x00401958"
STATE_VAR_PATTERN = "MOV qword ptr [RBP + -0x10]"

NOTE: the reason I wrote RBP + -0x10 is because it is the value instead of -0x18 when analyzeHeadless used.

Also there are if/while/for loops written like this:

See it compares and jumps case accordingly, then we also have to check for example in this case, if 9 jumps back to case_0 if so it is while/for loop and in tigress flattening most of the time next jump is just the end.This is additional thing we have to remember.

The last thing is also finding which case returns. I know it is not really necessary we can check it in several ways like if it jumps to leave/ret or just returns or doesn't have default MOV/JMP etc. but in my case just hardcoding case was easier.

This is the ghidra script I have written:

# Ghidra Control Flow Deflattening Script with Symbol Resolution
SWITCH_TABLE_ADDR = 0x402308
CODE_SECTION_START = 0x401000
CODE_SECTION_END = 0x402000
FUNCTION_END_ADDR = 0x40195d
DISPATCHER_JMP = "JMP 0x00401958"
STATE_VAR_PATTERN = "MOV qword ptr [RBP + -0x10]"
EXIT_CASE = 5
START_CASE = 0xb

import datetime

output_lines = []

def write_output(text):
    print(text)
    output_lines.append(text)

def resolve_address(addr_value):
    """Resolve address to function name, string, or symbol"""
    try:
        addr = toAddr(addr_value)

        func = getFunctionAt(addr)
        if func is not None:
            return func.getName()

        symbol = getSymbolAt(addr)
        if symbol is not None:
            return symbol.getName()

        data = getDataAt(addr)
        if data is not None and data.hasStringValue():
            string_val = str(data.getValue())
            return '"{}"'.format(string_val[:27] + "..." if len(string_val) > 30 else string_val)

        string_data = getString(addr)
        if string_data is not None:
            string_str = str(string_data)
            return '"{}"'.format(string_str[:27] + "..." if len(string_str) > 30 else string_str)

        return None
    except:
        return None

def enhance_instruction(inst):
    """Enhance instruction with resolved symbols"""
    inst_str = str(inst)

    for i in range(inst.getNumOperands()):
        try:
            operand = inst.getOpObjects(i)[0]
            if hasattr(operand, 'getOffset'):
                addr_val = operand.getOffset()
                resolved = resolve_address(addr_val)
                if resolved is not None:
                    hex_addr = "0x{:x}".format(addr_val)
                    if hex_addr in inst_str:
                        inst_str = inst_str.replace(hex_addr, "{} ; {}".format(hex_addr, resolved))
        except:
            continue

    return inst_str

# Header
write_output("; Control Flow Deobfuscation Analysis")
write_output("; Date: {} | Binary: a.out".format(datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")))
write_output(";" + "=" * 70)

# Extract case addresses
case_addresses = []
current_addr = toAddr(SWITCH_TABLE_ADDR)

while True:
    addr_bytes = getBytes(current_addr, 8)
    addr_value = sum((addr_bytes[i] & 0xFF) << (i * 8) for i in range(8))

    case_addr = toAddr(addr_value)
    if case_addr.getOffset() < CODE_SECTION_START or case_addr.getOffset() > CODE_SECTION_END:
        break

    case_addresses.append(case_addr)
    current_addr = current_addr.add(8)

sorted_addresses = sorted(set([addr.getOffset() for addr in case_addresses]))

def process_case(case_num, indent_level=0, from_case=None):
    case_addr = case_addresses[case_num]
    case_start = case_addr.getOffset()
    indent = "  " * indent_level

    if from_case is not None:
        write_output("{}        v".format(indent))

    write_output("{}    +==== CASE {} (0x{:x}) ====+".format(indent, case_num, case_start))

    case_end = FUNCTION_END_ADDR
    for addr_val in sorted_addresses:
        if addr_val > case_start:
            case_end = addr_val
            break

    current_addr = case_addr
    next_cases = []
    has_conditional = False

    while current_addr.getOffset() < case_end:
        inst = getInstructionAt(current_addr)
        if inst is None:
            break

        if DISPATCHER_JMP in str(inst):
            current_addr = inst.getMaxAddress().add(1)
            continue

        inst_str = enhance_instruction(inst)
        addr_hex = "0x{:x}".format(current_addr.getOffset())
        flow_indicator = ""

        if inst.getMnemonicString() in ["JG", "JL", "JE", "JNE", "JA", "JB", "JGE", "JLE"]:
            has_conditional = True
            flow_indicator = " [BRANCH]"
        elif inst.getMnemonicString() == "CALL":
            try:
                target = inst.getOpObjects(0)[0]
                if hasattr(target, 'getOffset'):
                    func_name = resolve_address(target.getOffset())
                    flow_indicator = " --> {}".format(func_name if func_name else "CALL")
            except:
                flow_indicator = " --> CALL"
        elif inst.getMnemonicString() == "JMP" and "0x0040195d" in str(inst):
            flow_indicator = " --> EXIT"

        if STATE_VAR_PATTERN in str(inst):
            parts = str(inst).split(',')
            if len(parts) > 1:
                next_case = int(parts[1].strip(), 0)
                next_cases.append(next_case)
                flow_indicator = " =====> CASE {}".format(next_case)

        write_output("{}    | {:>10} | {:<60}{}".format(indent, addr_hex, inst_str, flow_indicator))
        current_addr = inst.getMaxAddress().add(1)

    write_output("{}    +{}+".format(indent, "=" * 70))

    if len(next_cases) > 1 and has_conditional:
        write_output("{}        +======> FALSE =====> CASE {}".format(indent, next_cases[0]))
        write_output("{}        +======> TRUE  =====> CASE {}".format(indent, next_cases[1]))
    elif next_cases:
        write_output("{}        v".format(indent))

    return next_cases

visited_cases = set()
case_queue = [(START_CASE, 0, None)]

write_output("\n  ENTRY POINT\n      v")

while case_queue:
    current_case, indent_level, from_case = case_queue.pop(0)

    if current_case == EXIT_CASE:
        indent_str = "  " * indent_level
        write_output("{}    +==== EXIT CASE {} ====+".format(indent_str, EXIT_CASE))
        case_addr = case_addresses[EXIT_CASE]
        current_addr = case_addr

        for i in range(2):
            inst = getInstructionAt(current_addr)
            if inst is None:
                break
            addr_hex = "0x{:x}".format(current_addr.getOffset())
            enhanced_inst = enhance_instruction(inst)
            flow_indicator = " --> EXIT" if inst.getMnemonicString() == "JMP" else ""
            write_output("{}    | {:>10} | {:<60}{}".format(indent_str, addr_hex, enhanced_inst, flow_indicator))
            current_addr = inst.getMaxAddress().add(1)

        write_output("{}    +{}+".format(indent_str, "=" * 70))
        continue

    visited_cases.add(current_case)
    next_cases = process_case(current_case, indent_level, from_case)

    for next_case in next_cases:
        if next_case is not None:
            if next_case in visited_cases:
                # This is a true loop - show it inline
                indent_str = "  " * indent_level
                write_output("{}        v".format(indent_str))
                write_output("{}     CASE {} (LOOP BACK)".format(indent_str, next_case))
            else:
                new_indent = indent_level + (1 if len(next_cases) > 1 else 0)
                case_queue.append((next_case, new_indent, current_case))

# Save file
output_file = "control_flow_analysis.txt"
try:
    with open(output_file, 'w') as f:
        f.write('\n'.join(output_lines))
    print("File saved: {}".format(output_file))
except:
    print("Failed to save file")

This script will not overwrite the bytes or anything just save deflattened code to file and you can check it out. With a few improvments you can also write overwrite bytes. But you will need to trace addresses etc. fix address jumps etc. accordingly. A lot more checks etc. will be needed so I just wrote simple code like this. It outputs like:

NOTE: I feel like just checking out script is easier to understand than me writing a bunch of things trying to explain everything 1 by 1.

; Control Flow Deobfuscation Analysis
; Date: 2025-08-13 14:17:50 | Binary: a.out
;======================================================================

  ENTRY POINT
      v
    +==== CASE 11 (0x4018a3) ====+
    |   0x4018a3 | MOV EDI,0x402298                                            
    |   0x4018a8 | CALL 0x00401040                                              --> puts
    |   0x4018ad | MOV EDI,0x4022c0                                            
    |   0x4018b2 | CALL 0x00401040                                              --> puts
    |   0x4018b7 | MOV dword ptr [RBP + -0x4],0x0                              
    |   0x4018be | MOV qword ptr [RBP + -0x10],0x0                              =====> CASE 0
    +======================================================================+
        v
        v
    +==== CASE 0 (0x401901) ====+
    |   0x401901 | CMP dword ptr [RBP + -0x4],0x9                              
    |   0x401905 | JG 0x00401911                                                [BRANCH]
    |   0x401907 | MOV qword ptr [RBP + -0x10],0x9                              =====> CASE 9
    |   0x401911 | MOV qword ptr [RBP + -0x10],0x1                              =====> CASE 1
    +======================================================================+
        +======> FALSE =====> CASE 9
        +======> TRUE  =====> CASE 1
          v
      +==== CASE 9 (0x4018cb) ====+
      |   0x4018cb | MOV EAX,dword ptr [RBP + -0x4]                              
      |   0x4018ce | MOV EDI,EAX                                                 
      |   0x4018d0 | CALL 0x0040130f                                              --> fibonacci
      |   0x4018d5 | MOV dword ptr [RBP + -0x18],EAX                             
      |   0x4018d8 | MOV EAX,dword ptr [RBP + -0x18]                             
      |   0x4018db | MOV ESI,EAX                                                 
      |   0x4018dd | MOV EDI,0x4022e7                                            
      |   0x4018e2 | MOV EAX,0x0                                                 
      |   0x4018e7 | CALL 0x00401060                                              --> printf
      |   0x4018ec | ADD dword ptr [RBP + -0x4],0x1                              
      |   0x4018f0 | MOV qword ptr [RBP + -0x10],0x0                              =====> CASE 0
      +======================================================================+
          v
          v
       CASE 0 (LOOP BACK)
          v
      +==== CASE 1 (0x401838) ====+
      |   0x401838 | MOV EDI,0x402281                                            
      |   0x40183d | CALL 0x00401040                                              --> puts
      |   0x401842 | MOV EDI,0x402283                                            
      |   0x401847 | CALL 0x00401040                                              --> puts
      |   0x40184c | MOV dword ptr [RBP + -0x40],0x5f                            
      |   0x401853 | MOV dword ptr [RBP + -0x3c],0x57                            
      |   0x40185a | MOV dword ptr [RBP + -0x38],0x4c                            
      |   0x401861 | MOV dword ptr [RBP + -0x34],0x41                            
      |   0x401868 | MOV dword ptr [RBP + -0x30],0x2d                            
      |   0x40186f | MOV dword ptr [RBP + -0x8],0x0                              
      |   0x401876 | MOV qword ptr [RBP + -0x10],0x3                              =====> CASE 3
      +======================================================================+
          v
          v
      +==== CASE 3 (0x401883) ====+
      |   0x401883 | CMP dword ptr [RBP + -0x8],0x4                              
      |   0x401887 | JG 0x00401896                                                [BRANCH]
      |   0x401889 | MOV qword ptr [RBP + -0x10],0x2                              =====> CASE 2
      |   0x401896 | MOV qword ptr [RBP + -0x10],0x8                              =====> CASE 8
      +======================================================================+
          +======> FALSE =====> CASE 2
          +======> TRUE  =====> CASE 8
            v
            ...
            ...
            ...

To execute script:

/nix/store/vbvp4d290sc9zjj45g08a1fqlxj0mmi9-ghidra-11.3.2/lib/ghidra/support/analyzeHeadless <projects location> <project name> -process a.out -postScript deflatten_ghidra.py

(I use nixos so this is the AnalyzeHeadless directory for me, it changes from OS to OS)

Files