Skip to content

Firmware reverse engineering

This section is basically about understanding the working of a particular device and possible vulnerabilities in those devices. This devices may have a web interface or maybe an andriod application which might be vulnerable, so knowledge in these fields is vital. To understand the internal working of any device you need to have a basic understanding of Reverse Engineering

What is firmware reverse engineering and why should you do it?

Firmware Reversing involves breaking down and getting an understanding of the internal workings of a device. It can vary from simple analysis like looking into different aspects of the device like its file system and various interfaces to looking deep into the firmware itself and uncovering the internal details and algorithms of a firmware. Reversing a firmware can lead to the discovery of various critical information about the device like the hardcoded data, security flaws in various critical algorithms and even login credentials.

Dealing with various hardware architectures

When dealing with embedded devices you will have to understand various internal mechanisms of the device. One important aspect of a device is its architecture. Most embedded systems use RISC (Reduced Instruction Set Computer) Architecture as it uses small and optimized instructions which usually takes less number of clock cycles compared to computers which can exucute complex tasks in a single clock cycle. Clear understanding about a device's architecture is important to understand the internal workings of a device.

Now let's take a look at a few important hardware architectures.

MIPS

MIPS (Microprocessor without Interlocked Pipelined Stages) is a reduced RISC instruction set architecture which is mainly used in embedded systems like gateways and routers. MIPS has 32 general purpose registers.

 $zer0     - A register that holds the value zero
 $at       - (Assembler Temporary) This register is reserved by the assembler 
 $v0-$v1   - (Values) Holds values from expression evaluation and function results
 $a0-$a3   - (Arguments) Holds parameters for called function
 $t0-$t7   - (Temporaries) Registers for holding temporary values. 
 $s0-$s7   - (Saved values) Register contains saved values for the calling function. 
 $t8-$t9   - (Temporaries) Register for holding temporary values(Additional to t0-t7)
 $k0-$k1   - Registers reserved for interrupt handling
 $gp       - (Global Pointer) Points to the global area (Static data segment)
 $sp       - (Stack Pointer) Points to the last location of the stack
 $fp       - (Frame Pointer) Points to the start of the stackframe 
 $ra       - (Return Address) Register which stores the return address 

The main instructions are:

  • General arithmetic operations: add,addu,sub,subu,mult,dev

  • Load and store instructions: la,lb,lw,sw,sb

  • Branch and conditional branches: b,beq,bne,ble,bge, etc.

  • Jumps and function calls: j,jal,jr

Now, let's have a look at a sample MIPS assembly program:

    .data
 out_string:    .asciiz   "\nHello, World!\n"
    .text                   
 main:                      
    li $v0, 4               
    la $a0, out_string      
    syscall                 
    li$v0, 10               
    syscall
Recommended resources

ARM

ARM (Advanced RISC Machine) as its name suggests, is a family of RISC architecture that is widely used in various embedded devices. ARM Architecture supports a variety of devices from low end microcontrollers to high end smartphones. The number of registers you can see below depends on the ARM version. The common registers are:

R0          General Purpose
R1-R5       General Purpose
R6-R10      General Purpose
R11 (FP)    Frame Pointer
R12         Intra Procedural Call
R13 (SP)    Stack Pointer
R14 (LR)    Link Register
R15 (PC)    Program Counter 
CPSR        Current Program State Register/Flags    

The common instructions are:

  • Arithmetic and logical operations: ADD,SUB,MUL,LSL,ROR, etc.

  • Load Store operations: LDR,LDM,STR,STM

  • Branch operations: B,BL,BX,BLX

  • Branch operation can be used with conditional execution

Now, lets have a look at a sample ARM assembly program:

.data
hello:
    .asciz "Hello World\n"
len = .-hello
.text            
.global _start
_start:
    mov r0, #1
    ldr r1, =hello
    ldr r2, =len
    mov r7, #4
    swi 0
    mov r7, #1
    swi 0 
Recommended resources

AVR

AVR is an 8-bit RISC architecture which is mainly used in automotive applications, security and entertainment systems. AVR is also the architecture used by Arduino development boards.

r1-r25  - general purpose 8-bit registers
r26     - X register's lower byte
r27     - X register's higher byte
r28     - Y register's lower byte
r29     - Y register's higher byte
r30     - Z register's lower byte
r31     - Z register's higher byte

The common instructions are: * Arithmetic operations: add,addc,sub,subc * Input/Output: in,out * Load Store Operations: ld,ldi,lds,st,sts

RISC-V

RISC-V is a free and open ISA (instruction set architecture) based on established RISC (Reduced Instruction Set Computer). RISC-V ISA uses a load-store architecture.

x0      - hardwired zero    
x1      - return address 
x2      - stack pointer 
x3      - global pointer    
x4      - thread pointer 
x5-7    - temporary registers
x8      - saved register / frame pointer 
x9      - saved register    
x10-11  - function arguments / return values 
x12-17  - function arguments    
x18-27  - saved registers 
x28-31  - temporary registers   

The common instructions are:

  • Arithmetic operations: ADD,ADDI,SUB,LUI
  • Branch operations: BEQ,BNE,BLT,BGE
  • Load Store Operations: LB,LH,LW,LBU,LHU
Recommended resources

There are many more hardware architectures and it's almost impossible to learn everything about them. Try to learn the core parts of an architecture, like the concepts of basic instructions and the its program flow. A good way to familiarize yourself with various architectures is by writing simple high level codes and then comparing it with its corresponding assembly.

Recommended resources

Firmware emulation

What is firmware emulation and why should we do it?

Firmware emulation is about producing a virtual version of the device by recreating the functionalities of a device in a different environment. The primordial objective is to perform various tests on the devices. It can be used for testing out various exploits on a device, in various simulated coniditions, which we normally can't do with a physical device (like fuzzing).

Basics of emulating firmware

Before we get into emulating a particular device, we need to know various details about the device. First and foremost is of course, the hardware specifications of the device. A firmware will have a particular architecture and it will require its own specific hardware to run properly. So, how can you make this firmware run on your device which is totally different from what is required? This is where hardware virtualization comes into play. QEMU is a open-source emulator and hardware virtulalizer which can help to emulate a particular marchine's processor in another device.

The next important part of the device is its root filesystem. An embedded device also has a root filesystem and it is necessary for emulation.

Remember, when we tried to analyze a firmware, we used binwalk to see what's inside it.

DECIMAL       HEXADECIMAL     DESCRIPTION
--------------------------------------------------------------------------------
0             0x0             TRX firmware header, little endian, image size: 1691648 bytes, CRC32: 0x50C5FAF8, flags: 0x0, version: 1, header size: 28 bytes, loader offset: 0x1C, linux kernel offset: 0x8A260, rootfs offset: 0x0
28            0x1C            gzip compressed data, maximum compression, has original file name: "piggy", from Unix, last modified: 2007-02-14 19:21:37
565856        0x8A260         CramFS filesystem, little endian, size: 1122304, version 2, sorted_dirs, CRC 0xD6DE1CB8, edition 0, 797 blocks, 212 files

We can see that there is a cramFS filesystem. We will be needing to extract this filesystem from the firmware file for emulation.

Next, we need the necessary kernel files used by the embedded device. Now, going back to the firmware which we analyzed using binwalk, you can see a gzip compressed data in the firmware:

DECIMAL       HEXADECIMAL     DESCRIPTION
--------------------------------------------------------------------------------
1044640       0xFF0A0         Linux kernel version 2.4.20
1063452       0x103A1C        Unix path: /usr/lib/libc.so.1
1110975       0x10F3BF        Copyright string: "Copyright 1995-1998 Mark Adler"
1303420       0x13E37C        CRC32 polynomial table, little endian

When we binwalk this file again, we can see that there is a Linux kernel in this file. Thus, we can extract the kernel image from the firmware. If proven othervise, we can also compile the kernel ourselves with a version as close as possible to the kernel version used by the device. Buildroot offers a huge variety of features for just that, including CPU support for all the popular architectures. Once we have these files, we can begin a simple emulation by starting up QEMU.

$ qemu-system-mips -M malta \
-m 512 -hda hda.img \
-kernel kernal_image \
-initrd initrd.img-4.9.0-8-4kc-malta \
-append "root=/dev/sda1 console=ttyS0 nokaslr" \
-nographic \
-nic user,hostfwd=tcp::5555-:22

We use the -nic user,hostfwd=tcp::5555-:22 flag for enabling SSH which can be used to transfer the filesystem we extracted from the firmware into the virtual machine. Once we transfer the filesystem, we can go into its root directory using cd firmware-fs and mount the /proc,/dev and /sys folders. Then, use chroot command to change the root folder to the filesystem to run /bin/sh.

$ mount -o bind /proc proc/
$ mount -o bind /dev dev/
$ mount -o bind /sys sys/
$ chroot . /bin/sh

With that we will have the root shell for the device. This is not a full emulation of an embedded device by any means, as there are many more aspects like the NVRAM (Non-volatile memory), web interfaces and so on. A full system emulation can be difficult and setting up and integrating various parts of a device manually is a lot of work. That's were firmware emulation platforms can help. These platforms can give you a full system emulation so that you don't have to struggle with setting up everything.

Firmware emulation platforms

  • Firmadyne : An automated emulation and analysis platform for Linux-based embedded firmwares
  • ARM-X : A QEMU based Firmware Emulation Framework for various ARM IoT devices
Recommended resources