Firmware reverse engineering¶
This section is basically about understanding the working of a particular device and possible vulnerabilities in those devices. This devices may have a web interface or maybe an andriod application which might be vulnerable, so knowledge in these fields is vital. To understand the internal working of any device you need to have a basic understanding of Reverse Engineering
What is firmware reverse engineering and why should you do it?¶
Firmware Reversing involves breaking down and getting an understanding of the internal workings of a device. It can vary from simple analysis like looking into different aspects of the device like its file system and various interfaces to looking deep into the firmware itself and uncovering the internal details and algorithms of a firmware. Reversing a firmware can lead to the discovery of various critical information about the device like the hardcoded data, security flaws in various critical algorithms and even login credentials.
Dealing with various hardware architectures¶
When dealing with embedded devices you will have to understand various internal mechanisms of the device. One important aspect of a device is its architecture. Most embedded systems use RISC (Reduced Instruction Set Computer) Architecture as it uses small and optimized instructions which usually takes less number of clock cycles compared to computers which can exucute complex tasks in a single clock cycle. Clear understanding about a device's architecture is important to understand the internal workings of a device.
Now let's take a look at a few important hardware architectures.
MIPS (Microprocessor without Interlocked Pipelined Stages) is a reduced RISC instruction set architecture which is mainly used in embedded systems like gateways and routers. MIPS has 32 general purpose registers.
$zer0 - A register that holds the value zero $at - (Assembler Temporary) This register is reserved by the assembler $v0-$v1 - (Values) Holds values from expression evaluation and function results $a0-$a3 - (Arguments) Holds parameters for called function $t0-$t7 - (Temporaries) Registers for holding temporary values. $s0-$s7 - (Saved values) Register contains saved values for the calling function. $t8-$t9 - (Temporaries) Register for holding temporary values(Additional to t0-t7) $k0-$k1 - Registers reserved for interrupt handling $gp - (Global Pointer) Points to the global area (Static data segment) $sp - (Stack Pointer) Points to the last location of the stack $fp - (Frame Pointer) Points to the start of the stackframe $ra - (Return Address) Register which stores the return address
The main instructions are:
General arithmetic operations:
Load and store instructions:
Branch and conditional branches:
Jumps and function calls:
Now, let's have a look at a sample MIPS assembly program:
.data out_string: .asciiz "\nHello, World!\n" .text main: li $v0, 4 la $a0, out_string syscall li$v0, 10 syscall
ARM (Advanced RISC Machine) as its name suggests, is a family of RISC architecture that is widely used in various embedded devices. ARM Architecture supports a variety of devices from low end microcontrollers to high end smartphones. The number of registers you can see below depends on the ARM version. The common registers are:
R0 General Purpose R1-R5 General Purpose R6-R10 General Purpose R11 (FP) Frame Pointer R12 Intra Procedural Call R13 (SP) Stack Pointer R14 (LR) Link Register R15 (PC) Program Counter CPSR Current Program State Register/Flags
The common instructions are:
Arithmetic and logical operations:
Load Store operations:
Branch operation can be used with conditional execution
Now, lets have a look at a sample ARM assembly program:
.data hello: .asciz "Hello World\n" len = .-hello .text .global _start _start: mov r0, #1 ldr r1, =hello ldr r2, =len mov r7, #4 swi 0 mov r7, #1 swi 0
AVR is an 8-bit RISC architecture which is mainly used in automotive applications, security and entertainment systems. AVR is also the architecture used by Arduino development boards.
r1-r25 - general purpose 8-bit registers r26 - X register's lower byte r27 - X register's higher byte r28 - Y register's lower byte r29 - Y register's higher byte r30 - Z register's lower byte r31 - Z register's higher byte
The common instructions are:
* Arithmetic operations:
* Load Store Operations:
RISC-V is a free and open ISA (instruction set architecture) based on established RISC (Reduced Instruction Set Computer). RISC-V ISA uses a load-store architecture.
x0 - hardwired zero x1 - return address x2 - stack pointer x3 - global pointer x4 - thread pointer x5-7 - temporary registers x8 - saved register / frame pointer x9 - saved register x10-11 - function arguments / return values x12-17 - function arguments x18-27 - saved registers x28-31 - temporary registers
The common instructions are:
- Arithmetic operations:
- Branch operations:
- Load Store Operations:
There are many more hardware architectures and it's almost impossible to learn everything about them. Try to learn the core parts of an architecture, like the concepts of basic instructions and the its program flow. A good way to familiarize yourself with various architectures is by writing simple high level codes and then comparing it with its corresponding assembly.
- DVRF: DVRF project is to simulate a real-world environment which helps learn about various CPU architectures.
- Setting up DVRF A guide to a set-up guide for DVRF
- Breaking Encryption in an Embedded Firmware using Ghidra
What is firmware emulation and why should we do it?¶
Firmware emulation is about producing a virtual version of the device by recreating the functionalities of a device in a different environment. The primordial objective is to perform various tests on the devices. It can be used for testing out various exploits on a device, in various simulated coniditions, which we normally can't do with a physical device (like fuzzing).
Basics of emulating firmware¶
Before we get into emulating a particular device, we need to know various details about the device. First and foremost is of course, the hardware specifications of the device. A firmware will have a particular architecture and it will require its own specific hardware to run properly. So, how can you make this firmware run on your device which is totally different from what is required? This is where hardware virtualization comes into play. QEMU is a open-source emulator and hardware virtulalizer which can help to emulate a particular marchine's processor in another device.
The next important part of the device is its root filesystem. An embedded device also has a root filesystem and it is necessary for emulation.
Remember, when we tried to analyze a firmware, we used
binwalk to see what's inside
DECIMAL HEXADECIMAL DESCRIPTION -------------------------------------------------------------------------------- 0 0x0 TRX firmware header, little endian, image size: 1691648 bytes, CRC32: 0x50C5FAF8, flags: 0x0, version: 1, header size: 28 bytes, loader offset: 0x1C, linux kernel offset: 0x8A260, rootfs offset: 0x0 28 0x1C gzip compressed data, maximum compression, has original file name: "piggy", from Unix, last modified: 2007-02-14 19:21:37 565856 0x8A260 CramFS filesystem, little endian, size: 1122304, version 2, sorted_dirs, CRC 0xD6DE1CB8, edition 0, 797 blocks, 212 files
We can see that there is a cramFS filesystem. We will be needing to extract this filesystem from the firmware file for emulation.
Next, we need the necessary kernel files used by the embedded device. Now, going back
to the firmware which we analyzed using
binwalk, you can see a gzip
compressed data in the firmware:
DECIMAL HEXADECIMAL DESCRIPTION -------------------------------------------------------------------------------- 1044640 0xFF0A0 Linux kernel version 2.4.20 1063452 0x103A1C Unix path: /usr/lib/libc.so.1 1110975 0x10F3BF Copyright string: "Copyright 1995-1998 Mark Adler" 1303420 0x13E37C CRC32 polynomial table, little endian
binwalk this file again, we can see that there is a Linux kernel in
this file. Thus, we can extract the kernel image from the firmware. If proven othervise,
we can also compile the kernel ourselves with a version as close as possible to
the kernel version used by the device. Buildroot offers a
huge variety of features for just that, including CPU support for all the popular architectures. Once we have these files, we can begin a simple emulation by starting
$ qemu-system-mips -M malta \ -m 512 -hda hda.img \ -kernel kernal_image \ -initrd initrd.img-4.9.0-8-4kc-malta \ -append "root=/dev/sda1 console=ttyS0 nokaslr" \ -nographic \ -nic user,hostfwd=tcp::5555-:22
We use the
-nic user,hostfwd=tcp::5555-:22 flag for enabling SSH which can be used to
transfer the filesystem we extracted from the firmware into the virtual machine. Once
we transfer the filesystem, we can go into its root directory using
cd firmware-fs and
/sys folders. Then, use
chroot command to change the root
folder to the filesystem to run
$ mount -o bind /proc proc/ $ mount -o bind /dev dev/ $ mount -o bind /sys sys/ $ chroot . /bin/sh
With that we will have the root shell for the device. This is not a full emulation of an embedded device by any means, as there are many more aspects like the NVRAM (Non-volatile memory), web interfaces and so on. A full system emulation can be difficult and setting up and integrating various parts of a device manually is a lot of work. That's were firmware emulation platforms can help. These platforms can give you a full system emulation so that you don't have to struggle with setting up everything.
Firmware emulation platforms¶
- Firmadyne : An automated emulation and analysis platform for Linux-based embedded firmwares
- ARM-X : A QEMU based Firmware Emulation Framework for various ARM IoT devices