Format String Vulnerability¶
The format string vulnerability is seen in the incorrect use of printf() function in C.
Syntax of printf() in C:
1 |
|
The first argument, the format string, specifies how the variables have to be displayed, printf() also assumes how the variables are passed based on the format specifiers in the format string. The below table consists of the common format specifiers used in printf() and how the variables are perceived as.
Parameters | Output | Passed as |
---|---|---|
%d | Decimal | Value |
%u | Unsigned Decimal | Value |
%c | Character | Value |
%s | String | Reference |
%x | Hexadecimal | Value |
%p | Basically %x suffixed with ‘0x’ | Value |
%n | Writes the characters until “%n” into a pointer | Reference |
From now on everything mentioned will be in the assumption that you have a basic knowledge in C, x86 assembly and also a clear idea on how the stack works.
In a 32-bit environment the arguments of the printf() function is pushed on to the stack, first the variables are pushed on to the stack then the pointer to the control string.
As shown in the table the values pushed on the stack are printed as per the format specifier pushed on the stack.
#include<stdio.h>
int main()
{
int a=80;
puts(“Hello World”):
printf("The decimal is %d \n",a);
return 0;
}
Output :
1 2 |
|
This is what the above code looks like in x86 assembly language.
Dump of assembler code for function main:
0x0804840b <+0>: lea ecx,[esp+0x4]
0x0804840f <+4>: and esp,0xfffffff0
0x08048412 <+7>: push DWORD PTR [ecx-0x4]
0x08048415 <+10>: push ebp
0x08048416 <+11>: mov ebp,esp
0x08048418 <+13>: push ecx
0x08048419 <+14>: sub esp,0x14
0x0804841c <+17>: mov DWORD PTR [ebp-0xc],0x50
0x08048423 <+24>: sub esp,0x8
0x08048426 <+27>: push DWORD PTR [ebp-0xc]
0x08048429 <+30>: push 0x80484d0
0x0804842e <+35>: call 0x80482e0 <[email protected]>
0x08048433 <+40>: add esp,0x10
0x08048436 <+43>: mov eax,0x0
0x0804843b <+48>: mov ecx,DWORD PTR [ebp-0x4]
0x0804843e <+51>: leave
0x0804843f <+52>: lea esp,[ecx-0x4]
0x08048442 <+55>: ret
End of assembler dump.
Can you see the two push instructions before the calling printf()? The two arguments to printf() are pushed on to the stack, first the value in the variable, in this case ‘a’ is the variable and ‘80’ is the value, ‘80’ is pushed on to the stack, and then the pointer to the format string(“The decimal is %d”).
When printf() is called it assumes that the arguments are already on the stack and continues execution. Whatever is on the top of the stack is printed onto the screen, and ‘%d’ is replaced with the value ‘80’ which is right next to the pointer to format string on the stack.