Getting to the vulnerability:¶
Let us consider a case where we forget to give the second argument to printf().
#include<stdio.h>
int main()
{
int a=80;
puts("Hello World");
printf("The decimal is %d\n");
return 0;
}
Dump of assembler code for function main:
0x0804843b <+0>: lea ecx,[esp+0x4]
0x0804843f <+4>: and esp,0xfffffff0
0x08048442 <+7>: push DWORD PTR [ecx-0x4]
0x08048445 <+10>: push ebp
0x08048446 <+11>: mov ebp,esp
0x08048448 <+13>: push ecx
0x08048449 <+14>: sub esp,0x14
0x0804844c <+17>: mov DWORD PTR [ebp-0xc],0x50
0x08048453 <+24>: sub esp,0xc
0x08048456 <+27>: push 0x8048500
0x0804845b <+32>: call 0x8048310 <[email protected]>
0x08048460 <+37>: add esp,0x10
0x08048463 <+40>: sub esp,0xc
0x08048466 <+43>: push 0x804850c
0x0804846b <+48>: call 0x8048300 <[email protected]>
0x08048470 <+53>: add esp,0x10
0x08048473 <+56>: mov eax,0x0
0x08048478 <+61>: mov ecx,DWORD PTR [ebp-0x4]
0x0804847b <+64>: leave
0x0804847c <+65>: lea esp,[ecx-0x4]
0x0804847f <+68>: ret
End of assembler dump.
Notice that there is only one push instruction before calling printf(). No variable is being pushed on to the stack.
1 2 3 |
|
-
Q. How does ‘7’ get there even when we forget to give any variable or value to be printed ?
To answer the above question we shall look at a snapshot of the stack and just before the printf() function is called.
Have a look at the stack, at the top of the stack we have the pointer to the format string. When printf() is called, what happens is that ‘%d’ gets replaced with whatever is there on stack right next to the pointer to format string. This is because printf() assumes the whatever is to be printed has already been pushed onto the stack. How can this error be a vulnerability ? What if the programmer decides to display a user controlled string using printf(), a C code like this:
#include<stdio.h> int main() { char *str; char *secret="You don't get to see this"; puts("I will repeat whatever you say"); scanf("%s",str); printf(str); return 0; }
Here we have a printf() function with one string ‘str’ as argument, that we can control. What if we apply the previous example in this case, can we not leak data on the stack at the instance when the call to printf() is made. Let us look at the stack right before the printf() is called.
Here I gave input as “Hello” and hence the first argument is a pointer to “Hello” and the output will be “Hello” itself. Have a look at the stack. The next value on the stack is same pointer (this is because that pointer itself was pushed as argument for the scanf() function). Let’s give ‘
%d
’ after “Hello
” as the input and see what our output is.The input is
"Hello%d"
Now there is a format specifier in our input. The next value on the stack is a pointer “0xffffd11c” which is a hexadecimal.
1 2
The output we get is: Hello-12004
-12004
is the decimal value of “0xffffd11c
”.If we give more than one
%d
we get more info from the stack correspondingly.When the input is
“%p%p%p”
(we want to print the values on the stack in hexadecimal):Now let’s get to the fun part. The programmer clearly doesn’t want the user to know what the “secret” string is and no part of the code prints it out.0xffffd11c0xf7e29a500x804853b
The”
secret
” string is stored on the stack, now we can get the pointer to the string.Output:input: “%p%p%p%p%p%p%p”
As you can see the highlighted address is the pointer to “secret”. To view what the pointer holds we need to dereference it and then print it. We can use the ‘%s’ format specifier to do this. Let’s replace all the0xffffd11c0xf7e29a500x804853b0x10xffffd1140xffffd11c 0x8048570
%p
s with%s
Oops, we will get a segmentation fault if we do that, the
%s
dereferences any value on the stack, printf() doesn’t check if it a valid address or not, so we will get a segmentation if the stack has some junk value,Output :input : “%p%p%p%p%p%p%s”
We have successfully leaked information that a user should not be knowing.0xffffd11c0xf7e29a500x804853b0x10xffffd1140xffffd11c You don't get to see this
Hacking is not all about doing something we aren’t supposed to. It is equally important to analyse the vulnerability or the mistake the programmer made that led to the compromise of data security.
In the previous example we gave %p%p%p%p%p%p%s
as the input to get the leak. There is a different way of giving the same input. We can give it as %7$s
, this prints whatever is there in the 7th offset from top of stack. This can come in handy when we go to the next section.
If you are using a 64bit binary, the offset changes since we have some registers which are used before using the stack
How could the programmer fix this issue
- printf() should be used along with the format specifier. A
%s
as the first argument and then the user controlled argument will do do no harm. - Alternatively puts() function can be used to display on the screen.
- Avoid leaving out sensitive data in unwanted places. Data abstraction is important.