How security cookie works

·

7 min read

Buffer overrun is a type of vulnerability, where a function that has a statically allocated buffer (array) writes user data past the end of that buffer, thereby corrupting the memory.

In this post I want to write about how security cookies are implemented and how they help detect buffer overruns.

What is the issue anyway?

As you probably know, when CPU executes the code, it keeps track of a special data structure in memory called the program stack (or simply the stack). When functions start executing the data for local variables and the return address for the function get pushed onto the stack, and when the function finishes the values are removed. In order to keep track of where to put the new data onto the stack the CPU has a special register, the stack pointer register (on x86 it is the (E/R)SP register).

Every time a function (the callee) is called by the caller, first, the stack pointer decreases and the address of the instruction following the CALL is pushed onto the stack using the value of the stack pointer. Having that on the stack is useful, because later the function will use the ret instruction to return from the function and continue executing the caller code.

The buffers that the callee allocates are allocated in the same stack: first the stack pointer is decreased by the size of the buffer and the stack pointer is saved as a pointer to the buffer. When a function needs to write data to that buffer it will access that buffer through that saved pointer. Before returning from the function the stack memory allocated for the buffer and other local variables will be freed by increasing the stack pointer up to the point where the return address was pushed.

This all means that the somewhere above the address of any buffer is located the return address of the function. If a buffer overrun occurred it is likely that the return address is overwritten. In such a case the attacker could use this to overwrite the return address with a pointer to another user-controlled buffer, leading to Arbitrary Code Execution attack (ACE).

Here’s an example of simple vulnerable code, where a function reads username into a buffer and maybe does something with it, perhaps a check.

void login() {
    char username[256];
    scanf("%s", &username);
    // Do something else
}

Note that in this case scanf performs no checks to how long the buffer is, which is part of the reason it’s not recommended to use in production software. That aside, the problem is that if the user inputs more than 256 bytes they have a potential to overwrite the return address.

So how do we prevent that?

Here the compilers insert code to detect buffer overruns into your functions. The idea is simple: right below the return address, a specific value is pushed onto the stack, and at the end of function that value is checked again. If the value didn’t get overwritten then return address didn’t get overwritten, and it’s not likely that a buffer overrun occurred. If the value was overwritten, that means a buffer overrun did occur and it’s likely that return address also got overwritten.

This is the idea, but the implementation does a bit more than that. Let’s disassemble our code and look at our disassembly.

$ clang main.c -fstack-protector -c
$ dumpbin /disasm main.o

Relevant output:

I had cleared out parts of the output that aren’t important and color-coded the sections of assembly. The grey part corresponds to the actual code we have written that calls the scanf function.

The blue and red are function prologue and epilogue respectively. Prologue saves the pointer to the return address into register rbp and allocates stack space for all the local variables and buffers. Epilogue frees that space (restores the stack pointer) and returns from the function. As you can see from the first instruction this function allocates 0x128 (302) bytes of stack space.

The green part is what pushes that special value onto the stack. Note the following part:

mov   qword ptr[rsp+120h], rax

This will push the value of rax register onto the stack at offset 0x120 (296). This is 8 bytes below the return address, i.e. right below it.

And finally the yellow part is what retrieves the value at rsp+120h and checks whether it’s unchanged. To perform the check the __security_check_cookie function, which is part of the CRT.

The implementation of security cookies

If you think that the special value that’s being pushed to stack is the security cookie you will be half right, half wrong. In fact there’s still a bit more to it than just that.

The security cookie is a value that’s declared by the CRT library. At the start of the program the function __security_init_cookie() initializes the value of __security_cookie with a randomly generated value. This value is unchanging throughout the duration of the program execution.

After function allocates stack space, the security cookie is XOR’d with return address of the function and that is the special value that will get pushed to the stack. With this we can make sense of the three assembly lines after the prologue:

mov  rax, [__security_cookie] ; rax is now the cookie
xor  rax, rbp                 ; rax is now (cookie^return_addr)
mov  [rsp+120h], rax          ; put rax 8 bytes below the addr

The XOR operation has one interesting property that is important to us. When we xor the cookie and the return address, that value can be XOR’d with the return address and we should receive back the cookie (similarly it can be XOR’d with the cookie and we can get back the return address).

This is exactly what the code does before calling __check_security_cookie

  mov  rcx, [rsp+120h]         ; get the new value
  xor  rcx, rbp                ; rcx is now supposed to be the cookie
  call __security_check_cookie ; check rcx for if it's the cookie

The function __security_check_cookie has one parameter, which is the retrieved cookie. If the retrieved value is equal to __security_cookie then the check passes and no buffer overflow happened. Otherwise the check fails and that function proceeds to handle the situation.

On MSVC the function does the following. If the check fails (the value got changed), then int 0x29 instruction is executed, which is the behavior of __fastfail intrinsic, where a program gets terminated as quickly as possible with minimum overhead. If you’re running in a debugger you will see an exception 0xc0000409 (Which can be roughly translated to English as “Fatal program exit requested“) pop up.

The CRT code for security cookie for Windows on x86-64 roughly looks like this:

uint64_t __security_cookie;

void __security_init_cookie() {
    // maybe use a cooler random number generator
    __security_cookie = rand64();
}

void __security_check_cookie(uint64_t retrieved) {
    if(__security_cookie != retrieved) {
        // maybe use other means of exit the program
        abort();
    }
}

Slight note is that the function __security_init_cookie is called by mainCRTStartup or other entry point.

In short, the mechanism can be roughly described as:

  • On initialization we generate a random cookie

  • On each function call we read the return address, flip certain bits of it and save it to the stack

  • Before the function return we obtain the value from the stack, flip the same bits again and check whether the result is equal to the return value

  • If check fails (they were not equal), then the program terminates and maybe some information is printed to the error stream.

The cookie is random every time, this is because, even if the chance is slim, it can happen that the return value will get overwritten, but it will get overwritten with the same value as the return address, meaning there won’t be any observable consequences despite the presence of a bug. With random cookie, even if that happened, and then if you ran the program again with very high chance you will now detect the error.

Controlling security checks

The burden of implementation of the security check function lies on the runtime, while the burden of generating the checks lies on the compiler. On MSVC and clang there are options to disable generation of security checks. On clang the option is disabled by default and you can re-enable it with -fstack-protector flag. On MSVC the option is enabled by default and you can disable it with /GS- (disable buffer security checks) flag.

On MSVC you can disable security check on per-function basis using __declspec(safebuffers).

Conclusion

If you’re writing an application without CRT dependencies, you can implement your own handler of security cookies, instead of disabling the security checks.

Sometimes you might want to disable the security cookies on functions that are executed often to gain some performance improvement in release builds.

Knowing how security cookies work can help you read assembly code of functions and not stop on the unimportant first and last few lines.

Simply put, there’s a lot that this knowledge gives you, as does any other knowledge about how systems are implemented and how they work. You can subscribe to my blog if you’re interested in more posts like this. It is free and you can always change your mind later.