Reporting Crashes in IMVU: C++ Call Stacks
Last time, we talked about including contextual information to help us actually fix crashes that happen in the field. Minidumps are a great way to easily save a snapshot of the most important parts of a running (or crashed) process, but it's often useful to understand the low-level mechanics of a C++ call stack (on x86). Given some basic principles about function calls, we will derive the implementation of code to walk a call stack.
C++ function call stack entries are stored on the x86 stack, which
grows downward in memory. That is, pushing on the stack subtracts
from the stack pointer. The ESP
register points to the
most-recently-written item on the stack; thus, push eax
is equivalent to:
sub esp, 4 mov [esp], eax
Let's say we're calling a function:
int __stdcall foo(int x, int y)
The __stdcall
calling convention pushes arguments onto the stack from right to left
and returns the result in the EAX
register, so calling
foo(1, 2)
generates this code:
push 2 push 1 call foo ; result in eax
If you aren't familiar with assembly, I know this is a lot to absorb,
but bear with me; we're almost there. We haven't seen the
call
instruction before. It pushes the EIP
register, which is the return address from the called function onto
the stack and then jumps to the target function.
If we didn't store the instruction pointer, the called function would
not know where to return when it was done.
The final piece of information we need to construct a C++ call stack is that functions live in memory, functions have names, and thus sections of memory have names. If we can get access to a mapping of memory addresses to function names (say, with the /MAP linker option), and we can read instruction pointers up the call stack, we can generate a symbolic stack trace.
How do we read the instruction pointers up the call stack? Unfortunately, just knowing the return address from the current function is not enough. How do you know the location of the caller's caller? Without extra information, you don't. Fortunately, most functions have that information in the form of a function prologue:
push ebp mov ebp, esp
and epilogue:
mov esp, ebp pop ebp
These bits of code appear at the beginning and end of every function, allowing you
to use the EBP
register as the "current stack frame".
Function arguments are always accessed at positive offsets from EBP,
and locals at negative offsets:
; int foo(int x, int y) ; ... [EBP+12] = y argument [EBP+8] = x argument [EBP+4] = return address (set by call instruction) [EBP] = previous stack frame [EBP-4] = local variable 1 [EBP-8] = local variable 2 ; ...
Look! For any stack frame EBP
, the caller's address is
at [EBP+4]
and the previous stack frame is at [EBP]
.
By dereferencing EBP
, we can walk
the call stack, all the way to the top!
struct stack_frame { stack_frame* previous; unsigned long return_address; }; std::vector<unsigned long> get_call_stack() { std::vector<unsigned long> call_stack; stack_frame* current_frame; __asm mov current_frame, ebp while (!IsBadReadPtr(current_frame, sizeof(stack_frame))) { call_stack.push_back(current_frame->return_address); current_frame = current_frame->previous; } return call_stack; } // Convert the array of addresses to names with the aforementioned MAP file.
Yay, now we know how to grab a stack trace from any location in the code. This implementation is not robust, but the concepts are correct: functions have names, functions live in memory, and we can determine which memory addresses are on the call stack. Now that you know how to manually grab a call stack, let Microsoft do the heavy lifting with the StackWalk64 function.
Next time, we'll talk about setting up your very own Microsoft Symbol Server so you can grab accurate function names from every version of your software.
[...] minidumps or a bit of hand-rolled code, it’s pretty easy to report symbolic C++ stack traces whenever your application crashes. But [...]