Reporting Crashes in IMVU: Call Stacks and Minidumps

So far, we’ve implemented reporting for Python exceptions that bubble
out of the main loop
, C++ exceptions that bubble into Python (and then
out of the main loop), and structured exceptions that bubble into
(and then out of the main loop.) This is a fairly
comprehensive set of failure conditions, but there’s still a big piece
missing from our reporting.

Imagine that you implement this error reporting and have customers try
the new version of your software. You’ll soon have a collection of
crash reports, and one thing will stand out clearly. Without the
context in which crashes happened (call stacks, variable values,
perhaps log files), it’s very hard to determine their cause(s). And
without determining their cause(s), it’s very hard to fix them.

Reporting log files are easy enough. Just attach them to the error
report. You may need to deal with privacy concerns or limit the size
of the log files that get uploaded, but those are straightforward

Because Python has batteries
, grabbing the call stack from a Python exception is
trivial. Just take a quick look at the traceback

Structured exceptions are a little harder. The structure of a call
stack on x86 is machine- and sometimes compiler-dependent.
Fortunately, Microsoft provides an API to dump the relevant process
state to a file such that it can be opened in Visual
or WinDbg,
which will let you view the stack trace and select other data. These
files are called minidumps, and they’re pretty small. Just call MiniDumpWriteDump
with the context of the exception and submit the generated file with your crash

Grabbing a call stack from C++ exceptions is even harder, and maybe
not desired. If you regularly use C++ exceptions for communicating
errors from C++ to Python, it’s probably too expensive to grab a call
stack or write a minidump every single time. However, if you want to
do it anyway, here’s one way.

C++ exceptions are implemented on top of the Windows kernel’s
structured exception machinery. Using the try and
catch statements in your C++ code causes the compiler to
generate SEH code behind the scenes. However, by the time your C++
catch clauses run, the stack has already been unwound. Remember
that SEH has three passes: first it runs filter expressions until it
finds one that can handle the exception; then it unwinds the stack
(destroying any objects allocated on the stack); finally it runs the
actual exception handler. Your C++ exception handler runs as the last stage,
which means the stack has already been unwound, which means you can’t
get an accurate call stack from the exception handler. However, we
can use SEH to grab a call stack at the point where the exception was
thrown, before we handle it…

First, let’s determine the SEH exception code of C++ exceptions
(WARNING, this code is compiler-dependent):

int main() {
    DWORD code;
    __try {
        throw std::exception();
    __except (code = GetExceptionCode(), EXCEPTION_EXECUTE_HANDLER) {
        printf("%X\n", code);

Once we have that, we can write our exception-catching function like

void throw_cpp_exception() {
    throw std::runtime_error("hi");

bool writeMiniDump(const EXCEPTION_POINTERS* ep) {
    // ...
    return true;

void catch_seh_exception() {
    __try {
    __except (
        (CPP_EXCEPTION_CODE == GetExceptionCode()) && writeMiniDump(GetExceptionInformation()),
    ) {

int main() {
    try {
    catch (const std::exception& e) {
        printf("%s\n", e.what());

Now we’ve got call stacks and program state for C++, SEH, and Python
exceptions, which makes fixing reported crashes dramatically easier.

Next time I’ll go into more detail about how C++ stack traces work,
and we’ll see if we can grab them more efficiently.

Reporting Crashes in IMVU: Structured Exceptions

Previously, we discussed the implementation of automated reporting of
unhandled C++ exceptions
. However, if you’ve ever programmed in C++,
you know that C++ exceptions are not the only way your code can fail.
In fact, the most common failures probably aren’t C++ exceptions at
all. You know what I’m referring to: the dreaded access violation
(sometimes called segmentation fault).

Access Violation

How do we detect and report access violations? First, let’s talk
about what an access violation actually is.

Your processor has a mechanism for detecting loads and stores from
invalid memory addresses. When this happens, it raises an interrupt,
which Windows exposes to the program via Structured Exception Handling
(SEH). Matt Pietrek has written an excellent article on how
SEH works
, including a description of C++ exceptions implemented
on top of SEH. The gist is that there is a linked list of stack
frames that can possibly handle the exception. When an exception
occurs, that list is walked, and if an entry claims it can handle it,
it does. Otherwise, if no entry can handle the exception, the program
is halted and the familiar crash dialog box is displayed to the user.

OK, so access violations can be detected with SEH. In fact, with the
same mechanism, we can detect all other types of structured
exceptions, including division by zero and stack overflow. What does
the code look like? It’s approximately:

bool handle_exception_impl_seh(function f) {
    __try {
        // This is the previously-described C++ exception handler.
        // For various reasons, they need to be in different functions.
        // C++ exceptions are implemented in terms of SEH, so the C++
        // exception handling must be deeper in the call stack than
        // the structured exception handling.
        return handle_exception_impl_cpp(f);
    // catch all structured exceptions here
        PyErr_SetString(PyExc_RuntimeError, "Structured exception in C++ function");
        return true; // an error occurred

Note the __try and __except keywords. This means we’re using
structured exception handling, not C++ exception handling. The filter
expression in the __except statement evaluates to
EXCEPTION_EXECUTE_HANDLER, indicating that we always want to handle
structured exceptions. From the filter expression, you can optionally
use the GetExceptionCode
and GetExceptionInformation
intrinsics to access information about the actual error.

Now, if you write some code like:

Object* o = 0;
o->method(); // oops!

The error will be converted to a Python exception, and reported
with our existing mechanism. Good enough for now! However, there are
real problems with this approach. Can you think of them?

Soon, I’ll show the full implementation of the structured
exception handler.