Reporting Crashes in IMVU: Who threw that C++ exception?
It's not often that I get to write about recent work. Most of the techniques in this series were implemented at IMVU years ago. A few weeks ago, however, a common C++ exception (tr1::bad_weak_ptr
) starting intermittently causing crashes in the wild. This exception can be thrown in a variety of circumstances, so we had no clue which code was problematic.
We could have modified tr1::bad_weak_ptr
so its constructor fetched a CallStack
and returned it from tr1::bad_weak_ptr::what()
, but fetching a CallStack
is not terribly cheap, especially in such a frequently-thrown-and-caught exception. Ideally, we'd only grab a stack after we've determined it's a crash (in the top-level crash handler).
Allow me to illustrate:
void main_function(/*arguments*/) { try { try { // We don't want to grab the call stack here, because // we'll catch the exception soon. this_could_fail(/*arguments*/); } catch (const std::exception& e) { // Yup, exception is fine. Just swallow and // do something else. fallback_algorithm(/*arguments*/); } } catch (const std::exception& e) { // Oh no! fallback_algorithm() failed. // Grab a stack trace now. report_crash(CallStack::here()); } }
Almost! Unfortunately, the call stack generated in the catch clause doesn't contain fallback_algorithm
. It starts with main_function
, because the stack has already been unwound by the time the catch clause runs.
Remember the structure of the stack:
We can use the ebp
register, which points to the current stack frame, to walk and record the current call stack. [ebp+4]
is the caller's address, [[ebp]+4]
is the caller's caller, [[[ebp]]+4]
is the caller's caller's caller, and so on.
What can we do with this information? Slava Oks at Microsoft gives the clues we need. When you type throw MyException()
, a temporary MyException
object is constructed at the bottom of the stack and passed into the catch clause by reference or by value (as a copy deeper on the stack).
Before the catch clause runs, objects on the stack between the thrower and the catcher are destructed, and ebp
is pointed at the catcher's stack frame (so the catch clause can access parameters and local variables).
From within the outer catch block, here is the stack, ebp
, and esp
:
Notice that, every time an exception is caught the linked list of stack frames is truncated. When an exception is caught, ebp
is reset to the stack frame of the catcher, destroying our link to the thrower's stack.
But there's useful information between ebp
and esp
! We just need to search for it. We can find who threw the exception with this simple algorithm:
For every possible pointer between ebp and esp, find the deepest pointer p, where p might be a frame pointer. (That is, where walking p eventually leads to ebp.)
Or you can just use our implementation.
With this in mind, let's rewrite our example's error handling:
void main_function(/*arguments*/) { try { try { this_could_fail(/*arguments*/); } catch (const std::exception& e) { // that's okay, just swallow and // do something else. fallback_algorithm(/*arguments*/); } } catch (const std::exception& e) { // oh no! fallback_algorithm() failed. // grab a stack trace - including thrower! Context ctx; getCurrentContext(ctx); ctx.ebp = findDeepestFrame(ctx.ebp, ctx.esp); report_crash(CallStack(ctx)); } }
Bingo, fallback_algorithm appears in the stack:
main_function fallback_algorithm __CxxThrowException@8 _KiUserExceptionDispatcher@8 ExecuteHandler@20 ExecuteHandler2@20 ___CxxFrameHandler ___InternalCxxFrameHandler ___CxxExceptionFilter ___CxxExceptionFilter ?_is_exception_typeof@@YAHABVtype_info@@PAU_EXCEPTION_POINTERS@@@Z ?_CallCatchBlock2@@YAPAXPAUEHRegistrationNode@@PBU_s_FuncInfo@@PAXHK@Z
Now we'll have no problems finding the source of C++ exceptions!
See discussion on Hacker News (2009) and /r/programming.
Nice article, but are CallStack and Context Microsoft-specific classes? I've never run across them in the standard library.
word to the wise, a good c++ style guide (googles)
Exceptions
- We do not use C++ exceptions.
Hi twir,
CallStack and Context are part of IMVU's open source crash-reporting code. You can download them at http://imvu.svn.sourceforge.net/viewvc/imvu/imvuopensource/CallStack/
Let me know if you have questions about them!
sulfide: The problem with that convention is that it prevents you from using the standard C++ library or boost, both of which save a great deal of time when developing a project and also use exceptions occasionally to signal failure.
Lol, what a kludge :D Once upon a time there was the the proposal that exceptions would not unwind the stack by default, but instead let that decision to the exception handler (like in Common Lisp and a few other sane languages). Fortunately for all, God Almighty Bjarne Stroustroup and his fellows on the C++ committee decided that such a feature would be too hard to use, and after all, who needs it ? Lol again.
This is certainly a neat way of solving a pretty tricky problem. It just reinforces why I never, ever, want to use C++ for anything again. I want to focus on solving the kind of problems my clients need solved, not kludging around the behaviour of a brain-dead language.
We agree, which is why most of IMVU is written in Python and JavaScript. :) That said, there are problems better left to C++, such as 3D graphics, some high-performance I/O, and integration with other libraries (unless you trust ctypes a lot more than I do...) I feel like C++ is rarely the right language in which to start a project, but it really is the right tool for a bunch of problems...