Visualizing Python Import Dependencies

by Chad Austin
May 2 09

In a large Python program such as IMVU, startup time is dominated by Python module imports. Take these warm timings:

$ time python -c 'None'

real    0m0.096s
user    0m0.077s
sys     0m0.061s

$ time python -c 'import urllib2'

real    0m0.314s
user    0m0.155s
sys     0m0.186s

That’s 300ms for a single basic dependency. Importing the entire IMVU client takes 1.5s warm and 20s cold on a typical user’s machine.

Loading

The IMVU client’s loading progress bar imports modules bottom-up; that is, leaf modules are imported before their parents. The root module is imported last.

Implementing a bottom-up import sequence requires generating a graph of dependencies between modules:

def get_dependencies(module_name):
    """\
    Takes a module name as input (e.g. 'xml.dom') and returns a set of
    (lhs, rhs) tuples where lhs and rhs are module names and lhs
    imports rhs.
    """
    
    # module_from_key is a dict from a module key, an arbitrary
    # object, to a module object.  While importing, we discover
    # dependencies before we have access to the actual module objects.
    
    # import_dependencies is a list of (lhs, rhs) tuples where lhs and
    # rhs are module keys, and module_from_key[lhs] imported
    # module_from_key[rhs].

    root_key = object()
    module_from_key = {root_key: __main__}
    import_dependencies = []
    stack = [root_key]

    def import_in_stack(key, name, globals, locals, fromlist, level):
        stack.append(key)
        try:
            return original_import(name, globals, locals, fromlist, level)
        finally:
            stack.pop()

    import __builtin__
    original_import = __builtin__.__import__

    def my_import(name, globals=globals(), locals=locals(), fromlist=[], level=-1):
        # fromlist is a whore.  Most of the complexity in this
        # function stems from fromlist's semantics.  See
        # http://docs.python.org/library/functions.html#__import__
        
        # If a module imports 'xml.dom', then the module depends on
        # both 'xml' and 'xml.dom' modules.
        dotted = name.split('.')
        for i in range(1, len(dotted)):
            my_import('.'.join(dotted[0:i]), globals, locals, [], level)

        module_key = object()
        parent_key = stack[-1]

        def add_dependency_from_parent(key, m):
            module_from_key[key] = m
            import_dependencies.append((parent_key, key))

        submodule = import_in_stack(module_key, name, globals, locals, ['__name__'], level)
        add_dependency_from_parent(module_key, submodule)

        for f in (fromlist or []):
            from_key = object()
            module = import_in_stack(from_key, name, globals, locals, [f], level)
            if f == '*':
                continue
            submodule = getattr(module, f)
            if isinstance(submodule, types.ModuleType):
                add_dependency_from_parent(from_key, submodule)

        return original_import(name, globals, locals, fromlist, level)

    # Import module_name with import hook enabled.
    original_modules = sys.modules.copy()
    __builtin__.__import__ = my_import
    try:
        my_import(module_name)
    finally:
        __builtin__.__import__ = original_import
        sys.modules.clear()
        sys.modules.update(original_modules)

    assert stack == [root_key], stack

    return sorted(set(
        (module_from_key[lhs].__name__, module_from_key[rhs].__name__)
        for lhs, rhs in import_dependencies
    ))

(You can view all of the code at SourceForge).

First, we install an __import__ hook that discovers import dependencies between modules, and convert those relationships into a directed graph:

xml.dom.minidom

Then, we merge cycles. If module A imports B, B imports C, and C imports A, then it doesn’t matter which you import first. Importing A is equivalent to importing B or C. After this step, we have a DAG:

xml.dom.minidom DAG

Finally, we can postorder traverse the DAG to determine a good import sequence and cost (approximated as the number of modules in the cycle) per import:

1 xml
3 xml.dom
1 copy_reg
1 types
1 copy
1 xml.dom.NodeFilter
1 xml.dom.xmlbuilder
1 xml.dom.minidom
1 __main__

Now let’s look at some less-trivial examples. urllib2:

urllib2

SimpleXMLRPCServer:

SimpleXMLRPCServer

imvu.task:

imvu.task

Final notes: Other people have solved this problem with bytecode scanning, but we wanted to know the actual modules imported for an accurate progress bar. A simpler __import__ hook could have calculate the correct import sequence, but I find a visual representation of module dependencies to have additional benefits.

Flushing the Windows Disk Cache

by Chad Austin
Apr 22 09

Occasionally, I want to test the performance of a program after a cold boot, or maybe after the computer has been idle for hours and the program has been paged out. For example, the IMVU client starts relatively quickly when the disk cache is warm, but at system boot, it can take quite a while for the login dialog to even appear. Iterating in these situations is a pain in the butt because you have to reboot or leave your computer idle for hours.

I’m sure there exists a program which will flush the disk caches and force programs out of memory and into the page file, but I can’t find it. So I wrote one.

First, a caveat: programs these days rarely handle out-of-memory situations, so running flushmem.exe might cause open applications to explode like popcorn. Buyer beware, etc.

After running flushmem.exe, you should find that your computer becomes painfully slow as applications are paged back into memory and the disk cache is refilled. Perfect. Now I can realistically simulate the experiences of our users.

You can download the program here or on the FlushMem page.

Implementation details: in Windows, each process has a 2 GB user mode address space limit by default. If physical memory + page file size is greater than 2 GB, flushmem spawns multiple processes. Each process allocates memory in 64 KiB chunks until it can’t anymore, and then writes to each page, forcing older pages out to the page file.

Reporting Crashes in IMVU: Who threw that C++ exception?

by Chad Austin
Apr 19 09

It’s not often that I get to write about recent work. Most of the techniques in this series were implemented at IMVU years ago. A few weeks ago, however, a common C++ exception (tr1::bad_weak_ptr) starting intermittently causing crashes in the wild. This exception can be thrown in a variety of circumstances, so we had no clue which code was problematic.

We could have modified tr1::bad_weak_ptr so its constructor fetched a CallStack and returned it from tr1::bad_weak_ptr::what(), but fetching a CallStack is not terribly cheap, especially in such a frequently-thrown-and-caught exception. Ideally, we’d only grab a stack after we’ve determined it’s a crash (in the top-level crash handler).

Allow me to illustrate:

void main_function(/*arguments*/) {
    try {
        try {
            // We don't want to grab the call stack here, because
            // we'll catch the exception soon.
            this_could_fail(/*arguments*/);
        }
        catch (const std::exception& e) {
            // Yup, exception is fine.  Just swallow and
            // do something else.
            fallback_algorithm(/*arguments*/);
        }
    }
    catch (const std::exception& e) {
        // Oh no! fallback_algorithm() failed.
        // Grab a stack trace now.
        report_crash(CallStack::here());
    }
}

Almost! Unfortunately, the call stack generated in the catch clause doesn’t contain fallback_algorithm. It starts with main_function, because the stack has already been unwound by the time the catch clause runs.

Remember the structure of the stack:

Example Stack

We can use the ebp register, which points to the current stack frame, to walk and record the current call stack. [ebp+4] is the caller’s address, [[ebp]+4] is the caller’s caller, [[[ebp]]+4] is the caller’s caller’s caller, and so on.

What can we do with this information? Slava Oks at Microsoft gives the clues we need. When you type throw MyException(), a temporary MyException object is constructed at the bottom of the stack and passed into the catch clause by reference or by value (as a copy deeper on the stack).

Before the catch clause runs, objects on the stack between the thrower and the catcher are destructed, and ebp is pointed at the catcher’s stack frame (so the catch clause can access parameters and local variables).

From within the outer catch block, here is the stack, ebp, and esp:

Stack From Catch Clause

Notice that, every time an exception is caught the linked list of stack frames is truncated. When an exception is caught, ebp is reset to the stack frame of the catcher, destroying our link to the thrower’s stack.

But there’s useful information between ebp and esp! We just need to search for it. We can find who threw the exception with this simple algorithm:

	For every possible pointer between ebp and esp,
	find the deepest pointer p,
	where p might be a frame pointer.
	(That is, where walking p eventually leads to ebp.)

Or you can just use our implementation.

With this in mind, let’s rewrite our example’s error handling:

void main_function(/*arguments*/) {
    try {
        try {
            this_could_fail(/*arguments*/);
        }
        catch (const std::exception& e) {
            // that's okay, just swallow and
            // do something else.
            fallback_algorithm(/*arguments*/);
        }
    }
    catch (const std::exception& e) {
        // oh no! fallback_algorithm() failed.
        // grab a stack trace - including thrower!
        Context ctx;
        getCurrentContext(ctx);
        ctx.ebp = findDeepestFrame(ctx.ebp, ctx.esp);
        report_crash(CallStack(ctx));
    }
}

Bingo, fallback_algorithm appears in the stack:

main_function
fallback_algorithm
__CxxThrowException@8
_KiUserExceptionDispatcher@8
ExecuteHandler@20
ExecuteHandler2@20
___CxxFrameHandler
___InternalCxxFrameHandler
___CxxExceptionFilter
___CxxExceptionFilter
?_is_exception_typeof@@YAHABVtype_info@@PAU_EXCEPTION_POINTERS@@@Z
?_CallCatchBlock2@@YAPAXPAUEHRegistrationNode@@PBU_s_FuncInfo@@PAXHK@Z

Now we’ll have no problems finding the source of C++ exceptions!

IMVU’s CallStack API Now Open Source!

by Chad Austin
Apr 15 09

I’m proud to announce that IMVU has open-sourced its C++ CallStack API! It’s available under the MIT license at our SourceForge project. You can view the code here.

CallStack is a simple API for recording and displaying C++ call stacks on 32-bit Windows. To display the call stack at the current location:

printf("%s\n", CallStack::here().asString().c_str());

To grab a CallStack from an arbitrary thread:

HANDLE other_thread_handle = ...;
CallStack other_thread(other_thread_handle);

From a structured exception:

Context ctx;
CallStack cs;
__try {
	// broken code
}
__except (
	ctx = *(GetExceptionInformation())->ContextRecord),
	cs.getFromContext(ctx),
	EXCEPTION_EXECUTE_HANDLER
) {
	// display cs.asString()
}

At first, the format of CallStack.asString() is a bit confusing, but with your symbol server it contains everything necessary to generate a symbolic call stack, including file names and line numbers.

Here is an example CallStack.asString() result:

PYTHON25.DLL#b57f5c3ff1b64eda861d97643831ce701!000266dc
boost_python.dll#507f2f0a5fd34e65af25e728d0be9ebb1!0000d4bf
_avatarwindow.pyd#5289bbd0ff9c4ceab5198308f99ef9271!0002f76a

The lines are formatted module_name#module_hash!offset. module_name is the name of the DLL or EXE in which the function lives. module_hash is a unique hash that identifies a build of a particular module. offset is the offset of the line of code in bytes from the start of the module. With this information, you can look up a function name and line number for each entry in a call stack.

Fortunately, we have a tool that automates this process: symbol_dump.py! Running it with the previous call stack on the clipboard produces this output:

PYTHON25.DLL#b57f5c3ff1b64eda861d97643831ce701!000266dc
	...t\python-2.5.1-src\objects\abstract.c (1860): PyObject_Call
boost_python.dll#507f2f0a5fd34e65af25e728d0be9ebb1!0000d4bf
	...0\libs\python\src\object\function.cpp ( 614): function_call
_avatarwindow.pyd#5289bbd0ff9c4ceab5198308f99ef9271!0002f76a
	...\boost\function\function_template.hpp ( 132): boost::detail::function::function_obj_invoker2<boost::_bi::bind_t<bool,boost::python::detail::translate_exception<IMVUError,void (__cdecl*)(IMVUError const &)>,boost::_bi::list3<boost::arg<1>,boost::arg<2>,boost::_bi::value<void (__cdecl*)(IMVUError const

That last function name is pretty epic (as are most Boost or C++ function names), but notice that the call stack has accurate file names and line numbers.

The astute reader might ask “Don’t minidumps contain stack traces too?” The answer is yes, but minidumps are often inconvenient. Consider the common case:

  1. Open crash report
  2. Download mini.dmp to the desktop
  3. Open mini.dmp in Visual Studio
  4. Press F11
  5. Open the call stack debug window if it’s not open

With CallStack, we can shorten that to

  1. Open crash report
  2. Copy the call stack
  3. Run symbol_dump.py

Also, for reasons I don’t understand, sometimes Visual Studio fails to produce an informative stack when CallStack succeeds.

CallStack is a handy tool for debugging crashes from the wild, and I’m happy that we were able to make it available.

IMVU Crash Reporting: Plugging the VC++ Runtime’s Escape Hatches

by Chad Austin
Apr 5 09

Modern software is built on top of piles of third-party components. The holy grail of reusable code has manifested in the form of an open source ecosystem. That said, in a startup, you rarely have time to audit the third-party components you use. These third-party components might be ridiculous enough to call abort() on error. It may sound scary, but with a fixed amount of work, you can turn these calls into reported structured exceptions.

Unfortunately, the Visual C++ Runtime provides several functions that abnormally terminate the process without running our crash handlers. As a bonus, they usually include user-incomprehensible dialog boxes. Let’s see:

  • abort()
  • “pure virtual” method calls
  • throwing an exception from a destructor during unwind
  • stack buffer overflow (with /GS enabled)
  • standard C++ library index/iterator error with checked iterators enabled

Since you can never prove that you’ve implemented crash reporting for every way a third-party component can bypass your crash reporting, I’m just going to cover the ones we’ve implemented:

abort()

abort()

Result of calling abort()

Turning abort() into a structured exception is pretty straightforward. A quick read of the CRT source shows that abort() runs SIGABRT’s installed signal handler. It’s easy enough to install a custom handler that raises a structured exception:

void __cdecl onSignalAbort(int code) {
    // It's possible that this signal handler will get called twice
    // in a single execution of the application.  (On multiple threads,
    // for example.)  Since raise() resets the signal handler, put it back.
    signal(SIGABRT, onSignalAbort);

    RaiseException(EXC_SIGNAL_ABORT, 0, 0, 0);
}

...

// at program start:
signal(SIGABRT, onSignalAbort);

“Pure Virtual” Method Calls

Pure virtual function call

Pure virtual function call

Ever see a program fail with that useless “pure virtual function call” error message? This happens when a base class’s constructor tries to call a pure virtual method implemented by a derived class. Since base class constructors run before derived class constructors, the compiler fills the vtable for the derived class with references to _purecall, a function normally defined by the CRT. _purecall() aborts the process, sidestepping our crash reporting. Code might better elucidate this situation:


struct Base;
void foo(Base* b);

struct Base {
    Base() {
        foo(this);
    }
    virtual void pure() = 0;
};
struct Derived : public Base {
    void pure() { }
};

void foo(Base* b) {
    b->pure();
}

Derived d; // boom

The fix is simple: just define a _purecall that shadows the CRT implementation:

int __cdecl _purecall() {
    RaiseException(EXC_PURE_CALL, 0, 0, 0);
    return 0;
}

Throwing an Exception from a Destructor During Unwind

C++ is aggressive about making sure you don’t throw an exception while another exception is in the air. If you do, its default behavior is to terminate your process. From MSDN: If a matching handler is still not found, or if an exception occurs while unwinding, but before the handler gets control, the predefined run-time function terminate is called. If an exception occurs after throwing the exception, but before the unwind begins, terminate is called.

To convert calls to terminate() (and unexpected(), for completeness) into structured exceptions, override the terminate handler with set_terminate (and set_unexpected):

void onTerminate() {
    RaiseException(EXC_TERMINATE, 0, 0, 0);
}
void onUnexpected() {
    RaiseException(EXC_UNEXPECTED, 0, 0, 0);
}

// at program start:
set_terminate(onTerminate);
set_unexpected(onUnexpected);

Standard C++ Library Index/Iterator Error with Checked Iterators Enabled

The IMVU client is compiled with _SECURE_SCL enabled. Increased reliability from knowing exactly where failures occur is more important than the very minor performance hit of validating all iterator accesses.

There are two ways to convert invalid iterator uses into reported exceptions. The easiest is compiling with _SECURE_SCL_THROWS=1. Otherwise, just install your own invalid_parameter handler with _set_invalid_parameter_handler.

Stack Buffer Overflow (with /GS enabled)

By default, Visual C++ generates code that detects and reports stack buffer overruns. This prevents a common class of application security holes. Unfortunately, the stock implementation of this feature does not allow you to install your own handler, which means you can’t report any buffer overruns.

Again, we can shadow a CRT function to handle these failures. From C:\Program Files\Microsoft Visual Studio 8\VC\crt\src\gs_report.c, copy the __report_gsfailure function into your application. (You did install the CRT source, didn’t you?) Instead of calling UnhandledExceptionFilter at the bottom of __report_gsfailure, call your own last-chance handler or write a minidump.

Testing Crash Reporting

Writing tests for the above reporting mechanisms is super fun. Take everything you’re told not to do and implement it. I recommend adding these crashes to your UI somewhere so you can directly observe what happens when they occur in your application. Here is our crash menu:

Crash Menu

Crash Menu

These types of failures are rare, but when we hit them, I’m glad we implemented this reporting.

GDC Notes: Comparison of XBLA, PSN, iPhone

by Chad Austin
Apr 4 09

On Wednesday of GDC I attended Braving the Stormy Waters of Xbox Live Arcade and PSN: Smaller is NOT Easier. Panel title aside, it was really a comparison of developing for XBox Live Arcade, PlayStation Network, and iPhone. First, a terminology note. I’m tempted to call these games “indie games”, but even companies as large as Capcom are creating games for these platforms. Thus, I’ll call them “small games”. Here are the lessons I took away:

It’s hard to market small games. Use twitter, facebook, or youtube to build a community and spread your game via word of mouth. Go to conferences to spread the word if you must. Viral is the way to go.

It’s okay to ship incomplete as long as people buy into your vision and you continue to release updates. iPhone community is good about rerating updated apps. They gave an example of a game that went from 500th rank to 1 after key updates, but I forgot what it was.

ON THE OTHER HAND: on XBox/PSN, updating is too expensive, because it initiates another round of compliance checks. iPhone looks to be the easiest platform with a 97% app acceptance rate.

XBox 360 allows simultaneous global release of a title. PSN doesn’t. On the other hand, PSN is starving for content, so they’ll be very helpful. But they’re less organized than Microsoft.

Don’t depend on a launch window: you can’t control it. If a high-profile AAA game (Fable 2, say) ships the same week as you, you won’t get any traffic.

Along those lines, digital distribution (DD) is very weak in the traditional retail game season: fall/christmas. Launch your DD games in spring. Maybe someone will figure out how to take advantage of DD at Christmas.

Several people commented that small games are hard. You need to produce a new title every few months, which is very tiring. No time for long, thoughtful showers. You’ll quickly learn which programmers are actually pulling their weight. I got the impression small projects require more talented team members.

No mention of WiiWare, Flash, web, or desktop games.

You Won’t Learn This in School: Disabling Kernel Functions in Your Process

by Chad Austin
Mar 30 09

Detecting and reporting unhandled exceptions with SetUnhandledExceptionFilter seemed logical, and, in fact, it worked… for a while. Eventually, we started to notice failures that should have been reported as a last-chance exception but weren’t. After much investigation, we discovered that both Direct3D and Flash were installing their own unhandled exception filters! Worse, they were fighting over it, installing their handlers several times per second! In practice, this meant our last-chance crash reports were rarely generated, convincing us our crash metrics were better than they were. (Bad, bad libraries!)

It’s pretty ridiculous that we had to solve this problem, but, as Avery Lee says, “Just because it is not your fault does not mean it is not your problem.”

The obvious solution is to join the fray, calling SetUnhandledExceptionFilter every frame, right? How about we try something a bit more reliable… I hate implementing solutions that have obvious flaws. Thus, we chose to disable (with code modification) the SetUnhandledExceptionFilter function immediately after installing our own handler. When Direct3D and Flash try to call it, their requests will be ignored, leaving our exception handler installed.

Code modification… isn’t that scary? With a bit of knowledge and defensive programming, it’s not that bad. In fact, I’ll show you the code up front:

// If this doesn't make sense, skip the code and come back!

void lockUnhandledExceptionFilter() {
    HMODULE kernel32 = LoadLibraryA("kernel32.dll");
    Assert(kernel32);

    if (FARPROC gpaSetUnhandledExceptionFilter = GetProcAddress(kernel32, "SetUnhandledExceptionFilter")) {
        unsigned char expected_code[] = {
            0x8B, 0xFF, // mov edi,edi
            0x55,       // push ebp
            0x8B, 0xEC, // mov ebp,esp
        };

        // only replace code we expect
        if (memcmp(expected_code, gpaSetUnhandledExceptionFilter, sizeof(expected_code)) == 0) {
            unsigned char new_code[] = {
                0x33, 0xC0,       // xor eax,eax
                0xC2, 0x04, 0x00, // ret 4
            };

            BOOST_STATIC_ASSERT(sizeof(expected_code) == sizeof(new_code));

            DWORD old_protect;
            if (VirtualProtect(gpaSetUnhandledExceptionFilter, sizeof(new_code), PAGE_EXECUTE_READWRITE, &old_protect)) {
                CopyMemory(gpaSetUnhandledExceptionFilter, new_code, sizeof(new_code));

                DWORD dummy;
                VirtualProtect(gpaSetUnhandledExceptionFilter, sizeof(new_code), old_protect, &dummy);

                FlushInstructionCache(GetCurrentProcess(), gpaSetUnhandledExceptionFilter, sizeof(new_code));
            }
        }
    }
    FreeLibrary(kernel32);
}

If that’s obvious to you, then great: We’re hiring!

Otherwise, here is an overview:

Use GetProcAddress to grab the real address of SetUnhandledExceptionFilter. (If you just type &SetUnhandledExceptionFilter you’ll get the relocatable import thunk, not the actual SetUnhandledExceptionFilter function.)

Most Windows functions begin with five bytes of prologue:

mov edi, edi ; 2 bytes for hotpatching support
push ebp     ; stack frame
mov ebp, esp ; stack frame (con't)

We want to replace those five bytes with return 0;. Remember that __stdcall functions return values in the eax register. We want to replace the above code with:

xor eax, eax ; eax = 0
ret 4        ; pops 4 bytes (arg) and returns

Also five bytes! How convenient! Before we replace the prologue, we verify that the first five bytes match our expectations. (If not, we can’t feel comfortable about the effects of the code replacement.) The VirtualProtect and FlushInstructionCache calls are standard fare for code modification.

After implementing this, it’s worth stepping through the assembly in a debugger to verify that SetUnhandledExceptionFilter no longer has any effect. (If you really enjoy writing unit tests, it’s definitely possible to unit test the desired behavior. I’ll leave that as an exercise for the reader.)

Finally, our last-chance exception reporting actually works!

Reporting Crashes in IMVU: Last-Chance Exceptions

by Chad Austin
Mar 29 09
So far, our crash reporting is looking pretty comprehensive. But what if there is a crash in the crash reporting itself? Or perhaps a crash on another thread (outside of any __try ... __except blocks)? Or, heaven forbid, we somehow cause Python itself to crash? In all of these situations, we can’t count on our Python crash reporting code to handle the error. There are a couple ways to report these failures, and IMVU chose to implement a last-chance exception handler with SetUnhandledExceptionFilter. This handler runs whenever a structured exception bubbles out of any thread in the process. Unfortunately, you can’t run arbitrary code in the handler – maybe your process’s heap is corrupted and further attempts to allocate will cause another access violation. By the way, the last-chance handler is your opportunity to run code just before the Windows “this program has performed an illegal operation” dialog. Access Violation In IMVU’s last-chance handler, we do what we can to save the state of the failure to disk. The next time the client starts, if failure data exists, the client submits it to the server. (Assuming the customer tries to run the client again.) Here is is the last-chance handler’s implementation:
LONG WINAPI handleLastChanceException(
	EXCEPTION_POINTERS* ExceptionInfo
) {
	DWORD error = writeMiniDump(
		ExceptionInfo);
	if (error) {
		CRASHLOG("Failed to write minidump");
	}

	// Try to record some additional system
	// info if we can.

	// Show the application error dialog.
	return EXCEPTION_CONTINUE_SEARCH;
}
Pretty easy! Again, try to restrict yourself to system calls in your last-chance handler. When we first implemented this, we found out that the IMVU client was crashing a ton and we didn’t even know! Were I to start over from scratch, I’d implement the broadest crash detection possible first, and then implement narrower, more-detailed detection as necessary. Next time I’ll talk about an unexpected shortcoming of this implementation. (Can you guess what it is?)

Reporting Crashes in IMVU: Creating Your Very Own Symbol Server

by Chad Austin
Mar 28 09

With minidumps or a bit of hand-rolled code, it’s pretty easy to report symbolic C++ stack traces whenever your application crashes. But reporting is just one side of the coin. Once you begin collecting crash reports, you’ll need a way to read them. As I mentioned before, you can generate some MAP files and look up your functions manually, but Microsoft provides some lesser-known tools that take care of this for you.

Why should I run my own symbol server?

  • Once you create a symbol server, you can easily debug a release build of your program from any computer with network access to the symbol server.
  • Existing tools such as WinDbg and Visual Studio play nicely with symbol servers.
  • Creating a symbol server is easy: it requires a single download and a directory on a network share.
  • Symbol servers can be cascaded. Microsoft runs a symbol server for their operating system DLLs, so entries from both your code and system DLLs will be displayed in a single call stack.

An example of a call stack containing IMVU code and Windows code.

What is a symbol server?

If you compile your program with the correct options, the compiler and linker will generate symbol files (PDBs). PDBs contain information required to parse stack traces and locate identifiers in your program. A PDB also contains the signature of the DLL or EXE associated with it. A symbol server is a directory of PDBs, organized by signature. Thus, given a DLL or EXE, you can find its PDB signature (the PdbSig70 and PdbAge fields in the corresponding debug PE header) and look up the PDB in the symbol server directory.

Creating the symbol server

Download and install the Debugging Tools for Windows. Make sure symstore.exe is in your path.

Create a directory on an accessible network share. Note the full path; you’ll need it later. Ours is \\hydra\devel\SymbolServer.

Every time you release your software, add the generated PDBs to the symbol server. Our release script runs this command:

symstore.exe add /s \\hydra\devel\SymbolServer /compress /r /f *.pdb /t IMVUClient

This command takes a while to run, but it simply searches for PDBs and indexes them in the SymbolServer directory by their signature so Visual Studio and WinDbg can find them later.

Using the symbol server

There are generally two ways to coerce your debugger to resolve call stacks with your symbol server. You can either set the _NT_SYMBOL_PATH environment variable or configure the debugger with the paths directly.

The syntax for _NT_SYMBOL_PATH is a bit wonky, but it should look something like this:

srv*c:\localsymbols*\\hydra\devel\SymbolServer*http://msdl.microsoft.com/download/symbols

_NT_SYMBOL_PATH behaves just like any other PATH: paths are searched in order until the appropriate PDB is found. To make future lookups faster, if a PDB is found on the network, it will be copied to the local path.

The last entry in the list is Microsoft’s public symbol server. This allows the debugger to find function names for Microsoft operating system DLLs.

Each debugger generally has its own mechanism for configuring symbol servers. Here’s Visual Studio 2005’s symbol path editor:

MSDev Symbol Path Configuration

To save time, you may prefer using _NT_SYMBOL_PATH over per-debugger configuration, since it will work with all debuggers.

I hope this information was useful. I certainly wish we’d discovered it years ago. Happy debugging!

Handling exceptions from XULRunner callbacks

by Chad Austin
Mar 22 09

(I was composing an e-mail to IMVU’s engineering team when I realized this information was generally applicable to anyone embedding XULRunner into their application. Hope it’s useful.)

XULRunner is written in a subset of C++ that we’ll call XPCOM. An embedded XULRunner window communicates back to the host application through XPCOM interfaces that the host implements. In IMVU, we generally use C++ exceptions to signal failure. On the other hand, XPCOM uses nsresult error codes. Specifically, XULRunner is not written to support C++ exceptions, nor is it compiled with them enabled. (Note that compiling with exceptions enabled is not sufficient to guarantee defined behavior when they’re thrown. You must use techniques like RAII to properly unwind and restore state if an exception is thrown.)

If XULRunner is calling into our code, and our code uses exceptions to signal failure, and throwing an exception through a XULRunner call stack results in undefined behavior, what do we do? This is the strategy I took:

In every method of every IMVU implementation of an XPCOM interface, I bracketed the function body with IMVU_BEGIN_DISALLOW_EXCEPTIONS_XPCOM and IMVU_END_DISALLOW_EXCEPTIONS_XPCOM. For example:

nsresult xpcom_method_runtime_error_with_error_info() {
    IMVU_BEGIN_DISALLOW_EXCEPTIONS_XPCOM {
        boost::throw_exception(std::runtime_error("error"));
    } IMVU_END_DISALLOW_EXCEPTIONS_XPCOM;
}

These two macros generate a try ... catch clause that handles every C++ exception thrown from the body, returning NS_ERROR_UNEXPECTED to the caller.

If the exception thrown is a Python error (boost::python::error_already_set), then the Python exception is pulled (PyErr_Fetch) and scheduled to be reraised (PyErr_Restore) in the next iteration through the IMVU client’s message loop.

If the exception thrown is a C++ exception, we’d like to take the same approach. However, C++0x has not shipped, so there’s no built-in mechanism for transferring exceptions across contexts. Thus, we take advantage of the boost::exception framework to copy and rethrow the exception from the main message loop. Unfortunately, you can’t just “throw X()”. You have to use boost::throw_exception, which enables the machinery for current_exception() and rethrow_exception(). To enforce this requirement, I have modified our C++ exception hierarchy so that you must use X::throw_(arguments) instead of throw X(arguments).

If the exception thrown is a C++ exception but not a subclass of std::exception, then we catch it with catch (...) or std::uncaught_exception() in a sentry object’s destructor, and raise a structured exception to at least indicate that this is occurring in the field.

For reference, here is the implementation:

void handlePythonError();
void handleStandardException(const std::exception& e);

#define IMVU_BEGIN_DISALLOW_EXCEPTIONS         \
    DisallowExceptionsSentry PP_UNIQUE_NAME(); \
    try

#define IMVU_END_DISALLOW_EXCEPTIONS(block)                             \
    catch (const boost::python::error_already_set&) {               \
        handlePythonError();                                            \
        block ;                                                         \
    }                                                                   \
    catch (const std::exception& e) {                               \
        handleStandardException(e);                                     \
        block ;                                                         \
    }

#define IMVU_BEGIN_DISALLOW_EXCEPTIONS_XPCOM IMVU_BEGIN_DISALLOW_EXCEPTIONS
#define IMVU_END_DISALLOW_EXCEPTIONS_XPCOM IMVU_END_DISALLOW_EXCEPTIONS({ return NS_ERROR_UNEXPECTED; })

#define IMVU_DISALLOW_EXCEPTIONS_XPCOM(block)                           \
    IMVU_BEGIN_DISALLOW_EXCEPTIONS_XPCOM {                              \
        block ;                                                         \
    } IMVU_END_DISALLOW_EXCEPTIONS_XPCOM

And the source file:

DisallowExceptionsSentry::~DisallowExceptionsSentry() {
    if (std::uncaught_exception()) {
        RaiseException(EXC_EXCEPTIONS_NOT_ALLOWED, 0, 0, 0);
    }
}

/*
 * On error handling and the GIL.  Gecko's event handlers run during
 * the message loop, which means the GIL is not held.  Calls back into
 * Python require that the GIL be reacquired.  If the Python call
 * fails, the GIL is released (while error_already_set is unwinding
 * the stack).  The GIL must be reacquired to grab the exception
 * information and marshal it to the main thread.
 *
 * However, PumpWaitingMessages releases the GIL too!  Thus, reraising
 * the error on the main thread requires GIL reacquisition.
 *
 * If another thread acquires the GIL and blocks on the main thread's
 * message pump, a deadlock will occur.  Thus, secondary threads
 * should never block on the main thread's message pump.
 */

void reraisePythonError(PyObject* type, PyObject* value, PyObject* traceback) {
    HoldPythonGIL PP_UNIQUE_NAME();
    PyErr_Restore(type, value, traceback);
    boost::python::throw_error_already_set();
}

void error_but_no_error() {
    throw std::runtime_error("error_already_set but no Python error?");
}

void handlePythonError() {
    PyObject* type;
    PyObject* value;
    PyObject* traceback;
    {
        HoldPythonGIL PP_UNIQUE_NAME();
        PyErr_Fetch(&type, &value, &traceback);
    }
    if (type) {
        boost::function<void()> fn(boost::bind(reraisePythonError, type, value, traceback));
        queueMainThreadJob(fn);
    } else {
        queueMainThreadJob(error_but_no_error);
    }
}

void rethrowStandardException(const std::string& s) {
    std::string prefix("Unknown std::exception: ");
    throw std::runtime_error(prefix + s);
}

void handleStandardException(const std::exception& e) {
    if (boost::exception_detail::get_boost_exception(&e)) {
        queueMainThreadJob(boost::bind(
            boost::rethrow_exception,
            boost::current_exception()));
    } else {
        queueMainThreadJob(boost::bind(rethrowStandardException, std::string(e.what())));
    }
}