In a large Python program such as IMVU, startup time is dominated by Python module imports. Take these warm timings:

$ time python -c 'None'

real    0m0.096s
user    0m0.077s
sys     0m0.061s

$ time python -c 'import urllib2'

real    0m0.314s
user    0m0.155s
sys     0m0.186s

That's 300ms for a single basic dependency. Importing the entire IMVU client takes 1.5s warm and 20s cold on a typical user's machine.


The IMVU client's loading progress bar imports modules bottom-up; that is, leaf modules are imported before their parents. The root module is imported last.

Implementing a bottom-up import sequence requires generating a graph of dependencies between modules:

def get_dependencies(module_name):
    Takes a module name as input (e.g. 'xml.dom') and returns a set of
    (lhs, rhs) tuples where lhs and rhs are module names and lhs
    imports rhs.
    # module_from_key is a dict from a module key, an arbitrary
    # object, to a module object.  While importing, we discover
    # dependencies before we have access to the actual module objects.
    # import_dependencies is a list of (lhs, rhs) tuples where lhs and
    # rhs are module keys, and module_from_key[lhs] imported
    # module_from_key[rhs].

    root_key = object()
    module_from_key = {root_key: __main__}
    import_dependencies = []
    stack = [root_key]

    def import_in_stack(key, name, globals, locals, fromlist, level):
            return original_import(name, globals, locals, fromlist, level)

    import __builtin__
    original_import = __builtin__.__import__

    def my_import(name, globals=globals(), locals=locals(), fromlist=[], level=-1):
        # fromlist is a whore.  Most of the complexity in this
        # function stems from fromlist's semantics.  See
        # If a module imports 'xml.dom', then the module depends on
        # both 'xml' and 'xml.dom' modules.
        dotted = name.split('.')
        for i in range(1, len(dotted)):
            my_import('.'.join(dotted[0:i]), globals, locals, [], level)

        module_key = object()
        parent_key = stack[-1]

        def add_dependency_from_parent(key, m):
            module_from_key[key] = m
            import_dependencies.append((parent_key, key))

        submodule = import_in_stack(module_key, name, globals, locals, ['__name__'], level)
        add_dependency_from_parent(module_key, submodule)

        for f in (fromlist or []):
            from_key = object()
            module = import_in_stack(from_key, name, globals, locals, [f], level)
            if f == '*':
            submodule = getattr(module, f)
            if isinstance(submodule, types.ModuleType):
                add_dependency_from_parent(from_key, submodule)

        return original_import(name, globals, locals, fromlist, level)

    # Import module_name with import hook enabled.
    original_modules = sys.modules.copy()
    __builtin__.__import__ = my_import
        __builtin__.__import__ = original_import

    assert stack == [root_key], stack

    return sorted(set(
        (module_from_key[lhs].__name__, module_from_key[rhs].__name__)
        for lhs, rhs in import_dependencies

(You can view all of the code at SourceForge).

First, we install an __import__ hook that discovers import dependencies between modules, and convert those relationships into a directed graph:


Then, we merge cycles. If module A imports B, B imports C, and C imports A, then it doesn't matter which you import first. Importing A is equivalent to importing B or C. After this step, we have a DAG:

xml.dom.minidom DAG

Finally, we can postorder traverse the DAG to determine a good import sequence and cost (approximated as the number of modules in the cycle) per import:

1 xml
3 xml.dom
1 copy_reg
1 types
1 copy
1 xml.dom.NodeFilter
1 xml.dom.xmlbuilder
1 xml.dom.minidom
1 __main__

Now let's look at some less-trivial examples. urllib2:






Final notes: Other people have solved this problem with bytecode scanning, but we wanted to know the actual modules imported for an accurate progress bar. A simpler __import__ hook could have calculate the correct import sequence, but I find a visual representation of module dependencies to have additional benefits.