Your Version Control and Build Systems Don’t Scale
Let’s spend a minute and talk about the performance of a tool you use every day: your build system. For this purpose, they’re all the same, so let’s assume you use GNU Make. For a no-op build, what is Make’s running time? You know, that O(N) stuff.
If you said
O(Files), you’re right. Every time you type
make and press enter, it scans every file in the project, looking for out-of-date sources. That doesn’t sound so bad. Linear algorithms are good, right?
O(Files) = O(Engineers * Time)
Assuming you’re successful, your team is probably growing too:
O(Engineers) = O(Time)
Applying some substitutions:
O(Make) = O(Files) = O(Engineers * Time) = O(Time^2)
A no-op make is quadratic in the duration of the project?!
Exhibit A, Mozilla:
Chad@laptop /c/xulrunner/mozilla-central/obj-xulrunner $ time make rm -f -rf ./dist/sdk [tons of output snipped] make: Leaving directory `/c/xulrunner/mozilla-central/obj-xulrunner' real 21m31.526s user 4m13.292s sys 9m15.066s
21.5 MINUTES! This is a no-op build! I changed nothing from the last build!
Mozilla’s build is an especially egregious example, but it’s not uncommon for nontrivial Make, SCons, or Jam no-op builds to exceed 15 seconds. Every time you run SCons, for example, it:
- loads Python,
- imports a bunch of Python code,
- creates an entire dependency graph in memory,
- scans all files for changes,
- determines what to build,
- builds it,
- and throws all of those intermediate results away.
No-op builds should be
O(1) and instantaneous, and most other builds should be
O(WhateverChanged). What if I told you this is possible?
Introducing IBB: I/O-Bound Build
Dusty, one of the best engineers I know, once told me that C compilers are now fast enough that they’re I/O-bound, not CPU-bound as you’d expect. This was inspiration for a
build system dependency engine architecture:
- A long-running server process that contains a complete dependency graph between files, including the commands required to update targets from sources.
- An ultra-tiny C program that communicates build requests to the build server.
- A background thread that watches for filesystem updates (via ReadDirectoryChangesW or libfam) and updates the dependency graph in memory.
I have built a prototype of ibb’s architecture, and its code is available on my github.
In ibb, a no-op build takes constant time, no matter the size of the project. I tested with Noel Llopis’s build system stress test and no-op builds are merely 50 milliseconds. Take that, everyone else!
A Faster Grep
Let’s say you wanted to run a regular expression across your entire codebase. On Windows, you’ll spend all of your time in kernel calls, even if the files are in cache. However, remember that modern memory bandwidths are measured in GB/s…
Minimal Test Runner
Carl Masak’s inspirational addictive TDD harness is also possible with ibb. If you can determine which unit tests depend on which source files (via an __import__ hook in Python, say), you can run appropriate tests immediately after saving a source file.
I’ve watched both git and Subversion fall apart as a codebase grows. git is so faaast… until you import a 20 GB repository. Then it’s
O(Files) just like everything else.
$ time git status # On branch master nothing to commit (working directory clean) real 0m9.603s user 0m0.858s sys 0m7.378s
10 seconds to see what files I’ve modified. Ouch.
Imagine if ibb could keep track of which files have changed…
What This Means For You
Running tests, grepping code, and building binaries from source are equivalent problems. ibb’s architecture supports lightning-fast implementations of all of the above.
By converting an
O(Time^2) build, test runner, or version control system to
O(1), we eliminate hours of costly engineer time and encourage good habits like committing often and frequently running tests, no matter the age of the project. Frankly, instant feedback feels great.
It’s crazy to me that this architecture isn’t standard practice across development tools.
Take ibb’s concepts or code, integrate it with your engineering processes, and improve the quality of your team’s life today.