At Least Interview (or: How I Ended Up at IMVU)

by Chad Austin
Aug 17 10

Recent conversations have pointed out my career philosophy isn’t as obvious as I thought. Thus, I’d like to share the story of how I joined IMVU and what it means to me and to those I interview.

Why Won’t You Interview?

You wouldn’t believe the number of times I’ve tried and failed to get somebody to take a weekend and fly out to IMVU for an interview. I don’t understand: we’ll happily pay you and your significant other to spend a vacation in San Francisco for the small price of a day’s interview.

There are three possible outcomes:

  1. We make you an offer and you accept.
  2. We make you an offer and you decline.
  3. We don’t make you an offer.

What’s the worst that could happen? Maybe you’ll be forced to actually decide whether IMVU is the right home for you. Maybe IMVU won’t be a fit and you’ll feel a little worse for it.

Either way you’ll have a better sense of yourself and maybe you’ll have stumbled upon a more fulfilling life. Plus you’ll have a free vacation!

I Could Have Joined IMVU 9 Months Earlier

I was halfway through my graduate degree at Iowa State, implementing a functional GPU language. I figured I was headed towards a job working on concurrent languages at Microsoft Research or something. Indeed, that would have been fine! I’m still glad concurrent programming languages aren’t a solved a problem – I can still fantasize about someday contributing to the field.

On July 2nd, 2004 (my birthday!), a guy named Eric Ries e-mails me out of the blue “Are you the same Chad Austin from the boost and cal3d mailing lists? Interested in some contract work?” He was working on some wack AOL Instant Messenger add-on that used BitTorrent as its installer and had a hideous website, so I wasn’t terribly interested. He persisted, and by GDC 2005, he convinced me to come interview.

Once I met the founding team, I came to a few conclusions:

  1. IMVU’s founders were smart. I’d be silly not to work with them.
  2. Coming from graduate school, I didn’t expect much of a salary, so I could take a bunch of stock in exchange.
  3. If IMVU succeeded, win!
  4. If IMVU failed, at least I’d learn a lot.

I wasn’t super excited about the product at first, but IMVU’s founders convinced me to give them a shot, and it was definitely the right decision.

How I “Sell” to Candidates

When I interview candidates, I truly believe that IMVU is a great opportunity. If the candidate is hesitant about committing to such a huge life change, I understand. Moving across the country and taking a new job is a gigantic personal decision, and I can’t make that decision for them.

I never aggressively push IMVU, but I do my best to provide the data necessary to make the right decision. “I’ve been here a while. What information do you need to know whether IMVU is right for you?” I like to believe honesty is as effective as aggressive salesmanship. :)

What This Means

I heartily endorse the philosophy espoused by NetFlix: periodically reconsider your place in the world. I’d be a hypocrite if I said otherwise.

That said, I think our culture overvalues salary. Money is but one (uncorrelated?) component of our motivation. Since humans are notoriously bad at predicting what makes us happy, it’s critical that we weigh facets such as personal freedom, your colleagues, social context, future opportunities, and how your work fits into your personal narrative.

We once tried to hire a frighteningly smart man away from Google. He interviewed but declined our generous offer, saying that his entire social life was tied into Google. In hindsight, the sacrifice we asked of him is clear, and I respect his decision.

In short, stay open-minded, but consciously consider what makes you happy.

How do you balance Immediate and Meta?

by Chad Austin
Jul 30 10

I like to split work into two types, Immediate and Meta, which I define as follows:

  • Immediate: work that moves us closer to an immediate goal. Examples: bug fixes, development of a feature in a software application
  • Meta: work that improves or changes how we perform Immediate work. Examples: planning, refactoring, employee training, development of programming languages, and communication between teams

Clearly you cannot focus entirely on one or the other. If we focused entirely on immediate work, we would all be programming in assembly language with no communication between engineers. On the other hand, if we spent all of our time planning or writing better programming languages, we’d never sell any product.

Customers directly pay for the results of Immediate work and indirectly how fast you can deliver it.

Meta is tooling and process. ROI determines your investment in Meta. At the conception of a startup, you don’t know your business model or market, so investing in tooling is probably foolhardy. You can’t afford new programming languages or side innovations, so get ‘er done.

However, successfully raising funds changes that equation. With a longer horizon, investing in your iteration loop and tools begins to make sense. It seems Meta ROI is, roughly, ExpectedImmediateTime * ImprovementFactor / CostOfImprovement.

Additionally, Meta work applies to itself. Complex systems are like a dark cave — every step down a path illuminates further steps. You can also think of this effect as “leveling up”, like in a video game. If your team begins Test-Driven Development or Continuous Deployment early, it will begin to see further improvements, compounding the benefits. Process improvement has exponential return (as any Singularitarian is quick to mention).

So how can you balance these types of work? Let’s discuss several approaches:

Start 100% Immediate and Asymptotically Approach 100% Meta

Upon learning to program, students invariably have the attitude “How do I quickly solve the problem at hand?” After witnessing the limits of that attitude as their non-trivial programs collapse under their own complexity, their attitude shifts to “How can I make programming easier?” From this you see such rarely-fruitful projects as new operating systems and programming languages.

(This is the story of my life… I haven’t finished writing a single game since I learned a language powerful enough to metaprogram.)

Always Balance 50% Immediate and 50% Meta

It seems like there should be an optimal balance of process improvement and immediate value creation. Maybe that’s true in the long run, but, in my experience, we punctuate work with process improvement. Intel has its Tick Tock approach. IMVU periodically makes comparitively large investments in build processes, separated by periods of smooth execution. Students go to school for part of the year — process improvement — and work in the summers or co-ops. (However, I’d argue that applying tick tock more directly to education would be beneficial: a year of school and a year of work. That’s another blog post.)

Nintendo’s “Spiral” Analogy

If Nintendo focused entirely on immediate results, they would end up in a death spiral: financial pressure reduces available time, limited time reduces quality, poor quality causes lower game sales, and low game sales increases financial pressure.

If you have the latitude to pay attention to process improvement, you will instead follow an “Upward Spiral”. Quality and innovation drive sales, giving you more time to spend on quality and innovation.

I Don’t Have the Answer

How do you balance Immediate and Meta? Please comment – I’d love to hear your thoughts.

Extracting Color and Transparency from Flash

by Chad Austin
Jul 29 10

The original source of this post is at the IMVU engineering blog. Subscribe now!

For clarity, I slightly oversimplified my previous discussion on efficiently rendering Flash in a 3D scene. The sticky bit is extracting transparency information from the Flash framebuffer so we can composite the overlay into the scene.

Flash does not give you direct access to its framebuffer. It does, with IViewObject::Draw, allow you to composite the Flash framebuffer onto a DIB section of your choice.

Remembering your Porter-Duff, composition of source over dest is:

Color = SourceColor * SourceAlpha + DestColor * (1 - SourceAlpha)

If the source color is premultiplied, you get:

Color = SourceColor + DestColor * (1 - SourceAlpha)

Assuming we want premultiplied color and alpha from Flash for efficient rendering in the 3D scene, applying the above requires solving for FlashAlpha and FlashColor:

RenderedColor = FlashColor * FlashAlpha + DestColor * (1 - FlashAlpha)

RenderedColor = FlashColor * FlashAlpha + DestColor - DestColor * FlashAlpha

RenderedColor - DestColor = FlashColor * FlashAlpha - DestColor * FlashAlpha

RenderedColor - DestColor = FlashAlpha * (FlashColor - DestColor)

FlashAlpha = (RenderedColor - DestColor) / (FlashColor - DestColor)

If FlashColor and DestColor are equal, then FlashAlpha is undefined. Intuitively, this makes sense. If you render a translucent black SWF on a black background, you can’t know the transparency data because all of the pixels are still black. This doesn’t matter, as I’ll show in a moment.

FlashColor is trivial:

RenderedColor = FlashColor * FlashAlpha + DestColor * (1 - FlashAlpha)

RenderedColor - DestColor * (1 - FlashAlpha) = FlashColor * FlashAlpha

FlashColor = (RenderedColor - DestColor * (1 - FlashAlpha)) / FlashAlpha

FlashColor is undefined if FlashAlpha is 0. Transparency has no color.

What do these equations give us? We know RenderedColor, since it’s the result of calling IViewObject::Draw. We have control over DestColor, since we configure the DIB Flash is drawn atop. What happens if we set DestColor to black (0)?

FlashColor = (RenderedColorOnBlack) / FlashAlpha

What happens if we set it to white (1)?

FlashColor = (RenderedColorOnWhite - (1 - FlashAlpha)) / FlashAlpha

Now we’re getting somewhere! Since FlashColor and FlashAlpha are constant, we can define a relationship between FlashAlpha and RenderedColorOnBlack and RenderedColorOnWhite:

(RenderedColorOnBlack) / FlashAlpha = (RenderedColorOnWhite - (1 - FlashAlpha)) / FlashAlpha

RenderedColorOnBlack = RenderedColorOnWhite - 1 + FlashAlpha

FlashAlpha = RenderedColorOnBlack - RenderedColorOnWhite + 1

FlashAlpha = RenderedColorOnWhite - RenderedColorOnBlack

So all we have to do is render the SWF on a white background and a black background and subtract the two to extract the alpha channel.

Now what about color? Just plug the calculated FlashAlpha into the following when rendering on black.

FlashColor = (RenderedColor - DestColor * (1 - FlashAlpha)) / FlashAlpha

FlashColor = RenderedColorOnBlack / FlashAlpha

Since we want premultiplied alpha:

FlashColor = RenderedColorOnBlack

Now that we know FlashColor and FlashAlpha for the overlay, we can copy it into a texture and render the scene!

Efficiently Rendering Flash in a 3D Scene

by Chad Austin
Jul 29 10

The original source of this post is at the IMVU engineering blog. Subscribe now!

Last time, I talked about how to embed Flash into your desktop application, for UI flexibility and development speed. This time, I’ll discuss efficient rendering into a 3D scene.

Rendering Flash as a 3D Overlay (The Naive Way)

At first blush, rendering Flash on top of a 3D scene sounds easy. Every frame:

  1. Create a DIB section the size of your 3D viewport
  2. Render Flash into the DIB section with IViewObject::Draw
  3. Copy the DIB section into an IDirect3DTexture9
  4. Render the texture on the top of the scene

Naive Flash Rendering

Ta da! But your frame rate dropped to 2 frames per second? Ouch. It turns out this implementation is horribly slow. There are a couple reasons.

First, asking the Adobe flash player to render into a DIB isn’t a cheap operation. In our measurements, drawing even a simple SWF takes on the order of 10 milliseconds. Since most UI doesn’t animate every frame, we should be able to cache the captured framebuffer.

Second, main memory and graphics memory are on different components in your computer. You want to avoid wasting time and bus traffic by unnecessarily copying data from the CPU to the GPU every frame. If only the lower-right corner of a SWF changes, we should limit our memory copies to that region.

Third, modern GPUs are fast, but not everyone has them. Let’s say you have a giant mostly-empty SWF and want to render it on top of your 3D scene. On slower GPUs, it would be ideal if you could limit your texture draws to the region of the screen that are non-transparent.

Rendering Flash as a 3D Overlay (The Fast Way)

Disclaimer: I can’t take credit for these algorithms. They were jointly developed over years by many smart engineers at IMVU.

First, let’s reduce an embedded Flash player to its principles:

  • Flash exposes an IShockwaveFlash [link] interface through which you can load and play movies.
  • Flash maintains its own frame buffer. You can read these pixels with IViewObject::Draw.
  • When a SWF updates regions of the frame buffer, it notifies you through IOleInPlaceSiteWindowless::InvalidateRect.

In addition, we’d like the Flash overlay system to fit within these performance constraints:

  • Each SWF is rendered over the entire window. For example, implementing a ball that bounces around the screen or a draggable UI component should not require any special IMVU APIs.
  • If a SWF is not animating, we do not copy its pixels to the GPU every frame.
  • We do not render the overlay in transparent regions. That is, if no Flash content is visible, rendering is free.
  • Memory consumption (ignoring memory used by individual SWFs) for the overlay usage is O(framebuffer), not O(framebuffer * SWFs). That is, loading three SWFs should not require allocation of three screen-sized textures.
  • If Flash notifies of multiple changed regions per frame, only call IViewObject::Draw once.

Without further ado, let’s look at the fast algorithm:

Fast Flash Rendering

Flash notifies us of visual changes via IOleInPlaceSiteWindowless::InvalidateRect. We take any updated rectangles and add them to a per-frame dirty region. When it’s time to render a frame, there are four possibilities:

  • The dirty region is empty and the opaque region is empty. This case is basically free, because nothing need be drawn.
  • The dirty region is empty and the opaque region is nonempty. In this case, we just need to render our cached textures for the non-opaque regions of the screen. This case is the most common. Since a video memory blit is fast, there’s not much we could do to further speed it up.
  • The dirty region is nonempty. We must IViewObject::Draw into our Overlay DIB, with one tricky bit. Since we’re only storing one overlay texture, we need to render each loaded Flash overlay SWF into the DIB, not just the one that changed. Imagine an animating SWF underneath another translucent SWF. The top SWF must be composited with the bottom SWF’s updates. After rendering each SWF, we scan the updated DIB for a minimalish opaque region. Why not just render the dirty region? Imagine a SWF with a bouncing ball. If we naively rendered every dirty rectangle, eventually we’d be rendering the entire screen. Scanning for minimal opaque regions enables recalculation of what’s actually visible.
  • The dirty region is nonempty, but the updated pixels are all transparent. If this occurs, we no longer need to render anything at all until Flash content reappears.

This algorithm has proven efficient. It supports multiple overlapping SWFs while minimizing memory consumption and CPU/GPU draw calls per frame. Until recently, we used Flash for several of our UI components, giving us a standard toolchain and a great deal of flexibility. Flash was the bridge that took us from the dark ages of C++ UI code to UI on which we could actually iterate.

How to Embed Flash Into Your 3D Application

by Chad Austin
Jul 29 10

The original source of this post is at the IMVU engineering blog. Subscribe now!

[I wrote this post last year when IMVU still used Flash for a significant portion of our UI. Even though we now embed Gecko, I believe embedding Flash is still valuable.]

Writing user interfaces is hard. Writing usable interfaces is harder. Yet, the design of your interface is your product.

Products are living entities. They always want to grow, adapting to their users as users adapt to them. In that light, why build your user interface in a static technology like C++ or Java? It won’t be perfect the first time you build it, so prepare for change.

IMVU employs two technologies for rapidly iterating on and refining our client UIs: Flash and Gecko/HTML. Sure, integrating these technologies has a sizable up-front cost, but the iteration speed they provide easily pays for them. Rapid iteration has some obvious benefits:

  1. reduces development cost
  2. reduces time to market

and some less-obvious benefits:

  1. better product/market fit: when you can change your UI, you will.
  2. improved product quality: little details distinguish mediocre products from great products. make changing details cheap and your Pinto will become a Cadillac.
  3. improved morale: both engineers and designers love watching their creations appear on the screen right before them. it’s why so many programmers create games!

I will show you how integrating Flash into a 3D application is easier than it sounds.

Should I use Adobe Flash or Scaleform GFx?

The two most common Flash implementations are Adobe’s ActiveX control (which has a 97% installed base!) and Scaleform GFx.

Adobe’s control has perfect compatibility with their tool chain (go figure!) but is closed-source and good luck getting help from Adobe.

Scaleform GFx is an alternate implementation of Flash designed to be embedded in 3D applications, but, last I checked, is not efficient on machines without GPUs. (Disclaimer: this information is two years old, so I encourage you to make your own evaluation.)

IMVU chose to embed Adobe’s player.

Deploying the Flash Runtime

Assuming you’re using Adobe’s Flash player, how will you deploy their runtime? Well, given Flash’s install base, you can get away with loading the Flash player already installed on the user’s computer. If they don’t have Flash, just require that they install it from your download page. Simple and easy.

Down the road, when Flash version incompatibilities and that last 5% of your possible market becomes important, you can request permission from Adobe to deploy the Flash player with your application.

Displaying SWFs

IMVU displays Flash in two contexts: traditional HWND windows and 2D overlays atop the 3D scene.

IMVU Flash Window

IMVU Flash Overlay

If you want to have something up and running in a day, buy f_in_box. Besides its awesome name, it’s cheap, comes with source code, and the support forums are fantastic. It’s a perfect way to bootstrap. After a weekend of playing with f_in_box, Dusty and I had a YouTube video playing in a texture on top of our 3D scene.

Once you run into f_in_box’s limitations, you can use the IShockwaveFlash and IOleInPlaceObjectWindowless COM interfaces directly. See Igor Makarav’s excellent tutorial and CFlashWnd class.

Rendering Flash as an HWND

For top-level UI elements use f_in_box or CFlashWnd directly. They’re perfectly suited for this. Seriously, it’s just a few lines of code. Look at their samples and go.

Rendering Flash as a 3D Overlay

Rendering Flash to a 3D window gets a bit tricky… Wait for Part 2 of this post!

Book Review: E-Myth Revisited

by Chad Austin
Jul 25 10

E-Myth Revisited was written in 1995, but its concepts date to 1977, when Michael Gerber founded E-Myth Worldwide. I found the book somewhat dated in light of agile development’s popularity and the rise of the Internet. To ensure that I understand Gerber’s arguments, I will summarize them here. To be clear, these are my interpretations of the book’s arguments. My understanding may not match Gerber’s. For concision’s sake, I will write this post as if they are my opinions, even though I may not agree with what I’m writing.

Part 1. The E-Myth and the American Small Business

Chapter 1 - The Entrepreneurial Myth
Chapter 2 - The Entrepreneur, the Manager, and the Technician
Chapter 3 - Infancy: The Technician's Phase
Chapter 4 - Adolescence: Getting Some Help
Chapter 5 - Beyond the Comfort Zone
Chapter 6 - Maturity and the Entrepreneurial Perspective

Gerber’s E-Myth refers to the way we envision and glorify the creation of a successful business: a person with a great idea and inhuman talents works alone against all odds, gaining significant wealth.

(Before we get much further, I want to clarify one point that confused me. The term E-Myth has nothing to do with electronics or the internet; the E in E-Myth means Entrepreneur.)

Gerber’s small business research uncovered an important trend: most small business are created because a technically-proficient employee decides they can do a better job alone, making more money with more freedom. However, creating a business requires three personalities: Entrepreneur, Technician, and Manager. I find it helpful to think instead of dominant desires: Vision, Implementation, and Structure. Thus, small business entrepreneurs focus too much on Implementation and too little on Vision and Structure.

Further, assuming the business takes off, it will require a sound structure to grow. Businesses must be able to absorb new employees and continue to function in the case that the founder sells the business or moves on. If a Technician founds a business and focuses overly on Implementation, she will never have the time to consider how to hire effective employees that share her vision. She also won’t consider the reason she’s in business to begin with – why would customers choose her business over another? What does she want to do with her life?

This leads into Part 2.

Part 2. The Turn-Key Revolution: A New View of Business

Chapter 7 - The Turn-Key Revolution
Chapter 8 - The Franchise Prototype
Chapter 9 - Working on Your Business, Not in It

An entrepreneur’s business is a product. Unless she wants to work on the same company for the rest of her life, she will eventually want to sell (privately or publicly) or franchise. The success of franchises in America demonstrates a model by which a business can be created and scaled without enormous capital or risk. To found a business, consider your talents and desires. Then ask how you can enact those desires by teaching your talents to potentially-unskilled employees. For example, franchising your business requires documenting every tiny detail: dress codes, cleanliness standards, and the words used to greet customers. You set the vision and your documentation conveys it. Without explicit documentation, your vision will not survive implementation by others. (Again, I’d like to mention that I may not necessarily agree with Gerber.)

By explicitly working on your business instead of in it, you are forced to consider these questions:

  • How do customers interact with my business?
  • How do my customers interact with my products?
  • How do I hire employees to carry out my vision?
  • What happens if I sell my business?
  • How does starting a business further my life goals?

Part 3: Building a Small Business That Works

Chapter 10 - The Business Development Process
Chapter 11 - Your Business Development Program
Chapter 12 - Your Primary Aim
Chapter 13 - Your Strategic Objective
Chapter 14 - Your Organizational Strategy
Chapter 15 - Your Management Strategy
Chapter 16 - Your People Strategy
Chapter 17 - Your Marketing Strategy
Chapter 18 - Your Systems Strategy
Chapter 19 - A Letter to Sarah

Before you begin, why are you in business? What do you hope to achieve? Document these desires in a Life Plan so you know what you’re working towards and can measure overall progress.

How much money will I need? How much time? What do I hope to learn? How do I want people to think about me? How much money will I make? Creating a business requires vision, but you require vision for yourself, too.

Before you hire anybody, write an organizational chart for your company as envisioned years down the road. Even though there’s only one employee at the outset, the chart may contain dozens of roles. However, the chart gives you a framework into which to hire new employees and clarify said roles’ responsibilities and success metrics.

Don’t predicate your business on hiring the best employees possible. You won’t be able to find or afford them. Instead, build a System for incorporating and training employees into your vision.

Know that customers are irrational. Know that they’re not buying your product – they’re buying the experience your business offers. Know that nothing escapes your customers’ senses. Aspects as minor as colors, font sizes, the shapes of logos, wording, and timing can affect your customers’ purchasing decisions. Customer needs may be real, but customers purchase based on perceived needs.

Measure and understand your customers’ demographics and psychographics. This data will help you make good decisions.

Marketing is the most important function of your business. Marketing is the reason businesses exist. Success requires meeting the perceived needs of a market, which means the market must know you exist. How will you obtain customers? How will you satisfy them? How will you convince them to return?

Chad’s Perspective

E-Myth Revisited’s biggest takeaways are:
  1. Creating a business around your technical proficiencies is a mistake. Instead consider your customers.
  2. Think of a business as a product that can sustain itself and provide you with the income or freedom you desire. You are your first customer.
  3. Your comfort zone is too small to start a business – don’t let that stop you!

However, the book’s age shows. E-Myth Revisited fails entirely to address the possibility that your business is in the wrong market and thus cannot survive. In today’s environment, I think it’s more important to rapidly build a business that fails quickly or reacts to customer desires and market pressures.

I don’t empathize with the notion that you can’t hire the best employees. Gerber too often uses the example of McDonald’s. While McDonald’s is wildly successful in its market, small internet-based business can scale with just a few highly-skilled individuals. That said, it’s worth asking up front what level of talent you require to scale your business. I wish the book had focused more on non-food-service industries.

If you have any startup experience at all, I would avoid E-Myth Revisited. If you’re inexperienced and you’re considering creating a business, it’s a fine introduction to the issues you’ll face.

More Thoughts on ibb

by Chad Austin
Jun 12 10

There were some great comments on my introduction to ibb and they’re worthy of discussion.

On the Performance of ReadDirectoryChangesW

Brandon Ehle is absolutely right when he says ReadDirectoryChangesW is not free. In fact, many modern applications monitor directory trees for changes, so, on a fast disk such as my Intel SSD, “svn up” might quickly create and destroy thousands of zero-byte lockfiles. When I have Visual Studio and ibb_server running, I have witnessed them peg a core each to handle the filesystem events.

While filesystem monitors consumes a great deal of CPU in aggregate, the important metric to consider is latency. That is, it doesn’t actually matter that my CPU is pegged while I “svn up”: the factor limiting my efficiency is the sheer number of system calls to the OS. Cores are cheap and getting cheaper, but latency in computing systems isn’t improving. Put another way, each ReadDirectoryChangesW adds a small constant amount of work to what is already an O(files) process: the root problem is that svn up needs to create and destroy thousands of lockfiles. (I assume a small constant set of tools running at any given time. Assigning one core per tool seems realistic enough.)

Since common-case latency is the primary constraint we must optimize, converting O(n) to O(1) is a huge win. O(n) to O(k*n) for some small constant k isn’t as important.

I might also argue that O(n) use cases should be eliminated if possible. “svn up” already knows the changeset from the server – if it knew the changeset on the client without locking and scanning every directory, it would become O(changeset + localchanges).

What if you stop ibb_server and rebuild?

Then the next build is O(n) like SCons or Make, but there’s no reason it needs to be slower. Build signatures should definitely be cached in a local database so subsequent builds merely require scanning all files in the DAG. Recompiling from scratch upon every reboot would be a dealbreaker.

Scalable Build Systems: An Analysis of Tup

by Chad Austin
Jun 10 10

I previously argued that any tool whose running time is proportional with the number of files in a project scales quadratically with time. Bluem00 on Hacker News pointed me towards Tup, a scalable build system with goals similar to ibb.

Mike Shal, Tup’s author, wrote Build System Rules and Algorithms, formalizing the algorithmic deficiencies with existing build systems and describing Tup’s implementation, a significant improvement over the status quo. I would like to document my analysis of Tup and whether I think it replaces ibb.

Before we get started, I’d like to thank Mike Shal for being receptive to my comments. I sent him a draft of my analysis and his responses were thoughtful and complete. With his permission, I have folded his thoughts into the discussion below.

Is Tup suitable as a general-purpose build system? Will it replace SCons or Jam or Make anytime soon? Should I continue working on ibb?

Remember our criteria for a scalable build system, one that enables test-driven development at arbitrary project sizes:

  1. O(1) no-op builds
  2. O(changes) incremental builds
  3. Accessible dependency DAG atop which a variety of tools can be built

Without further ado, my thoughts on Tup follow:

Syntax

Tup defines its own declarative syntax, similar to Make or Jam. At first glance, the Tup syntax looks semantically equivalent to Make. From the examples:

: hello.c |> gcc hello.c -o hello |> hello

Read the dependency graph from left to right: hello.c is compiled by gcc into a hello executable. Tup supports variable substitution and limited flow control.

Build systems are inherently declarative, but I think Tup’s syntax has two flaws:

  1. Inventing a new syntax unnecessarily slows adoption: by implementing GNU Make’s syntax, Tup would be a huge drop-in improvement to existing build systems.
  2. Even though specifying dependency graphs is naturally declarative, I think a declarative syntax is a mistake. Build systems are a first-class component of your software and your team’s workflow. You should be able to develop them in a well-known, high-level language such as Python or Ruby, especially since those languages come with rich libraries. As an example, SCons gets this right: it’s trivial for me to write CPU autodetection logic for parallel builds in a build script if that makes sense. Or I can extend SCons’s Node system to download source files from the web.

Implementation Language

Tup is 15,000 lines of C. There’s no inherent problem with C, but I do think a community-supported project is more likely to thrive in a faster and safer language, such as Python or Ruby. Having worked with teams of engineers, it’s clear that most engineers can safely work in Python with hardly any spin-up time. I can’t say the same of C.

Git is an interesting case study: The core performance-sensitive data structures and algorithms are written in C, but many of its interesting features are written in Perl or sh, including git-stash, git-svn, and git-bisect. Unlike Git, I claim Python and Ruby are plenty efficient for the entirety of a scalable build system. Worst case, the dependency graph could live in C and everything else could stay in Python.

Scanning Implicit Dependencies

The Tup paper mentions offhand that it’s trivial to monitor a compiler’s file accesses and thus determine its true dependencies for generating a particular set of outputs. The existing implementation uses a LD_PRELOAD shim to monitor all file accesses attempted by, say, gcc, and treats those as canonical input files. Clever!

This is a great example of lateral, scrappy thinking. It has a couple huge advantages:

  1. No implicit dependencies (such as C++ header file includes) need be specified — if all dependencies come from the command line or a file, Tup will know them all.
  2. It’s easy to implement. Tup’s ldpreload.c is a mere 500 lines.

And a few disadvantages:

  1. Any realistic build system must treat Windows as a first-class citizen. Perhaps, on Windows, Tup could use something like Detours. I’ll have to investigate that.
  2. Intercepting system calls is reliable when the set of system calls is known and finite. However, there’s nothing stopping the OS vendor from adding new system calls that modify files.
  3. Incremental linking / external PDB files: these Visual C++ features both read and write the same file in one compile command. SCons calls this a SideEffect: commands that share a SideEffect cannot parallelize. A build system that does not support incremental linking or external symbols would face resistance among Visual C++ users.

And some open questions:

  1. I haven’t completely thought this through, but it may be important to support user-defined dependency scanners that run before command execution, enabling tools such as graph debugging.
  2. I don’t have a realistic example, but imagine a compiler that reads spurious dependency changes from run to run; say, a compiler that only checks its license file on every other day.

Stepping back, I think the core build system should not be responsible for dependency scanning. By focusing on dependency graph semantics and leaving dependency scanning up to individual tools (which may or may not use LD_PRELOAD or similar techniques), a build system can generalize to uses beyond compiling software, as I mentioned in my previous blog post.

Dependency Graph

Tup’s dependency DAG contains two types of nodes: Commands and Files. Files depend on Commands and Commands depend on other Files. I prefer Tup’s design over SCons’s DAG-edges-are-commands design for two reasons:

  1. It simplifies the representation of multiple-input multiple-output commands.
  2. Some commands, such as “run-test foo” or “search-regex some.*regex” depend on source files but produce no files as output. Since they fit naturally into the build DAG, commands are a first-class concept.

Build Reliability

Tup, like SCons, places a huge emphasis on build reliability. This is key and I couldn’t agree more. In the half-decade I’ve used SCons, I can count the number of broken builds on one hand. Sadly, many software developers are used to typing “make clean” or clicking “full rebuild” when something is weird. What a huge source of waste! Developers should trust the build system as much as their compiler, and the build system should go out of its way to help engineers specify complete and accurate dependencies.

Reliable builds imply:

  1. Changes are tracked by file contents, not timestamps.
  2. The dependency graph, including implicit dependencies such as header files and build commands, is complete and accurate by default.
  3. Compiler command lines are included in the DAG. Put another way: if the command used to build a file changes, the file must be rebuilt.

Tup takes a strict functional approach and formalizes build state as a set of files and their contents. (I would argue build state also includes file metadata such as file names and timestamps, at least if the compiler uses such information.) If the build state does not change between invocations, then no work must be done.

Tup even takes build reliability one step further than SCons: If you rename a target file in the build script, Tup actually deletes the old built target before rebuilding the new one. Thus, you will never have stale target executables lying around in your build tree.

Nonetheless, there are situations where a project may choose to sacrifice absolute reliability for significant improvements in build speed, such as incremental linking discussed above.

Core vs. Community

A build system is a critical component of any software team’s development process. Since every team is different, it’s essential that a build system is flexible and extensible. SCons, for example, correctly chose to implement build scripts in a high-level language (Python) with a declarative API for specifying nodes and edges in the dependency graph.

However, I think SCons did not succeed at separating its core engine from its community. SCons tightly couples the underlying dependency graph with support for tools like Visual C++, gcc, and version control. The frozen and documented SCons API is fairly high-level while the (interesting) internals are treated as private APIs. It should be the opposite: a dependency graph is a narrow, stable, and general API. By simplifying and documenting the DAG API, SCons could enable broader uses, such as unit test execution.

Configuration

Like Tup’s author, I agree that build autoconfiguration (such as autoconf or SCons’s Configure support) should not live in the core build system. Autoconfiguration is simply an argument that build scripts should be specified in a general programming language and that the community should develop competing autoconfiguration systems. If a particular autoconfiguration system succeeds in the marketplace, then, by all means, ship it with your build tool. Either way, it shouldn’t have access to any internal APIs. Configuration mechanisms are highly environment-sensitive and are best maintained by the community anyway.

DAG post-process optimizations

Another argument for defining a build tool in a general-purpose language is to allow user-defined DAG optimizations and sort orders. I can think of two such use cases:

  1. Visual C++ greatly improves compile times when multiple C++ files are specified on one command line. In fact, the benefit of batched builds can exceed the benefit of PCH. A DAG optimizer would search for a set of C++ source files that produce object files in the same directory and rewrite the individual command lines into one.
  2. When rapidly iterating, it would be valuable for a build system or test runner to sort such that the most-recently-failed compile or test runs first. However, when hunting test interdependencies as part of a nightly build, you may want to shuffle test runs. On machines with many cores but slow disks, you want to schedule expensive links as soon as possible to mitigate the risk that multiple will execute concurrently and thrash against your disk.

Conclusion

Tup is a significant improvement over the status quo, and I have personally confirmed its performance — it’s lightning fast and it scales to arbitrary project sizes.

However, without out-of-the-box Windows support, a mainstream general-purpose language, and a model for community contribution, I don’t see Tup rapidly gaining traction. With the changes I suggest, it could certainly replace Make and perhaps change the way we iterate on software entirely.

Next, I intend to analyze prebake.

How to Make a DVD with the Flip UltraHD

by Chad Austin
Jun 7 10

Now that my wife and I have a child, we make frequent use of our Flip UltraHD video camera. Our intention is to film precious moments of our lives, burn physical DVDs, and mail them to our families strewn across the east coast and midwest. Sounds old-fashioned, but it’s convenient for our audience.

I will explain DVD creation for Mac users and for Windows users.

Flip -> DVD on a Mac

For those of you blessed with a recent Mac, creating a DVD from the Flip is straightforward:

  1. Copy MP4 files from Flip to your computer (not strictly necessary, though iDVD is much snappier if the videos are on your hard drive)
  2. Open iDVD, select Magic iDVD
  3. Rename your movies to reflect their contents (optional, but makes the DVD a little nicer)
  4. Drag movies into the iDVD window
  5. Create project
  6. Tweak title, button fonts, text
  7. Burn to disc image
  8. Use Disk Utility to burn as many copies as you want!

Flip -> DVD on Windows Vista/7

  1. First of all, Windows DVD Maker doesn’t support Flip’s MP4 files directly. It will crash or hang if you try.
  2. Download the Adobe Flash CS4 trial
  3. Use Adobe Media Encoder to convert the Flip videos to AVI or some such.
  4. Add to Windows DVD Maker.
  5. Configure title and menus.
  6. Wait an evening for Windows DVD Maker to casually burn your DVD to disc.
  7. Frown because the audio and video are no longer synchronized. Also the video is corrupted.
  8. Buy a Mac and use iDVD after all.

I wrote this post months ago but I was waiting to figure out how to create DVDs on Windows… At some point, it’s worth simply buying a Mac Mini and using that instead.

Your Version Control and Build Systems Don’t Scale

by Chad Austin
Mar 4 10

Let’s spend a minute and talk about the performance of a tool you use every day: your build system. For this purpose, they’re all the same, so let’s assume you use GNU Make. For a no-op build, what is Make’s running time? You know, that O(N) stuff.

If you said O(Files), you’re right. Every time you type make and press enter, it scans every file in the project, looking for out-of-date sources. That doesn’t sound so bad. Linear algorithms are good, right?

Oh, but projects grow over time, typically in proportion with the number of engineers (even if some are net negative). Since projects grow:

O(Files) = O(Engineers * Time)

Assuming you’re successful, your team is probably growing too:

O(Engineers) = O(Time)

Applying some substitutions:

O(Make) = O(Files) = O(Engineers * Time) = O(Time^2)

A no-op make is quadratic in the duration of the project?!

Exhibit A, Mozilla:

Chad@laptop /c/xulrunner/mozilla-central/obj-xulrunner
$ time make
rm -f -rf ./dist/sdk
[tons of output snipped]
make[1]: Leaving directory `/c/xulrunner/mozilla-central/obj-xulrunner'

real    21m31.526s
user    4m13.292s
sys     9m15.066s

21.5 MINUTES! This is a no-op build! I changed nothing from the last build!

Mozilla’s build is an especially egregious example, but it’s not uncommon for nontrivial Make, SCons, or Jam no-op builds to exceed 15 seconds. Every time you run SCons, for example, it:

  • loads Python,
  • imports a bunch of Python code,
  • creates an entire dependency graph in memory,
  • scans all files for changes,
  • determines what to build,
  • builds it,
  • and throws all of those intermediate results away.

Am I the only person who thinks this situation is insane? We’re spending all of our time improving constant factors rather than addressing fundamentally quadratic build algorithms.

No-op builds should be O(1) and instantaneous, and most other builds should be O(WhateverChanged). What if I told you this is possible?

Introducing IBB: I/O-Bound Build

Dusty, one of the best engineers I know, once told me that C compilers are now fast enough that they’re I/O-bound, not CPU-bound as you’d expect. This was inspiration for a build system dependency engine architecture:

  • A long-running server process that contains a complete dependency graph between files, including the commands required to update targets from sources.
  • An ultra-tiny C program that communicates build requests to the build server.
  • A background thread that watches for filesystem updates (via ReadDirectoryChangesW or libfam) and updates the dependency graph in memory.

I have built a prototype of ibb’s architecture, and its code is available on my github.

In ibb, a no-op build takes constant time, no matter the size of the project. I tested with Noel Llopis’s build system stress test and no-op builds are merely 50 milliseconds. Take that, everyone else!

A Faster Grep

Let’s say you wanted to run a regular expression across your entire codebase. On Windows, you’ll spend all of your time in kernel calls, even if the files are in cache. However, remember that modern memory bandwidths are measured in GB/s…

ibb’s file search implementation can run a regular expression across 200 MB of PHP and JavaScript in 100 milliseconds. It copies file data into memory on the first run and rapidly scans across that memory on future runs. It could just as easily mmap and use OS’s disk cache to avoid kernel calls.

Minimal Test Runner

Carl Masak’s inspirational addictive TDD harness is also possible with ibb. If you can determine which unit tests depend on which source files (via an __import__ hook in Python, say), you can run appropriate tests immediately after saving a source file.

Version Control

I’ve watched both git and Subversion fall apart as a codebase grows. git is so faaast… until you import a 20 GB repository. Then it’s O(Files) just like everything else.

$ time git status
# On branch master
nothing to commit (working directory clean)

real    0m9.603s
user    0m0.858s
sys     0m7.378s

10 seconds to see what files I’ve modified. Ouch.

Imagine if ibb could keep track of which files have changed…

What This Means For You

Running tests, grepping code, and building binaries from source are equivalent problems. ibb’s architecture supports lightning-fast implementations of all of the above.

By converting an O(Time^2) build, test runner, or version control system to O(1), we eliminate hours of costly engineer time and encourage good habits like committing often and frequently running tests, no matter the age of the project. Frankly, instant feedback feels great.

It’s crazy to me that this architecture isn’t standard practice across development tools.

Take ibb’s concepts or code, integrate it with your engineering processes, and improve the quality of your team’s life today.