<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Your Version Control and Build Systems Don&#8217;t Scale</title>
	<atom:link href="http://chadaustin.me/2010/03/your-version-control-and-build-systems-dont-scale-introducing-ibb/feed/" rel="self" type="application/rss+xml" />
	<link>http://chadaustin.me/2010/03/your-version-control-and-build-systems-dont-scale-introducing-ibb/</link>
	<description></description>
	<lastBuildDate>Tue, 06 Mar 2012 21:20:45 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: anon</title>
		<link>http://chadaustin.me/2010/03/your-version-control-and-build-systems-dont-scale-introducing-ibb/comment-page-1/#comment-64285</link>
		<dc:creator>anon</dc:creator>
		<pubDate>Tue, 22 Feb 2011 14:26:42 +0000</pubDate>
		<guid isPermaLink="false">http://chadaustin.me/?p=1508#comment-64285</guid>
		<description>Your ibb saved not only I/O ops, but also CPU cycles, so the name is misleading.</description>
		<content:encoded><![CDATA[<p>Your ibb saved not only I/O ops, but also CPU cycles, so the name is misleading.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Chad Austin</title>
		<link>http://chadaustin.me/2010/03/your-version-control-and-build-systems-dont-scale-introducing-ibb/comment-page-1/#comment-31786</link>
		<dc:creator>Chad Austin</dc:creator>
		<pubDate>Tue, 20 Jul 2010 02:46:05 +0000</pubDate>
		<guid isPermaLink="false">http://chadaustin.me/?p=1508#comment-31786</guid>
		<description>Allan: I never said anything about SSD vs. HDD.  SSDs don&#039;t change the fundamental algorithmic complexity of building software: you&#039;re still going to do O(files) stats, each stat going through a kernel call.

FWIW, I run SSDs on all of my systems, and I still have the problems I described in the article.</description>
		<content:encoded><![CDATA[<p>Allan: I never said anything about SSD vs. HDD.  SSDs don&#8217;t change the fundamental algorithmic complexity of building software: you&#8217;re still going to do O(files) stats, each stat going through a kernel call.</p>
<p>FWIW, I run SSDs on all of my systems, and I still have the problems I described in the article.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Allan Stokes</title>
		<link>http://chadaustin.me/2010/03/your-version-control-and-build-systems-dont-scale-introducing-ibb/comment-page-1/#comment-31758</link>
		<dc:creator>Allan Stokes</dc:creator>
		<pubDate>Mon, 19 Jul 2010 21:04:39 +0000</pubDate>
		<guid isPermaLink="false">http://chadaustin.me/?p=1508#comment-31758</guid>
		<description>Are you expecting anyone to take your O(N^2) analysis seriously?  

The general rule of thumb is that productivity is sub-linear in the growth of code base complexity.   You could argue that as complexity rises, it pressures the project toward smaller source files on average, but I haven&#039;t witnessed this, except in projects that were out of control.  

More accurately, the total number of file tests (in a conventional build system) is O(N^2) over the life of the project, where N is the final project size.  

Even this makes a radical assumption that a large project isn&#039;t partitioned into a 100 shared libraries, where each shared library is built independently during active development.  

You seem to have a master build fetish.  In a distributed project, this is often the first build each new participant performs, which seems to cause heart palpitations in a certain type of engineer who likes to perform wild extrapolations on the basis step using strangely fabricated denominators.  

You also neglected to take into account the negative exponential term due to rising system performance.  True, this is linked to the nearly constant physical size of the drive read head over the past two decades.  However, if you spin around 180 degree and examine the future, we&#039;re on the precipice of the largest negative step function in file IO since the Rubik&#039;s cube was invented: the transition to SSD.  What kind of astrobucks does it take to develop a source code tree that won&#039;t comfortably fit on a cheap SSD drive three years from now?   At which point we&#039;re back to negative exponential scaling in file stat times.  

This will become even more clear as file systems further adapt to exploit SSD (or miracle RAM) performance scaling.  

There are many applications that could benefit from a derived authority with different performance characteristics than the authoritative backing store.  Amortized O(1) is often possible with space overhead linear in the number of objects tracked (which generally implies a RAM based data structure).   

On the flip side, you really have to put  a lot of faith in the derived authority layer, it adds complexity, and it&#039;s a bad, bad day when you discover your derived authority hasn&#039;t been telling the truth since the last patch Monday.  

Seems like a strange point in history to mount an architectural campaign on the observation that HDD head mechanisms don&#039;t scale.</description>
		<content:encoded><![CDATA[<p>Are you expecting anyone to take your O(N^2) analysis seriously?  </p>
<p>The general rule of thumb is that productivity is sub-linear in the growth of code base complexity.   You could argue that as complexity rises, it pressures the project toward smaller source files on average, but I haven&#8217;t witnessed this, except in projects that were out of control.  </p>
<p>More accurately, the total number of file tests (in a conventional build system) is O(N^2) over the life of the project, where N is the final project size.  </p>
<p>Even this makes a radical assumption that a large project isn&#8217;t partitioned into a 100 shared libraries, where each shared library is built independently during active development.  </p>
<p>You seem to have a master build fetish.  In a distributed project, this is often the first build each new participant performs, which seems to cause heart palpitations in a certain type of engineer who likes to perform wild extrapolations on the basis step using strangely fabricated denominators.  </p>
<p>You also neglected to take into account the negative exponential term due to rising system performance.  True, this is linked to the nearly constant physical size of the drive read head over the past two decades.  However, if you spin around 180 degree and examine the future, we&#8217;re on the precipice of the largest negative step function in file IO since the Rubik&#8217;s cube was invented: the transition to SSD.  What kind of astrobucks does it take to develop a source code tree that won&#8217;t comfortably fit on a cheap SSD drive three years from now?   At which point we&#8217;re back to negative exponential scaling in file stat times.  </p>
<p>This will become even more clear as file systems further adapt to exploit SSD (or miracle RAM) performance scaling.  </p>
<p>There are many applications that could benefit from a derived authority with different performance characteristics than the authoritative backing store.  Amortized O(1) is often possible with space overhead linear in the number of objects tracked (which generally implies a RAM based data structure).   </p>
<p>On the flip side, you really have to put  a lot of faith in the derived authority layer, it adds complexity, and it&#8217;s a bad, bad day when you discover your derived authority hasn&#8217;t been telling the truth since the last patch Monday.  </p>
<p>Seems like a strange point in history to mount an architectural campaign on the observation that HDD head mechanisms don&#8217;t scale.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Chad Austin</title>
		<link>http://chadaustin.me/2010/03/your-version-control-and-build-systems-dont-scale-introducing-ibb/comment-page-1/#comment-28628</link>
		<dc:creator>Chad Austin</dc:creator>
		<pubDate>Fri, 11 Jun 2010 06:59:54 +0000</pubDate>
		<guid isPermaLink="false">http://chadaustin.me/?p=1508#comment-28628</guid>
		<description>William:  Absolutely.  I intend to spike a continuous-autobuild feature to ibb that will compile and run tests every time you save.  I think it will be amazing.

Brandon: You&#039;re absolutely right.  I will discuss that in my next post.

Mike Samuel: Thanks for the heads up!  I will take a deeper look at prebake.

Sorry, everyone, about the comments.  I wasn&#039;t getting moderation e-mails so I didn&#039;t notice them until the other day.  &gt;_&lt;</description>
		<content:encoded><![CDATA[<p>William:  Absolutely.  I intend to spike a continuous-autobuild feature to ibb that will compile and run tests every time you save.  I think it will be amazing.</p>
<p>Brandon: You&#8217;re absolutely right.  I will discuss that in my next post.</p>
<p>Mike Samuel: Thanks for the heads up!  I will take a deeper look at prebake.</p>
<p>Sorry, everyone, about the comments.  I wasn&#8217;t getting moderation e-mails so I didn&#8217;t notice them until the other day.  &gt;_&lt;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Mike Samuel</title>
		<link>http://chadaustin.me/2010/03/your-version-control-and-build-systems-dont-scale-introducing-ibb/comment-page-1/#comment-26206</link>
		<dc:creator>Mike Samuel</dc:creator>
		<pubDate>Thu, 06 May 2010 15:51:09 +0000</pubDate>
		<guid isPermaLink="false">http://chadaustin.me/?p=1508#comment-26206</guid>
		<description>Prebake is a build system that does this as well.  It&#039;s alpha software.

From http://code.google.com/p/prebake/wiki/Goals

Build Time is Independent of Source Repository Size

When a build system has to stat every file in the repo, build is necessarily O(&#124;project&#124;). By hooking into the filesystem to get updates on files that change, prebake avoids this O(&#124;project&#124;) cost ; if you change one file, your build time depends only on the number of files that depend on that file and the time required to rebuild them.</description>
		<content:encoded><![CDATA[<p>Prebake is a build system that does this as well.  It&#8217;s alpha software.</p>
<p>From <a href="http://code.google.com/p/prebake/wiki/Goals" rel="nofollow">http://code.google.com/p/prebake/wiki/Goals</a></p>
<p>Build Time is Independent of Source Repository Size</p>
<p>When a build system has to stat every file in the repo, build is necessarily O(|project|). By hooking into the filesystem to get updates on files that change, prebake avoids this O(|project|) cost ; if you change one file, your build time depends only on the number of files that depend on that file and the time required to rebuild them.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Eric</title>
		<link>http://chadaustin.me/2010/03/your-version-control-and-build-systems-dont-scale-introducing-ibb/comment-page-1/#comment-21553</link>
		<dc:creator>Eric</dc:creator>
		<pubDate>Tue, 09 Mar 2010 02:45:37 +0000</pubDate>
		<guid isPermaLink="false">http://chadaustin.me/?p=1508#comment-21553</guid>
		<description>Welcome to clearcase.</description>
		<content:encoded><![CDATA[<p>Welcome to clearcase.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Brandon Ehle</title>
		<link>http://chadaustin.me/2010/03/your-version-control-and-build-systems-dont-scale-introducing-ibb/comment-page-1/#comment-21408</link>
		<dc:creator>Brandon Ehle</dc:creator>
		<pubDate>Sat, 06 Mar 2010 19:49:20 +0000</pubDate>
		<guid isPermaLink="false">http://chadaustin.me/?p=1508#comment-21408</guid>
		<description>The ReadDirectoryChangesW / libfam is not nearly as &quot;free&quot; as it seems.  If you do lots of global operations on the directory tree, you will pay for the dependency checking overhead each and every time you modify files instead of just when you run git status, svn up, make, scons, etc...  Depending on what your workflow is this could still be a signficant win, but it could also be a huge loss.

When svn and make are watching the same directory for update notifications, anytime svn needs to update a large portion of the tree, make is going to be slowing down subversion to handle the incoming file / directory changed requests.  The same operation will happen when a compile is writing intermediate files all over the tree (unless you compile to an out of tree location).  Thus on the reverse side you are switching O(1) for the amount of time it takes to update a file for O(Tools) watching the tree for updates.

Another downside is that this only works when the daemon is able to stay running from build to build.  If you need to reboot between tests, have security policies that log you out of the machine, or are tight on RAM and need to kill the listener daemon, you will quickly be back to the original slow build times.

There are other ways to solve this that could are not quite O(1) on the no-op build and VCS status side and also do not have O(Tools) running in the background consuming memory and CPU with each update notification.  One example would be for file modification times to trickle up the directory chain so that individual file stat operations could be culled.</description>
		<content:encoded><![CDATA[<p>The ReadDirectoryChangesW / libfam is not nearly as &#8220;free&#8221; as it seems.  If you do lots of global operations on the directory tree, you will pay for the dependency checking overhead each and every time you modify files instead of just when you run git status, svn up, make, scons, etc&#8230;  Depending on what your workflow is this could still be a signficant win, but it could also be a huge loss.</p>
<p>When svn and make are watching the same directory for update notifications, anytime svn needs to update a large portion of the tree, make is going to be slowing down subversion to handle the incoming file / directory changed requests.  The same operation will happen when a compile is writing intermediate files all over the tree (unless you compile to an out of tree location).  Thus on the reverse side you are switching O(1) for the amount of time it takes to update a file for O(Tools) watching the tree for updates.</p>
<p>Another downside is that this only works when the daemon is able to stay running from build to build.  If you need to reboot between tests, have security policies that log you out of the machine, or are tight on RAM and need to kill the listener daemon, you will quickly be back to the original slow build times.</p>
<p>There are other ways to solve this that could are not quite O(1) on the no-op build and VCS status side and also do not have O(Tools) running in the background consuming memory and CPU with each update notification.  One example would be for file modification times to trickle up the directory chain so that individual file stat operations could be culled.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Evan Jones</title>
		<link>http://chadaustin.me/2010/03/your-version-control-and-build-systems-dont-scale-introducing-ibb/comment-page-1/#comment-21405</link>
		<dc:creator>Evan Jones</dc:creator>
		<pubDate>Sat, 06 Mar 2010 19:04:50 +0000</pubDate>
		<guid isPermaLink="false">http://chadaustin.me/?p=1508#comment-21405</guid>
		<description>I think we are due for a better build system. This seems like a great proof of concept. Keeping the compiler around between builds also makes sense (see the GCC compile server discussion, which never went anywhere as far as I can tell: http://tromey.com/blog/?p=407)

Build system wish list:
* Higher level than Make (I want to say &quot;build this library&quot; or &quot;build this executable&quot; and have it do dependency analysis for me.

* Some story for dependencies. I want to be able to either import third party libraries into my project, so I can tell people something like &quot;check out this single tree and hit make.&quot; At the same time, i want to be able to easily send those patches &quot;upstream&quot; to the original project. This requires some build system integration to handle the dependency analysis and rebuilds correctly. I&#039;m imagining maven, but &quot;better&quot;?

The problem, as I see it, is that there is little commercial value in build systems, so there is little incentive for others to really do a good job of this. Everyone hacks their own system to satisfy their own needs, then moves on.

Evan</description>
		<content:encoded><![CDATA[<p>I think we are due for a better build system. This seems like a great proof of concept. Keeping the compiler around between builds also makes sense (see the GCC compile server discussion, which never went anywhere as far as I can tell: <a href="http://tromey.com/blog/?p=407)" rel="nofollow">http://tromey.com/blog/?p=407)</a></p>
<p>Build system wish list:<br />
* Higher level than Make (I want to say &#8220;build this library&#8221; or &#8220;build this executable&#8221; and have it do dependency analysis for me.</p>
<p>* Some story for dependencies. I want to be able to either import third party libraries into my project, so I can tell people something like &#8220;check out this single tree and hit make.&#8221; At the same time, i want to be able to easily send those patches &#8220;upstream&#8221; to the original project. This requires some build system integration to handle the dependency analysis and rebuilds correctly. I&#8217;m imagining maven, but &#8220;better&#8221;?</p>
<p>The problem, as I see it, is that there is little commercial value in build systems, so there is little incentive for others to really do a good job of this. Everyone hacks their own system to satisfy their own needs, then moves on.</p>
<p>Evan</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Mark</title>
		<link>http://chadaustin.me/2010/03/your-version-control-and-build-systems-dont-scale-introducing-ibb/comment-page-1/#comment-21348</link>
		<dc:creator>Mark</dc:creator>
		<pubDate>Sat, 06 Mar 2010 08:49:14 +0000</pubDate>
		<guid isPermaLink="false">http://chadaustin.me/?p=1508#comment-21348</guid>
		<description>Mozilla probably still uses the recursive makefile build system.

That&#039;s the problem. 

It&#039;s just as bad as calculating fibbonaci using naive recursive algorithm.</description>
		<content:encoded><![CDATA[<p>Mozilla probably still uses the recursive makefile build system.</p>
<p>That&#8217;s the problem. </p>
<p>It&#8217;s just as bad as calculating fibbonaci using naive recursive algorithm.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: William Pietri</title>
		<link>http://chadaustin.me/2010/03/your-version-control-and-build-systems-dont-scale-introducing-ibb/comment-page-1/#comment-21268</link>
		<dc:creator>William Pietri</dc:creator>
		<pubDate>Fri, 05 Mar 2010 16:25:21 +0000</pubDate>
		<guid isPermaLink="false">http://chadaustin.me/?p=1508#comment-21268</guid>
		<description>Yes, this makes a lot of sense. I am entirely in favor of it.

Can it be taken farther? Eclipse auto-builds on save, and JUnit Max auto-runs the tests on save. But in a few years we&#039;ll have twice as many cores at our disposal, and then twice again. What can we do to take advantage of that?

Instead of every save, can we kick off the chain every time the code is plausibly compilable? Can we speculate on what the next few edits will be and have the build running before the coding is done? Can we go beyond automated unit testing to automated load testing, search for refactoring opportunities,  impact analysis, profiling, or even UI evaluation?

Make was invented in 1977. Build tools created today will be in their prime when make hits 40. The amount of computing power it is economical to give a developer is absurdly greater. Your analysis here makes me think that ibb is just the start of what we could get if we take a fresh look at our common tools.</description>
		<content:encoded><![CDATA[<p>Yes, this makes a lot of sense. I am entirely in favor of it.</p>
<p>Can it be taken farther? Eclipse auto-builds on save, and JUnit Max auto-runs the tests on save. But in a few years we&#8217;ll have twice as many cores at our disposal, and then twice again. What can we do to take advantage of that?</p>
<p>Instead of every save, can we kick off the chain every time the code is plausibly compilable? Can we speculate on what the next few edits will be and have the build running before the coding is done? Can we go beyond automated unit testing to automated load testing, search for refactoring opportunities,  impact analysis, profiling, or even UI evaluation?</p>
<p>Make was invented in 1977. Build tools created today will be in their prime when make hits 40. The amount of computing power it is economical to give a developer is absurdly greater. Your analysis here makes me think that ibb is just the start of what we could get if we take a fresh look at our common tools.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

