Logic vs. Array Processing
I've always been amused by the Java vs. C++ performance arguments:
- "Java's faster than C++!"
- "No it's not!"
- "Yeah it is, look at this benchmark!"
- "Well look how much longer the Java version of program takes to start!"
Back and forth and back and forth. The fact is, they're both right, and here's why. I mentally separate code into either of two categories, logic or array processing:
- 3D rasterization is obviously array processing.
- Video playback is also array processing.
- Calculating your tax refund is logic.
- Loading a PDF is definitely logic.
Often the line is blurry, but array processing involves running a relatively small set of rules over a lot of homogenous data. Computers are very, very good at this kind of computation, and specialized hardware such as a GPU can increase performance by orders of magnitude. Ignoring memory bandwidth, a desktop CPU can multiply billions of floating point numbers per second, and a fast GPU can multiply trillions.
At the other extreme, logic code tends to be full of branches, function calls, dependent memory accesses, and often it executes code that hasn't been run in minutes. Just think about the set of operations that happen when you open a file in Word. Computers aren't so good at these types of operations, and as Moore's Law continues, they tend not to improve as rapidly as array computation does.
Back to Java vs. C++. The synthetic benchmarks that compare Java and C++ performance tend to be tight loops, simply because accurate measurement requires it. This gives the JVM time to prime its JIT/prediction engines/what have you, so I'd expect a good result. Heck, I'd expect a good result from the modern JavaScript tracing engines.*
The lesson here is that, for array processing, it's very little work to make full use of the hardware at hand. Because the amount of code is limited (and the amount of data is large), time spent optimization has high leverage.
On the other hand, logic code is messy and spread out, often written by entire teams of people. Its performance is dominated by your programming language and the team's vocabulary of idioms. Truly optimizing this kind of code is hard or impossible. It can be done, but you often have to retrain your team to make sure the benefits stick.
This is a reason that the choice of programming language(s) and libraries has such a big effect on the responsiveness of a desktop application, and one of the reasons why people can "feel" the programming language in which a project was written. Typical desktop application usage patterns are dominated by random, temporally sparse actions, so code size, "directness", and working set are primary performance factors. (Anecdote: Andy's rewriting the IMVU client's windowing framework so it's a bajillion times simpler, and when he had the client running again, he exclaimed "Hey, resizing the 3D window is twice as responsive!")
Perhaps there's an argument here for the creation of more project-specific programming languages (GOAL, TreeHydra, DSLs), so that performance improvements can be applied universally across the codebase.
With disk and memory speeds improving so much more slowly than CPU speeds, the difference between a snappy desktop application and a sluggish application is a handful of page faults. When choosing a technology platform for a project, it's worth considering the impact to overall responsiveness down the road. And I'm pretty sure I just recommended writing your entire application in C++, which sounds insane, even to me. I'll leave it at that.
* By the way, I'm not picking on Java or promoting C++ in particular. You could make these same arguments between any "native" language and "managed" language. The blocking and tackling of loading applications, calling functions, and keeping memory footprint low are important.