UPDATE. After I posted these numbers, Alon Zakai, Emscripten's author, pointed out options for generating optimized JavaScript. I reran my benchmarks; check out the updated table below and the script used to generate the new results.

At the beginning of the year, I tried to justify my claim that JavaScript has a long way to go before it can compete with the performance of native code.

Well, 10 months have passed. WebGL is catching on, Native Client has been launched, Unreal Engine 3 targets Flash 11, and Crytek has announced they might target Flash 11 too. Exciting times!

On the GPU front, we're in a good place. With WebGL, iOS, and Flash 11 all roughly exposing shader model 2.0, it's not a ton of work to target all of the above. Even on the desktop you can't assume higher than shader model 2.0: the Intel GMA 950 is still at the top.

However, shader model 2.0 isn't general enough to offload all of your compute-intensive workloads to the GPU. With 16 vertex attributes and no vertex texture fetch, you simply can't get enough data into your vertex shaders do to everything you need, e.g. blending morph targets.

Thus, for the foreseeable future, we'll need to write fast CPU code that can run on the web, mobile devices, and the desktop. Today, that means at least JavaScript and a native language like C++. And, because Microsoft has not implemented WebGL, the Firefox and Chrome WebGL blacklists are so strict, and no major browsers fall back on software, you probably care about targeting Flash 11 too. (It does have a software fallback!) If you care about Flash 11, then your code had better target ActionScript 3 / AVM2 too.

How can we target native platforms, the web, and Flash at the same time?

Native platforms are easy: C++ is well-supported on Windows, Mac, iOS, and Android. SSE2 is ubiquitous on x86, ARM NEON is widely available, and both have high-quality intrinsics-based implementations.

As for Flash... I'm just counting on Adobe Alchemy to ship.

On the web, you have two choices. Write your code in C++ and cross-compile it to JavaScript with Emscripten or write it in JavaScript and run via your native JavaScript engine. Ideally, cross-compiling C++ to JS via Emscripten would be as fast as writing your code in JavaScript. If it is, then targeting all platforms is easy: just use C++ and the browsers will do as well as they would with native JavaScript.

Over the last two evenings, while weathering a dust storm, I set about updating my skeletal animation benchmark results: for math-heavy code, how does JavaScript compare to C++ today? And how does Emscripten compare to hand-written JavaScript?

If you'd like, take a look at the raw results.

LanguageCompilerVariantVertex RateSlowdown
C++clang 2.9SSE1015800001
C++gcc 4.2SSE964204541.05
C++gcc 4.2scalar633555011.6
C++clang 2.9scalar629281751.61
JavaScriptChrome 15untyped102100009.95
JavaScriptFirefox 7typed arrays840159812.1
JavaScriptChrome 15typed arrays579000017.5
EmscriptenChrome 15scalar518481519.6
JavaScriptFirefox 7untyped510489519.9
JavaScriptFirefox 9a2untyped200598850.6
JavaScriptFirefox 9a2typed arrays193227152.6
EmscriptenFirefox 9a2scalar734126138
EmscriptenFirefox 7scalar729270139

Conclusions?

  • JavaScript is still a factor of 10-20 away from well-written native code. Adding SIMD support to JavaScript will help, but obviously that's not the whole story...
  • It's bizarre that Chrome and Firefox disagree on whether typed arrays or not are faster.
  • Firefox 9 clearly has performance issues that need to be worked out. I wanted to benchmark its type inference capabilities.
  • Emscripten... ouch :( I wish it were even comparable to hand-written JavaScript, but it's another factor of 10-20 slower...
  • Emscripten on Chrome 15 is within a factor of two of hand-written JavaScript. I think that means you can target all platforms with C++, because hand-written JavaScript won't be that much faster than cross-compiled C++.
  • Emscripten on Firefox 7 and 9 still has issues, but Alon Zakai informs me that the trunk version of SpiderMonkey is much faster.

In the future, I'd love to run the same test on Flash 11 / Alchemy and Native Client but the former hasn't shipped and the latter remains a small market.

One final note: it's very possible my test methodology is screwed up, my benchmarks are wrong, or I suck at copy/pasting numbers. Science should be reproducible: please try to reproduce these results yourself!