Lately I've been doing some work with Emscripten. As predicted, the quality of Emscripten's generated code is improving and JITs are learning to understand its generated code. I have high hopes for asm.js, a formalization of high-performance, low-level JavaScript. I now believe it's conceivable that Emscripten could approach the same level of performance as PNaCl, though whether that happens remains to be seen.

However, having a rough understanding of how today's JavaScript JITs work, I've always wondered whether Emscripten-generated code would be especially penalized relative to native code on an in-order core like Intel Atom. Having recently built an Intel Atom home server, I figured I'd update my recent Emscripten skinning benchmark results and find out.

First I'll describe the hardware. The CPU is an Atom D2700 on the Intel D2700DC board. 1066 MHz DDR3 memory. Two cores hyperthreaded. Running Ubuntu 12.04 Server. Firefox and Chromium packages are stock. Node.js and clang 3.1 are x64 Linux binaries downloaded from their respective websites. Emscripten is commit 26250471b46a68204711f037f33790bfb4ba37c7 in the master branch.

Now the results. Remember there are three JavaScript implementations: hand-written JS with untyped arrays and objects "untyped", hand-written JS with typed arrays "typed arrays", and Emscripten-compiled C++ "scalar". Emscripten's compiler was invoked with -O1. I saw significant performance drop-offs with -O2 and -O3.

LanguageCompilerVariantVertex RateSlowdown
C++gcc 4.6.3 -O3SSE240400001
C++clang 3.1 -O3SSE225300001.07
C++gcc 4.6.3 -O3scalar187300001.28
C++clang 3.1 -O3scalar130400001.84
JavaScriptChromium 20.0untyped31500007.63
JavaScriptFirefox 17typed arrays24375629.86
JavaScriptFirefox 17untyped108457722.2
EmscriptenFirefox 17scalar94433325.5
JavaScriptChromium 20.0typed arrays80757729.8
Emscriptennode 0.8.14scalar67980235.4
EmscriptenChromium 20.0scalar67796635.5

Based on the previous benchmark results and my recent experience with Emscripten, it appears that JavaScript JITted code indeed has a penalty relative native code on in-order cores, or at least the Atom D2700.

Next time I hope to update these benchmarks on a high-end desktop CPU.

As always, if you'd like to reproduce these results or question them, the code is available on my github.