JavaScript, Emscripten, and the Atom D2700
Lately I've been doing some work with Emscripten. As predicted, the quality of Emscripten's generated code is improving and JITs are learning to understand its generated code. I have high hopes for asm.js, a formalization of high-performance, low-level JavaScript. I now believe it's conceivable that Emscripten could approach the same level of performance as PNaCl, though whether that happens remains to be seen.
However, having a rough understanding of how today's JavaScript JITs work, I've always wondered whether Emscripten-generated code would be especially penalized relative to native code on an in-order core like Intel Atom. Having recently built an Intel Atom home server, I figured I'd update my recent Emscripten skinning benchmark results and find out.
First I'll describe the hardware. The CPU is an Atom D2700 on the Intel D2700DC board. 1066 MHz DDR3 memory. Two cores hyperthreaded. Running Ubuntu 12.04 Server. Firefox and Chromium packages are stock. Node.js and clang 3.1 are x64 Linux binaries downloaded from their respective websites. Emscripten is commit 26250471b46a68204711f037f33790bfb4ba37c7 in the master branch.
Now the results. Remember there are three JavaScript implementations: hand-written JS with untyped arrays and objects "untyped", hand-written JS with typed arrays "typed arrays", and Emscripten-compiled C++ "scalar". Emscripten's compiler was invoked with -O1. I saw significant performance drop-offs with -O2 and -O3.
Language | Compiler | Variant | Vertex Rate | Slowdown |
---|---|---|---|---|
C++ | gcc 4.6.3 -O3 | SSE | 24040000 | 1 |
C++ | clang 3.1 -O3 | SSE | 22530000 | 1.07 |
C++ | gcc 4.6.3 -O3 | scalar | 18730000 | 1.28 |
C++ | clang 3.1 -O3 | scalar | 13040000 | 1.84 |
JavaScript | Chromium 20.0 | untyped | 3150000 | 7.63 |
JavaScript | Firefox 17 | typed arrays | 2437562 | 9.86 |
JavaScript | Firefox 17 | untyped | 1084577 | 22.2 |
Emscripten | Firefox 17 | scalar | 944333 | 25.5 |
JavaScript | Chromium 20.0 | typed arrays | 807577 | 29.8 |
Emscripten | node 0.8.14 | scalar | 679802 | 35.4 |
Emscripten | Chromium 20.0 | scalar | 677966 | 35.5 |
Based on the previous benchmark results and my recent experience with Emscripten, it appears that JavaScript JITted code indeed has a penalty relative native code on in-order cores, or at least the Atom D2700.
Next time I hope to update these benchmarks on a high-end desktop CPU.
As always, if you'd like to reproduce these results or question them, the code is available on my github.
The optimized_emscripten.sh file has outdated optimization settings for emscripten. It manually specifies some settings that used to make sense but today would be done better, and automatically, by emcc. In particular the JS optimizer is not run at all.
Also, that uses typed arrays 1 and not 2. Only 2 allows full LLVM optimizations.
Oops, I'll kill optimized_emscripten.sh. I was using emcc -O1 for the benchmarks. -O2 was half the speed. I'll automate the build and test scripts shortly so you can reproduce my results.
-O2 was half the speed of -O1? That is really bizarre. Will investigate when you upload those scripts.
Can web browsers handle multiple platforms like asm.js and PPNaCl?
They encourage diversity of code while also ensuring there's no fragmentation on the web. They are fast!
They are both the founding technologies of the next generation of the internet.
But how many of these sandboxed universal platforms can a browser handle without causing slowdowns?