Benchmarking JSON Parsing: Emscripten vs. Native
This post concludes my obsession with JSON parsing. In fact, the entire reason I wrote a JSON parser was to show these graphs. I wanted to see whether I could write a JSON parser faster than any other when run in Emscripten. As vjson is typically faster, I did not succeed unless I requalify my goal as writing the fastest-in-Emscripten JSON parser with a useful parse tree.
This benchmark's code is on GitHub. I encourage you to reproduce my results and search for flaws.
All benchmarks were run on a 2010 Macbook Pro, 2.53 GHz Core i5, 8 GB 1067 MHz DDR3.
Native vs. Emscripten
First, let's compare native JSON parsing performance (clang 3.1, -O2) with both stable and nightly versions of Firefox and Chrome.
Two things are immediately clear from this graph. First, native code is still 5-10x faster than Emscripten/JavaScript. Second, yajl and jansson are dramatically slower than rapidjson, sajson, and vjson. Native yajl and jansson are even slower than Emscripten'd sajson. Henceforth I will not include them.
Looking closely at the browser results, a third conclusion is clear. Firefox runs Emscripten'd code much faster than Chrome.
Finally, sajson consistently performs better than rapidjson in my Emscripten tests. And vjson always parses the fastest. I believe this is because Emscripten and browser JITs punish larger code sizes.
The previous graph only shows parse rates by browser and parser for a single file. Next let's look at parse rates by file.
Yep, native code consistently wins. At this point I want to dig into differences between the browsers, so I will show the same graph but without native code.
Firefox vs. Chrome
Not only is Firefox consistently faster than Chrome, it's faster by a factor of 2-5x!
Finally, here the same graph but normalized against Firefox 18.
If I were a Firefox JS developer, I'd be feeling pretty proud right now, assuming this experiment holds up to scrutiny. Even so, these results match my experience with Bullet/Emscripten in Chrome: Chrome takes a very long time to stabilize its JIT, and I've even seen it fail to stabilize, constantly deopting and then reoptimizing code. In contrast, Firefox may take longer to JIT up front, but performance is smooth afterwards.
Further work is necessary to test the hypothesis that code size is the biggest predictor of Emscripten performance.
Preemptive answers to predicted questions follow:
Well, duh. You shouldn't use an Emscripten'd JSON parser. You should use the browser's built-in JSON.parse function.
This isn't about parsing JSON. This is about seeing whether the web can compete with native code under common workloads. After all, Mozillians have been claiming JavaScript is or will be fast enough to replace native apps. If parsing JSON through a blessed browser API is the only way to do it quickly, then developers are disempowered. What if a new format comes along? Must developers wait on the browser vendors? The web has to be better than that.
Shouldn't you have compiled the Emscripten code with -fno-exceptions?
Yes. Oops. That said, after generating the graphs, I did recompile with -fno-exceptions and it made no difference to the benchmark results in either Chrome or Firefox.
I wouldn't say firefox is consistently faster than chrome on emscripten benchmarks. Chrome does better on some of them. But, there are several pain points for chrome, like
http://code.google.com/p/v8/issues/detail?id=2223
Btw, for comparing native to JS, might be interesting to compare asm.js. I gave a talk about it,
http://kripken.github.com/mlocemscriptentalk/#/
the numbers there indicate that it could get JS from the 5-10x shown here (similar to the other real-world benchmarks in my graphs there) to the 2x range.
I may have time to run asm.js benchmarks when it lands in Aurora. Until then, maybe you could add it to the asm.js benchmarks that you're using to measure asm.js progress?