In Defense of Language Democracy (Or: Why the Browser Needs a Virtual Machine)

Years ago, Mark Hammond did a bunch of work to get Python running inside Mozilla’s script tags. Parts of Mozilla are ostensibly designed to be language-independent, even. Unfortunately, even if Mozilla had succeeded at shipping multiple language implementations, it’s unlikely other browser vendors would have followed suit. It’s just not logistically feasible to have all browsers gate and care for the set of interesting languages on the client.

I can hear you asking “Why do I care about Python in the browser? Or C++? Or OCaml? JavaScript is a great language.” I agree! JavaScript is a great language. Given the extremely short timeframe and immense political pressure, I’m thrilled we ended up with something as capable as JavaScript.

Nonetheless, fair competition benefits everyone. Take a look at what’s happened in the web server space in the last few years: Ruby on Rails. Django. Node.js. nginx. Tornado. Twisted. AppEngine. MochiWeb. HipHop-PHP. ASP.NET MVC. A proliferation of interesting datastores: memcache, redis, riak, etc. That’s an incredible amount of innovation in a short period of time.

Now let’s go through the same exercise, but on the client. jQuery, YUI, fast JavaScript JITs, CSS3, CoffeeScript, proliferation of standards-compliant browsers, some amount of HTML5… Maybe ubiquitous usage of Flash video? These advancements are significant, but it’s clear the front-end stack is changing much more slowly than the back-end.

Why is the back-end evolving faster than the front-end?

When building an application backend, even atop a virtualized hosting provider such as EC2, you are given approximately raw access to a machine: x86 instruction set, sockets, virtual memory, operating system APIs, and all. Any software that runs on that machine competes at the same level. You can use Python or Ruby or C++ or some combination thereof. If Redis wants to innovate with new memory management schemes, nothing is stopping it. This ecosystem democratized – nay, meritocratized – innovation.

On the front-end, the problem boils down to the fact that JavaScript is built atop but does not expose the capabilities of the underlying hardware, meaning browsers and JavaScript implementations are inherently more capable than anything built atop them.

Of course, any client-side technology is going to rev slower simply because it’s hard to get people to update their bits. Also, users decide which client bits they like best, whether they be Internet Explorer, Chrome, or Firefox. Now the technology-takes-time-to-gain-ubiquity problem has a new dimension: each browser vendor must also decide to implement this technology in a compatible way. It took years for even JavaScript to standardize across browsers.

However, if we could instead standardize the web on a performant and safe VM such as CLR, JVM, or LLVM, including explicit memory layout and allocation and access to extra cores and the GPU, JavaScript becomes a choice rather than a mandate.

This point of view depends on my prediction that JavaScript will not become competitive with native code, but not everyone agrees. If JavaScript does eventually match native code, than I’d expect the browser itself to be written in it. It’s impossible for me to claim that JavaScript will never match native code, but the sustained success of C++ in systems programming, games, and high-performance computing is a testament to the value of systems languages.

Native Client, however, gives web developers the opportunity to write code within 5-10% of native code performance, in whatever language they want, without losing the safety and convenience of the web. You can write web applications that leverage multiple cores, and with WebGL, you can harness dedicated graphics hardware as well. Native Client does restrict access to operating system APIs, but I expect APIs to evolve reasonably quickly.

Let’s take a particular example: the HTML5 video tag. Native Client could have sidestepped the entire which-video-codec-should-we-standardize spat between Mozilla, Google, Apple, and Microsoft by allowing each site to choose the codec it prefers. YouTube could safely deploy whatever codecs it wanted, and even evolve them over time.

With Native Client, we could share Python code between the front-end and the back-end. We could use languages that support weak references. We could implement IMVU’s asynchronous task system. We could embed new JavaScript interpreters in old browsers.

Native Client is not the only option here. The JVM and CLR are other portable and performant VMs that have seen considerable language innovation while approximating native code performance.

A standardized, performant, and safe VM in the browser would increase the strength of the open web versus native app stores and their arbitrary technology limitations.

Finally, I’d like to thank Alon Zakai (author of Emscripten), Mike Shaver, and Chris Saari for engaging in open, honest discussion. I hope this public discourse leads to a better web. Either way, I hope this is my last post on this topic. :)

8 thoughts on “In Defense of Language Democracy (Or: Why the Browser Needs a Virtual Machine)”

  1. > Why is the back-end evolving faster than the front-end?

    My guess: Because, no matter who you are, your service is small, and the installed base is large.

    That being said — you can treat JavaScript as an exection layer. You can generate JavaScript from C code, even. With a tracing compiler approach, you can get useful performance (but not “native” performance).

    The question is: How resistant are users /really/ to installing a native plug-in, if it’s for something they really want? If Native Client ended up just making pop-over ads more annoying, then was it worth it?

  2. I was arguing the same for years! I prefer Python to Javascript…. others Haskell…

    I just came up to your blog via the native client article and added you instantly to Google Reader!

  3. Chad,

    Your arguments crystallise something i had a gut instinct for.

    PNaCl is not getting much exposure, but Nacl is. so when coders see just Nacl they small a lockin rat (as per history). Hence, i believe its great that you are writing these blog posts as its a service to all for the future.

    As far as portability however i see a few issues that google, LLVM and CPU instruction set builders need to address:
    a. Future portability.
    – For example the MIPS looks to be getting a good rebirth via the Chinese Godsen CPU. See recent news announcements about this.
    Of course gong the virtual instruction set solves this but thats the same as Java and c# approach.

    b. bloated instruction set. Its 30% more bloated on PNaCl. This is a big penalty currently to pay. Still allot better than javaScript, but thats not the point really.

    c. Security.
    You are still limited to the hardware you can access, but then if they can pre rationalise what the code is trying to access and tell the user then we have a good solution. A bit like how Android and Chrome extensions work.

  4. Hello Chad! I’m intrigued by your post. I’m wondering if you could move the HTML parser itself into the VM as just another module among others, including the javascript engine, CSS interpreter etc.
    You would have a completely modular and portable browser that just needs an OS specific VM (say, an LLVM interpreter with access to native widgets/toolkits of the operating system, for native UI elements), and with full access to all the competing engines like Webkit vs Gecko. In fact, I suppose this VM wouldn’t need to be limited at all to just web browsing purposes, though I can’t imagine what it would be used for otherwise.

    Man, just thinking about writing something like this makes my mouth drool :P I’m not an experienced software developer yet though, so I guess I shouldn’t open my mouth about these things when I don’t fully understand them! Is this a feasible idea at all? Did I even use the correct terminology? :P

  5. Re-reading that comment some of it sounds weird, I don’t mean that YOU should do these things, just wondering if ONE could do these things. Sorry for any confusion :P

  6. I’d be interested in your take on Fabric Engine, as we use LLVM – we’re going after the problem in a different way that doesn’t throw the baby out with the bath water. We’ve brought native, multi-threaded performance to web applications (on client and server) without requiring the developer to transition to compiled languages. There’s a bunch of FE client demos here:

    Server-wise, we’re integrated with node and have been getting great performance – the benefit of sticking to dynamic languages on the server shouldn’t be underestimated.

  7. Paul: I don’t understand how running dynamic languages on the server has anything to do with Native Client, which enables close-to-native performance on the client without a plug-in?

  8. Hi Chad – Apologies for the slow response, I don’t get notifications of replies.

    I mentioned ‘dynamic languages on the server’ in response to your comment about JavaScript never being competitive with native code. With some changes/extensions to JavaScript, we have comparable performance to multi-threaded C. All accessible to a web developer.

    I recently watched Ryan Dahl’s talk on why he built node. Interestingly, he wrote it in C first and nobody was interested. Once it was in js, everyone went crazy. Web developers want to use web languages – NaCl doesn’t do anything for those guys. Present a web developer with a way to use JavaScript and have it be as performant as native code – that’s the approach we took. How many web developers can (or want to) write good C/C++? How many can write solid multi-threaded code?

    That said – we might use NaCl in the future so that Fabric can run without the plug-in. Unfortunately, we don’t own our own browser, so we have to fit in with everyone else’s ;)

Leave a Reply

Your email address will not be published. Required fields are marked *