JSON… replacement for XML?

JSON (JavaScript Object Notation) is a structured data-interchange format much like XML. Apparently it's much more efficient for things like doing XML-RPC calls over a browser, because you don't need to parse or generate any XML from JavaScript in the browser. (The browser already has a built-in JSON parser in the form of a JavaScript interpreter.) Since it's a direct representation of JavaScript data you can easily generate and parse it from within your scripts. To give you some context, here's what some JSON looks like alongside XML:

{"glossary": {
    "title": "example glossary",
    "GlossDiv": {
        "title": "S",
        "GlossList": [{
            "ID": "SGML",
            "SortAs": "SGML",
            "GlossTerm": "Standard Generalized Markup Language",
            "Acronym": "SGML",
            "Abbrev": "ISO 8879:1986",
            "GlossDef": "A meta-markup language, used to create markup languages such as DocBook.",
            "GlossSeeAlso": ["GML", "XML", "markup"]
        }]
    }
}}

Here's the XML:

<glossary><title>example glossary</title>
 <GlossDiv><title>S</title>
  <GlossList>
   <GlossEntry ID="SGML" SortAs="SGML">
    <GlossTerm>Standard Generalized Markup Language</GlossTerm>
    <Acronym>SGML</Acronym>
    <Abbrev>ISO 8879:1986</Abbrev>
    <GlossDef>
      <para>A meta-markup language, used to create markup languages such as DocBook.</para>
     <GlossSeeAlso OtherTerm="GML">
     <GlossSeeAlso OtherTerm="XML">
    </GlossDef>
    <GlossSee OtherTerm="markup">
   </GlossEntry>
  </GlossList>
 </GlossDiv>
</glossary>

From the human readability standpoint it doesn't really stand much over XML. Maybe a little bit terser and easier to type. The quotes are annoying though. And I can see escaping and multiline strings being an irritation. But here's the kicker

JSON is directly compatible with Python too! Parsing, writing, and processing are simple simple simple! In Python, eval() reads; str(), repr(), and pprint.pprint() writes; and transforms? Just a little script that processes the data in the way you do everyday! You don't need any crazy XSL or XSLT or DOM, the JSON object model uses the native datatypes of the scripting language you're using! (Assuming it's Python or JavaScript, at least. You need a parser and DOM in other languages. But even so, I'd still much rather write my transformations in Python than XSLT.)

Kevin Gadd brought up some concerns about the security of, from your web browser, eval()'ing data that comes across the wire in the form of JSON. Let's say, instead of valid JSON data, you get the string "destroyYourComputer()" and your browser is dumb enough to expose that as a public function. Yes, it could be dangerous, but I don't see how it's any more dangerous than executing JavaScript contained in some HTML. Thoughts? If need be, you could just have a sandboxed evalLiteralExpression() function to do the parsing.

I don't see JSON replacing XML for everything anytime soon, but for me it looks like a better choice in a situation where people just want to store, retrieve, and maintain some structured data.

Imported Comments [?]

kisai on Jan 26, 2005

Keenspace's framework in how it accesses the keenspot cookies uses a trick like this. All it does is get a javascript variable, array, or whatever from the keenspot server. This gets around the cross-site scripting cookie theft... which is what I find odd.

(yes most of the browser bugs related to x-site scripting are from ads hijacking websites.)

But this also introduces the possibility of code-injection unlike the XML which is not a directly executable data.

thespeedbump on Jan 27, 2005

Assuming that eval() isn't used to parse the data, JSOL wins hands down. I don't even mind the quotes.

Also, I wouldn't understate the terseness: the JSOL version is 491 bytes, whereas the xml version is 554. 12% of 554 bytes isn't a big deal, but I wouldn't be shocked if there were folks using XML documents in excess of 50MB, in which case the difference would be more like 6MB. Of course, most compression algorithms probably devour XML's redundancy, but that only proves the point that it is redundant both to people and computers.