XML vs. JSON: What’s the Difference for Developers?

Programming is not just about writing code. If you have a few years of programming experience under your belt, you’ll have no doubt encountered the issue of moving data between processes or computers, or creating it for use in a program. For years, people used simple text files (or their more sophisticated cousins’ comma separated values (.csv) file) to send this data.

Depending on your ultimate goal, that data may need to come in human-readable form. If it’s just binary data, you need to write a utility program or use a hex editor in order to view it. To that end, XML and JSON are both human-viewable text formats. How do they compare?

XML (Extensible Markup language) has been around since the mid-1990s but is itself an extension of a previous standard called SGML (Standard Generalized Markup Language) from the mid-1980s.

Back before HTML5, it looked as if the future of HTML might be XHTML (i.e., HTML based on XML rather than SGML). But a different working group (WHATWG) to the World Wide Web Consortium (W3C), the main international organization for the development of Web standards, pushed HTML5 and it won. Had it not, all HTML would have been case-sensitive, and we might still need browser plugins. (Mobile devices also work better with HTML5 than XHTML.)

XML and Java both came into being about the same time, and you’ll find XML used in many different guises. For example, I once built some software in Java that was called from a .NET client, and it passed structured data between the two. The data was defined first using XSD (XML Schema Definition), basically XML specifying what fields to pass; then the interface code classes were generated using JAXB (Java Architecture for XML) for the server and xsd.exe for the client classes on .NET.

The code generated was pretty unsightly, but it worked.  One of the files generated was over 300,000 lines long! It had lots of code to convert objects to XML, and vice versa.

If you have a WinForms application on .NET that includes settings such as config strings, the compiler puts them in an XML file. The example below shows the connection string used to connect the application to a SQL server instance.

<?xml version=”1.0″?>

<configuration>

   <connectionStrings>

       <add name=”Sirius.Properties.Settings.mydogshowsConnectionString” connectionString=”Data Source=PCNEW\SQLEXPRESS;Initial Catalog=mydogshows;Integrated Security=True” providerName=”System.Data.SqlClient”/>

   </connectionStrings>

 <startup>

     <supportedRuntime version=”v4.0″ sku=”.NETFramework,Version=v4.0″/>

 </startup>

</configuration>

Criticism of XML

The main criticism is that XML files can be very large—typically five to six times the size of the data when stored in a text file. Every item of data has a tag, with a closing tag apart from empty elements. XML files are slower to transmit and parse. In a trading system I worked on, it was not unusual for a trade’s data to be 30KB or larger when sent to the server in XML.

To add further confusion to XML, there’s XSL, which is a way of styling XML documents. You can transform XML into other forms using XSLT (XSL transformations). You can also navigate through XML documents using XPath, accessing elements absolutely or relative to others, and there’s a query language called XQuery.

JSON

Dating back to 2001, JSON (JavaScript Object Notation) has emerged as a rival to XML. Although it is supported inside JavaScript, JSON is language-independent. Compared to XML, it’s much shorter and easier to read. There’s also a simpler JSON-like called HJSON, short for Human JSON.

JSON is very simple, and parsing is fast compared to XML (in some cases up to 100x faster). Conceptually, a JSON file is made up of two types of structure: a list of name and value pairs, and an ordered list of values. You can read and understand how to parse in it in under five minutes on the JSON home page.

Because it’s been around so long, most browsers will render XML, but JSON is not as well-supported by the browser ecosystem. You can’t include comments, and there aren’t many data types supported (string, number, object, and array). Here’s a simple example:

{
"firstname" : "David",
"lastname" : "Bolton "
}

There’s also a binary version of JSON, which seems a bit counterintuitive.  It’s used in PostgreSQL and one or two other systems, and speeds up JSON operations such as sorting, slicing and splicing and indexed lookups.

Conclusion

There’s no disputing that XML is bulkier and slower to parse than JSON, and needs more resources to do so. So if efficiency is your aim, JSON is undoubtedly better.

But XML, with its accompanying alphabet soup (XSD, XSL, XSLT, XPath, XQuery), has some different use cases than JSON. For example, the RSS format for notifying changes to websites is based on XML, as is XAML, the declarative GUI description language used in WPF (Windows Presentation Framework) and in Xamarin.Forms for cross-platform mobile development. JSON would not suit those.

XML is a bit old-school and feels better suited to the desktop and server computing world, while JSON is more appropriate for Web and mobile development. Certainly the use of the XML-based SOAP has declined in favour of the XML/JSON Rest.

If you’d prefer something a bit different from XML, take a look at 5 XML Alternatives to Consider.

4 Responses to “XML vs. JSON: What’s the Difference for Developers?”

  1. Kaihusrav

    Are you dull? is that how you compare methods\standards ? XML is more safe, much more intuitive, easy for automated reading, supported everywhere , JSON – is just for some rare times … slower ?? did you make any lab tests ?

  2. This comparison is really too terse to be conclusive. Choosing the best data transfer scheme really reverts to the old rule “the best tool for the job”. Since CSV/TSV is more efficient than either XML or JSON, it will still win out in certain applications. CSV/TSV suffer from brittleness, and flatness. XML and JSON both support tree structures, and with flexible ordering and easy omission of optional items.

    When comparing XML and JSON, how you choose to model the structure will dictate the total byte count. Good models follow the natural structure of the data itself, but compactness can be achieved by restructuring into non-logical models, if you really need to count bytes. You can also shorten the names of elements/nodes to non-friendly tags. Here is a classic example of a people record, given in both XML and JSON:

    {“person”={“attrs”={“sex”=”male”,”name”=”joe”,”age”=30,”eyes”=”brown”,”height”=70.5,”weight”=180},”person”={“attrs”={“sex”=”female”,”name”=”jane”}}}

    In this example, the JSON is many bytes longer than the XML! XML natively supports the concept of attributes, whereas in the JSON example it is simulated as a known named (“attrs”) child node. Both examples provide a tree structure, where there is a parent level and a child level, with Joe the person having a single daughter, another person.

    JSON is great when the receiver of the data simply wants to parse the whole thing into an object, which can then be easily referenced with code like: if (person.attrs.name == “joe”). This implies trust of the data structure, and also may be size prohibitive in terms of how much memory is needed to instantiate the object.

    XML is much more versatile for how the receiver wishes to handle the data, with a wide variety of common libraries available in most popular languages. You don’t have to parse the whole thing into an in-memory object if you don’t want to, and can search for elements in the tree if that better suits your application. You can also run a validator on the data, to ensure it conforms to your expectations.

    I suggest you take the time to understand your application, both at the data producer and at the data receiver, to choose the best encoding for your needs, rather than blindly picking based on age of technology or perceived trends.

    Also consider that the producer side of the application may not need to use any libraries to create the output data. CSV, TSV, XML, and JSON are all very easy to generate with standard string functions like sprintf. This can be a major deal if your producer is a resource constrained embedded device. However, this simple technique can become harder to manage as the complexity of the output increases.

    • Dang! The XML example got swallowed! Here it is, with all opening brackets changed to stars:

      *person sex=”male” name=”joe” age=”30″ eyes=”brown” height=”70.5″ weight=”180″>*person sex=”female” name=”jane”/>*/person>

  3. shaliniboudh010@yahoo.com

    Why You Should Use JSON vs. XML Extensible Markup Language (XML) used to be the only option for open data interchange. However, the Open Data Sharing developments have introduced more options for developers, each of which has its own set of benefits.