csvjson

Format Comparison

XML vs JSON

JSON replaced XML for most new APIs and web services. But XML still dominates enterprise integration, document formats, and legacy systems. Here's what each is actually good at — and the gotchas that trip up every XML↔JSON conversion.

The same data, two formats

The same data — XML is 2.5× larger

XML
<?xml version="1.0" encoding="UTF-8"?>
<users>
  <user id="1">
    <name>Alice Chen</name>
    <email>alice@example.com</email>
    <role>admin</role>
    <tags>
      <tag>developer</tag>
      <tag>billing</tag>
    </tags>
  </user>
  <user id="2">
    <name>Bob Kumar</name>
    <email>bob@example.com</email>
    <role>viewer</role>
    <tags>
      <tag>support</tag>
    </tags>
  </user>
</users>
<!-- 380 characters -->
JSON
[
  {
    "id": 1,
    "name": "Alice Chen",
    "email": "alice@example.com",
    "role": "admin",
    "tags": ["developer", "billing"]
  },
  {
    "id": 2,
    "name": "Bob Kumar",
    "email": "bob@example.com",
    "role": "viewer",
    "tags": ["support"]
  }
]
// 215 characters

Feature comparison

FeatureXMLJSON
SyntaxTags, attributes, elements — verboseBraces, brackets, strings — concise
ReadabilityReadable but noisy due to closing tagsClean for most data; gets messy when deeply nested
CommentsSupported (<!-- comment -->)Not supported
AttributesYes — elements can have attributesNo — all data is key-value pairs
NamespacesBuilt-in namespace support (xmlns:)No namespace concept
Schema validationXSD (XML Schema Definition) — very matureJSON Schema (Draft-07, 2020-12) — modern
QueryingXPath, XQuery — powerful but complexJSONPath, jq — simpler syntax
TransformationXSLT — Turing-complete transform languageNo equivalent — use code
Mixed contentSupports text + elements in the same nodeCannot represent mixed content natively
Binary dataBase64-encode inside elementsBase64-encode as a string value
Parsing speedSlower — DOM/SAX parsers more complexFaster — simpler grammar
Payload sizeLarger — tag names repeated for every elementSmaller — keys appear once per object
Native JS supportDOMParser requiredJSON.parse() built in
Primary use todayEnterprise systems, SOAP, SVG, XHTML, RSSREST APIs, config, web apps, data exchange

XML — use it when

Enterprise and legacy system integration

SOAP web services, EDI (Electronic Data Interchange), SAP integrations, and most healthcare (HL7 FHIR) and finance (FIX, SWIFT) systems use XML. If you're integrating with a system built before 2010, you're probably dealing with XML.

Documents with mixed content

XML models documents, not just data. A node can contain both text and child elements: <p>This is <strong>important</strong> text</p>. JSON has no equivalent. HTML, DocBook, DITA, and EPUB are all XML because they need this.

XSLT transformation pipelines

XSLT is a Turing-complete language for transforming XML into HTML, other XML, or plain text. Publishing pipelines, report generation, and document workflows that require complex structural transformations have no JSON equivalent.

Namespace-aware data

When multiple vocabularies need to coexist in one document without key collision — like XHTML + MathML + SVG — XML's namespace system handles this cleanly. JSON has no equivalent mechanism.

JSON — use it when

REST APIs and web services

JSON became the default for REST APIs around 2010 and has been dominant since. Every HTTP client library in every language parses JSON natively. It's smaller on the wire, faster to parse, and requires less code to work with.

JavaScript and browser applications

JSON is native to JavaScript — JSON.parse() and JSON.stringify() are built in. Working with XML in the browser requires DOMParser and XPath queries. For any web application exchanging data with a backend, JSON is the obvious choice.

Configuration files

package.json, tsconfig.json, .eslintrc, composer.json — the JavaScript ecosystem standardized on JSON for config. Even where YAML is an alternative (GitHub Actions), JSON is always accepted.

Modern data pipelines

pandas, BigQuery, Elasticsearch, MongoDB, and virtually every modern data tool has native JSON support. XML support, where it exists, is an afterthought. If your data enters a modern pipeline, JSON moves through it more easily.

XML → JSON conversion gotchas

These problems appear in nearly every XML-to-JSON conversion. Know them before you start.

XML attributes have no JSON equivalent

XML elements can have attributes (<user id="1">) and element content simultaneously. When converting to JSON, converters must choose a convention for attributes — usually a prefix like "@id" or a nested "$" key. This convention differs between libraries (xmltodict, xml2js, fast-xml-parser), so round-trips aren't always lossless.

<!-- XML with attributes -->
<payment id="pi_3Mtw" status="succeeded">
  <amount currency="usd">2000</amount>
</payment>

# xmltodict output (Python):
{
  "payment": {
    "@id": "pi_3Mtw",
    "@status": "succeeded",
    "amount": {
      "@currency": "usd",
      "#text": "2000"
    }
  }
}

Single-element arrays vs objects

In XML, a single child element and a list of one child element look identical. Most XML-to-JSON converters produce a dict for one child and a list for multiple children — meaning the same field has different types depending on the data. Always force list for array fields.

# xmltodict: force_list ensures consistent types
import xmltodict

xml = """
<orders>
  <order><id>1</id></order>
</orders>
"""

# Without force_list: order is a dict (single element)
data = xmltodict.parse(xml)
print(type(data["orders"]["order"]))  # <class 'dict'>

# With force_list: order is always a list
data = xmltodict.parse(xml, force_list=("order",))
print(type(data["orders"]["order"]))  # <class 'list'>

Namespaces complicate key names

XML namespaces produce keys like {http://www.w3.org/2001/XMLSchema-instance}type in parsed output. This is technically correct but makes JSON output unusable without post-processing. Strip namespaces before parsing for cleaner output.

import xmltodict
import re

def strip_namespaces(xml_string: str) -> str:
    """Remove namespace declarations and prefixes from XML."""
    # Remove namespace declarations
    xml_string = re.sub(r' xmlns[^"]*"[^"]*"', "", xml_string)
    # Remove namespace prefixes (ns0:element → element)
    xml_string = re.sub(r"<(/?)\w+:", "<\1", xml_string)
    return xml_string

clean_xml = strip_namespaces(raw_xml)
data = xmltodict.parse(clean_xml)

Frequently asked questions

Why did REST APIs switch from XML to JSON?

Several reasons converged around 2008–2012: JSON is native to JavaScript (no parsing library needed in the browser), it's more compact (lower bandwidth), it's easier to read and write by hand, and the rise of Node.js meant server-side JavaScript developers wanted to use the same format on both ends. SOAP/XML never had these advantages.

Is XML still used in 2025?

Yes, heavily — just not for new REST APIs. XML dominates: SVG graphics, HTML itself (XHTML), RSS/Atom feeds, Microsoft Office files (.docx, .xlsx are XML inside a zip), Android layouts, Java configurations (Spring, Maven, Hibernate), and enterprise integration middleware (MuleSoft, IBM MQ, SAP). It's not dying, it's just not where new development happens.

Can XML represent everything JSON can?

Yes — any JSON structure can be represented in XML, though the translation requires conventions for arrays (repeated elements) and type information (XML has no native boolean or null). The reverse isn't entirely true: XML's mixed content (text + child elements in the same node) and attributes have no direct JSON equivalent.

Which has better schema validation — XML or JSON?

XSD (XML Schema Definition) is older and more mature — it has been around since 2001 and supports complex validation including cross-element constraints. JSON Schema is more modern and readable but still catching up in some areas. For new systems, JSON Schema (via AJV) is perfectly capable. For regulated industries (healthcare, finance) with existing XSD schemas, XML remains the standard.

How do I convert XML to JSON in Python?

Use xmltodict: pip install xmltodict, then import xmltodict; data = xmltodict.parse(open("file.xml").read()). For large files, use the streaming parser. Watch out for the single-element array problem — always pass force_list for fields that should always be arrays.

Convert between XML and JSON — no upload, no sign-up.