Format Comparison
XML vs JSON
JSON replaced XML for most new APIs and web services. But XML still dominates enterprise integration, document formats, and legacy systems. Here's what each is actually good at — and the gotchas that trip up every XML↔JSON conversion.
The same data, two formats
The same data — XML is 2.5× larger
<?xml version="1.0" encoding="UTF-8"?>
<users>
<user id="1">
<name>Alice Chen</name>
<email>alice@example.com</email>
<role>admin</role>
<tags>
<tag>developer</tag>
<tag>billing</tag>
</tags>
</user>
<user id="2">
<name>Bob Kumar</name>
<email>bob@example.com</email>
<role>viewer</role>
<tags>
<tag>support</tag>
</tags>
</user>
</users>
<!-- 380 characters -->[
{
"id": 1,
"name": "Alice Chen",
"email": "alice@example.com",
"role": "admin",
"tags": ["developer", "billing"]
},
{
"id": 2,
"name": "Bob Kumar",
"email": "bob@example.com",
"role": "viewer",
"tags": ["support"]
}
]
// 215 charactersFeature comparison
| Feature | XML | JSON |
|---|---|---|
| Syntax | Tags, attributes, elements — verbose | Braces, brackets, strings — concise |
| Readability | Readable but noisy due to closing tags | Clean for most data; gets messy when deeply nested |
| Comments | Supported (<!-- comment -->) | Not supported |
| Attributes | Yes — elements can have attributes | No — all data is key-value pairs |
| Namespaces | Built-in namespace support (xmlns:) | No namespace concept |
| Schema validation | XSD (XML Schema Definition) — very mature | JSON Schema (Draft-07, 2020-12) — modern |
| Querying | XPath, XQuery — powerful but complex | JSONPath, jq — simpler syntax |
| Transformation | XSLT — Turing-complete transform language | No equivalent — use code |
| Mixed content | Supports text + elements in the same node | Cannot represent mixed content natively |
| Binary data | Base64-encode inside elements | Base64-encode as a string value |
| Parsing speed | Slower — DOM/SAX parsers more complex | Faster — simpler grammar |
| Payload size | Larger — tag names repeated for every element | Smaller — keys appear once per object |
| Native JS support | DOMParser required | JSON.parse() built in |
| Primary use today | Enterprise systems, SOAP, SVG, XHTML, RSS | REST APIs, config, web apps, data exchange |
XML — use it when
Enterprise and legacy system integration
SOAP web services, EDI (Electronic Data Interchange), SAP integrations, and most healthcare (HL7 FHIR) and finance (FIX, SWIFT) systems use XML. If you're integrating with a system built before 2010, you're probably dealing with XML.
Documents with mixed content
XML models documents, not just data. A node can contain both text and child elements: <p>This is <strong>important</strong> text</p>. JSON has no equivalent. HTML, DocBook, DITA, and EPUB are all XML because they need this.
XSLT transformation pipelines
XSLT is a Turing-complete language for transforming XML into HTML, other XML, or plain text. Publishing pipelines, report generation, and document workflows that require complex structural transformations have no JSON equivalent.
Namespace-aware data
When multiple vocabularies need to coexist in one document without key collision — like XHTML + MathML + SVG — XML's namespace system handles this cleanly. JSON has no equivalent mechanism.
JSON — use it when
REST APIs and web services
JSON became the default for REST APIs around 2010 and has been dominant since. Every HTTP client library in every language parses JSON natively. It's smaller on the wire, faster to parse, and requires less code to work with.
JavaScript and browser applications
JSON is native to JavaScript — JSON.parse() and JSON.stringify() are built in. Working with XML in the browser requires DOMParser and XPath queries. For any web application exchanging data with a backend, JSON is the obvious choice.
Configuration files
package.json, tsconfig.json, .eslintrc, composer.json — the JavaScript ecosystem standardized on JSON for config. Even where YAML is an alternative (GitHub Actions), JSON is always accepted.
Modern data pipelines
pandas, BigQuery, Elasticsearch, MongoDB, and virtually every modern data tool has native JSON support. XML support, where it exists, is an afterthought. If your data enters a modern pipeline, JSON moves through it more easily.
XML → JSON conversion gotchas
These problems appear in nearly every XML-to-JSON conversion. Know them before you start.
XML attributes have no JSON equivalent
XML elements can have attributes (<user id="1">) and element content simultaneously. When converting to JSON, converters must choose a convention for attributes — usually a prefix like "@id" or a nested "$" key. This convention differs between libraries (xmltodict, xml2js, fast-xml-parser), so round-trips aren't always lossless.
<!-- XML with attributes -->
<payment id="pi_3Mtw" status="succeeded">
<amount currency="usd">2000</amount>
</payment>
# xmltodict output (Python):
{
"payment": {
"@id": "pi_3Mtw",
"@status": "succeeded",
"amount": {
"@currency": "usd",
"#text": "2000"
}
}
}Single-element arrays vs objects
In XML, a single child element and a list of one child element look identical. Most XML-to-JSON converters produce a dict for one child and a list for multiple children — meaning the same field has different types depending on the data. Always force list for array fields.
# xmltodict: force_list ensures consistent types
import xmltodict
xml = """
<orders>
<order><id>1</id></order>
</orders>
"""
# Without force_list: order is a dict (single element)
data = xmltodict.parse(xml)
print(type(data["orders"]["order"])) # <class 'dict'>
# With force_list: order is always a list
data = xmltodict.parse(xml, force_list=("order",))
print(type(data["orders"]["order"])) # <class 'list'>Namespaces complicate key names
XML namespaces produce keys like {http://www.w3.org/2001/XMLSchema-instance}type in parsed output. This is technically correct but makes JSON output unusable without post-processing. Strip namespaces before parsing for cleaner output.
import xmltodict
import re
def strip_namespaces(xml_string: str) -> str:
"""Remove namespace declarations and prefixes from XML."""
# Remove namespace declarations
xml_string = re.sub(r' xmlns[^"]*"[^"]*"', "", xml_string)
# Remove namespace prefixes (ns0:element → element)
xml_string = re.sub(r"<(/?)\w+:", "<\1", xml_string)
return xml_string
clean_xml = strip_namespaces(raw_xml)
data = xmltodict.parse(clean_xml)Frequently asked questions
Why did REST APIs switch from XML to JSON?
Several reasons converged around 2008–2012: JSON is native to JavaScript (no parsing library needed in the browser), it's more compact (lower bandwidth), it's easier to read and write by hand, and the rise of Node.js meant server-side JavaScript developers wanted to use the same format on both ends. SOAP/XML never had these advantages.
Is XML still used in 2025?
Yes, heavily — just not for new REST APIs. XML dominates: SVG graphics, HTML itself (XHTML), RSS/Atom feeds, Microsoft Office files (.docx, .xlsx are XML inside a zip), Android layouts, Java configurations (Spring, Maven, Hibernate), and enterprise integration middleware (MuleSoft, IBM MQ, SAP). It's not dying, it's just not where new development happens.
Can XML represent everything JSON can?
Yes — any JSON structure can be represented in XML, though the translation requires conventions for arrays (repeated elements) and type information (XML has no native boolean or null). The reverse isn't entirely true: XML's mixed content (text + child elements in the same node) and attributes have no direct JSON equivalent.
Which has better schema validation — XML or JSON?
XSD (XML Schema Definition) is older and more mature — it has been around since 2001 and supports complex validation including cross-element constraints. JSON Schema is more modern and readable but still catching up in some areas. For new systems, JSON Schema (via AJV) is perfectly capable. For regulated industries (healthcare, finance) with existing XSD schemas, XML remains the standard.
How do I convert XML to JSON in Python?
Use xmltodict: pip install xmltodict, then import xmltodict; data = xmltodict.parse(open("file.xml").read()). For large files, use the streaming parser. Watch out for the single-element array problem — always pass force_list for fields that should always be arrays.
Convert between XML and JSON — no upload, no sign-up.