How to Flatten Nested JSON: A Complete Guide
Learn how to flatten deeply nested JSON into flat key-value pairs. Includes Python and JavaScript examples, plus when to use dot notation vs bracket notation for arrays.
- What Does "Flattening" Mean?
- Why You Need to Flatten JSON
- Flattening JSON in Python
- Option 1: A Manual Recursive Function
- Option 2: pandas json_normalize
- Flattening JSON in JavaScript
- Recursive Flatten in Node.js
- Handling Arrays in Nested JSON
- Choosing a Separator
- The Easy Way: Use a Tool
- Common Gotchas
- Frequently Asked Questions
- What is the difference between flattening and normalizing JSON?
- Does flattening JSON lose any data?
- Can I reverse a flattened JSON back to nested?
How to Flatten Nested JSON: A Complete Guide
Here's the problem. You're pulling payment data from the Stripe API and the response looks like this:
{
"id": "ch_3OqK2L2eZvKYlo2C1234abcd",
"amount": 4999,
"currency": "usd",
"customer": {
"id": "cus_Pq8m9vXyZ1234",
"email": "alex.morgan@example.com",
"metadata": {
"plan": "pro",
"signup_source": "organic"
}
},
"payment_method": {
"type": "card",
"card": {
"brand": "visa",
"last4": "4242",
"exp_month": 11,
"exp_year": 2026
}
},
"created": 1708523400
}
You want to load this into a spreadsheet, a SQL table, or a CSV file. The problem is that every tool expecting tabular data needs a flat structure — one level, no nesting. The customer.metadata.plan value doesn't have a natural column to live in. And if you have thousands of these records, you need a systematic way to flatten them all consistently.
That's what this guide covers.
What Does "Flattening" Mean?
Flattening takes a nested JSON object and produces a single-level dictionary where each key represents the full path to a value. Using dot notation, the nested Stripe payment above becomes:
{
"id": "ch_3OqK2L2eZvKYlo2C1234abcd",
"amount": 4999,
"currency": "usd",
"customer.id": "cus_Pq8m9vXyZ1234",
"customer.email": "alex.morgan@example.com",
"customer.metadata.plan": "pro",
"customer.metadata.signup_source": "organic",
"payment_method.type": "card",
"payment_method.card.brand": "visa",
"payment_method.card.last4": "4242",
"payment_method.card.exp_month": 11,
"payment_method.card.exp_year": 2026,
"created": 1708523400
}
Each nested key is now a flat key with the full path encoded in the name. customer.metadata.plan tells you exactly where in the original structure the value came from. This flat structure maps cleanly to a CSV row or a database table column.
Why You Need to Flatten JSON
Flattening comes up constantly in data engineering work:
Exporting to CSV or Excel. CSV has no concept of nested data. Before converting any nested JSON to CSV, you need to flatten it. Otherwise you end up with [object Object] filling your columns instead of real values. Our JSON to CSV converter handles this automatically, but understanding what's happening under the hood helps when things go wrong.
Loading into SQL databases. Relational tables are flat by definition. A payments table needs discrete columns: customer_email, card_brand, card_last4. You can't store a nested object in a single column and query it efficiently (without JSON column types, which not every situation warrants).
Training machine learning models. Feature engineering for ML requires numeric or categorical columns. A nested object can't be passed directly to scikit-learn. You need each potential feature to be its own column before you can normalize, encode, or select it.
Logging and analytics pipelines. Tools like Elasticsearch, BigQuery, and Segment work best with flat event schemas. Deeply nested events are harder to query and harder to visualize in dashboards. Flattening at ingest time makes downstream queries much simpler.
Flattening JSON in Python
Option 1: A Manual Recursive Function
This approach gives you full control over how keys are joined and how arrays are handled:
def flatten_json(obj, parent_key="", sep="."):
"""
Recursively flatten a nested dictionary.
Arrays of primitives are joined with a pipe character.
"""
items = {}
for key, value in obj.items():
new_key = f"{parent_key}{sep}{key}" if parent_key else key
if isinstance(value, dict):
# Recurse into nested objects
items.update(flatten_json(value, new_key, sep=sep))
elif isinstance(value, list):
# For arrays: join primitives, index into objects
if all(isinstance(v, (str, int, float, bool, type(None))) for v in value):
items[new_key] = "|".join(str(v) for v in value if v is not None)
else:
for i, v in enumerate(value):
if isinstance(v, dict):
items.update(flatten_json(v, f"{new_key}[{i}]", sep=sep))
else:
items[f"{new_key}[{i}]"] = v
else:
items[new_key] = value
return items
# Example usage
payment = {
"id": "ch_3OqK2L2eZvKYlo2C1234abcd",
"amount": 4999,
"customer": {
"email": "alex.morgan@example.com",
"metadata": {"plan": "pro"}
}
}
flat = flatten_json(payment)
print(flat)
# {
# 'id': 'ch_3OqK2L2eZvKYlo2C1234abcd',
# 'amount': 4999,
# 'customer.email': 'alex.morgan@example.com',
# 'customer.metadata.plan': 'pro'
# }
Option 2: pandas json_normalize
If you're already using pandas, json_normalize handles most cases in a single line:
import pandas as pd
import json
with open("payments.json") as f:
data = json.load(f)
# data is a list of payment objects
df = pd.json_normalize(data, sep=".")
print(df.columns.tolist())
# ['id', 'amount', 'currency', 'customer.id', 'customer.email',
# 'customer.metadata.plan', 'payment_method.card.brand', ...]
df.to_csv("payments_flat.csv", index=False)
json_normalize is concise and handles most real-world nesting well. The manual approach is better when you need custom array handling or specific separator logic.
Flattening JSON in JavaScript
Recursive Flatten in Node.js
function flattenJSON(obj, parentKey = "", sep = ".") {
const result = {};
for (const [key, value] of Object.entries(obj)) {
const newKey = parentKey ? `${parentKey}${sep}${key}` : key;
if (value !== null && typeof value === "object" && !Array.isArray(value)) {
// Recurse into nested objects
Object.assign(result, flattenJSON(value, newKey, sep));
} else if (Array.isArray(value)) {
const allPrimitive = value.every(
(v) => v === null || typeof v !== "object"
);
if (allPrimitive) {
result[newKey] = value.join("|");
} else {
value.forEach((item, index) => {
if (item !== null && typeof item === "object") {
Object.assign(result, flattenJSON(item, `${newKey}[${index}]`, sep));
} else {
result[`${newKey}[${index}]`] = item;
}
});
}
} else {
result[newKey] = value;
}
}
return result;
}
// Example usage
const payment = {
id: "ch_3OqK2L2eZvKYlo2C1234abcd",
amount: 4999,
customer: {
email: "alex.morgan@example.com",
metadata: { plan: "pro" },
},
};
console.log(flattenJSON(payment));
// {
// id: 'ch_3OqK2L2eZvKYlo2C1234abcd',
// amount: 4999,
// 'customer.email': 'alex.morgan@example.com',
// 'customer.metadata.plan': 'pro'
// }
In a browser context, this same function works identically — paste it into a <script> tag or an ES module. To convert to CSV in the browser, pair it with a library like Papa Parse:
import Papa from "papaparse";
const payments = await fetch("/api/payments").then((r) => r.json());
const flatPayments = payments.map((p) => flattenJSON(p));
const csv = Papa.unparse(flatPayments);
// Trigger download
const blob = new Blob([csv], { type: "text/csv" });
const url = URL.createObjectURL(blob);
const a = document.createElement("a");
a.href = url;
a.download = "payments.csv";
a.click();
Handling Arrays in Nested JSON
Arrays are where flattening gets genuinely tricky. There are three main strategies, and the right one depends on your data:
Index notation — orders[0].id, orders[1].id. Each array element gets its own set of columns. This works if all records have arrays of the same length, but breaks down quickly when array sizes vary (you end up with sparse columns and hundreds of orders[4].sku columns that are empty for most rows).
Inline — Join array elements into a single string: "admin|editor" or "admin; editor". This works well for arrays of primitives (tags, roles, categories) where you just need to see all the values and don't need to query individual elements. The delimiter choice matters — avoid commas since you're likely outputting to CSV.
Expand to multiple rows — Each array element becomes its own row, with the parent object's fields duplicated. If an order has 3 line items, you get 3 rows, each containing the order ID, customer info, and one line item. This is the right approach when each array element is a first-class record that you need to query and filter independently.
Use inline for simple tag arrays. Use row expansion for order line items, transaction histories, or anything where each array element has business meaning on its own.
Choosing a Separator
The separator character in flattened key names matters more than it seems:
Dot notation (.) is the most readable: customer.address.city. It mirrors how you'd access the field in code, making flattened column names feel natural. The downside: SQL column names can't contain dots without quoting, so customer.address.city needs to be "customer.address.city" in a SQL query.
Underscore (_) produces SQL-safe column names: customer_address_city. More practical for database imports. The risk is key collisions — if your JSON has both customer_address (a flat field) and customer.address (a nested object), flattening with underscores will produce duplicate keys.
Slash (/) produces path-like keys: customer/address/city. Useful when the flattened structure is meant to represent file-path-like hierarchies, common in some logging and telemetry schemas.
For most use cases: dot notation for human readability, underscore for database/CSV column names.
The Easy Way: Use a Tool
If you're not writing a pipeline and just need to flatten a JSON file quickly, our flatten JSON tool handles this in seconds. Paste or upload your JSON, choose your separator (dot, underscore, or custom), configure how arrays should be handled (index, inline, or expand to rows), and download the flattened result as JSON or CSV.
The tool is particularly useful when dealing with inconsistent real-world JSON where some records have nested objects and others have the same field as a primitive — a common inconsistency in API responses that trips up most scripts.
Common Gotchas
Circular references. JSON itself can't contain circular references (the format doesn't allow it), but JavaScript objects can. If you're flattening in-memory JavaScript objects rather than parsed JSON strings, a circular reference will cause infinite recursion. Add a visited Set to guard against it.
Very deep nesting and performance. Most real-world JSON is 3-5 levels deep, which is fine. But some schemas go 10+ levels deep, and recursive flattening on large arrays of deeply nested objects can be slow. Profile before deploying a recursive flatten in a hot path.
Key collisions. If your JSON has a field customer_email at the top level and also customer.email as a nested field, flattening with underscores produces two keys that map to customer_email. The second one silently overwrites the first. Always check for collisions, especially when flattening data from external APIs where you don't control the schema.
Frequently Asked Questions
What is the difference between flattening and normalizing JSON?
Flattening collapses nested objects into a single level using compound key names. Normalization (in the database sense) means decomposing data into related tables to eliminate redundancy. When using pd.json_normalize, "normalization" refers to flattening into a tabular structure — it's closer to flattening than to database normalization.
Does flattening JSON lose any data?
It depends on how arrays are handled. If you use inline joining for arrays of objects (not just primitives), you will lose structure. For example, joining [{"sku": "A"}, {"sku": "B"}] as a string collapses the internal structure. Index notation and row expansion preserve all data; inline joining is lossy for complex arrays.
Can I reverse a flattened JSON back to nested?
Yes — if you used a consistent separator and there are no key collisions, you can reconstruct the nested structure by splitting keys on the separator and building the hierarchy back up. This is sometimes called "unflattening" or "expanding" a flat object. The flatten JSON tool supports both directions.