DataShift
·9 min read

How to Convert Nested JSON to CSV: A Complete Guide

Converting nested JSON to CSV requires a flattening strategy. This guide explains how to handle nested objects, arrays of objects, and mixed structures—with code examples.

How to Convert Nested JSON to CSV: A Complete Guide

Flat JSON converts to CSV easily. An array of objects where every value is a string or number? One line in pandas. Done.

This guide focuses specifically on nested structures. If you're looking for the full Python workflow — standard library, pandas, real API examples, and error handling — see the complete guide to converting JSON to CSV in Python.

Nested JSON is where things get complicated. Real-world APIs — Salesforce, HubSpot, Shopify, Stripe — return data with nested objects, arrays of objects, and mixed structures that don't fit neatly into rows and columns. When you run a basic JSON.stringify on each record and throw it into a CSV row, you get [object Object] in half your columns and no actual data.

This guide walks through three concrete strategies for converting nested JSON to CSV, with working code examples and a clear decision framework for choosing between them.

Why Nested JSON Breaks Simple CSV Converters

Consider a HubSpot contact export. Each contact looks something like this:

{
  "id": "103940",
  "properties": {
    "firstname": "Sarah",
    "lastname": "Chen",
    "email": "sarah.chen@techcorp.io",
    "company": "TechCorp",
    "lifecyclestage": "customer"
  },
  "associations": {
    "deals": {
      "results": [
        {"id": "deal_8821", "type": "contact_to_deal"},
        {"id": "deal_8975", "type": "contact_to_deal"}
      ]
    }
  },
  "createdAt": "2024-01-10T14:22:00.000Z",
  "updatedAt": "2024-09-03T08:44:00.000Z"
}

A naive converter might just call Object.values() and join with commas, producing a row like:

103940,[object Object],[object Object],2024-01-10T14:22:00.000Z,2024-09-03T08:44:00.000Z

The properties object and associations object have become [object Object] — completely useless. The nested structure needs to be resolved before CSV conversion is possible.

Strategy 1: Dot-Notation Flattening

The most common approach is to recursively flatten the nested object into a single level, using dot-separated keys to encode the path to each value.

The HubSpot contact above flattens to:

{
  "id": "103940",
  "properties.firstname": "Sarah",
  "properties.lastname": "Chen",
  "properties.email": "sarah.chen@techcorp.io",
  "properties.company": "TechCorp",
  "properties.lifecyclestage": "customer",
  "associations.deals.results[0].id": "deal_8821",
  "associations.deals.results[0].type": "contact_to_deal",
  "associations.deals.results[1].id": "deal_8975",
  "associations.deals.results[1].type": "contact_to_deal",
  "createdAt": "2024-01-10T14:22:00.000Z",
  "updatedAt": "2024-09-03T08:44:00.000Z"
}

This works well when the nested structure is consistent across records and the arrays are short. It becomes unwieldy when arrays are long or variable-length, producing sparse columns with hundreds of [N] indexed entries.

See our complete guide to flattening nested JSON for detailed implementation code.

Strategy 2: Expand Arrays to Multiple Rows

When your JSON has arrays of objects where each element is a meaningful record, the right approach is to expand — each array element becomes its own CSV row, with the parent object's fields repeated on every row.

For a Shopify order with line items:

{
  "order_id": "5001",
  "customer_email": "buyer@example.com",
  "order_date": "2024-11-20",
  "line_items": [
    {"sku": "SHOE-BLK-42", "quantity": 1, "unit_price": 89.99},
    {"sku": "SOCK-GRY-M", "quantity": 3, "unit_price": 7.99}
  ]
}

Expanding line items gives you:

order_id,customer_email,order_date,sku,quantity,unit_price
5001,buyer@example.com,2024-11-20,SHOE-BLK-42,1,89.99
5001,buyer@example.com,2024-11-20,SOCK-GRY-M,3,7.99

Here's the Python implementation:

import csv
import json

def expand_to_rows(record, array_field, parent_fields=None):
    """
    Expand a nested array of objects into multiple flat rows.
    parent_fields: list of top-level fields to include in every row.
                   If None, all non-array fields are used.
    """
    if parent_fields is None:
        parent_fields = [k for k, v in record.items() if not isinstance(v, list)]

    parent_data = {k: record[k] for k in parent_fields if k in record}
    items = record.get(array_field, [])

    if not items:
        # Still include the parent record even if array is empty
        return [parent_data]

    rows = []
    for item in items:
        row = {**parent_data}
        if isinstance(item, dict):
            row.update(item)
        else:
            row[array_field] = item
        rows.append(row)

    return rows


# Usage
orders = [
    {
        "order_id": "5001",
        "customer_email": "buyer@example.com",
        "order_date": "2024-11-20",
        "line_items": [
            {"sku": "SHOE-BLK-42", "quantity": 1, "unit_price": 89.99},
            {"sku": "SOCK-GRY-M", "quantity": 3, "unit_price": 7.99},
        ],
    }
]

all_rows = []
for order in orders:
    all_rows.extend(expand_to_rows(order, "line_items"))

# Write to CSV
if all_rows:
    with open("order_items.csv", "w", newline="", encoding="utf-8") as f:
        writer = csv.DictWriter(f, fieldnames=all_rows[0].keys())
        writer.writeheader()
        writer.writerows(all_rows)

Strategy 3: Inline Primitive Arrays

When an array contains only strings, numbers, or booleans — not objects — the simplest approach is to join them into a single delimited string.

{
  "user_id": "u_4421",
  "email": "dev@example.com",
  "roles": ["admin", "editor", "billing"],
  "tag_ids": [14, 22, 31]
}

Inline joining produces:

user_id,email,roles,tag_ids
u_4421,dev@example.com,admin|editor|billing,14|22|31
def inline_arrays(record, sep="|"):
    """Join primitive arrays into delimited strings."""
    result = {}
    for key, value in record.items():
        if isinstance(value, list):
            if all(isinstance(v, (str, int, float, bool, type(None))) for v in value):
                result[key] = sep.join(str(v) for v in value if v is not None)
            else:
                # Don't inline arrays of objects — use a different strategy
                result[key] = json.dumps(value)
        else:
            result[key] = value
    return result

Note: Avoid using commas as the inline separator. Use a pipe (|), semicolon (;), or another character that won't appear in your values. Using a comma inside a CSV cell creates ambiguity that requires quoting.

Deciding Which Strategy to Use

Here's a straightforward decision process:

  1. Is the nested value an array of objects? Use row expansion. Each object in the array likely has independent business meaning (line items, transactions, comments).

  2. Is the nested value an array of primitives (strings, numbers)? Use inline joining. Tags, categories, and IDs are best read as a single delimited value.

  3. Is the nested value an object (not an array)? Use dot-notation flattening. Address objects, metadata objects, and config blobs flatten cleanly.

  4. Mixed structure? Combine strategies: flatten the nested objects, inline the primitive arrays, expand the object arrays.

Python: Converting Nested JSON to CSV with pandas

For nested JSON with objects (not arrays), pandas.json_normalize is the cleanest solution:

import pandas as pd
import json

with open("hubspot_contacts.json") as f:
    contacts = json.load(f)

# Flatten the 'properties' nested object using dot notation
# record_path handles arrays of objects; meta includes top-level fields
df = pd.json_normalize(
    contacts,
    sep=".",
    max_level=3  # Limit depth if needed
)

# Some columns may still contain objects or arrays — inspect them
object_cols = [col for col in df.columns if df[col].apply(lambda x: isinstance(x, (dict, list))).any()]
print(f"Columns still containing nested data: {object_cols}")

# For primitive array columns, inline them
for col in object_cols:
    if df[col].apply(lambda x: isinstance(x, list) and all(not isinstance(i, dict) for i in (x or []))).all():
        df[col] = df[col].apply(lambda x: "|".join(str(i) for i in x) if isinstance(x, list) else x)

df.to_csv("contacts_flat.csv", index=False)
print(f"Exported {len(df)} rows, {len(df.columns)} columns")

For arrays of objects within each record, use record_path to expand them:

# Expand order line items using record_path
df = pd.json_normalize(
    orders,
    record_path=["line_items"],
    meta=["order_id", "customer_email", "order_date"],
    sep=".",
    errors="ignore"
)

df.to_csv("order_line_items.csv", index=False)

JavaScript: Converting Nested JSON to CSV

Here's a complete Node.js implementation that combines flattening and row expansion:

import { createWriteStream } from "fs";
import { stringify } from "csv-stringify";

function flattenObject(obj, prefix = "", sep = ".") {
  const result = {};
  for (const [key, value] of Object.entries(obj)) {
    const newKey = prefix ? `${prefix}${sep}${key}` : key;
    if (value !== null && typeof value === "object" && !Array.isArray(value)) {
      Object.assign(result, flattenObject(value, newKey, sep));
    } else if (Array.isArray(value)) {
      const allPrimitive = value.every((v) => typeof v !== "object" || v === null);
      if (allPrimitive) {
        result[newKey] = value.join("|");
      } else {
        // Store as JSON string — handle separately with expand logic
        result[newKey] = JSON.stringify(value);
      }
    } else {
      result[newKey] = value;
    }
  }
  return result;
}

function expandAndFlatten(records, arrayField) {
  const rows = [];
  for (const record of records) {
    const { [arrayField]: items, ...rest } = record;
    const flatParent = flattenObject(rest);

    if (!items || items.length === 0) {
      rows.push(flatParent);
      continue;
    }

    for (const item of items) {
      const flatItem = flattenObject(item);
      rows.push({ ...flatParent, ...flatItem });
    }
  }
  return rows;
}

// Usage
import ordersData from "./orders.json" assert { type: "json" };

const rows = expandAndFlatten(ordersData, "line_items");

// Collect all unique headers
const headers = [...new Set(rows.flatMap(Object.keys))];

const output = createWriteStream("orders_flat.csv");
const stringifier = stringify({ header: true, columns: headers });
stringifier.pipe(output);

for (const row of rows) {
  stringifier.write(row);
}
stringifier.end();

output.on("finish", () => console.log(`Wrote ${rows.length} rows`));

Using an Online Converter

If you're dealing with a one-off conversion and don't need a script, our JSON to CSV converter handles nested JSON with configurable flattening options. You can choose dot-notation flattening for nested objects, row expansion for arrays, and inline joining for primitive arrays — all without writing a line of code. For pre-flattening complex structures before CSV conversion, the flatten JSON tool gives you fine-grained control over every option.

Edge Cases to Watch Out For

Null values in nested paths. If contact.address is null for some records, trying to access contact.address.city throws an error. Guard against this in recursive functions with null checks: if (value === null) { result[newKey] = null; continue; }.

Mixed types in arrays. Some APIs return arrays where some elements are objects and others are primitives. ["active", {"since": "2024-01-01"}, true] breaks most flattening logic that assumes uniform types. Add type checks and handle each element based on its actual type.

Very deep nesting (5+ levels). Most recursive flatteners handle this fine, but the resulting column names become unwieldy: response.data.attributes.metadata.tracking.source.campaign. Consider using max_level in json_normalize to truncate at a reasonable depth and handle deeper levels separately.

Very large arrays. Row expansion multiplies your record count by the average array length. A 10,000-record JSON file where each record has 50 line items becomes a 500,000-row CSV. Make sure your tooling can handle the output size before running the expansion.

Frequently Asked Questions

How do I convert a JSON array to CSV in Python?

If the JSON is a flat array of objects: pd.read_json("data.json").to_csv("data.csv", index=False). If nested, use pd.json_normalize() with the sep parameter. For arrays of objects within each record, use the record_path and meta parameters to expand and include parent fields.

Why does my converter output [object Object] in the CSV?

This happens when a CSV converter receives a field value that is a JavaScript object (or Python dict) and calls toString() or str() on it directly instead of recursively processing it. The converter doesn't know how to serialize the nested structure. You need to either flatten the JSON before conversion, or use a converter that handles nesting — like our JSON to CSV converter.

What happens to null values when converting nested JSON to CSV?

Null values in JSON become empty strings in CSV. Most CSV parsers treat an empty cell as null when loading into a database, so the round-trip is usually lossless. Watch out for the difference between a missing field (key not present in JSON) and an explicit null ("field": null) — both become empty CSV cells, which means you lose that distinction on round-trip.

How do I handle JSON where different records have different keys?

Use pd.json_normalize() with errors="ignore" — it fills missing fields with NaN (which becomes an empty CSV cell). In JavaScript, collect all keys across all records first, then write each row filling in empty strings for missing keys. Our JSON to CSV converter handles this automatically using a union of all keys across the dataset.