csvjson
·13 min read

Common JSON to CSV Errors in Python (and How to Fix Them)

A complete reference for every error you'll hit converting JSON to CSV in Python — UnicodeDecodeError, JSONDecodeError, object Object in cells, KeyError, MemoryError, and more. Includes working fixes for each.

Converting JSON to CSV in Python fails in predictable ways. The same handful of errors appear across every project, every API, every team. This guide covers each one: what causes it, what the traceback looks like, and the exact fix.

This guide pairs with the complete guide to converting JSON to CSV in Python. If you're just getting started, read that first — it covers the three main approaches. Come back here when something breaks.

1. [object Object] in CSV Cells

Symptom: Your CSV opens in Excel and half the cells contain [object Object] or {'key': 'value'} as a raw string.

What happened: Your script serialized a Python dict or list directly into a CSV cell. When Python's csv.writer calls str() on a dict, you get {'name': 'Alice', 'age': 30}. JavaScript's equivalent gives [object Object].

Example that produces this:

import csv, json

data = [{"id": 1, "address": {"city": "Austin", "state": "TX"}}]

with open("output.csv", "w", newline="") as f:
    writer = csv.writer(f)
    writer.writerow(data[0].keys())   # writes: id, address
    writer.writerow(data[0].values()) # writes: 1, {'city': 'Austin', 'state': 'TX'}

Fix: Flatten the nested object before writing. Use pandas.json_normalize for the simplest path:

import pandas as pd, json

with open("data.json") as f:
    data = json.load(f)

df = pd.json_normalize(data, sep=".")
df.to_csv("output.csv", index=False)
# Columns: id, address.city, address.state

Or write a recursive flatten if you can't use pandas:

def flatten(obj, prefix="", sep="."):
    result = {}
    for key, value in obj.items():
        new_key = f"{prefix}{sep}{key}" if prefix else key
        if isinstance(value, dict):
            result.update(flatten(value, new_key, sep))
        elif isinstance(value, list):
            # For primitive arrays, join with pipe
            if all(not isinstance(v, (dict, list)) for v in value):
                result[new_key] = "|".join(str(v) for v in value)
            else:
                result[new_key] = json.dumps(value)
        else:
            result[new_key] = value
    return result

See the nested JSON to CSV guide for a complete breakdown of every nesting strategy.


2. json.decoder.JSONDecodeError

Symptom:

json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 3 column 3 (char 12)

or

json.decoder.JSONDecodeError: Trailing comma is not allowed: line 8 column 1 (char 201)

What happened: The file isn't valid JSON. Common causes:

  • Single-quoted strings ('name': 'Alice') instead of double-quoted ("name": "Alice")
  • Trailing commas after the last item in an array or object
  • Comments (// this is a comment) — standard JSON doesn't support comments
  • Unquoted keys ({name: "Alice"})
  • Truncated files — the download or write operation was interrupted

Fix 1 — Validate and locate the error:

Paste the content into our JSON validator. It shows the exact line and character where the problem occurs, which is far more useful than the json.decoder traceback alone.

Fix 2 — Use json5 for lenient parsing:

# pip install json5
import json5

with open("data.json") as f:
    data = json5.load(f)  # Accepts trailing commas, comments, single quotes

json5 parses a superset of JSON that allows trailing commas, comments, and single-quoted strings. It's slower than the standard library, but useful for developer-produced JSON that bends the spec.

Fix 3 — Strip a common problem: BOM characters at file start:

import json

with open("data.json", encoding="utf-8-sig") as f:  # -sig strips BOM
    data = json.load(f)

A BOM (byte order mark) at the start of a UTF-8 file causes JSONDecodeError because json.load sees a non-JSON character before the { or [.

Fix 4 — Check for truncation:

import json

try:
    with open("data.json") as f:
        data = json.load(f)
except json.JSONDecodeError as e:
    print(f"Error at line {e.lineno}, col {e.colno}: {e.msg}")
    # Read a window around the error location
    with open("data.json") as f:
        lines = f.readlines()
    start = max(0, e.lineno - 3)
    end = min(len(lines), e.lineno + 2)
    for i, line in enumerate(lines[start:end], start=start + 1):
        marker = ">>>" if i == e.lineno else "   "
        print(f"{marker} {i}: {line}", end="")

3. UnicodeDecodeError When Reading JSON

Symptom:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte

or

UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 142: character maps to <undefined>

What happened: The file was saved in a non-UTF-8 encoding — Latin-1 (ISO-8859-1), Windows-1252 (cp1252), or UTF-16 are the most common. Python 3 defaults to the system locale encoding when encoding is not specified, which on Windows is often cp1252.

Fix 1 — Specify UTF-8 explicitly:

with open("data.json", encoding="utf-8") as f:
    data = json.load(f)

Fix 2 — Detect the encoding automatically:

# pip install chardet
import chardet, json

with open("data.json", "rb") as f:
    raw = f.read()

detected = chardet.detect(raw)
encoding = detected["encoding"]
confidence = detected["confidence"]
print(f"Detected: {encoding} ({confidence:.0%} confidence)")

data = json.loads(raw.decode(encoding))

Fix 3 — Handle files with a BOM:

UTF-8 files from Windows tools sometimes have a UTF-8 BOM (bytes EF BB BF at the start). Use encoding="utf-8-sig" to strip it automatically:

with open("data.json", encoding="utf-8-sig") as f:
    data = json.load(f)

Fix 4 — Force UTF-8 and replace undecodable bytes:

If you know the file is mostly UTF-8 but has a few bad bytes, use errors="replace" as a last resort:

with open("data.json", encoding="utf-8", errors="replace") as f:
    data = json.load(f)

This replaces undecodable bytes with \ufffd (the Unicode replacement character). It's lossy, but it gets the conversion done when you can't fix the source file.


4. KeyError on Inconsistent Records

Symptom:

KeyError: 'email'

This usually happens inside a loop or when accessing data[0].keys() to get headers, then finding that record 47 doesn't have an email field.

What happened: Real-world JSON records are often inconsistent. Optional fields, partial API responses, or records created before a field was added can all produce missing keys.

Fix 1 — Collect the union of all keys first:

import json, csv

with open("contacts.json") as f:
    data = json.load(f)

# Pass 1: collect all keys
all_keys = []
seen = set()
for record in data:
    for key in record.keys():
        if key not in seen:
            all_keys.append(key)
            seen.add(key)

# Pass 2: write with DictWriter (restval="" fills missing fields)
with open("contacts.csv", "w", newline="", encoding="utf-8") as f:
    writer = csv.DictWriter(f, fieldnames=all_keys, restval="")
    writer.writeheader()
    writer.writerows(data)

The restval="" parameter fills any missing keys with an empty string instead of raising KeyError.

Fix 2 — Use pandas (handles this automatically):

import pandas as pd, json

with open("contacts.json") as f:
    data = json.load(f)

# NaN fills missing fields; fillna("") converts to empty strings
df = pd.json_normalize(data, sep=".", errors="ignore")
df.fillna("").to_csv("contacts.csv", index=False)

json_normalize with errors="ignore" fills missing nested fields with NaN, which fillna("") converts to empty strings in the CSV.


5. Double Line Breaks on Windows (\r\n\r\n)

Symptom: Your CSV has blank rows between every data row when opened in Excel or a text editor. The file looks like:

id,name,email
1,Alice,alice@example.com

2,Bob,bob@example.com

What happened: Python's csv module adds its own \r\n line endings. If you open the file in text mode (the default), Python also translates \n to \r\n on Windows, resulting in \r\r\n per line — which most tools render as two line breaks.

Fix: Always open CSV files with newline="":

with open("output.csv", "w", newline="", encoding="utf-8") as f:
    writer = csv.DictWriter(f, fieldnames=["id", "name", "email"])
    writer.writeheader()
    writer.writerows(data)

The newline="" tells Python not to do any newline translation, leaving csv.writer in full control of line endings.


6. MemoryError on Large JSON Files

Symptom:

MemoryError

Or the process is killed by the OS before completing. The file is several GB and json.load() tries to load the entire thing into RAM.

Fix 1 — Stream with ijson:

ijson parses JSON incrementally — one record at a time — without loading the entire file into memory.

# pip install ijson
import ijson, csv

with open("large_data.json", "rb") as f_in, \
     open("output.csv", "w", newline="", encoding="utf-8") as f_out:

    writer = None
    for record in ijson.items(f_in, "item"):
        # "item" means: each element of the top-level array
        if writer is None:
            writer = csv.DictWriter(f_out, fieldnames=record.keys(), restval="")
            writer.writeheader()
        writer.writerow(record)

print("Done")

ijson.items(f, "item") yields one dict per array element. Memory usage stays constant regardless of file size.

Fix 2 — Read in chunks with pandas:

If the file is a newline-delimited JSON (each line is a JSON object), pandas can read it in chunks:

import pandas as pd

chunk_size = 10_000
first_chunk = True

for chunk in pd.read_json("large_data.ndjson", lines=True, chunksize=chunk_size):
    chunk.to_csv(
        "output.csv",
        mode="w" if first_chunk else "a",
        header=first_chunk,
        index=False,
    )
    first_chunk = False
    print(f"Wrote {len(chunk)} rows")

Fix 3 — Use our online converter for files up to ~50MB:

For moderately large files, our JSON to CSV converter runs entirely in your browser with no file upload. It processes up to around 50MB comfortably. Above that, use ijson.


7. Pandas ValueError: arrays must all be same length

Symptom:

ValueError: arrays must all be same length

This appears when calling pd.DataFrame(data) directly on a list of dicts where some values are lists of different lengths.

What happened: If some records have a field like tags: ["a", "b"] and others have tags: ["x"], pandas tries to expand those arrays into columns and can't reconcile the different lengths.

Fix: Use json_normalize instead of pd.DataFrame():

# Don't do this:
df = pd.DataFrame(data)  # Fails on uneven arrays

# Do this instead:
df = pd.json_normalize(data, sep=".")  # Serializes arrays as-is

For primitive arrays that you want to inline, convert them to strings before normalizing:

for record in data:
    for key, value in record.items():
        if isinstance(value, list) and all(not isinstance(v, dict) for v in value):
            record[key] = "|".join(str(v) for v in value)

df = pd.json_normalize(data, sep=".")

8. Empty CSV Output (Zero Rows)

Symptom: The script runs without errors, but the output CSV is empty or contains only headers.

Cause 1 — The JSON root is an object, not an array:

Many APIs wrap their data: {"data": [...], "meta": {...}}. If you pass the root object to csv.DictWriter directly, you're writing one row representing the whole envelope, not the items inside.

with open("api_response.json") as f:
    response = json.load(f)

# Wrong:
data = response  # The whole envelope object

# Correct:
data = response["data"]  # The actual array of records

Cause 2 — Generator exhausted before writing:

If you're chaining generators, the writer may have consumed the iterable before it reached writerows:

# Risky — generator is consumed by the headers step
rows = (flatten(r) for r in data)
headers = rows  # This would exhaust the generator

Use list() to materialize the data when you need to iterate it more than once.

Cause 3 — Wrong file mode overwrites on each iteration:

Opening with "w" inside a loop replaces the file on every iteration. Use "a" (append) for subsequent chunks, or open the file once outside the loop.


9. CSV Headers Out of Order

Symptom: The CSV columns appear in a random order that changes each run, making downstream pipelines brittle.

What happened: In Python versions before 3.7, dict didn't preserve insertion order. Even in modern Python, if you collect keys with a set comprehension or use dict.keys() across multiple records, insertion order isn't guaranteed to match what you want.

Fix — Specify an explicit column order:

import json, csv

with open("data.json") as f:
    data = json.load(f)

# Define the columns you want, in the order you want them
COLUMNS = ["id", "name", "email", "created_at", "plan"]

with open("output.csv", "w", newline="", encoding="utf-8") as f:
    writer = csv.DictWriter(f, fieldnames=COLUMNS, extrasaction="ignore", restval="")
    writer.writeheader()
    writer.writerows(data)

extrasaction="ignore" silently skips fields not in COLUMNS. restval="" fills any missing fields with empty strings.


10. Dates and Numbers Formatted Incorrectly in CSV

Symptom: Dates come out as integers (1706400000), timestamps lose their timezone (2024-01-28T00:00:00), or floats appear with excessive precision (89.98999999999999).

Fix — Convert types before writing:

from datetime import datetime, timezone
import json, csv

def normalize_record(record):
    result = {}
    for key, value in record.items():
        # Unix timestamps → ISO 8601
        if key.endswith("_at") and isinstance(value, (int, float)):
            result[key] = datetime.fromtimestamp(value, tz=timezone.utc).isoformat()
        # Floats → rounded to 2 decimal places
        elif isinstance(value, float):
            result[key] = round(value, 2)
        else:
            result[key] = value
    return result

with open("data.json") as f:
    data = json.load(f)

normalized = [normalize_record(r) for r in data]

with open("output.csv", "w", newline="", encoding="utf-8") as f:
    writer = csv.DictWriter(f, fieldnames=normalized[0].keys())
    writer.writeheader()
    writer.writerows(normalized)

For pandas, use .dt accessor methods:

df = pd.json_normalize(data, sep=".")

# Convert Unix timestamp columns to readable dates
for col in ["created_at", "updated_at"]:
    if col in df.columns:
        df[col] = pd.to_datetime(df[col], unit="s", utc=True).dt.strftime("%Y-%m-%d %H:%M:%S")

# Round floats
float_cols = df.select_dtypes(include="float").columns
df[float_cols] = df[float_cols].round(2)

df.to_csv("output.csv", index=False)

Quick Reference Table

| Error | Cause | Fix | |-------|-------|-----| | [object Object] in cells | Dict/list written as string | Flatten before writing; use json_normalize | | JSONDecodeError | Malformed JSON | Validate with JSON validator; use json5 for lenient files | | UnicodeDecodeError | Wrong file encoding | Add encoding="utf-8-sig"; detect with chardet | | KeyError on field | Inconsistent records | Use DictWriter(restval="") with union of all keys | | Double blank rows | Missing newline="" | open(file, "w", newline="") always | | MemoryError | Huge file loaded at once | Stream with ijson; chunk with pandas | | ValueError: arrays must all be same length | Mixed-length arrays in pd.DataFrame() | Use pd.json_normalize() instead | | Empty output | Wrong root key; generator exhausted | Unwrap the data array; materialize generators | | Columns out of order | Non-deterministic dict key order | Pass explicit fieldnames list to DictWriter | | Dates/numbers malformatted | No type conversion | Normalize before writing; use .dt in pandas |


Frequently Asked Questions

Why does my JSON to CSV script work locally but fail in production?

Three common causes: (1) The production file has a different encoding than your local test file — always specify encoding="utf-8". (2) The production data has more records with edge cases your local sample didn't have — use restval="" and errors="ignore". (3) The production environment doesn't have pandas installed — test with the exact same environment.

How do I debug a JSONDecodeError without reading the entire file?

Get the lineno and colno from the exception, then read just a few lines around that position. The code snippet in error #2 above shows exactly how to do this. Alternatively, paste the content into our JSON validator which highlights the exact error inline.

My CSV looks fine in a text editor but Excel shows it wrong. Why?

Excel has opinions about encoding (it prefers UTF-8 with BOM for CSV), date formats, and delimiters. Write with encoding="utf-8-sig" to add the BOM that Excel expects. If Excel misparses dates or numbers, use explicit string formatting before writing. If the delimiter is wrong (some locales expect ; instead of ,), use csv.writer(f, delimiter=";").

Can I convert JSON to CSV without writing any Python?

Yes. Our JSON to CSV converter handles nested JSON, inconsistent records, arrays, and mixed types directly in your browser — no code, no dependencies, no install. Use it for one-off conversions and come back to Python for automation.

How do I handle a JSON file where the array is nested inside a wrapper object?

Unwrap it before passing to the converter:

with open("api_response.json") as f:
    response = json.load(f)

# Find the array — common wrapper patterns:
data = response.get("data") or response.get("items") or response.get("results") or response

if not isinstance(data, list):
    raise ValueError(f"Expected a list, got {type(data).__name__}. Keys: {list(response.keys())}")