MongoDB

Convert MongoDB Export to CSV

mongoexport produces NDJSON with extended JSON types — ObjectId, ISODate, NumberLong — that need special handling before you get clean CSV. Here's every pattern you'll hit.

MongoDB's mongoexport tool writes NDJSON (one JSON document per line) using MongoDB Extended JSON format. This means _id fields look like {'\$oid': '65f1a2...'}, dates look like {'\$date': '2025-01-15T...'}, and large integers look like {'\$numberLong': '123456'}. Standard JSON parsers don't flatten these automatically — you need to extract the inner value. This guide covers mongoexport NDJSON, Atlas Data Explorer JSON, and the aggregation pipeline CSV approach.

Want to skip the code? Paste your JSON directly into the converter — it handles nested objects, arrays, and large files automatically.

Open JSON to CSV Converter

What the API returns

mongoexport NDJSON output — two documents

{"_id":{"$oid":"65f1a2b3c4d5e6f7a8b9c0d1"},"name":"Alice Chen","email":"alice@example.com","age":32,"department":"Engineering","joined":{"$date":"2022-06-15T00:00:00.000Z"},"active":true,"scores":[95,87,92],"address":{"city":"Austin","state":"TX"}}
{"_id":{"$oid":"65f1a2b3c4d5e6f7a8b9c0d2"},"name":"Bob Kumar","email":"bob@example.com","age":28,"department":"Design","joined":{"$date":"2023-01-10T00:00:00.000Z"},"active":true,"scores":[88,91,85],"address":{"city":"Seattle","state":"WA"}}

Field mapping: JSON path → CSV column

JSON path	CSV column	Notes
_id.$oid	id	Extract the string from {$oid: ...}
name	name	Plain string — no transformation
email	email
age	age	Integer
department	department
joined.$date	joined	Extract from {$date: ...}
active	active	Boolean
scores	scores	Array — join as '95\|87\|92'
address.city	city	Nested document
address.state	state

Extended JSON types ($oid, $date, $numberLong, $numberDecimal) all follow the same pattern: the real value is the string inside the inner object. Extract with doc.get('$oid') or doc.get('$date').

Python conversion

Convert mongoexport NDJSON to CSV — handles all Extended JSON types

import json
import csv
from pathlib import Path

def extract_extended_json(value):
    """Unwrap MongoDB Extended JSON types to plain Python values."""
    if not isinstance(value, dict):
        return value
    if "$oid" in value:
        return value["$oid"]
    if "$date" in value:
        # Returns ISO string; slice to date only if preferred
        return value["$date"]
    if "$numberLong" in value:
        return int(value["$numberLong"])
    if "$numberDecimal" in value:
        return float(value["$numberDecimal"])
    if "$numberInt" in value:
        return int(value["$numberInt"])
    if "$binary" in value:
        return f"<binary:{value['$binary'].get('subType', '')}>"
    # Unknown extended type — return as-is
    return value

def flatten_doc(doc, prefix="", sep="."):
    """Recursively flatten nested documents."""
    out = {}
    for key, value in doc.items():
        full_key = f"{prefix}{sep}{key}" if prefix else key
        value = extract_extended_json(value)
        if isinstance(value, dict):
            out.update(flatten_doc(value, full_key, sep))
        elif isinstance(value, list):
            # Join scalar arrays; skip nested object arrays (handle separately)
            if all(not isinstance(v, (dict, list)) for v in value):
                out[full_key] = "|".join(str(extract_extended_json(v)) for v in value)
            else:
                out[full_key] = json.dumps(value)  # keep complex arrays as JSON string
        else:
            out[full_key] = value
    return out

input_path = "export.json"   # mongoexport NDJSON output
output_path = "output.csv"

headers_seen = set()
rows = []

with open(input_path, encoding="utf-8") as f:
    for line in f:
        line = line.strip()
        if not line:
            continue
        doc = json.loads(line)
        flat = flatten_doc(doc)
        headers_seen.update(flat.keys())
        rows.append(flat)

headers = sorted(headers_seen)

with open(output_path, "w", newline="", encoding="utf-8-sig") as f:
    writer = csv.DictWriter(f, fieldnames=headers, extrasaction="ignore")
    writer.writeheader()
    for row in rows:
        writer.writerow({k: row.get(k, "") for k in headers})

print(f"Exported {len(rows)} documents to {output_path}")

Using pandas for simpler flat collections (no Extended JSON types)

import pandas as pd

# If your collection has no Extended JSON types (no ObjectIds, plain dates)
df = pd.read_json("export.json", lines=True)

# Rename _id to id if present
if "_id" in df.columns:
    df = df.rename(columns={"_id": "id"})

# Flatten nested address object
if "address" in df.columns:
    address_df = pd.json_normalize(df["address"].tolist())
    address_df.columns = [f"address_{c}" for c in address_df.columns]
    df = pd.concat([df.drop(columns=["address"]), address_df], axis=1)

# Join array columns
if "scores" in df.columns:
    df["scores"] = df["scores"].apply(
        lambda x: "|".join(str(v) for v in x) if isinstance(x, list) else x
    )

df.to_csv("output.csv", index=False, encoding="utf-8-sig")

Export directly from MongoDB using pymongo (skip mongoexport)

from pymongo import MongoClient
import pandas as pd

client = MongoClient("mongodb+srv://user:pass@cluster.mongodb.net/")
db = client["mydb"]
collection = db["users"]

# Fetch all documents, exclude internal _id if not needed
docs = list(collection.find({}, {"_id": 0}))

# Or include _id as string:
# docs = list(collection.find({}))
# for doc in docs:
#     doc["_id"] = str(doc["_id"])

df = pd.json_normalize(docs, sep=".")

# Handle list columns
for col in df.columns:
    if df[col].apply(lambda x: isinstance(x, list)).any():
        df[col] = df[col].apply(
            lambda x: "|".join(str(v) for v in x) if isinstance(x, list) else x
        )

df.to_csv("users.csv", index=False, encoding="utf-8-sig")
print(f"Exported {len(df)} documents")
client.close()

Common issues with this API

mongoexport default output is NDJSON, not a JSON array

mongoexport writes one JSON document per line with no outer array brackets — this is NDJSON. json.load() will fail because it expects a single JSON value. Read line by line with json.loads(), or use pd.read_json(path, lines=True).

# Wrong — fails on NDJSON
with open("export.json") as f:
    data = json.load(f)  # JSONDecodeError: Extra data

# Correct
with open("export.json") as f:
    docs = [json.loads(line) for line in f if line.strip()]

# Or with pandas
df = pd.read_json("export.json", lines=True)

mongoexport --type=csv requires an explicit field list

mongoexport has a built-in --type=csv option, but it requires --fields or --fieldFile to specify which fields to export. It can't auto-discover fields from the documents. Use this for simple, known schemas.

mongoexport \
  --uri="mongodb+srv://..." \
  --collection=users \
  --type=csv \
  --fields=name,email,age,department \
  --out=users.csv

Documents in the same collection can have different fields

MongoDB is schema-flexible — not every document has every field. When you flatten documents with different schemas into CSV, rows will have empty cells for missing fields. The flatten_doc function above handles this by collecting all headers across all documents first.

Frequently asked questions

What's the difference between mongoexport and mongodump?

mongoexport produces human-readable NDJSON (or CSV) — suitable for importing into spreadsheets, pandas, or other databases. mongodump produces a binary BSON format that's only readable by mongorestore. Use mongoexport for CSV conversion; use mongodump for full database backups.

How do I convert a MongoDB Atlas JSON export to CSV?

Atlas Data Explorer exports collections as a JSON array (not NDJSON). Use json.load() to read it, then apply the same flatten_doc function. The Extended JSON types ($oid, $date) are the same as mongoexport — the extraction logic is identical.

Can the online converter handle MongoDB NDJSON?

The JSON to CSV converter expects a JSON array. For NDJSON, convert it to an array first: wrap the lines in [ ] brackets, or use Python to load the NDJSON and convert to a standard array before pasting.

How do I handle documents with deeply nested arrays of objects?

The flatten_doc function above stores complex arrays (arrays of objects) as JSON strings. For a true one-row-per-nested-item output, use Python's itertools to expand: for each parent document, iterate the nested array and emit one row per item with parent fields repeated.