How to Convert Excel to JSON (The Complete Guide)

Converting Excel to JSON is one of those tasks that sounds simple until you sit down to do it. An Excel file isn't flat data — it has sheets, merged cells, typed columns, formulas, and dates that Excel stores as serial numbers. Getting all of that into clean, typed JSON without silent data loss is the actual problem.

This guide covers four methods: Excel's built-in Power Query, Python with pandas, Python with openpyxl for fine-grained control, and a browser-based tool for when you just need clean output without writing code. Each section is honest about where the method breaks down.

What "Excel to JSON" actually means

Before choosing a method, decide what your JSON output should look like.

A worksheet with this data:

| id | name | active | joined | |----|---------|--------|------------| | 1 | Alice | TRUE | 2024-01-14 | | 2 | Bob | FALSE | 2024-03-02 |

Could reasonably produce:

[
  { "id": 1, "name": "Alice", "active": true, "joined": "2024-01-14" },
  { "id": 2, "name": "Bob",   "active": false, "joined": "2024-03-02" }
]

But naive converters often produce this instead:

[
  { "id": "1", "name": "Alice", "active": "TRUE", "joined": "45305" },
  { "id": "2", "name": "Bob",   "active": "FALSE", "joined": "45353" }
]

All strings. Booleans stringified. Dates as Excel serial numbers. The method you choose determines whether you get clean JSON or garbage you have to clean up downstream.

Method 1: Excel Power Query (no code)

Power Query (available in Excel 2016+ and Microsoft 365 under Data → Get & Transform) can export a worksheet as JSON, but the path is indirect and the output has gotchas.

Steps

Open your workbook and select the sheet you want to export.
Go to Data → From Table/Range. If your data isn't already a table, Excel will prompt you to convert it.
In the Power Query Editor, go to Home → Advanced Editor.
Replace the query with:

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    JsonText = Text.FromBinary(Json.FromValue(Source))
in
    JsonText

Replace "Table1" with your actual table name.

Click Done, then Home → Close & Load. The result cell contains your JSON string.
Copy the cell value and paste it into a text file. Save as .json.

What Power Query gets right

No additional software needed — works entirely inside Excel.
Handles column names as JSON keys automatically.
Reasonably fast on tables up to ~50,000 rows.

What Power Query gets wrong

Type inference is inconsistent. Power Query treats columns as any type unless you explicitly set the column type in the query. A column containing 1, 2, 3 may produce [1, 2, 3] or ["1", "2", "3"] depending on how the table was formatted.

Dates become numbers or strings unpredictably. Excel stores dates as serial integers (days since 1900-01-01). If you don't set the column type to Date explicitly, Json.FromValue often emits the serial number (45305) instead of an ISO string ("2024-01-14").

Nested structures aren't possible. Power Query produces flat JSON — one object per row. If you need nested objects or arrays within rows, you'll need Python or a dedicated tool.

The output is a string in a cell. There's no built-in "export to file" step — you have to copy-paste.

Method 2: Python with pandas

pandas is the most common Python approach. It works well for flat data and handles large files efficiently.

Install

pip install pandas openpyxl

openpyxl is required as the Excel reading engine; pandas doesn't include it by default.

Basic conversion

import pandas as pd
import json

df = pd.read_excel("data.xlsx", sheet_name="Sheet1")
result = df.to_dict(orient="records")

with open("output.json", "w") as f:
    json.dump(result, f, indent=2, default=str)

orient="records" produces a JSON array of objects — one per row — which is the format most APIs and tools expect.

Type preservation

By default, pandas infers types from the column data. Numbers come back as integers or floats, booleans as Python True/False, and dates as pandas.Timestamp objects.

The default=str in json.dump converts anything that isn't JSON-serializable (like Timestamp) to its string representation. That works, but produces locale-formatted strings like "2024-01-14 00:00:00" instead of ISO 8601 "2024-01-14".

For cleaner date output:

import pandas as pd
import json
from datetime import date, datetime

def serialize(obj):
    if isinstance(obj, (datetime, pd.Timestamp)):
        # Date-only if no time component
        if obj.hour == 0 and obj.minute == 0 and obj.second == 0:
            return obj.strftime("%Y-%m-%d")
        return obj.isoformat()
    if isinstance(obj, date):
        return obj.isoformat()
    raise TypeError(f"Not serializable: {type(obj)}")

df = pd.read_excel("data.xlsx", sheet_name=0)
result = df.to_dict(orient="records")

with open("output.json", "w") as f:
    json.dump(result, f, indent=2, default=serialize)

NaN handling

Excel empty cells become float("nan") in pandas, which is not valid JSON. Fix with:

import math

def serialize(obj):
    if isinstance(obj, float) and math.isnan(obj):
        return None
    # ... rest of handler

Or use df.where(pd.notna(df), other=None) before converting.

Multiple sheets

xl = pd.ExcelFile("data.xlsx")
output = {}

for sheet_name in xl.sheet_names:
    df = xl.parse(sheet_name)
    output[sheet_name] = df.to_dict(orient="records")

with open("output.json", "w") as f:
    json.dump(output, f, indent=2, default=str)

Where pandas falls short

Formula cells — pandas reads the cached value (what Excel last computed), not the formula. Usually fine, but stale cached values cause silent errors.
Merged cells — only the top-left cell has a value; other cells in the merge are NaN. You'll need to forward-fill if the merge implies a shared value.
Custom number formats — a cell formatted as a percentage in Excel (0.15 displayed as "15%") comes back as 0.15. pandas has no knowledge of the display format.

Method 3: Python with openpyxl (fine-grained control)

openpyxl reads the workbook at the cell level, giving you access to the actual cell type, formula, display format, hyperlinks, and merged cell ranges. Use it when pandas doesn't give you enough control.

Install

pip install openpyxl

Basic conversion

from openpyxl import load_workbook
import json

wb = load_workbook("data.xlsx", data_only=True)
ws = wb.active

rows = list(ws.iter_rows(values_only=True))
if not rows:
    print("[]")
else:
    headers = [str(h) for h in rows[0]]
    records = []
    for row in rows[1:]:
        obj = {}
        for key, val in zip(headers, row):
            obj[key] = val
        records.append(obj)

    print(json.dumps(records, indent=2, default=str))

data_only=True returns the last-computed value for formula cells, not the formula string. Without it, formula cells return the formula (e.g. "=SUM(A1:A10)").

Handling cell types explicitly

from openpyxl import load_workbook
from datetime import datetime, date
import json

def cell_value(cell):
    val = cell.value
    if val is None:
        return None
    if isinstance(val, bool):          # check bool before int — bool is a subclass of int
        return val
    if isinstance(val, int):
        return val
    if isinstance(val, float):
        return val
    if isinstance(val, (datetime, date)):
        if isinstance(val, datetime) and val.hour == 0 and val.minute == 0:
            return val.strftime("%Y-%m-%d")
        if isinstance(val, datetime):
            return val.isoformat()
        return val.isoformat()         # plain date
    return str(val)

wb = load_workbook("data.xlsx", data_only=True)
ws = wb.active
rows = list(ws.iter_rows())

headers = [str(cell_value(c) or c.column_letter) for c in rows[0]]
records = [
    {headers[i]: cell_value(c) for i, c in enumerate(row)}
    for row in rows[1:]
]

print(json.dumps(records, indent=2))

Note the bool check before int — in Python, isinstance(True, int) returns True, so checking int first would convert booleans to 1 and 0.

Reading merged cells

Merged cells beyond the top-left return None in openpyxl. To forward-fill the value across the merge:

wb = load_workbook("data.xlsx", data_only=True)
ws = wb.active

# Unmerge by copying values into merged cells
for merged_range in list(ws.merged_cells.ranges):
    top_left = ws.cell(merged_range.min_row, merged_range.min_col)
    value = top_left.value
    ws.unmerge_cells(str(merged_range))
    for row in ws.iter_rows(
        min_row=merged_range.min_row, max_row=merged_range.max_row,
        min_col=merged_range.min_col, max_col=merged_range.max_col
    ):
        for cell in row:
            cell.value = value

Where openpyxl falls short

Slower than pandas on large files (100k+ rows). For those, use pandas with openpyxl as the engine.
No built-in NaN handling — empty cells are Python None, which serializes cleanly to JSON null.
.xlsb format (binary Excel) is not supported. Convert to .xlsx first or use pyxlsb.

Method 4: Browser-based converter (no setup)

If you're working with a one-off file, need a quick answer, or don't want to write code, a browser-based converter is the fastest path.

csvjson.tools Excel to JSON handles the type-preservation problem correctly:

Numbers come back as number, not "29" or "3.14"
Booleans come back as true/false, not "TRUE" or "FALSE"
ISO date strings come back as "2024-01-14", not 45305 or "1/14/2024"
Multi-sheet workbooks are supported — choose which sheet to convert
Empty cells become null

The conversion runs entirely in your browser. Your file is never uploaded to a server.

When to use this vs code

Use the browser tool when:

You have one file and don't need to automate the process
You want to verify what your data looks like before writing a script
You're not in a Python environment

Use Python when:

You need to process many files or automate a pipeline
The file comes from an API or S3 bucket programmatically
You need to transform the data beyond a straight row-to-object conversion

Common problems and fixes

Dates appear as integers (e.g. `45305`)

Excel stores dates as the number of days since 1900-01-01. If your converter doesn't know to treat that number as a date, it emits the raw integer.

Power Query: Set the column type to Date explicitly before calling Json.FromValue.
pandas: pd.read_excel should infer date columns automatically if the cells are formatted as Date in Excel. If not, use parse_dates=["column_name"].
openpyxl with data_only=True: Returns Python datetime objects for date cells. Serialize with .isoformat().
Browser tool: Handled automatically — Excel to JSON converts date cells to ISO strings.

Booleans appear as `1` and `0`

In Python, bool is a subclass of int. If you check isinstance(val, int) before isinstance(val, bool), you'll coerce True to 1. Always check bool first.

Numbers appear as strings (`"29"` instead of `29`)

The converter is reading the formatted display value instead of the cell's underlying number. In openpyxl, use values_only=True or access cell.value directly. In pandas, ensure the column dtype is numeric — check df.dtypes.

Empty cells become empty strings instead of `null`

Power Query and some converters emit "" for empty cells. In pandas, use df.where(pd.notna(df), other=None) before serializing.

Choosing the right method

| Situation | Best method | |---|---| | Quick one-off file, no code | Browser tool | | Automated pipeline, flat data | pandas | | Need cell-level control (merged cells, types, formulas) | openpyxl | | No Python available, no code wanted | Power Query | | Multi-sheet workbook to single JSON | pandas or browser tool | | Large file (100k+ rows) | pandas with openpyxl engine |

Excel to JSON — upload an xlsx and get clean, typed JSON in one click
JSON to Excel — reverse: go from JSON back to a properly formatted xlsx
CSV to JSON — if your Excel data is already exported as CSV
Flatten JSON — normalize nested JSON into flat rows for analytics tools