How to Convert Excel to JSON (The Complete Guide)
Four practical methods to convert Excel spreadsheets to JSON — Power Query, Python pandas, openpyxl, and a browser tool. Covers nested structures, type preservation, and large files.
- What "Excel to JSON" actually means
- Method 1: Excel Power Query (no code)
- Steps
- What Power Query gets right
- What Power Query gets wrong
- Method 2: Python with pandas
- Install
- Basic conversion
- Type preservation
- NaN handling
- Multiple sheets
- Where pandas falls short
- Method 3: Python with openpyxl (fine-grained control)
- Install
- Basic conversion
- Handling cell types explicitly
- Reading merged cells
- Where openpyxl falls short
- Method 4: Browser-based converter (no setup)
- When to use this vs code
- Common problems and fixes
- Dates appear as integers (e.g. `45305`)
- Booleans appear as `1` and `0`
- Numbers appear as strings (`"29"` instead of `29`)
- Empty cells become empty strings instead of `null`
- Choosing the right method
- Related tools
How to Convert Excel to JSON (The Complete Guide)
Converting Excel to JSON is one of those tasks that sounds simple until you sit down to do it. An Excel file isn't flat data — it has sheets, merged cells, typed columns, formulas, and dates that Excel stores as serial numbers. Getting all of that into clean, typed JSON without silent data loss is the actual problem.
This guide covers four methods: Excel's built-in Power Query, Python with pandas, Python with openpyxl for fine-grained control, and a browser-based tool for when you just need clean output without writing code. Each section is honest about where the method breaks down.
What "Excel to JSON" actually means
Before choosing a method, decide what your JSON output should look like.
A worksheet with this data:
| id | name | active | joined | |----|---------|--------|------------| | 1 | Alice | TRUE | 2024-01-14 | | 2 | Bob | FALSE | 2024-03-02 |
Could reasonably produce:
[
{ "id": 1, "name": "Alice", "active": true, "joined": "2024-01-14" },
{ "id": 2, "name": "Bob", "active": false, "joined": "2024-03-02" }
]
But naive converters often produce this instead:
[
{ "id": "1", "name": "Alice", "active": "TRUE", "joined": "45305" },
{ "id": "2", "name": "Bob", "active": "FALSE", "joined": "45353" }
]
All strings. Booleans stringified. Dates as Excel serial numbers. The method you choose determines whether you get clean JSON or garbage you have to clean up downstream.
Method 1: Excel Power Query (no code)
Power Query (available in Excel 2016+ and Microsoft 365 under Data → Get & Transform) can export a worksheet as JSON, but the path is indirect and the output has gotchas.
Steps
- Open your workbook and select the sheet you want to export.
- Go to Data → From Table/Range. If your data isn't already a table, Excel will prompt you to convert it.
- In the Power Query Editor, go to Home → Advanced Editor.
- Replace the query with:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
JsonText = Text.FromBinary(Json.FromValue(Source))
in
JsonText
Replace "Table1" with your actual table name.
- Click Done, then Home → Close & Load. The result cell contains your JSON string.
- Copy the cell value and paste it into a text file. Save as
.json.
What Power Query gets right
- No additional software needed — works entirely inside Excel.
- Handles column names as JSON keys automatically.
- Reasonably fast on tables up to ~50,000 rows.
What Power Query gets wrong
Type inference is inconsistent. Power Query treats columns as any type unless you explicitly set the column type in the query. A column containing 1, 2, 3 may produce [1, 2, 3] or ["1", "2", "3"] depending on how the table was formatted.
Dates become numbers or strings unpredictably. Excel stores dates as serial integers (days since 1900-01-01). If you don't set the column type to Date explicitly, Json.FromValue often emits the serial number (45305) instead of an ISO string ("2024-01-14").
Nested structures aren't possible. Power Query produces flat JSON — one object per row. If you need nested objects or arrays within rows, you'll need Python or a dedicated tool.
The output is a string in a cell. There's no built-in "export to file" step — you have to copy-paste.
Method 2: Python with pandas
pandas is the most common Python approach. It works well for flat data and handles large files efficiently.
Install
pip install pandas openpyxl
openpyxl is required as the Excel reading engine; pandas doesn't include it by default.
Basic conversion
import pandas as pd
import json
df = pd.read_excel("data.xlsx", sheet_name="Sheet1")
result = df.to_dict(orient="records")
with open("output.json", "w") as f:
json.dump(result, f, indent=2, default=str)
orient="records" produces a JSON array of objects — one per row — which is the format most APIs and tools expect.
Type preservation
By default, pandas infers types from the column data. Numbers come back as integers or floats, booleans as Python True/False, and dates as pandas.Timestamp objects.
The default=str in json.dump converts anything that isn't JSON-serializable (like Timestamp) to its string representation. That works, but produces locale-formatted strings like "2024-01-14 00:00:00" instead of ISO 8601 "2024-01-14".
For cleaner date output:
import pandas as pd
import json
from datetime import date, datetime
def serialize(obj):
if isinstance(obj, (datetime, pd.Timestamp)):
# Date-only if no time component
if obj.hour == 0 and obj.minute == 0 and obj.second == 0:
return obj.strftime("%Y-%m-%d")
return obj.isoformat()
if isinstance(obj, date):
return obj.isoformat()
raise TypeError(f"Not serializable: {type(obj)}")
df = pd.read_excel("data.xlsx", sheet_name=0)
result = df.to_dict(orient="records")
with open("output.json", "w") as f:
json.dump(result, f, indent=2, default=serialize)
NaN handling
Excel empty cells become float("nan") in pandas, which is not valid JSON. Fix with:
import math
def serialize(obj):
if isinstance(obj, float) and math.isnan(obj):
return None
# ... rest of handler
Or use df.where(pd.notna(df), other=None) before converting.
Multiple sheets
xl = pd.ExcelFile("data.xlsx")
output = {}
for sheet_name in xl.sheet_names:
df = xl.parse(sheet_name)
output[sheet_name] = df.to_dict(orient="records")
with open("output.json", "w") as f:
json.dump(output, f, indent=2, default=str)
Where pandas falls short
- Formula cells — pandas reads the cached value (what Excel last computed), not the formula. Usually fine, but stale cached values cause silent errors.
- Merged cells — only the top-left cell has a value; other cells in the merge are NaN. You'll need to forward-fill if the merge implies a shared value.
- Custom number formats — a cell formatted as a percentage in Excel (0.15 displayed as "15%") comes back as
0.15. pandas has no knowledge of the display format.
Method 3: Python with openpyxl (fine-grained control)
openpyxl reads the workbook at the cell level, giving you access to the actual cell type, formula, display format, hyperlinks, and merged cell ranges. Use it when pandas doesn't give you enough control.
Install
pip install openpyxl
Basic conversion
from openpyxl import load_workbook
import json
wb = load_workbook("data.xlsx", data_only=True)
ws = wb.active
rows = list(ws.iter_rows(values_only=True))
if not rows:
print("[]")
else:
headers = [str(h) for h in rows[0]]
records = []
for row in rows[1:]:
obj = {}
for key, val in zip(headers, row):
obj[key] = val
records.append(obj)
print(json.dumps(records, indent=2, default=str))
data_only=True returns the last-computed value for formula cells, not the formula string. Without it, formula cells return the formula (e.g. "=SUM(A1:A10)").
Handling cell types explicitly
from openpyxl import load_workbook
from datetime import datetime, date
import json
def cell_value(cell):
val = cell.value
if val is None:
return None
if isinstance(val, bool): # check bool before int — bool is a subclass of int
return val
if isinstance(val, int):
return val
if isinstance(val, float):
return val
if isinstance(val, (datetime, date)):
if isinstance(val, datetime) and val.hour == 0 and val.minute == 0:
return val.strftime("%Y-%m-%d")
if isinstance(val, datetime):
return val.isoformat()
return val.isoformat() # plain date
return str(val)
wb = load_workbook("data.xlsx", data_only=True)
ws = wb.active
rows = list(ws.iter_rows())
headers = [str(cell_value(c) or c.column_letter) for c in rows[0]]
records = [
{headers[i]: cell_value(c) for i, c in enumerate(row)}
for row in rows[1:]
]
print(json.dumps(records, indent=2))
Note the bool check before int — in Python, isinstance(True, int) returns True, so checking int first would convert booleans to 1 and 0.
Reading merged cells
Merged cells beyond the top-left return None in openpyxl. To forward-fill the value across the merge:
wb = load_workbook("data.xlsx", data_only=True)
ws = wb.active
# Unmerge by copying values into merged cells
for merged_range in list(ws.merged_cells.ranges):
top_left = ws.cell(merged_range.min_row, merged_range.min_col)
value = top_left.value
ws.unmerge_cells(str(merged_range))
for row in ws.iter_rows(
min_row=merged_range.min_row, max_row=merged_range.max_row,
min_col=merged_range.min_col, max_col=merged_range.max_col
):
for cell in row:
cell.value = value
Where openpyxl falls short
- Slower than pandas on large files (100k+ rows). For those, use pandas with openpyxl as the engine.
- No built-in NaN handling — empty cells are Python
None, which serializes cleanly to JSONnull. .xlsbformat (binary Excel) is not supported. Convert to.xlsxfirst or usepyxlsb.
Method 4: Browser-based converter (no setup)
If you're working with a one-off file, need a quick answer, or don't want to write code, a browser-based converter is the fastest path.
csvjson.tools Excel to JSON handles the type-preservation problem correctly:
- Numbers come back as
number, not"29"or"3.14" - Booleans come back as
true/false, not"TRUE"or"FALSE" - ISO date strings come back as
"2024-01-14", not45305or"1/14/2024" - Multi-sheet workbooks are supported — choose which sheet to convert
- Empty cells become
null
The conversion runs entirely in your browser. Your file is never uploaded to a server.
When to use this vs code
Use the browser tool when:
- You have one file and don't need to automate the process
- You want to verify what your data looks like before writing a script
- You're not in a Python environment
Use Python when:
- You need to process many files or automate a pipeline
- The file comes from an API or S3 bucket programmatically
- You need to transform the data beyond a straight row-to-object conversion
Common problems and fixes
Dates appear as integers (e.g. 45305)
Excel stores dates as the number of days since 1900-01-01. If your converter doesn't know to treat that number as a date, it emits the raw integer.
- Power Query: Set the column type to
Dateexplicitly before callingJson.FromValue. - pandas:
pd.read_excelshould infer date columns automatically if the cells are formatted as Date in Excel. If not, useparse_dates=["column_name"]. - openpyxl with
data_only=True: Returns Pythondatetimeobjects for date cells. Serialize with.isoformat(). - Browser tool: Handled automatically — Excel to JSON converts date cells to ISO strings.
Booleans appear as 1 and 0
In Python, bool is a subclass of int. If you check isinstance(val, int) before isinstance(val, bool), you'll coerce True to 1. Always check bool first.
Numbers appear as strings ("29" instead of 29)
The converter is reading the formatted display value instead of the cell's underlying number. In openpyxl, use values_only=True or access cell.value directly. In pandas, ensure the column dtype is numeric — check df.dtypes.
Empty cells become empty strings instead of null
Power Query and some converters emit "" for empty cells. In pandas, use df.where(pd.notna(df), other=None) before serializing.
Choosing the right method
| Situation | Best method | |---|---| | Quick one-off file, no code | Browser tool | | Automated pipeline, flat data | pandas | | Need cell-level control (merged cells, types, formulas) | openpyxl | | No Python available, no code wanted | Power Query | | Multi-sheet workbook to single JSON | pandas or browser tool | | Large file (100k+ rows) | pandas with openpyxl engine |
Related tools
- Excel to JSON — upload an xlsx and get clean, typed JSON in one click
- JSON to Excel — reverse: go from JSON back to a properly formatted xlsx
- CSV to JSON — if your Excel data is already exported as CSV
- Flatten JSON — normalize nested JSON into flat rows for analytics tools