csvjson

Format Comparison

CSV vs Excel

CSV is a plain text format that works everywhere. Excel is a rich workbook format built for humans. They're both "spreadsheets" but they solve different problems — here's when to use which.

The same data, two formats

An order export — what it looks like in each format.

CSV — plain text
order_id,customer,amount,date,status
ORD-1001,Alice Chen,1299.00,2024-11-15,shipped
ORD-1002,Bob Kumar,149.00,2024-11-16,pending
ORD-1003,Sara Mills,89.99,2024-11-16,shipped

# File size: ~120 bytes
# Works in: pandas, MySQL, every spreadsheet app
# No formulas, no formatting, no multiple sheets
Excel (.xlsx) — what it stores
Sheet 1: Orders (same rows as CSV above)
  - Column A: order_id (Text format)
  - Column D: date (Date format, displays as Nov 15 2024)
  - Column C: amount (Currency format, shows $1,299.00)
  - Row 1: bold, frozen, background color

Sheet 2: Summary
  - Total revenue: =SUM(Orders!C2:C4)
  - Shipped orders: =COUNTIF(Orders!E:E,"shipped")

# File size: ~8KB (binary ZIP format)
# Requires Excel, Google Sheets, or openpyxl to read

Feature comparison

FeatureCSVExcel (.xlsx)
File formatPlain text — readable in any editorBinary ZIP archive (.xlsx is XML inside a zip)
File sizeSmall — just the dataLarger — stores formatting, formulas, metadata
Multiple sheetsOne sheet per fileMultiple sheets in one workbook
FormulasNot supported — values onlyFull formula support (SUM, VLOOKUP, etc.)
FormattingNone — no bold, colors, column widthsFull formatting: fonts, colors, borders, widths
Data typesEverything is a string unless parsedNative number, date, currency, boolean types
Date handlingStrings — format depends on locale/toolStored as serial numbers — timezone-safe
EncodingUTF-8, UTF-16, Latin-1 — no standard.xlsx always uses UTF-8 internally
Row limitUnlimited (any text file size)1,048,576 rows per sheet
Version controlGit-diffable — line-by-line changes visibleBinary diffs — git shows changed/unchanged only
CompatibilityEvery tool: pandas, SQL, Excel, Numbers, RBest in Excel; partial support in Google Sheets, LibreOffice
Automationtrivial — open(), read(), write()Requires openpyxl, xlrd, or com automation
Pivot tables / chartsNot stored in the fileFully embedded in the workbook

CSV — use it when

Pipelines, databases, and code

Every database import tool, ETL system, pandas workflow, and data pipeline accepts CSV. It's the lingua franca of data — no library needed to read or write it. openpyxl is 50 lines; csv.DictWriter is 3.

Version control and git

CSV diffs are human-readable in a pull request. You can see exactly which rows changed, which were added, and which were deleted. An Excel file in git shows only "binary file changed".

Large files

Excel caps at 1,048,576 rows. CSV has no limit. A 5-million-row export from a database goes to CSV. It also loads faster in pandas — no binary format to parse.

Sharing with any tool

CSV works in Excel, Google Sheets, Numbers, LibreOffice, R, Python, Ruby, Go, and every SQL client. An .xlsx file requires a specific library or application to open.

Excel — use it when

Sharing with business users

If the file goes to someone in finance, operations, or HR who lives in Excel, send .xlsx. Column widths are right, date columns look like dates, and they can sort and filter without any extra steps.

Formatting and presentation matter

Bold headers, currency formatting, conditional formatting for outliers, frozen header rows — none of this survives a CSV round-trip. If the spreadsheet needs to look right, use Excel.

Multiple related datasets

A monthly report with a summary sheet, a raw data sheet, and a pivot chart is naturally a single .xlsx workbook. In CSV, that's three files and a convention for how they relate.

Formulas and calculated columns

If the recipient needs to run their own calculations inside the file — SUMIF by region, VLOOKUP to join another table — Excel is the only choice. CSV files don't store formulas.

Common conversion pitfalls

These problems appear on almost every CSV ↔ Excel round-trip.

Leading zeros are dropped in CSV → Excel

Opening a CSV in Excel by double-clicking causes Excel to infer column types. A ZIP code "01234" becomes the number 1234. A product code "007-A" stays as a string, but "00007" becomes 7. Fix: import via Data > From Text/CSV and set those columns to Text explicitly.

# Python: write CSV with leading zeros preserved
import csv

rows = [{"zip": "01234", "sku": "00719"}]

with open("data.csv", "w", newline="", encoding="utf-8-sig") as f:
    writer = csv.DictWriter(f, fieldnames=["zip", "sku"])
    writer.writeheader()
    writer.writerows(rows)

# In Excel: import with Data > From Text/CSV
# and set zip and sku columns to Text

Dates serialize differently in each format

In CSV, dates become strings — their format depends on which locale Excel used when it saved the file. A date that looks like "1/28/2025" in a US CSV looks like "28/01/2025" in a UK CSV. Excel stores dates as serial numbers (days since 1900-01-01) which is unambiguous but invisible in the file.

# Python: write dates safely to both formats
import pandas as pd
from datetime import date

df = pd.DataFrame({
    "order_date": [date(2024, 11, 15), date(2024, 11, 16)],
    "amount": [1299.00, 149.00],
})

# CSV: explicit ISO format avoids locale ambiguity
df["order_date"] = df["order_date"].astype(str)  # "2024-11-15"
df.to_csv("orders.csv", index=False)

# Excel: pandas writes native date cells
df["order_date"] = pd.to_datetime(df["order_date"])
df.to_excel("orders.xlsx", index=False)

CSV encoding is not standardized

Excel on Windows saves CSV in the system locale encoding (often Windows-1252 on Western systems). Pandas defaults to UTF-8. When you open a UTF-8 CSV by double-clicking in Excel on Windows, accented characters (café, naïve, Zürich) become garbage. Fix: write with encoding="utf-8-sig" (UTF-8 with BOM) so Excel recognizes the encoding.

# Always write CSVs with utf-8-sig when humans will open in Excel
df.to_csv("output.csv", index=False, encoding="utf-8-sig")

# utf-8 is fine for machine consumption (pipelines, databases)
df.to_csv("output.csv", index=False, encoding="utf-8")

Frequently asked questions

Should I send CSV or Excel to a colleague?

If they use Excel or Google Sheets and just need to view and filter the data, either works. Send .xlsx if formatting matters (column widths, number formats) or if you're including multiple sheets. Send .csv if they'll import it into another tool or if the data is large.

Can I open a CSV in Excel?

Yes. Double-clicking a .csv file opens it in Excel on most systems. For full control over how columns are parsed (especially for dates, leading zeros, and encoding), use Data > From Text/CSV instead of double-clicking. This lets you set each column's data type before import.

Does Excel have a row limit that CSV doesn't?

Yes. Excel is capped at 1,048,576 rows per sheet (2^20). A CSV file has no row limit — it's just a text file. If you're exporting millions of records from a database, CSV is the only option.

Why do my ZIP codes and phone numbers lose their leading zeros in Excel?

Excel auto-detects column types when opening a CSV by double-clicking. A value like "01234" looks like the number 1234 to Excel, so it drops the leading zero. Fix it by importing via Data > From Text/CSV and setting those columns to "Text" format. Or format the cells as Text before pasting.

Is CSV or Excel better for version control?

CSV is far better for git. Each row is a line of text, so a diff shows you exactly which rows changed. Excel is a binary format — git can detect that the file changed but can't show you what changed inside it. If your data lives in a repo, use CSV.

Can pandas read both CSV and Excel?

Yes. pd.read_csv() for CSV files, pd.read_excel() for .xlsx and .xls files (requires openpyxl or xlrd installed). Both return a DataFrame. pd.to_csv() and df.to_excel() write them out.

Convert between CSV and Excel instantly — no upload, no sign-up.