GitHub REST API

Convert GitHub API JSON to CSV

GitHub API responses are clean JSON arrays, but labels, assignees, and milestone fields are nested objects and arrays that don't map directly to CSV columns. Here's how to get a flat, usable spreadsheet from any GitHub endpoint.

GitHub's REST API returns issues, pull requests, commits, releases, and repository data as JSON arrays. Common use cases include project management exports, sprint reporting, open source contribution analysis, and release tracking. The main challenge is the nested structure: each issue has a user object, an array of label objects, an optional milestone object, and an optional assignee object. Flattening these correctly is the difference between a useful export and a mess.

Want to skip the code? Paste your JSON directly into the converter — it handles nested objects, arrays, and large files automatically.

Open JSON to CSV Converter

What the API returns

GitHub GET /repos/{owner}/{repo}/issues — single issue

[
  {
    "number": 142,
    "title": "Fix pagination on mobile dashboard",
    "state": "open",
    "user": {
      "login": "alicechen",
      "id": 12345,
      "type": "User"
    },
    "labels": [
      {"id": 1, "name": "bug",    "color": "d73a4a"},
      {"id": 2, "name": "mobile", "color": "0075ca"}
    ],
    "assignee": {
      "login": "bobkumar"
    },
    "milestone": {
      "number": 5,
      "title": "v2.4",
      "state": "open"
    },
    "comments": 3,
    "created_at": "2025-02-10T11:42:00Z",
    "updated_at": "2025-03-01T09:15:00Z",
    "closed_at": null,
    "body": "When navigating to page 2 on mobile..."
  }
]

Field mapping: JSON path → CSV column

JSON path	CSV column	Notes
number	issue_number	The #142 displayed in GitHub UI
title	title
state	state	open or closed
user.login	author	GitHub username of creator
labels[].name	labels	Join multiple labels with comma
assignee.login	assignee	null if unassigned
milestone.title	milestone	null if no milestone
comments	comment_count	Integer
created_at	created_at	ISO 8601
updated_at	updated_at	ISO 8601
closed_at	closed_at	null if still open

The body field (issue description) can contain newlines and commas. The csv module wraps it in double quotes automatically — but omit it if recipients will open the CSV in Excel without proper quoting support.

Python conversion

Export all issues from a repository (handles pagination and rate limits)

import requests
import pandas as pd
import time

TOKEN = "ghp_..."  # GitHub personal access token
OWNER = "torvalds"
REPO  = "linux"

headers = {
    "Authorization": f"Bearer {TOKEN}",
    "Accept": "application/vnd.github+json",
    "X-GitHub-Api-Version": "2022-11-28",
}

issues = []
page = 1

while True:
    r = requests.get(
        f"https://api.github.com/repos/{OWNER}/{REPO}/issues",
        headers=headers,
        params={"state": "all", "per_page": 100, "page": page},
    )
    r.raise_for_status()

    # Check rate limit
    remaining = int(r.headers.get("X-RateLimit-Remaining", 1))
    if remaining < 5:
        reset_at = int(r.headers.get("X-RateLimit-Reset", time.time() + 60))
        wait = max(0, reset_at - time.time()) + 2
        print(f"Rate limit low — waiting {wait:.0f}s")
        time.sleep(wait)

    batch = r.json()
    if not batch:
        break

    issues.extend(batch)
    page += 1
    time.sleep(0.1)  # be polite

print(f"Fetched {len(issues)} issues/PRs")

# Flatten
rows = []
for issue in issues:
    rows.append({
        "number":        issue["number"],
        "title":         issue["title"],
        "state":         issue["state"],
        "author":        issue["user"]["login"] if issue.get("user") else "",
        "labels":        ", ".join(l["name"] for l in issue.get("labels", [])),
        "assignee":      issue["assignee"]["login"] if issue.get("assignee") else "",
        "milestone":     issue["milestone"]["title"] if issue.get("milestone") else "",
        "comments":      issue.get("comments", 0),
        "is_pr":         "pull_request" in issue,  # PRs appear in issues endpoint
        "created_at":    issue.get("created_at", "")[:10],
        "updated_at":    issue.get("updated_at", "")[:10],
        "closed_at":     (issue.get("closed_at") or "")[:10],
    })

df = pd.DataFrame(rows)
df.to_csv(f"{OWNER}_{REPO}_issues.csv", index=False, encoding="utf-8-sig")
print(f"Exported {len(df)} rows")

Export pull requests with review status and merge info

import requests
import pandas as pd

TOKEN = "ghp_..."
OWNER = "your-org"
REPO  = "your-repo"

headers = {
    "Authorization": f"Bearer {TOKEN}",
    "Accept": "application/vnd.github+json",
}

prs = []
page = 1
while True:
    r = requests.get(
        f"https://api.github.com/repos/{OWNER}/{REPO}/pulls",
        headers=headers,
        params={"state": "all", "per_page": 100, "page": page},
    )
    batch = r.json()
    if not batch:
        break
    prs.extend(batch)
    page += 1

rows = []
for pr in prs:
    rows.append({
        "pr_number":     pr["number"],
        "title":         pr["title"],
        "state":         pr["state"],
        "author":        pr["user"]["login"] if pr.get("user") else "",
        "base_branch":   pr.get("base", {}).get("ref"),
        "head_branch":   pr.get("head", {}).get("ref"),
        "draft":         pr.get("draft", False),
        "merged":        pr.get("merged_at") is not None,
        "labels":        ", ".join(l["name"] for l in pr.get("labels", [])),
        "reviewers":     ", ".join(r["login"] for r in pr.get("requested_reviewers", [])),
        "created_at":    (pr.get("created_at") or "")[:10],
        "merged_at":     (pr.get("merged_at") or "")[:10],
        "closed_at":     (pr.get("closed_at") or "")[:10],
        "additions":     pr.get("additions", 0),
        "deletions":     pr.get("deletions", 0),
        "changed_files": pr.get("changed_files", 0),
    })

pd.DataFrame(rows).to_csv(f"{OWNER}_{REPO}_prs.csv", index=False, encoding="utf-8-sig")
print(f"Exported {len(rows)} pull requests")

Common issues with this API

Pull requests appear in the issues endpoint

GitHub's /issues endpoint returns both issues and pull requests. PRs have a 'pull_request' key in the response; plain issues don't. Filter with 'pull_request' in issue to separate them.

issues_only = [i for i in all_issues if "pull_request" not in i]
prs_only    = [i for i in all_issues if "pull_request" in i]

Rate limit: 5,000 requests per hour with a token

Unauthenticated requests are limited to 60/hour. With a personal access token or GitHub App token, you get 5,000/hour. Check the X-RateLimit-Remaining header and back off before you hit zero — the API returns 403 when exhausted.

remaining = int(r.headers.get("X-RateLimit-Remaining", 1))
reset_time = int(r.headers.get("X-RateLimit-Reset", 0))
if remaining < 10:
    import time
    wait = max(0, reset_time - time.time()) + 5
    time.sleep(wait)

assignees is an array, assignee is the first one

GitHub issues can have multiple assignees. The 'assignee' field is the primary assignee (or null). The 'assignees' field is the full array. If you need all assignees, join the assignees array — not just the single assignee field.

"assignees": ", ".join(a["login"] for a in issue.get("assignees", []))

Frequently asked questions

How do I export issues from a private repository?

Use a personal access token (classic) with the 'repo' scope, or a fine-grained token with Issues read permission for that repository. Pass it in the Authorization header: 'Bearer ghp_...'.

Can I export GitHub Projects data to CSV?

GitHub Projects (v2) uses the GraphQL API, not the REST API. The REST /issues endpoint gives you issue metadata but not project-specific fields like status, priority, or sprint. For project fields, use the GraphQL API to query ProjectV2Items.

Why do I get a 422 error when fetching more than 1000 issues?

GitHub's REST API limits pagination to 1000 results (100 per page × 10 pages) for some endpoints when using simple page-number pagination. Switch to using the since parameter with created or updated timestamps to page through more than 1000 records.

How do I export commit history to CSV?

Use GET /repos/{owner}/{repo}/commits. Each commit has a commit.author.name, commit.author.date, commit.message, and sha. The structure is different from issues — use json_normalize(commits, sep='.') to flatten it, then rename the dot-notation columns.