Skip to content

Feature: Add --batch flag to gh api for multi-endpoint requests #13333

@SebTardif

Description

@SebTardif

Problem

LLM agents, CI pipelines, and automation scripts frequently need to call multiple GitHub API endpoints in sequence. Each gh api invocation spawns a new process, establishes a new TCP connection, performs TLS negotiation, and sends authentication headers — even when hitting the same host.

In a typical LLM agent session (using gh api to fetch issues, file contents, and repo metadata), I measured 15+ sequential gh api invocations, each incurring ~50ms process spawn overhead and ~100-200ms TCP+TLS setup to api.github.com. That's 2-3 seconds of pure connection overhead that could be eliminated by reusing a single HTTP client.

Related: #13174 (Batch operations for code search / in general)

Proposed Solution

Add a --batch flag to gh api that reads newline-separated API requests from stdin and processes them using a shared HTTP connection.

Input format

Two formats, auto-detected per line:

Plain paths (one per line, treated as GET):

repos/cli/cli
repos/cli/cli/issues/1
repos/cli/cli/pulls/2

JSON objects (full control per request):

{"method":"GET","url":"repos/cli/cli"}
{"method":"POST","url":"repos/cli/cli/issues/1/comments","body":{"body":"hello"},"headers":{"Accept":"application/vnd.github.raw+json"}}

Usage

# Fetch multiple endpoints, get JSON Lines output
printf "repos/cli/cli\nrepos/cli/cli/issues/1" | gh api --batch

# With jq filter (applied per-response)
printf "repos/cli/cli\nrepos/cli/cli/issues/1" | gh api --batch --jq '.full_name // .title'

# Placeholders work per-line
printf "repos/{owner}/{repo}\nrepos/{owner}/{repo}/issues?per_page=1" | gh api --batch

Output

Without --jq: JSON Lines with envelope per response:

{"status":200,"path":"repos/cli/cli","body":{"full_name":"cli/cli",...}}
{"status":200,"path":"repos/cli/cli/issues/1","body":{"title":"Bug",...}}

With --jq: filter applied to each response body, one result per line.

Flag interactions

  • Mutually exclusive with --paginate and --input (both use stdin)
  • Compatible with --jq, --template, --hostname, --method (as default), --header, --cache
  • Requests are sequential (connection reuse, not concurrency), so rate limiting is naturally respected

Quantified Impact

Real measurements from an LLM agent session working with the aws/aws-cli repo:

Pattern Without --batch With --batch Savings
5 file content fetches 5 processes, 5 TCP connections 1 process, 1 connection 80%
2 comment fetches 2 processes, 2 connections 1 process, 1 connection 50%
Tree listing + 3 file fetches 4 processes, 4 connections 1 process, 1 connection 75%
Entire session (~15 gh api calls) ~15 processes ~3-4 invocations ~75%

Prototype Implementation

I have a working prototype with full test coverage: SebTardif/cli@feat/api-batch

Changes (~380 lines):

  • pkg/cmd/api/batch.go — Input parsing (plain paths + JSON objects) and JSON Lines output
  • pkg/cmd/api/api.go--batch flag, validation, batch execution loop reusing existing httpRequest() and processResponse()
  • pkg/cmd/api/batch_test.go — 8 unit tests for parsing and output
  • pkg/cmd/api/api_test.go — 4 integration tests (basic batch, jq filter, error handling, flag existence)

All existing tests pass. Happy to open a PR whenever this is reviewed.

Why this benefits GitHub

  • Fewer API connections: N sequential requests become 1 TCP connection with HTTP keep-alive
  • Lower rate limit pressure: Automation tools hit rate limits less often with efficient batching
  • LLM agent adoption: As AI coding tools increasingly use gh for GitHub interaction, batch mode makes gh api the natural choice over raw HTTP clients

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions