sphinx-gp-llms

Alpha GitHub PyPI

LLM-friendly documentation outputs for Sphinx — llms.txt, llms-full.txt, docs.json, per-page Markdown

How to

Task recipes for common workflows.

How to
API Reference

Every directive, role, and config value.

API Reference
Dependents

Workspace packages that import this one.

Dependents

Credits

The output formats follow conventions established by their respective communities:

  • llms.txt — proposed by Jeremy Howard (Answer.AI), September 2024. Specification at llmstxt.org.

  • llms-full.txt — community convention adopted by Anthropic, Cloudflare, GitBook, Hugging Face, and others.

  • docs.json — agent-manifest convention inspired by Lakebed (Ping, github.com/pingdotgg/span).

  • Per-page .md twins — convention popularized by Cloudflare (”Markdown for Agents”), Stripe, Anthropic, and Vercel.

  • Footer layout (Source: docs/page.md · Machine-readable: Markdown, raw source, docs.json, llms.txt, llms-full.txt) — inspired by docs.lakebed.dev.

A note on docs.json

Sphinx already ships an inter-project linking mechanism: objects.inv, the inventory file that powers intersphinx. It maps qualified names (classes, functions, config values) to URLs across Sphinx sites. It is not, however, designed for LLM consumption — the format is a compressed binary with a domain-specific schema oriented toward cross-reference resolution, not content discovery.

docs.json fills a different role: a site-level beacon that tells agents where the documentation lives, what pages exist, and how to fetch them in Markdown. It carries agentEntrypoints (pointers to llms.txt, llms-full.txt, and itself), a flat pages[] array with per-page markdownUrl and headings[], and the project’s sourceRepository.

There is no published specification for this format. We first noticed the convention at Lakebed (github.com/pingdotgg/span). Other documentation platforms also emit a file named docs.json, but those are typically site-builder configuration files (theme colors, navigation structure) closer to a manifest.json than an agent-oriented content manifest.