sphinx-gp-sitemap¶
Alpha
Rendered output is stable. The Python API and Sphinx config value names may change without a major version bump. Pin your dependency to a specific version range in production.
Sitemap generator for Sphinx. The package registers every sitemap_*
config value the upstream
sphinx-sitemap exposes
and emits the same sitemap.xml shape (urlset, hreflang alternates,
optional <lastmod>), updated to Sphinx 8.1+ idioms. The hard
dependency on sphinx-last-updated-by-git is downgraded to a
soft on-demand load that activates only under
sitemap_show_lastmod = True.
For install, builder support, locale rules, and the lastmod / migration story, see the package README. This page covers integration with gp-sphinx, the emission pipeline, the trade-offs, and the auto-generated config-value reference.
Integration with gp-sphinx¶
sphinx_gp_sitemap ships in DEFAULT_EXTENSIONS,
so projects that build through merge_sphinx_config()
load it automatically. Passing docs_url= to that function auto-derives
both URL inputs the extension needs:
Auto-derived |
Source |
|---|---|
|
|
|
|
The flat scheme overrides the upstream default of
"{lang}{version}{link}" because git-pull.com sites deploy at the
project root, with no language or version directory in the URL space.
Multilingual or version-pinned hosts can still pass an explicit
sitemap_url_scheme through **overrides — merge_sphinx_config()
runs auto-derivation first and overrides last. The canonical mapping
lives in From docs_url.
How sitemap.xml is built¶
After every HTML-family build, the extension serializes one <url>
element per built page to sitemap.xml in the output directory.
Init —
builder-initedinitializesenv.temp_data["sphinx_gp_sitemap_links"]to an empty list.Collect —
html-page-contextfires once per page. The handler computes the relative URL using the builder’s suffix (html_file_suffix or ".html"for thehtmlbuilder;…/fordirhtml, with the index emitted as the empty string), drops it when any pattern insitemap_excludesmatches, and appends a(relative_link, last_updated)tuple to the list.Compose —
build-finishedresolvessite_url(orhtml_baseurlas fallback; if both are unset the build is logged at INFO and skipped silently). For each collected link the handler formatssite_url + sitemap_url_scheme.format(lang=…, version=…, link=…). Thelangsegment comes fromapp.builder.config.languagefollowed by/(empty when no language is set);versionlikewise fromapp.builder.config.version.Hreflang — when
sitemap_localesresolves to a non-empty list (explicit value, or auto-detected sub-directories of every entry inlocale_dirs), each<url>gains<xhtml:link rel="alternate" hreflang="…">siblings. The formatter rewrites underscores to hyphens for IANA compatibility (pt_BR→pt-BR). The sentinelsitemap_locales = [None]suppresses alternates explicitly.Lastmod (optional) — when
sitemap_show_lastmod = True, theconfig-initedhandler runsapp.setup_extension("sphinx_last_updated_by_git")once at the start of the build to lazy-load the supporting extension. If the import fails, sphinx-gp-sitemap logs aWARNINGand disables the flag for the rest of the build —<lastmod>is omitted but everything else still emits.Serialize —
xml.etree.ElementTree.write()produces the file. Whensitemap_indent > 0,ElementTree.indent()pretty-prints the tree with the configured width. ElementTree handles XML entity escaping for the URL text and attribute values automatically.
Event hooks¶
config-inited → _maybe_enable_git_lastmod (lazy-load lastmod ext)
build-finished → _write_sitemap (enumerate found_docs +
XML serialization)
Both live in
sphinx_gp_sitemap/__init__.py.
Page enumeration runs once at build-finished over app.env.found_docs
using app.builder.get_target_uri(pagename) for each URL — no
html-page-context handler, so incremental builds (where Sphinx
fires the hook only for re-written pages) still emit a complete
sitemap. app.env.found_docs is part of the env Sphinx merges across
parallel-read workers, so the extension is parallel_write_safe
without per-handler aggregation logic.
Trade-offs¶
Drop-in for sphinx-sitemap with stricter URL handling. Upstream
reconstructed page URLs as pagename + html_file_suffix, which
diverges from the HTML builder’s actual <a href> output when
html_link_suffix is set (e.g. "/" for clean URLs) or when a
pagename contains characters Sphinx URL-quotes. sphinx-gp-sitemap
calls app.builder.get_target_uri(pagename) directly, matching the
links Sphinx emits on the page itself.
html_baseurl is re-registered defensively. Sphinx core
registers html_baseurl on most modern versions, but older trees and
some custom builders skip it. The setup() body wraps the
add_config_value("html_baseurl", …) call in
contextlib.suppress(ExtensionError) so the extension is robust
against either layout. The bare except BaseException upstream uses
is replaced by the narrow ExtensionError catch.
Config reference¶
Generated from app.add_config_value() registrations in
sphinx_gp_sitemap/__init__.py.
Config Value Index
Name |
Type |
Default |
Rebuild |
|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
-
site_url¶
Site base URL prepended to every sitemap entry. Auto-derived from
docs_url(trailing-slash normalized) under gp-sphinx; falls back tohtml_baseurlwhen unset. If both are unset the build is skipped silently at INFO level.- Type:
None | str- Default:
None- Registered by:
sphinx_gp_sitemap.setup()
-
sitemap_url_scheme¶
Per-URL composition template formatted with
lang(language/or empty),version(version/or empty), andlink(the page’s relative URL). Auto-set to flat{link}under gp-sphinx; multilingual or version-pinned hosts can pass{lang}{version}{link}via**overrides.- Type:
str- Default:
'{lang}{version}{link}'- Registered by:
sphinx_gp_sitemap.setup()
-
sitemap_locales¶
Locales emitted as
<xhtml:link rel="alternate" hreflang=...>siblings on every URL. Empty list auto-detects sub-directories of everylocale_dirsentry;[None]explicitly suppresses hreflang alternates. Underscores in locale codes become hyphens for IANA compatibility.- Type:
None | list- Default:
[]- Registered by:
sphinx_gp_sitemap.setup()
-
sitemap_filename¶
Output filename written under the build’s
outdir.- Type:
str- Default:
'sitemap.xml'- Registered by:
sphinx_gp_sitemap.setup()
-
sitemap_excludes¶
fnmatch patterns matched against each page’s relative URL (after the builder applies its suffix). Matched pages are dropped from the sitemap; everything else is included.
- Type:
list- Default:
[]- Registered by:
sphinx_gp_sitemap.setup()
-
sitemap_show_lastmod¶
When
True, lazy-loadssphinx-last-updated-by-gitand emits a<lastmod>element per page from the source file’s latest commit timestamp. If the supporting extension is not installed, gp-sitemap warns once and silently disables the flag.- Type:
bool- Default:
False- Registered by:
sphinx_gp_sitemap.setup()
-
sitemap_indent¶
XML indent width in spaces.
0minifies the output; any positive value pretty-prints viaElementTree.indent.- Type:
int- Default:
0- Registered by:
sphinx_gp_sitemap.setup()
-
html_baseurl¶
Sphinx core’s canonical HTML base URL — re-registered defensively here to serve as the
site_urlfallback on Sphinx versions that ship without it.- Type:
None | str- Default:
None- Registered by:
sphinx_gp_sitemap.setup()
Package reference¶
Copyable config snippet
extensions = [
"sphinx_gp_sitemap",
]
Package metadata
Source on GitHub: sphinx-gp-sitemap
PyPI: sphinx-gp-sitemap
Maturity:
Alpha