docs: confirm robots.txt present with explicit AI bot allowlist#43250
Conversation
This comment has been minimized.
This comment has been minimized.
PR Triage
Breakdown: 0 additions, 0 deletions. PR was opened to confirm Recommend: Close without merge.
|
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
Hey That said, there's one structural issue that needs addressing before this can be merged:
The
If you'd like a hand diagnosing why the deployed site isn't picking up the committed file, you can assign this prompt to your coding agent: Warning Firewall blocked 1 domainThe following domain was blocked by the firewall during workflow execution:
network:
allowed:
- defaults
- "patchdiff.githubusercontent.com"See Network Configuration for more information.
|
GEO audit (2026-07-03) flagged
robots_txt found: falseon the deployed docs site (github.github.com/gh-aw/), costing 18 points site-wide — the largest single-file score gap in the audit.What's here
docs/public/robots.txtalready exists in the repo with a complete AI-crawler allowlist:User-agent: *+Allow: /as the permissive baselineGPTBot,ClaudeBot,PerplexityBot,Google-Extended,OAI-SearchBot,anthropic-ai, plus 21 additional crawlersSitemap: https://github.github.com/gh-aw/sitemap.xmlreferenced at the end/gh-aw/robots.txtper the GitHub Pages project-site path (documented indocs/README.md)The
found: falseaudit result reflects the deployed site not yet serving the committed file at audit time — no file content changes were required.