static: robots.txt: add even more stupid, disrespecting AI crawlers
Signed-off-by: Christoph Heiss <christoph@c8h4.io>
This commit is contained in:
parent
ac3e86addb
commit
a2766e6098
|
@ -1,8 +1,5 @@
|
|||
# Primarily based on https://git.sr.ht/~sircmpwn/sr.ht-nginx/tree/master/item/robots.txt
|
||||
# All credit for collecting to Drew, the sourcehut crew and its contributers!
|
||||
#
|
||||
# Also some taken from here, thanks for the idea you AI shills!
|
||||
# https://github.com/samber/the-great-gpt-firewall
|
||||
|
||||
# Too aggressive, marketing/SEO
|
||||
User-agent: SemrushBot
|
||||
|
@ -66,22 +63,10 @@ Disallow: /
|
|||
User-agent: GPTBot
|
||||
Disallow: /
|
||||
|
||||
# ChatGPT plugins
|
||||
User-agent: ChatGPT-User
|
||||
Disallow: /
|
||||
|
||||
# Common Crawl, used by e.g. OpenAI .. blargh
|
||||
User-agent: CCBot
|
||||
Disallow: /
|
||||
|
||||
# Fairly certain that this is an LLM data vacuum
|
||||
User-agent: ClaudeBot
|
||||
Disallow: /
|
||||
|
||||
# Claude
|
||||
User-agent: anthropic-ai
|
||||
Disallow: /
|
||||
|
||||
# Same
|
||||
User-agent: Google-Extended
|
||||
Disallow: /
|
||||
|
@ -89,3 +74,35 @@ Disallow: /
|
|||
# Marketing
|
||||
User-agent: serpstatbot
|
||||
Disallow: /
|
||||
|
||||
#
|
||||
# Thanks for the additional list, you AI shills!
|
||||
# https://github.com/samber/the-great-gpt-firewall
|
||||
#
|
||||
|
||||
# ChatGPT plugins
|
||||
User-agent: ChatGPT-User
|
||||
Disallow: /
|
||||
|
||||
# Common Crawl, used by e.g. OpenAI .. blargh
|
||||
User-agent: CCBot
|
||||
Disallow: /
|
||||
|
||||
# Claude
|
||||
User-agent: anthropic-ai
|
||||
Disallow: /
|
||||
|
||||
# Many thanks to
|
||||
# https://neil-clarke.com/block-the-bots-that-feed-ai-models-by-scraping-your-website/
|
||||
# for the next few!
|
||||
User-agent: Omgilibot
|
||||
Disallow: /
|
||||
|
||||
User-agent: Omgili
|
||||
Disallow: /
|
||||
|
||||
User-agent: FacebookBot
|
||||
Disallow: /
|
||||
|
||||
User-agent: Bytespider
|
||||
Disallow: /
|
||||
|
|
Loading…
Reference in a new issue