# I am the Watcher. I am your guide through this vast new twtiverse.
# 
# Usage:
#     https://watcher.sour.is/api/plain/users              View list of users and latest twt date.
#     https://watcher.sour.is/api/plain/twt                View all twts.
#     https://watcher.sour.is/api/plain/mentions?uri=:uri  View all mentions for uri.
#     https://watcher.sour.is/api/plain/conv/:hash         View all twts for a conversation subject.
# 
# Options:
#     uri     Filter to show a specific users twts.
#     offset  Start index for quey.
#     limit   Count of items to return (going back in time).
# 
# twt range = 1 33
# self = https://watcher.sour.is/conv/px274va
Anyone got a link to a robots.txt that “blocks” all the “AI” stuff?
Anyone got a link to a robots.txt that “blocks” all the “AI” stuff?
Anyone got a link to a robots.txt that “blocks” all the “AI” stuff?
Anyone got a link to a robots.txt that “blocks” all the “AI” stuff?
… or maybe I should do this based on allowlisting rather than blocklisting. 🤔 Only allow a couple of bots that I think are fine …
… or maybe I should do this based on allowlisting rather than blocklisting. 🤔 Only allow a couple of bots that I think are fine …
… or maybe I should do this based on allowlisting rather than blocklisting. 🤔 Only allow a couple of bots that I think are fine …
… or maybe I should do this based on allowlisting rather than blocklisting. 🤔 Only allow a couple of bots that I think are fine …
@movq Only found 3 results for "robotst.xt" and OpenAI 😢 I seem to recall an effort (_I cannot find_) to build a standard for AI Crawlers similar to robots.txt
@movq Only found 3 results for "robotst.xt" and OpenAI 😢 I seem to recall an effort (_I cannot find_) to build a standard for AI Crawlers similar to robots.txt
@movq Found it!

ai.txt: A new way for websites to set permissions for AI
@movq Found it!

ai.txt: A new way for websites to set permissions for AI
@prologic Ahhh, I right, now I remember. That ai.txt boils down to this, I guess:

User-Agent: *
Disallow: /*
@prologic Ahhh, I right, now I remember. That ai.txt boils down to this, I guess:

User-Agent: *
Disallow: /*
@prologic Ahhh, I right, now I remember. That ai.txt boils down to this, I guess:

User-Agent: *
Disallow: /*
@prologic Ahhh, I right, now I remember. That ai.txt boils down to this, I guess:

User-Agent: *
Disallow: /*
@movq I have this one as per some article I read some time ago... But just like the robots.txt I don't think you have any grantee that it would be honored, you might even have a better chance hunting for and blocking user-agents.
@aelaraji Yeah, there is no guarantee with any of these things, it can all be faked or ignored. 🫤 I’m still going to do it in the hopes that *some* of those bots respect it.
@aelaraji Yeah, there is no guarantee with any of these things, it can all be faked or ignored. 🫤 I’m still going to do it in the hopes that *some* of those bots respect it.
@aelaraji Yeah, there is no guarantee with any of these things, it can all be faked or ignored. 🫤 I’m still going to do it in the hopes that *some* of those bots respect it.
@aelaraji Yeah, there is no guarantee with any of these things, it can all be faked or ignored. 🫤 I’m still going to do it in the hopes that *some* of those bots respect it.
@movq It looks like this one actually reads the robots.txt ... it did a couple of times over the past few weeks.

> "GET /robots.txt HTTP/1.1" 304 0 "-" "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.0; +https://openai.com/gptbot)"
Hey @movq !! here's an article you might find interesting: Blocking Bots with Nginx ... this person is actually blocking AI Bots based on a list of User Agents in an interesting way. 👍
@aelaraji Hmmm looks like the core idea is to intercept requests, Inspect the UserAgent header and respond accordingly.
@aelaraji Hmmm looks like the core idea is to intercept requests, Inspect the UserAgent header and respond accordingly.
Can we trust the bots not to fake their identity? 🤔
Can we trust the bots not to fake their identity? 🤔
@aelaraji @prologic Hmm, yeah, looks a bit better than ai.txt / robots.txt, but I wouldn’t trust that they don’t spoof their user agent. 🤔
@aelaraji @prologic Hmm, yeah, looks a bit better than ai.txt / robots.txt, but I wouldn’t trust that they don’t spoof their user agent. 🤔
@aelaraji @prologic Hmm, yeah, looks a bit better than ai.txt / robots.txt, but I wouldn’t trust that they don’t spoof their user agent. 🤔
@aelaraji @prologic Hmm, yeah, looks a bit better than ai.txt / robots.txt, but I wouldn’t trust that they don’t spoof their user agent. 🤔
@movq me neither 🤦‍♂️
@movq me neither 🤦‍♂️