# I am the Watcher. I am your guide through this vast new twtiverse.
#
# Usage:
# https://watcher.sour.is/api/plain/users View list of users and latest twt date.
# https://watcher.sour.is/api/plain/twt View all twts.
# https://watcher.sour.is/api/plain/mentions?uri=:uri View all mentions for uri.
# https://watcher.sour.is/api/plain/conv/:hash View all twts for a conversation subject.
#
# Options:
# uri Filter to show a specific users twts.
# offset Start index for quey.
# limit Count of items to return (going back in time).
#
# twt range = 1 33
# self = https://watcher.sour.is/conv/px274va
Anyone got a link to a robots.txt that “blocks” all the “AI” stuff?
Anyone got a link to a robots.txt that “blocks” all the “AI” stuff?
Anyone got a link to a robots.txt that “blocks” all the “AI” stuff?
Anyone got a link to a robots.txt that “blocks” all the “AI” stuff?
… or maybe I should do this based on allowlisting rather than blocklisting. 🤔 Only allow a couple of bots that I think are fine …
… or maybe I should do this based on allowlisting rather than blocklisting. 🤔 Only allow a couple of bots that I think are fine …
… or maybe I should do this based on allowlisting rather than blocklisting. 🤔 Only allow a couple of bots that I think are fine …
… or maybe I should do this based on allowlisting rather than blocklisting. 🤔 Only allow a couple of bots that I think are fine …
@movq Only found 3 results for "robotst.xt" and OpenAI 😢 I seem to recall an effort (_I cannot find_) to build a standard for AI Crawlers similar to robots.txt
@movq Only found 3 results for "robotst.xt" and OpenAI 😢 I seem to recall an effort (_I cannot find_) to build a standard for AI Crawlers similar to robots.txt
@prologic Ahhh, I right, now I remember. That ai.txt
boils down to this, I guess:
User-Agent: *
Disallow: /*
@prologic Ahhh, I right, now I remember. That ai.txt
boils down to this, I guess:
User-Agent: *
Disallow: /*
@prologic Ahhh, I right, now I remember. That ai.txt
boils down to this, I guess:
User-Agent: *
Disallow: /*
@prologic Ahhh, I right, now I remember. That ai.txt
boils down to this, I guess:
User-Agent: *
Disallow: /*
@movq I have this one as per some article I read some time ago... But just like the robots.txt I don't think you have any grantee that it would be honored, you might even have a better chance hunting for and blocking user-agents.
@aelaraji Yeah, there is no guarantee with any of these things, it can all be faked or ignored. 🫤 I’m still going to do it in the hopes that *some* of those bots respect it.
@aelaraji Yeah, there is no guarantee with any of these things, it can all be faked or ignored. 🫤 I’m still going to do it in the hopes that *some* of those bots respect it.
@aelaraji Yeah, there is no guarantee with any of these things, it can all be faked or ignored. 🫤 I’m still going to do it in the hopes that *some* of those bots respect it.
@aelaraji Yeah, there is no guarantee with any of these things, it can all be faked or ignored. 🫤 I’m still going to do it in the hopes that *some* of those bots respect it.
@movq It looks like this one actually reads the robots.txt ... it did a couple of times over the past few weeks.
> "GET /robots.txt HTTP/1.1" 304 0 "-" "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.0; +https://openai.com/gptbot)"
Hey @movq !! here's an article you might find interesting: Blocking Bots with Nginx ... this person is actually blocking AI
Bots based on a list of User Agents in an interesting way. 👍
@aelaraji Hmmm looks like the core idea is to intercept requests, Inspect the UserAgent
header and respond accordingly.
@aelaraji Hmmm looks like the core idea is to intercept requests, Inspect the UserAgent
header and respond accordingly.
Can we trust the bots not to fake their identity? 🤔
Can we trust the bots not to fake their identity? 🤔
@aelaraji @prologic Hmm, yeah, looks a bit better than ai.txt
/ robots.txt
, but I wouldn’t trust that they don’t spoof their user agent. 🤔
@aelaraji @prologic Hmm, yeah, looks a bit better than ai.txt
/ robots.txt
, but I wouldn’t trust that they don’t spoof their user agent. 🤔
@aelaraji @prologic Hmm, yeah, looks a bit better than ai.txt
/ robots.txt
, but I wouldn’t trust that they don’t spoof their user agent. 🤔
@aelaraji @prologic Hmm, yeah, looks a bit better than ai.txt
/ robots.txt
, but I wouldn’t trust that they don’t spoof their user agent. 🤔