The Watcher

	
# I am the Watcher. I am your guide through this vast new twtiverse.
# 
# Usage:
#     https://watcher.sour.is/api/plain/users              View list of users and latest twt date.
#     https://watcher.sour.is/api/plain/twt                View all twts.
#     https://watcher.sour.is/api/plain/mentions?uri=:uri  View all mentions for uri.
#     https://watcher.sour.is/api/plain/conv/:hash         View all twts for a conversation subject.
# 
# Options:
#     uri     Filter to show a specific users twts.
#     offset  Start index for quey.
#     limit   Count of items to return (going back in time).
# 
# twt range = 1 18
# self = https://watcher.sour.is/conv/sux32qq

movq

www.uninformativ.de

31 Aug 25 05:06 UTC+0000

The bots have begun to access my website way more often. I’m getting about 120k hits on https://www.uninformativ.de/git/ now in a couple of hours.

They don’t cache anything, probably on purpose.

It comes in waves. I get about 100 hits (all at once) on that /git endpoint, all from different IPs. Then it takes a moment until I get another wave of about 500-1000 requests (all at once) where they do HEAD requests on some of the paths below /git. I assume they did a GET earlier and are now checking if something has changed.

movq

www.uninformativ.de

31 Aug 25 05:06 UTC+0000

The bots have begun to access my website way more often. I’m getting about 120k hits on https://www.uninformativ.de/git/ now in a couple of hours.

They don’t cache anything, probably on purpose.

It comes in waves. I get about 100 hits (all at once) on that /git endpoint, all from different IPs. Then it takes a moment until I get another wave of about 500-1000 requests (all at once) where they do HEAD requests on some of the paths below /git. I assume they did a GET earlier and are now checking if something has changed.

movq

www.uninformativ.de

31 Aug 25 05:07 UTC+0000

It doesn’t pose a problem for my server’s performance – yet. But if more bots/companies start doing this, my website will go down from the load.

movq

www.uninformativ.de

31 Aug 25 05:07 UTC+0000

It doesn’t pose a problem for my server’s performance – yet. But if more bots/companies start doing this, my website will go down from the load.

movq

www.uninformativ.de

31 Aug 25 05:09 UTC+0000

This probably means that I can no longer host my own website. I don’t want to deploy something like Anubis, because that ruins the whole thing: I want it to be accessible from ancient browsers, like OS/2 or Windows 3.11.

I’ll keep an eye on it for a while. Maybe try to block some IPs.

Sooner or later, I’ll take the website down and shift everything to Gopher.

movq

www.uninformativ.de

31 Aug 25 05:09 UTC+0000

This probably means that I can no longer host my own website. I don’t want to deploy something like Anubis, because that ruins the whole thing: I want it to be accessible from ancient browsers, like OS/2 or Windows 3.11.

I’ll keep an eye on it for a while. Maybe try to block some IPs.

Sooner or later, I’ll take the website down and shift everything to Gopher.

movq

www.uninformativ.de

31 Aug 25 05:14 UTC+0000

Why do I care about this?

1. The load will become a problem at some point.
2. These crawlers and the current “AI” in general are breaking the rules. *I* am supposed to be paying for every little thing, *I* get sued for “piracy”. But apparently, these rules only apply to me. If I had more money, I could break them. Fuck that.
3. I simply don’t want it. Period.

movq

www.uninformativ.de

31 Aug 25 05:14 UTC+0000

Why do I care about this?

1. The load will become a problem at some point.
2. These crawlers and the current “AI” in general are breaking the rules. *I* am supposed to be paying for every little thing, *I* get sued for “piracy”. But apparently, these rules only apply to me. If I had more money, I could break them. Fuck that.
3. I simply don’t want it. Period.

movq

www.uninformativ.de

31 Aug 25 05:19 UTC+0000

“But all your stuff is MIT licensed! They are allowed to do that!”

Haha. As if they would care. They crawl everything they get their hands on.

Besides, that’s not true, the license states that the copyright notice must be retained. “AI” breaks that. They incorporate my code and my articles in their product and make it appear as if it was their work.

movq

www.uninformativ.de

31 Aug 25 05:19 UTC+0000

“But all your stuff is MIT licensed! They are allowed to do that!”

Haha. As if they would care. They crawl everything they get their hands on.

Besides, that’s not true, the license states that the copyright notice must be retained. “AI” breaks that. They incorporate my code and my articles in their product and make it appear as if it was their work.

prologic

twtxt.net

31 Aug 25 05:57 UTC

@movq Right now I'm basically just blocking entire ASN(s) at this point and large blocks of IP(s) from Anthropic, OPenAI, Microsoft and others.

movq

www.uninformativ.de

31 Aug 25 06:42 UTC+0000

@prologic Yeah, I’ve blocked some large subnets now (most likely overblocking a lot of stuff) and it has died down.

I’m not looking forward to doing this on a regular basis. This is supposed to be a fun hobby – and it was, for many years. Maybe that time is just over.

movq

www.uninformativ.de

31 Aug 25 06:42 UTC+0000

@prologic Yeah, I’ve blocked some large subnets now (most likely overblocking a lot of stuff) and it has died down.

I’m not looking forward to doing this on a regular basis. This is supposed to be a fun hobby – and it was, for many years. Maybe that time is just over.

movq

www.uninformativ.de

31 Aug 25 14:52 UTC+0000

As expected: Didn’t last long. They’re coming from different IPs now.

I’ve read enough blog posts by other people to know that this is probably pointless. The bots have *so many* IPs/networks at their disposal …

movq

www.uninformativ.de

31 Aug 25 14:52 UTC+0000

As expected: Didn’t last long. They’re coming from different IPs now.

I’ve read enough blog posts by other people to know that this is probably pointless. The bots have *so many* IPs/networks at their disposal …

dce

hashnix.club

31 Aug 25 15:24 UTC

@movq I heard about a defence against badly-behaved crawlers a while ago: an HTML zip bomb. This post explains how to do it. Essentially, web servers can serve compressed versions of webpages and, with a little trickery, one can replace the compressed page with a different file. After that, any bot that tries to crawl the page will instead download and unpack a zip bomb that will cause it to crash.

movq

www.uninformativ.de

31 Aug 25 15:37 UTC+0000

@dce Yeah, I’ve read about that approach. Sounds clever. Truth is, I’m too tired. 😢 I don’t want to spend too much of my time fighting assholes.

I’ve now started blocking entire cloud hosters. Sorry, not sorry.

movq

www.uninformativ.de

31 Aug 25 15:37 UTC+0000

@dce Yeah, I’ve read about that approach. Sounds clever. Truth is, I’m too tired. 😢 I don’t want to spend too much of my time fighting assholes.

I’ve now started blocking entire cloud hosters. Sorry, not sorry.