# I am the Watcher. I am your guide through this vast new twtiverse.
# 
# Usage:
#     https://watcher.sour.is/api/plain/users              View list of users and latest twt date.
#     https://watcher.sour.is/api/plain/twt                View all twts.
#     https://watcher.sour.is/api/plain/mentions?uri=:uri  View all mentions for uri.
#     https://watcher.sour.is/api/plain/conv/:hash         View all twts for a conversation subject.
# 
# Options:
#     uri     Filter to show a specific users twts.
#     offset  Start index for quey.
#     limit   Count of items to return (going back in time).
# 
# twt range = 1 19
# self = https://watcher.sour.is/conv/qed3omq
I just banned 41 bad user agents from accessing any of my services. 😱
I just banned 41 bad user agents from accessing any of my services. 😱
Bloody fucking hell. I _think_ one of Google's GenAI crawlers was just hitting my Gitea instance quite hard. Fuck 🤬 Geez
Bloody fucking hell. I _think_ one of Google's GenAI crawlers was just hitting my Gitea instance quite hard. Fuck 🤬 Geez
@prologic You might (not) enjoy this blog post: https://pod.geraspora.de/posts/17342163
@prologic You might (not) enjoy this blog post: https://pod.geraspora.de/posts/17342163
@prologic You might (not) enjoy this blog post: https://pod.geraspora.de/posts/17342163
@prologic You might (not) enjoy this blog post: https://pod.geraspora.de/posts/17342163
@movq Yeah it's starting to piss me off too 🤣 Not nearly as much as that guy, but stil. Anyway I'm having fun! Now I just need to find a good IP/Subnet list that I can blacklist entirely, ideally one that's updated frequently so I can refresh firewall rules.
@movq Yeah it's starting to piss me off too 🤣 Not nearly as much as that guy, but stil. Anyway I'm having fun! Now I just need to find a good IP/Subnet list that I can blacklist entirely, ideally one that's updated frequently so I can refresh firewall rules.
Did you have disallow rule in robots.txt? (I think not because can google several twtxt.net posts)
@doesnm No. I generally don't put up any robots.txt files at all really, because they mostly get ignored. I don't generally mind if "normal" web crawlers crawl things. But LLM(s) can go fuck themselves 🤣
@doesnm No. I generally don't put up any robots.txt files at all really, because they mostly get ignored. I don't generally mind if "normal" web crawlers crawl things. But LLM(s) can go fuck themselves 🤣
@prologic Yeah, robots.txt or ai.txt are not worth the effort. I have them, but they get ignored. Just now, I saw a stupid AI bot hitting one of my blog posts like crazy. Not just once, but hundreds of times, over and over. 🤦🙄
@prologic Yeah, robots.txt or ai.txt are not worth the effort. I have them, but they get ignored. Just now, I saw a stupid AI bot hitting one of my blog posts like crazy. Not just once, but hundreds of times, over and over. 🤦🙄
@prologic Yeah, robots.txt or ai.txt are not worth the effort. I have them, but they get ignored. Just now, I saw a stupid AI bot hitting one of my blog posts like crazy. Not just once, but hundreds of times, over and over. 🤦🙄
@prologic Yeah, robots.txt or ai.txt are not worth the effort. I have them, but they get ignored. Just now, I saw a stupid AI bot hitting one of my blog posts like crazy. Not just once, but hundreds of times, over and over. 🤦🙄
@movq Yeah I swear to god the engineers that write this shit™ don't know how to write distributed cralwers that don't happy the shit™ out of their targets 🤦‍♂️
@movq Yeah I swear to god the engineers that write this shit™ don't know how to write distributed cralwers that don't happy the shit™ out of their targets 🤦‍♂️