# I am the Watcher. I am your guide through this vast new twtiverse.
# 
# Usage:
#     https://watcher.sour.is/api/plain/users              View list of users and latest twt date.
#     https://watcher.sour.is/api/plain/twt                View all twts.
#     https://watcher.sour.is/api/plain/mentions?uri=:uri  View all mentions for uri.
#     https://watcher.sour.is/api/plain/conv/:hash         View all twts for a conversation subject.
# 
# Options:
#     uri     Filter to show a specific users twts.
#     offset  Start index for quey.
#     limit   Count of items to return (going back in time).
# 
# twt range = 1 60515
# self = https://watcher.sour.is?uri=https://twtxt.net/user/prologic/twtxt.txt&offset=59215
# next = https://watcher.sour.is?uri=https://twtxt.net/user/prologic/twtxt.txt&offset=59315
# prev = https://watcher.sour.is?uri=https://twtxt.net/user/prologic/twtxt.txt&offset=59115
@bender So you mean, get failtb2n to look at my Caddy audit logs for violations and then just block at the firewall level for repeated violations? πŸ€”
@bender So you mean, get failtb2n to look at my Caddy audit logs for violations and then just block at the firewall level for repeated violations? πŸ€”
@kat token will still be valid πŸ‘Œ
@kat token will still be valid πŸ‘Œ
@kat πŸ™Œ
@kat πŸ™Œ
@kat Yeah that's what the admin function does. Normal user password reset is different but requires working email 🀣
@kat Yeah that's what the admin function does. Normal user password reset is different but requires working email 🀣
@kat Speaking of KVM, Tiny Pilot and Jet KVM look really good!
@kat Speaking of KVM, Tiny Pilot and Jet KVM look really good!
@kat It'll be whatever the actual server's time zone is.
@kat It'll be whatever the actual server's time zone is.
@kat Temporally change the admin account on your pod to another account. Then login with that and reset the password on your main account.
@kat Temporally change the admin account on your pod to another account. Then login with that and reset the password on your main account.
What didn't work? Hmmm πŸ€”
What didn't work? Hmmm πŸ€”
Hmm? πŸ€”
Hmm? πŸ€”
@seabirdie πŸ‘‹ Welcome to Yarn.social πŸ™Œ
@seabirdie πŸ‘‹ Welcome to Yarn.social πŸ™Œ
@kat Haha 🀣
@kat Haha 🀣
Also yarnd supports video too 🀣
Also yarnd supports video too 🀣
@kat Thanks! I built my own video hosting platform too but not nearly as fancy as what you use 🀣
@kat Thanks! I built my own video hosting platform too but not nearly as fancy as what you use 🀣
@ πŸ‘‹ Welcome to Yarn.social πŸ™Œ
@ πŸ‘‹ Welcome to Yarn.social πŸ™Œ
@bender Wre I'm talking about Web right? 🀣
@bender Wre I'm talking about Web right? 🀣
@aelaraji Nice! πŸ™Œ
@aelaraji Nice! πŸ™Œ
@bender you're right the scale wasn't that large, but analyzing the logs. It definitely was a detox attack. 🀣 I woke up this morning to see six other small spikes like this which I'll have to analyze later tonight…
@bender you're right the scale wasn't that large, but analyzing the logs. It definitely was a detox attack. 🀣 I woke up this morning to see six other small spikes like this which I'll have to analyze later tonight…
@movq Yes
@movq Yes
@kat What do you use for this btw? πŸ€”
@kat What do you use for this btw? πŸ€”
So I need to figure out how to block ASN(s)...

Additionally, I' thinking of; How to detect DDoS attachs?

Here's one way I've come up that's quite simple:

> Detecting DDoS attacks by tracking requests across multiple IPs in a sliding window. If total requests exceed a threshold in a given time, flag as potential DDoS.
So I need to figure out how to block ASN(s)...

Additionally, I' thinking of; How to detect DDoS attachs?

Here's one way I've come up that's quite simple:

> Detecting DDoS attacks by tracking requests across multiple IPs in a sliding window. If total requests exceed a threshold in a given time, flag as potential DDoS.
@lyse Cool πŸ‘Œ
@lyse Cool πŸ‘Œ
Hmmm so I've sustained two DDoS attacks on my Gitea server today. A few hours apar. Still analyzing the traffic...
Hmmm so I've sustained two DDoS attacks on my Gitea server today. A few hours apar. Still analyzing the traffic...
For the time being... I've just blocked all of OpenAI(s) Bots. They (_thankfully_) publish a JSON endpoint that you can use to block all OpenAI crawlers from reaching your server (_in my case, blocking it at the edge_). Example:


proxy-1:~# curl -qs https://openai.com/gptbot.json | jq -r '.prefixes[].ipv4Prefix' | xargs -I{} ./block-ip.sh {}


Where block-ip.sh is simply:


#!/bin/sh

ufw insert 1 deny from "$1" to any
For the time being... I've just blocked all of OpenAI(s) Bots. They (_thankfully_) publish a JSON endpoint that you can use to block all OpenAI crawlers from reaching your server (_in my case, blocking it at the edge_). Example:


proxy-1:~# curl -qs https://openai.com/gptbot.json | jq -r '.prefixes[].ipv4Prefix' | xargs -I{} ./block-ip.sh {}


Where block-ip.sh is simply:


#!/bin/sh

ufw insert 1 deny from "$1" to any
@aelaraji Yes! πŸ‘ This is exactly what it is! 🀣 I will of course soonβ„’ be hosting this service, likely at validator.twtxt.net πŸ˜…πŸ˜…
@aelaraji Yes! πŸ‘ This is exactly what it is! 🀣 I will of course soonβ„’ be hosting this service, likely at validator.twtxt.net πŸ˜…πŸ˜…
@kat Haha 🀣 If someone figures this out, please let me know πŸ™πŸ™ -- In the meantime, I'm going to very soonβ„’ write a daemon that will watch the audit log for repeated violations and add to the network firewall.
@kat Haha 🀣 If someone figures this out, please let me know πŸ™πŸ™ -- In the meantime, I'm going to very soonβ„’ write a daemon that will watch the audit log for repeated violations and add to the network firewall.
This is better:


proxy-1:~# ./audit-log-by-ip.sh 4.227.36.76 | coraza-log-formatter -m -
2025/01/04 23:17:04 4.227.36.76 58982 GET /external?aff-HY0BLO=&f=mediaonly&f=noreplies&nick=g1n&uri=https%3A%2F%2Fthe-president-codes.linegames.org null 0  On OWASP_CRS/4.7.0
Actionset: OWASP_CRS/4.7.0
Message: Bad User Agent
Severity: 0
Raw: SecRule REQUEST_HEADERS:User-Agent "@pmFromFile /etc/caddy/waf/bad_user_agents.txt" "id:2000,log,phase:1,deny,msg:'Bad User Agent'"
This is better:


proxy-1:~# ./audit-log-by-ip.sh 4.227.36.76 | coraza-log-formatter -m -
2025/01/04 23:17:04 4.227.36.76 58982 GET /external?aff-HY0BLO=&f=mediaonly&f=noreplies&nick=g1n&uri=https%3A%2F%2Fthe-president-codes.linegames.org null 0  On OWASP_CRS/4.7.0
Actionset: OWASP_CRS/4.7.0
Message: Bad User Agent
Severity: 0
Raw: SecRule REQUEST_HEADERS:User-Agent "@pmFromFile /etc/caddy/waf/bad_user_agents.txt" "id:2000,log,phase:1,deny,msg:'Bad User Agent'"
Nice! I wrote another useful tool πŸ‘Œ


proxy-1:~# ./audit-log-by-ip.sh 4.227.36.76 | coraza-log-formatter -m -
Actionset: OWASP_CRS/4.7.0
Message: Bad User Agent
Severity: 0
Raw: SecRule REQUEST_HEADERS:User-Agent "@pmFromFile /etc/caddy/waf/bad_user_agents.txt" "id:2000,log,phase:1,deny,msg:'Bad User Agent'"
Nice! I wrote another useful tool πŸ‘Œ


proxy-1:~# ./audit-log-by-ip.sh 4.227.36.76 | coraza-log-formatter -m -
Actionset: OWASP_CRS/4.7.0
Message: Bad User Agent
Severity: 0
Raw: SecRule REQUEST_HEADERS:User-Agent "@pmFromFile /etc/caddy/waf/bad_user_agents.txt" "id:2000,log,phase:1,deny,msg:'Bad User Agent'"
How in da fuq do you _actually_ make these fucking useless AI bots go way?


proxy-1:~# jq '. | select(.request.remote_ip=="4.227.36.76")' /var/log/caddy/access/mills.io.log | jq -s '. | last' | caddy-log-formatter -
4.227.36.76 - [2025-01-05 04:05:43.971 +0000] "GET /external?aff-QNAXWV=&f=mediaonly&f=noreplies&nick=g1n&uri=https%3A%2F%2Fmy-hero-ultra-impact-codes.linegames.org HTTP/2.0" 0 0
proxy-1:~# date
Sun Jan  5 04:05:49 UTC 2025


😱
How in da fuq do you _actually_ make these fucking useless AI bots go way?


proxy-1:~# jq '. | select(.request.remote_ip=="4.227.36.76")' /var/log/caddy/access/mills.io.log | jq -s '. | last' | caddy-log-formatter -
4.227.36.76 - [2025-01-05 04:05:43.971 +0000] "GET /external?aff-QNAXWV=&f=mediaonly&f=noreplies&nick=g1n&uri=https%3A%2F%2Fmy-hero-ultra-impact-codes.linegames.org HTTP/2.0" 0 0
proxy-1:~# date
Sun Jan  5 04:05:49 UTC 2025


😱
Done.
Done.
@lyse Oh good! It works haha 🀣 I'll bump it up a bit πŸ‘Œ
@lyse Oh good! It works haha 🀣 I'll bump it up a bit πŸ‘Œ
And now I've applied rate limits on every site to reasonable values πŸ‘Œ
And now I've applied rate limits on every site to reasonable values πŸ‘Œ
@bender Isn't that why um yarning my progress 🀣
@bender Isn't that why um yarning my progress 🀣
@kat I've actually moved most of my stuff of of Cloudflare now 🀣 I'm actually very happy with my edge proxy setup that reverse proxies, caches and acts as a web application firewall πŸ₯³
@kat I've actually moved most of my stuff of of Cloudflare now 🀣 I'm actually very happy with my edge proxy setup that reverse proxies, caches and acts as a web application firewall πŸ₯³
@kat Have you seen the SSG that I built and use on all my static sites? zs πŸ€”
@kat Have you seen the SSG that I built and use on all my static sites? zs πŸ€”
Oh gawd. I can't enable caching on my edge proxy everywhere 😱 Some shitβ„’ doesn't deal with a caching reverse proxy in front of it very well for some reason I don't have time to dig into right now πŸ€”
Oh gawd. I can't enable caching on my edge proxy everywhere 😱 Some shitβ„’ doesn't deal with a caching reverse proxy in front of it very well for some reason I don't have time to dig into right now πŸ€”
What's a reasonable per second or per minute rate limit that I could apply in general at my edge proxy for all clients? (_no matter what_) ... LIke a good reasonable upper bound? πŸ€”
What's a reasonable per second or per minute rate limit that I could apply in general at my edge proxy for all clients? (_no matter what_) ... LIke a good reasonable upper bound? πŸ€”
@movq Yeah I swear to god the engineers that write this shitβ„’ don't know how to write distributed cralwers that don't happy the shitβ„’ out of their targets πŸ€¦β€β™‚οΈ
@movq Yeah I swear to god the engineers that write this shitβ„’ don't know how to write distributed cralwers that don't happy the shitβ„’ out of their targets πŸ€¦β€β™‚οΈ
@doesnm No. I generally don't put up any robots.txt files at all really, because they mostly get ignored. I don't generally mind if "normal" web crawlers crawl things. But LLM(s) can go fuck themselves 🀣
@doesnm No. I generally don't put up any robots.txt files at all really, because they mostly get ignored. I don't generally mind if "normal" web crawlers crawl things. But LLM(s) can go fuck themselves 🀣
@movq Yeah it's starting to piss me off too 🀣 Not nearly as much as that guy, but stil. Anyway I'm having fun! Now I just need to find a good IP/Subnet list that I can blacklist entirely, ideally one that's updated frequently so I can refresh firewall rules.
@movq Yeah it's starting to piss me off too 🀣 Not nearly as much as that guy, but stil. Anyway I'm having fun! Now I just need to find a good IP/Subnet list that I can blacklist entirely, ideally one that's updated frequently so I can refresh firewall rules.
Bloody fucking hell. I _think_ one of Google's GenAI crawlers was just hitting my Gitea instance quite hard. Fuck 🀬 Geez
Bloody fucking hell. I _think_ one of Google's GenAI crawlers was just hitting my Gitea instance quite hard. Fuck 🀬 Geez
@movq Oh πŸ€¦β€β™‚οΈ
@movq Oh πŸ€¦β€β™‚οΈ
I just banned 41 bad user agents from accessing any of my services. 😱
I just banned 41 bad user agents from accessing any of my services. 😱
@movq How do you manage to get those skulines on your photos? πŸ€”
@movq How do you manage to get those skulines on your photos? πŸ€”
@doesnm No, it's only designed for yarnd. What did you have in mind here? πŸ€”
@doesnm No, it's only designed for yarnd. What did you have in mind here? πŸ€”
@doesnm It is the same API that yarnc the command-line client uses.
@doesnm It is the same API that yarnc the command-line client uses.
i.e: Not much point in running a WAF on a static site. But OTOH if there's enough abuse from shitty assholes, there might be πŸ€”πŸ€”
i.e: Not much point in running a WAF on a static site. But OTOH if there's enough abuse from shitty assholes, there might be πŸ€”πŸ€”
I'm just basically learning now how ModSecurity rules work and how to write my own.

The builtin OWASP rules are already working nicely πŸ‘Œ -- And yeah I won't include the WAF on every site block, probably just my main/primary domain where I tend to run demo services and other things.
I'm just basically learning now how ModSecurity rules work and how to write my own.

The builtin OWASP rules are already working nicely πŸ‘Œ -- And yeah I won't include the WAF on every site block, probably just my main/primary domain where I tend to run demo services and other things.
@kat If you've been following my yarns the other day about me getting off of Clownflare and building my own WAF, Proxy and effectively my own Edge network, you'll know I'm doing this at the very edge 🀣🀣
@kat If you've been following my yarns the other day about me getting off of Clownflare and building my own WAF, Proxy and effectively my own Edge network, you'll know I'm doing this at the very edge 🀣🀣
Having a lot of fun with Coraza today. A Web Application Firewall library written in Go that also happens to have a Caddy module.
Having a lot of fun with Coraza today. A Web Application Firewall library written in Go that also happens to have a Caddy module.