# I am the Watcher. I am your guide through this vast new twtiverse.
# 
# Usage:
#     https://watcher.sour.is/api/plain/users              View list of users and latest twt date.
#     https://watcher.sour.is/api/plain/twt                View all twts.
#     https://watcher.sour.is/api/plain/mentions?uri=:uri  View all mentions for uri.
#     https://watcher.sour.is/api/plain/conv/:hash         View all twts for a conversation subject.
# 
# Options:
#     uri     Filter to show a specific users twts.
#     offset  Start index for quey.
#     limit   Count of items to return (going back in time).
# 
# twt range = 1 196320
# self = https://watcher.sour.is?offset=166899
# next = https://watcher.sour.is?offset=166999
# prev = https://watcher.sour.is?offset=166799
@stigatle @prologic testing 1 2 3 can either of you see this?
Hmm, I wonder if I banned too many IPs and caused these issues for myself ๐Ÿ˜†
twts are taking a very long time to post from yarn after the latest upgrade. Like a good 60 seconds.
@prologic I don't know if this is new, but I'm seeing:


Jul 25 16:01:17 buc yarnd[1921547]: time="2024-07-25T16:01:17Z" level=error msg="https://yarn.stigatle.no/user/stigatle/twtxt.txt: client.Do fail: Get \\"https://yarn.stigatle.no/user/stigatle/twtxt.txt\\": dial tcp 185.97.32.18:443: i/o timeout (Client.Timeout exceeded while awaiting headers)" error="Get \\"https://yarn.stigatle.no/user/stigatle/twtxt.txt\\": dial tcp 185.97.32.18:443: i/o timeout (Client.Timeout exceeded while awaiting headers)"


I no longer see twts from @stigatle at all.
[47ยฐ09โ€ฒ21โ€ณS, 126ยฐ43โ€ฒ24โ€ณW] Reading: 1.12 Sv
@prologic Have you been seeing any of my replies?
@abucci / @abucci Any interesting errors pop up in the server logs since the the flaw got fixed (_unbounded receieveFile()_)? ๐Ÿค”
@abucci / @abucci Any interesting errors pop up in the server logs since the the flaw got fixed (_unbounded receieveFile()_)? ๐Ÿค”
Hmmm ๐Ÿง


for url in $(jq -r '.Twters[].avatar' cache.json | sed '/^$/d' | grep -v -E '(twtxt.net|anthony.buc.ci|yarn.stigatle.no|yarn.mills.io)' | sort -u); do echo "$url $(curl -I -s -o /dev/null -w '%header{content-length}' "$url")"; done
...


๐Ÿ˜… Let's see... ๐Ÿค”
Hmmm ๐Ÿง


for url in $(jq -r '.Twters[].avatar' cache.json | sed '/^$/d' | grep -v -E '(twtxt.net|anthony.buc.ci|yarn.stigatle.no|yarn.mills.io)' | sort -u); do echo "$url $(curl -I -s -o /dev/null -w '%header{content-length}' "$url")"; done
...


๐Ÿ˜… Let's see... ๐Ÿค”
It shows up in my twtxt feed so that's good.
@movq My issue is, now that we have the chance of getting something fast, people artificially slow it down again. Wether they think it's cool that they added some slow animation or just lack of knowledge or whatever. The absolute performance does not translate to the relative performance that I observe. Completely wasted potential. :-(

In today's economy, nobody optimizes something if it can be just called good enough with the next generation hardware. That's especially the mindset of big coorporations.

Anyway, getting sidetracked from the original post. :-)
@prologic will do, thanks for the tip!
This is a test. I am not seeing twts from @stigatle and it seems like @prologic might not be seeing twts from me. Do people see this?
@prologic I am not seeing twts from @stigatle anymore. Are you seeing twts from me?
@stigatle The one you sent is fine. I'm inspecting it now. I'm just saying, do yourself a favor and nuke your pod's garbage cache ๐Ÿคฃ It'll rebuild automatically in a much more prestine state.
@stigatle The one you sent is fine. I'm inspecting it now. I'm just saying, do yourself a favor and nuke your pod's garbage cache ๐Ÿคฃ It'll rebuild automatically in a much more prestine state.
@prologic you want a new cache from me - or was the one I sent OK for what you needed?
That was also a source of abuse that also got plugged (_being able to fill up the cache with garbage data_)
That was also a source of abuse that also got plugged (_being able to fill up the cache with garbage data_)
Ooof


$ jq '.Feeds | keys[]' cache.json | wc -l
4402


If you both don't mind dropping your caches. I would recommend it. Settings -> Poderator Settings -> Refresh cache.
Ooof


$ jq '.Feeds | keys[]' cache.json | wc -l
4402


If you both don't mind dropping your caches. I would recommend it. Settings -> Poderator Settings -> Refresh cache.
Ooof


$ jq '.Feeds | keys[]' cache.json | wc -l
4402


If you both don't mind dropping your caches. I would recommend it. Settings -> Poderator Settings -> Reset cache.
@prologic

./tools/dump_cache.sh: line 8: bat: command not found
No Token Provided



I don't have bat on my VPS and there is no package for installing it. Is cat a reasonable alternate?
@prologic No worries, thanks for working on the fix for it so fast :)
@prologic Yup. Didn't regret climbing these three hundred odd meters of elevation. :-)
@stigatle Thank you! ๐Ÿ™
@stigatle Thank you! ๐Ÿ™
@prologic Try hitting this URL:

https://twtxt.net/external?nick=nosuchuser&uri=https://foo.com

Change nosuchuser to any phrase at all.

If you hit https://twtxt.net/external?nick=nosuchuser , you're given an error. If you hit that URL above with the uri parameter, you can a legitimate-looking page. I think that is a bug.
@prologic here you go:
https://drive.proton.me/urls/XRKQQ632SG#LXWehEZMNQWF
@stigatle Ta. I hope my theory is right ๐Ÿ˜…
@stigatle Ta. I hope my theory is right ๐Ÿ˜…
@prologic Hitting that URL returns a bunch of HTML even though there is no user named lovetocode999 on my pod. I think it should 404, and maybe with a delay, to discourage whatever this abuse is. Basically this can be used to DDoS a pod by forcing it to generate a hunch of HTML just by doing a bogus GET like this.
@prologic thank you. I run it now as you said, I'll get the files put somewhere shortly.
But just have a look at the yarnd server logs too. Any new interesting errors? ๐Ÿค” No more multi-GB tmp files? ๐Ÿค”
But just have a look at the yarnd server logs too. Any new interesting errors? ๐Ÿค” No more multi-GB tmp files? ๐Ÿค”
@stigatle You want to run backup_db.sh and dump_cache.sh They pipe JSON to stdout and prompt for your admin password. Example:


URL=<your_pod_url> ADMIN=<your_admin_user> ./tools/dump_cache.sh > cache.json
@stigatle You want to run backup_db.sh and dump_cache.sh They pipe JSON to stdout and prompt for your admin password. Example:


URL=<your_pod_url> ADMIN=<your_admin_user> ./tools/dump_cache.sh > cache.json
I'm seeing GETs like this over and over again:

"GET /external?nick=lovetocode999&uri=https://vuf.minagricultura.gov.co/Lists/Informacin%20Servicios%20Web/DispForm.aspx?ID=8375144 HTTP/1.1" 200 35861 17.077914ms


always to nick=lovetocode999, but with different uris. What are these calls?
@stigatle Worky, worky now! :-)

Mate, these are some really nice gems! What a stunning landscape. I love it. Holy cow, that wooden church looks really sick. Even though, I'm not a scroll guy and prefer simple, straight designs, I have to say, that the interior craftmanship is something to admire.
@prologic so, if I'm correct the dump tool made a pods.txt and a stats.txt file, those are the ones you want? or do you want the output that it spits out in the console window?
@prologic so, if I'm correct the dump tool made a pods.txt and a stats.txt file, those are the ones you want?
Just thinking out loud here... With that PR merged (_or if you built off that branch_), you _might_ hopefully see new errors popup and we might catch this problematic bad feed in the act? Hmmm ๐Ÿง
Just thinking out loud here... With that PR merged (_or if you built off that branch_), you _might_ hopefully see new errors popup and we might catch this problematic bad feed in the act? Hmmm ๐Ÿง
@slashdot I _thought_ Sunday was the hottest day on Earth ๐Ÿคฆโ€โ™‚๏ธ wtf is wrong with Slashdot these days?! ๐Ÿคฃ
@slashdot I _thought_ Sunday was the hottest day on Earth ๐Ÿคฆโ€โ™‚๏ธ wtf is wrong with Slashdot these days?! ๐Ÿคฃ
if we can figure out wtf is going on here and my theory is right, we can blacklist that feed, hell even add it to the codebase as an "asshole".
if we can figure out wtf is going on here and my theory is right, we can blacklist that feed, hell even add it to the codebase as an "asshole".
@stigatle The problem is it'll only cause the attack to stop and error out. It won't stop your pod from trying to do this over and over again. That's why I need some help inspecting both your pods for "bad feeds".
@stigatle The problem is it'll only cause the attack to stop and error out. It won't stop your pod from trying to do this over and over again. That's why I need some help inspecting both your pods for "bad feeds".
@prologic I'm running it now. I'll keep an eye out for the tmp folder now (I built the branch you have made). I'll let you know shortly if it helped on my end.
@prologic Ok, I'm running it now. I'll keep an eye out for the tmp folder now (I built the branch you have made). I'll let you know shortly if it helped on my end.
@abucci / @stigatle Please git pull, rebuild and redeploy.

There is also a shell script in ./tools called dump_cache.sh. Please run this, dump your cache and share it with me. ๐Ÿ™
@abucci / @stigatle Please git pull, rebuild and redeploy.

There is also a shell script in ./tools called dump_cache.sh. Please run this, dump your cache and share it with me. ๐Ÿ™
I'm going to merge this...
I'm going to merge this...
@abucci Yeah I've had to block entire ASN(s) recently myself from bad actors, mostly bad AI bots actually from Facebook and Caude AI
@abucci Yeah I've had to block entire ASN(s) recently myself from bad actors, mostly bad AI bots actually from Facebook and Caude AI
@stigatle I used the following hack to keep my VPS from running out of space: watch -n 60 rm -rf /tmp/yarn-avatar-*, run in tmux so it keeps running.
The vast majority of this traffic was coming from a single IP address. I blocked that IP on my VPS, and I sent an abuse report to the abuse email of the service provider. That ought to slow it down, but the vulnerability persists and I'm still getting traffic from other IPs that seem to be doing the same thing.
Or if y'all trust my monkey-ass coding skillz I'll just merge and you can do a git pull and rebuild ๐Ÿ˜…
Or if y'all trust my monkey-ass coding skillz I'll just merge and you can do a git pull and rebuild ๐Ÿ˜…
@stigatle / @abucci My current working theory is that there is an asshole out there that has a feed that both your pods are fetching with a multi-GB avatar URL advertised in their feed's preamble (metadata). I'd love for you both to review this PR, and once merged, re-roll your pods and dump your respective caches and share with me using https://gist.mills.io/
@stigatle / @abucci My current working theory is that there is an asshole out there that has a feed that both your pods are fetching with a multi-GB avatar URL advertised in their feed's preamble (metadata). I'd love for you both to review this PR, and once merged, re-roll your pods and dump your respective caches and share with me using https://gist.mills.io/
@prologic yeah I still do have that issue.
@prologic yeah I still do have that issue, I compiled latest main, did not apply any patches or anything like that.
@stigatle I'm wondering whether you're having the same issue as @abucci still? mulit-GB yarnd-avatar-*1 files piling up in /tmp/? ๐Ÿค”
@stigatle I'm wondering whether you're having the same issue as @abucci still? mulit-GB yarnd-avatar-*1 files piling up in /tmp/? ๐Ÿค”
@prologic yeah, I ran out of space again. also have the activitypub stuff turned off (just so you know).
@abucci So... The only way I see this happening at all is if your pod is fetching feeds which have multi-GB sized avatar(s) in their feed metadata. So the PR I linked earlier will plug that flaw. But now I want to confirm that theory. Can I get you to dump your cache to JSON for me and share it with me?
@abucci So... The only way I see this happening at all is if your pod is fetching feeds which have multi-GB sized avatar(s) in their feed metadata. So the PR I linked earlier will plug that flaw. But now I want to confirm that theory. Can I get you to dump your cache to JSON for me and share it with me?
@abucci Yeah that should be okay, you get so much crap on the web ๐Ÿคฆโ€โ™‚๏ธ
@abucci Yeah that should be okay, you get so much crap on the web ๐Ÿคฆโ€โ™‚๏ธ
@abucci sift is a tool I use for grep/find, etc.

> What would you like to know about the files?

Roughly what their contents are. I've been reviewing the code paths responsible and have found a flaw that needs to be fixed ASAP.

Here's the PR: https://git.mills.io/yarnsocial/yarn/pulls/1169
@abucci sift is a tool I use for grep/find, etc.

> What would you like to know about the files?

Roughly what their contents are. I've been reviewing the code paths responsible and have found a flaw that needs to be fixed ASAP.

Here's the PR: https://git.mills.io/yarnsocial/yarn/pulls/1169
@prologic There are *a lot* of logs being generated by yarnd, which is something I haven't seen before too:


Jul 25 14:32:42 buc yarnd[1911318]: [yarnd] 2024/07/25 14:32:42 (162.211.155.2) "GET /twt/ubhq33a HTTP/1.1" 404 29 643.251ยตs
Jul 25 14:32:43 buc yarnd[1911318]: [yarnd] 2024/07/25 14:32:43 (162.211.155.2) "GET /twt/112073211746755451 HTTP/1.1" 400 12 505.333ยตs
Jul 25 14:32:44 buc yarnd[1911318]: [yarnd] 2024/07/25 14:32:44 (111.119.213.103) "GET /twt/whau6pa HTTP/1.1" 200 37360 35.173255ms
Jul 25 14:32:44 buc yarnd[1911318]: [yarnd] 2024/07/25 14:32:44 (162.211.155.2) "GET /twt/112343305123858004 HTTP/1.1" 400 12 455.069ยตs
Jul 25 14:32:44 buc yarnd[1911318]: [yarnd] 2024/07/25 14:32:44 (168.199.225.19) "GET /external?nick=lovetocode999&uri=http%3A%2F%2Fwww.palapa.pl%2Fbaners.php%3Flink%3Dhttps%3A%2F%2Fwww.dwnewstoday.com HTTP/1.1" 200 36167 19.582077ms
Jul 25 14:32:44 buc yarnd[1911318]: [yarnd] 2024/07/25 14:32:44 (162.211.155.2) "GET /twt/112503061785024494 HTTP/1.1" 400 12 619.152ยตs
Jul 25 14:32:46 buc yarnd[1911318]: [yarnd] 2024/07/25 14:32:46 (162.211.155.2) "GET /twt/111863876118553837 HTTP/1.1" 400 12 817.678ยตs
Jul 25 14:32:46 buc yarnd[1911318]: [yarnd] 2024/07/25 14:32:46 (162.211.155.2) "GET /twt/112749994821704400 HTTP/1.1" 400 12 540.616ยตs
Jul 25 14:32:47 buc yarnd[1911318]: [yarnd] 2024/07/25 14:32:47 (103.204.109.150) "GET /external?nick=lovetocode999&uri=http%3A%2F%2Fampurify.com%2Fbbs%2Fboard.php%3Fbo_table%3Dfree%26wr_id%3D113858 HTTP/1.1" 200 36187 15.95329ms


I've seen that nick=lovetocode999 a bunch.
@prologic Inspect? What's sift? What would you like to know about the files?
@abucci I believe you are correct.
@abucci I believe you are correct.
@abucci That's fucking insane ๐Ÿ˜ฑ I know what code-paths is triggering this, but need to confirm a few other things... Some correlation with logs would also help...
@abucci That's fucking insane ๐Ÿ˜ฑ I know what code-paths is triggering this, but need to confirm a few other things... Some correlation with logs would also help...
Do you happen to have the activitypub feature turned on btw? In fact could you just list out what features you have enabled please? ๐Ÿ™
Do you happen to have the activitypub feature turned on btw? In fact could you just list out what features you have enabled please? ๐Ÿ™
@prologic 10 Gbytes has accumulated since I made that last post. It's coming in at a rate of 55 Mbits/second !
These should be getting cleaned up, but I'm very concerned about the sizes of these ๐Ÿค”

https://git.mills.io/yarnsocial/yarn/src/commit/983fa87d4ea17f76537e19714ad8a6d19ba9d904/internal/utils.go#L658-L670
These should be getting cleaned up, but I'm very concerned about the sizes of these ๐Ÿค”

https://git.mills.io/yarnsocial/yarn/src/commit/983fa87d4ea17f76537e19714ad8a6d19ba9d904/internal/utils.go#L658-L670
Hah ๐Ÿ˜ˆ


prologic@JamessMacStudio
Fri Jul 26 00:22:44
~/Projects/yarnsocial/yarn
 (main) 0
$ sift 'yarnd-avatar-*'
internal/utils.go:666:	tf, err := receiveFile(res.Body, "yarnd-avatar-*")


@abucci Don't suppose you can inspect one of those files could you? Kinda wondering if there's some other abuse going on here that I need to plug? ๐Ÿ”Œ
Hah ๐Ÿ˜ˆ


prologic@JamessMacStudio
Fri Jul 26 00:22:44
~/Projects/yarnsocial/yarn
 (main) 0
$ sift 'yarnd-avatar-*'
internal/utils.go:666:\ttf, err := receiveFile(res.Body, "yarnd-avatar-*")


@abucci Don't suppose you can inspect one of those files could you? Kinda wondering if there's some other abuse going on here that I need to plug? ๐Ÿ”Œ
Hah ๐Ÿ˜ˆ


prologic@JamessMacStudio
Fri Jul 26 00:22:44
~/Projects/yarnsocial/yarn
 (main) 0
$ sift 'yarnd-avatar-*'
internal/utils.go:666:	tf, err := receiveFile(res.Body, "yarnd-avatar-*")


@abucci Don't suppose you can inspect one of those files could you? Kinda wondering if there's some other abuse going on here that I need to plug? ๐Ÿ”Œ
@prologic I think there's more to it than that. I've updated, yet hundreds of gigabytes of junk is still accumulating.
@abucci Hmm that's a bit weird then. Lemme have a poke.
@abucci Hmm that's a bit weird then. Lemme have a poke.
@prologic I'm still getting this crap:

abucci@buc:~/yarnd/yarn$ ls -lh /tmp/yarnd-avatar-*
-rw------- 1 abucci abucci 863M Jul 25 14:19 /tmp/yarnd-avatar-1594499680
-rw------- 1 abucci abucci 7.8G Jul 25 14:19 /tmp/yarnd-avatar-2144295337
-rw------- 1 abucci abucci 9.8G Jul 25 14:19 /tmp/yarnd-avatar-2334738193
-rw------- 1 abucci abucci  10G Jul 25 14:14 /tmp/yarnd-avatar-2494107777
-rw------- 1 abucci abucci 9.5G Jul 25 13:59 /tmp/yarnd-avatar-2619243454
-rw------- 1 abucci abucci  11G Jul 25 14:04 /tmp/yarnd-avatar-2922187513
-rw------- 1 abucci abucci 7.5G Jul 25 14:14 /tmp/yarnd-avatar-349775570
-rw------- 1 abucci abucci  10G Jul 25 14:09 /tmp/yarnd-avatar-3640724243
-rw------- 1 abucci abucci 901M Jul 25 14:19 /tmp/yarnd-avatar-3921595598
-rw------- 1 abucci abucci 9.5G Jul 25 13:59 /tmp/yarnd-avatar-609094539
-rw------- 1 abucci abucci 9.3G Jul 25 14:04 /tmp/yarnd-avatar-755173392
-rw------- 1 abucci abucci 7.9G Jul 25 14:09 /tmp/yarnd-avatar-984061000
@prologic I'm still getting this crap:

abucci@buc:~/yarnd/yarn$ ls -lh /tmp/yarnd-avatar-*
-rw------- 1 abucci abucci 863M Jul 25 14:19 /tmp/yarnd-avatar-1594499680
-rw------- 1 abucci abucci 7.8G Jul 25 14:19 /tmp/yarnd-avatar-2144295337
-rw------- 1 abucci abucci 9.8G Jul 25 14:19 /tmp/yarnd-avatar-2334738193
-rw------- 1 abucci abucci  10G Jul 25 14:14 /tmp/yarnd-avatar-2494107777
-rw------- 1 abucci abucci 9.5G Jul 25 13:59 /tmp/yarnd-avatar-2619243454
-rw------- 1 abucci abucci  11G Jul 25 14:04 /tmp/yarnd-avatar-2922187513
-rw------- 1 abucci abucci 7.5G Jul 25 14:14 /tmp/yarnd-avatar-349775570
-rw------- 1 abucci abucci  10G Jul 25 14:09 /tmp/yarnd-avatar-3640724243
-rw------- 1 abucci abucci 901M Jul 25 14:19 /tmp/yarnd-avatar-3921595598
-rw------- 1 abucci abucci 9.5G Jul 25 13:59 /tmp/yarnd-avatar-609094539
-rw------- 1 abucci abucci 9.3G Jul 25 14:04 /tmp/yarnd-avatar-755173392
-rw------- 1 abucci abucci 7.9G Jul 25 14:09 /tmp/yarnd-avatar-984061000


Something like 100 Gbytes of this junk has accumulated since I updated and re-started the server. I'm now running the latest version of yarnd, so the update did not fix the problem. Something else is going wrong.

How are temporary files growing to 10 Gbytes in size? The name of the file is "yarn-avatar", but why would avatars be so large?
Hmm remove the cpu limits on this pod, not even sure why I had 'em set tbh, we decided at my day job that setting cpu limits on containers is a bit of a silly idea too. Anyway, pod should be much snappier now ๐Ÿ˜…
Hmm remove the cpu limits on this pod, not even sure why I had 'em set tbh, we decided at my day job that setting cpu limits on containers is a bit of a silly idea too. Anyway, pod should be much snappier now ๐Ÿ˜…
@movq Oh nothing much ๐Ÿคฃ Just a bunch of folks running really old versions of yarnd that were susceptible to abuse on the open web ๐Ÿคฃ
@movq Oh nothing much ๐Ÿคฃ Just a bunch of folks running really old versions of yarnd that were susceptible to abuse on the open web ๐Ÿคฃ
What the heck is going on here today, so many messages. ๐Ÿ˜‚
What the heck is going on here today, so many messages. ๐Ÿ˜‚