# I am the Watcher. I am your guide through this vast new twtiverse.
# 
# Usage:
#     https://watcher.sour.is/api/plain/users              View list of users and latest twt date.
#     https://watcher.sour.is/api/plain/twt                View all twts.
#     https://watcher.sour.is/api/plain/mentions?uri=:uri  View all mentions for uri.
#     https://watcher.sour.is/api/plain/conv/:hash         View all twts for a conversation subject.
# 
# Options:
#     uri     Filter to show a specific users twts.
#     offset  Start index for quey.
#     limit   Count of items to return (going back in time).
# 
# twt range = 1 60515
# self = https://watcher.sour.is?uri=https://twtxt.net/user/prologic/twtxt.txt&offset=57115
# next = https://watcher.sour.is?uri=https://twtxt.net/user/prologic/twtxt.txt&offset=57215
# prev = https://watcher.sour.is?uri=https://twtxt.net/user/prologic/twtxt.txt&offset=57015
👋 Reminder folks of the upcoming Yarn.social monthly online meetup:

I hope to see @david @movq @lyse @xuu @sorenpeter and hopefully others too @aelaraji @falsifian and anyone else that sees this! 🙏 We're _hopefully_ going to primarily discuss the future of Twtxt and the last few weeks of discussions 🤣

- Event: Yarn.social Online Meetup
- When: 28th September 2024 at 12:00pm UTC (midday)
- Where: Mills Meet : Yarn.social
- Cadence: 4th Saturday of every Month

Agenda:

- Let's talk about the upcoming changes to the Twtxt spec(s)
- See #xgghhnq

#Yarn.social #Meetup
My Position on the last few weeks of Twtxt spec discussions:

- We increase the Hash length from 7 to 11.
- We formalise the Update Commands extension.
- We amend the Twt Hash and Metadata extension to state:

> Feed authors that wish to change the location of their feed (_once Twts have been published_) must append a new # url = comment to their feed to indicate the new location and thus change the "Hashing URI" used for Twts from _that_ point onward.

This has implications of the "order" of a feed, and we should either do one of two things, either:

- Mandate that feeds are append-only.
- Or amend the Metadata spec with a new field that denotes the order of the feed so clients can make sense of "inline" comments in the feed. -- This would also imply that the default order is (_of course_) append-only. Suggestion: # direction = [append|prepend]
My Position on the last few weeks of Twtxt spec discussions:

- We increase the Hash length from 7 to 11.
- We formalise the Update Commands extension.
- We amend the Twt Hash and Metadata extension to state:

> Feed authors that wish to change the location of their feed (_once Twts have been published_) must append a new # url = comment to their feed to indicate the new location and thus change the "Hashing URI" used for Twts from _that_ point onward.

This has implications of the "order" of a feed, and we should either do one of two things, either:

- Mandate that feeds are append-only.
- Or amend the Metadata spec with a new field that denotes the order of the feed so clients can make sense of "inline" comments in the feed. -- This would also imply that the default order is (_of course_) append-only. Suggestion: # direction = [append|prepend]
I finally decided to do a few experiments with yarnd to see how many things would break and how many assumptions there are around the idea of "Content Addressing"; here's where I'm at so far:

- What breaks

Basically I'm at a point where spending time on this is going to provide very little value, there are assumptions made in the lextwt parser, assumptions made in yarnd, assumptions in the way storage is done and the way threading works and things are looked up. There are far reaching implications to changing the way Twts are identified here to be "location addressed" that I'm quite worried about the amount of effort would be required to change yarnd here.

I finally decided to do a few experiments with yarnd to see how many things would break and how many assumptions there are around the idea of "Content Addressing"; here's where I'm at so far:

- What breaks

Basically I'm at a point where spending time on this is going to provide very little value, there are assumptions made in the lextwt parser, assumptions made in yarnd, assumptions in the way storage is done and the way threading works and things are looked up. There are far reaching implications to changing the way Twts are identified here to be "location addressed" that I'm quite worried about the amount of effort would be required to change yarnd here.

@mckinley Yes I have, however I'm not counting that because even using "Cloud" is not labor free.
@mckinley Yes I have, however I'm not counting that because even using "Cloud" is not labor free.
@aelaraji We digits it out 🤣 @lyse 's little hack was good but only temporary 🤣
@aelaraji We digits it out 🤣 @lyse 's little hack was good but only temporary 🤣
@sorenpeter Lins of agree with dealing with this kind of social nonsense which we've all done in the past 🤣
@sorenpeter Lins of agree with dealing with this kind of social nonsense which we've all done in the past 🤣
@movq I think your scenario doesn't account for clients and their storage. The scenario described only really affects clients that come along later. Even then they would also be able to re-fetch mossing Twts from peers or even a search engine to fill in the gaps.
@movq I think your scenario doesn't account for clients and their storage. The scenario described only really affects clients that come along later. Even then they would also be able to re-fetch mossing Twts from peers or even a search engine to fill in the gaps.
@movq That's kind a problem though right?
@movq That's kind a problem though right?
@david 🤣🤣🤣
@david 🤣🤣🤣
I just realized the other big property you lose is:

> What if someone completely changes the content of the root of the thread?

Does the Subject reference the feed and timestamp only or the intent too?
I just realized the other big property you lose is:

> What if someone completely changes the content of the root of the thread?

Does the Subject reference the feed and timestamp only or the intent too?
@bender Yeah I'll be honest here; I'm not going to be very happy if we go down this "location addressing" route;

- Twt Subjects lose their meaning.
- Twt Subjects cannot be verified without looking up the feed.
- Which may or may not exist anymore or may change.
- Two persons cannot reply to a Twt independently of each other anymore.

_and probably some other properties we'd stand to lose that I'm forgetting about..._
@bender Yeah I'll be honest here; I'm not going to be very happy if we go down this "location addressing" route;

- Twt Subjects lose their meaning.
- Twt Subjects cannot be verified without looking up the feed.
- Which may or may not exist anymore or may change.
- Two persons cannot reply to a Twt independently of each other anymore.

_and probably some other properties we'd stand to lose that I'm forgetting about..._
@movq One of the biggest reasons I don't like the (replyto:…) proposal (_location addressing vs. content addressing_) is that you just introduce a similar problem down the track, albeit rarer where if a feed changes its location, your thread's "identifiers" are no longer valid, unless those feed authors maintain strict URL redirects, etc. This potentially has the long-term effect of being rather fragile, as opposed to what we have now where an Edit just really causes a natural fork in the thread, which is how "forking" works in the first place.

I realise this is a bit pret here, and it probably doesn't matter a whole lot at our size. But I'm trying to think way ahead, to a point where Twtxt as a "thing" can continue to work and function decades from now, even with the extensions we've built. We've already proven for example that Twts and threads from ~4 years ago still work and are easily looked up haha 😝~
@movq One of the biggest reasons I don't like the (replyto:…) proposal (_location addressing vs. content addressing_) is that you just introduce a similar problem down the track, albeit rarer where if a feed changes its location, your thread's "identifiers" are no longer valid, unless those feed authors maintain strict URL redirects, etc. This potentially has the long-term effect of being rather fragile, as opposed to what we have now where an Edit just really causes a natural fork in the thread, which is how "forking" works in the first place.

I realise this is a bit pret here, and it probably doesn't matter a whole lot at our size. But I'm trying to think way ahead, to a point where Twtxt as a "thing" can continue to work and function decades from now, even with the extensions we've built. We've already proven for example that Twts and threads from ~4 years ago still work and are easily looked up haha 😝~
I just read the primary spec I'm strongly in support of and it's pretty rock solid for me 👌 💯
I just read the primary spec I'm strongly in support of and it's pretty rock solid for me 👌 💯
Do you recall what it was? I blame my maintenance window 🪟
Do you recall what it was? I blame my maintenance window 🪟
@bender Hmm what you replied to appears to be non-existent: https://twtxt.net/twt/pqst4ea
@bender Hmm what you replied to appears to be non-existent: https://twtxt.net/twt/pqst4ea
@movq I just saw thes come through! 🙏 Thank you very much, I'll definitely have a read tomorrow! 👌
@movq I just saw thes come through! 🙏 Thank you very much, I'll definitely have a read tomorrow! 👌
@bender Which reply was that? 🤔
@bender Which reply was that? 🤔
@bender Bahahahahaha 🤣
@bender Bahahahahaha 🤣
Ever wondered what it would cost to self-hosted vs. use the cloud? Well I often doubt myself every time I look at hardware prices, and I know I have to do some hardware refresh soon™ for the Mills DC (_something I don't have a regular plan or budget for_), here's a rough ball park:

The Mills DC has cost me around ~$15k to build and maintain over the last ~10 years or so. Roughly speaking. I've never actually taken a Bill of Materials or anything, but I could if anyone is interested in more specifics.

The equivalent of resources if run in the "Cloud" would cost around:

- ~$1,000 for virtual machines
- ~$12000 for storage

So around ~$2,000/month to run.

Keep this in mind anytime anyone ever tries to con you into believing "Cloud is cheaper". It's not.~
Ever wondered what it would cost to self-hosted vs. use the cloud? Well I often doubt myself every time I look at hardware prices, and I know I have to do some hardware refresh soon™ for the Mills DC (_something I don't have a regular plan or budget for_), here's a rough ball park:

The Mills DC has cost me around ~$15k to build and maintain over the last ~10 years or so. Roughly speaking. I've never actually taken a Bill of Materials or anything, but I could if anyone is interested in more specifics.

The equivalent of resources if run in the "Cloud" would cost around:

- ~$1,000 for virtual machines
- ~$12000 for storage

So around ~$2,000/month to run.

Keep this in mind anytime anyone ever tries to con you into believing "Cloud is cheaper". It's not.~
@aelaraji This is one of the reasons why yarnd has a couple of settings with some sensible/sane defaults:

> I could already imagine a couple of extreme cases where, somewhere, in this peaceful world one’s exercise of freedom of speech could get them in Real trouble (if not danger) if found out, it wouldn’t necessarily have to involve something to do with Law or legal authorities. So, If someone asks, and maybe fearing fearing for… let’s just say ‘Their well being’, would it heart if a pod just purged their content if it’s serving it publicly (maybe relay the info to other pods) and call it a day? It doesn’t have to be about some law/convention somewhere … 🤷 I know! Too extreme, but I’ve seen news of people who’d gone to jail or got their lives ruined for as little as a silly joke. And it doesn’t even have to be about any of this.

There are two settings:


$ ./yarnd --help 2>&1 | grep max-cache
      --max-cache-fetchers int        set maximum numnber of fetchers to use for feed cache updates (default 10)
  -I, --max-cache-items int           maximum cache items (per feed source) of cached twts in memory (default 150)
  -C, --max-cache-ttl duration        maximum cache ttl (time-to-live) of cached twts in memory (default 336h0m0s)


So yarnd pods by default are designed to only keep Twts around publicly visible on either the anonymous Frontpage or Discover View or your Timeline or the feed's Timeline for up to 2 weeks with a maximum of 150 items, whichever get exceeded first. Any Twts over this are considered "old" and drop off the active cache.

It's a feature that my old man @off_grid_living was very strongly in support of, as was I back in the day of yarnd's design (_nothing particularly to do with Twtxt per se_) that I've to this day stuck by -- Even though there are _some_ 😉 that have different views on this 🤣
@aelaraji This is one of the reasons why yarnd has a couple of settings with some sensible/sane defaults:

> I could already imagine a couple of extreme cases where, somewhere, in this peaceful world one’s exercise of freedom of speech could get them in Real trouble (if not danger) if found out, it wouldn’t necessarily have to involve something to do with Law or legal authorities. So, If someone asks, and maybe fearing fearing for… let’s just say ‘Their well being’, would it heart if a pod just purged their content if it’s serving it publicly (maybe relay the info to other pods) and call it a day? It doesn’t have to be about some law/convention somewhere … 🤷 I know! Too extreme, but I’ve seen news of people who’d gone to jail or got their lives ruined for as little as a silly joke. And it doesn’t even have to be about any of this.

There are two settings:


$ ./yarnd --help 2>&1 | grep max-cache
      --max-cache-fetchers int        set maximum numnber of fetchers to use for feed cache updates (default 10)
  -I, --max-cache-items int           maximum cache items (per feed source) of cached twts in memory (default 150)
  -C, --max-cache-ttl duration        maximum cache ttl (time-to-live) of cached twts in memory (default 336h0m0s)


So yarnd pods by default are designed to only keep Twts around publicly visible on either the anonymous Frontpage or Discover View or your Timeline or the feed's Timeline for up to 2 weeks with a maximum of 150 items, whichever get exceeded first. Any Twts over this are considered "old" and drop off the active cache.

It's a feature that my old man @off_grid_living was very strongly in support of, as was I back in the day of yarnd's design (_nothing particularly to do with Twtxt per se_) that I've to this day stuck by -- Even though there are _some_ 😉 that have different views on this 🤣
@aelaraji Thanks for this! 🙏
@aelaraji Thanks for this! 🙏
Bahahahaha very clever @lyse I look forward to reading your report ! 🤣 However...


$ yarnc debug https://twtxt.net/user/prologic/twtxt.txt | grep -E '^pqst4ea' | tee | wc -l
0


I very quickly proved that Twt was never from me 🤣
Bahahahaha very clever @lyse I look forward to reading your report ! 🤣 However...


$ yarnc debug https://twtxt.net/user/prologic/twtxt.txt | grep -E '^pqst4ea' | tee | wc -l
0


I very quickly proved that Twt was never from me 🤣
@yarn_police Cool cool 🙇‍♂️
@yarn_police Cool cool 🙇‍♂️
@yarn_police What's going on?
@yarn_police What's going on?
@movq Yes that's true they are only integrity checks. But beyond a malicious pod (ignore yarnd'a gossiping protocol for now) how does what @lyse presented work exactly? 😅
@movq Yes that's true they are only integrity checks. But beyond a malicious pod (ignore yarnd'a gossiping protocol for now) how does what @lyse presented work exactly? 😅
But this is no different to how jenny does things with storing every Twt in a Maildir I suppose? 🤔
But this is no different to how jenny does things with storing every Twt in a Maildir I suppose? 🤔
This has specifically come up before in the form of "informal complaints" against yarnd because of the way it permanently stores and archives Twts, so even if you decide you changed your mind, or deleted that line out of your feed, if my pod or @xuu or @abucci or @eldersnake (_or any other handful of pods still around?_) saw the Twt, it'd be permanently archived._
This has specifically come up before in the form of "informal complaints" against yarnd because of the way it permanently stores and archives Twts, so even if you decide you changed your mind, or deleted that line out of your feed, if my pod or @xuu or @abucci or @eldersnake (_or any other handful of pods still around?_) saw the Twt, it'd be permanently archived._
Yeah I'm curious to find out too beyond just "here say". But regardless of whether we should or shouldn't care about this or should or shouldn't comply. We should IMO. I'd have to build something that horrendously violates someone's rights in another country.
Yeah I'm curious to find out too beyond just "here say". But regardless of whether we should or shouldn't care about this or should or shouldn't comply. We should IMO. I'd have to build something that horrendously violates someone's rights in another country.
@movq Care to explain how this explicit/attack works for me? 🤣
@movq Care to explain how this explicit/attack works for me? 🤣
Well that was bloody awful. This PR bokr my pod for some strange reason I can't figure out why or how 😱 The process just kept getting terminated from something, somewhere (_no panic_). weird. I've reverted this PR for now @xuu
Well that was bloody awful. This PR bokr my pod for some strange reason I can't figure out why or how 😱 The process just kept getting terminated from something, somewhere (_no panic_). weird. I've reverted this PR for now @xuu
Really though I only managed to save a few GB, but it's enough for now.
Really though I only managed to save a few GB, but it's enough for now.
@bender Haha 😛 Faster? Maybe 🤔 But yeah it's good to have backups! (_that work_)
@bender Haha 😛 Faster? Maybe 🤔 But yeah it's good to have backups! (_that work_)
I've also put up this PR [Add compatible methods for Index to behave as the Archiver (transition) #1177
](https://git.mills.io/yarnsocial/yarn/pulls/1177) that will act as a transition from the old naive archiver to the new bluge-based search/index. I will switch my pod over to this soon to test it before anyone else does.
I've also put up this PR [Add compatible methods for Index to behave as the Archiver (transition) #1177
](https://git.mills.io/yarnsocial/yarn/pulls/1177) that will act as a transition from the old naive archiver to the new bluge-based search/index. I will switch my pod over to this soon to test it before anyone else does.
For those curious, the archive on this pod had reached around ~22GB in size. I had to suck it down to my more powerful Mac Studio to clean it up and remove a bunch of junk. Then copy all the data back. This is what my local network traffic looked like for the last few hours 😱 ~
For those curious, the archive on this pod had reached around ~22GB in size. I had to suck it down to my more powerful Mac Studio to clean it up and remove a bunch of junk. Then copy all the data back. This is what my local network traffic looked like for the last few hours 😱 ~
And we're back. Sorry about that 😅
And we're back. Sorry about that 😅
@lyse Hmmm I'm not sure sure I get what you're getting at here. In order for this to be true, yarnd would have to be maliciously fabricating a Twt with the Hash D.
@lyse Hmmm I'm not sure sure I get what you're getting at here. In order for this to be true, yarnd would have to be maliciously fabricating a Twt with the Hash D.
i.e: there must be two versions of the Twt in the feed.
i.e: there must be two versions of the Twt in the feed.
@lyse This is true. But the client MUST supply the original too! Or this doesn't work 😢
@lyse This is true. But the client MUST supply the original too! Or this doesn't work 😢
If OTOH your client doesn't store individual Twts in a cache/archive or some kind of database, then verification becomes quite hard and tedious. However I think of this as an implementation details. The spec should just call out that clients must validate/verify the edit request and the matching hash actually exists in that feed, not how the client should implement that.
If OTOH your client doesn't store individual Twts in a cache/archive or some kind of database, then verification becomes quite hard and tedious. However I think of this as an implementation details. The spec should just call out that clients must validate/verify the edit request and the matching hash actually exists in that feed, not how the client should implement that.
@lyse Yes you do. You keep both versions in your cache. They have different hashes. So you have Twt A, a client indicates Twt B is an edit of A, your client has already seen A and cached and archived it, now your client fetches B which is indicated of editing A. You cache/archive B as well, but now indicate in your display that B replaces A (_maybe display, link both_) or just display B or whatever. But essentially you now have both, but an indicator of one being an edit of the other.

The right thing to do here of course is to keep A in the "thread" but display B. Why? So the thread/chain doesn't actually break or fork (_forking is a natural consequence of editing, or is it the other way around? 🤔_)._
@lyse Yes you do. You keep both versions in your cache. They have different hashes. So you have Twt A, a client indicates Twt B is an edit of A, your client has already seen A and cached and archived it, now your client fetches B which is indicated of editing A. You cache/archive B as well, but now indicate in your display that B replaces A (_maybe display, link both_) or just display B or whatever. But essentially you now have both, but an indicator of one being an edit of the other.

The right thing to do here of course is to keep A in the "thread" but display B. Why? So the thread/chain doesn't actually break or fork (_forking is a natural consequence of editing, or is it the other way around? 🤔_)._
@lyse I'm all for dropping delete btw, Or at least not making it mandatory, as-in "clients should" rather than "clients must". But yes I agree, let's explore all the possible ways this can be exploited (_if at all_).
@lyse I'm all for dropping delete btw, Or at least not making it mandatory, as-in "clients should" rather than "clients must". But yes I agree, let's explore all the possible ways this can be exploited (_if at all_).
@movq I think not.

> What about edits of edits? Do we want to “chain” edits or does the latest edit simply win?

This gets too complicated if we start to support this kind of nonsense 🤣
@movq I think not.

> What about edits of edits? Do we want to “chain” edits or does the latest edit simply win?

This gets too complicated if we start to support this kind of nonsense 🤣
@movq Thank you! 🙏
@movq Thank you! 🙏
@lyse Walk me through this? 🤔 I get what you're saying, but I'm too stupid to be a "hacker" 🤣
@lyse Walk me through this? 🤔 I get what you're saying, but I'm too stupid to be a "hacker" 🤣
But yes, at the end of the day if the edit request is invalid or cannot be verified, it should be ignored as treated as "malicious".
But yes, at the end of the day if the edit request is invalid or cannot be verified, it should be ignored as treated as "malicious".
@lyse @movq So a client that has the idea of a cache/archive wouldn't necessarily have to re-check that the Twt being marked as "edited" belongs to that feed or not, the client would already know that for sure. At least this is how yarnd works and I'm sure jenny can make similar assertions too.
@lyse @movq So a client that has the idea of a cache/archive wouldn't necessarily have to re-check that the Twt being marked as "edited" belongs to that feed or not, the client would already know that for sure. At least this is how yarnd works and I'm sure jenny can make similar assertions too.
@lyse @falsifian Contributions to search.twtxt.net, which runs yarns (_not to be confused with yarnd_) are always welcome 🤗 -- I don't have as much "spare time" as I used to due to the nature of my job (_Staff Engineer_); but I try to make improvements every now and again 💪
@lyse @falsifian Contributions to search.twtxt.net, which runs yarns (_not to be confused with yarnd_) are always welcome 🤗 -- I don't have as much "spare time" as I used to due to the nature of my job (_Staff Engineer_); but I try to make improvements every now and again 💪
@falsifian You make good points though, I made similar arguments about this too back in the day. Twtxt v2 / Yarn.social being at least ~4 years old now 😅~
@falsifian You make good points though, I made similar arguments about this too back in the day. Twtxt v2 / Yarn.social being at least ~4 years old now 😅~
@falsifian Do you have specifics about the GRPD law about this?

> Would the GDPR would apply to a one-person client like jenny? I seriously hope not. If someone asks me to delete an email they sent me, I don’t think I have to honour that request, no matter how European they are.

I'm not sure myself now. So let's find out whether parts of the GDPR actually apply to a truly decentralised system? 🤔
@falsifian Do you have specifics about the GRPD law about this?

> Would the GDPR would apply to a one-person client like jenny? I seriously hope not. If someone asks me to delete an email they sent me, I don’t think I have to honour that request, no matter how European they are.

I'm not sure myself now. So let's find out whether parts of the GDPR actually apply to a truly decentralised system? 🤔
LOL 😂 This:

> anyone could claim that some feed contained a certain message which was then removed again by just creating the hash over the fake message in said feed and invented timestamp themselves

I'd like to see a step-by-step reproduction of this. I don't buy it 🤣

Admittedly yarnd had a few implementation security bugs, but I'm not sure this is actually possible, unless I'm missing something? 🤔
LOL 😂 This:

> anyone could claim that some feed contained a certain message which was then removed again by just creating the hash over the fake message in said feed and invented timestamp themselves

I'd like to see a step-by-step reproduction of this. I don't buy it 🤣

Admittedly yarnd had a few implementation security bugs, but I'm not sure this is actually possible, unless I'm missing something? 🤔
@david Very nice! 👍