The Watcher

Don't forget about the upcoming Yarn.social meetup coming up this Saturday! See # for details! Hope to see some/all of y'all there 💪

prologic

twtxt.net

23 Sep 24 11:20 UTC

Don't forget about the upcoming Yarn.social meetup coming up this Saturday! See #jjbnvgq for details! Hope to see some/all of y'all there 💪

prologic

twtxt.net

23 Sep 24 11:18 UTC

@lyse And your query to construct a tree? Can you share the full query (_screenshot looks scary 🤣_) -- On another note, SQL and relational databases aren't really that conduces to tree-like structures are they? 🤣_

prologic

twtxt.net

23 Sep 24 11:18 UTC

prologic

twtxt.net

23 Sep 24 11:10 UTC

In fact it depends on how many Twts there are that form part of a thread, if you take a much larger sample size of my own feed for example, it starts to approximate ~1.5x increase in size:


$ ./compare.sh https://twtxt.net/user/prologic/twtxt.txt 500
Original file size: 126842 bytes
Modified file size: 317029 bytes
Percentage increase in file size: 149.94%
...

prologic

twtxt.net

23 Sep 24 11:10 UTC

In fact it depends on how many Twts there are that form part of a thread, if you take a much larger sample size of my own feed for example, it starts to approximate ~1.5x increase in size:


$ ./compare.sh https://twtxt.net/user/prologic/twtxt.txt 500
Original file size: 126842 bytes
Modified file size: 317029 bytes
Percentage increase in file size: 149.94%
...

prologic

twtxt.net

23 Sep 24 11:04 UTC

In fact @falsifian you had quite a lot of good feedback, do you mind collecting them in a task list on the doc somewhere so I can get to em? 🤔

prologic

twtxt.net

23 Sep 24 11:04 UTC

In fact @falsifian you had quite a lot of good feedback, do you mind collecting them in a task list on the doc somewhere so I can get to em? 🤔

prologic

twtxt.net

23 Sep 24 11:00 UTC

Can someone make the edit?

prologic

twtxt.net

23 Sep 24 11:00 UTC

Can someone make the edit?

prologic

twtxt.net

23 Sep 24 10:57 UTC

@movq Tbis was just a representative sample. The real concrete cost here is a ~5x increase in memory consumption for yarnd and/or ~5x increase in disk storage.

prologic

twtxt.net

23 Sep 24 10:57 UTC

@movq Tbis was just a representative sample. The real concrete cost here is a ~5x increase in memory consumption for yarnd and/or ~5x increase in disk storage.

prologic

twtxt.net

23 Sep 24 10:51 UTC

@lyse Mind sharing your schema?

prologic

twtxt.net

23 Sep 24 10:51 UTC

@lyse Mind sharing your schema?

prologic

twtxt.net

23 Sep 24 10:50 UTC

@lyse Not sure I'll check

prologic

twtxt.net

23 Sep 24 10:50 UTC

@lyse Not sure I'll check

prologic

twtxt.net

23 Sep 24 10:49 UTC

@lyse My proposal is three steps:

- increase the hash length from 7 to 11

Then:

- Add support for changing your feed's location without breaking g threads

Then much later:

- Add formal support for edits

prologic

twtxt.net

23 Sep 24 10:49 UTC

prologic

twtxt.net

23 Sep 24 10:45 UTC

@lyse No I don't either just say'n 😅

prologic

twtxt.net

23 Sep 24 10:45 UTC

@lyse No I don't either just say'n 😅

prologic

twtxt.net

23 Sep 24 10:43 UTC

@movq That's what I want to know 🤣

prologic

twtxt.net

23 Sep 24 10:43 UTC

@movq That's what I want to know 🤣

prologic

twtxt.net

23 Sep 24 07:58 UTC

So just to be clear, it's not as bad as the OP in this thread, this is just a worst case scenario. With some additional analysis I did today, its closer to around ~5x the memory requirements of my pod, which would roughly go from ~22MB to ~120MB or so, probably a bit more in practise. But this is still a significant increase in memory. The on-disk requirements would also increase by around ~5x as well on average going from ~12GB to about ~60GB at current archive size.

prologic

twtxt.net

23 Sep 24 07:58 UTC

prologic

twtxt.net

23 Sep 24 06:46 UTC

Just out of curiosity, I inspected the yarns database (_the search engine//cralwer_) to find the average length of a Twtxt URI:


$ inspect-db yarns.db | jq -r '.Value.URL' | awk '{ total += length; count++ } END { if (count > 0) print total / count }'
40.3387

Given an RFC3339 UTC timestamp has a length of 20 characters with seconds precision. We're talking about Twt Subject taking up ~63 characters/bytes on average._~

prologic

twtxt.net

23 Sep 24 06:46 UTC

Just out of curiosity, I inspected the yarns database (_the search engine//cralwer_) to find the average length of a Twtxt URI:


$ inspect-db yarns.db | jq -r '.Value.URL' | awk '{ total += length; count++ } END { if (count > 0) print total / count }'
40.3387

Given an RFC3339 UTC timestamp has a length of 20 characters with seconds precision. We're talking about Twt Subject taking up ~63 characters/bytes on average._~

prologic

twtxt.net

23 Sep 24 06:30 UTC

Comparing a few feeds:

- @xuu would see an increase of ~20%
- @falsifian would see an increase of ~8%
- @bender would see an increase of ~20%
- @lyse would see an increase of ~15%
- @aelaraji would see an increase of ~13%
- @sorenpeter would see an increase of ~8%
- @movq would see an increase of ~9%

Just from a scalability standpoint along I'm not seeing a switch to location-based Twt ids to support threading a good idea here. This is what I meant when I said to @david in a recent call that we open up a new can of worms (_or new set of problems_) by drastically changing the approach, rather than incrementally improving the existing approach we have today (_which has served us well for the past 4 years already_0.~_

prologic

twtxt.net

23 Sep 24 06:30 UTC

prologic

twtxt.net

23 Sep 24 06:23 UTC

Reminder to take the Twtxt (_anonymous_) Poll: http://polljunkie.com/poll/xdgjib/twtxt-v2

Apologies, I can't edit the poll once it's live, so the suggestion on feedback for supporting Markdown will have to be discussed at another time.

prologic

twtxt.net

23 Sep 24 06:23 UTC

prologic

twtxt.net

@xuu correct

prologic

twtxt.net

@xuu correct

prologic

twtxt.net

@xuu 🤣🤣🤣

prologic

twtxt.net

@xuu 🤣🤣🤣

prologic

twtxt.net

23 Sep 24 04:57 UTC

So I whipped up a quick shell script to demonstrate what I mean by the increase in feed size on average as well as the expected increase in storage and retrieval requirements.


$ ./compare.sh
Original file size: 28145 bytes
Modified file size: 70672 bytes
Percentage increase in file size: 151.10%
...

prologic

twtxt.net

23 Sep 24 04:57 UTC

So I whipped up a quick shell script to demonstrate what I mean by the increase in feed size on average as well as the expected increase in storage and retrieval requirements.


$ ./compare.sh
Original file size: 28145 bytes
Modified file size: 70672 bytes
Percentage increase in file size: 151.10%
...

prologic

twtxt.net

Thank goodness we relaxed that limit and I've stopped being so Puritan about it but my overall point is we would be significantly increasing the human size as well as the machine size of the identity of threads as well as twts

prologic

twtxt.net

prologic

twtxt.net

With the original specification of 140 character Twt length recommendation. There's only leaves you with about 78 characters worth of anything remotely useful to say in response.

prologic

twtxt.net

With the original specification of 140 character Twt length recommendation. There's only leaves you with about 78 characters worth of anything remotely useful to say in response.

prologic

twtxt.net

Let's say the overhead is always three bytes two parentheses under space.

prologic

twtxt.net

Let's say the overhead is always three boats two parentheses under space.

prologic

twtxt.net

Let's say the overhead is always three bytes two parentheses under space.

prologic

twtxt.net

So for example, if we would use @movq 's feed as an example thread ID here, his feed with a particular timestamp, were already looking at a subject length of 59 bytes +/- a couple of bytes to denote the subject in the Twt itself/

prologic

twtxt.net

prologic

twtxt.net

23 Sep 24 04:05 UTC

One of the reasons we wanted to originally use Contant based addressing and short hashes as our threading model was to keep individual Twts short so that they were still readable if you viewed the manually by hand.

With the proposal to switch to location based addressing using a pointer to a feed and a timestamp in that feed you're looking at roughly 2025 characters long because both the HTTP and HTML and even URI specifications do not specify maximum length for URI(s) AFAIK only recommendations.

prologic

twtxt.net

23 Sep 24 04:05 UTC

prologic

twtxt.net

23 Sep 24 03:59 UTC

@bender I can't see myself personally, increasing the infrastructure and costs to run this pod to support this as we switch over potentially and as things continue to grow in scale. You would never get your infinite search and infinite timeline features that you've always wanted for example and I would have to drastically reduce what is visible or even searchable at any given point in time to much less than what it is today.

prologic

twtxt.net

23 Sep 24 03:59 UTC

prologic

twtxt.net

23 Sep 24 03:57 UTC

Another interesting side effect of changing from content-based addressing to location-based addressing is that switching from 7-byte keys to 2025-character keys for 3.5 million entries would expand the database size from 24.5 MB to about 7.09 GB—an increase of roughly 7.06 GB!

prologic

twtxt.net

23 Sep 24 03:57 UTC

prologic

twtxt.net

23 Sep 24 00:56 UTC

@falsifian No worries! Fell few to contribute to the doc directly I'd you wish 👌

prologic

twtxt.net

23 Sep 24 00:56 UTC

@falsifian No worries! Fell few to contribute to the doc directly I'd you wish 👌

prologic

twtxt.net

23 Sep 24 00:55 UTC

@falsifian Hmmm not sure sorry 🤔

prologic

twtxt.net

23 Sep 24 00:55 UTC

@falsifian Hmmm not sure sorry 🤔

prologic

twtxt.net

23 Sep 24 00:45 UTC

@xuu Goos to know! 👌 So as long as we remain decentralized and non-commercial (I assume non/profit works too?) we're good?

prologic

twtxt.net

23 Sep 24 00:45 UTC

@xuu Goos to know! 👌 So as long as we remain decentralized and non-commercial (I assume non/profit works too?) we're good?

prologic

twtxt.net

22 Sep 24 12:33 UTC

@lyse Nice ! 🙏

prologic

twtxt.net

22 Sep 24 12:33 UTC

@lyse Nice ! 🙏

prologic

twtxt.net

22 Sep 24 11:54 UTC

@doesnm Hello! 👋

prologic

twtxt.net

22 Sep 24 11:54 UTC

@doesnm Hello! 👋

prologic

twtxt.net

22 Sep 24 10:20 UTC

@lyse Yes let's make UTF-8 mandatory 👌

prologic

twtxt.net

22 Sep 24 10:20 UTC

@lyse Yes let's make UTF-8 mandatory 👌

prologic

twtxt.net

22 Sep 24 10:19 UTC

@lyse Agreed

prologic

twtxt.net

22 Sep 24 10:19 UTC

@lyse Agreed

prologic

twtxt.net

22 Sep 24 10:13 UTC

Let's try this pill for Twtxt v2 (no account required)

http://polljunkie.com/poll/xdgjib/twtxt-v2

prologic

twtxt.net

22 Sep 24 10:13 UTC

Let's try this pill for Twtxt v2 (no account required)

http://polljunkie.com/poll/xdgjib/twtxt-v2

prologic

twtxt.net

22 Sep 24 09:35 UTC

@lyse I'm a bit indifferent whether it's at the beginning or end tbh.

prologic

twtxt.net

22 Sep 24 09:35 UTC

@lyse I'm a bit indifferent whether it's at the beginning or end tbh.

prologic

twtxt.net

22 Sep 24 09:21 UTC

This is still a draft! Feel free to edit it 👌

prologic

twtxt.net

22 Sep 24 09:21 UTC

This is still a draft! Feel free to edit it 👌

prologic

twtxt.net

22 Sep 24 09:19 UTC

@movq That's what I was afraid of 🤣

prologic

twtxt.net

22 Sep 24 09:19 UTC

@movq That's what I was afraid of 🤣

prologic

twtxt.net

22 Sep 24 09:18 UTC

@movq Makes sense 👌 I think it's fair to implement any spec changes incrementaly for sure 👌

And yea since yarnd has a store it's a bit easier to support edit / delete actions 😅

prologic

twtxt.net

22 Sep 24 09:18 UTC

@movq Makes sense 👌 I think it's fair to implement any spec changes incrementaly for sure 👌

And yea since yarnd has a store it's a bit easier to support edit / delete actions 😅

prologic

twtxt.net

22 Sep 24 08:50 UTC

So I'm a location based system, how exactly do I reply to one of these two Twts from @Yarns ? 🤔


2024-09-07T12:55:56Z	🥳 NEW FEED: @<twtxt http://edsu.github.io/twtxt/twtxt.txt>
2024-09-07T12:55:56Z	🥳 NEW FEED: @<kdy https://twtxt.kdy.ch/twtxt.txt>

prologic

twtxt.net

22 Sep 24 08:50 UTC

So I'm a location based system, how exactly do I reply to one of these two Twts from @Yarns ? 🤔


2024-09-07T12:55:56Z\t🥳 NEW FEED: @<twtxt http://edsu.github.io/twtxt/twtxt.txt>
2024-09-07T12:55:56Z\t🥳 NEW FEED: @<kdy https://twtxt.kdy.ch/twtxt.txt>

prologic

twtxt.net

22 Sep 24 08:50 UTC

So I'm a location based system, how exactly do I reply to one of these two Twts from @Yarns ? 🤔


2024-09-07T12:55:56Z	🥳 NEW FEED: @<twtxt http://edsu.github.io/twtxt/twtxt.txt>
2024-09-07T12:55:56Z	🥳 NEW FEED: @<kdy https://twtxt.kdy.ch/twtxt.txt>

prologic

twtxt.net

22 Sep 24 08:18 UTC

@lyse Yup, this is why you started seeing if you could improve the "trust" of peers right? 😅

prologic

twtxt.net

22 Sep 24 08:18 UTC

@lyse Yup, this is why you started seeing if you could improve the "trust" of peers right? 😅

prologic

twtxt.net

22 Sep 24 08:10 UTC

@movq Yeah I think what I'm proposing here is a more pragmatic approach to improvements that will last much longer than our first interaction (~4 years and going strong, but running into minor issues with edit/identify and some collssions_). This scope of changes is much easier to implement for yarnd and I suspect jenny too. and as indicated in here quite easy to have a reference implementation written in Bash with standard UNIX tools.~_

prologic

twtxt.net

22 Sep 24 08:10 UTC

prologic

twtxt.net

22 Sep 24 07:53 UTC

It's even sorta/somewhat compatible with our existing feeds (_kind of_) 🤣 -- Bit too stupid to figure out how to write enough correct Bash to make threads display inline nicely in an indented/tree-like fashion, but oh well 😅

prologic

twtxt.net

22 Sep 24 07:53 UTC

prologic

twtxt.net

22 Sep 24 07:52 UTC

Example:


$ ./twtxt-v2.sh reply 242561ce02d "Cool! 👌"
Posted twt with hash: b2c938f9838
...
$ ./twtxt-v2.sh timeline
...
prologic@twtxt.net [2024-09-22T07:26:37Z] <242561ce02d> Okay folks, I've spent all day on this today, and I _think_ its in "good enough"™ shape to share:

**Twtxt v2**:

- Specification: https://docs.mills.io/uJXuisaYTRWYDrl8A2jADg?both
- implementation: https://gist.mills.io/prologic/afdec15443da4d7aa898f383f171ec1b

 ![](https://twtxt.net/media/Wb9MtAiQyEkzNQB5dyVvUR.png)
prologic@localhost [2024-09-22T07:51:16Z] <b2c938f9838> Cool! 👌 (reply-to:242561ce02d)

prologic

twtxt.net

22 Sep 24 07:52 UTC

Example:


$ ./twtxt-v2.sh reply 242561ce02d "Cool! 👌"
Posted twt with hash: b2c938f9838
...
$ ./twtxt-v2.sh timeline
...
prologic@twtxt.net [2024-09-22T07:26:37Z] <242561ce02d> Okay folks, I've spent all day on this today, and I _think_ its in "good enough"™ shape to share:

**Twtxt v2**:

- Specification: https://docs.mills.io/uJXuisaYTRWYDrl8A2jADg?both
- implementation: https://gist.mills.io/prologic/afdec15443da4d7aa898f383f171ec1b

 ![](https://twtxt.net/media/Wb9MtAiQyEkzNQB5dyVvUR.png)
prologic@localhost [2024-09-22T07:51:16Z] <b2c938f9838> Cool! 👌 (reply-to:242561ce02d)

prologic

twtxt.net

22 Sep 24 07:26 UTC

Okay folks, I've spent all day on this today, and I _think_ its in "good enough"™ shape to share:

Twtxt v2:

- Specification: https://docs.mills.io/uJXuisaYTRWYDrl8A2jADg?both
- implementation: https://gist.mills.io/prologic/afdec15443da4d7aa898f383f171ec1b

prologic

twtxt.net

22 Sep 24 07:26 UTC

prologic

twtxt.net

22 Sep 24 06:38 UTC

@aelaraji No that is absolutely correct. Without cryptographic identities and signatures there is no way to verify authenticity. That is correct. And I don't think we need to necessarily. What I was just showing and proving was that I didn't write that spoofed Twt in the first place, which was only provable at the time of @lyse short-lived attack 🤣 He essentially forked yarnd, hosted it temporarily (_I think locally_) and used it to poison the caches of a few production pods.

Thankfully the gossip protocol used by yarnd as part of its "peering" between pods isn't fully trusted, twts are not archived for example into permanent storage. So the moment my pod re-fetched my own feed, the spoofed Twt was obliterated 😅

Eventual consistency 🤣

prologic

twtxt.net

22 Sep 24 06:38 UTC

prologic

twtxt.net

22 Sep 24 06:26 UTC

LOl 😂 Not only have a tried to write up a full Twtxt v2 specification, I've also written a Bash shell script that implements the new spec 😅

prologic

twtxt.net

22 Sep 24 06:26 UTC

LOl 😂 Not only have a tried to write up a full Twtxt v2 specification, I've also written a Bash shell script that implements the new spec 😅

prologic

twtxt.net

22 Sep 24 05:40 UTC

@movq Haha 😝 Nice one! And yes I'm also aware of some collisions too!

prologic

twtxt.net

22 Sep 24 05:40 UTC

@movq Haha 😝 Nice one! And yes I'm also aware of some collisions too!

prologic

twtxt.net

22 Sep 24 02:53 UTC

@aelaraji I like Nttfy 👌 I've wanted to replace my use of the Pushover service with this for a while now 🤔

prologic

twtxt.net

22 Sep 24 02:53 UTC

@aelaraji I like Nttfy 👌 I've wanted to replace my use of the Pushover service with this for a while now 🤔

prologic

twtxt.net

22 Sep 24 01:15 UTC

@bender 👌

prologic

twtxt.net

22 Sep 24 01:15 UTC

@bender 👌

prologic

twtxt.net

22 Sep 24 01:11 UTC