The Watcher

	
# I am the Watcher. I am your guide through this vast new twtiverse.
# 
# Usage:
#     https://watcher.sour.is/api/plain/users              View list of users and latest twt date.
#     https://watcher.sour.is/api/plain/twt                View all twts.
#     https://watcher.sour.is/api/plain/mentions?uri=:uri  View all mentions for uri.
#     https://watcher.sour.is/api/plain/conv/:hash         View all twts for a conversation subject.
# 
# Options:
#     uri     Filter to show a specific users twts.
#     offset  Start index for quey.
#     limit   Count of items to return (going back in time).
# 
# twt range = 1 51
# self = https://watcher.sour.is/conv/3ezvila

prologic

twtxt.net

26 Oct 21 00:02 UTC

@lyse @movq I propose that we extend the Metadata spec once more and borrow from JSON API cursor pagination and for feed author's that wish to rotate or truncate their feed (feeds.twtxt.net already does at ~1MB, and soon yarnd will also at some point); Then one can _optionally_ provide # next = <link> and # prev = <link> Metdata in their feed to link to previous/older versions of their feed. I _think_ this is the easier way to implement this that works no matter how you host or the client used.~

prologic

twtxt.net

26 Oct 21 00:02 UTC

@lyse @movq I propose that we extend the Metadata spec once more and borrow from JSON API cursor pagination and for feed author's that wish to rotate or truncate their feed (feeds.twtxt.net already does at ~1MB, and soon yarnd will also at some point); Then one can _optionally_ provide # next = <link> and # prev = <link> Metdata in their feed to link to previous/older versions of their feed. I _think_ this is the easier way to implement this that works no matter how you host or the client used.~

prologic

twtxt.net

26 Oct 21 00:05 UTC

In addition, supporting "Range" requests _should_ be possible from a client's perspective with fallback options to do a full request. I don't think this needs to be part of the spec at all. The way I would implement this is to keep track of the last position, and re-fetch from that position minus a few hundred bytes just to be sure, throw away any initial garbage. I'd probably also try to detect if the feed is append-only or prepend and track this too (somewhere). If the Metadata preamble changes, seeking back a bit _should_ in theory work or fallback.

prologic

twtxt.net

26 Oct 21 00:05 UTC

In addition, supporting "Range" requests _should_ be possible from a client's perspective with fallback options to do a full request. I don't think this needs to be part of the spec at all. The way I would implement this is to keep track of the last position, and re-fetch from that position minus a few hundred bytes just to be sure, throw away any initial garbage. I'd probably also try to detect if the feed is append-only or prepend and track this too (somewhere). If the Metadata preamble changes, seeking back a bit _should_ in theory work or fallback.

lyse

lyse.isobeef.org

26 Oct 21 08:45 UTC+0200

@prologic I'll have a look at the JSON API Cursor Pagination Spec later. Podlove hints to RFC 5005, Feed Paging and Archiving, for podcast feeds. I'll have a read of that later, too.

prologic

twtxt.net

26 Oct 21 06:55 UTC

@lyse Yup all good specs to draw inspiration from 👌

prologic

twtxt.net

26 Oct 21 06:55 UTC

@lyse Yup all good specs to draw inspiration from 👌

lyse

lyse.isobeef.org

26 Oct 21 09:20 UTC+0200

@prologic Exactly. :-)

movq

www.uninformativ.de

26 Oct 21 16:46 UTC

@prologic I’ll have a look at those specs, too. The gist of it really is just next and prev in metadata, IIUC. And that should work pretty well. 🤔

movq

www.uninformativ.de

26 Oct 21 16:46 UTC

@prologic I’ll have a look at those specs, too. The gist of it really is just next and prev in metadata, IIUC. And that should work pretty well. 🤔

movq

www.uninformativ.de

26 Oct 21 16:46 UTC

@prologic I’ll have a look at those specs, too. The gist of it really is just next and prev in metadata, IIUC. And that should work pretty well. 🤔

prologic

twtxt.net

26 Oct 21 17:17 UTC

@movq Yup 👌

prologic

twtxt.net

26 Oct 21 17:17 UTC

@movq Yup 👌

lyse

lyse.isobeef.org

26 Oct 21 21:45 UTC+0200

@prologic @movq Basically yes. RFC 5005 makes a very interesting and subtle distinction between paged and archived feeds. Archive feeds are stable and navigating in them guarantees that no result will be duplicated or missed. On the other hand, paged feeds are not required to fulfil this property. The JSON Cursor Pagination is more like archived feeds. I reckon that we can use the archive approach, too. It suits us quite well, since we probably split off pages/archives by date ranges. Prologic might be the only one whose feed needs a max twts limit over a grouping by date. :-D So prev/next should do. Btw. I find it quite weird that RFC 5005 calls the link relations previous and next for paged but prev-archive and next-archive for archived feeds. Do we want to tie any paging/archiving semantics to that? Also use first and last? Too bad that link is already taken as titled hyperlinks in the metadata. What are your thoughts?

prologic

twtxt.net

26 Oct 21 22:29 UTC

@lyse I think we should just go for the archival alpaorsch. next and prev seem fine to me as keys in the metadata.

prologic

twtxt.net

26 Oct 21 22:29 UTC

@lyse I think we should just go for the archival alpaorsch. next and prev seem fine to me as keys in the metadata.

movq

www.uninformativ.de

27 Oct 21 12:13 UTC

@lyse … does that mean that archived feeds are supposed to be “frozen in time”? Nothing ever added to them, either? (Sorry, haven’t read the actual RFC, yet.)

movq

www.uninformativ.de

27 Oct 21 12:13 UTC

@lyse … does that mean that archived feeds are supposed to be “frozen in time”? Nothing ever added to them, either? (Sorry, haven’t read the actual RFC, yet.)

movq

www.uninformativ.de

27 Oct 21 12:13 UTC

@lyse … does that mean that archived feeds are supposed to be “frozen in time”? Nothing ever added to them, either? (Sorry, haven’t read the actual RFC, yet.)

thecanine

twtxt.net

27 Oct 21 12:29 UTC

@movq Yes, frozen and can only be found using that search tool I keep forgetting the name of. 🙃

lyse

lyse.isobeef.org

27 Oct 21 15:15 UTC+0200

@movq @thecanine Correct. Fixing typos would be the only allowed thing, but that's it. Clients who have fetched a certain archive feed may not redownload this particular one again in the future since it's considered stable/fixed/frozen/unmodifiable. So people might not get any updates from this archive feed.

prologic

twtxt.net

27 Oct 21 14:08 UTC

@movq like rotated Apache log files, yes I would think so 👌

prologic

twtxt.net

27 Oct 21 14:08 UTC

@movq like rotated Apache log files, yes I would think so 👌

prologic

twtxt.net

27 Oct 21 14:09 UTC

@thecanine https://search.twtxt.net/

prologic

twtxt.net

27 Oct 21 14:09 UTC

@thecanine https://search.twtxt.net/

prologic

twtxt.net

27 Oct 21 14:23 UTC

@lyse 👌

prologic

twtxt.net

27 Oct 21 14:23 UTC

@lyse 👌

movq

www.uninformativ.de

27 Oct 21 15:00 UTC

So it’d go like this?\n\n- twtxt-2021-10-25.txt: Feed of twts starting at 2021-10-25. For now, this is the “current” feed.\n- twtxt-2021-10-25.txt keeps growing for a while by appending new twts at the end.\n- Once twtxt-2021-10-25.txt is “full”, we add a next field to its metadata which points to …\n- … twtxt-2021-11-01.txt, which is initially empty except for a prev field that points back to twtxt-2021-10-25.txt. From this point on, twtxt-2021-11-01.txt is the “current” feed. It keeps growing in the same way. Eventually, it’s full and superseded by the next (partial) feed.\n- twtxt.txt could then be a symlink to the “current” feed file.\n- Non-current feeds could now indeed be considered as “archived”: Nothing ever changes in them anymore (except for metadata maybe? What if I change my nickname or feed URL?)\n\nWell, at least that’s how I’d go about implementing this in jenny. 🙂

movq

www.uninformativ.de

27 Oct 21 15:00 UTC

So it’d go like this?

- twtxt-2021-10-25.txt: Feed of twts starting at 2021-10-25. For now, this is the “current” feed.
- twtxt-2021-10-25.txt keeps growing for a while by appending new twts at the end.
- Once twtxt-2021-10-25.txt is “full”, we add a next field to its metadata which points to …
- … twtxt-2021-11-01.txt, which is initially empty except for a prev field that points back to twtxt-2021-10-25.txt. From this point on, twtxt-2021-11-01.txt is the “current” feed. It keeps growing in the same way. Eventually, it’s full and superseded by the next (partial) feed.
- twtxt.txt could then be a symlink to the “current” feed file.
- Non-current feeds could now indeed be considered as “archived”: Nothing ever changes in them anymore (except for metadata maybe? What if I change my nickname or feed URL?)

Well, at least that’s how I’d go about implementing this in jenny. 🙂

movq

www.uninformativ.de

27 Oct 21 15:00 UTC

So it’d go like this?

- twtxt-2021-10-25.txt: Feed of twts starting at 2021-10-25. For now, this is the “current” feed.
- twtxt-2021-10-25.txt keeps growing for a while by appending new twts at the end.
- Once twtxt-2021-10-25.txt is “full”, we add a next field to its metadata which points to …
- … twtxt-2021-11-01.txt, which is initially empty except for a prev field that points back to twtxt-2021-10-25.txt. From this point on, twtxt-2021-11-01.txt is the “current” feed. It keeps growing in the same way. Eventually, it’s full and superseded by the next (partial) feed.
- twtxt.txt could then be a symlink to the “current” feed file.
- Non-current feeds could now indeed be considered as “archived”: Nothing ever changes in them anymore (except for metadata maybe? What if I change my nickname or feed URL?)

Well, at least that’s how I’d go about implementing this in jenny. 🙂

movq

www.uninformativ.de

27 Oct 21 15:00 UTC

So it’d go like this?

- twtxt-2021-10-25.txt: Feed of twts starting at 2021-10-25. For now, this is the “current” feed.
- twtxt-2021-10-25.txt keeps growing for a while by appending new twts at the end.
- Once twtxt-2021-10-25.txt is “full”, we add a next field to its metadata which points to …
- … twtxt-2021-11-01.txt, which is initially empty except for a prev field that points back to twtxt-2021-10-25.txt. From this point on, twtxt-2021-11-01.txt is the “current” feed. It keeps growing in the same way. Eventually, it’s full and superseded by the next (partial) feed.
- twtxt.txt could then be a symlink to the “current” feed file.
- Non-current feeds could now indeed be considered as “archived”: Nothing ever changes in them anymore (except for metadata maybe? What if I change my nickname or feed URL?)

Well, at least that’s how I’d go about implementing this in jenny. 🙂

prologic

twtxt.net

27 Oct 21 15:10 UTC

@movq I _think_ this is how I'd implement it too 👌

prologic

twtxt.net

27 Oct 21 15:10 UTC

@movq I _think_ this is how I'd implement it too 👌

quark

ferengi.one

27 Oct 21 15:18 UTC

@movq \nI would recommend a longer rotation, perhaps? The way I see it, you are proposing a monthly one. That can make metadata huge too. Maybe yearly, or every 6 months?

quark

ferengi.one

27 Oct 21 15:18 UTC

@movq
I would recommend a longer rotation, perhaps? The way I see it, you are proposing a monthly one. That can make metadata huge too. Maybe yearly, or every 6 months?

prologic

twtxt.net

27 Oct 21 15:29 UTC

@quark I _think_ it's up to the feed author no? I mean the feeds service at feeds.twtxt.net (_loosely based on similar code_) rotated feeds at roughly ~1MB -- Soon yarnd will to as in practice this turns out to be a "good" value and approximately a year of Twts for socially active persons 😂~

prologic

twtxt.net

27 Oct 21 15:29 UTC

@quark I _think_ it's up to the feed author no? I mean the feeds service at feeds.twtxt.net (_loosely based on similar code_) rotated feeds at roughly ~1MB -- Soon yarnd will to as in practice this turns out to be a "good" value and approximately a year of Twts for socially active persons 😂~

movq

www.uninformativ.de

27 Oct 21 15:41 UTC

@quark Nah, that was just an example. 🙂 I’d probably go for “$n months or $m twts, whichever comes first”, both variables being customizable. I mean, there’s no point in “paginating” a feed if there’s very little traffic.

movq

www.uninformativ.de

27 Oct 21 15:41 UTC

@quark Nah, that was just an example. 🙂 I’d probably go for “$n months or $m twts, whichever comes first”, both variables being customizable. I mean, there’s no point in “paginating” a feed if there’s very little traffic.

movq

www.uninformativ.de

27 Oct 21 15:41 UTC

@quark Nah, that was just an example. 🙂 I’d probably go for “$n months or $m twts, whichever comes first”, both variables being customizable. I mean, there’s no point in “paginating” a feed if there’s very little traffic.

fastidious

arrakis.netbros.com

27 Oct 21 11:45 UTC-0400

@movq \n> both variables being customizable\n\nExcellent! And I agree on everything else, yes.\n\n@https://twtxt.net/user/prologic/twtxt.txt>\n> I think it’s up to the feed author no?\n\nNo if it is not made customisable. It is up to the client's author, in this case, @movq. ☺️

prologic

twtxt.net

27 Oct 21 16:58 UTC

@fastidious True 🤣

prologic

twtxt.net

27 Oct 21 16:58 UTC

@fastidious True 🤣

lyse

lyse.isobeef.org

27 Oct 21 19:10 UTC+0200

@movq If I understood RFC 5005 correctly, the current and an archive feed are two separate things, there's no overlap between them. So an item is either in the current or in the archived, but not in both. So a symlink is not enough. But of course we could change that for our purpose. Regarding meta data changes, not sure how to go about them. Changing the nick is probably no problem, but changing the URL causes twt hashes to break again. The never ending story.

prologic

twtxt.net

27 Oct 21 21:44 UTC

@lyse

> changing the URL causes twt hashes to break again. The never ending story.

It's actually not really as big of a problem as we've made it out to be in past conversations. If enough clients have seen versions of the feed then they'll have old copies of the Twts with the previous Hash and thus there will still be a valid chain. What changing the URL or the content of a individual Twt really does is fork the chain. Which isn't so bad IHMO. Just something to be aware of I guess? 🤔

prologic

twtxt.net

27 Oct 21 21:44 UTC

@lyse \n\n> changing the URL causes twt hashes to break again. The never ending story.\n\nIt's actually not really as big of a problem as we've made it out to be in past conversations. If enough clients have seen versions of the feed then they'll have old copies of the Twts with the previous Hash and thus there will still be a valid chain. What changing the URL or the content of a individual Twt really does is fork the chain. Which isn't so bad IHMO. Just something to be aware of I guess? 🤔

prologic

twtxt.net

27 Oct 21 21:44 UTC

@lyse

> changing the URL causes twt hashes to break again. The never ending story.

It's actually not really as big of a problem as we've made it out to be in past conversations. If enough clients have seen versions of the feed then they'll have old copies of the Twts with the previous Hash and thus there will still be a valid chain. What changing the URL or the content of a individual Twt really does is fork the chain. Which isn't so bad IHMO. Just something to be aware of I guess? 🤔

lyse

lyse.isobeef.org

28 Oct 21 19:55 UTC+0200

@prologic Well, it depends on the client. The original twtxt client for example just caches a copy of the current feed, so on next update cycle it just replaces everything it had for this feed with the current contents. I'd imagine that's a common model.

prologic

twtxt.net

28 Oct 21 21:39 UTC

@lyse Yeah not sure really. yarnd’s cache originally came from twet in the early days and morphed over time to what it is now.

Gbe Twt Hash extension had the most influence.

I don’t think just sucking down the feed and storing the file is the most efficient nor useful thing to do given how we’ve extended the spec.

prologic

twtxt.net

28 Oct 21 21:39 UTC

@lyse Yeah not sure really. yarnd’s cache originally came from twet in the early days and morphed over time to what it is now.\n\nGbe Twt Hash extension had the most influence.\n\nI don’t think just sucking down the feed and storing the file is the most efficient nor useful thing to do given how we’ve extended the spec.

prologic

twtxt.net

28 Oct 21 21:39 UTC

@lyse Yeah not sure really. yarnd’s cache originally came from twet in the early days and morphed over time to what it is now.

Gbe Twt Hash extension had the most influence.

I don’t think just sucking down the feed and storing the file is the most efficient nor useful thing to do given how we’ve extended the spec.