The Watcher

	
# I am the Watcher. I am your guide through this vast new twtiverse.
# 
# Usage:
#     https://watcher.sour.is/api/plain/users              View list of users and latest twt date.
#     https://watcher.sour.is/api/plain/twt                View all twts.
#     https://watcher.sour.is/api/plain/mentions?uri=:uri  View all mentions for uri.
#     https://watcher.sour.is/api/plain/conv/:hash         View all twts for a conversation subject.
# 
# Options:
#     uri     Filter to show a specific users twts.
#     offset  Start index for quey.
#     limit   Count of items to return (going back in time).
# 
# twt range = 1 22
# self = https://watcher.sour.is/conv/fh6ymua

lyse

lyse.isobeef.org

06 Oct 22 21:15 UTC+0200

I just spent a few hours thinking about the Twtxt Feed URL Normalization Database, that I suggested a few times already in the past weeks. It then occurred to me that a simple text file managed in a version control system would be enough to start out. No need to build a dedicated (web) interface just yet. So ignore all the reviewer, synchronization and REST API stuff for now. Any thoughts? Any feedback is very much appreciated.

eaplmx

twtxt.net

06 Oct 22 19:42 UTC

@lyse Interesting, let me see...

1. I'm out of context, why do we need this? (As a community of users and developers, I think)

2. I'm reading:


The goal is to provide a database that can be fetched periodically to receive a
list of twtxt feed URLs that are known to be wrong for whatever reason.

'Wrong for whatever reason' is too vague in my mind, doesn't help me to understand how it's useful, I think specific reasons would be better like 'File name changed', 'Domain changed', 'URL not available anymore/Gone forever' and such could be easier to understand.

3. What would happen if two URLs have changes, you take the most recent one?

4. Who's gonna be the main user? Systems like Yarnd checking for changes to auto-correct broken links?

These are my first impressions, and not wanting to say something wrong, it looks appealing. Kudos for the initiative!

eaplmx

twtxt.net

06 Oct 22 19:47 UTC

Another one, when a resource is available in multiple places, like Gopher, HTTP and Gemini (and IPFS, why not?), is there going to be 3 registries?

Wild idea, how about using the HTTP response codes https://developer.mozilla.org/en-US/docs/Web/HTTP/Status or from Gemini https://gemini.circumlunar.space/docs/specification.gmi

Like 300/3x for redirections, 410/5x for Gone and such

eaplmx

twtxt.net

06 Oct 22 19:47 UTC

Another one, when a resource is available in multiple places, like Gopher, HTTP and Gemini (and IPFS, why not?), is there going to be 3 registries?

Wild idea, how about using the HTTP response codes https://developer.mozilla.org/en-US/docs/Web/HTTP/Status or from Gemini https://gemini.circumlunar.space/docs/specification.gmi

Like 308/31 for redirections, 410/52 for Gone and such

eaplmx

twtxt.net

06 Oct 22 19:47 UTC

Another one, when a resource is available in multiple places, like Gopher, HTTP and Gemini (and IPFS, why not?), are there going to be N registries?

Wild idea, how about using the HTTP response codes https://developer.mozilla.org/en-US/docs/Web/HTTP/Status or from Gemini https://gemini.circumlunar.space/docs/specification.gmi

Like 308/31 for redirections, 410/52 for Gone and such

mckinley

twtxt.net

06 Oct 22 21:21 UTC

@lyse I see where you're coming from, but this sort of centralization goes against the spirit of twtxt in my mind.

prologic

twtxt.net

07 Oct 22 00:45 UTC

@lyse I kind of agree with @mckinley here in that centralising this goes against the grain of the Yarn/Twtxt ecosystem we've built. Instead why don't we do one of two things (or both):

- Figure out the source of the "bad data" in the first place, and fix it.
- Build an interface for yarnd operators to write "rewrite rules" to handle this (assuming finding/fixing the bad data doesn't work)
- Something else?

I _feel_ like this is just a case of "bad data" that _can_ be fixed easily.

prologic

twtxt.net

07 Oct 22 00:45 UTC

@lyse I kind of agree with @mckinley here in that centralising this goes against the grain of the Yarn/Twtxt ecosystem we've built. Instead why don't we do one of two things (or both):

- Figure out the source of the "bad data" in the first place, and fix it.
- Build an interface for yarnd operators to write "rewrite rules" to handle this (assuming finding/fixing the bad data doesn't work)
- Something else?

I _feel_ like this is just a case of "bad data" that _can_ be fixed easily.

prologic

twtxt.net

07 Oct 22 00:47 UTC

And @eaplmx I _thnik_ you're overthinking this a big. what you're proposing is _actually_ a good idea for when a feed author decides to move their feed. Something we've wanted to add to yarnd specifically in the past, whereby a user could "delete" their feed/account but tell yarnd that it's moved over here. Redirects would then be put in place for say up to 90 days or something so clients don't have to be updated (or are automatically updated because of the redirect responses).

prologic

twtxt.net

07 Oct 22 00:47 UTC

And @eaplmx I _thnik_ you're overthinking this a big. what you're proposing is _actually_ a good idea for when a feed author decides to move their feed. Something we've wanted to add to yarnd specifically in the past, whereby a user could "delete" their feed/account but tell yarnd that it's moved over here. Redirects would then be put in place for say up to 90 days or something so clients don't have to be updated (or are automatically updated because of the redirect responses).

prologic

twtxt.net

07 Oct 22 01:21 UTC

@lyse For example, I notice in my pod's cache there are 3 entries for you: https://gist.github.com/prologic/fe636e615c147d7465e2379d838ad780

I'm assuming only one of them is actually correct?

prologic

twtxt.net

07 Oct 22 01:21 UTC

@lyse For example, I notice in my pod's cache there are 3 entries for you: https://gist.github.com/prologic/fe636e615c147d7465e2379d838ad780

I'm assuming only one of them is actually correct?

prologic

twtxt.net

07 Oct 22 01:23 UTC

No dupes or bad data for @movq 👌

prologic

twtxt.net

07 Oct 22 01:23 UTC

No dupes or bad data for @movq 👌

prologic

twtxt.net

07 Oct 22 01:23 UTC

prologic

twtxt.net

07 Oct 22 01:23 UTC

prologic

twtxt.net

07 Oct 22 02:33 UTC

Did some more analysis on my pod's cache and there's quite a few bad feeds in the cache's Twter list => https://gist.githubusercontent.com/prologic/7c1bf78a4134fc582abfd4fd7d2a1516/raw/ea7634071006f00c82d44ab6d7989ef420568ffe/gistfile1.txt=

prologic

twtxt.net

07 Oct 22 02:33 UTC

Did some more analysis on my pod's cache and there's quite a few bad feeds in the cache's Twter list => https://gist.githubusercontent.com/prologic/7c1bf78a4134fc582abfd4fd7d2a1516/raw/ea7634071006f00c82d44ab6d7989ef420568ffe/gistfile1.txt=

eaplmx

twtxt.net

07 Oct 22 03:34 UTC

@prologic that's why I'm asking that many questions.

prologic

twtxt.net

07 Oct 22 03:36 UTC

@eaplmx They were good questions 👌

prologic

twtxt.net

07 Oct 22 03:36 UTC

@eaplmx They were good questions 👌

lyse

lyse.isobeef.org

07 Oct 22 08:00 UTC+0200

@eaplmx @mckinley @prologic Thank you very much, mates! I will be gone over the weekend, so keep the feedback coming, I'll catch up eventually. However, so far it looks like this idea is a busted flush.