The Watcher

prologic

twtxt.net

\n\n> The author rightly blames search engines. A similar revelation hit me like a truck after I used Marginalia Search a few times. Give it a try.\n\nBookmarked! Is this a search engine that's done it's own crawling and indexing like what I've tried to do with spyda.dev? 🤔

prologic

twtxt.net

21 Sep 21 04:18 UTC

View Thread

> The author rightly blames search engines. A similar revelation hit me like a truck after I used Marginalia Search a few times. Give it a try.

Bookmarked! Is this a search engine that's done it's own crawling and indexing like what I've tried to do with spyda.dev? 🤔

prologic

twtxt.net

21 Sep 21 04:18 UTC

View Thread

mckinley

twtxt.net

21 Sep 21 04:36 UTC

View Thread

@prologic Yes, it does its own crawling. You can check if a particular website is indexed by searching for a domain like this: site:mckinley.cc

mckinley

twtxt.net

21 Sep 21 04:36 UTC

View Thread

@prologic Yes, it does its own crawling. You can check if a particular website is indexed by searching for a domain like this: site:mckinley.cc

prologic

twtxt.net

21 Sep 21 04:45 UTC

View Thread

@mckinley In that case it's very similar in spirit to what I've been building at https://spyda.dev -- What's holding me back at the moment is I need to understand how to better index "web" documents and figure out a crawling strategy so it continues to grow it's index.

prologic

twtxt.net

21 Sep 21 04:45 UTC

View Thread

prologic

twtxt.net

21 Sep 21 13:03 UTC

View Thread

So I had a play with this search engine tonight and read everything about what this guy has done, amazing work! 👌 I've reached out to him via email to see if perhaps he'd be interested in teaming up with me in some way. Anyway I also wanted to point out something rather sad:

> The crawler gets captchad by CDNs like Fastly and CloudFlare. I've prostrated myself before them and pleaded to get listed as a good bot, but they have yet to call back so until then they are blocked on a subnet level.

😢 😡 🤬 #Fastly and #Cloudflare sucks 😡

prologic

twtxt.net

21 Sep 21 13:03 UTC

View Thread

So I had a play with this search engine tonight and read everything about what this guy has done, amazing work! 👌 I've reached out to him via email to see if perhaps he'd be interested in teaming up with me in some way. Anyway I also wanted to point out something rather sad:\n\n> The crawler gets captchad by CDNs like Fastly and CloudFlare. I've prostrated myself before them and pleaded to get listed as a good bot, but they have yet to call back so until then they are blocked on a subnet level.\n\n😢 😡 🤬 #Fastly and #Cloudflare sucks 😡

prologic

twtxt.net

21 Sep 21 13:03 UTC

View Thread

adi

f.adi.onl

21 Sep 21 13:05 UTC

View Thread

@prologic Hehe, would be nice for you to team up! 😎

adi

twtxt.net

21 Sep 21 13:05 UTC

View Thread

@prologic Hehe, would be nice for you to team up! 😎

prologic

twtxt.net

21 Sep 21 13:08 UTC

View Thread

Yeah it would be! I _think_ we'd have a lot to complement each other. Problem is it actually is a lot of work to create a generalised search engine. It's much easier to create a search engine for a small domain like Yarn.social / Twtxt. But even then there's still work to be done on the crawling side (_I think_) -- Right now it just re-crawls the space once a day.

prologic

twtxt.net

21 Sep 21 13:08 UTC

View Thread