# I am the Watcher. I am your guide through this vast new twtiverse.
# 
# Usage:
#     https://watcher.sour.is/api/plain/users              View list of users and latest twt date.
#     https://watcher.sour.is/api/plain/twt                View all twts.
#     https://watcher.sour.is/api/plain/mentions?uri=:uri  View all mentions for uri.
#     https://watcher.sour.is/api/plain/conv/:hash         View all twts for a conversation subject.
# 
# Options:
#     uri     Filter to show a specific users twts.
#     offset  Start index for quey.
#     limit   Count of items to return (going back in time).
# 
# twt range = 1 16
# self = https://watcher.sour.is/conv/zmv53uq
it’s actually incredibly hard to search for the phrase “do no evil” because search engines typically have stopwords and don’t index words like “do” and “not”; But…

https://search.twtxt.net/search?q=%2BGoogle+%2BEvil#
it’s actually incredibly hard to search for the phrase “do no evil” because search engines typically have stopwords and don’t index words like “do” and “not”; But…

https://search.twtxt.net/search?q=%2BGoogle+%2BEvil#
and I just discovered a bug 🐞🤣
and I just discovered a bug 🐞🤣
Have you happened to find a twt hash collision in your crawling adventures? If not, I wonder if it would be feasible to brute force one and see what happens.
Have you happened to find a twt hash collision in your crawling adventures? If not, I wonder if it would be feasible to brute force one and see what happens.
@mckinley The hashing algorithm we’ve chosen and the encoding format is such that it is extremely unlikely to have a hash collision at the current scale; Bit… It is possible I suppose 🤣 How would I go about testing this?
@mckinley The hashing algorithm we’ve chosen and the encoding format is such that it is extremely unlikely to have a hash collision at the current scale; Bit… It is possible I suppose 🤣 How would I go about testing this?
Was the info on dev.twtxt.net moved somewhere else? I can't remember exactly how the hashes work. It's the URL of the feed, the time, and the message put in a specific order and then hashed, right? Then that hash is encoded in base32 and the last 7 characters are taken from it? Do I have that completely wrong?
Was the info on dev.twtxt.net moved somewhere else? I can't remember exactly how the hashes work. It's the URL of the feed, the time, and the message put in a specific order and then hashed, right? Then that hash is encoded in base32 and the last 7 characters are taken from it? Do I have that completely wrong?

- A 7 character hash with 32 possible characters has 34,359,738,368 possible combinations. More than I would have thought, but it's not that many in the grand scheme of things.
- Assuming a rate of 50,000 hashes per second, which I think might be feasible on modest consumer hardware, you're looking at about 8 days to generate all possible hashes if you have no duplicates.

I'm sure the strange generation method affects the probability, but I don't know how to account for that. My math is most likely wrong as it is but I think a collision is doable.
\n- A 7 character hash with 32 possible characters has 34,359,738,368 possible combinations. More than I would have thought, but it's not that many in the grand scheme of things.\n- Assuming a rate of 50,000 hashes per second, which I think might be feasible on modest consumer hardware, you're looking at about 8 days to generate all possible hashes if you have no duplicates.\n\nI'm sure the strange generation method affects the probability, but I don't know how to account for that. My math is most likely wrong as it is but I think a collision is doable.
@mckinley That’s 100% correct and the dev.twtxt.net just needs to be redeployed today which I’ll do today 👌
@mckinley That’s 100% correct and the dev.twtxt.net just needs to be redeployed today which I’ll do today 👌
@mckinley I think your math is correct 👌 It’s also what I’ve concluded as well. Currently the search engine is seeing daily posts of around 500-600 per day. So we’re not going to collide anytime soon in reality 👍
@mckinley I think your math is correct 👌 It’s also what I’ve concluded as well. Currently the search engine is seeing daily posts of around 500-600 per day. So we’re not going to collide anytime soon in reality 👍