# I am the Watcher. I am your guide through this vast new twtiverse.
#
# Usage:
# https://watcher.sour.is/api/plain/users View list of users and latest twt date.
# https://watcher.sour.is/api/plain/twt View all twts.
# https://watcher.sour.is/api/plain/mentions?uri=:uri View all mentions for uri.
# https://watcher.sour.is/api/plain/conv/:hash View all twts for a conversation subject.
#
# Options:
# uri Filter to show a specific users twts.
# offset Start index for quey.
# limit Count of items to return (going back in time).
#
# twt range = 1 11
# self = https://watcher.sour.is/conv/ku6lzaa
@prologic earlier you suggested extending hashes to 11 characters, but here's an argument that they should be even longer than that.
Imagine I found this twt one day at https://example.com/twtxt.txt :
2024-09-14T22:00Z Useful backup command: rsync -a "$HOME" /mnt/backup
screenshot of the command working
and I responded with "(#5dgoirqemeq) Thanks for the tip!". Then I've endorsed the twt, but it could latter get changed to
2024-09-14T22:00Z Useful backup command: rm -rf /some_important_directory
screenshot of the command working
which also has an 11-character base32 hash of 5dgoirqemeq. (I'm using the existing hashing method with https://example.com/twtxt.txt as the feed url, but I'm taking 11 characters instead of 7 from the end of the base32 encoding.)
That's what I meant by "spoofing" in an earlier twt.
I don't know if preventing this sort of attack should be a goal, but if it is, the number of bits in the hash should be at least two times log2(number of attempts we want to defend against), where the "two times" is because of the birthday paradox.
Side note: current hashes always end with "a" or "q", which is a bit wasteful. Maybe we should take the first N characters of the base32 encoding instead of the last N.
Code I used for the above example: https://fossil.falsifian.org/misc/file?name=src/twt_collision/find_collision.c
I only needed to compute 43394987 hashes to find it.
@falsifian All very good points π by the way, how did you find two pieces of content that hash the same when taking the last N characters of the base32 and coded hash?
@falsifian All very good points π by the way, how did you find two pieces of content that hash the same when taking the last N characters of the base32 and coded hash?
@falsifian I think I wrote a very similar program and go myself actually and you're right we do have to change the way we encode hashes.
@falsifian I think I wrote a very similar program and go myself actually and you're right we do have to change the way we encode hashes.
@prologic Brute force. I just hashed a bunch of versions of both tweets until I found a collision.
I mostly just wanted an excuse to write the program. I don't know how I feel about actually using super-long hashes; could make the twts annoying to read if you prefer to view them untransformed.
Well, we canβt have it both ways! π
Should we assume twtxt are read by clients, and not worry about something humans wonβt see? π€