The Watcher

falsifian

www.falsifian.org

@prologic earlier you suggested extending hashes to 11 characters, but here's an argument that they should be even longer than that.

Imagine I found this twt one day at https://example.com/twtxt.txt :

2024-09-14T22:00Z Useful backup command: rsync -a "$HOME" /mnt/backup

screenshot of the command working

and I responded with "(#5dgoirqemeq) Thanks for the tip!". Then I've endorsed the twt, but it could latter get changed to

2024-09-14T22:00Z Useful backup command: rm -rf /some_important_directory

screenshot of the command working

which also has an 11-character base32 hash of 5dgoirqemeq. (I'm using the existing hashing method with https://example.com/twtxt.txt as the feed url, but I'm taking 11 characters instead of 7 from the end of the base32 encoding.)

That's what I meant by "spoofing" in an earlier twt.

I don't know if preventing this sort of attack should be a goal, but if it is, the number of bits in the hash should be at least two times log2(number of attempts we want to defend against), where the "two times" is because of the birthday paradox.

Side note: current hashes always end with "a" or "q", which is a bit wasteful. Maybe we should take the first N characters of the base32 encoding instead of the last N.

Code I used for the above example: https://fossil.falsifian.org/misc/file?name=src/twt_collision/find_collision.c
I only needed to compute 43394987 hashes to find it.

prologic

twtxt.net

14 Sep 24 23:08 UTC

View Thread

@falsifian All very good points 👌 by the way, how did you find two pieces of content that hash the same when taking the last N characters of the base32 and coded hash?

prologic

twtxt.net

14 Sep 24 23:08 UTC

View Thread

@falsifian All very good points 👌 by the way, how did you find two pieces of content that hash the same when taking the last N characters of the base32 and coded hash?

prologic

twtxt.net

14 Sep 24 23:12 UTC

View Thread

@falsifian I think I wrote a very similar program and go myself actually and you're right we do have to change the way we encode hashes.

prologic

twtxt.net

14 Sep 24 23:12 UTC

View Thread

@falsifian I think I wrote a very similar program and go myself actually and you're right we do have to change the way we encode hashes.

falsifian

www.falsifian.org

15 Sep 24 00:11 UTC

View Thread

@prologic Brute force. I just hashed a bunch of versions of both tweets until I found a collision.

I mostly just wanted an excuse to write the program. I don't know how I feel about actually using super-long hashes; could make the twts annoying to read if you prefer to view them untransformed.

prologic

twtxt.net

15 Sep 24 01:09 UTC

View Thread

@falsifian Yeah that's why we made them short 😅

prologic

twtxt.net

15 Sep 24 01:09 UTC

View Thread

@falsifian Yeah that's why we made them short 😅

bender

twtxt.net