$ echo -n "https://twtxt.net/user/prologic/twtxt.txt\n2020-07-18T12:39:52Z\nHello World! 😊" | sha1sum | head -c 11
87fd9b0ae4e
1. Hash Structure: Hashes are typically designed so that their outputs have specific statistical properties. The first few characters often have more entropy or variability, meaning they are less likely to have patterns. The last characters may not maintain this randomness, especially if the encoding method has a tendency to produce less varied endings.
2. Collision Resistance: When using hashes, the goal is to minimize the risk of collisions (different inputs producing the same output). By using the first few characters, you leverage the full distribution of the hash. The last characters may not distribute in the same way, potentially increasing the likelihood of collisions.
3. Encoding Characteristics: Base32 encoding has a specific structure and padding that might influence the last characters more than the first. If the data being hashed is similar, the last characters may be more similar across different hashes.
4. Use Cases: In many applications (like generating unique identifiers), the beginning of the hash is often the most informative and varied. Relying on the end might reduce the uniqueness of generated identifiers, especially if a prefix has a specific context or meaning.
In summary, using the first n characters generally preserves the intended randomness and collision resistance of the hash, making it a safer choice in most cases.
1. Hash Structure: Hashes are typically designed so that their outputs have specific statistical properties. The first few characters often have more entropy or variability, meaning they are less likely to have patterns. The last characters may not maintain this randomness, especially if the encoding method has a tendency to produce less varied endings.
2. Collision Resistance: When using hashes, the goal is to minimize the risk of collisions (different inputs producing the same output). By using the first few characters, you leverage the full distribution of the hash. The last characters may not distribute in the same way, potentially increasing the likelihood of collisions.
3. Encoding Characteristics: Base32 encoding has a specific structure and padding that might influence the last characters more than the first. If the data being hashed is similar, the last characters may be more similar across different hashes.
4. Use Cases: In many applications (like generating unique identifiers), the beginning of the hash is often the most informative and varied. Relying on the end might reduce the uniqueness of generated identifiers, especially if a prefix has a specific context or meaning.
In summary, using the first n characters generally preserves the intended randomness and collision resistance of the hash, making it a safer choice in most cases.
> What I was referring to in the OP: Sometimes I check the workphone simply out of curiosity. 😂
> What I was referring to in the OP: Sometimes I check the workphone simply out of curiosity. 😂
The probability of a Twt Hash collision depends on the size of the hash and the number of possible values it can take. For the Twt Hash, which uses a Blake2b 256-bit hash, Base32 encoding, and takes the last 7 characters, the space of possible hash values is significantly reduced.
### Breakdown:
1. Base32 encoding: Each character in the Base32 encoding represents 5 bits of information (since \( 2^5 = 32 \)).
2. 7 characters: With 7 characters, the total number of possible hashes is:
\[ 32^7 = 3,518,437,208 \]
This gives about 3.5 billion possible hash values.
### Probability of Collision:
The probability of a hash collision depends on the number of hashes generated and can be estimated using the Birthday Paradox. The paradox tells us that collisions are more likely than expected when hashing a large number of items.
The approximate formula for the probability of at least one collision after generating
n hashes is:\[ P(\text{collision}) \approx 1 - e^{-\frac{n^2}{2M}}
\]
Where:
- \(n\) is the number of generated Twt Hashes.
- \(M = 32^7 = 3,518,437,208\) is the total number of possible hash values.
For practical purposes, here are some example probabilities for different numbers of hashes (
n):- For 1,000 hashes:
\[ P(\text{collision}) \approx 1 - e^{-\frac{1000^2}{2 \cdot 3,518,437,208}} \approx 0.00014 \, \text{(0.014%)}
\]
- For 10,000 hashes:
\[ P(\text{collision}) \approx 1 - e^{-\frac{10000^2}{2 \cdot 3,518,437,208}} \approx 0.14 \, \text{(14%)}
\]
- For 100,000 hashes:
\[ P(\text{collision}) \approx 1 - e^{-\frac{100000^2}{2 \cdot 3,518,437,208}} \approx 0.999 \, \text{(99.9%)}
\]
### Conclusion:
- For small to moderate numbers of hashes (up to around 1,000–10,000), the collision probability is quite low.
- However, as the number of Twts grows (above 100,000), the likelihood of a collision increases significantly due to the relatively small hash space (3.5 billion).=
The probability of a Twt Hash collision depends on the size of the hash and the number of possible values it can take. For the Twt Hash, which uses a Blake2b 256-bit hash, Base32 encoding, and takes the last 7 characters, the space of possible hash values is significantly reduced.
### Breakdown:
1. Base32 encoding: Each character in the Base32 encoding represents 5 bits of information (since \\( 2^5 = 32 \\)).
2. 7 characters: With 7 characters, the total number of possible hashes is:
\\[ 32^7 = 3,518,437,208 \\]
This gives about 3.5 billion possible hash values.
### Probability of Collision:
The probability of a hash collision depends on the number of hashes generated and can be estimated using the Birthday Paradox. The paradox tells us that collisions are more likely than expected when hashing a large number of items.
The approximate formula for the probability of at least one collision after generating
n hashes is:\\[ P(\\text{collision}) \\approx 1 - e^{-\\frac{n^2}{2M}}
\\]
Where:
- \\(n\\) is the number of generated Twt Hashes.
- \\(M = 32^7 = 3,518,437,208\\) is the total number of possible hash values.
For practical purposes, here are some example probabilities for different numbers of hashes (
n):- For 1,000 hashes:
\\[ P(\\text{collision}) \\approx 1 - e^{-\\frac{1000^2}{2 \\cdot 3,518,437,208}} \\approx 0.00014 \\, \\text{(0.014%)}
\\]
- For 10,000 hashes:
\\[ P(\\text{collision}) \\approx 1 - e^{-\\frac{10000^2}{2 \\cdot 3,518,437,208}} \\approx 0.14 \\, \\text{(14%)}
\\]
- For 100,000 hashes:
\\[ P(\\text{collision}) \\approx 1 - e^{-\\frac{100000^2}{2 \\cdot 3,518,437,208}} \\approx 0.999 \\, \\text{(99.9%)}
\\]
### Conclusion:
- For small to moderate numbers of hashes (up to around 1,000–10,000), the collision probability is quite low.
- However, as the number of Twts grows (above 100,000), the likelihood of a collision increases significantly due to the relatively small hash space (3.5 billion).=
The probability of a Twt Hash collision depends on the size of the hash and the number of possible values it can take. For the Twt Hash, which uses a Blake2b 256-bit hash, Base32 encoding, and takes the last 7 characters, the space of possible hash values is significantly reduced.
### Breakdown:
1. Base32 encoding: Each character in the Base32 encoding represents 5 bits of information (since \\( 2^5 = 32 \\)).
2. 7 characters: With 7 characters, the total number of possible hashes is:
\\\n
This gives about 3.5 billion possible hash values.
### Probability of Collision:
The probability of a hash collision depends on the number of hashes generated and can be estimated using the Birthday Paradox. The paradox tells us that collisions are more likely than expected when hashing a large number of items.
The approximate formula for the probability of at least one collision after generating
n hashes is:\\[ P(\\text{collision}) \\approx 1 - e^{-\\frac{n^2}{2M}}
\\]
Where:
- \\(n\\) is the number of generated Twt Hashes.
- \\(M = 32^7 = 3,518,437,208\\) is the total number of possible hash values.
For practical purposes, here are some example probabilities for different numbers of hashes (
n):- For 1,000 hashes:
\\[ P(\\text{collision}) \\approx 1 - e^{-\\frac{1000^2}{2 \\cdot 3,518,437,208}} \\approx 0.00014 \\, \\text{(0.014%)}
\\]
- For 10,000 hashes:
\\[ P(\\text{collision}) \\approx 1 - e^{-\\frac{10000^2}{2 \\cdot 3,518,437,208}} \\approx 0.14 \\, \\text{(14%)}
\\]
- For 100,000 hashes:
\\[ P(\\text{collision}) \\approx 1 - e^{-\\frac{100000^2}{2 \\cdot 3,518,437,208}} \\approx 0.999 \\, \\text{(99.9%)}
\\]
### Conclusion:
- For small to moderate numbers of hashes (up to around 1,000–10,000), the collision probability is quite low.
- However, as the number of Twts grows (above 100,000), the likelihood of a collision increases significantly due to the relatively small hash space (3.5 billion).=
The probability of a Twt Hash collision depends on the size of the hash and the number of possible values it can take. For the Twt Hash, which uses a Blake2b 256-bit hash, Base32 encoding, and takes the last 7 characters, the space of possible hash values is significantly reduced.
### Breakdown:
1. Base32 encoding: Each character in the Base32 encoding represents 5 bits of information (since \( 2^5 = 32 \)).
2. 7 characters: With 7 characters, the total number of possible hashes is:
\[ 32^7 = 3,518,437,208 \]
This gives about 3.5 billion possible hash values.
### Probability of Collision:
The probability of a hash collision depends on the number of hashes generated and can be estimated using the Birthday Paradox. The paradox tells us that collisions are more likely than expected when hashing a large number of items.
The approximate formula for the probability of at least one collision after generating
n hashes is:\[ P(\text{collision}) \approx 1 - e^{-\frac{n^2}{2M}}
\]
Where:
- \(n\) is the number of generated Twt Hashes.
- \(M = 32^7 = 3,518,437,208\) is the total number of possible hash values.
For practical purposes, here are some example probabilities for different numbers of hashes (
n):- For 1,000 hashes:
\[ P(\text{collision}) \approx 1 - e^{-\frac{1000^2}{2 \cdot 3,518,437,208}} \approx 0.00014 \, \text{(0.014%)}
\]
- For 10,000 hashes:
\[ P(\text{collision}) \approx 1 - e^{-\frac{10000^2}{2 \cdot 3,518,437,208}} \approx 0.14 \, \text{(14%)}
\]
- For 100,000 hashes:
\[ P(\text{collision}) \approx 1 - e^{-\frac{100000^2}{2 \cdot 3,518,437,208}} \approx 0.999 \, \text{(99.9%)}
\]
### Conclusion:
- For small to moderate numbers of hashes (up to around 1,000–10,000), the collision probability is quite low.
- However, as the number of Twts grows (above 100,000), the likelihood of a collision increases significantly due to the relatively small hash space (3.5 billion).=
* a0826a65 - Add debug sub-command to yarnc (7 weeks ago) <James Mills>
I'd recommend a
git pull && make build
* a0826a65 - Add debug sub-command to yarnc (7 weeks ago) <James Mills>
I'd recommend a
git pull && make build
$ echo -n "https://twtxt.net/user/prologic/twtxt.txt\n2020-07-18T12:39:52Z\nHello World! 😊" | sha1sum | head -c 11
87fd9b0ae4e
$ echo -n "https://twtxt.net/user/prologic/twtxt.txt\n2020-07-18T12:39:52Z\nHello World! 😊" | sha1sum | head -c 11
87fd9b0ae4e
$ echo -n "https://twtxt.net/user/prologic/twtxt.txt\\n2020-07-18T12:39:52Z\\nHello World! 😊" | sha1sum | head -c 11
87fd9b0ae4e
It works for us because there are enough people around and there’s a good chance that someone will be able to help.
Really, I am glad that we have this model. The alternative would be actual on-call duty, like, this week you’re the poor bastard who is legally required to fix shit. That’s just horrible, I don’t want that. 😅
What I was referring to in the OP: Sometimes I check the workphone simply out of curiosity. 😂
It works for us because there are enough people around and there’s a good chance that someone will be able to help.
Really, I am glad that we have this model. The alternative would be actual on-call duty, like, this week you’re the poor bastard who is legally required to fix shit. That’s just horrible, I don’t want that. 😅
What I was referring to in the OP: Sometimes I check the workphone simply out of curiosity. 😂
It works for us because there are enough people around and there’s a good chance that someone will be able to help.
Really, I am glad that we have this model. The alternative would be actual on-call duty, like, this week you’re the poor bastard who is legally required to fix shit. That’s just horrible, I don’t want that. 😅
What I was referring to in the OP: Sometimes I check the workphone simply out of curiosity. 😂
It works for us because there are enough people around and there’s a good chance that someone will be able to help.
Really, I am glad that we have this model. The alternative would be actual on-call duty, like, this week you’re the poor bastard who is legally required to fix shit. That’s just horrible, I don’t want that. 😅
What I was referring to in the OP: Sometimes I check the workphone simply out of curiosity. 😂
$ echo -n "https://twtxt.net/user/prologic/twtxt.txt\n2020-07-18T12:39:52Z\nHello World! 😊" | sha256sum | base32 | tr -d '=' | tr 'A-Z' 'a-z' | tail -c 12
tdqmjaeawqu
$ echo -n "https://twtxt.net/user/prologic/twtxt.txt\n2020-07-18T12:39:52Z\nHello World! 😊" | sha256sum | base32 | tr -d '=' | tr 'A-Z' 'a-z' | tail -c 12
tdqmjaeawqu
$ echo -n "https://twtxt.net/user/prologic/twtxt.txt\\n2020-07-18T12:39:52Z\\nHello World! 😊" | sha256sum | base32 | tr -d '=' | tr 'A-Z' 'a-z' | tail -c 12
tdqmjaeawqu
debug command on my local yarnc. Are you talking about a potential future implementation here?
debug command on my local yarnc. Are you talking about a potential future implementation here?
> The default character set, which must be assumed in the absence of a charset parameter, is US-ASCII.
https://www.rfc-editor.org/rfc/rfc2046.html#section-4.1.2
https://www.rfc-editor.org/rfc/rfc6657#section-4
> The default character set, which must be assumed in the absence of a charset parameter, is US-ASCII.
https://www.rfc-editor.org/rfc/rfc2046.html#section-4.1.2
https://www.rfc-editor.org/rfc/rfc6657#section-4
> The default character set, which must be assumed in the absence of a charset parameter, is US-ASCII.
https://www.rfc-editor.org/rfc/rfc2046.html#section-4.1.2
https://www.rfc-editor.org/rfc/rfc6657#section-4
> The default character set, which must be assumed in the absence of a charset parameter, is US-ASCII.
https://www.rfc-editor.org/rfc/rfc2046.html#section-4.1.2
https://www.rfc-editor.org/rfc/rfc6657#section-4
$ echo -n "https://twtxt.net/user/prologic/twtxt.txt\n2020-07-18T12:39:52Z\nHello World! 😊" | sha256sum | base64 | tr -d '=' | tail -c 12
NWY4MSAgLQo
$ echo -n "https://twtxt.net/user/prologic/twtxt.txt\\n2020-07-18T12:39:52Z\\nHello World! 😊" | sha256sum | base64 | tr -d '=' | tail -c 12
NWY4MSAgLQo
$ echo -n "https://twtxt.net/user/prologic/twtxt.txt\n2020-07-18T12:39:52Z\nHello World! 😊" | sha256sum | base64 | tr -d '=' | tail -c 12
NWY4MSAgLQo
blake2b 256bit digest algorithm is no longer supported by the openssl tool, however blake2s256 is; I'm not sure why 🤔
$ echo -n "https://twtxt.net/user/prologic/twtxt.txt\n2020-07-18T12:39:52Z\nHello World! 😊" | openssl dgst -blake2s256 -binary | base32 | tr -d '=' | tr 'A-Z' 'a-z' | tail -c 7
zq4fgq
Obviously produce the wrong hash, which should be
o6dsrga as indicated by the yarnc hash utility:
$ yarnc hash -u https://twtxt.net/user/prologic/twtxt.txt -t 2020-07-18T12:39:52Z "Hello World! 😊"
o6dsrga
But at least the shell pipeline is "correct".
blake2b 256bit digest algorithm is no longer supported by the openssl tool, however blake2s256 is; I'm not sure why 🤔
$ echo -n "https://twtxt.net/user/prologic/twtxt.txt\\n2020-07-18T12:39:52Z\\nHello World! 😊" | openssl dgst -blake2s256 -binary | base32 | tr -d '=' | tr 'A-Z' 'a-z' | tail -c 7
zq4fgq
Obviously produce the wrong hash, which should be
o6dsrga as indicated by the yarnc hash utility:
$ yarnc hash -u https://twtxt.net/user/prologic/twtxt.txt -t 2020-07-18T12:39:52Z "Hello World! 😊"
o6dsrga
But at least the shell pipeline is "correct".
blake2b 256bit digest algorithm is no longer supported by the openssl tool, however blake2s256 is; I'm not sure why 🤔
$ echo -n "https://twtxt.net/user/prologic/twtxt.txt\n2020-07-18T12:39:52Z\nHello World! 😊" | openssl dgst -blake2s256 -binary | base32 | tr -d '=' | tr 'A-Z' 'a-z' | tail -c 7
zq4fgq
Obviously produce the wrong hash, which should be
o6dsrga as indicated by the yarnc hash utility:
$ yarnc hash -u https://twtxt.net/user/prologic/twtxt.txt -t 2020-07-18T12:39:52Z "Hello World! 😊"
o6dsrga
But at least the shell pipeline is "correct".
openssl and b2sum -- Just trying to figure out how to make a shell pipeline again (_if you really want that_); as tools keep changing god damnit 🤣
openssl and b2sum -- Just trying to figure out how to make a shell pipeline again (_if you really want that_); as tools keep changing god damnit 🤣
$ ./yarnc debug ~/Public/twtxt.txt | tail -n 1
kp4zitq 2024-09-08T02:08:45Z (#wsdbfna) @<aelaraji https://aelaraji.com/twtxt.txt> My work has this thing called "compressed work", where you can **buy** extra time off (_as much as 4 additional weeks_) per year. It comes out of your pay though, so it's not exactly a 4-day work week but it could be useful, just haven't tired it yet as I'm not entirely sure how it'll affect my net pay
$ ./yarnc debug ~/Public/twtxt.txt | tail -n 1
kp4zitq 2024-09-08T02:08:45Z\t(#wsdbfna) @<aelaraji https://aelaraji.com/twtxt.txt> My work has this thing called "compressed work", where you can **buy** extra time off (_as much as 4 additional weeks_) per year. It comes out of your pay though, so it's not exactly a 4-day work week but it could be useful, just haven't tired it yet as I'm not entirely sure how it'll affect my net pay
$ ./yarnc debug ~/Public/twtxt.txt | tail -n 1
kp4zitq 2024-09-08T02:08:45Z (#wsdbfna) @<aelaraji https://aelaraji.com/twtxt.txt> My work has this thing called "compressed work", where you can **buy** extra time off (_as much as 4 additional weeks_) per year. It comes out of your pay though, so it's not exactly a 4-day work week but it could be useful, just haven't tired it yet as I'm not entirely sure how it'll affect my net pay
yarnc, and it would work for a simple twtxt. Now, for a more convoluted one it truly becomes a nightmare using that tool for the job. I know there are talks about changing this hash, so this might be a moot point *right now*, but it would be nice to have a tool that:1. Would calculate the hash of a twtxt in a file.
2. Would calculate all hashes on a
twtxt.txt (local and remote).Again, something lovely to have after any looming changes occur.
yarnc, and it would work for a simple twtxt. Now, for a more convoluted one it truly becomes a nightmare using that tool for the job. I know there are talks about changing this hash, so this might be a moot point *right now*, but it would be nice to have a tool that:1. Would calculate the hash of a twtxt in a file.
2. Would calculate all hashes on a
twtxt.txt (local and remote).Again, something lovely to have after any looming changes occur.
h
#!/bin/bash
set -e
trap 'echo "!! Something went wrong...!!"' ERR
#============= Variables ==========#
# Source files
LOCAL_DIR=$HOME/twtxt
TWTXT=$LOCAL_DIR/twtxt.txt
HTML=$LOCAL_DIR/log.html
TEMPLATE=$LOCAL_DIR/template.tmpl
# Destination
REMOTE_HOST=remotHostName # Host already setup in ~/.ssh/config
WEB_DIR="path/to/html/content"
GOPHER_DIR="path/to/phlog/content"
GEMINI_DIR="path/to/gemini-capsule/content"
DIST_DIRS=("$WEB_DIR" "$GOPHER_DIR" "$GEMINI_DIR")
#============ Functions ===========#
# Building log.html:
build_page() {
twtxt2html -T $TEMPLATE $TWTXT > $HTML
}
# Bulk Copy files to their destinations:
copy_files() {
for DIR in "${DIST_DIRS[@]}"; do
# Copy both `txt` and `html` files to the Web server and only `txt`
# to gemini and gopher server content folders
if [ "$DIR" == "$WEB_DIR" ]; then
scp -C "$TWTXT" "$HTML" "$REMOTE_HOST:$DIR/"
else
scp -C "$TWTXT" "$REMOTE_HOST:$DIR/"
fi
done
}
#========== Call to functions ===========$
build_page && copy_files
h
#!/bin/bash
set -e
trap 'echo "!! Something went wrong...!!"' ERR
#============= Variables ==========#
# Source files
LOCAL_DIR=$HOME/twtxt
TWTXT=$LOCAL_DIR/twtxt.txt
HTML=$LOCAL_DIR/log.html
TEMPLATE=$LOCAL_DIR/template.tmpl
# Destination
REMOTE_HOST=remotHostName # Host already setup in ~/.ssh/config
WEB_DIR="path/to/html/content"
GOPHER_DIR="path/to/phlog/content"
GEMINI_DIR="path/to/gemini-capsule/content"
DIST_DIRS=("$WEB_DIR" "$GOPHER_DIR" "$GEMINI_DIR")
#============ Functions ===========#
# Building log.html:
build_page() {
twtxt2html -T $TEMPLATE $TWTXT > $HTML
}
# Bulk Copy files to their destinations:
copy_files() {
for DIR in "${DIST_DIRS[@]}"; do
# Copy both `txt` and `html` files to the Web server and only `txt`
# to gemini and gopher server content folders
if [ "$DIR" == "$WEB_DIR" ]; then
scp -C "$TWTXT" "$HTML" "$REMOTE_HOST:$DIR/"
else
scp -C "$TWTXT" "$REMOTE_HOST:$DIR/"
fi
done
}
#========== Call to functions ===========$
build_page && copy_files
h
#!/bin/bash
set -e
trap 'echo "!! Something went wrong...!!"' ERR
#============= Variables ==========#
# Source files
LOCAL_DIR=$HOME/twtxt
TWTXT=$LOCAL_DIR/twtxt.txt
HTML=$LOCAL_DIR/log.html
TEMPLATE=$LOCAL_DIR/template.tmpl
# Destination
REMOTE_HOST=remotHostName # Host already setup in ~/.ssh/config
WEB_DIR="path/to/html/content"
GOPHER_DIR="path/to/phlog/content"
GEMINI_DIR="path/to/gemini-capsule/content"
DIST_DIRS=("$WEB_DIR" "$GOPHER_DIR" "$GEMINI_DIR")
#============ Functions ===========#
# Building log.html:
build_page() {
\ttwtxt2html -T $TEMPLATE $TWTXT > $HTML
}
# Bulk Copy files to their destinations:
copy_files() {
\tfor DIR in "${DIST_DIRS[@]}"; do
# Copy both `txt` and `html` files to the Web server and only `txt`
# to gemini and gopher server content folders
\t\tif [ "$DIR" == "$WEB_DIR" ]; then
\t\t\tscp -C "$TWTXT" "$HTML" "$REMOTE_HOST:$DIR/"
\t\telse
\t\t\tscp -C "$TWTXT" "$REMOTE_HOST:$DIR/"
\t\tfi
\tdone
}
#========== Call to functions ===========$
build_page && copy_files
It of course completely sucks at doing anything "intelligent" or complex, but if you just use it as a fancier auto complete it's actually half way decent 👌
It of course completely sucks at doing anything "intelligent" or complex, but if you just use it as a fancier auto complete it's actually half way decent 👌
I admittedly however, I've been guilty of doing this sometimes myself. 🤦♂️ sometimes though I think it's OK to show your achievements. 👌
I admittedly however, I've been guilty of doing this sometimes myself. 🤦♂️ sometimes though I think it's OK to show your achievements. 👌
Thank gods those posting their hideous squares have finally quieted down. LOL.
Thank gods those posting their hideous squares have finally quieted down. LOL.
utf-8 anyway these days if not specified? 🤔 I mean basically everything almost always uses utf-8 encoding right? 😅
utf-8 anyway these days if not specified? 🤔 I mean basically everything almost always uses utf-8 encoding right? 😅
jenny, nor yarnd supports it at all. This was treated as a thread because I picked one of @falsifian's twtxts (with the "old subject"), and replied to it (hence starting the thread).
jenny, nor yarnd supports it at all. This was treated as a thread because I picked one of @falsifian's twtxts (with the "old subject"), and replied to it (hence starting the thread).
yarnd makes this user configurable via a preference 🤣 And displays both 😅
yarnd makes this user configurable via a preference 🤣 And displays both 😅