# I am the Watcher. I am your guide through this vast new twtiverse.
# 
# Usage:
#     https://watcher.sour.is/api/plain/users              View list of users and latest twt date.
#     https://watcher.sour.is/api/plain/twt                View all twts.
#     https://watcher.sour.is/api/plain/mentions?uri=:uri  View all mentions for uri.
#     https://watcher.sour.is/api/plain/conv/:hash         View all twts for a conversation subject.
# 
# Options:
#     uri     Filter to show a specific users twts.
#     offset  Start index for quey.
#     limit   Count of items to return (going back in time).
# 
# twt range = 1 19
# self = https://watcher.sour.is/conv/7kuwtmq
Ready for clone https://github.com/sorenpeter/pixelblog
@darch Yiha! I just explored the code online a little bit. The very first thing that came to mind is that you probably want to maintain a *.gitignore* in your repo and at least add this silly .DS_Store to it, it's of no use. Since git only tracks files and not directories some of the folders would be empty. So they do not exist after cloning the repository. There are two commonly used approaches:

1.) The software just creates the directories, if they're not present. In my opinion that's the best solution in 99% of the time.
2.) Add and commit an empty file, often named *.gitkeep* or something similar.

Also temporary editor files are very good candidates to exclude from git. They of course depend on your favorite editor, I always add *.sw? for Vim swap files and also *~ for good measure. Some editors I used in the past just append a tilde to their temp files, so it's an old habit. Of course, there are plenty of different suffixes, extensions and what not. I tell people to just start out with those the original author uses.

Other than some typos in the README and comments I haven't tried this out. A few years back I made the resolution to never execute PHP code again if I can help it. 8-)
@lyse

> A few years back I made the resolution to never execute PHP code again if I can help it. 8-)

Sadly the same here too, I used to actually be a PHP Web Deveoper once upon a time, never again 😅

----

But congrats @darch I hope pixelblog flouries! Who knows maybe we can incorporate some ideas intp yarnd over time 👌
@lyse

> A few years back I made the resolution to never execute PHP code again if I can help it. 8-)

Sadly the same here too, I used to actually be a PHP Web Deveoper once upon a time, never again 😅

----

But congrats @darch I hope pixelblog flouries! Who knows maybe we can incorporate some ideas intp yarnd over time 👌
@darch I just sent you a merge request that fixes the typos I spotted.

Now to one very severe security flaw, the filesystem traversal attack. You must never ever trust user input. Never. Ever. Not in a hundred years. Or the devil himself will kidnap your kids, your wive and yourself to steal their souls, rape and then painfully kill all of them.

Back to the user input. This also goes for the filename of the submitted file upload. You either have to sanitize it or even better, just generate a new one, you know is safe. Currently you just use the user's filename, replace spaces with hyphens, convert it to lowercase and prepend the date. So basically, doing nothing in terms of sanitizing. But if the filename contains slashes, you're basically fucked. Imagine a user-supplied filename of ../../../../../etc/passwd or something similar. It will then override system data or any of your scripts or whatever, if the user running the PHP script has sufficient permissions. Which it often has to at least override your own PHP scripts. So you should at least extract the submitted filename's basename at the very bare minimum. That would result in passwd on the example above. Maybe there are even more PHP-specific things to keep in mind, I don't know.

Okay, granted you check for the existence of the final file and abort, but it still would be possible to sneak files into places, where they truely do not belong. Like optional configuration files an application would read if present but ignore if missing.

Also checking the file extension to determine whether a file is of a certain type doesn't really work. You can just lie about the extension.

I'm heading to bed now. Happy fixing my friend! :-)
@lyse where do I sign up to subscribe to that devil’s treatment? It sounds so enticing! Shagadelic, baby. Oh, stop it! 🤣
@darch Why PHP based and not pp based?
@adi Because apparently PHP is still commonly found in many cheap/free? hosting providers where allegedly nothing else is allowed? 🤔Which to be frank, I don't really get because Lightsail is essentially free, Vultr VMs are a mere $2.50/month and I'm sure you could do interesting things on some rented space from a friend's server or infra 😅
@adi Because apparently PHP is still commonly found in many cheap/free? hosting providers where allegedly nothing else is allowed? 🤔Which to be frank, I don't really get because Lightsail is essentially free, Vultr VMs are a mere $2.50/month and I'm sure you could do interesting things on some rented space from a friend's server or infra 😅
@lyse Thank for the PR fixing the typo and for reminding me to get rid of the apple garbage files.
@lyse So to sanitize the files going thought upload.php is something like preg_replace(“/[^a-z0-9\\.]/”, “”, strtolower($str)); // from:http://www.touchoftechnology.com/simple-way-to-clean-up-filenames-in-php/ enough or should I use this https://gist.github.com/sumanthkumarc/2de2e2cc06c648a9f52c121501a181df or something completely different?

I relation to checking if the uploaded files is in fact images it is this code from https://www.w3schools.com/php/php_file_upload.asp good?
@lyse So to sanitize the files going thought upload.php is something like preg_replace(“/[^a-z0-9\.]/”, “”, strtolower($str)); // from:http://www.touchoftechnology.com/simple-way-to-clean-up-filenames-in-php/ enough or should I use this https://gist.github.com/sumanthkumarc/2de2e2cc06c648a9f52c121501a181df or something completely different?

I relation to checking if the uploaded files is in fact images it is this code from https://www.w3schools.com/php/php_file_upload.asp good?
@prologic You are more than welcome to nick what every ideas and features you like - the only really new think pixelblog is adding is the gallery view, where you are free to rewrite it go.
@adi and @prologic to the question about PHP. My goal is to make something anyone with ftp access can deploy in a hour without having to use a command prompt. SSH access also often comes at a extra fee if available at all. I know PHP is not the most efficient out there compared to go and static site generators like pp, but my target users are not professional programmers like you guys - pixelblog - a twtxt frontend not just for hackers™
@darch I understand your points, but this is very cheap https://tinykvm.com/, not trying to convince you of anything.
@darch So it depends a little bit on your requirement whether you want to *somehow* preserve the original filename or if it is okay to just come up with a "random" one, that has nothing to do with the user-supplied one. For the latter you could just use UUIDs, they're unique and you're done. Collissions are super unlikely.

Maybe even just use the current Unix timestamp in milli-, micro- or nanoseconds. Seconds-only precision increases the danger of collission at parallel uploads. In any case you should check for duplicate filenames in case of clock adjustments. It's super simple and fast, though.

Or you could hash the data and use that as the filename, again checking for duplicates. That has the advantage that you can detect identical file uploads. Not entirely sure if that property is something you really want, but might work out in your favor. Uploading the exact same image is probably not of much use. Any hashing algorithm will do, cryptographic ones should be favored. Hashing does not come for free, some computational effort is required which heavily varies with the selected algorithm.

Now, if you want to keep as much from the original filename as possible for whatever reason then basename($filename) is a very good start. Limiting even further to only alphanumeric characters including dot (.), underscore (_) and dash (-) makes the result a tad better. (Make sure to put the dash as the last character in the choice of the regular expression.) But then you also need to check for duplicates and handle them somehow, since höllo.jpg and høllo.jpg would both be truncated to the same (hllo.jpg). Might be completely different images, though. Your filename might also end up (quite) empty or just consists of your extension (depending on order of checks). You easily can see there are quite some things to be aware of with that whitelist approach.

So unless you really have to, I'd strongly recommend to go the generated filename route. It'll make your life easier. Pick one approach whose properties suit your use case. Personally, I'd select UUIDs or hashing (probably SHA-1 or even successors).
@lyse in an easy to read twt you not only managed to explain to @darch the issues his code has, and potential solutions, but by now he fully understands he shouldn't be meddling with PHP programming, and instead use a solution provided by people who do that for a living. 🤣 It will not be fully fool proof, but certainly better than what he has right now.
@david where is the fun in that 🤣
@darch I'll give you points for that! I agree it is fun to do what you like, and enjoy the fruits of your creation. You are almost there, only needing a few changes here and there. Don't give up, mate! If I come across as giving you a hard time, please understand I am simply pushing you to do better. ☺️ Good luck, and code on!