# I am the Watcher. I am your guide through this vast new twtiverse.
# 
# Usage:
#     https://watcher.sour.is/api/plain/users              View list of users and latest twt date.
#     https://watcher.sour.is/api/plain/twt                View all twts.
#     https://watcher.sour.is/api/plain/mentions?uri=:uri  View all mentions for uri.
#     https://watcher.sour.is/api/plain/conv/:hash         View all twts for a conversation subject.
# 
# Options:
#     uri     Filter to show a specific users twts.
#     offset  Start index for quey.
#     limit   Count of items to return (going back in time).
# 
# twt range = 1 41
# self = https://watcher.sour.is/conv/vm5bptq
Is this what I'm suppose to be seeing here?\n\n \n\ncc @stackeffect
Is this what I'm suppose to be seeing here?



cc @stackeffect
Is this what I'm suppose to be seeing here?



cc @stackeffect
@prologic off topic, it looks like you need to update Chrome. 😝
@fastidious Bahahahaha 🤣 Sure fine I'll update 😂
@fastidious Bahahahaha 🤣 Sure fine I'll update 😂
@prologic @movq
Exactly, you see correct UTF-8 encoded version (even with content-type: text/plain leaving out charset declaration).

After following utf8test twtxt myself I now see that jenny does not handle it as UTF-8 when charset is missing from HTTP header, just like @quark has observed.

So should jenny treat twtxt files always as UTF-8 encoded? I'm not sure about this.
@prologic @movq \nExactly, you see correct UTF-8 encoded version (even with content-type: text/plain leaving out charset declaration).\n\nAfter following utf8test twtxt myself I now see that jenny does not handle it as UTF-8 when charset is missing from HTTP header, just like @quark has observed.\n\nSo should jenny treat twtxt files always as UTF-8 encoded? I'm not sure about this.
@stackeffect I think it should yes.
@stackeffect I think it should yes.
@stackeffect jenny defers this decision to the requests library:

https://docs.python-requests.org/en/latest/user/advanced/#encodings

Honestly, I’d rather not interfere with that.

They refer to RFC 2616, which indeed says ISO-8859-1 should be the default for text/plain. However, RFC 7231 says in appendix B that this has been removed and it’s now up to the media type. When we look at https://www.iana.org/assignments/media-types/media-types.xhtml#text, we see RFC 2046 listed for text/plain. RFC 2046 including its update RFC 6657 specify US-ASCII as a default (https://www.rfc-editor.org/rfc/rfc6657#section-4). So, uhm, which one is correct? ISO-8859-1 or US-ASCII? None of those things specify UTF-8 as a default for text/plain, though, this only applies to *new* text media type registrations.

It’s a rabbit hole. That’s why I’d like to defer this to requests.
@stackeffect jenny defers this decision to the requests library:\n\nhttps://docs.python-requests.org/en/latest/user/advanced/#encodings\n\nHonestly, I’d rather not interfere with that.\n\nThey refer to RFC 2616, which indeed says ISO-8859-1 should be the default for text/plain. However, RFC 7231 says in appendix B that this has been removed and it’s now up to the media type. When we look at https://www.iana.org/assignments/media-types/media-types.xhtml#text, we see RFC 2046 listed for text/plain. RFC 2046 including its update RFC 6657 specify US-ASCII as a default (https://www.rfc-editor.org/rfc/rfc6657#section-4). So, uhm, which one is correct? ISO-8859-1 or US-ASCII? None of those things specify UTF-8 as a default for text/plain, though, this only applies to *new* text media type registrations.\n\nIt’s a rabbit hole. That’s why I’d like to defer this to requests.
@stackeffect jenny defers this decision to the requests library:

https://docs.python-requests.org/en/latest/user/advanced/#encodings

Honestly, I’d rather not interfere with that.

They refer to RFC 2616, which indeed says ISO-8859-1 should be the default for text/plain. However, RFC 7231 says in appendix B that this has been removed and it’s now up to the media type. When we look at https://www.iana.org/assignments/media-types/media-types.xhtml#text, we see RFC 2046 listed for text/plain. RFC 2046 including its update RFC 6657 specify US-ASCII as a default (https://www.rfc-editor.org/rfc/rfc6657#section-4). So, uhm, which one is correct? ISO-8859-1 or US-ASCII? None of those things specify UTF-8 as a default for text/plain, though, this only applies to *new* text media type registrations.

It’s a rabbit hole. That’s why I’d like to defer this to requests.
@stackeffect jenny defers this decision to the requests library:

https://docs.python-requests.org/en/latest/user/advanced/#encodings

Honestly, I’d rather not interfere with that.

They refer to RFC 2616, which indeed says ISO-8859-1 should be the default for text/plain. However, RFC 7231 says in appendix B that this has been removed and it’s now up to the media type. When we look at https://www.iana.org/assignments/media-types/media-types.xhtml#text, we see RFC 2046 listed for text/plain. RFC 2046 including its update RFC 6657 specify US-ASCII as a default (https://www.rfc-editor.org/rfc/rfc6657#section-4). So, uhm, which one is correct? ISO-8859-1 or US-ASCII? None of those things specify UTF-8 as a default for text/plain, though, this only applies to *new* text media type registrations.

It’s a rabbit hole. That’s why I’d like to defer this to requests.
@movq Is this when we link to ? 🤔 😂
@movq Is this when we link to ? 🤔 😂
@prologic I haven’t opened that image, but I’m pretty sure I know what it is. 😁 😁
@prologic I haven’t opened that image, but I’m pretty sure I know what it is. 😁 😁
@prologic I haven’t opened that image, but I’m pretty sure I know what it is. 😁 😁
@movq Hehe 😁
@movq Hehe 😁
@movq \nI'm not a Python programmer, so please bear with me.\nThe doc about encodings does also mention:\n\n If you require a different encoding, you can manually set the Response.encoding property\n\nWouldn't that be a one liner like (Ruby example)?\n\n 'some text'.force_encoding('utf-8')\n\nI understand that you do not want to interfere with requests. On the other hand we know that received data must be utf-8 (by twtxt spec) and it does burden "publishers" to somehow add charset property to content-type header. But again I'm not sure what "the right thing to do" (TM) is.
@movq
I'm not a Python programmer, so please bear with me.
The doc about encodings does also mention:

If you require a different encoding, you can manually set the Response.encoding property

Wouldn't that be a one liner like (Ruby example)?

'some text'.force_encoding('utf-8')

I understand that you do not want to interfere with requests. On the other hand we know that received data must be utf-8 (by twtxt spec) and it does burden "publishers" to somehow add charset property to content-type header. But again I'm not sure what "the right thing to do" (TM) is.
@stackeffect Hmmmmm, good point. I completely forgot that twtxt enforces UTF-8. 🤦 I think I'll change this, yes.
@stackeffect Hmmmmm, good point. I completely forgot that twtxt enforces UTF-8. 🤦 I think I'll change this, yes.
@stackeffect Hmmmmm, good point. I completely forgot that twtxt enforces UTF-8. 🤦 I think I'll change this, yes.
@stackeffect I pushed commit cb02422 to the repo. 👌

(Yes, the change is super simple. I just wasn’t sure earlier if I *wanted* to do this. But you’re absolutely right, twtxt says feeds must be UTF-8, so there’s no point in caring about the Content-Type header at all.)
@stackeffect I pushed commit cb02422 to the repo. 👌

(Yes, the change is super simple. I just wasn’t sure earlier if I *wanted* to do this. But you’re absolutely right, twtxt says feeds must be UTF-8, so there’s no point in caring about the Content-Type header at all.)
@stackeffect I pushed commit cb02422 to the repo. 👌\n\n(Yes, the change is super simple. I just wasn’t sure earlier if I *wanted* to do this. But you’re absolutely right, twtxt says feeds must be UTF-8, so there’s no point in caring about the Content-Type header at all.)
@stackeffect I pushed commit cb02422 to the repo. 👌

(Yes, the change is super simple. I just wasn’t sure earlier if I *wanted* to do this. But you’re absolutely right, twtxt says feeds must be UTF-8, so there’s no point in caring about the Content-Type header at all.)
Fun fact, I had to jump through some hoops to configure *my own web server* to serve my feed with Content-Type: text/plain; charset=utf-8 instead of just Content-Type: text/plain. 🤣 Maybe I’ll remove that hack from my config now …
Fun fact, I had to jump through some hoops to configure *my own web server* to serve my feed with Content-Type: text/plain; charset=utf-8 instead of just Content-Type: text/plain. 🤣 Maybe I’ll remove that hack from my config now …
Fun fact, I had to jump through some hoops to configure *my own web server* to serve my feed with Content-Type: text/plain; charset=utf-8 instead of just Content-Type: text/plain. 🤣 Maybe I’ll remove that hack from my config now …
@movq
Updated. Will it be possible for the subject be moved at the begining instead (like Yarn and tt do)?
@movq \nUpdated. Will it be possible for the subject be moved at the begining instead (like Yarn and tt do)?
@movq \nI just pulled it, works like a charm (as expected) ;-)
@movq
I just pulled it, works like a charm (as expected) ;-)
@movq 👌
@movq 👌
@movq It is still a good idea to have 😆the correct header on your web server so the web browsers also render the text correctly because web browsers definitely do not assume the character encoding of the content
@movq It is still a good idea to have 😆the correct header on your web server so the web browsers also render the text correctly because web browsers definitely do not assume the character encoding of the content