# I am the Watcher. I am your guide through this vast new twtiverse.
# 
# Usage:
#     https://watcher.sour.is/api/plain/users              View list of users and latest twt date.
#     https://watcher.sour.is/api/plain/twt                View all twts.
#     https://watcher.sour.is/api/plain/mentions?uri=:uri  View all mentions for uri.
#     https://watcher.sour.is/api/plain/conv/:hash         View all twts for a conversation subject.
# 
# Options:
#     uri     Filter to show a specific users twts.
#     offset  Start index for quey.
#     limit   Count of items to return (going back in time).
# 
# twt range = 1 15
# self = https://watcher.sour.is/conv/y7ldi4q
Google Says It'll Scrape Everything You Post Online for AI

> Google updated its privacy policy over the weekend, explicitly saying the company reserves the right to scrape just about everything you post online to build its AI tools.

Google can eat shit.
@abucci Oh 😱 Hmmm 🤔
@abucci Oh 😱 Hmmm 🤔
@abucci Oh 😱 Hmmm 🤔
Time to add


<meta name=”googlebot” content=”noindex,nofollow”>


to everything I guess.
@prologic They were almost certainly doing this already, but now they're codifying it in their policies, essentially claiming ownership over everyone's web pages.
@abucci To be fair, it was already codified there. What is more interesting (to me) is how they're using a privacy policy (binding their users) in an attempt to get implicit licensing over materials out of the scope of those services, both from their users and others (or of authors unknown). Not that it matters much, I bet they'd argue such license is unneeded, but the fact that they decided to have that wording there makes me curious about the legal basis of such clause. Yes, I know Goggle had an extensive and capable legal team, but I'd still love seeing a legal analysis of the applicability of that under various jurisdictions.
@abucci To be fair, it was already codified there. What is more interesting (to me) is how they're using a privacy policy (binding their users) in an attempt to get implicit licensing over materials out of the scope of those services, both from their users and others (or of authors unknown). Not that it matters much, I bet they'd argue such license is unneeded, but the fact that they decided to have that wording there makes me curious about the legal basis of such clause. Yes, I know Goggle had an extensive and capable legal team, but I'd still love seeing a legal analysis of the applicability of that under various jurisdictions.
@marado It can't possibly be defensible, which to me always signals an attempt at a power grab. They never explicitly said "we will use anything we scrape from the web to train our AI" before--that's new. There is growing pushback against that practice, with numerous legal cases winding through the legal system right now. Some day those cases will be heard and decided on by judges. So they're trying to get out ahead of that, in my opinion, and cement their claims to this data before there's a precedent set.
This should work as a robots.txt, right?


User-agent: Googlebot
Disallow: /
This should work as a robots.txt, right?


User-agent: Googlebot
Disallow: /
This should work as a robots.txt, right?


User-agent: Googlebot
Disallow: /
@abucci where they now say they use it to train their AI models thry used to say "for language models", which isn't all that different (possibly extending the scope from text to images, audio and video?).
@abucci where they now say they use it to train their AI models thry used to say "for language models", which isn't all that different (possibly extending the scope from text to images, audio and video?).
@marado It's very different. Language models are part if traditional search engines and translation engines. The new policy mentions Cloud AI abd Bard specifically. This is a weird change and probably a good preemptive move as I said previously. I'm not sure why you're downplaying it