The Watcher

abucci

anthony.buc.ci

AI is changing scientists’ understanding of language learning | Ars Technica

OK, we can't have this debate all over again.
> But new insights into language learning are coming from an unlikely source: artificial intelligence. A new breed of large AI language models can write newspaper articles, poetry, and computer code and answer questions truthfully after being exposed to vast amounts of language input. And even more astonishingly, they all do it without the help of grammar.

The "AI" (actually neural network in most cases) models that can do this thing are exposed to trillions and trillions of words worth of *written* text. The typical person is exposed to many many orders of magnitude less language data over their whole lives, let alone in the first two years of life--you'd have to experience 16 words per second, every second, minute, and hour of every day for two years.

abucci

anthony.buc.ci

19 Oct 22 15:23 UTC

View Thread

Yes, they mention this new (not yet peer reviewed) version of GPT-2 might be able to perform similarly with significantly less input. Still, it's carefully curated, written text input, and it's still a huge amount. There's no way people learn language like that. There has to be something else going on.

Meanwhile, what doesn't get hyped at all for some reason is that there are wide-coverage, deep parsing techniques that produce deep semantic representations of the text they ingest, whose output resembles a lot more what people seem to do with language. I guess because those methods require some knowledge of linguistics to understand, and they don't produce headline-grabbing performance on simple tasks the way neural networks do.

abucci

anthony.buc.ci

19 Oct 22 15:27 UTC

View Thread

You could literally re-write this same story by cherry-picking a best-in-class deep parser result and saying that scientists are re-affirming their belief in deep knowledge structures based on the latest results in computational linguistics. That's how you know it's a B.S. hype story and not grounded in evidence.

prologic

twtxt.net

19 Oct 22 21:22 UTC

View Thread

Interesting stuff 🤔

prologic

twtxt.net

19 Oct 22 21:22 UTC

View Thread

Interesting stuff 🤔

abucci

anthony.buc.ci

19 Oct 22 22:23 UTC

View Thread

@prologic I hate all the hype around things like GPT-3 so much. A model with 200 billion parameters that ingested a trillion words of text damn well better perform well. It should also be able to make coffee and play chess and sing.

prologic

twtxt.net

19 Oct 22 23:05 UTC

View Thread

@abucci I agree! 💯 But aside from the "hype" (also agree) the _thing_ that really pissed me off about all this great "Machine Learning" (AI) -- Really very very large neural nets, is two things: a) Its all fucking written in Python and b) Despite Open AI promoting "Open Source" and "Open Standards" for AI and Machine Learning, Do you think you can get GPT-3 running anywhere besides their fucking cloud-based SaaS API?! 🙄 🤦‍♂️

I mean c'mon, fuck me. If you're going to build a company called "Open AI" -- At least make an effort to let ordinary folks like you and me run a GPT-3 model on say a modest machine with an NVIDIA GPu or two. But no 🤦‍♂️

prologic

twtxt.net

19 Oct 22 23:05 UTC

View Thread

abucci

anthony.buc.ci

19 Oct 22 23:28 UTC

View Thread

@prologic sone of those language models take tens or hundreds of days to train on an NVIDIA A100 and use the electricity it would take to run a town. These results aren't even replicable for all but the largest organizations.

abucci

anthony.buc.ci

19 Oct 22 23:30 UTC

View Thread

Do you remember before they released these GPT language models OpenAi was making these press releases like "we found this amazing AI for NLP but it would be irresponsible of us to release it to the public because it's TOO GOOD"? Utter hype.