The best description I've heard of it is that it's an extremely complicated autocomplete. The way it (most likely) works is that it reads through the sequence of text a user enters (the prompt), and then begins generating the next words that it deems likely to follow the prompt text. Very much how autocomplete on a smartphone keyboard works. It's a generative model, which means the neural network is probably being trained to learn the mean, standard deviation, and possibly other statistics about some probabilistic generative model (undescribed by OpenAI to my knowldge). There were some advances in LSTM around the time GPT was becoming popular, so it's possible they use a variant of that.
Hope that suffices for now!
The best description I've heard of it is that it's an extremely complicated autocomplete. The way it (most likely) works is that it reads through the sequence of text a user enters (the prompt), and then begins generating the next words that it deems likely to follow the prompt text. Very much how autocomplete on a smartphone keyboard works. It's a generative model, which means the neural network is probably being trained to learn the mean, standard deviation, and possibly other statistics about some probabilistic generative model (undescribed by OpenAI to my knowldge). There were some advances in LSTM around the time GPT was becoming popular, so it's possible they use a variant of that.
The best description I've heard of it is that it's an extremely complicated autocomplete. The way it (most likely) works is that it reads through the sequence of text a user enters (the prompt), and then begins generating the next words that it deems likely to follow the prompt text. Very much how autocomplete on a smartphone keyboard works. It's a generative model, which means the neural network is probably being trained to learn the mean, standard deviation, and possibly other statistics about some probabilistic generative model (undescribed by OpenAI to my knowldge). There were some advances in LSTM around the time GPT was becoming popular, so it's possible they use a variant of that.
Hope that suffices for now!
The best description I've heard of it is that it's an extremely complicated autocomplete. The way it (most likely) works is that it reads through the sequence of text a user enters (the prompt), and then begins generating the next words that it deems likely to follow the prompt text. Very much how autocomplete on a smartphone keyboard works. It's a generative model, which means the neural network is probably being trained to learn the mean, standard deviation, and possibly other statistics about some probabilistic generative model (undescribed by OpenAI to my knowldge). There were some advances in LSTM around the time GPT was becoming popular, so it's possible they use a variant of that.
Hope that suffices for now!
> Generative Pre-trained Transformer 3 (GPT-3) is an autoregressive language model released in 2020 that uses deep learning to produce human-like text. Given an initial text as prompt, it will produce text that continues the prompt.
Which is even more confusing to me, mostly because it doesn't speak of a neural network at all. Basically I was (on my short-lived holiday) doing some R&D on neural networks, evolutionary algorithms and other reading 😅
> Generative Pre-trained Transformer 3 (GPT-3) is an autoregressive language model released in 2020 that uses deep learning to produce human-like text. Given an initial text as prompt, it will produce text that continues the prompt.
Which is even more confusing to me, mostly because it doesn't speak of a neural network at all. Basically I was (on my short-lived holiday) doing some R&D on neural networks, evolutionary algorithms and other reading 😅
> Generative Pre-trained Transformer 3 (GPT-3) is an autoregressive language model released in 2020 that uses deep learning to produce human-like text. Given an initial text as prompt, it will produce text that continues the prompt.
Which is even more confusing to me, mostly because it doesn't speak of a neural network at all. Basically I was (on my short-lived holiday) doing some R&D on neural networks, evolutionary algorithms and other reading 😅
Like, if you type "The dog is", autocomplete will suggest some words from you that are likely to come next. Maybe "barking", "wet", "hungry", ... It'll rank those by how high a probability it rates each follow-up word. It'll probably not suggest words like "uranium" or "quickly", because you very rarely if ever encounter those words after "The dog is" in English sentences so their probability is very low.
👆 That's the "autoregressive" part.
It gets these probabilities from a "language model", which is a fancy way of saying a table of probabilities. A literal lookup table of the probabilities would be wayyyyy too big to be practical, so neural networks are often used as a representation of the lookup table, and deep learning (many-layered neural networks + a learning algorithm) is the hotness lately so they use that.
👆 That's the "language model" part.
So, you enter a prompt for ChatGPT. It runs fancy autocorrect to pick a word that should come next. It runs fancy autocorrect again to see what word will come next *after the last word it predicted and some of your prompt words*. Repeat to generate as many words as needed. There's probably a heuristic or a special "END OF CHAT" token to indicate when it should stop generating and send its response to you. Uppercase and lowercase versions of the tokens are in there so it can generate those. Punctuation is in there so it can generate that. With a good enough "language model", it'll do nice stuff like close parens and quotes, correctly start a sentence with a capital letter, add paragraph breaks, and so on.
There's really not much more to it than that, aside from a crapton of engineering to make all that work at the scale they're doing it.
Sasly I didn't come across RNNS though 😆 But yhay doesn't matter 🤔
Sasly I didn't come across RNNS though 😆 But yhay doesn't matter 🤔
Sasly I didn't come across RNNS though 😆 But yhay doesn't matter 🤔