Chomsky on ChatGPT
Ive seen a decent amount of discussion about ChatGBT and AI. Im gifting (i.e., everyone should be able to read it without a paywall) this New York Times op-ed from a few weeks ago: Noam Chomsky: The False Promise of ChatGPT
I think itd be interesting to discuss this perspective. What do we make of it? Convincing? Is it premature, given ChatGBT is in its infancy?
Heres a quote that stood out to me:
I think itd be interesting to discuss this perspective. What do we make of it? Convincing? Is it premature, given ChatGBT is in its infancy?
Heres a quote that stood out to me:
It is at once comic and tragic, as Borges might have noted, that so much money and attention should be concentrated on so little a thing something so trivial when contrasted with the human mind, which by dint of language, in the words of Wilhelm von Humboldt, can make infinite use of finite means, creating ideas and theories with universal reach.
The human mind is not, like ChatGPT and its ilk, a lumbering statistical engine for pattern matching, gorging on hundreds of terabytes of data and extrapolating the most likely conversational response or most probable answer to a scientific question. On the contrary, the human mind is a surprisingly efficient and even elegant system that operates with small amounts of information; it seeks not to infer brute correlations among data points but to create explanations.
Comments (34)
Many of the skills that GPT-4 can now exhibits had been deemed by nativist linguists and cognitive scientists like Noam Chomsky and Gary Marcus to be impossible for large predictive language models to acquire by means of training alone. Those results are a challenge for the nativist assumptions. Other linguists, like Daniel Everett, who were critical of nativism, are better equipped for understanding how the emergent cognitive abilities of large language models could have arisen similarly to the way plastic human brains also enable language acquisition through training.
I had discussed some of those issues with GPT-4 yesterday. See the second half of this post.
I can't rule out that it might be a scholastic parrot, but I may be Aristotelianizing it.
So, if bots can reason, according to this their reasoning would be confined to deductive and inductive reasoning, and they are incapable of abductive reasoning, or in the words of the article, "creative conjecture".
Also, whenever I hear of programs like ChatGTP, I like to imagine the future relation between humans and AI will look something like this:
In terms of it resembling human thought it may in some instances pass as human but upon further interrogation of its internal logic its limitations are laid bare.
And now on version 4 of chatGPT they charge the gullible punter $$ to use. A bastardisation of openAI indeed
I've fed ChatGPT a fictional story from a show that didn't exist at the September 2021 cutoff date for it's training data, and the AI is pretty good at summarizing the story, drawing inferences about the characters and their motivations, and asking questions not answered by the show so far. I'd say it was about on par with your average online comment.
I've also asked it to take characters it knows about from older stories and have them interact in a new scenario. You can have it show the characters thoughts, and it's a decent story teller. You can have them play a hand of poker. I invented a simple game to play with it, and it mostly got the rules correct. When it didn't, I could tell it what it got wrong, and it would correct itself.
I would say stochastic parrot is too narrow. It seems clear there are emergent behaviors from the more complex models like 3.5 and 4 where it's some building internal models to output the correct text. It doesn't understand like we do, but it does seem to have an increasingly emergent understanding of the semantic meanings embedded in our languages.
I quite agree. It seems to me that one of the most stringent limitations that large (pre-trained transformer) language models currently have is their lacking a short term memory module (or function) that could enable the tacit mental representations that their cognitive processing rely on to be carried from one iteration of the dialogue to the next, or that could enable some of them to be maintained firmly in place while internal deliberation occurs. While there are some work arounds, such as prompting the LLM to think explicitly step by step, they are limited in their efficiency due to the one-way nature of the interaction of the neural network with the models "internal monologue" outputs. The latter can function as an external cognitive aid, allowing the network to offload part of the cognitive burden of the task onto the context window, but those outputs can't be updated dynamically in light of the revised contextual understanding the neural network develops. This is unlike human beings who can reinterpret their own virtual internal monologue on the fly while they think a problem through, or easily revise the opinions they already have expressed out loud. (GPT-4 is much better than its predecessors at correcting its errors, but their lingering presence in the context window thereafter tends to confuse the predictive language model who is poor at dealing with inconsistencies.)
With all of that being said, what currently gives human beings the upper hand in the cogitative department, contrary to what many seem to believe, is that our ability to reason dynamically is an emergent features of our mastery of language that is quite similar the the LLMs' one, and that we owe to the fact that, just like the LLMs, we do *not* usually have to rely on explicit and self contained internal representations while thinking. It's rather the better hierarchical organization of our mnemonic functions (short term, working memory, long term, and auto-biographical) that currently gives us the upper-hand rather than the (fictitious) ability to manipulate self-contained internal representations.
:up:
It's better than current search engine implementations because it retains state. Which means that response to queries can be adjust and 'trained' to fit a particular line of question. Stupid example would be:
User: "I have such and such ingredients. Give me a recipe"
Chatgpt: "here is a list of recipes blah blah blah"
User: "but change those recipes to not include tomatoes"
Chatgpt: "sure blah blah blah"
This is a trivial example but the general concept of stateful queries is a powerful tool. I expect Google and the others will find a way to integrate ai into their engines (bing already does and Google is working on its own thing called lamda).
I don't think you understand how much money it costs to host a service like chatgpt. Iirc 5 days after it opened to the public it had over a million users. The statistics now are absolutely insane. It's not feasible for openai to be an unpaid service.
https://www.businessinsider.com/how-much-chatgpt-costs-openai-to-run-estimate-report-2023-4?r=US&IR=T
This type of stuff seems to me to be beyond coding. At least for a good long while, if not forever. But, surprises can happen.
Though ChatGPT does do impressive stuff, it has to be granted, one should be very, very worried of the potential of further disinformation, which is bad enough as is.
The problem I have with such arguments is that it seems to be arguing against the concept of true AI (i.e. something on par with human reasoning), as if it is an impossibility, by using the current best example of AI as proof. That is, he says that the computer crunching will never resemble human thought because humans don't crunch, but do something different.
That simply attacks the current way AI is being done, but there is no reason to believe one day human thought processes will not be truly reproduced synthetically. Every day new intelligent creatures are created, and that occurs through a biological system, but I don't follow how one can argue that that sort of intelligence cannot be created without going through the normal human reproductive means, as if that is the only way.
At the end of the day, we humans and our consciousness and thought processes are just a certain amount of ingredients that occur when a certain recipe is followed. Why Dr. Frankenstein cannot make that in a lab one day is just a bold assertion, much like there will never be flying machines and whatnot.
Unless you're going to say that humans have a mystery ingredient that must be mixed in a mystery way can you assert that true AI is a hopeless fiction.
It's already released as Bard (powered by LaMDA) but not worldwide yet. It's not available here, in Canada.
Those were the kind of problems that would often stump ChatGPT (based on GPT-3.5) but that GPT-4 usually aces on the fly.
GPT-4's response
[hide]
[i]Hi GPT4!
If it takes 5 machines 5 minutes to make 5 devices, how long would it take 100 machines to make 100 devices?
GPT4:
Hello!
If 5 machines can make 5 devices in 5 minutes, that means each machine can make one device in 5 minutes.
If you have 100 machines, each machine can still make one device in 5 minutes. So, 100 machines can make 100 devices in the same 5 minutes.[/i]
***[/hide]
While such problems were more challenging for ChatGPT, its errors were very human like and it had the ability to better reason about the problem when offered minimal hints or when encouraged to think step by step. (See the answer provided by "Boris" in this discussion on StackExchange.) The restricted ability of large language models - engineered on the basis of the next-word-prediction paradigm - to plan ahead efficiently remains but it is somewhat reduced by their increasingly high power of abstraction.
It has no desires, and more generally, no intentionality.
The goal it provides is a piece of PR spin, programmed in to it. This is demonstrated by the ease with which one can generate wrong responses and hallucinations. It has no goals.
The responses that large language models provide to their users' queries aren't programmed into them although the style of the responses can be oriented or biased as a result of the human supervised fine-tuning of the model. Their responses rather are generated on the basis of patterns extracted from the tens of millions of texts that were part of their training data.
It's true that they don't have intrinsic goals other than those that emerge from their tendency to emulate the gold-oriented structure of the texts they have been trained on, or that have been reinforced during the fine-tuning. Their tendency to hallucinate has different causes, though. When interrogated about source material that are richly represented in their training data, are not hitting the limitations of their context window, and don't accidentally get entangled in self-contradictions, the models' rate of hallucination drops to near zero (in the case of GPT-4, at least).
It presents arguments that are invalid, it hallucinated; it does this because it can have no intent that is not foisted upon it by the users. The liar cares about the truth and attempts to hide it; the bullshitter doesn't care if what they say is true or false. It generates bullshit.
We should probably discuss this elsewhere since it seems unrelated to the objections raised by Chomsky, Marcus, Pinker or other nativists regarding LLMs (allegedly) inherent limitations in processing language and, on the basis of those, their consequent cognitive limitations.
So, to relate this back to the OP, Isn't the argument presented in the article cited much the same as that given by Searle, that intentionality and comprehension are the result of biological systems?
Yes, I agree that this is a relevant way to frame the debate in light of Chomsky's objections to ascribing intelligence to LLMs. Chomsky's philosophies of mind and of language are internalist, as are Searle's. There are commonalities to their arguments. The way Searle uses "intentionality" though, in the context of the the contrast between intrinsic versus extrinsic modes of reference of singular terms (regarding texts, or computer programs, versus human thoughts, respectively) isn't primarily related to the goals or intentions of agents. There might actually be sensible ways to relate intentions (as goals) to "intentionality" (as reference), but I think they would appeal to externalistic considerations and to embodied/embedded paradigms in cognitive science that both are rather alien to Chomsky's or Searle's internalist theoretical assumptions.
(Moderators can delete this)
They are useful tools for capturing elusive truths (or insights, or understandings of difficult topics) though. The fact that they don't intrinsically care about "the truth" is both a strength and a weakness that they have, as intellectual tools. This is something that I had begun to realize during an older conversation with GPT4 that I just posted an except of a few seconds before you posted your comment.
(On edit: I had mistakenly posted the aforementioned excerpt of my conversation with GPT4 into this thread instead of mine. I'm going to move it back over there.)
There is a subtlety here that GPT4 fails to address. But that's better than the other GPT.
Well, GPT4 is at least as good as me since I am equally stomped regarding what subtlety it might have missed ;-)
Was it specified that the machines were identical ( functioning identically ) ?
Should that not be the default assumption?
I think so, and it was cool that the bot understood that. I piped in because I was guessing at the proposed neglected intricacy, and that's what I could come up with.
:up: