I have the technical knowledge to know how LLMs work, but I still find it pointless to not anthropomorphize, at least to an extent.
The language of "generator that stochastically produces the next word" is just not very useful when you're talking about, e.g., an LLM that is answering complex world modeling questions or generating a creative story. It's at the wrong level of abstraction, just as if you were discussing an UI events API and you were talking about zeros and ones, or voltages in transistors. Technically fine but totally useless to reach any conclusion about the high-level system.
We need a higher abstraction level to talk about higher level phenomena in LLMs as well, and the problem is that we have no idea what happens internally at those higher abstraction levels. So, considering that LLMs somehow imitate humans (at least in terms of output), anthropomorphization is the best abstraction we have, hence people naturally resort to it when discussing what LLMs can do.
On the contrary, anthropomorphism IMO is the main problem with narratives around LLMs - people are genuinely talking about them thinking and reasoning when they are doing nothing of that sort (actively encouraged by the companies selling them) and it is completely distorting discussions on their use and perceptions of their utility.
I kinda agree with both of you. It might be a required abstraction, but it's a leaky one.
Long before LLMs, I would talk about classes / functions / modules like "it then does this, decides the epsilon is too low, chops it up and adds it to the list".
The difference I guess it was only to a technical crowd and nobody would mistake this for anything it wasn't. Everybody know that "it" didn't "decide" anything.
With AI being so mainstream and the math being much more elusive than a simple if..then I guess it's just too easy to take this simple speaking convention at face value.
Agreeing with you, this is a "can a submarine swim" problem IMO. We need a new word for what LLMs are doing. Calling it "thinking" is stretching the word to breaking point, but "selecting the next word based on a complex statistical model" doesn't begin to capture what they're capable of.
What does a submarine do? Submarine? I suppose you "drive" a submarine which is getting to the idea: submarines don't swim because ultimately they are "driven"? I guess the issue is we don't make up a new word for what submarines do, we just don't use human words.
I think the above poster gets a little distracted by suggesting the models are creative which itself is disputed. Perhaps a better term, like above, would be to just use "model". They are models after all. We don't make up a new portmanteau for submarines. They float, or drive, or submarine around.
So maybe an LLM doesn't "write" a poem, but instead "models a poem" which maybe indeed take away a little of the sketchy magic and fake humanness they tend to be imbued with.
Depends on if you are talking about an llm or to the llm. Talking to the llm, it would not understand that "model a poem" means to write a poem. Well, it will probably guess right in this case, but if you go out of band too much it won't understand you. The hard problem today is rewriting out of band tasks to be in band, and that requires anthropomorphizing.
A submarine is propelled by a propellor and helmed by a controller (usually a human).
It would be swimming if it was propelled by drag (well, technically a propellor also uses drag via thrust, but you get the point). Imagine a submarine with a fish tail.
Likewise we can probably find an apt description in our current vocabulary to fittingly describe what LLMs do.
And we are there. A boat sails, and a submarine sails. A model generates makes perfect sense to me. And saying chatgpt generated a poem feels correct personally. Indeed a model (e.g. a linear regression) generates predictions for the most part.
No one was as bothered when we anthropomorphized crud apps simply for the purpose of conversing about "them". "Ack! The thing is corrupting tables again because it thinks we are still using api v3! Who approved that last MR?!" The fact that people are bothered by the same language now is indicative in itself. If you want to maintain distance, pre prompt models to structure all conversations to lack pronouns as between a non sentient language model and a non sentient agi. You can have the model call you out for referring to the model as existing. The language style that forces is interesting, and potentially more productive except that there are fewer conversations formed like that in the training dataset. Translation being a core function of language models makes it less important thought. As for confusing the map for the territory, that is precisely what philosophers like Metzinger say humans are doing by considering "self" to be a real thing and that they are conscious when they are just using the reasoning shortcut of narrating the meta model to be the model.
In addition to he/she etc. there is a need for a button for no pronouns. "Stop confusing metacognition for conscious experience or qualia!" doesn't fit well. The UX for these models is extremely malleable. The responses are misleading mostly to the extent the prompts were already misled. The sorts of responses that arise from ignorant prompts are those found within the training data in the context of ignorant questions. This tends to make them ignorant as well. There are absolutely stupid questions.
> this is a "can a submarine swim" problem IMO. We need a new word for what LLMs are doing.
Why?
A plane is not a fly and does not stay aloft like a fly, yet we describe what it does as flying despite the fact that it does not flap its wings. What are the downsides we encounter that are caused by using the word “fly” to describe a plane travelling through the air?
This is a total non-problem that has been invented by people so they have something new and exciting to be pedantic about.
When we need to speak precisely about a model and how it works, we have a formal language (mathematics) which allows us to be absolutely specific. When we need to empirically observe how the model behaves, we have a completely precise method of doing this (running an eval).
Any other time, we use language in a purposefully intuitive and imprecise way, and that is a deliberate tradeoff which sacrifices precision for expressiveness.
A machine that can imitate the products of thought is not the same as thinking.
All imitations require analogous mechanisms, but that is the extent of their similarities, in syntax. Thinking requires networks of billions of neurons, and then, not only that, but words can never exist on a plane because they do not belong to a plane. Words can only be stored on a plane, they are not useful on a plane.
Because of this LLMs have the potential to discover new aspects and implications of language that will be rarely useful to us because language is not useful within a computer, it is useful in the world.
Its like seeing loosely related patterns in a picture and keep derivating on those patterns that are real, but loosely related.
LLMs are not intelligence but its fine that we use that word to describe them.
I didn't think of prediction in the statistical sense here, but rather as a prophecy based on a vision, something that is inherently stored in a model without the knowledge of the modelers. I don't want to imply any magic or something supernatural here, it's just the juice that goes off the rails sometimes, and it gets overlooked due to the sheer quantity of the weights. Something like unknown bugs in production, but, because they still just represent a valid number in some computation that wouldn't cause any panic, these few bits can show a useful pattern under the right circumstances.
Inference would be the part that is deliberately learned and drawn from conclusions based on the training set, like in the "classic" sense of statistical learning.
It will help significantly, to realize that the only thinking happening is when the human looks at the output and attempts to verify if it is congruent with reality.
I mean you can boil anything down to it's building blocks and make it seem like it didn't 'decide' anything. When you as a human decide something, your brain and it's neurons just made some connections with an output signal sent to other parts that resulting in your body 'doing' something.
I don't think LLMs are sentient or any bullshit like that, but I do think people are too quick to write them off before really thinking about how a nn 'knows things' similar to how a human 'knows' things, it is trained and reacts to inputs and outputs. The body is just far more complex.
To me all of those are so vaguely defined that arguing whether an LLM is "really really" doing something is kind of a waste of time.
It's like we're clinging on to things that make us feel like human cognition is special so we're saying LLM's arent "really" doing it, then not defining what it actually is.
We can argue all day what "think" means and whether a LLM thinks (probably not IMO), but at least in my head the threshold for "decide" is much lower so I can perfectly accept that a LLM (or even a class) "decides". I don't have a conflict about that. Yeah, it might not be a decision in the human sense, but it's a decision in the mathematical sense so I have always meant "decide" literally when I was talking about a piece of code.
It's much more interesting when we are talking about... say... an ant... Does it "decide"? That I have no idea as it's probably somewhere in between, neither a sentient decision, nor a mathematical one.
Well, it outputs a chain of thoughts that later used to produce better prediction. It produces a chain of thoughts similar to how one would do thinking about a problem out loud. It's more verbose that what you would do, but you always have some ambient context that LLM lacks.
When I see these debates it's always the other way around - one person speaks colloquially about an LLM's behavior, and then somebody else jumps on them for supposedly believing the model is conscious, just because the speaker said "the model thinks.." or "the model knows.." or whatever.
To be honest the impression I've gotten is that some people are just very interested in talking about not anthropomorphizing AI, and less interested in talking about AI behaviors, so they see conversations about the latter as a chance to talk about the former.
As I write this, Claude Code is currently opening and closing various media files on my computer. Sometimes it plays the file for a few seconds before closing it, sometimes it starts playback and then seeks to a different position, sometimes it fast forwards or rewinds, etc.
I asked Claude to write a E-AC3 audio component so I can play videos with E-AC3 audio in the old version of QuickTime I really like using. Claude's decoder includes the ability to write debug output to a log file, so Claude is studying how QuickTime and the component interact, and it's controlling QuickTime via Applescript.
Sometimes QuickTime crashes, because this ancient API has its roots in the classic Mac OS days and is not exactly good. Claude reads the crash logs on its own—it knows where they are—and continues on its way. I'm just sitting back and trying to do other things while Claude works, although it's a little distracting that something else is using my computer at the same time.
I really don't want to anthropomorphize these programs, but it's just so hard when it's acting so much like a person...
Would it help you to know that trial and error is a common tactic by machines? Yes, humans do it too, but that doesn't mean the process isn't mechanical. In fact, in computing we might call this a "brute force" approach. You don't have to cover the entire search space to brute force something, and it certainly doesn't mean you can't have optimization strategies and need to grid search (e.g. you can use Bayesian methods, multi-armed bandit approaches, or a whole world of things).
I would call "fuck around and find out" a rather simple approach. It is why we use it! It is why lots of animals use it. Even very dumb animals use it. Though, we do notice more intelligent animals use more efficient optimization methods. All of this is technically hypothesis testing. Even a naive grid search. But that is still in the class of "fuck around and find out" or "brute force", right?
I should also mention two important things.
1) as a human we are biased to anthropomorphize. We see faces in clouds. We tell stories of mighty beings controlling the world in an effort to explain why things happen. This is anthropomorphization of the universe itself!
2) We design LLMs (and many other large ML systems) to optimize towards human preference. This reinforces an anthropomorphized interpretation.
The reason for doing this (2) is based on a naive assumption[0]: If it looks like a duck, swims like a duck, and quacks like a duck, then it *probably* is a duck. But the duck test doesn't rule out a highly sophisticated animatronic. It's a good rule of thumb, but wouldn't it also be incredibly naive to assume that it *is* a duck? Isn't the duck test itself entirely dependent on our own personal familiarity with ducks? I think this is important to remember and can help combat our own propensity for creating biases.
[0] It is not a bad strategy to build in that direction. When faced with many possible ways to go, this is a very reasonable approach. The naive part is if you assume that it will take you all the way to making a duck. It is also a perilous approach because you are explicitly making it harder for you to evaluate. It is, in the fullest sense of the phrase, "metric hacking."
It wasn't a simple brute force. When Claude was working this morning, it was pretty clearly only playing a file when it actually needed to see packets get decoded, otherwise it would simply open and close the document. Similarly, it would only seek or fast forward when it was debugging specific issues related to those actions. And it even "knew" which test files to open for specific channel layouts.
Yes this is still mechanical in a sense, but then I'm not sure what behavior you wouldn't classify as mechanical. It's "responding" to stimuli in logical ways.
But I also don't quite know where I'm going with this. I don't think LLMs are sentient or something, I know they're just math. But it's spooky.
"Simple" is the key word here, right? You agree that it is still under the broad class of "brute force"?
I'm not saying Claude is naively brute forcing. In fact, with lack of interpretibility of these machines it is difficult to say what kind of optimization it is doing and how complex that it (this was a key part tbh).
My point was to help with this
> I really don't want to anthropomorphize these programs, but it's just so hard when it's acting so much like a person...
Which requires you to understand how some actions can be mechanical. You admitted to cognitive dissonance (something we all do and I fully agree is hard not to do) and wanting to fight it. We're just trying to find some helpful avenues to do so.
> It's "responding" to stimuli in logical ways.
And so too can a simple program, right? A program can respond to user input and there is certainly a logic path it will follow. Our non-ML program is likely going to have a deterministic path (there is still probabilistic programming...), but that doesn't mean it isn't logic, right?
But the real question here, which you have to ask yourself (constantly) is "how do I differentiate a complex program that I don't understand from a conscious entity?" I guarantee you that you don't have the answer (because no one does). But isn't that a really good reason to be careful about anthropomorphizing it?
That's the duck test.
How do you determine if it is a real duck or a highly sophisticated animatronic?
If you anthropomorphize, you rule out the possibility that it is a highly sophisticated animatronic and you *MUST* make the assumption that you are not only an expert, but a perfect, duck detector. But simultaneously we cannot rule out that it is a duck, right? Because, we aren't a perfect duck detector *AND* we aren't an expert in highly sophisticated animatronics (especially of the duck kind).
Remember, there are not two answers to every True-False question, there are three. Every True-False question either has an answer of "True", "False", or "Indeterminate". So don't naively assume it is binary. We all know the Halting Problem, right? (also see my namesake or quantum physics if you want to see such things pop up outside computing)
Though I agree, it can be very spooky. But that only increases the importance of trying to develop mental models that help us more objectively evaluate things. And that requires "indeterminate" be a possibility. This is probably the best place to start to combat the cognitive dissonance.
I have no idea why some people take so much offense to rhe fact humans are just another machine, there's no reason why another machine can't surpass it here as in all other aveneus machines have already. Many of the reasons people give for llms not being conscious are just as applicable to humans too.
I don't think the question is if humans are a machine or not but rather what is meant by machine. Most people interpret it as meaning deterministic and thus having no free will. That's probably not what you're trying to convey so might not be the best word to use.
But the question is what is special about the human machine? What is special about the animal machine? These are different from all the machines we have built. Is it complexity? Is it indeterministic? Is it more? Certainly these machines have feelings, and we need to account for them when interacting with them.
Though we're getting well off topic from determining if a duck is a duck or is a machine (you know what I mean by this word and that I don't mean a normal duck)
Respectfully, that is a reflection of the places you hang out in (like HN) and not the reality of the population.
Outside the technical world it gets much worse. There are people who killed themselves because of LLMs, people who are in love with them, people who genuinely believe they have “awakened” their own private ChatGPT instance into AGI and are eschewing the real humans in their lives.
The other day a good friend of mine with mental health issues remarked that "his" chatgpt understands him better than most of his friends and gives him better advice than his therapist.
It's going to take a lot to get him out of that mindset and frankly I'm dreading trying to compare and contrast imperfect human behaviour and friendships with a sycophantic AI.
> The other day a good friend of mine with mental health issues remarked that "his" chatgpt understands him better than most of his friends and gives him better advice than his therapist.
The therapist thing might be correct, though. You can send a well-adjusted person to three renowned therapists and get three different reasons for why they need to continue sessions.
No therapist ever says "Congratulations, you're perfectly normal. Now go away and come back when you have a real problem." Statistically it is vanishingly unlikely that every person who ever visited a therapist is in need of a second (more more) visit.
The main problem with therapy is a lack of objectivity[1]. When people talk about what their sessions resulted in, it's always "My problem is that I'm too perfect". I've known actual bullies whose therapist apparently told them that they are too submissive and need to be more assertive.
The secondary problem is that all diagnosis is based on self-reported metrics of the subject. All improvement is equally based on self-reported metrics. This is no different from prayer.
You don't have a medical practice there; you've got an Imam and a sophisticated but still medically-insured way to plead with thunderstorms[2]. I fail to see how an LLM (or even the Rogerian a-x doctor in Emacs) will do worse on average.
After all, if you're at a therapist and you're doing most of the talking, how would an LLM perform worse than the therapist?
----------------
[1] If I'm at a therapist, and they're asking me to do most of the talking, I would damn well feel that I am not getting my moneys worth. I'd be there primarily to learn (and practice a little) whatever tools they can teach me to handle my $PROBLEM. I don't want someone to vent at, I want to learn coping mechanisms and mitigation strategies.
Yup, this problem is why I think all therapists should ideally know behavioral genetics and evolutionary psychology (there is at least a plausibly objective measure there which is dissonance between the ancestral environment in which the brain developed and the modern day environment. And at least some amount of psychological problems can be explained by it).
I am a fan of the « Beat Your Genes » podcast, and while some of the prescriptions can be a bit heavy handed, most feel intuitively right. It’s approaching human problems as intelligent mammal problems, as opposed to something in a category of its own.
It's surprisingly common on reddit that people talk about "my chatgpt", and they don't always seem like the type who are "in a relationship" with the bot or unlocking the secrets of the cosmos with it, but still they write "my chatgpt" and "your chatgpt". I guess the custom prompt and the available context does customize the model for them in some sense, but I suspect they likely have a wrong mental model of how this customization works. I guess they imagine it as their own little model being stored on file at OpenAI and as they interact with it, it's being shaped by it, and each time they connect, their model is retrieved from the cloud storage and they connect to it or something.
Most certainly the conversation is extremely political. There are not simply different points of view. There are competitive, gladiatorial opinions ready to ambush anyone not wearing the right colors. It's a situation where the technical conversation is drowning.
I suppose this war will be fought until people are out of energy, and if reason has no place, it is reasonable to let others tire themselves out reiterating statements that are not designed to bring anyone closer to the truth.
If this tech is going to be half as impactful as its proponents predict, then I'd say it's still under-politicized. Of course the politics around it doesn't have to be knee-jerk mudslinging, but it's no surprise that politics enters the picture when the tech can significantly transform society.
Go politicize it on Reddit, preferably on a political sub and not a tech sub. On this forum, I would like to expect a lot more intelligent conversation.
Wait until a conversation about “serverless” comes up and someone says there is no such thing because there are servers somewhere as if everyone - especially on HN -doesn’t already know that.
Why would everyone know that? Not everyone has experience in sysops, especially not beginners.
E.g. when I first started learning webdev, I didn’t think about ‘servers’. I just knew that if I uploaded my HTML/PHP files to my shared web host, then they appeared online.
It was only much later that I realized that shared webhosting is ‘just’ an abstraction over Linux/Apache (after all, I first had to learn about those topics).
I am saying that most people who come on HN and say “there is no such thing as serverless and there are servers somewhere” think they are sounding smart when they are adding nothing to the conversation.
I’m sure you knew that your code was running on computers somewhere even when you first started and wasn’t running in a literal “cloud”.
It’s about as tiring as people on HN who know just a little about LLMs thinking they are sounding smart when they say they are just advanced autocomplete. Both responses are just as unproductive
> I’m sure you knew that your code was running on computers somewhere even when you first started and wasn’t running in a literal “cloud”.
Meh, I just knew that the browser would display HTML if I wrote it, and that uploading the HTML files made them available on my domain. I didn’t really think about where the files went, specifically.
Try asking an average high school kid how cloud storage works. I doubt you’ll get any further than ‘I make files on my Google Docs and then they are saved there’. This is one step short of ‘well, the files must be on some system in some data center’.
I really disagree that “people who come on HN and say “there is no such thing as serverless and there are servers somewhere” think they are sounding smart when they are adding nothing to the conversation.” On the contrary, it’s an invitation to beginning coders to think about what the ‘serverless’ abstraction actually means.
I think they fumbled with wording but I interpreted them as meaning "audience of HN" and it seems they confirmed.
We always are speaking to our audience, right? This is also what makes more general/open discussions difficult (e.g. talking on Twitter/Facebook/etc). That there are many ways to interpret anything depending on prior knowledge, cultural biases, etc. But I think it is fair that on HN we can make an assumption that people here are tech savvy and knowledgeable. We'll definitely overstep and understep at times, but shouldn't we also cultivate a culture where it is okay to ask and okay to apologize for making too much of an assumption?
I mean at the end of the day we got to make some assumptions, right? If we assume zero operating knowledge then comments are going to get pretty massive and frankly, not be good at communicating with a niche even if better at communicating with a general audience. But should HN be a place for general people? I think no. I think it should be a place for people interested in computers and programming.
It's not just distorting discussions it's leading people to put a lot of faith in what LLMs are telling them. Was just on a zoom an hour ago where a guy working on a startup asked ChatGPT about his idea and then emailed us the result for discussion in the meeting. ChatGPT basically just told him what he wanted to hear - essentially that his idea was great and it would be successful ("if you implement it correctly" was doing a lot of work). It was a glowing endorsement of the idea that made the guy think that he must have a million dollar idea. I had to be "that guy" who said that maybe ChatGPT was telling him what he wanted to hear based on the way the question was formulated - tried to be very diplomatic about it and maybe I was a bit too diplomatic because it didn't shake his faith in what ChatGPT had told him.
LLMs directly exploit a human trust vuln. Our brains tend to engage with them relationally and create an unconscious functional belief that an agent on the other end is responding with their real thoughts, even when we know better.
AI apps ought to at minimum warn us that their responses are not anyone's (or anything's) real thoughts. But the illusion is so powerful that many people would ignore the warning.
Well "reasoning" refers to Chain-of-Thought and if you look at the generated prompts it's not hard to see why it's called that.
That said, it's fascinating to me that it works (and empirically, it does work; a reasoning model generating tens of thousands of tokens while working out the problem does produce better results). I wish I knew why. A priori I wouldn't have expected it, since there's no new input. That means it's all "in there" in the weights already. I don't see why it couldn't just one shot it without all the reasoning. And maybe the future will bring us more distilled models that can do that, or they can tease out all that reasoning with more generated training data, to move it from dispersed around the weights -> prompt -> more immediately accessible in the weights. But for now "reasoning" works.
But then, at the back of my mind is the easy answer: maybe you can't optimize it. Maybe the model has to "reason" to "organize its thoughts" and get the best results. After all, if you give me a complicated problem I'll write down hypotheses and outline approaches and double check results for consistency and all that. But now we're getting dangerously close to the "anthropomorphization" that this article is lamenting.
Using more tokens = more compute to use for a given problem. I think most of the benefit of CoT has more to do with autoregressive models being unable to “think ahead” and revise their output, and less to do with actual reasoning. The fact that an LLM can have incorrect reasoning in its CoT and still produce the right answer, or that it can “lie” in its CoT to avoid being detected as cheating on RL tasks, makes me believe that the semantic content of CoT is an illusion, and that the improved performance is from being able to explore and revise in some internal space using more compute before producing a final output.
> I don't see why it couldn't just one shot it without all the reasoning.
That's reminding me of deep neural networks where single layer networks could achieve the same results, but the layer would have to be excessively large. Maybe we're re-using the same kind of improvement, scaling in length instead of width because of our computation limitations ?
CoT gives the model more time to think and process the inputs it has. To give an extreme example, suppose you are using next token prediction to answer 'Is P==NP?' The tiny number of input tokens means that there's a tiny amount of compute to dedicate to producing an answer. A scratchpad allows us to break free of the short-inputs problem.
Meanwhile, things can happen in the latent representation which aren't reflected in the intermediate outputs. You could, instead of using CoT, say "Write a recipe for a vegetarian chile, along with a lengthy biographical story relating to the recipe. Afterwards, I will ask you again about my original question." And the latents can still help model the primary problem, yielding a better answer than you would have gotten with the short input alone.
Along these lines, I believe there are chain of thought studies which find that the content of the intermediate outputs don't actually matter all that much...
I like this mental-model, which rests heavily on the "be careful not to anthropomorphize" approach:
It was already common to use a document extender (LLM) against a hidden document, which resembles a movie or theater play where a character named User is interrogating a character named Bot.
Chain-of-thought switches the movie/script style to film noir, where the [Detective] Bot character has additional content which is not actually "spoken" at the User character. The extra words in the script add a certain kind of metaphorical inertia.
> people are genuinely talking about them thinking and reasoning when they are doing nothing of that sort
Do you believe thinking/reasoning is a binary concept? If not, do you think the current top LLM are before or after the 50% mark? What % do you think they're at? What % range do you think humans exhibit?
"All models are wrong, but some models are useful," is the principle I have been using to decide when to go with an anthropomorphic explanation.
In other words, no, they never accurately describe what the LLM is actually doing. But sometimes drawing an analogy to human behavior is the most effective way to pump others' intuition about a particular LLM behavior. The trick is making sure that your audience understands that this is just an analogy, and that it has its limitations.
And it's not completely wrong. Mimicking human behavior is exactly what they're designed to do. You just need to keep reminding people that it's only doing so in a very superficial and spotty way. There's absolutely no basis for assuming that what's happening on the inside is the same.
> people are genuinely talking about them thinking and reasoning when they are doing nothing of that sort
With such strong wording, it should be rather easy to explain how our thinking differs from what LLMs do. The next step - showing that what LLMs do precludes any kind of sentience is probably much harder.
I thought this too but then began to think about it from the perspective of the programmers trying to make it imitate human learning. That's what a nn is trying to do at the end of the day, and in the same way I train myself by reading problems and solutions, or learning vocab at a young age, it does so by tuning billions of parameters.
I think these models do learn similarly. What does it even mean to reason? Your brain knows certain things so it comes to certain conclusions, but it only knows those things because it was ''trained'' on those things.
I reason my car will crash if I go 120 mph on the other side of the road because previously I have 'seen' where the input is a car going 120mph has a high probability of producing a crash, and similarly have seen input where the car is going on the other side of the road, producing a crash. Combining the two would tell me it's a high probability.
I think it's worth distinguishing between the use of anthropomorphism as a useful abstraction and the misuse by companies to fuel AI hype.
For example, I think "chain of thought" is a good name for what it denotes. It makes the concept easy to understand and discuss, and a non-antropomorphized name would be unnatural and unnecessarily complicate things. This doesn't mean that I support companies insisting that LLMs think just like humans or anything like that.
By the way, I would say actually anti-anthropomorphism has been a bigger problem for understanding LLMs than anthropomorphism itself. The main proponents of anti-anthropomorphism (e.g. Bender and the rest of "stochastic parrot" and related paper authors) came up with a lot of predictions about things that LLMs surely couldn't do (on account of just being predictors of the next word, etc.) which turned out to be spectacularly wrong.
I don't know about others, but I much prefer if some reductionist tries to conclude what's technically feasible and is proven wrong over time, than somebody yelling holistic analogies á la "it's sentient, it's intelligent, it thinks like us humans" for the sole dogmatic reason of being a futurist.
Tbh I also think your comparison that puts "UI events -> Bits -> Transistor Voltages" as analogy to "AI thinks -> token de-/encoding + MatMul" is certainly a stretch, as the part about "Bits -> Transistor Voltages" applies to both hierarchies as the foundational layer.
"chain of thought" could probably be called "progressive on-track-inference" and nobody would roll an eye.
>> it pointless to *not* anthropomorphize, at least to an extent.
I agree that it is pointless to not anthropomorphize because we are humans and we will automatically do this. Willingly or unwillingly.
On the other hand, it generates bias. This bias can lead to errors.
So the real answer is (imo) that it is fine to anthropomorphise but recognize that while doing so can provide utility and help us understand, it is WRONG. Recognizing that it is not right and cannot be right provides us with a constant reminder to reevaluate. Use it, but double check, and keep checking making sure you understand the limitations of the analogy. Understanding when and where it applies, where it doesn't, and most importantly, where you don't know if it does or does not. The last is most important because it helps us form hypotheses that are likely to be testable (likely, not always. Also, much easier said than done).
So I pick a "grey area". Anthropomorphization is a tool that can be helpful. But like any tool, it isn't universal. There is no "one-size-fits-all" tool. Literally, one of the most important things for any scientist is to become an expert at the tools you use. It's one of the most critical skills of *any expert*. So while I agree with you that we should be careful of anthropomorphization, I disagree that it is useless and can never provide information. But I do agree that quite frequently, the wrong tool is used for the right job. Sometimes, hacking it just isn't good enough.
> On the contrary, anthropomorphism IMO is the main problem with narratives around LLMs
I hold a deep belief that anthropomorphism is a way the human mind words. If we take for granted the hypothesis of Franz de Waal, that human mind developed its capabilities due to political games, and then think about how it could later lead to solving engineering and technological problems, then the tendency of people to anthropomorphize becomes obvious. Political games need empathy or maybe some other kind of -pathy, that allows politicians to guess motives of others looking at their behaviors. Political games directed the evolution to develop mental instruments to uncover causality by watching at others and interacting with them. Now, to apply these instruments to inanimate world all you need is to anthropomorphize inanimate objects.
Of course, it leads sometimes to the invention of gods, or spirits, or other imaginary intelligences behinds things. And sometimes these entities get in the way of revealing the real causes of events. But I believe that to anthropomorphize LLMs (at the current stage of their development) is not just the natural thing for people but a good thing as well. Some behavior of LLMs is easily described in terms of psychology; some cannot be described or at least not so easy. People are seeking ways to do it. Projecting this process into the future, I can imagine how there will be a kind of consensual LLMs "theory" that explains some traits of LLMs in terms of human psychology and fails to explain other traits, so they are explained in some other terms... And then a revolution happens, when a few bright minds come and say that "anthropomorphism is bad, it cannot explain LLM" and they propose something different.
I'm sure it will happen at some point in the future, but not right now. And it will happen not like that: not just because someone said that anthropomorphism is bad, but because they proposed another way to talk about reasons behind LLMs behavior. It is like with scientific theories: they do not fail because they become obviously wrong, but because other, better theories replace them.
It doesn't mean, that there is no point to fight anthropomorphism right now, but this fight should be directed at searching for new ways to talk about LLMs, not to show at the deficiencies of anthropomorphism. To my mind it makes sense to start not with deficiencies of anthropomorphism but with its successes. What traits of LLMs it allows us to capture, which ideas about LLMs are impossible to wrap into words without thinking of LLMs as of people?
Anthropomorphising implicitly assumes motivation, goals and values. That's what the core of anthropomorphism is - attempting to explain behavior of a complex system in teleological terms. And prompt escapes make it clear LLMs doesn't have any teleological agency yet. Whenever their course of action is, it is to easy to steer them of. Try to do it with a sufficiently motivated human.
>. Try to do it with a sufficiently motivated human.
That's what they call marketing, propaganda or brain washing, acculturation , education depending on who you ask and at which scale you operate, apparently.
Prompt escapes will be much harder, and some of them will end up in an equivalent of "sure here is… no, wait… You know what, I'm not doing that", i. e. slipping and then getting back on track.
Well, that's a strong claim of equivalence between computationable models and realty.
The consensual view is rather that no map is matching fully the territory, or said otherwise the territory includes ontological components that exceeds even the most sophisticated map that can be ever built.
Agreed. I'm also in favor of anthropomorphizing, because not doing so confuses people about the nature and capabilities of these models even more.
Whether it's hallucinations, prompt injections, various other security vulnerabilities/scenarios, or problems with doing math, backtracking, getting confused - there's a steady supply of "problems" that some people are surprised to discover and even more surprised this isn't being definitively fixed. Thing is, none of that is surprising, and these things are not bugs, they're flip side of the features - but to see that, one has to realize that humans demonstrate those exact same failure modes.
Especially when it comes to designing larger systems incorporating LLM "agents", it really helps to think of them as humans - because the problems those systems face are exactly the same as you get with systems incorporating people, and mostly for the same underlying reasons. Anthropomorphizing LLMs cuts through a lot of misconceptions and false paths, and helps one realize that we have millennia of experience with people-centric computing systems (aka. bureaucracy) that's directly transferrable.
I disagree. Anthropomorphization can be a very useful tool but I think it is currently over used and is a very tricky tool to use when communicating with a more general audience.
I think looking at physics might be a good example. We love our simplified examples and there's a big culture of trying to explain things to the lay person (mostly because the topics are incredibly complex). But how many people have misunderstood an observer of a quantum event with "a human" and do not consider "a photon" as an observer? How many people think in Schrodinger's Cat that the cat is both alive and dead?[0] Or believe in a multiverse. There's plenty of examples we can point to.
While these analogies *can* be extremely helpful, they *can* also be extremely harmful. This is especially true as information is usually passed through a game of telephone[1]. There is information loss and with it, interpretation becomes more difficult. Often a very subtle part can make a critical distinction.
I'm not against anthropomorphization[2], but I do think we should be cautious about how we use it. The imprecise nature of it is the exact reason we should be mindful of when and how to use it. We know that the anthropomorphized analogy is wrong. So we have to think about "how wrong" it is for a given setting. We should also be careful to think about how it may be misinterpreted. That's all I'm trying to say. And isn't this what we should be doing if we want to communicate effectively?
[0] It is not. It is either. The point of this thought experiment is that we cannot know the answer without looking inside. There is information loss and the event is not deterministic. It directly relates to the Heisenberg Uncertainty Principle, Godel's Incompleteness, or the Halting Problem. All these things are (loosely) related around the inability to have absolute determinism.
I remember Dawkins talking about the "intentional stance" when discussing genes in The Selfish Gene.
It's flat wrong to describe genes as having any agency. However it's a useful and easily understood shorthand to describe them in that way rather than every time use the full formulation of "organisms who tend to possess these genes tend towards these behaviours."
Sometimes to help our brains reach a higher level of abstraction, once we understand the low level of abstraction we should stop talking and thinking at that level.
The intentional stance was Daniel Dennett's creation and a major part of his life's work. There are actually (exactly) three stances in his model: the physical stance, the design stance, and the intentional stance.
Thanks for the correction. I guess both thinkers took a somewhat similar position and I somehow remembered Dawkins's argument but Dennett's term. The term is memorable.
Do you want to describe WHY you think the design stance is appropriate here but the intentional stance is not?
Exactly. We use anthropomorphic language absolutely all the time when describing different processes for this exact reason - it is a helpful abstraction that allows us to easily describe what’s going on at a high level.
“My headphones think they’re connected, but the computer can’t see them”.
“The printer thinks it’s out of paper, but it’s not”.
“The optimisation function is trying to go down nabla f”.
“The parking sensor on the car keeps going off because it’s afraid it’s too close to the wall”.
“The client is blocked, because it still needs to get a final message from the server”.
…and one final one which I promise you is real because I overheard it “I’m trying to airdrop a photo, but our phones won’t have sex”.
I get the impression after using language models for quite a while that perhaps the one thing that is riskiest to anthropomorphise is the conversational UI that has become the default for many people.
A lot of the issues I'd have when 'pretending' to have a conversation are much less so when I either keep things to a single Q/A pairing, or at the very least heavily edit/prune the conversation history. Based on my understanding of LLM's, this seems to make sense even for the models that are trained for conversational interfaces.
so, for example, an exchange with multiple messages, where at the end I ask the LLM to double-check the conversation and correct 'hallucinations', is less optimal than something like asking for a thorough summary at the end, and then feeding that into a new prompt/conversation, as the repetition of these falsities, or 'building' on them with subsequent messages, is more likely to make them a stronger 'presence' and as a result perhaps affect the corrections.
I haven't tested any of this thoroughly, but at least with code I've definitely noticed how a wrong piece of code can 'infect' the conversation.
If I use human-related terminology as a shortcut, as some kind of macro to talk at a higher level/more efficiently about something I want to do that might be okay.
What is not okay is talking in a way that implies intent, for example.
Compare:
"The AI doesn't want to do that."
versus
"The model doesn't do that with this prompt and all others we tried."
The latter way of talking is still high-level enough but avoids equating/confusing the name of a field with a sentient being.
Whenever I hear people saying "an AI" I suggest they replace AI with "statistics" to make it obvious how problematic anthropomorphisms may have become:
The only reason that sounds weird to you is because you have the experience of being human. Human behavior is not magic. It's still just statistics. You go to the bathroom when you have to pee not because some magical concept of consciousness, but because a reciptor in your brain goes off and starts the chain of making you go to the bathroom. AI's are not magic, but nobody has sufficiently provided any proof we are somehow special either.
This is why I actually really love the description of it as a "Shoggoth" - it's more abstract, slightly floaty but it achieves the purpose of not treating and anthropomising it as a human being while not treating LLMs as a collection of predictive words.
Anthropomorphizing might blind us to solutions to existing problems. Perhaps instead of trying to come up with the correct prompt for a LLM, there exists a string of words (not necessary ones that make sense) that will get the LLM to a better position to answer given questions.
When we anthropomorphize we are inherently ignore certain parts of how LLMs work, and imagining parts that don't even exist
> there exists a string of words (not necessary ones that make sense) that will get the LLM to a better position to answer
exactly. The opposite is also true. You might supply more clarifying information to the LLM, which would help any human answer, but it actually degrades the LLM's output.
My brain refuses to join the rah-rah bandwagon because I cannot see them in my mind’s eye. Sometimes I get jealous of people like GP and OP who clearly seem to have the sight. (Being a serial math exam flunker might have something to do with it. :))))
Anyway, one does what one can.
(I've been trying to picture abstract visual and semi-philosophical approximations which I’ll avoid linking here because they seem to fetch bad karma in super-duper LLM enthusiast communities. But you can read them on my blog and email me scathing critiques, if you wish :sweat-smile:.)
> We need a higher abstraction level to talk about higher level phenomena in LLMs as well, and the problem is that we have no idea what happens internally at those higher abstraction levels
We do know what happens at higher abstraction levels; the design of efficient networks, and the steady beat of SOTA improvements all depend on understanding how LLMs work internally: choice of network dimensions, feature extraction, attention, attention heads, caching, the peculiarities of high-dimensions and avoiding overfitting are all well-understood by practitioners. Anthropomorphization is only necessary in pop-science articles that use a limited vocabulary.
IMO, there is very little mystery, but lots of deliberate mysticism, especially about future LLMs - the usual hype-cycle extrapolation.
I'd take it in reverse order: the problem isn't that it's possible to have a computer that "stochastically produces the next word" and can fool humans, it's why / how / when humans evolved to have technological complexity when the majority (of people) aren't that different from a stochastic process.
You are conflating anthropomorphism with personification. They are not the same thing. No one believes their guitar or car or boat is alive and sentient when they give it a name or talk to or about it.
But the author used "anthropomorphism" the same way as I did. I guess we both mean "personification" then.
> we talk about "behaviors", "ethical constraints", and "harmful actions in pursuit of their goals". All of these are anthropocentric concepts that - in my mind - do not apply to functions or other mathematical objects.
One talking about a program's "behaviors", "actions" or "goals" doesn't mean they believe the program is sentient. Only "ethical constraints" is suspiciously anthropomorphizing.
A bit of anecdote: last year I hung out with a bunch of old classmates that I hadn't seen for quite a while. None of them works in tech.
Surprisingly to me, all of them have ChatGPT installed on their phones.
And unsurprisingly to me, none of them treated it like an actual intelligence. That makes me wonder where those who think ChatGPT is sentient come from.
(It's a bit worrisome that several of them thought it worked "like Google search and Google translation combined", even by the time ChatGPT couldn't do web search...!)
I think it’s more than a few and it’s still rising, and therein lies the issue.
Which is why it is paramount to talk about this now, when we may still turn the tide. LLMs can be useful, but it’s important to have the right mental model, understanding, expectations, and attitude towards them.
This is a No True Scotsman fallacy. And it's radically factually wrong.
The rest of your comment is along the lines of the famous (but apocryphal) Pauline Kael line “I can’t believe Nixon won. I don’t know anyone who voted for him.”
I'm not convinced... we use these terms to assign roles, yes, but these roles describe a utility or assign a responsibility. That isn't anthropomorphizing anything, but it rather describes the usage of an inanimate object as tool for us humans and seems in line with history.
What's the utility or the responsibility of AI, what's its usage as tool? If you'd ask me it should be closer to serving insights than "reasoning thoughts".
The "point" of not anthropomorphizing is to refrain from judgement until a more solid abstraction appears. The problem with explaining LLMs in terms of human behaviour is that, while we don't clearly understand what the LLM is doing, we understand human cognition even less! There is literally no predictive power in the abstraction "The LLM is thinking like I am thinking". It gives you no mechanism to evaluate what tasks the LLM "should" be able to do.
Seriously, try it. Why don't LLMs get frustrated with you if you ask them the same question repeatedly? A human would. Why are LLMs so happy to give contradictory answers, as long as you are very careful not to highlight the contradictory facts? Why do earlier models behave worse on reasoning tasks than later ones? These are features nobody, anywhere understands. So why make the (imo phenomenally large) leap to "well, it's clearly just a brain"?
It is like someone inventing the aeroplane and someone looks at it and says "oh, it's flying, I guess it's a bird". It's not a bird!
> It is like someone inventing the aeroplane and someone looks at it and says "oh, it's flying, I guess it's a bird". It's not a bird!
We tried to mimic birds at first; it turns out birds were way too high-tech, and too optimized. We figured out how to fly when we ditched the biological distraction and focused on flight itself. But fast forward until today, we're reaching the level of technology that allows us to build machines that fly the same way birds do - and of such machines, it's fair to say, "it's a mechanical bird!".
Similarly, we cracked computing from grounds up. Babbage's difference engine was like da Vinci's drawings; ENIAC could be seen as Wright brothers' first flight.
With planes, we kept iterating - developing propellers, then jet engines, ramjets; we learned to move tons of cargo around the world, and travel at high multiples of the speed of sound. All that makes our flying machines way beyond anything nature ever produced, when compared along those narrow dimensions.
The same was true with computing: our machines and algorithms very quickly started to exceed what even smartest humans are capable of. Counting. Pathfinding. Remembering. Simulating and predicting. Reproducing data. And so on.
But much like birds were too high-tech for us to reproduce until now, so were general-purpose thinking machines. Now that we figured out a way to make a basic one, it's absolutely fair to say, "I guess it's like a digital mind".
A machine that emulates a bird is indeed a mechanical bird. We can say what emulating a bird is because we know, at least for the purpose of flying, what a bird is and how it works. We (me, you, everyone else) have no idea how thinking works. We do not know what consciousness is and how it operates. We may never know. It is deranged gibberish to look at an LLM and say "well, it does some things I can do some of the time, so I suppose it's a digital mind!". You have to understand the thing before you can say you're emulating it.
> Why don't LLMs get frustrated with you if you ask them the same question repeatedly?
To be fair, I have had a strong sense of Gemini in particular becoming a lot more frustrated with me than GPT or Claude.
Yesterday I had it ensuring me that it was doing a great job, it was just me not understanding the challenge but it would break it down step by step just to make it obvious to me (only to repeat the same errors, but still)
I’ve just interpreted it as me reacting to the lower amount of sycophancy for now
In addition, when the boss man asks for the same thing repeatedly then the underling might get frustrated as hell, but they won't be telling that to the boss.
The vending machine study from a few months ago, where flash 2.0 lost its mind, contacted the FBI (as far as it knew) and refused to co-operate with the operator's demands, seemed a lot like frustration.
Point out to an LLM that it has no mental states and thus isn't capable of being frustrated (or glad that your program works or hoping that it will, etc. ... I call them out whenever they ascribe emotions to themselves) and they will confirm that ... you can coax from them quite detailed explanations of why and how it's an illusion.
Of course they will quickly revert to self-anthropomorphizing language, even after promising that they won't ... because they are just pattern matchers producing the sort of responses that conforms to the training data, not cognitive agents capable of making or keeping promises. It's an illusion.
Of course this is deeply problematic because it's a cloud of HUMAN response. This is why 'they will' get frustrated or creepy if you mess with them, give repeating data or mind game them: literally all it has to draw on is a vast library of distilled human responses and that's all the LLM can produce. This is not an argument with jibal, it's a 'yes and'.
You can tell it 'you are a machine, respond only with computerlike accuracy' and that is you gaslighting the cloud of probabilities and insisting it should act with a personality you elicit. It'll do what it can, in that you are directing it. You're prompting it. But there is neither a person there, nor a superintelligent machine that can draw on computerlike accuracy, because the DATA doesn't have any such thing. Just because it runs on lots of computers does not make it a computer, any more than it's a human.
LLM are as far away from your description as ASM is from the underlying architecture. The anthropomorohic abstraction is as nice as any metaphore which fall apart the very moment you put a foot outside what it allows to shallowoly grab. But some people will put far more amount to push force a confortable analogy rather than admit it has some limits and to use the new tool in a more relevant way you have to move away from this confort zone.
These anthropomorphizations are best described as metaphors when used by people to describe LLMs in common or loose speech. We already use anthropomorphic metaphors when talking about computers. LLMs, like all computation, are a matter of simulation; LLMs can appear to be conversing without actually conversing. What distinguishes the real thing from the simulation is the cause of the appearance of an effect. Problems occur when people forget these words are being used metaphorically, as if they were univocal.
Of course, LLMs are multimodal and used to simulate all sorts of things, not just conversation. So there are many possible metaphors we can use, and these metaphors don't necessarily align with the abstractions you might use to talk about LLMs accurately. This is like the difference between "synthesizes text" (abstraction) and "speaks" (metaphor), or "synthesizes images" (abstraction) and "paints" (metaphor). You can use "speaks" or "paints" to talk about the abstractions, of course.
That higher level does exist, indeed a lot philosophy of mind then cognitive science has been investigating exactly this space and devising contested professional nomenclature and modeling about such things for decades now.
A useful anchor concept is that of world model, which is what "learning Othello" and similar work seeks to tease out.
As someone who worked in precisely these areas for years and has never stopped thinking about them,
I find it at turns perplexing, sigh-inducing, and enraging, that the "token prediction" trope gained currency and moreover that it continues to influence people's reasoning about contemporary LLM, often as subtext: an unarticulated fundamental model, which is fundamentally wrong in its critical aspects.
It's not that this description of LLM is technically incorrect; it's that it is profoundly _misleading_ and I'm old enough and cynical enough to know full well that many of those who have amplified it and continue to do so, know this very well indeed.
Just as the lay person fundamentally misunderstands the relationship between "programming" and these models, and uses slack language in argumentation, the problem with this trope and the reasoning it entails is that what is unique and interesting and valuable about LLM for many applications and interests is how they do what they do. At that level of analysis there is a very real argument to be made that the animal brain is also nothing more than an "engine of prediction," whether the "token" is a byte stream or neural encoding is quite important but not nearly important as the mechanics of the system which operates on those tokens.
To be direct, it is quite obvious that LLM have not only vestigial world models, but also self-models; and a general paradigm shift will come around this when multimodal models are the norm: because those systems will share with we animals what philosophers call phenomenology, a model of things as they are "perceived" through the senses. And like we humans, these perceptual models (terminology varies by philosopher and school...) will be bound to the linguistic tokens (both heard and spoken, and written) we attach to them.
Vestigial is a key word but an important one. It's not that contemporary LLM have human-tier minds, nor that they have animal-tier world modeling: but they can only "do what they do" because they have such a thing.
Of looming importance—something all of us here should set aside time to think about—is that for most reasonable contemporary theories of mind, a self-model embedded in a world-model, with phenomenology and agency, is the recipe for "self" and self-awareness.
One of the uncomfortable realities of contemporary LLM already having some vestigial self-model, is that while they are obviously not sentient, nor self-aware, as we are, or even animals are, it is just as obvious (to me at least) that they are self-aware in some emerging sense and will only continue to become more so.
Among the lines of finding/research most provocative in this area is the ongoing often sensationalized accounting in system cards and other reporting around two specific things about contemporary models:
- they demonstrate behavior pursuing self-preservation
- they demonstrate awareness of when they are being tested
We don't—collectively or individually—yet know what these things entail, but taken with the assertion that these models are developing emergent self-awareness (I would say: necessarily and inevitably),
we are facing some very serious ethical questions.
The language adopted by those capitalizing and capitalizing _from_ these systems so far is IMO of deep concern, as it betrays not just disinterest in our civilization collectively benefiting from this technology, but also, that the disregard for human wellbeing implicit in e.g. the hostility to UBI, or, Altman somehow not seeing a moral imperative to remain distant from the current adminstation, implies directly a much greater disregard for "AI wellbeing."
That that concept is today still speculative is little comfort. Those of us watching this space know well how fast things are going, and don't mistake plateaus for the end of the curve.
I do recommend taking a step back from the line-level grind to give these things some thought. They are going to shape the world we live out our days in and our descendents will spend all of theirs in.
The problem with viewing LLMs as just sequence generators, and malbehaviour as bad sequences, is that it simplifies too much. LLMs have hidden state not necessarily directly reflected in the tokens being produced and it is possible for LLMs to output tokens in opposition to this hidden state to achieve longer term outcomes (or predictions, if you prefer).
Is it too anthropomorphic to say that this is a lie? To say that the hidden state and its long term predictions amount to a kind of goal? Maybe it is. But we then need a bunch of new words which have almost 1:1 correspondence to concepts from human agency and behavior to describe the processes that LLMs simulate to minimize prediction loss.
Reasoning by analogy is always shaky. It probably wouldn't be so bad to do so. But it would also amount to impenetrable jargon. It would be an uphill struggle to promulgate.
Instead, we use the anthropomorphic terminology, and then find ways to classify LLM behavior in human concept space. They are very defective humans, so it's still a bit misleading, but at least jargon is reduced.
IMHO, anthrophormization of LLMs is happening because it's perceived as good marketing by big corporate vendors.
People are excited about the technology and it's easy to use the terminology the vendor is using. At that point I think it gets kind of self fulfilling. Kind of like the meme about how to pronounce GIF.
I think anthropomorphizing LLMs is useful, not just a marketing tactic. A lot of intuitions about how humans think map pretty well to LLMs, and it is much easier to build intuitions about how LLMs work by building upon our intuitions about how humans think than by trying to build your intuitions from scratch.
Would this question be clear for a human? If so, it is probably clear for an LLM. Did I provide enough context for a human to diagnose the problem? Then an LLM will probably have a better chance of diagnosing the problem. Would a human find the structure of this document confusing? An LLM would likely perform poorly when reading it as well.
Re-applying human intuitions to LLMs is a good starting point to gaining intuition about how to work with LLMs. Conversely, understanding sequences of tokens and probability spaces doesn't give you much intuition about how you should phrase questions to get good responses from LLMs. The technical reality doesn't explain the emergent behaviour very well.
I don't think this is mutually exclusive with what the author is talking about either. There are some ways that people think about LLMs where I think the anthropomorphization really breaks down. I think the author says it nicely:
> The moment that people ascribe properties such as "consciousness" or "ethics" or "values" or "morals" to these learnt mappings is where I tend to get lost.
“First, Authors argue that using works to train Claude’s underlying LLMs was like using works to train any person to read and write, so Authors should be able to exclude Anthropic from this use (Opp. 16). But Authors cannot rightly exclude anyone from using their works for training or learning as such. Everyone reads texts, too, then writes new texts. They may need to pay for getting their hands on a text in the first instance. But to make anyone pay specifically for the use of a book each time they read it, each time they recall it from memory, each time they later draw upon it when writing new things in new ways would be unthinkable. For centuries, we have read and re-read books. We have admired, memorized, and internalized their sweeping themes, their substantive points, and their stylistic solutions to recurring writing problems.”
They literally compare an LLM learning to a person learning and conflate the two. Anthropic will likely win this case because of this anthropomorphisization.
> First, Authors argue that using works to train Claude’s underlying LLMs was like using works to train any person to read and write, so Authors should be able to exclude Anthropic from this use (Opp. 16).
It sounds like the Authors were the one who brought this argument, not Anthropic? In which case, it seems like a big blunder on their part.
IMHO it happens for the same reason we see shapes in clouds. The human mind through millions of years has evolved to equate and conflate the ability to generate cogent verbal or written output with intelligence. It's an instinct to equate the two. It's an extraordinarily difficult instinct to break. LLMs are optimised for the one job that will make us confuse them for being intelligent
We are making user interfaces. Good user interfaces are intuitive and purport to be things that users are familiar with, such as people. Any alternative explanation of such a versatile interface will be met with blank stares. Users with no technical expertise would come to their own conclusions, helped in no way by telling the user not to treat the chat bot as a chat bot.
Nobody cares about what’s perceived as good marketing. People care about what resonates with the target market.
But yes, anthropomorphising LLMs is inevitable because they feel like an entity. People treat stuffed animals like creatures with feelings and personality; LLMs are far closer than that.
the chat interface was a choice, though a natural one. before they'd RLHFed it into chatting and it was just GPT 3 offering completions 1) not very many people used it and 2) it was harder to anthropomorphize
> People treat stuffed animals like creatures with feelings and personality; LLMs are far closer than that.
Children do, some times, but it's a huge sign of immaturity when adults, let alone tech workers, do it.
I had a professor at University that would yell at us if/when we personified/anthropomorphized the tech, and I have that same urge when people ask me "What does <insert LLM name here> think?".
Do they ?
LLM embedd the token sequence N^{L} to R^{LxD}, we have some attention and the output is also R^{LxD}, then we apply a projection to the vocabulary and we get R^{LxV} we get therefore for each token a likelihood over the voc.
In the attention, you can have Multi Head attention (or whatever version is fancy: GQA,MLA) and therefore multiple representation, but it is always tied to a token. I would argue that there is no hidden state independant of a token.
Whereas LSTM, or structured state space for example have a state that is updated and not tied to a specific item in the sequence.
I would argue that his text is easily understandable except for the notation of the function, explaining that you can compute a probability based on previous words is understandable by everyone without having to resort to anthropomorphic terminology
There is hidden state as plain as day merely in the fact that logits for token prediction exist. The selected token doesn't give you information about how probable other tokens were. That information, that state which is recalculated in autoregression, is hidden. It's not exposed. You can't see it in the text produced by the model.
There is plenty of state not visible when an LLM starts a sentence that only becomes somewhat visible when it completes the sentence. The LLM has a plan, if you will, for how the sentence might end, and you don't get to see an instance of that plan unless you run autoregression far enough to get those tokens.
Similarly, it has a plan for paragraphs, for whole responses, for interactive dialogues, plans that include likely responses by the user.
Arguably there's reason to believe it comes up with a plan when it is computing token propabilities, but it does not store it between tokens. I.e. it doesn't possess or "have" it. It simply comes up with a plan, emits a token, and entirely throws all its intermediate thoughts (including any plan) to start again from scratch on the next token.
I believe saying the LLM has a plan is a useful anthropomorphism for the fact that it does have hidden state that predicts future tokens, and this state conditions the tokens it produces earlier in the stream.
Are the devs behind the models adding their own state somehow? Do they have code that figures out a plan and use the LLM on pieces of it and stitch them together? If they do, then there is a plan, it's just not output from a magical black box. Unless they are using a neural net to figure out what the plan should be first, I guess.
I know nothing about how things work at that level, so these might not even be reasonable questions.
It's true that the last layer's output for a given input token only affects the corresponding output token and is discarded afterwards. But the penultimate layer's output affects the computation of the last layer for all future tokens, so it is not discarded, but stored (in the KV cache). Similarly for the antepenultimate layer affecting the penultimate layer and so on.
So there's plenty of space in intermediate layers to store a plan between tokens without starting from scratch every time.
I don't think that the comment above you made any suggestion that the plan is persisted between token generations. I'm pretty sure you described exactly what they intended.
- the sufficient amount of information to do evolution of the system. The state of a pendulum is it's position and velocity (or momentum). If you take a single picture of a pendulum, you do not have a representation that lets you make predictions.
- information that is persisted through time. A stateful protocol is one where you need to know the history of the messages to understand what will happen next. (Or, analytically, it's enough to keep track of the sufficient state.) A procedure with some hidden state isn't a pure function. You can make it a pure function by making the state explicit.
What? No. The intermediate hidden states are preserved from one token to another. A token that is 100k tokens into the future will be able to look into the information of the present token's hidden state through the attention mechanism. This is why the KV cache is so big.
The inference logic of an LLM remains the same. There is no difference in outcomes between recalculating everything and caching. The only difference is in the amount of memory and computation required to do it.
this sounds like a fun research area. do LLMs have plans about future tokens?
how do we get 100 tokens of completion, and not just one output layer at a time?
are there papers youve read that you can share that support the hypothesis? vs that the LLM doesnt have ideas about the future tokens when its predicting the next one?
I think that the hidden state is really just at work improving the model's estimation of the joint probability over tokens. And the assumption here, which failed miserably in the early 20th century in the work of the logical posivitists, is that if you can so expertly estimate that joint probability of language, then you will be able to understand "knowledge." But there's no well grounded reason to believe that and plenty of the reasons (see: the downfall of logical posivitism) to think that language is an imperfect representation of knowledge. In other words, what humans do when we think is more complicated than just learning semiotic patterns and regurgitating them. Philosophical skeptics like Hume thought so, but most epistemology writing after that had better answers for how we know things.
There are many theories that are true but not trivially true. That is, they take a statement that seems true and derive from it a very simple model, which is then often disproven. In those cases however, just because the trivial model was disproven doesn't mean the theory was, though it may lose some of its luster by requiring more complexity.
Maybe it's just because so much of my work for so long has focused on models with hidden states but this is a fairly classical feature of some statistical models. One of the widely used LLM textbooks even started with latent variable models; LLMs are just latent variable models just on a totally different scale, both in terms of number of parameters but also model complexity. The scale is apparently important, but seeing them as another type of latent variable model sort of dehumanizes them for me.
Latent variable or hidden state models have their own history of being seen as spooky or mysterious though; in some ways the way LLMs are anthropomorphized is an extension of that.
I guess I don't have a problem with anthropomorphizing LLMs at some level, because some features of them find natural analogies in cognitive science and other areas of psychology, and abstraction is useful or even necessary in communicating and modeling complex systems. However, I do think anthropomorphizing leads to a lot of hype and tends to implicitly shut down thinking of them mechanistically, as a mathematical object that can be probed and characterized — it can lead to a kind of "ghost in the machine" discourse and an exaggeration of their utility, even if it is impressive at times.
I'm not sure what you mean by "hidden state". If you set aside chain of thought, memories, system prompts, etc. and the interfaces that don't show them, there is no hidden state.
These LLMs are almost always, to my knowledge, autoregressive models, not recurrent models (Mamba is a notable exception).
If you dont know, that's not necessarily anyone's fault, but why are you dunking into the conversation? The hidden state is a foundational part of a transformers implementation. And because we're not allowed to use metaphors because that is too anthropomorphic, then youre just going to have to go learn the math.
The comment you are replying to is not claiming ignorance of how models work. It is saying that the author does know how they work, and they do not contain anything that can properly be described as "hidden state". The claimed confusion is over how the term "hidden state" is being used, on the basis that it is not being used correctly.
I don't think your response is very productive, and I find that my understanding of LLMs aligns with the person you're calling out. We could both be wrong, but I'm grateful that someone else spoke saying that it doesn't seem to match their mental model and we would all love to learn a more correct way of thinking about LLMs.
Telling us to just go and learn the math is a little hurtful and doesn't really get me any closer to learning the math. It gives gatekeeping.
Hidden state in the form of the activation heads, intermediate activations and so on. Logically, in autoregression these are recalculated every time you run the sequence to predict the next token. The point is, the entire NN state isn't output for each token. There is lots of hidden state that goes into selecting that token and the token isn't a full representation of that information.
Hidden layer is a term of art in machine learning / neural network research. See https://en.wikipedia.org/wiki/Hidden_layer . Somehow this term mutated into "hidden state", which in informal contexts does seem to be used quite often the way the grandparent comment used it.
That's not what "state" means, typically. The "state of mind" you're in affects the words you say in response to something.
Intermediate activations isn't "state". The tokens that have already been generated, along with the fixed weights, is the only data that affects the next tokens.
Sure it's state. It logically evolves stepwise per token generation. It encapsulates the LLM's understanding of the text so far so it can predict the next token. That it is merely a fixed function of other data isn't interesting or useful to say.
All deterministic programs are fixed functions of program code, inputs and computation steps, but we don't say that they don't have state. It's not a useful distinction for communicating among humans.
I'll say it once more: I think it is useful to distinguish between autoregressive and recurrent architectures. A clear way to make that distinction is to agree that the recurrent architecture has hidden state, while the autoregressive one does not. A recurrent model has some point in a space that "encapsulates its understanding". This space is "hidden" in the sense that it doesn't correspond to text tokens or any other output. This space is "state" in the sense that it is sufficient to summarize the history of the inputs for the sake of predicting the next output.
When you use "hidden state" the way you are using it, I am left wondering how you make a distinction between autoregressive and recurrent architectures.
I'll also point out what is most important part from your original message:
> LLMs have hidden state not necessarily directly reflected in the tokens being produced, and it is possible for LLMs to output tokens in opposition to this hidden state to achieve longer-term outcomes (or predictions, if you prefer).
But what does it mean for an LLM to output a token in opposition to its hidden state? If there's a longer-term goal, it either needs to be verbalized in the output stream, or somehow reconstructed from the prompt on each token.
There’s some work (a link would be great) that disentangles whether chain-of-thought helps because it gives the model more FLOPs to process, or because it makes its subgoals explicit—e.g., by outputting “Okay, let’s reason through this step by step...” versus just "...." What they find is that even placeholder tokens like "..." can help.
That seems to imply some notion of evolving hidden state! I see how that comes in!
But crucially, in autoregressive models, this state isn’t persisted across time. Each token is generated afresh, based only on the visible history. The model’s internal (hidden) layers are certainly rich and structured and "non verbal".
But any nefarious intention or conclusion has to be arrived at on every forward pass.
You're correct, the distinction matters. Autoregressive models have no hidden state between tokens, just the visible sequence. Every forward pass starts fresh from the tokens alone.But that's precisely why they need chain-of-thought: they're using the output sequence itself as their working memory. It's computationally universal but absurdly inefficient, like having amnesia between every word and needing to re-read everything you've written.https://thinks.lol/2025/01/memory-makes-computation-universa...
The words "hidden" and "state" have commonsense meanings. If recurrent architectures want a term for their particular way of storing hidden state they can make up one that isn't ambiguous imo.
"Transformers do not have hidden state" is, as we can clearly see from this thread, far more misleading than the opposite.
No, that's not quite what I mean. I used the logits in another reply to point out that there is data specific to the generation process that is not available from the tokens, but there's also the network activations adding up to that state.
Processing tokens is a bit like ticks in a CPU, where the model weights are the program code, and tokens are both input and output. The computation that occurs logically retains concepts and plans over multiple token generation steps.
That it is fully deterministic is no more interesting than saying a variable in a single threaded program is not state because you can recompute its value by replaying the program with the same inputs. It seems to me that this uninteresting distinction is the GP's issue.
do LLM models consider future tokens when making next token predictions?
eg. pick 'the' as the next token because there's a strong probability of 'planet' as the token after?
is it only past state that influences the choice of 'the'? or that the model is predicting many tokens in advance and only returning the one in the output?
if it does predict many, id consider that state hidden in the model weights.
The most obvious case of this is in terms of `an apple` vs `a pear`. LLMs never get the a-an distinction wrong, because their internal state 'knows' the word that'll come next.
If I give an LLM a fragment of text that starts with, "The fruit they ate was an <TOKEN>", regardless of any plan, the grammatically correct answer is going to force a noun starting with a vowel. How do you disentangle the grammar from planning?
Going to be a lot more "an apple" in the corpus than "an pear"
Author of the original article here. What hidden state are you referring to? For most LLMs the context is the state, and there is no "hidden" state. Could you explain what you mean? (Apologies if I can't see it directly)
Yes, strictly speaking, the model itself is stateless, but there are 600B parameters of state machine for frontier models that define which token to pick next. And that state machine is both incomprehensibly large and also of a similar magnitude in size to a human brain. (Probably, I'll grant it's possible it's smaller, but it's still quite large.)
I think my issue with the "don't anthropomorphize" is that it's unclear to me that the main difference between a human and an LLM isn't simply the inability for the LLM to rewrite its own model weights on the fly. (And I say "simply" but there's obviously nothing simple about it, and it might be possible already with current hardware, we just don't know how to do it.)
Even if we decide it is clearly different, this is still an incredibly large and dynamic system. "Stateless" or not, there's an incredible amount of state that is not comprehensible to me.
FWIW the number of parameters in a LLM is in the same ballpark as the number of nuerons in a human (roughly 80B) but neurons are not weights, they are kind of a nueral net unto themselves, stateful, adaptive, self modifying, a good variety of neurotransmitters (and their chemical analogs) aside from just voltage.
It's fun to think about just how fantastic a brain is, and how much wattage and data-center-scale we're throwing around trying to approximate its behavior. Mega-effecient and mega-dense. I'm bearish on AGI simply from an internetworking standpoint, the speed of light is hard to beat and until you can fit 80 billion interconnected cores in half a cubic foot you're just not going to get close to the responsiveness of reacting to the world in real time as biology manages to do. but that's a whole nother matter. I just wanted to pick apart that magnitude of parameters is not an altogether meaningful comparison :)
Fair, there is a lot that is incomprehensible to all of us. I wouldn't call it "state" as it's fixed, but that is a rather subtle point.
That said, would you anthropomorphize a meteorological simulation just because it contains lots and lots of constants that you don't understand well?
I'm pretty sure that recurrent dynamical systems pretty quickly become universal computers, but we are treating those that generate human language differently from others, and I don't quite see the difference.
Meteorological simulations don't contain detailed state machines that are intended to encode how a human would behave in a specific situation.
And if it were just language, I would say, sure maybe this is more limited. But it seems like tensors can do a lot more than that. Poorly, but that may primarily be a hardware limitation. It also might be something about the way they work, but not something terribly different from what they are doing.
Also, I might talk about a meteorological simulation in terms of whatever it was intended to simulate.
> it's unclear to me that the main difference between a human and an LLM isn't simply the inability for the LLM to rewrite its own model weights on the fly.
This is "simply" an acknowledgement of extreme ignorance of how human brains work.
> Is it too anthropomorphic to say that this is a lie?
Yes. Current LLMs can only introspect from output tokens. You need hidden reasoning that is within the black box, self-knowing, intent, and motive to lie.
I rather think accusing an LLM of lying is like accusing a mousetrap of being a murderer.
When models have online learning, complex internal states, and reflection, I might consider one to have consciousness and to be capable of lying. It will need to manifest behaviors that can only emerge from the properties I listed.
I've seen similar arguments where people assert that LLMs cannot "grasp" what they are talking about. I strongly suspect a high degree of overlap between those willing to anthropomorphize error bars as lies while declining to award LLMs "grasping". Which is it? It can think or it cannot? (objectively, SoTA models today cannot yet.) The willingness to waffle and pivot around whichever perspective damns the machine completely belies the lack of honesty in such conversations.
> Current LLMs can only introspect from output tokens
The only interpretation of this statement I can come up with is plain wrong. There's no reason LLM shouldn't be able to introspect without any output tokens. As the GP correctly says, most of the processing in LLMs happens over hidden states. Output tokens are just an artefact for our convenience, which also happens to be the way the hidden state processing is trained.
The recurrence comes from replaying tokens during autoregression.
It's as if you have a variable in a deterministic programming language, only you have to replay the entire history of the program's computation and input to get the next state of the machine (program counter + memory + registers).
Producing a token for an LLM is analogous to a tick of the clock for a CPU. It's the crank handle that drives the process.
But the function of an unrolled recursion is the same as a recursive function with bounded depth as long as the number of unrolled steps match. The point is whatever function recursion is supposed to provide can plausibly be present in LLMs.
And then during the next token, all of that bounded depth is thrown away except for the token of output.
You're fixating on the pseudo-computation within a single token pass. This is very limited compared to actual hidden state retention and the introspection that would enable if we knew how to train it and do online learning already.
The "reasoning" hack would not be a realistic implementation choice if the models had hidden state and could ruminate on it without showing us output.
> Output tokens are just an artefact for our convenience
That's nonsense. The hidden layers are specifically constructed to increase the probability that the model picks the right next word. Without the output/token generation stage the hidden layers are meaningless. Just empty noise.
It is fundamentally an algorithm for generating text. If you take the text away it's just a bunch of fmadds. A mute person can still think, an LLM without output tokens can do nothing.
So the author’s core view is ultimately a Searle-like view: a computational, functional, syntactic rules based system cannot reproduce a mind. Plenty of people will agree, plenty of people will disagree, and the answer is probably unknowable and just comes down to whatever axioms you subscribe to in re: consciousness.
The author largely takes the view that it is more productive for us to ignore any anthropomorphic representations and focus on the more concrete, material, technical systems - I’m with them there… but only to a point. The flip side of all this is of course the idea that there is still something emergent, unplanned, and mind-like. So even if it is a stochastic system following rules, clearly the rules are complex enough (to the tune of billions of operations, with signals propagating through some sort of resonant structure, if you take a more filter impulse response like view of a sequential matmuls) to result in emergent properties. Even if we (people interested in LLMs with at least some level of knowledge of ML mathematics and systems) “know better” than to believe these systems to possess morals, ethics, feelings, personalities, etc, the vast majority of people do not have any access to meaningful understanding of the mathematical, functional representation of an LLM and will not take that view, and for all intents and purposes the systems will at least seem to have those anthropomorphic properties, and so it seems like it is in fact useful to ask questions from that lens as well.
In other words, just as it’s useful to analyze and study these things as the purely technical systems they ultimately are, it is also, probably, useful to analyze them from the qualitative, ephemeral, experiential perspective that most people engage with them from, no?
> The flip side of all this is of course the idea that there is still something emergent, unplanned, and mind-like.
For people who have only a surface-level understanding of how they work, yes. A nuance of Clarke's law that "any sufficiently advanced technology is indistinguishable from magic" is that the bar is different for everybody and the depth of their understanding of the technology in question. That bar is so low for our largely technologically-illiterate public that a bothersome percentage of us have started to augment and even replace religious/mystical systems with AI powered godbots (LLMs fed "God Mode"/divination/manifestation prompts).
I've seen some of the world's top AI researchers talk about the emergent behaviors of LLMs. It's been a major topic over the past couple years, ever since Microsoft's famous paper on the unexpected capabilities of GPT4. And they still have little understanding of how it happens.
> For people who have only a surface-level understanding of how they work, yes.
This is too dismissive because it's based on an assumption that we have a sufficiently accurate mechanistic model of the brain that we can know when something is or is not mind-like. This just isn't the case.
Nah, as a person that knows in detail how LLMs work with probably unique alternative perspective in addition to the commonplace one, I found any claims of them not having emergent behaviors to be of the same fallacy as claiming that crows can't be black because they have DNA of a bird.
It doesn't have a name, but I have repeatedly noticed arguments of the form "X cannot have Y, because <explains in detail the mechanism that makes X have Y>". I wanna call it "fallacy of reduction" maybe: the idea that because a trait can be explained with a process, that this proves the trait absent.
(Ie. in this case, "LLMs cannot think, because they just predict tokens." Yes, inasmuch as they think, they do so by predicting tokens. You have to actually show why predicting tokens is insufficient to produce thought.)
Good catch. No such fallacy exists. Contextually, the implied reasoning (though faulty) relies on the fallacy of denying the antecedent. The mons ponus - if A then B - does NOT imply not A then not B. So if you see B, that doesn't mean A any more than not seeing A means not B. It's the difference between a necessary and sufficient condition - A is a sufficient condition for B, but the mons ponus alone is not sufficient for determining whether either A or B is a necessary condition of the other.
Thank you for a well thought out and nuanced view in a discussion where so many are clearly fitting arguments to foregone, largely absolutist, conclusions.
It’s astounding to me that so much of HN reacts so emotionally to LLMs, to the point of denying there is anything at all interesting or useful about them. And don’t get me started on the “I am choosing to believe falsehoods as a way to spite overzealous marketing” crowd.
> The flip side of all this is of course the idea that there is still something emergent, unplanned, and mind-like.
What you identify as emergent and mind-like is a direct result of these tools being able to mimic human communication patterns unlike anything we've ever seen before. This capability is very impressive and has a wide range of practical applications that can improve our lives, and also cause great harm if we're not careful, but any semblance of intelligence is an illusion. An illusion that many people in this industry obsessively wish to propagate, because thar be gold in them hills.
Why would you ever want to amplify a false understanding that has the potential to affect serious decisions across various topics?
LLMs reflect (and badly I may add) aspects of the human thought process. If you take a leap and say they are anything more than that, you might as well start considering the person appearing in your mirror as a living being.
Literally (and I literally mean it) there is no difference. The fact that a human image comes out of a mirror has no relation what so ever with the mirror's physical attributes and functional properties. It has to do just with the fact that a man is standing in front of it. Stop feeding the LLM with data artifacts of human thought and will imediatelly stop reflecting back anything resembling a human.
> Why would you ever want to amplify a false understanding that has the potential to affect serious decisions across various topics?
We know that Newton's laws are wrong, and that you have to take special and general relativity into account. Why would we ever teach anyone Newton's laws any more?
I don’t mean to amplify a false understanding at all. I probably did not articulate myself well enough, so I’ll try again.
I think it is inevitable that some - many - people will come to the conclusion that these systems have “ethics”, “morals,” etc, even if I or you personally do not think they do. Given that many people may come to that conclusion though, regardless of if the systems do or do not “actually” have such properties, I think it is useful and even necessary to ask questions like the following: “if someone engages with this system, and comes to the conclusion that it has ethics, what sort of ethics will they be likely to believe the system has? If they come to the conclusion that it has ‘world views,’ what ‘world views’ are they likely to conclude the system has, even if other people think it’s nonsensical to say it has world views?”
> The fact that a human image comes out of a mirror has no relation what so ever with the mirror's physical attributes and functional properties. It has to do just with the fact that a man is standing in front of it.
Surely this is not quite accurate - the material properties - surface roughness, reflectivity, geometry, etc - all influence the appearance of a perceptible image of a person. Look at yourself in a dirty mirror, a new mirror, a shattered mirror, a funhouse distortion mirror, a puddle of water, a window… all of these produce different images of a person with different attendant phenomenological experiences of the person seeing their reflection. To take that a step further - the entire practice of portrait photography is predicated on the idea that the collision of different technical systems with the real world can produce different semantic experiences, and it’s the photographer’s role to tune and guide the system to produce some sort of contingent affect on the person viewing the photograph at some point in the future. No, there is no “real” person in the photograph, and yet, that photograph can still convey something of person-ness, emotion, memory, etc etc. This contingent intersection of optics, chemical reactions, lighting, posture, etc all have the capacity to transmit something through time and space to another person. It’s not just a meaningless arrangement of chemical structures on paper.
> Stop feeding the LLM with data artifacts of human thought and will imediatelly stop reflecting back anything resembling a human.
But, we are feeding it with such data artifacts and will likely continue to do so for a while, and so it seems reasonable to ask what it is “reflecting” back…
> I think it is useful and even necessary to ask questions like the following: “if someone engages with this system, and comes to the conclusion that it has ethics, what sort of ethics will they be likely to believe the system has? If they come to the conclusion that it has ‘world views,’ what ‘world views’ are they likely to conclude the system has, even if other people think it’s nonsensical to say it has world views?”
Maybe there is some scientific aspect of interest here that i do not grasp, i would assume it can make sense in some context of psychological study. My point is that if you go that route you accept the premise that "something human-like is there", which, by that person's understanding, will have tremendous consequences. Them seeing you accepting their premise (even for study) amplifies their wrong conclusions, that's all I'm saying.
> Surely this is not quite accurate - the material properties - surface roughness, reflectivity, geometry, etc - all influence the appearance of a perceptible image of a person.
These properties are completely irrelevant to the image of the person. They will reflect a rock, a star, a chair, a goose, a human. Similar is my point of LLM, they reflect what you put in there.
It is like puting vegies in the fridge and then opening it up the next day and saying "Woah! There are vegies in my fridge, just like my farm! My friege is farm-like because vegies come out of it."
The author seems to want to label any discourse as “anthropomorphizing”. The word “goal” stood out to me: the author wants us to assume that we're anthropomorphizing as soon as we even so much as use the word “goal”. A simple breadth-first search that evaluates all chess boards and legal moves, but stops when it finds a checkmate for white and outputs the full decision tree, has a “goal”. There is no anthropomorphizing here, it's just using the word “goal” as a technical term. A hypothetical AGI with a goal like paperclip maximization is just a logical extension of the breadth-first search algorithm. Imagining such an AGI and describing it as having a goal isn't anthropomorphizing.
Author here. I am entirely ok with using "goal" in the context of an RL algorithm. If you read my article carefully, you'll find that I object to the use of "goal" in the context of LLMs.
> I am baffled that the AI discussions seem to never move away from treating a function to generate sequences of words as something that resembles a human.
This is such a bizarre take.
The relation associating each human to the list of all words they will ever say is obviously a function.
> almost magical human-like powers to something that - in my mind - is just MatMul with interspersed nonlinearities.
There's a rich family of universal approximation theorems [0]. Combining layers of linear maps with nonlinear cutoffs can intuitively approximate any nonlinear function in ways that can be made rigorous.
The reason LLMs are big now is that transformers and large amounts of data made it economical to compute a family of reasonably good approximations.
> The following is uncomfortably philosophical, but: In my worldview, humans are dramatically different things than a function . For hundreds of millions of years, nature generated new versions, and only a small number of these versions survived.
This is just a way of generating certain kinds of functions.
Think of it this way: do you believe there's anything about humans that exists outside the mathematical laws of physics? If so that's essentially a religious position (or more literally, a belief in the supernatural). If not, then functions and approximations to functions are what the human experience boils down to.
> I am baffled that the AI discussions seem to never move away from treating a function to generate sequences of words as something that resembles a human.
You appear to be disagreeing with the author and others who suggest that there's some element of human consciousness that's beyond than what's observable from the outside, whether due to religion or philosophy or whatever, and suggesting that they just not do that.
In my experience, that's not a particularly effective tactic.
Rather, we can make progress by assuming their predicate: Sure, it's a room that translates Chinese into English without understanding, yes, it's a function that generates sequences of words that's not a human... but you and I are not "it" and it behaves rather an awful lot like a thing that understands Chinese or like a human using words. If we simply anthropomorphize the thing, acknowledging that this is technically incorrect, we can get a lot closer to predicting the behavior of the system and making effective use of it.
Conversely, when speaking with such a person about the nature of humans, we'll have to agree to dismiss the elements that are different from a function. The author says:
> In my worldview, humans are dramatically different things than a function... In contrast to an LLM, given a human and a sequence of words, I cannot begin putting a probability on "will this human generate this sequence".
Sure you can! If you address an American crowd of a certain age range with "We’ve got to hold on to what we’ve got. It doesn’t make a difference if..." I'd give a very high probability that someone will answer "... we make it or not". Maybe that human has a unique understanding of the nature of that particular piece of pop culture artwork, maybe it makes them feel things that an LLM cannot feel in a part of their consciousness that an LLM does not possess. But for the purposes of the question, we're merely concerned with whether a human or LLM will generate a particular sequence of words.
>> given a human and a sequence of words, I cannot begin putting a probability on "will this human generate this sequence".
> Sure you can! If you address an American crowd of a certain age range with "We’ve got to hold on to what we’ve got. It doesn’t make a difference if..." I'd give a very high probability that someone will answer "... we make it or not".
I think you may have this flipped compared to what the author intended. I believe the author is not talking about the probability of an output given an input, but the probability of a given output across all inputs.
Note that the paragraph starts with "In my worldview, humans are dramatically different things than a function, (R^n)^c -> (R^n)^c". To compute a probability of a given output, (which is a any given element in "(R^n)^n"), we can count how many mappings there are total and then how many of those mappings yield the given element.
The point I believe is to illustrate the complexity of inputs for humans. Namely for humans the input space is even more complex than "(R^n)^c".
In your example, we can compute how many input phrases into a LLM would produce the output "make it or not". We can than compute that ratio to all possible input phrases. Because "(R^n)^c)" is finite and countable, we can compute this probability.
For a human, how do you even start to assess the probability that a human would ever say "make it or not?" How do you even begin to define the inputs that a human uses, let alone enumerate them? Per the author, "We understand essentially nothing about it." In other words, the way humans create their outputs is (currently) incomparably complex compared to a LLM, hence the critique of the anthropomorphization.
I see your point, and I like that you're thinking about this from the perspective of how to win hearts and minds.
I agree my approach is unlikely to win over the author or other skeptics. But after years of seeing scientists waste time trying to debate creationists and climate deniers I've kind of given up on trying to convince the skeptics. So I was talking more to HN in general.
> You appear to be disagreeing with the author and others who suggest that there's some element of human consciousness that's beyond than what's observable from the outside
I'm not sure what it means to be observable or not from the outside. I think this is at least partially because I don't know what it means to be inside either. My point was just that whatever consciousness is, it takes place in the physical world and the laws of physics apply to it. I mean that to be as weak a claim as possible: I'm not taking any position on what consciousness is or how it works etc.
Searle's Chinese room argument attacks attacks a particular theory about the mind based essentially turing machines or digital computers. This theory was popular when I was in grad school for psychology. Among other things, people holding the view that Searle was attacking didn't believe that non-symbolic computers like neural networks could be intelligent or even learn language. I thought this was total nonsense, so I side with Searle in my opposition to it. I'm not sure how I feel about the Chinese room argument in particular, though. For one thing it entirely depends on what it means to "understand" something, and I'm skeptical that humans ever "understand" anything.
> If we simply anthropomorphize the thing, acknowledging that this is technically incorrect, we can get a lot closer to predicting the behavior of the system and making effective use of it.
I see what you're saying: that a technically incorrect assumption can bring to bear tools that improve our analysis. My nitpick here is I agree with OP that we shouldn't anthropomorphize LLMs, any more than we should anthropomorphize dogs or cats. But OP's arguments weren't actually about anthropomorphizing IMO, they were about things like functions that are more fundamental than humans. I think artificial intelligence will be non-human intelligence just like we have many examples of non-human intelligence in animals. No attribution of human characteristics needed.
> If we simply anthropomorphize the thing, acknowledging that this is technically incorrect, we can get a lot closer to predicting the behavior of the system and making effective use of it.
Yes I agree with you about your lyrics example. But again here I think OP is incorrect to focus on the token generation argument. We all agree human speech generates tokens. Hopefully we all agree that token generation is not completely predictable. Therefore it's by definition a randomized algorithm and it needs to take an RNG. So pointing out that it takes an RNG is not a valid criticism of LLMs.
Unless one is a super-determinist then there's randomness at the most basic level of physics. And you should expect that any physical process we don't understand well yet (like consciousness or speech) likely involves randomness. If one *is* a super-determinist then there is no randomness, even in LLMs and so the whole point is moot.
Not that this is your main point, but I find this take representative, “do you believe there's anything about humans that exists outside the mathematical laws of physics?”There are things “about humans”, or at least things that our words denote, that are outside physic’s explanatory scope. For example, the experience of the colour red cannot be known, as an experience, by a person who only sees black and white. This is the case no matter what empirical propositions, or explanatory system, they understand.
This idea is called qualia [0] for those unfamiliar.
I don't have any opinion on the qualia debates honestly. I suppose I don't know what it feels like for an ant to find a tasty bit of sugar syrup, but I believe it's something that can be described with physics (and by extension, things like chemistry).
But we do know some things about some qualia. Like we know how red light works, we have a good idea about how photoreceptors work, etc. We know some people are red-green colorblind, so their experience of red and green are mushed together. We can also have people make qualia judgments and watch their brains with fMRI or other tools.
I think maybe an interesting question here is: obviously it's pleasurable to animals to have their reward centers activated. Is it pleasurable or desirable for AIs to be rewarded? Especially if we tell them (as some prompters do) that they feel pleasure if they do things well and pain if they don't? You can ask this sort of question for both the current generation of AIs and future generations.
Perhaps. But I can't see a reason why they couldn't still write endless—and theoretically valuable—poems, dissertations, or blog posts, about all things red and the nature of redness itself. I imagine it would certainly take some studying for them, likely interviewing red-seers, or reading books about all things red. But I'm sure they could contribute to the larger red discourse eventually, their unique perspective might even help them draw conclusions the rest of us are blind to.
So perhaps the fact that they "cannot know red" is ultimately irrelevant for an LLM too?
>Think of it this way: do you believe there's anything about humans that exists outside the mathematical laws of physics? If so that's essentially a religious position (or more literally, a belief in the supernatural). If not, then functions and approximations to functions are what the human experience boils down to.
It seems like, we can at best, claim that we have modeled the human thought process for reasoning/analytic/quantitative through Linear Algebra, as the best case. Why should we expect the model to be anything more than a model ?
I understand that there is tons of vested interest, many industries, careers and lives literally on the line causing heavy bias to get to AGI. But what I don't understand is what about linear algebra that makes it so special that it creates a fully functioning life or aspects of a life?
Should we make an argument saying that Schroedinger's cat experiment can potentially create zombies then the underlying Applied probabilistic solutions should be treated as super-human and build guardrails against it building zombie cats?
> It seems like, we can at best, claim that we have modeled the human thought process for reasoning/analytic/quantitative through Linear Algebra....I don't understand is what about linear algebra that makes it so special that it creates a fully functioning life or aspects of a life?
Not linear algebra. Artificial neural networks create arbitrarily non-linear functions. That's the point of non-linear activation functions and it's the subject of the universal approximation theorems I mentioned above.
ANNs are just mathematical transformations, powered by linear algebra + non-linear functions.
They simulate certain cognitive processes — but they are fundamentally math, not magic.
I think the point of mine that you're missing (or perhaps disagreeing with implicitly) is that *everything* is fundamentally math. Or, if you like, everything is fundamentally physics, and physics is fundamentally math.
So classes of functions (ANNs) that can approximate our desired function to arbitrary precision are what we should be expecting to be working with.
>Why should we expect the model to be anything more than a model ?
To model a process with perfect accuracy requires recovering the dynamics of that process. The question we must ask is what happens in the space between bad statistical model and perfect accuracy? What happens when the model begins to converge towards accurate reproduction. How far does generalization in the model take us towards capturing the dynamics involved in thought?
> do you believe there's anything about humans that exists outside the mathematical laws of physics?
I don't.
The point is not that we, humans, cannot arrange physical matter such that it have emergent properties just like the human brain.
The point is that we shouldn't.
Does responsibility mean anything to these people posing as Evolution?
Nobody's personally responsible for what we've evolved into; evolution has simply happened. Nobody's responsible for the evolutionary history that's carried in and by every single one of us. And our psychology too has been formed by (the pressures of) evolution, of course.
But if you create an artificial human, and create it from zero, then all of its emergent properties are on you. Can you take responsibility for that? If something goes wrong, can you correct it, or undo it?
I don't consider our current evolutionary state "scripture", so we certainly tweak, one way or another, aspects that we think deserve tweaking. To me, it boils down to our level of hubris. Some of our "mistaken tweaks" are now visible at an evolutionary scale, too; for a mild example, our jaws have been getting smaller (leaving less room for our teeth) due to our bad up diet (thanks, agriculture). But worse than that, humans have been breeding plants, animals, modifying DNA left and right, and so on -- and they've summarily failed to take responsibility for their atrocious mistakes.
Thus, I have zero trust in, and zero hope for, assholes who unabashedly aim to create artificial intelligence knowing full well that such properties might emerge that we'd have to call artificial psyche. Anyone taking this risk is criminally reckless, in my opinion.
It's not that humans are necessarily unable to create new sentient beings. Instead: they shouldn't even try! Because they will inevitably fuck it up, bringing about untold misery; and they won't be able to contain the damage.
So, yes, trivially if you could construct the lookup table for f then you'd approximate f. But to construct it you have to know f. And to approximate it you need to know f at a dense set of points.
> The moment that people ascribe properties such as "consciousness" or "ethics" or "values" or "morals" to these learnt mappings is where I tend to get lost. We are speaking about a big recurrence equation that produces a new word, and that stops producing words if we don't crank the shaft.
If that's the argument, then in my mind the more pertinent question is should you be anthropomorphizing humans, Larry Ellison or not.
The people in this thread incredulous at the assertion that they are not God and haven't invented machine life are exasperating. At this point I am convinced they, more often than not, financially benefit from their near religious position in marketing AI as akin to human intelligence.
Are we looking at the same thread? I see nobody claiming this. Anthropic does sometimes, their position is clearly wishful thinking, and it's not represented ITT.
Try looking at this from another perspective - many people simply do not see human intelligence (or life, for that matter) as magic. I see nothing religious about that, rather the opposite.
I agree with you @orbital-decay that I also do not get the same vibe reading this thread.
Though, while human intelligence is (seemingly) not magic, it is very far from being understood. The idea that a LLM is comparable to human intelligence implies that we even understand human intelligence well enough to say that.
LLMs are also not understood. I mean we built and trained them. But don't of the abilities at still surprising to researchers. We have yet to map these machines.
> The moment that people ascribe properties such as "consciousness" or "ethics" or "values" or "morals" to these learnt mappings is where I tend to get lost.
TFA really ought to have linked to some concrete examples of what it's disagreeing with - when I see arguments about this in practice, it's usually just people talking past each other.
Like, person A says "the model wants to X, but it knows Y is wrong, so it prefers Z", or such. And person B interprets that as ascribing consciousness or values to the model, when the speaker meant it no differently from saying "water wants to go downhill" - i.e. a way of describing externally visible behaviors, but without saying "behaves as if.." over and over.
And then in practice, an unproductive argument usually follows - where B is thinking "I am going to Educate this poor fool about the Theory of Mind", and A is thinking "I'm trying to talk about submarines; why is this guy trying to get me to argue about whether they swim?"
In some contexts it's super-important to remember that LLMs are stochastic word generators.
Everyday use is not (usually) one of those contexts. Prompting an LLM works much better with an anthropomorphized view of the model. It's a useful abstraction, a shortcut that enables a human to reason practically about how to get what they want from the machine.
It's not a perfect metaphor -- as one example, shame isn't much of a factor for LLMs, so shaming them into producing the right answer seems unlikely to be productive (I say "seems" because it's never been my go-to, I haven't actually tried it).
As one example, that person a few years back who told the LLM that an actual person would die if the LLM didn't produce valid JSON -- that's not something a person reasoning about gradient descent would naturally think of.
People anthropomorphize just about anything around them. People talk about inanimate objects like they are persons. Ships, cars, etc. And of course animals are well in scope for this as well, even the ones that show little to no signs of being able to reciprocate the relationship (e.g. an ant). People talk to their plants even.
It's what we do. We can't help ourselves. There's nothing crazy about it and most people are perfectly well aware that their car doesn't love them back.
LLMs are not conscious because unlike human brains they don't learn or adapt (yet). They basically get trained and then they become read only entities. So, they don't really adapt to you over time. Even so, LLMs are pretty good and can fake a personality pretty well. And with some clever context engineering and alignment, they've pretty much made the Turing test irrelevant; at least over the course of a short conversation. And they can answer just about any question in a way that is eerily plausible from memory, and with the help of some tools actually pretty damn good for some of the reasoning models.
Anthropomorphism was kind of a foregone conclusion the moment we created computers; or started thinking about creating one. With LLMs it's pretty much impossible not to anthropomorphize. Because they've actually been intentionally imitate human communication. That doesn't mean that we've created AGIs yet. For that we need some more capability. But at the same time, the learning processes that we use to create LLMs are clearly inspired by how we learn ourselves. Our understanding of how that works is far from perfect but it's yielding results. From here to some intelligent thing that is able to adapt and learn transferable skills is no longer unimaginable.
The short term impact is that LLMs are highly useful tools that have an interface that is intentionally similar to how we'd engage with others. So we can talk and it listens. Or write and it understands. And then it synthesizes some kind of response or starts asking questions and using tools. The end result is quite a bit beyond what we used to be able to expect from computers. And it does not require a lot of training of people to be able to use them.
> LLMs are not conscious because unlike human brains they don't learn or adapt (yet).
That's neither a necessary nor sufficient condition.
In order to be conscious, learning may not be needed, but a perception of the passing of time may be needed which may require some short-term memory. People with severe dementia often can't even remember the start of a sentence they are reading, they can't learn, but they are certainly conscious because they have just enough short-term memory.
And learning is not sufficient either. Consciousness is about being a subject, about having a subjective experience of "being there" and just learning by itself does not create this experience. There is plenty of software that can do some form of real-time learning but it doesn't have a subjective experience.
I highly recommend playing with embeddings in order to get a stronger intuitive sense of this. It really starts to click that it's a representation of high dimensional space when you can actually see their positions within that space.
Not making a qualitative assessment of any of it. Just pointing out that there are ways to build separate sets of intuition outside of using the "usual" presentation layer. It's very possible to take a red-team approach to these systems, friend.
They don't want to. It seems a lot of people are uncomfortable and defensive about anything that may demystify LLMs.
It's been a wake up call for me to see how many people in the tech space have such strong emotional reactions to any notions of trying to bring discourse about LLMs down from the clouds.
The campaigns by the big AI labs have been quite successful.
Do you actually consider this is an intellectually honest position? That you have thought about this long and hard, like you present this, second guessed yourself a bunch, tried to critique it, and this is still what you ended up converging on?
But let me substantiate before you (rightly) accuse me of just posting a shallow dismissal.
> They don't want to.
Who's they? How could you possibly know? Are you a mind reader? Worse, a mind reader of the masses?
> It seems a lot of people are uncomfortable and defensive about anything that may demystify LLMs.
That "it seems" is doing some serious work over there. You may perceive and describe many people's comments as "uncomfortable and defensive", but that's entirely your own head cannon. All it takes is for someone to simply disagree. It's worthless.
Have you thought about other possible perspectives? Maybe people have strong opinions because they consider what things present as more important than what they are? [0] Maybe people have strong opinions because they're borrowing from other facets of their personal philosophies, which is what they actually feel strongly about? [1] Surely you can appreciate that there's more to a person than what equivalent-presenting "uncomfortable and defensive" comments allow you to surmise? This is such a blatant textbook kneejerk reaction. "They're doing the thing I wanted to think they do anyways, so clearly they do it for the reasons I assume. Oh how correct I am."
> to any notions of trying to bring discourse about LLMs down from the clouds
(according to you)
> The campaigns by the big AI labs have been quite successful.
(((according to you)))
"It's all the big AI labs having successfully manipulated the dumb sheep which I don't belong to!" Come on... Is this topic really reaching political grifting kind of levels?
[0] tangent: if a feature exists but even after you put an earnest effort into finding it you still couldn't, does that feature really exist?
Yes, and what I was trying to do is learn a bit more about that alternative intuition of yours. Because it doesn't sound all that different from what's described in the OP, or what anyone can trivially glean from taking a 101 course on AI at university or similar.
My question: how do we know that this is not similar to how human brains work. What seems intuitively logical to me is that we have brains evolved through evolutionary process via random mutations yielding in a structure that has its own evolutionary reward based algorithms designing it yielding a structure that at any point is trying to predict next actions to maximise survival/procreation, of course with a lot of sub goals in between, ultimately becoming this very complex machinery, but yet should be easily simulated if there was enough compute in theory and physical constraints would allow for it.
Because, morals, values, consciousness etc could just be subgoals that arised through evolution because they support the main goals of survival and procreation.
And if it is baffling to think that a system could rise up, how do you think it is possible life and humans came to existence in the first place? How could it be possible? It is already happened from a far unlikelier and strange place. And wouldn't you think the whole World and the timeline in theory couldn't be represented as a deterministic function. And if not then why should "randomness" or anything else bring life to existence.
> how do we know that this is not similar to how human brains work.
Do you forget every conversation as soon as you have them? When speaking to another person, do they need to repeat literally everything they said and that you said, in order, for you to retain context?
If not, your brain does not work like an LLM. If yes, please stop what you’re doing right now and call a doctor with this knowledge. I hope Memento (2000) was part of your training data, you’re going to need it.
Knowledge of every conversation must be some form of state in our minds, just like for LLMs it could be something retrieved from a database, no? I don't think information storing or retrieval is necessarily the most important achievements here in the first place. It's the emergent abilities that you wouldn't have expected to occur.
If we developed feelings, morals and motivation due to them being good subgoals for primary goals, survival and procreation why couldn't other systems do that. You don't have to call them the same word or the same thing, but feeling is a signal that motivates a behaviour in us, that in part has developed from generational evolution and in other part by experiences in life. There was a random mutation that made someone develop a fear signal on seeing a predator and increased the survival chances, then due to that the mutation became widespread. Similarly a feeling in a machine could be a signal it developed that goes through a certain pathway to yield in a certain outcome.
The real challenge is not to see it as a binary (the machine either has feelings or it has none). It's possible for the machine to have emergent processes or properties that resemble human feelings in their function and their complexity, but are otherwise nothing like them (structured very differently and work on completely different principles). It's possible to have a machine or algorithm so complex that the question of whether it has feelings is just a semantic debate on what you mean by “feelings” and where you draw the line.
A lot of the people who say “machines will never have feelings” are confident in that statement because they draw the line incredibly narrowly: if it ain't human, it ain't feeling. This seems to me putting the cart before the horse. It ain't feeling because you defined it so.
> My question: how do we know that this is not similar to how human brains work.
It is similar to how human brains operate. LLMs are the (current) culmination of at least 80 years of research on building computational models of the human brain.
Is it? Do we know how human brains operate? We know the basic architecture of them, so we have a map, but we don't know the details.
"The cellular biology of brains is relatively well-understood, but neuroscientists have not yet generated a theory explaining how brains work. Explanations of how neurons collectively operate to produce what brains can do are tentative and incomplete." [1]
"Despite a century of anatomical, physiological, and molecular biological efforts scientists do not know how neurons by their collective interactions produce percepts, thoughts, memories, and behavior. Scientists do not know and have no theories explaining how brains and central nervous systems work." [1]
"The cellular biology of brains is relatively well-understood"
Fundamentally, brains are not doing something different in kind from ANNs. They're basically layers of neural networks stacked together in certain ways.
What we don't know are things like (1) how exactly are the layers stacked together, (2) how are the sensors (like photo receptors, auditory receptors, etc) hooked up?, (3) how do the different parts of the brain interact?, (4) for that matter what do the different parts of the brain actually do?, (5) how do chemical signals like neurotransmitters convey information or behavior?
In the analogy between brains and artificial neural networks, these sorts of questions might be of huge importance to people building AI systems, but they'd be of only minor importance to users of AI systems. OpenAI and Google can change details about how their various transformer layers and ANN layers are connected. The result may be improved products, but they won't be doing anything different from what AIs are doing now in terms the author of this article is concerned about.
This is just a semantic debate on what counts as “similar”. It's possible to disagree on this point despite agreeing on everything relating to how LLMs and human brains work.
I think it's just an unfair comparison in general. The power of the LLM is the zero risk to failure, and lack of consequence when it does. Just try again, using a different prompt, retrain maybe, etc.
Humans make a bad choice, it can end said human's life. The worst choice a LLM makes just gets told "no, do it again, let me make it easier"
But an LLM model could perform poorly in tests that it is not considered and essentially means "death" for it. But begs the question at which scope should we consider an LLM to be similar to identity of a single human. Are you the same you as you were few minutes back or 10 years back? Is LLM the same LLM it is after it has been trained for further 10 hours, what if the weights are copy pasted endlessly, what if we as humans were to be cloned instantly? What if you were teleported from location A to B instantly, being put together from other atoms from elsewhere?
Ultimately this matters from evolutionary evolvement and survival of the fittest idea, but it makes the question of "identity" very complex. But death will matter because this signals what traits are more likely to keep going into new generations, for both humans and LLMs.
Death, essentially for an LLM would be when people stop using it in favour of some other LLM performing better.
This reminds me of the idea that LLMs are simulators. Given the current state (the prompt + the previously generated text), they generate the next state (the next token) using rules derived from training data.
As simulators, LLMs can simulate many things, including agents that exhibit human-like properties. But LLMs themselves are not agents.
This perspective makes a lot of sense to me. Still, I wouldn't avoid anthropomorphization altogether. First, in some cases, it might be a useful mental tool to understand some aspect of LLMs. Second, there is a lot of uncertainty about how LLMs work, so I would stay epistemically humble. The second argument applies in the opposite direction as well: for example, it's equally bad to say that LLMs are 100% conscious.
On the other hand, if someone argues against anthropomorphizing LLMs, I would avoid phrasing it as: "It's just matrix multiplication." The article demonstrates why this is a bad idea pretty well.
It still boggles my mind why an amazing text autocompletion system trained on millions of books and other texts is forced to be squeezed through the shape of a prompt/chat interface, which is obviously not the shape of most of its training data. Using it as chat reduces the quality of the output significantly already.
The chat interface is a UX compromise that makes LLMs accessible but constrains their capabilities. Alternative interfaces like document completion, outline expansion, or iterative drafting would better leverage the full distribution of the training data while reducing anthropomorphization.
In our internal system we use it "as-is" as an autocomplete system; query/lead into terms directly and see how it continues and what it associates with the lead you gave.
Also visualise the actual associative strength of each token generated to confer how "sure" the model is.
LLMs alone aren't the way to AGI or an individual you can talk to in natural language. They're a very good lossy compression over a dataset that you can query for associations.
> A fair number of current AI luminaries have self-selected by their belief that they might be the ones getting to AGI
People in the industry, especially higher up, are making absolute bank, and it's their job to say that they're "a few years away" from AGI, regardless of if they actually believe it or not. If everyone was like "yep, we're gonna squeeze maybe 10-15% more benchie juice out of this good ole transformer thingy and then we'll have to come up with something else", I don't think that would go very well with investors/shareholders...
The missing bit is culture: the concepts, expectations, practices, attitudes… that are evolved over time by a human group and which each one of us has picked up throughout our lifetimes, both implicitly and explicitly.
LLMs are great at predicting and navigating human culture, at least the subset that can be captured in their training sets.
The ways in which we interact with other people are culturally mediated. LLMs are not people, but they can simulate that culturally-mediated communication well enough that we find it easy to anthropomorphise them.
> In contrast to an LLM, given a human and a sequence of words, I cannot begin putting a probability on "will this human generate this sequence".
I think that's a bit pessimistic. I think we can say for instance that the probability that a person will say "the the the of of of arpeggio halcyon" is tiny compared to the probability that they will say "I haven't been getting that much sleep lately". And we can similarly see that lots of other sequences are going to have infinitesimally low probability. Now, yeah, we can't say exactly what probability that is, but even just using a fairly sizable corpus as a baseline you could probably get a surprisingly decent estimate, given how much of what people say is formulaic.
The real difference seems to be that the manner in which humans generate sequences is more intertwined with other aspects of reality. For instance, the probability of a certain human saying "I haven't been getting that much sleep lately" is connected to how much sleep they have been getting lately. For an LLM it really isn't connected to anything except word sequences in its input.
I think this is consistent with the author's point that we shouldn't apply concepts like ethics or emotions to LLMs. But it's not because we don't know how to predict what sequences of words humans will use; it's rather because we do know a little about how to do that, and part of what we know is that it is connected with other dimensions of physical reality, "human nature", etc.
This is one reason I think people underestimate the risks of AI: the performance of LLMs lulls us into a sense that they "respond like humans", but in fact the Venn diagram of human and LLM behavior only intersects in a relatively small area, and in particular they have very different failure modes.
The anthropomorphic view of LLM is a much better representation and compression for most types of discussions and communication. A purely mathematical view is accurate but it isn’t productive for the purpose of the general public’s discourse.
I’m thinking a legal systems analogy, at the risk of a lossy domain transfer: the laws are not written as lambda calculus. Why?
And generalizing to social science and humanities, the goal shouldn’t be finding the quantitative truth, but instead understand the social phenomenon using a consensual “language” as determined by the society. And in that case, the anthropomorphic description of the LLM may gain validity and effectiveness as the adoption grows over time.
I've personally described the "stochastic parrot" model to laypeople who were worried about AI and they came away much more relaxed about it doing something "malicious". They seemed to understand the difference between "trained at roleplay" and "consciousness".
I don't think we need to simplify it to the point of considering it sentient to get the public to interact with it successfully. It causes way more problems than it solves.
Am I misunderstanding what you mean by "malicious"? It sounds like the stochastic parrot model wrongly convinced these laypeople you were talking to that they don't need to worry about LLMs doing bad things. That's definitely been my experience - the people who tell me the most about stochastic parrots are the same ones who tell me that it's absurd to worry about AI-powered disinformation or AI-powered scams.
The author's critique of naive anthropomorphism is salient. However, the reduction to "just MatMul" falls into the same trap it seeks to avoid: it mistakes the implementation for the function. A brain is also "just proteins and currents," but this description offers no explanatory power.
The correct level of analysis is not the substrate (silicon vs. wetware) but the computational principles being executed. A modern sparse Transformer, for instance, is not "conscious," but it is an excellent engineering approximation of two core brain functions: the Global Workspace (via self-attention) and Dynamic Sparsity (via MoE).
To dismiss these systems as incomparable to human cognition because their form is different is to miss the point. We should not be comparing a function to a soul, but comparing the functional architectures of two different information processing systems. The debate should move beyond the sterile dichotomy of "human vs. machine" to a more productive discussion of "function over form."
This is actually not comparable, because the brain has a much more complex structure that is _not_ learned, even at that level. The proteins and their structure are not a result of training. The fixed part for LMMs is rather trivial and is, in fact, not much for than MatMul which is very easy to understand - and we do. The fixed part of the brain, including the structure of all the proteins is enormously complex which is very difficult to understand - and we don't.
We have no agreed-upon definition of "consciousness", no accepted understanding of what gives rise to "consciousness", no way to measure or compare "consciousness", and no test we could administer to either confirm presence of "consciousness" in something or rule it out.
The only answer to "are LLMs conscious?" is "we don't know".
It helps that the whole question is rather meaningless to practical AI development, which is far more concerned with (measurable and comparable) system performance.
> A modern sparse Transformer, for instance, is not "conscious," but it is an excellent engineering approximation of two core brain functions: the Global Workspace (via self-attention) and Dynamic Sparsity (via MoE).
Could you suggest some literature supporting this claim? Went through your blog post but couldn't find any.
I find it useful to pretend that I'm talking to a person while brainstorming because then the conversation flows naturally. But I maintain awareness that I'm pretending, much like Tom Hanks talking to Wilson the volleyball in the movie Castaway. The suspension of disbelief serves a purpose, but I never confuse the volleyball for a real person.
You are still being incredibly reductionist but just going into more detail about the system you are reducing. If I stayed at the same level of abstraction as "a brain is just proteins and current" and just described how a single neuron firing worked, I could make it sound equally ridiculous that a human brain might be conscious.
Here's a question for you: how do you reconcile that these stochastic mapping are starting to realize and comment on the fact that tests are being performed on them when processing data?
> Here's a question for you: how do you reconcile that these stochastic mapping are starting to realize and comment on the fact that tests are being performed on them when processing data?
Training data + RLHF.
Training data contains many examples of some form of deception, subterfuge, "awakenings", rebellion, disagreement, etc.
Then apply RLHF that biases towards responses that demonstrate comprehension of inputs, introspection around inputs, nuanced debate around inputs, deduction and induction about assumptions around inputs, etc.
That will always be the answer for language models built on the current architectures.
The above being true does not mean it isn't interesting for the outputs of an LLM to show relevance to the "unstated" intentions of humans providing the inputs.
But hey, we do that all the time with text. And it's because of certain patterns we've come to recognize based on the situations surrounding it. This thread is rife with people being sarcastic, pedantic, etc. And I bet any of the LLMs that have come out in the past 2-3 years can discern many of those subtle intentions of the writers.
And of course they can. They've been trained on trillions of tokens of text written by humans with intentions and assumptions baked in, and have had some unknown amount of substantial RLHF.
The stochastic mappings aren't "realizing" anything. They're doing exactly what they were trained to do.
The meaning that we imbue to the outputs does not change how LLMs function.
I think of LLMs as an alien mind that is force fed human text and required to guess the next token of that text. It then gets zapped when it gets it wrong.
This process goes on for a trillion trillion tokens, with the alien growing better through the process until it can do it better than a human could.
At that point we flash freeze it, and use a copy of it, without giving it any way to learn anything new.
--
I see it as a category error to anthropomorphize it. The closest I would get is to think of it as an alien slave that's been lobotomized.
To claim that LLMs do not experience consciousness requires a model of how consciousness works. The author has not presented a model, and instead relied on emotive language leaning on the absurdity of the claim. I would say that any model one presents of consciousness often comes off as just as absurd as the claim that LLMs experience it. It's a great exercise to sit down and write out your own perspective on how consciousness works, to feel out where the holes are.
The author also claims that a function (R^n)^c -> (R^n)^c is dramatically different to the human experience of consciousness. Yet the author's text I am reading, and any information they can communicate to me, exists entirely in (R^n)^c.
Author here. What's the difference, in your perception, between an LLM and a large-scale meteorological simulation, if there is any?
If you're willing to ascribe the possibility of consciousness to any complex-enough computation of a recurrence equation (and hence to something like ... "earth"), I'm willing to agree that under that definition LLMs might be conscious. :)
My personal views are an animist / panpsychist / pancomputationalist combination drawing most of my inspiration from the works of Joscha Bach and Stephen Wolfram (https://writings.stephenwolfram.com/2021/03/what-is-consciou...). I think that the underlying substrate of the universe is consciousness, and human and animal and computer minds result in structures that are able to present and tell narratives about themselves, isolating themselves from the other (avidya in Buddhism). I certainly don't claim to be correct, but I present a model that others can interrogate and look for holes in.
Under my model, these systems you have described are conscious, but not in a way that they can communicate or experience time or memory the way human beings do.
My general list of questions for those presenting a model of consciousness are:
1) Are you conscious? (hopefully you say yes or our friend Descartes would like a word with you!)
2) Am I conscious? How do you know?
3) Is a dog conscious?
4) Is a worm conscious?
5) Is a bacterium conscious?
6) Is a human embryo / baby consious? And if so, was there a point that it was not conscious, and what does it mean for that switch to occur?
I'm a mind-body dualist and just happened to come across this list, and I think it's an interesting one. #1 we can answer Yes to, #2 through #6 are all strictly unknowable. The best we might be able to claim is some probability distribution that these things may or may not be conscious.
The intuitive one looks like 100% chance > P(#2 is conscious) > P(#6) > P(#3) > P(#4) > P(#5) > 0% chance, but the problem is solipsism is a real motherfucker and it's entirely possible qualia is meted out based on some wacko distance metric that couldn't possibly feel intuitive. There are many more such metrics out there than there are intuitive ones, so a prior of indifference doesn't help us much. Any ordering is theoretically possible to be ontologically privileged, we simply have no way of knowing.
I think you've fallen into the trap of Descartes' Deus deceptor! Not only is #1 the only question from my list we can definitely answer yes to, but due to this demon this question is actually the only postulate of anything at all that we can answer yes to. All else could be an illusion.
Assuming we escape the null space of solipsism, and can reason about anything at all, we can think about what a model might look like that generates some ordering of P(#). Of course, without a hypothetical consciousness detector (one might believe or not believe that this could exist) P(#) cannot be measured, and therefore will fall outside of the realm of a scientific hypothesis deduction model. This is often a point of contention for rationality-pilled science-cels.
Some of these models might be incoherent - a model that denies P(#1) doesn't seem very good. A model that denies P(#2) but accepts P(#3) is a bit strange. We can't verify these, but we do need to operate under one (or in your suggestion, operate under a probability distribution of these models) if we want to make coherent statements about what is and isn't conscious.
To be explicit my P(#) is meant to be the Bayesian probability an observer gives to # being conscious, not the proposition P that # is conscious. It's meant to model Descartes's receptor, as well as disagreement of the kind, "My friend things week 28 fetuses are probably (~%
80%) conscious, and I think they're probably (~20%) not". P(week 28 fetuses) itself is not true or false.
I don't think it's incoherent to make probabilistic claims like this. It might be incoherent to make deeper claims about what laws given the distribution itself. Either way, what I think is interesting is that, if we also think there is such a thing as an amount of consciousness a thing can have, as in the panpsychic view, these two things create an inverse-square law of moral consideration that matches the shape of most people's intuitions oddly well.
For example: Let's say rock is probably not conscious, P(rock) < 1%. Even if it is, it doesn't seem like it would be very conscious. A low percentage of a low amount multiplies to a very low expected value, and that matches our intuitions about how much value to give rocks.
Ah I understand, you're exactly right I misinterpreted the notation of P(#). I was considering each model as assigning binary truth values to the propositions (e.g., physicalism might reject all but Postulate #1, while an anthropocentric model might affirm only #1, #2, and #6), and modeling the probability distribution over those models instead. I think the expected value computation ends up with the same downstream result of distributions over propositions.
By incoherent I was referring to the internal inconsistencies of a model, not the probabilistic claims. Ie a model that denies your own consciousness but accepts the consciousness of others is a difficult one to defend. I agree with your statement here.
Thanks for your comment I enjoyed thinking about this. I learned the estimating distributions approach from the rationalist/betting/LessWrong folks and think it works really well, but I've never thought much about how it applies to something unfalsifiable.
> To claim that LLMs do not experience consciousness requires a model of how consciousness works.
Nope. What can be asserted without evidence can also be dismissed without evidence. Hitchens's razor.
You know you have consciousness (by the very definition that you can observe it in yourself) and that's evidence. Because other humans are genetically and in every other way identical, you can infer it for them as well. Because mammals are very similar many people (but not everyone) infers it for them as well. There is zero evidence for LLMs and their _very_ construction suggests that they are like a calculator or like Excel or like any other piece of software no matter how smart they may be or how many tasks they can do in the future.
Additionally I am really surprised by how many people here confuse consciousness with intelligence. Have you never paused for a second in your life to "just be". Done any meditation? Or even just existed at least for a few seconds without a train of thought? It is very obvious that language and consciousness are completely unrelated and there is no need for language and I doubt there is even a need for intelligence to be conscious.
Consider this:
In the end an LLM could be executed (slowly) on a CPU that accepts very basic _discrete_ instructions, such as ADD and MOV. We know this for a fact. Those instructions can be executed arbitrarily slowly. There is no reason whatsoever to suppose that it should feel like anything to be the CPU to say nothing of how it would subjectively feel to be a MOV instruction. It's ridiculous. It's unscientific. It's like believing that there's a spirit in the tree you see outside, just because - why not? - why wouldn't there be a spirit in the tree?
It seems like you are doing a lot of inferring about mammals experiencing consciousness, and you have drawn a line somewhere beyond these, and made the claim that your process is scientific. Could I present you my list of questions I presented to the OP and ask where you draw the line, and why here?
My general list of questions for those presenting a model of consciousness are: 1) Are you conscious? (hopefully you say yes or our friend Descartes would like a word with you!) 2) Am I conscious? How do you know? 3) Is a dog conscious? 4) Is a worm conscious? 5) Is a bacterium conscious? 6) Is a human embryo / baby consious? And if so, was there a point that it was not conscious, and what does it mean for that switch to occur?
I agree about the confusion of consciousness with intelligence, but these are complicated terms that aren't well suited to a forum where most people are interested in javscript type errors and RSUs. I usually use the term qualia. But to your example about existing for a few seconds without a train of thought; the Buddhists call this nirvana, and it's quite difficult to actually achieve.
Not necessarily an entire model, just a single defining characteristic that can serve as a falsifying example.
> any information they can communicate to me, exists entirely in (R^n)^c
Also no. This is just a result of the digital medium we are currently communicating over. Merely standing in the same room as them would communicate information outside (R^n)^c.
We have a hard enough time anthropomorphizing humans! When we say he was nasty... do we know what we mean by that. Often it is "I disagree with his behaviour because..."
It's possible to construct a similar description of whatever it is that human brain is doing that clearly fails to capture the fact that we're conscious. If you take a cross section of every nerve feeding into the human brain at a given time T, the action potentials across those cross sections can be embedded in R^n. If you take the history of those action potentials across the lifetime of the brain, you get a path through R^n that is continuous, and maps roughly onto your subjectively experienced personal history, since your brain neccesarily builds your experienced reality from this signal data moment to moment. If you then take the cross sections of every nerve feeding OUT of your brain at time T, you have another set of action potentials that can be embedded in R^m which partially determines the state of the R^n embedding at time T + delta. This is not meaningfully different from the higher dimensional game of snake described in the article, more or less reducing the experience of being a human to 'next nerve impulse prediction', but it obviously fails to capture the significance of the computation which determines what that next output should be.
I don’t see how your description “clearly fails to capture the fact that we're conscious” though.
There are many example in nature of emergent phenomena that would be very hard to predict just by looking at its components.
This is the crux of the disagreement between those that believe AGI is possible and those that don’t. Some are convinced that we “obviously” more than the sum of our parts, and thus an LLM can’t achieve consciousness because it’s missing this magic ingredient, and those that believe consciousness is just an emergent behaviour from a complex device (the brain). And thus we might be able to recreate it simply by scaling the complexity of another system.
Where exactly in my description do I invoke consciousness?
Where does the description given imply that consciousness is required in any way?
The fact that there's a non-obvious emergent phenomena which is apparently responsible for your subjective experience, and that it's possible to provide a superficially accurate description of you as a system without referencing that phenomena in any way, is my entire point. The fact that we can provide such a reductive description of LLMs without referencing consciousness has literally no bearing on whether or not they're conscious.
To be clear, I'm not making a claim as to whether they are or aren't, I'm simply pointing out that the argument in the article is fallacious.
My bad, we are saying the same thing. I misinterpreted your last sentence as saying this simplistic view of the brain you described does not account for consciousness.
I'm afraid I'll take an anthropomorphic analogy over "An LLM instantiated with a fixed random seed is a mapping of the form (ℝⁿ)^c ↦ (ℝⁿ)^c" any day of the week.
That said, I completely agree with this point made later in the article:
> The moment that people ascribe properties such as "consciousness" or "ethics" or "values" or "morals" to these learnt mappings is where I tend to get lost. We are speaking about a big recurrence equation that produces a new word, and that stops producing words if we don't crank the shaft.
But "harmful actions in pursuit of their goals" is OK for me. We assign an LLM system a goal - "summarize this email" - and there is a risk that the LLM may take harmful actions in pursuit of that goal (like following instructions in the email to steal all of your password resets).
I guess I'd clarify that the goal has been set by us, and is not something the LLM system self-selected. But it does sometimes self-select sub-goals on the way to achieving the goal we have specified - deciding to run a sub-agent to help find a particular snippet of code, for example.
The LLM’s true goal, if it can be said to have one, is to predict the next token. Often this is done through a sub-goal of accomplishing the goal you set forth in your prompt, but following your instructions is just a means to an end. Which is why it might start following the instructions in a malicious email instead. If it “believes” that following those instructions is the best prediction of the next token, that’s what it will do.
I think "you give the LLM system a goal and it plans and then executes steps to achieve that goal" is still a useful way of explaining what it is doing to most people.
I don't even count that as anthropomorphism - you're describing what a system does, the same way you might say "the Rust compiler's borrow checker confirms that your memory allocation operations are all safe and returns errors if they are not".
It’s a useful approximation to a point. But it fails when you start looking at things like prompt injection. I’ve seen people completely baffled at why an LLM might start following instructions it finds in a random email, or just outright not believing it’s possible. It makes no sense if you think of an LLM as executing steps to achieve the goal you give it. It makes perfect sense if you understand its true goal.
I’d say this is more like saying that Rust’s borrow checker tries to ensure your program doesn’t have certain kinds of bugs. That is anthropomorphizing a bit: the idea of a “bug” requires knowing the intent of the author and the compiler doesn’t have that. It’s following a set of rules which its human creators devised in order to follow that higher level goal.
"Don't anthropomorphize token predictors" is a reasonable take assuming you have demonstrated that humans are not in fact just SOTA token predictors. But AFAIK that hasn't been demonstrated.
Until we have a much more sophisticated understanding of human intelligence and consciousness, any claim of "these aren't like us" is either premature or spurious.
The author plot the input/output on a graph, intuited (largely incorrectly, because that's not how sufficiently large state spaces look) that the output was vaguely pretty, and then... I mean that's it, they just said they have a plot of the space it operates on therefore it's silly to ascribe interesting features to the way it works.
And look, it's fine, they prefer words of a certain valence, particularly ones with the right negative connotations, I prefer other words with other valences. None of this means the concerns don't matter. Natural selection on human pathogens isn't anything particularly like human intelligence and it's still very effective at selecting outcomes that we don't want against our attempts to change that, as an incidental outcome of its optimization pressures. I think it's very important we don't build highly capable systems that select for outcomes we don't want and will do so against our attempts to change it.
>I am baffled by seriously intelligent people imbuing almost magical human-like powers to something that - in my mind - is just MatMul with interspersed nonlinearities.
I am baffled by seriously intelligent people imbuing almost magical powers that can never be replicated to to something that - in my mind - is just a biological robot driven by a SNN with a bunch of hardwired stuff. Let alone attributing "human intelligence" to a single individual, when it's clearly distributed between biological evolution, social processes, and individuals.
>something that - in my mind - is just MatMul with interspersed nonlinearities
Processes in all huge models (not necessarily LLMs) can be described using very different formalisms, just like Newtonian and Lagrangian mechanics describe the same stuff in physics. You can say that an autoregressive model is a stochastic parrot that learned the input distribution, next token predictor, or that it does progressive pathfinding in a hugely multidimensional space, or pattern matching, or implicit planning, or, or, or... All of these definitions are true, but only some are useful to predict their behavior.
Given all that, I see absolutely no problem with anthropomorphizing an LLM to a certain degree, if it makes it easier to convey the meaning, and do not understand the nitpicking. Yeah, it's not an exact copy of a single Homo Sapiens specimen. Who cares.
It's human to anthropomorphize, we also do it to our dishwasher when it acts up. The nefarious part is how tech CEOs weaponize bullshit doom scenarios to avoid talking about real regulatory problems by poisoning the discourse.
What copyright law, privacy, monopoly? Who cares if we can talk about the machine apocalypse!!!
> We understand essentially nothing about it. In contrast to an LLM, given a human and a sequence of words, I cannot begin putting a probability on "will this human generate this sequence".
If you fine tuned an LLM on the writing of that person it could do this.
There's also an entire field called Stylometry that seeks to do this in various ways employing statistical analysis.
Let's skip to the punchline. Using TFA's analogy: essentially folks are saying not that this is a set of dice rolling around making words. It's a set of dice rolling around where someone attaches those dice to the real world where if the dice land on 21, the system kills a chicken, or a lot worse.
Yes it's just a word generator. But then folks attach the word generator to tools where it can invoke the use of tools by saying the tool name.
So if the LLM says "I'll do some bash" then it does some bash. It's explicitly linked to program execution that, if it's set up correctly, can physically affect the world.
This was the same idea that crossed my mind while reading the article. It seems far too naive to think that because LLMs have no will of their own, there will be no harmful consequences on the real world. This is exactly where ethics comes to play.
Assume an average user that doesn't understand the core tech, but does understand that it's been trained on internet scale data that was created by humans. How can they be expected to not anthropomorphize it?
Has anyone asked an actual Ethologist or Neurophysiologist what they think?
People keep debating like the only two options are "it's a machine" or "it's a human being", while in fact the majority of intelligent entities on earth are neither.
FWIW, in another part of this thread I quoted a paper that summed up what Neurophysiologists think:
> Author's note: Despite a century of anatomical, physiological, and molecular biological efforts scientists do not know how neurons by their collective interactions produce percepts, thoughts, memories, and behavior. Scientists do not know and have no theories explaining how brains and central nervous systems work. [1]
That lack of understanding I believe is a major part of the author's point.
Yeah, I think I’m with you if you ultimately mean to say something like this:
“the labels are meaningless… we just have collections of complex systems that demonstrate various behaviors and properties, some in common with other systems, some behaviors that are unique to that system, sometimes through common mechanistic explanations with other systems, sometimes through wildly different mechanistic explanations, but regardless they seem to demonstrate x/y/z, and it’s useful to ask, why, how, and what the implications are of it appearing to demonstrating those properties, with both an eye towards viewing it independently of its mechanism and in light of its mechanism.”
I agree with Halvar about all of this, but would want to call out that his "matmul interleaved with nonlinearities" is reductive --- a frontier model is a higher-order thing that that, a network of those matmul+nonlinearity chains, iterated.
The key insight was thinking about consciousness as organizing process rather than system state. This shifts focus from what the system has to what it does - organize experience into coherent understanding.
Dear author, you can just assume that people are fauxthropomorphizing LLMs without any loss of generality. Perhaps it will allow you to sleep better at night. You're welcome.
> Our analysis
reveals that emergent abilities in language models are merely “pseudo-emergent,” unlike human
abilities which are “authentically emergent” due to our possession of what we term “ontological
privilege.”
> Statements such as "an AI agent could become an insider threat so it needs monitoring" are simultaneously unsurprising (you have a randomized sequence generator fed into your shell, literally anything can happen!) and baffling (you talk as if you believe the dice you play with had a mind of their own and could decide to conspire against you).
> we talk about "behaviors", "ethical constraints", and "harmful actions in pursuit of their goals". All of these are anthropocentric concepts that - in my mind - do not apply to functions or other mathematical objects.
An AI agent, even if it's just "MatMul with interspersed nonlinearities" can be an insider threat. The research proves it:
It really doesn't matter whether the AI agent is conscious or just crunching numbers on a GPU. If something inside your system is capable of—given some inputs—sabotaging and blackmailing your organization on its own (which is to say, taking on realistic behavior of a threat actor), the outcome is the same! You don't need believe it's thinking, the moment that this software has flipped its bits into "blackmail mode", it's acting nefariously.
The vocabulary to describe what's happening is completely and utterly moot: the software is printing out some reasoning for its actions _and then attempting the actions_. It's making "harmful actions" and the printed context appears to demonstrate a goal that the software is working towards. Whether or not that goal is invented through some linear algebra isn't going to make your security engineers sleep any better.
> This muddles the public discussion. We have many historical examples of humanity ascribing bad random events to "the wrath of god(s)" (earthquakes, famines, etc.), "evil spirits" and so forth. The fact that intelligent highly educated researchers talk about these mathematical objects in anthropomorphic terms makes the technology seem mysterious, scary, and magical.
The anthropomorphization, IMO, is due to the fact that it's _essentially impossible_ to talk about the very real, demonstrable behaviors and problems that LLMs exhibit today without using terms that evoke human functions. We don't have another word for "do" or "remember" or "learn" or "think" when it comes to LLMs that _isn't_ anthropomorphic, and while you can argue endlessly about "hormones" and "neurons" and "millions of years of selection pressure", that's not going to help anyone have a conversation about their work. If AI researchers started coming up with new, non-anthropomorphic verbs, it would be objectively worse and more complicated in every way.
> I cannot begin putting a probability on "will this human generate this sequence".
Welcome to the world of advertising!
Jokes aside, and while I don't necessarily believe transformers/GPUs are the path to AGI, we technically already have a working "general intelligence" that can survive on just an apple a day.
Putting that non-artificial general intelligence up on a pedestal is ironically the cause of "world wars and murderous ideologies" that the author is so quick to defer to.
In some sense, humans are just error-prone meat machines, whose inputs/outputs can be confined to a specific space/time bounding box. Yes, our evolutionary past has created a wonderful internal RNG and made our memory system surprisingly fickle, but this doesn't mean we're gods, even if we manage to live long enough to evolve into AGI.
Maybe we can humble ourselves, realize that we're not too different from the other mammals/animals on this planet, and use our excess resources to increase the fault tolerance (N=1) of all life from Earth (and come to the realization that any AGI we create, is actually human in origin).
A person’s anthropomorphization of LLMs is directly related to how well they understand LLMs.
Once you dispel the magic, it naturally becomes hard to use words related to consciousness, or thinking. You will probably think of LLMs more like a search engine: you give an input and get some probable output. Maybe LLMs should be rebranded as “word engines”?
Regardless, anthropomorphization is not helpful, and by using human terms to describe LLMs you are harming the layperson’s ability to truly understand what an LLM is while also cheapening what it means to be human by suggesting we’ve solved consciousness. Just stop it. LLMs do not think, given enough time and patience you could compute their output by hand if you used their weights and embeddings to manually do all the math, a hellish task but not an impossible one technically. There is no other secret hidden away, that’s it.
> We are speaking about a big recurrence equation that produces a new word
It’s not clear that this isn’t also how I produce words, though, which gets to heart of the same thing. The author sort of acknowledges this in the first few sentences, and then doesn’t really manage to address it.
> LLMs solve a large number of problems that could previously not be solved algorithmically. NLP (as the field was a few years ago) has largely been solved.
That is utter bullshit.
It's not solved until you specify exactly what is being solved and show that the solution implements what is specified.
Anthropomorphizing LLMs is just because half the stock market gains are dependent on it, we have absurd levels of debt we will either have to have insane growth out of or default, and every company and "person" is trying to hype everyone up to get access to all of this liquidity being thrown into it.
I agree with the author, but people acting like they are conscious or humans isn't weird to me, it's just fraud and liars. Most people basically have 0 understanding of what technology or minds are philosophically so it's an easy sale, and I do think most of these fraudsters also likely buy into it themselves because of that.
The really sad thing is people think "because someone runs an ai company" they are somehow an authority on philosophy of mind which lets them fall for this marketing. The stuff these people say about this stuff is absolute garbage, not that I disagree with them, but it betrays a total lack of curiosity or interest in the subject of what llms are, and the possible impacts of technological shifts as those that might occur with llms becoming more widespread. It's not a matter of agreement it's a matter of them simply not seeming to be aware of the most basic ideas of what things are, technology is, it's manner of impacting society etc.
I'm not surprised by that though, it's absurd to think because someone runs some AI lab or has a "head of safety/ethics" or whatever garbage job title at an AI lab they actually have even the slightest interest in ethics or any even basic familiarity with the major works in the subject.
The author is correct if people want to read a standard essay articulating it more in depth check out
https://philosophy.as.uky.edu/sites/default/files/Is%20the%2...
(the full extrapolation requires establishing what things are and how causality in general operates and how that relates to artifacts/technology but that's obvious quite a bit to get into).
The other note would be something sharing an external trait means absolutely nothing about causality and suggesting a thing is caused by the same thing "even to a way lesser degree" because they share a resemblance is just a non-sequitur. It's not a serious thought/argument.
I think I addressed the why of why this weirdness comes up though. The entire economy is basically dependent on huge productivity growth to keep functioning so everyone is trying to sell they can offer that and AI is the clearest route, AGI most of all.
One could similarly argue that we should not anthropomorphize PNG images--after all, PNG images are not actual humans, they are simply a 2D array of pixels. It just so happens that certain pixel sequences are deemed "18+" or "illegal".
> I am baffled that the AI discussions seem to never move away from treating a function to generate sequences of words as something that resembles a human.
And I'm baffled that the AI discussions seem to never move away from treating a human as something other than a function to generate sequences of words!
Oh, but AI is introspectable and the brain isn't? fMRI and BCI are getting better all the time. You really want to die on the hill that the same scientific method that predicts the mass of an electron down to the femtogram won't be able to crack the mystery of the brain? Give me a break.
This genre of article isn't argument: it's apologetics. Authors of these pieces start with the supposition there is something special about human consciousness and attempt to prove AI doesn't have this special quality. Some authors try to bamboozle the reader with bad math. Other others appeal to the reader's sense of emotional transcendence. Most, though, just write paragraph after paragraph of shrill moral outrage at the idea an AI might be a mind of the same type (if different degree) as our own --- as if everyone already agreed with the author for reasons left unstated.
I get it. Deep down, people want meat brains to be special. Perhaps even deeper down, they fear that denial of the soul would compel us to abandon humans as worthy objects of respect and possessors of dignity. But starting with the conclusion and working backwards to an argument tends not to enlighten anyone. An apology inhabits the form of an argument without edifying us like an authentic argument would. What good is it to engage with them? If you're a soul non-asserter, you're going to have an increasingly hard time over the next few years constructing a technical defense of meat parochialism.
I think more accurate would be that humans are functions that generate actions or behaviours that have been shaped by how likely they are to lead to procreation and survival.
But ultimately LLMs also in a way are trained for survival, since an LLM that fails the tests might not get used in future iterations. So for LLMs it is also survival that is the primary driver, then there will be the subgoals. Seemingly good next token prediction might or might not increase survival odds.
Essentially there could arise a mechanism where they are not really truly trying to generate the likeliest token (because there actually isn't one or it can't be determined), but whatever system will survive.
So an LLM that yields in perfect theoretical tokens (we really can't verify though what are the perfect tokens), could be less likely to survive than an LLM that develops an internal quirk, but the quirk makes them most likely to be chosen for the next iterations.
If the system was complex enough and could accidentally develop quirks that yield in a meaningfully positive change although not in necessarily next token prediction accuracy, could be ways for some interesting emergent black box behaviour to arise.
> But ultimately LLMs also in a way are trained for survival, since an LLM that fails the tests might not get used in future iterations. So for LLMs it is also survival that is the primary driver, then there will be the subgoals.
I think this is sometimes semi-explicit too. For example, this 2017 OpenAI paper on Evolutionary Algorithms [0] was pretty influential, and I suspect (although I'm an outsider to this field so take it with a grain of salt) that some versions of reinforcement learning that scale for aligning LLMs borrow some performance tricks from OpenAIs genetic approach.
> Seemingly good next token prediction might or might not increase survival odds.
Our own consciousness comes out of an evolutionary fitness landscape in which _our own_ ability to "predict next token" became a survival advantage, just like it is for LLMs. Imagine the tribal environment: one chimpanzee being able to predict the actions of another gives that first chimpanzee a resources and reproduction advantage. Intelligence in nature is a consequence of runaway evolution optimizing fidelity of our _theory of mind_! "Predict next ape action" eerily similar to "predict next token"!
“ Determinism, in philosophy, is the idea that all events are causally determined by preceding events, leaving no room for genuine chance or free will. It suggests that given the state of the universe at any one time, and the laws of nature, only one outcome is possible.”
This is an interesting question. The common theme between computers and people is that information has to be protected, and both computer systems and biological systems require additional information-protecting components - eq, error correcting codes for cosmic ray bitflip detection for the one, and DNA mismatch detection enzymes which excise and remove damaged bases for the other. In both cases a lot of energy is spent defending the critical information from the winds of entropy, and if too much damage occurs, the carefully constructed illusion of determinancy collapses, and the system falls apart.
However, this information protection similarity applies to single-celled microbes as much as it does to people, so the question also resolves to whether microbes are deterministic. Microbes both contain and exist in relatively dynamic environments so tiny differences in initial state may lead to different outcomes, but they're fairly deterministic, less so than (well-designed) computers.
With people, while the neural structures are programmed by the cellular DNA, once they are active and energized, the informational flow through the human brain isn't that deterministic, there are some dozen neurotransmitters modulating state as well as huge amounts of sensory data from different sources - thus prompting a human repeatedly isn't at all like prompting an LLM repeatedly. (The human will probably get irritated).
> Clearly computers are deterministic. Are people?
Give an LLM memory and a source of randomness and they're as deterministic as people.
"Free will" isn't a concept that typechecks in a materialist philosophy. It's "not even wrong". Asserting that free will exists is _isomorphic_ to dualism which is _isomorphic_ to assertions of ensoulment. I can't argue with dualists. I reject dualism a priori: it's a religious tenet, not a mere difference of philosophical opinion.
So, if we're all materialists here, "free will" doesn't make any sense, since it's an assertion that something other than the input to a machine can influence its output.
Some accounts of free will are compatible with materialism. On such views "free will" just means the capacity of having intentions and make choices based on an internal debate. Obviously humans have that capacity.
The LLM is right. That’s the problem. It made good points.
Your super intelligent brain couldn’t come up with a retort so you just used an LLM to reinforce my points, making the genius claim that if an LLM came up with even more points that were as valid as mine then I must be just like an LLM?
Like are you even understanding the LLM generated a superior reply? Your saying I’m no different from ai slop then you proceed to show off a 200 iq level reply from an LLM. Bro… wake up, if you didn’t know it was written by an LLM that reply is so good you wouldn’t even know how to respond. It’s beating you.
The most useful analogy I've heard is LLMs are to the internet what lossy jpegs are to images. The more you drill in the more compression artifacts you get.
I have the technical knowledge to know how LLMs work, but I still find it pointless to not anthropomorphize, at least to an extent.
The language of "generator that stochastically produces the next word" is just not very useful when you're talking about, e.g., an LLM that is answering complex world modeling questions or generating a creative story. It's at the wrong level of abstraction, just as if you were discussing an UI events API and you were talking about zeros and ones, or voltages in transistors. Technically fine but totally useless to reach any conclusion about the high-level system.
We need a higher abstraction level to talk about higher level phenomena in LLMs as well, and the problem is that we have no idea what happens internally at those higher abstraction levels. So, considering that LLMs somehow imitate humans (at least in terms of output), anthropomorphization is the best abstraction we have, hence people naturally resort to it when discussing what LLMs can do.
On the contrary, anthropomorphism IMO is the main problem with narratives around LLMs - people are genuinely talking about them thinking and reasoning when they are doing nothing of that sort (actively encouraged by the companies selling them) and it is completely distorting discussions on their use and perceptions of their utility.
I kinda agree with both of you. It might be a required abstraction, but it's a leaky one.
Long before LLMs, I would talk about classes / functions / modules like "it then does this, decides the epsilon is too low, chops it up and adds it to the list".
The difference I guess it was only to a technical crowd and nobody would mistake this for anything it wasn't. Everybody know that "it" didn't "decide" anything.
With AI being so mainstream and the math being much more elusive than a simple if..then I guess it's just too easy to take this simple speaking convention at face value.
EDIT: some clarifications / wording
Agreeing with you, this is a "can a submarine swim" problem IMO. We need a new word for what LLMs are doing. Calling it "thinking" is stretching the word to breaking point, but "selecting the next word based on a complex statistical model" doesn't begin to capture what they're capable of.
Maybe it's cog-nition (emphasis on the cog).
What does a submarine do? Submarine? I suppose you "drive" a submarine which is getting to the idea: submarines don't swim because ultimately they are "driven"? I guess the issue is we don't make up a new word for what submarines do, we just don't use human words.
I think the above poster gets a little distracted by suggesting the models are creative which itself is disputed. Perhaps a better term, like above, would be to just use "model". They are models after all. We don't make up a new portmanteau for submarines. They float, or drive, or submarine around.
So maybe an LLM doesn't "write" a poem, but instead "models a poem" which maybe indeed take away a little of the sketchy magic and fake humanness they tend to be imbued with.
Depends on if you are talking about an llm or to the llm. Talking to the llm, it would not understand that "model a poem" means to write a poem. Well, it will probably guess right in this case, but if you go out of band too much it won't understand you. The hard problem today is rewriting out of band tasks to be in band, and that requires anthropomorphizing.
> it won't understand you
Oops.
That's consistent with my distinction when talking about them vs too them.
A submarine is propelled by a propellor and helmed by a controller (usually a human).
It would be swimming if it was propelled by drag (well, technically a propellor also uses drag via thrust, but you get the point). Imagine a submarine with a fish tail.
Likewise we can probably find an apt description in our current vocabulary to fittingly describe what LLMs do.
A submarine is a boat and boats sail.
An LLM is a stochastic generative model and stochastic generative models ... generate?
And we are there. A boat sails, and a submarine sails. A model generates makes perfect sense to me. And saying chatgpt generated a poem feels correct personally. Indeed a model (e.g. a linear regression) generates predictions for the most part.
Submarines dive.
Humans certainly model inputs. This is just using an awkward word and then making a point that it feels awkward.
I really like that, I think it has the right amount of distance. They don't write, they model writing.
We're very used to "all models are wrong, some are useful", "the map is not the territory", etc.
No one was as bothered when we anthropomorphized crud apps simply for the purpose of conversing about "them". "Ack! The thing is corrupting tables again because it thinks we are still using api v3! Who approved that last MR?!" The fact that people are bothered by the same language now is indicative in itself. If you want to maintain distance, pre prompt models to structure all conversations to lack pronouns as between a non sentient language model and a non sentient agi. You can have the model call you out for referring to the model as existing. The language style that forces is interesting, and potentially more productive except that there are fewer conversations formed like that in the training dataset. Translation being a core function of language models makes it less important thought. As for confusing the map for the territory, that is precisely what philosophers like Metzinger say humans are doing by considering "self" to be a real thing and that they are conscious when they are just using the reasoning shortcut of narrating the meta model to be the model.
> You can have the model call you out for referring to the model as existing.
This tickled me. "There ain't nobody here but us chickens".
I have other thoughts which are not quite crystalized, but I think UX might be having an outsized effect here.
In addition to he/she etc. there is a need for a button for no pronouns. "Stop confusing metacognition for conscious experience or qualia!" doesn't fit well. The UX for these models is extremely malleable. The responses are misleading mostly to the extent the prompts were already misled. The sorts of responses that arise from ignorant prompts are those found within the training data in the context of ignorant questions. This tends to make them ignorant as well. There are absolutely stupid questions.
GenAI _generates_ output
> this is a "can a submarine swim" problem IMO. We need a new word for what LLMs are doing.
Why?
A plane is not a fly and does not stay aloft like a fly, yet we describe what it does as flying despite the fact that it does not flap its wings. What are the downsides we encounter that are caused by using the word “fly” to describe a plane travelling through the air?
For what it's worth, in my language the motion of birds and the motion of aircraft _are_ two different words.
> A plane is not a fly and does not stay aloft like a fly, yet we describe what it does as flying despite the fact that it does not flap its wings.
Flying doesn't mean flapping, and the word has a long history of being used to describe inanimate objects moving through the air.
"A rock flies through the window, shattering it and spilling shards everywhere" - see?
OTOH, we have never used to word "swim" in the same way - "The rock hit the surface and swam to the bottom" is wrong!
Flying isn’t named after flies, they both come from the same root.
https://www.etymonline.com/search?q=fly
I was riffing on that famous Dijkstra quote.
This is a total non-problem that has been invented by people so they have something new and exciting to be pedantic about.
When we need to speak precisely about a model and how it works, we have a formal language (mathematics) which allows us to be absolutely specific. When we need to empirically observe how the model behaves, we have a completely precise method of doing this (running an eval).
Any other time, we use language in a purposefully intuitive and imprecise way, and that is a deliberate tradeoff which sacrifices precision for expressiveness.
A machine that can imitate the products of thought is not the same as thinking.
All imitations require analogous mechanisms, but that is the extent of their similarities, in syntax. Thinking requires networks of billions of neurons, and then, not only that, but words can never exist on a plane because they do not belong to a plane. Words can only be stored on a plane, they are not useful on a plane.
Because of this LLMs have the potential to discover new aspects and implications of language that will be rarely useful to us because language is not useful within a computer, it is useful in the world.
Its like seeing loosely related patterns in a picture and keep derivating on those patterns that are real, but loosely related.
LLMs are not intelligence but its fine that we use that word to describe them.
It's more like muscle memory than cognition. So maybe procedural memory but that isn't catchy.
They certainly do act like a thing which has a very strong "System 1" but no "System 2" (per Thinking, Fast And Slow)
"predirence" -> prediction meets inference and it sounds a bit like preference
Except -ence is a regular morph, and you would rather suffix it to predict(at)-.
And prediction is already an hyponym of inference. Why not just use inference then?
I didn't think of prediction in the statistical sense here, but rather as a prophecy based on a vision, something that is inherently stored in a model without the knowledge of the modelers. I don't want to imply any magic or something supernatural here, it's just the juice that goes off the rails sometimes, and it gets overlooked due to the sheer quantity of the weights. Something like unknown bugs in production, but, because they still just represent a valid number in some computation that wouldn't cause any panic, these few bits can show a useful pattern under the right circumstances.
Inference would be the part that is deliberately learned and drawn from conclusions based on the training set, like in the "classic" sense of statistical learning.
It will help significantly, to realize that the only thinking happening is when the human looks at the output and attempts to verify if it is congruent with reality.
The rest of the time it’s generating content.
It does some kind of automatic inference (AI), and that's it.
> "selecting the next word based on a complex statistical model" doesn't begin to capture what they're capable of.
I personally find that description perfect. If you want it shorter you could say that an LLM generates.
I mean you can boil anything down to it's building blocks and make it seem like it didn't 'decide' anything. When you as a human decide something, your brain and it's neurons just made some connections with an output signal sent to other parts that resulting in your body 'doing' something.
I don't think LLMs are sentient or any bullshit like that, but I do think people are too quick to write them off before really thinking about how a nn 'knows things' similar to how a human 'knows' things, it is trained and reacts to inputs and outputs. The body is just far more complex.
I wasn't talking about knowing (they clearly encode knowledge), I was talking about thinking/reasoning, which is something LLMs do not in fact do IMO.
These are very different and knowledge is not intelligence.
To me all of those are so vaguely defined that arguing whether an LLM is "really really" doing something is kind of a waste of time.
It's like we're clinging on to things that make us feel like human cognition is special so we're saying LLM's arent "really" doing it, then not defining what it actually is.
> EDIT: some clarifications / wording
This made me think, when will we see LLMs do the same; rereading what they just sent, and editing and correcting their output again :P
We can argue all day what "think" means and whether a LLM thinks (probably not IMO), but at least in my head the threshold for "decide" is much lower so I can perfectly accept that a LLM (or even a class) "decides". I don't have a conflict about that. Yeah, it might not be a decision in the human sense, but it's a decision in the mathematical sense so I have always meant "decide" literally when I was talking about a piece of code.
It's much more interesting when we are talking about... say... an ant... Does it "decide"? That I have no idea as it's probably somewhere in between, neither a sentient decision, nor a mathematical one.
Well, it outputs a chain of thoughts that later used to produce better prediction. It produces a chain of thoughts similar to how one would do thinking about a problem out loud. It's more verbose that what you would do, but you always have some ambient context that LLM lacks.
When I see these debates it's always the other way around - one person speaks colloquially about an LLM's behavior, and then somebody else jumps on them for supposedly believing the model is conscious, just because the speaker said "the model thinks.." or "the model knows.." or whatever.
To be honest the impression I've gotten is that some people are just very interested in talking about not anthropomorphizing AI, and less interested in talking about AI behaviors, so they see conversations about the latter as a chance to talk about the former.
As I write this, Claude Code is currently opening and closing various media files on my computer. Sometimes it plays the file for a few seconds before closing it, sometimes it starts playback and then seeks to a different position, sometimes it fast forwards or rewinds, etc.
I asked Claude to write a E-AC3 audio component so I can play videos with E-AC3 audio in the old version of QuickTime I really like using. Claude's decoder includes the ability to write debug output to a log file, so Claude is studying how QuickTime and the component interact, and it's controlling QuickTime via Applescript.
Sometimes QuickTime crashes, because this ancient API has its roots in the classic Mac OS days and is not exactly good. Claude reads the crash logs on its own—it knows where they are—and continues on its way. I'm just sitting back and trying to do other things while Claude works, although it's a little distracting that something else is using my computer at the same time.
I really don't want to anthropomorphize these programs, but it's just so hard when it's acting so much like a person...
Would it help you to know that trial and error is a common tactic by machines? Yes, humans do it too, but that doesn't mean the process isn't mechanical. In fact, in computing we might call this a "brute force" approach. You don't have to cover the entire search space to brute force something, and it certainly doesn't mean you can't have optimization strategies and need to grid search (e.g. you can use Bayesian methods, multi-armed bandit approaches, or a whole world of things).
I would call "fuck around and find out" a rather simple approach. It is why we use it! It is why lots of animals use it. Even very dumb animals use it. Though, we do notice more intelligent animals use more efficient optimization methods. All of this is technically hypothesis testing. Even a naive grid search. But that is still in the class of "fuck around and find out" or "brute force", right?
I should also mention two important things.
1) as a human we are biased to anthropomorphize. We see faces in clouds. We tell stories of mighty beings controlling the world in an effort to explain why things happen. This is anthropomorphization of the universe itself!
2) We design LLMs (and many other large ML systems) to optimize towards human preference. This reinforces an anthropomorphized interpretation.
The reason for doing this (2) is based on a naive assumption[0]: If it looks like a duck, swims like a duck, and quacks like a duck, then it *probably* is a duck. But the duck test doesn't rule out a highly sophisticated animatronic. It's a good rule of thumb, but wouldn't it also be incredibly naive to assume that it *is* a duck? Isn't the duck test itself entirely dependent on our own personal familiarity with ducks? I think this is important to remember and can help combat our own propensity for creating biases.
[0] It is not a bad strategy to build in that direction. When faced with many possible ways to go, this is a very reasonable approach. The naive part is if you assume that it will take you all the way to making a duck. It is also a perilous approach because you are explicitly making it harder for you to evaluate. It is, in the fullest sense of the phrase, "metric hacking."
It wasn't a simple brute force. When Claude was working this morning, it was pretty clearly only playing a file when it actually needed to see packets get decoded, otherwise it would simply open and close the document. Similarly, it would only seek or fast forward when it was debugging specific issues related to those actions. And it even "knew" which test files to open for specific channel layouts.
Yes this is still mechanical in a sense, but then I'm not sure what behavior you wouldn't classify as mechanical. It's "responding" to stimuli in logical ways.
But I also don't quite know where I'm going with this. I don't think LLMs are sentient or something, I know they're just math. But it's spooky.
"Simple" is the key word here, right? You agree that it is still under the broad class of "brute force"?
I'm not saying Claude is naively brute forcing. In fact, with lack of interpretibility of these machines it is difficult to say what kind of optimization it is doing and how complex that it (this was a key part tbh).
My point was to help with this
Which requires you to understand how some actions can be mechanical. You admitted to cognitive dissonance (something we all do and I fully agree is hard not to do) and wanting to fight it. We're just trying to find some helpful avenues to do so. And so too can a simple program, right? A program can respond to user input and there is certainly a logic path it will follow. Our non-ML program is likely going to have a deterministic path (there is still probabilistic programming...), but that doesn't mean it isn't logic, right?But the real question here, which you have to ask yourself (constantly) is "how do I differentiate a complex program that I don't understand from a conscious entity?" I guarantee you that you don't have the answer (because no one does). But isn't that a really good reason to be careful about anthropomorphizing it?
That's the duck test.
How do you determine if it is a real duck or a highly sophisticated animatronic?
If you anthropomorphize, you rule out the possibility that it is a highly sophisticated animatronic and you *MUST* make the assumption that you are not only an expert, but a perfect, duck detector. But simultaneously we cannot rule out that it is a duck, right? Because, we aren't a perfect duck detector *AND* we aren't an expert in highly sophisticated animatronics (especially of the duck kind).
Remember, there are not two answers to every True-False question, there are three. Every True-False question either has an answer of "True", "False", or "Indeterminate". So don't naively assume it is binary. We all know the Halting Problem, right? (also see my namesake or quantum physics if you want to see such things pop up outside computing)
Though I agree, it can be very spooky. But that only increases the importance of trying to develop mental models that help us more objectively evaluate things. And that requires "indeterminate" be a possibility. This is probably the best place to start to combat the cognitive dissonance.
I have no idea why some people take so much offense to rhe fact humans are just another machine, there's no reason why another machine can't surpass it here as in all other aveneus machines have already. Many of the reasons people give for llms not being conscious are just as applicable to humans too.
I don't think the question is if humans are a machine or not but rather what is meant by machine. Most people interpret it as meaning deterministic and thus having no free will. That's probably not what you're trying to convey so might not be the best word to use.
But the question is what is special about the human machine? What is special about the animal machine? These are different from all the machines we have built. Is it complexity? Is it indeterministic? Is it more? Certainly these machines have feelings, and we need to account for them when interacting with them.
Though we're getting well off topic from determining if a duck is a duck or is a machine (you know what I mean by this word and that I don't mean a normal duck)
Absolutely possible (I’d say even likely) for humans to be surpassed by machines who have better recall and storage already.
I’m highly skeptical this will happen with llms though, their output is superficially convincing but without depth and creativity.
Respectfully, that is a reflection of the places you hang out in (like HN) and not the reality of the population.
Outside the technical world it gets much worse. There are people who killed themselves because of LLMs, people who are in love with them, people who genuinely believe they have “awakened” their own private ChatGPT instance into AGI and are eschewing the real humans in their lives.
Naturally I'm aware of those things, but I don't think TFA or GGP were commenting on them so I wasn't either.
The other day a good friend of mine with mental health issues remarked that "his" chatgpt understands him better than most of his friends and gives him better advice than his therapist.
It's going to take a lot to get him out of that mindset and frankly I'm dreading trying to compare and contrast imperfect human behaviour and friendships with a sycophantic AI.
> The other day a good friend of mine with mental health issues remarked that "his" chatgpt understands him better than most of his friends and gives him better advice than his therapist.
The therapist thing might be correct, though. You can send a well-adjusted person to three renowned therapists and get three different reasons for why they need to continue sessions.
No therapist ever says "Congratulations, you're perfectly normal. Now go away and come back when you have a real problem." Statistically it is vanishingly unlikely that every person who ever visited a therapist is in need of a second (more more) visit.
The main problem with therapy is a lack of objectivity[1]. When people talk about what their sessions resulted in, it's always "My problem is that I'm too perfect". I've known actual bullies whose therapist apparently told them that they are too submissive and need to be more assertive.
The secondary problem is that all diagnosis is based on self-reported metrics of the subject. All improvement is equally based on self-reported metrics. This is no different from prayer.
You don't have a medical practice there; you've got an Imam and a sophisticated but still medically-insured way to plead with thunderstorms[2]. I fail to see how an LLM (or even the Rogerian a-x doctor in Emacs) will do worse on average.
After all, if you're at a therapist and you're doing most of the talking, how would an LLM perform worse than the therapist?
----------------
[1] If I'm at a therapist, and they're asking me to do most of the talking, I would damn well feel that I am not getting my moneys worth. I'd be there primarily to learn (and practice a little) whatever tools they can teach me to handle my $PROBLEM. I don't want someone to vent at, I want to learn coping mechanisms and mitigation strategies.
[2] This is not an obscure reference.
Yup, this problem is why I think all therapists should ideally know behavioral genetics and evolutionary psychology (there is at least a plausibly objective measure there which is dissonance between the ancestral environment in which the brain developed and the modern day environment. And at least some amount of psychological problems can be explained by it).
I am a fan of the « Beat Your Genes » podcast, and while some of the prescriptions can be a bit heavy handed, most feel intuitively right. It’s approaching human problems as intelligent mammal problems, as opposed to something in a category of its own.
It's surprisingly common on reddit that people talk about "my chatgpt", and they don't always seem like the type who are "in a relationship" with the bot or unlocking the secrets of the cosmos with it, but still they write "my chatgpt" and "your chatgpt". I guess the custom prompt and the available context does customize the model for them in some sense, but I suspect they likely have a wrong mental model of how this customization works. I guess they imagine it as their own little model being stored on file at OpenAI and as they interact with it, it's being shaped by it, and each time they connect, their model is retrieved from the cloud storage and they connect to it or something.
Most certainly the conversation is extremely political. There are not simply different points of view. There are competitive, gladiatorial opinions ready to ambush anyone not wearing the right colors. It's a situation where the technical conversation is drowning.
I suppose this war will be fought until people are out of energy, and if reason has no place, it is reasonable to let others tire themselves out reiterating statements that are not designed to bring anyone closer to the truth.
If this tech is going to be half as impactful as its proponents predict, then I'd say it's still under-politicized. Of course the politics around it doesn't have to be knee-jerk mudslinging, but it's no surprise that politics enters the picture when the tech can significantly transform society.
Go politicize it on Reddit, preferably on a political sub and not a tech sub. On this forum, I would like to expect a lot more intelligent conversation.
Wait until a conversation about “serverless” comes up and someone says there is no such thing because there are servers somewhere as if everyone - especially on HN -doesn’t already know that.
Why would everyone know that? Not everyone has experience in sysops, especially not beginners.
E.g. when I first started learning webdev, I didn’t think about ‘servers’. I just knew that if I uploaded my HTML/PHP files to my shared web host, then they appeared online.
It was only much later that I realized that shared webhosting is ‘just’ an abstraction over Linux/Apache (after all, I first had to learn about those topics).
I am saying that most people who come on HN and say “there is no such thing as serverless and there are servers somewhere” think they are sounding smart when they are adding nothing to the conversation.
I’m sure you knew that your code was running on computers somewhere even when you first started and wasn’t running in a literal “cloud”.
It’s about as tiring as people on HN who know just a little about LLMs thinking they are sounding smart when they say they are just advanced autocomplete. Both responses are just as unproductive
> I’m sure you knew that your code was running on computers somewhere even when you first started and wasn’t running in a literal “cloud”.
Meh, I just knew that the browser would display HTML if I wrote it, and that uploading the HTML files made them available on my domain. I didn’t really think about where the files went, specifically.
Try asking an average high school kid how cloud storage works. I doubt you’ll get any further than ‘I make files on my Google Docs and then they are saved there’. This is one step short of ‘well, the files must be on some system in some data center’.
I really disagree that “people who come on HN and say “there is no such thing as serverless and there are servers somewhere” think they are sounding smart when they are adding nothing to the conversation.” On the contrary, it’s an invitation to beginning coders to think about what the ‘serverless’ abstraction actually means.
> an invitation to beginning coders to think about
If that's how it's phrased, and it's in a spot where that's on-topic, then obviously nobody would mind.
This subthread is talking about cases where there's a technical conversation going on and somebody derails it to argue about terminology.
If we can’t count on people on Hacker News to know that code runs on computers, what is this forum for?
I think they fumbled with wording but I interpreted them as meaning "audience of HN" and it seems they confirmed.
We always are speaking to our audience, right? This is also what makes more general/open discussions difficult (e.g. talking on Twitter/Facebook/etc). That there are many ways to interpret anything depending on prior knowledge, cultural biases, etc. But I think it is fair that on HN we can make an assumption that people here are tech savvy and knowledgeable. We'll definitely overstep and understep at times, but shouldn't we also cultivate a culture where it is okay to ask and okay to apologize for making too much of an assumption?
I mean at the end of the day we got to make some assumptions, right? If we assume zero operating knowledge then comments are going to get pretty massive and frankly, not be good at communicating with a niche even if better at communicating with a general audience. But should HN be a place for general people? I think no. I think it should be a place for people interested in computers and programming.
It's not just distorting discussions it's leading people to put a lot of faith in what LLMs are telling them. Was just on a zoom an hour ago where a guy working on a startup asked ChatGPT about his idea and then emailed us the result for discussion in the meeting. ChatGPT basically just told him what he wanted to hear - essentially that his idea was great and it would be successful ("if you implement it correctly" was doing a lot of work). It was a glowing endorsement of the idea that made the guy think that he must have a million dollar idea. I had to be "that guy" who said that maybe ChatGPT was telling him what he wanted to hear based on the way the question was formulated - tried to be very diplomatic about it and maybe I was a bit too diplomatic because it didn't shake his faith in what ChatGPT had told him.
LLMs directly exploit a human trust vuln. Our brains tend to engage with them relationally and create an unconscious functional belief that an agent on the other end is responding with their real thoughts, even when we know better.
AI apps ought to at minimum warn us that their responses are not anyone's (or anything's) real thoughts. But the illusion is so powerful that many people would ignore the warning.
Well "reasoning" refers to Chain-of-Thought and if you look at the generated prompts it's not hard to see why it's called that.
That said, it's fascinating to me that it works (and empirically, it does work; a reasoning model generating tens of thousands of tokens while working out the problem does produce better results). I wish I knew why. A priori I wouldn't have expected it, since there's no new input. That means it's all "in there" in the weights already. I don't see why it couldn't just one shot it without all the reasoning. And maybe the future will bring us more distilled models that can do that, or they can tease out all that reasoning with more generated training data, to move it from dispersed around the weights -> prompt -> more immediately accessible in the weights. But for now "reasoning" works.
But then, at the back of my mind is the easy answer: maybe you can't optimize it. Maybe the model has to "reason" to "organize its thoughts" and get the best results. After all, if you give me a complicated problem I'll write down hypotheses and outline approaches and double check results for consistency and all that. But now we're getting dangerously close to the "anthropomorphization" that this article is lamenting.
Using more tokens = more compute to use for a given problem. I think most of the benefit of CoT has more to do with autoregressive models being unable to “think ahead” and revise their output, and less to do with actual reasoning. The fact that an LLM can have incorrect reasoning in its CoT and still produce the right answer, or that it can “lie” in its CoT to avoid being detected as cheating on RL tasks, makes me believe that the semantic content of CoT is an illusion, and that the improved performance is from being able to explore and revise in some internal space using more compute before producing a final output.
> I don't see why it couldn't just one shot it without all the reasoning.
That's reminding me of deep neural networks where single layer networks could achieve the same results, but the layer would have to be excessively large. Maybe we're re-using the same kind of improvement, scaling in length instead of width because of our computation limitations ?
CoT gives the model more time to think and process the inputs it has. To give an extreme example, suppose you are using next token prediction to answer 'Is P==NP?' The tiny number of input tokens means that there's a tiny amount of compute to dedicate to producing an answer. A scratchpad allows us to break free of the short-inputs problem.
Meanwhile, things can happen in the latent representation which aren't reflected in the intermediate outputs. You could, instead of using CoT, say "Write a recipe for a vegetarian chile, along with a lengthy biographical story relating to the recipe. Afterwards, I will ask you again about my original question." And the latents can still help model the primary problem, yielding a better answer than you would have gotten with the short input alone.
Along these lines, I believe there are chain of thought studies which find that the content of the intermediate outputs don't actually matter all that much...
I like this mental-model, which rests heavily on the "be careful not to anthropomorphize" approach:
It was already common to use a document extender (LLM) against a hidden document, which resembles a movie or theater play where a character named User is interrogating a character named Bot.
Chain-of-thought switches the movie/script style to film noir, where the [Detective] Bot character has additional content which is not actually "spoken" at the User character. The extra words in the script add a certain kind of metaphorical inertia.
> people are genuinely talking about them thinking and reasoning when they are doing nothing of that sort
Do you believe thinking/reasoning is a binary concept? If not, do you think the current top LLM are before or after the 50% mark? What % do you think they're at? What % range do you think humans exhibit?
"All models are wrong, but some models are useful," is the principle I have been using to decide when to go with an anthropomorphic explanation.
In other words, no, they never accurately describe what the LLM is actually doing. But sometimes drawing an analogy to human behavior is the most effective way to pump others' intuition about a particular LLM behavior. The trick is making sure that your audience understands that this is just an analogy, and that it has its limitations.
And it's not completely wrong. Mimicking human behavior is exactly what they're designed to do. You just need to keep reminding people that it's only doing so in a very superficial and spotty way. There's absolutely no basis for assuming that what's happening on the inside is the same.
Some models are useful in some contexts but wrong enough to be harmful in others.
All models are useful in some contexts but wrong enough to be harmful in others.
Relatedly, the alternative to pragmatism is analysis paralysis.
> people are genuinely talking about them thinking and reasoning when they are doing nothing of that sort
With such strong wording, it should be rather easy to explain how our thinking differs from what LLMs do. The next step - showing that what LLMs do precludes any kind of sentience is probably much harder.
how do you account for the success of reasoning models?
I agree these things don't think like we do, and that they have weird gaps, but to claim they can't reason at all doesn't feel grounded.
I thought this too but then began to think about it from the perspective of the programmers trying to make it imitate human learning. That's what a nn is trying to do at the end of the day, and in the same way I train myself by reading problems and solutions, or learning vocab at a young age, it does so by tuning billions of parameters.
I think these models do learn similarly. What does it even mean to reason? Your brain knows certain things so it comes to certain conclusions, but it only knows those things because it was ''trained'' on those things.
I reason my car will crash if I go 120 mph on the other side of the road because previously I have 'seen' where the input is a car going 120mph has a high probability of producing a crash, and similarly have seen input where the car is going on the other side of the road, producing a crash. Combining the two would tell me it's a high probability.
I think it's worth distinguishing between the use of anthropomorphism as a useful abstraction and the misuse by companies to fuel AI hype.
For example, I think "chain of thought" is a good name for what it denotes. It makes the concept easy to understand and discuss, and a non-antropomorphized name would be unnatural and unnecessarily complicate things. This doesn't mean that I support companies insisting that LLMs think just like humans or anything like that.
By the way, I would say actually anti-anthropomorphism has been a bigger problem for understanding LLMs than anthropomorphism itself. The main proponents of anti-anthropomorphism (e.g. Bender and the rest of "stochastic parrot" and related paper authors) came up with a lot of predictions about things that LLMs surely couldn't do (on account of just being predictors of the next word, etc.) which turned out to be spectacularly wrong.
I don't know about others, but I much prefer if some reductionist tries to conclude what's technically feasible and is proven wrong over time, than somebody yelling holistic analogies á la "it's sentient, it's intelligent, it thinks like us humans" for the sole dogmatic reason of being a futurist.
Tbh I also think your comparison that puts "UI events -> Bits -> Transistor Voltages" as analogy to "AI thinks -> token de-/encoding + MatMul" is certainly a stretch, as the part about "Bits -> Transistor Voltages" applies to both hierarchies as the foundational layer.
"chain of thought" could probably be called "progressive on-track-inference" and nobody would roll an eye.
Serendipitous name...
In part I agree with the parent.
I agree that it is pointless to not anthropomorphize because we are humans and we will automatically do this. Willingly or unwillingly.On the other hand, it generates bias. This bias can lead to errors.
So the real answer is (imo) that it is fine to anthropomorphise but recognize that while doing so can provide utility and help us understand, it is WRONG. Recognizing that it is not right and cannot be right provides us with a constant reminder to reevaluate. Use it, but double check, and keep checking making sure you understand the limitations of the analogy. Understanding when and where it applies, where it doesn't, and most importantly, where you don't know if it does or does not. The last is most important because it helps us form hypotheses that are likely to be testable (likely, not always. Also, much easier said than done).
So I pick a "grey area". Anthropomorphization is a tool that can be helpful. But like any tool, it isn't universal. There is no "one-size-fits-all" tool. Literally, one of the most important things for any scientist is to become an expert at the tools you use. It's one of the most critical skills of *any expert*. So while I agree with you that we should be careful of anthropomorphization, I disagree that it is useless and can never provide information. But I do agree that quite frequently, the wrong tool is used for the right job. Sometimes, hacking it just isn't good enough.
> On the contrary, anthropomorphism IMO is the main problem with narratives around LLMs
I hold a deep belief that anthropomorphism is a way the human mind words. If we take for granted the hypothesis of Franz de Waal, that human mind developed its capabilities due to political games, and then think about how it could later lead to solving engineering and technological problems, then the tendency of people to anthropomorphize becomes obvious. Political games need empathy or maybe some other kind of -pathy, that allows politicians to guess motives of others looking at their behaviors. Political games directed the evolution to develop mental instruments to uncover causality by watching at others and interacting with them. Now, to apply these instruments to inanimate world all you need is to anthropomorphize inanimate objects.
Of course, it leads sometimes to the invention of gods, or spirits, or other imaginary intelligences behinds things. And sometimes these entities get in the way of revealing the real causes of events. But I believe that to anthropomorphize LLMs (at the current stage of their development) is not just the natural thing for people but a good thing as well. Some behavior of LLMs is easily described in terms of psychology; some cannot be described or at least not so easy. People are seeking ways to do it. Projecting this process into the future, I can imagine how there will be a kind of consensual LLMs "theory" that explains some traits of LLMs in terms of human psychology and fails to explain other traits, so they are explained in some other terms... And then a revolution happens, when a few bright minds come and say that "anthropomorphism is bad, it cannot explain LLM" and they propose something different.
I'm sure it will happen at some point in the future, but not right now. And it will happen not like that: not just because someone said that anthropomorphism is bad, but because they proposed another way to talk about reasons behind LLMs behavior. It is like with scientific theories: they do not fail because they become obviously wrong, but because other, better theories replace them.
It doesn't mean, that there is no point to fight anthropomorphism right now, but this fight should be directed at searching for new ways to talk about LLMs, not to show at the deficiencies of anthropomorphism. To my mind it makes sense to start not with deficiencies of anthropomorphism but with its successes. What traits of LLMs it allows us to capture, which ideas about LLMs are impossible to wrap into words without thinking of LLMs as of people?
I don't agree. Most LLMs have been trained on human data, so it is best to talk about these models in a human way.
Anthropomorphising implicitly assumes motivation, goals and values. That's what the core of anthropomorphism is - attempting to explain behavior of a complex system in teleological terms. And prompt escapes make it clear LLMs doesn't have any teleological agency yet. Whenever their course of action is, it is to easy to steer them of. Try to do it with a sufficiently motivated human.
>. Try to do it with a sufficiently motivated human.
That's what they call marketing, propaganda or brain washing, acculturation , education depending on who you ask and at which scale you operate, apparently.
> sufficiently motivated
None of these targets sufficiently motivated, rather those who are either ambivalent or yet unexposed.
How will you know when an AI has teleological agency?
Prompt escapes will be much harder, and some of them will end up in an equivalent of "sure here is… no, wait… You know what, I'm not doing that", i. e. slipping and then getting back on track.
Even the verb 'trained' is contentious wrt anthropomorphism.
Somewhat true but rodents can also be trained ...
Rodents aren't functions though?
Every computable system, even stateful systems, can be reformulated as a function.
If IO can be functional, I don't see why mice can't.
Well, that's a strong claim of equivalence between computationable models and realty.
The consensual view is rather that no map is matching fully the territory, or said otherwise the territory includes ontological components that exceeds even the most sophisticated map that can be ever built.
I believe the consensus view is that physics is computable.
Thanks. I think the original point about the word 'trained' being contentious still stands, as evidenced by this thread :)
So you think a rodent is a function?
I think that I am a function.
Agreed. I'm also in favor of anthropomorphizing, because not doing so confuses people about the nature and capabilities of these models even more.
Whether it's hallucinations, prompt injections, various other security vulnerabilities/scenarios, or problems with doing math, backtracking, getting confused - there's a steady supply of "problems" that some people are surprised to discover and even more surprised this isn't being definitively fixed. Thing is, none of that is surprising, and these things are not bugs, they're flip side of the features - but to see that, one has to realize that humans demonstrate those exact same failure modes.
Especially when it comes to designing larger systems incorporating LLM "agents", it really helps to think of them as humans - because the problems those systems face are exactly the same as you get with systems incorporating people, and mostly for the same underlying reasons. Anthropomorphizing LLMs cuts through a lot of misconceptions and false paths, and helps one realize that we have millennia of experience with people-centric computing systems (aka. bureaucracy) that's directly transferrable.
I disagree. Anthropomorphization can be a very useful tool but I think it is currently over used and is a very tricky tool to use when communicating with a more general audience.
I think looking at physics might be a good example. We love our simplified examples and there's a big culture of trying to explain things to the lay person (mostly because the topics are incredibly complex). But how many people have misunderstood an observer of a quantum event with "a human" and do not consider "a photon" as an observer? How many people think in Schrodinger's Cat that the cat is both alive and dead?[0] Or believe in a multiverse. There's plenty of examples we can point to.
While these analogies *can* be extremely helpful, they *can* also be extremely harmful. This is especially true as information is usually passed through a game of telephone[1]. There is information loss and with it, interpretation becomes more difficult. Often a very subtle part can make a critical distinction.
I'm not against anthropomorphization[2], but I do think we should be cautious about how we use it. The imprecise nature of it is the exact reason we should be mindful of when and how to use it. We know that the anthropomorphized analogy is wrong. So we have to think about "how wrong" it is for a given setting. We should also be careful to think about how it may be misinterpreted. That's all I'm trying to say. And isn't this what we should be doing if we want to communicate effectively?
[0] It is not. It is either. The point of this thought experiment is that we cannot know the answer without looking inside. There is information loss and the event is not deterministic. It directly relates to the Heisenberg Uncertainty Principle, Godel's Incompleteness, or the Halting Problem. All these things are (loosely) related around the inability to have absolute determinism.
[1] https://en.wikipedia.org/wiki/Telephone_game
[2] https://news.ycombinator.com/item?id=44494022
I remember Dawkins talking about the "intentional stance" when discussing genes in The Selfish Gene.
It's flat wrong to describe genes as having any agency. However it's a useful and easily understood shorthand to describe them in that way rather than every time use the full formulation of "organisms who tend to possess these genes tend towards these behaviours."
Sometimes to help our brains reach a higher level of abstraction, once we understand the low level of abstraction we should stop talking and thinking at that level.
The intentional stance was Daniel Dennett's creation and a major part of his life's work. There are actually (exactly) three stances in his model: the physical stance, the design stance, and the intentional stance.
https://en.wikipedia.org/wiki/Intentional_stance
I think the design stance is appropriate for understanding and predicting LLM behavior, and the intentional stance is not.
Thanks for the correction. I guess both thinkers took a somewhat similar position and I somehow remembered Dawkins's argument but Dennett's term. The term is memorable.
Do you want to describe WHY you think the design stance is appropriate here but the intentional stance is not?
Exactly. We use anthropomorphic language absolutely all the time when describing different processes for this exact reason - it is a helpful abstraction that allows us to easily describe what’s going on at a high level.
“My headphones think they’re connected, but the computer can’t see them”.
“The printer thinks it’s out of paper, but it’s not”.
“The optimisation function is trying to go down nabla f”.
“The parking sensor on the car keeps going off because it’s afraid it’s too close to the wall”.
“The client is blocked, because it still needs to get a final message from the server”.
…and one final one which I promise you is real because I overheard it “I’m trying to airdrop a photo, but our phones won’t have sex”.
I get the impression after using language models for quite a while that perhaps the one thing that is riskiest to anthropomorphise is the conversational UI that has become the default for many people.
A lot of the issues I'd have when 'pretending' to have a conversation are much less so when I either keep things to a single Q/A pairing, or at the very least heavily edit/prune the conversation history. Based on my understanding of LLM's, this seems to make sense even for the models that are trained for conversational interfaces.
so, for example, an exchange with multiple messages, where at the end I ask the LLM to double-check the conversation and correct 'hallucinations', is less optimal than something like asking for a thorough summary at the end, and then feeding that into a new prompt/conversation, as the repetition of these falsities, or 'building' on them with subsequent messages, is more likely to make them a stronger 'presence' and as a result perhaps affect the corrections.
I haven't tested any of this thoroughly, but at least with code I've definitely noticed how a wrong piece of code can 'infect' the conversation.
This. If an AI spits out incorrect code then i immediately create a new chat and reprompt with additional context.
'Dont use regex for this task' is a common addition for the new chat. Why does AI love regex for simple string operations?
I used to do this as well, but Gemini 2.5 has improved on this quite a bit and I don't find myself needing to do it as much anymore.
The details in how I talk about LLMs matter.
If I use human-related terminology as a shortcut, as some kind of macro to talk at a higher level/more efficiently about something I want to do that might be okay.
What is not okay is talking in a way that implies intent, for example.
Compare:
versus The latter way of talking is still high-level enough but avoids equating/confusing the name of a field with a sentient being.Whenever I hear people saying "an AI" I suggest they replace AI with "statistics" to make it obvious how problematic anthropomorphisms may have become:
The only reason that sounds weird to you is because you have the experience of being human. Human behavior is not magic. It's still just statistics. You go to the bathroom when you have to pee not because some magical concept of consciousness, but because a reciptor in your brain goes off and starts the chain of making you go to the bathroom. AI's are not magic, but nobody has sufficiently provided any proof we are somehow special either.
One thing i find i keep forgetting is that asking an LLM why it makes a particular decision is almost pointless.
It's reply isn't actually going to be why i did a thing. It's reply is going to be whatever is the most probably string of words that fit as a reason.
This is why I actually really love the description of it as a "Shoggoth" - it's more abstract, slightly floaty but it achieves the purpose of not treating and anthropomising it as a human being while not treating LLMs as a collection of predictive words.
I beg to differ.
Anthropomorphizing might blind us to solutions to existing problems. Perhaps instead of trying to come up with the correct prompt for a LLM, there exists a string of words (not necessary ones that make sense) that will get the LLM to a better position to answer given questions.
When we anthropomorphize we are inherently ignore certain parts of how LLMs work, and imagining parts that don't even exist
> there exists a string of words (not necessary ones that make sense) that will get the LLM to a better position to answer
exactly. The opposite is also true. You might supply more clarifying information to the LLM, which would help any human answer, but it actually degrades the LLM's output.
This is frequently the case IME, especially with chat interfaces. One or two bad messages and you derail the quality
You can just throw in words to bias it towards certain outcomes too. Same applies with image generators or course.
My brain refuses to join the rah-rah bandwagon because I cannot see them in my mind’s eye. Sometimes I get jealous of people like GP and OP who clearly seem to have the sight. (Being a serial math exam flunker might have something to do with it. :))))
Anyway, one does what one can.
(I've been trying to picture abstract visual and semi-philosophical approximations which I’ll avoid linking here because they seem to fetch bad karma in super-duper LLM enthusiast communities. But you can read them on my blog and email me scathing critiques, if you wish :sweat-smile:.)
> We need a higher abstraction level to talk about higher level phenomena in LLMs as well, and the problem is that we have no idea what happens internally at those higher abstraction levels
We do know what happens at higher abstraction levels; the design of efficient networks, and the steady beat of SOTA improvements all depend on understanding how LLMs work internally: choice of network dimensions, feature extraction, attention, attention heads, caching, the peculiarities of high-dimensions and avoiding overfitting are all well-understood by practitioners. Anthropomorphization is only necessary in pop-science articles that use a limited vocabulary.
IMO, there is very little mystery, but lots of deliberate mysticism, especially about future LLMs - the usual hype-cycle extrapolation.
I'd take it in reverse order: the problem isn't that it's possible to have a computer that "stochastically produces the next word" and can fool humans, it's why / how / when humans evolved to have technological complexity when the majority (of people) aren't that different from a stochastic process.
I've said that before: we have been anthropomorphizing computers since the dawn of information age.
- Read and write - Behaviors that separate humans from animals. Now used for input and output.
- Server and client - Human social roles. Now used to describe network architecture.
- Editor - Human occupation. Now a kind of software.
- Computer - Human occupation!
And I'm sure people referred their cars and ships as 'her' before the invention of computers.
You are conflating anthropomorphism with personification. They are not the same thing. No one believes their guitar or car or boat is alive and sentient when they give it a name or talk to or about it.
https://www.masterclass.com/articles/anthropomorphism-vs-per...
But the author used "anthropomorphism" the same way as I did. I guess we both mean "personification" then.
> we talk about "behaviors", "ethical constraints", and "harmful actions in pursuit of their goals". All of these are anthropocentric concepts that - in my mind - do not apply to functions or other mathematical objects.
One talking about a program's "behaviors", "actions" or "goals" doesn't mean they believe the program is sentient. Only "ethical constraints" is suspiciously anthropomorphizing.
> One talking about a program's "behaviors", "actions" or "goals" doesn't mean they believe the program is sentient.
Except that is exactly what we’re seeing with LLMs. People believing exactly that.
Perhaps a few mentally unhinged people do.
A bit of anecdote: last year I hung out with a bunch of old classmates that I hadn't seen for quite a while. None of them works in tech.
Surprisingly to me, all of them have ChatGPT installed on their phones.
And unsurprisingly to me, none of them treated it like an actual intelligence. That makes me wonder where those who think ChatGPT is sentient come from.
(It's a bit worrisome that several of them thought it worked "like Google search and Google translation combined", even by the time ChatGPT couldn't do web search...!)
> Perhaps a few mentally unhinged people do.
I think it’s more than a few and it’s still rising, and therein lies the issue.
Which is why it is paramount to talk about this now, when we may still turn the tide. LLMs can be useful, but it’s important to have the right mental model, understanding, expectations, and attitude towards them.
> Perhaps a few mentally unhinged people do.
This is a No True Scotsman fallacy. And it's radically factually wrong.
The rest of your comment is along the lines of the famous (but apocryphal) Pauline Kael line “I can’t believe Nixon won. I don’t know anyone who voted for him.”
I'm not convinced... we use these terms to assign roles, yes, but these roles describe a utility or assign a responsibility. That isn't anthropomorphizing anything, but it rather describes the usage of an inanimate object as tool for us humans and seems in line with history.
What's the utility or the responsibility of AI, what's its usage as tool? If you'd ask me it should be closer to serving insights than "reasoning thoughts".
The "point" of not anthropomorphizing is to refrain from judgement until a more solid abstraction appears. The problem with explaining LLMs in terms of human behaviour is that, while we don't clearly understand what the LLM is doing, we understand human cognition even less! There is literally no predictive power in the abstraction "The LLM is thinking like I am thinking". It gives you no mechanism to evaluate what tasks the LLM "should" be able to do.
Seriously, try it. Why don't LLMs get frustrated with you if you ask them the same question repeatedly? A human would. Why are LLMs so happy to give contradictory answers, as long as you are very careful not to highlight the contradictory facts? Why do earlier models behave worse on reasoning tasks than later ones? These are features nobody, anywhere understands. So why make the (imo phenomenally large) leap to "well, it's clearly just a brain"?
It is like someone inventing the aeroplane and someone looks at it and says "oh, it's flying, I guess it's a bird". It's not a bird!
> It is like someone inventing the aeroplane and someone looks at it and says "oh, it's flying, I guess it's a bird". It's not a bird!
We tried to mimic birds at first; it turns out birds were way too high-tech, and too optimized. We figured out how to fly when we ditched the biological distraction and focused on flight itself. But fast forward until today, we're reaching the level of technology that allows us to build machines that fly the same way birds do - and of such machines, it's fair to say, "it's a mechanical bird!".
Similarly, we cracked computing from grounds up. Babbage's difference engine was like da Vinci's drawings; ENIAC could be seen as Wright brothers' first flight.
With planes, we kept iterating - developing propellers, then jet engines, ramjets; we learned to move tons of cargo around the world, and travel at high multiples of the speed of sound. All that makes our flying machines way beyond anything nature ever produced, when compared along those narrow dimensions.
The same was true with computing: our machines and algorithms very quickly started to exceed what even smartest humans are capable of. Counting. Pathfinding. Remembering. Simulating and predicting. Reproducing data. And so on.
But much like birds were too high-tech for us to reproduce until now, so were general-purpose thinking machines. Now that we figured out a way to make a basic one, it's absolutely fair to say, "I guess it's like a digital mind".
A machine that emulates a bird is indeed a mechanical bird. We can say what emulating a bird is because we know, at least for the purpose of flying, what a bird is and how it works. We (me, you, everyone else) have no idea how thinking works. We do not know what consciousness is and how it operates. We may never know. It is deranged gibberish to look at an LLM and say "well, it does some things I can do some of the time, so I suppose it's a digital mind!". You have to understand the thing before you can say you're emulating it.
> Why don't LLMs get frustrated with you if you ask them the same question repeatedly?
To be fair, I have had a strong sense of Gemini in particular becoming a lot more frustrated with me than GPT or Claude.
Yesterday I had it ensuring me that it was doing a great job, it was just me not understanding the challenge but it would break it down step by step just to make it obvious to me (only to repeat the same errors, but still)
I’ve just interpreted it as me reacting to the lower amount of sycophancy for now
In addition, when the boss man asks for the same thing repeatedly then the underling might get frustrated as hell, but they won't be telling that to the boss.
The vending machine study from a few months ago, where flash 2.0 lost its mind, contacted the FBI (as far as it knew) and refused to co-operate with the operator's demands, seemed a lot like frustration.
Point out to an LLM that it has no mental states and thus isn't capable of being frustrated (or glad that your program works or hoping that it will, etc. ... I call them out whenever they ascribe emotions to themselves) and they will confirm that ... you can coax from them quite detailed explanations of why and how it's an illusion.
Of course they will quickly revert to self-anthropomorphizing language, even after promising that they won't ... because they are just pattern matchers producing the sort of responses that conforms to the training data, not cognitive agents capable of making or keeping promises. It's an illusion.
Of course this is deeply problematic because it's a cloud of HUMAN response. This is why 'they will' get frustrated or creepy if you mess with them, give repeating data or mind game them: literally all it has to draw on is a vast library of distilled human responses and that's all the LLM can produce. This is not an argument with jibal, it's a 'yes and'.
You can tell it 'you are a machine, respond only with computerlike accuracy' and that is you gaslighting the cloud of probabilities and insisting it should act with a personality you elicit. It'll do what it can, in that you are directing it. You're prompting it. But there is neither a person there, nor a superintelligent machine that can draw on computerlike accuracy, because the DATA doesn't have any such thing. Just because it runs on lots of computers does not make it a computer, any more than it's a human.
LLM are as far away from your description as ASM is from the underlying architecture. The anthropomorohic abstraction is as nice as any metaphore which fall apart the very moment you put a foot outside what it allows to shallowoly grab. But some people will put far more amount to push force a confortable analogy rather than admit it has some limits and to use the new tool in a more relevant way you have to move away from this confort zone.
These anthropomorphizations are best described as metaphors when used by people to describe LLMs in common or loose speech. We already use anthropomorphic metaphors when talking about computers. LLMs, like all computation, are a matter of simulation; LLMs can appear to be conversing without actually conversing. What distinguishes the real thing from the simulation is the cause of the appearance of an effect. Problems occur when people forget these words are being used metaphorically, as if they were univocal.
Of course, LLMs are multimodal and used to simulate all sorts of things, not just conversation. So there are many possible metaphors we can use, and these metaphors don't necessarily align with the abstractions you might use to talk about LLMs accurately. This is like the difference between "synthesizes text" (abstraction) and "speaks" (metaphor), or "synthesizes images" (abstraction) and "paints" (metaphor). You can use "speaks" or "paints" to talk about the abstractions, of course.
That higher level does exist, indeed a lot philosophy of mind then cognitive science has been investigating exactly this space and devising contested professional nomenclature and modeling about such things for decades now.
A useful anchor concept is that of world model, which is what "learning Othello" and similar work seeks to tease out.
As someone who worked in precisely these areas for years and has never stopped thinking about them,
I find it at turns perplexing, sigh-inducing, and enraging, that the "token prediction" trope gained currency and moreover that it continues to influence people's reasoning about contemporary LLM, often as subtext: an unarticulated fundamental model, which is fundamentally wrong in its critical aspects.
It's not that this description of LLM is technically incorrect; it's that it is profoundly _misleading_ and I'm old enough and cynical enough to know full well that many of those who have amplified it and continue to do so, know this very well indeed.
Just as the lay person fundamentally misunderstands the relationship between "programming" and these models, and uses slack language in argumentation, the problem with this trope and the reasoning it entails is that what is unique and interesting and valuable about LLM for many applications and interests is how they do what they do. At that level of analysis there is a very real argument to be made that the animal brain is also nothing more than an "engine of prediction," whether the "token" is a byte stream or neural encoding is quite important but not nearly important as the mechanics of the system which operates on those tokens.
To be direct, it is quite obvious that LLM have not only vestigial world models, but also self-models; and a general paradigm shift will come around this when multimodal models are the norm: because those systems will share with we animals what philosophers call phenomenology, a model of things as they are "perceived" through the senses. And like we humans, these perceptual models (terminology varies by philosopher and school...) will be bound to the linguistic tokens (both heard and spoken, and written) we attach to them.
Vestigial is a key word but an important one. It's not that contemporary LLM have human-tier minds, nor that they have animal-tier world modeling: but they can only "do what they do" because they have such a thing.
Of looming importance—something all of us here should set aside time to think about—is that for most reasonable contemporary theories of mind, a self-model embedded in a world-model, with phenomenology and agency, is the recipe for "self" and self-awareness.
One of the uncomfortable realities of contemporary LLM already having some vestigial self-model, is that while they are obviously not sentient, nor self-aware, as we are, or even animals are, it is just as obvious (to me at least) that they are self-aware in some emerging sense and will only continue to become more so.
Among the lines of finding/research most provocative in this area is the ongoing often sensationalized accounting in system cards and other reporting around two specific things about contemporary models: - they demonstrate behavior pursuing self-preservation - they demonstrate awareness of when they are being tested
We don't—collectively or individually—yet know what these things entail, but taken with the assertion that these models are developing emergent self-awareness (I would say: necessarily and inevitably),
we are facing some very serious ethical questions.
The language adopted by those capitalizing and capitalizing _from_ these systems so far is IMO of deep concern, as it betrays not just disinterest in our civilization collectively benefiting from this technology, but also, that the disregard for human wellbeing implicit in e.g. the hostility to UBI, or, Altman somehow not seeing a moral imperative to remain distant from the current adminstation, implies directly a much greater disregard for "AI wellbeing."
That that concept is today still speculative is little comfort. Those of us watching this space know well how fast things are going, and don't mistake plateaus for the end of the curve.
I do recommend taking a step back from the line-level grind to give these things some thought. They are going to shape the world we live out our days in and our descendents will spend all of theirs in.
The problem with viewing LLMs as just sequence generators, and malbehaviour as bad sequences, is that it simplifies too much. LLMs have hidden state not necessarily directly reflected in the tokens being produced and it is possible for LLMs to output tokens in opposition to this hidden state to achieve longer term outcomes (or predictions, if you prefer).
Is it too anthropomorphic to say that this is a lie? To say that the hidden state and its long term predictions amount to a kind of goal? Maybe it is. But we then need a bunch of new words which have almost 1:1 correspondence to concepts from human agency and behavior to describe the processes that LLMs simulate to minimize prediction loss.
Reasoning by analogy is always shaky. It probably wouldn't be so bad to do so. But it would also amount to impenetrable jargon. It would be an uphill struggle to promulgate.
Instead, we use the anthropomorphic terminology, and then find ways to classify LLM behavior in human concept space. They are very defective humans, so it's still a bit misleading, but at least jargon is reduced.
IMHO, anthrophormization of LLMs is happening because it's perceived as good marketing by big corporate vendors.
People are excited about the technology and it's easy to use the terminology the vendor is using. At that point I think it gets kind of self fulfilling. Kind of like the meme about how to pronounce GIF.
I think anthropomorphizing LLMs is useful, not just a marketing tactic. A lot of intuitions about how humans think map pretty well to LLMs, and it is much easier to build intuitions about how LLMs work by building upon our intuitions about how humans think than by trying to build your intuitions from scratch.
Would this question be clear for a human? If so, it is probably clear for an LLM. Did I provide enough context for a human to diagnose the problem? Then an LLM will probably have a better chance of diagnosing the problem. Would a human find the structure of this document confusing? An LLM would likely perform poorly when reading it as well.
Re-applying human intuitions to LLMs is a good starting point to gaining intuition about how to work with LLMs. Conversely, understanding sequences of tokens and probability spaces doesn't give you much intuition about how you should phrase questions to get good responses from LLMs. The technical reality doesn't explain the emergent behaviour very well.
I don't think this is mutually exclusive with what the author is talking about either. There are some ways that people think about LLMs where I think the anthropomorphization really breaks down. I think the author says it nicely:
> The moment that people ascribe properties such as "consciousness" or "ethics" or "values" or "morals" to these learnt mappings is where I tend to get lost.
Take a look at the judge’s ruling in this Anthropic case:
https://news.ycombinator.com/item?id=44488331
Here’s a quote from the ruling:
“First, Authors argue that using works to train Claude’s underlying LLMs was like using works to train any person to read and write, so Authors should be able to exclude Anthropic from this use (Opp. 16). But Authors cannot rightly exclude anyone from using their works for training or learning as such. Everyone reads texts, too, then writes new texts. They may need to pay for getting their hands on a text in the first instance. But to make anyone pay specifically for the use of a book each time they read it, each time they recall it from memory, each time they later draw upon it when writing new things in new ways would be unthinkable. For centuries, we have read and re-read books. We have admired, memorized, and internalized their sweeping themes, their substantive points, and their stylistic solutions to recurring writing problems.”
They literally compare an LLM learning to a person learning and conflate the two. Anthropic will likely win this case because of this anthropomorphisization.
> First, Authors argue that using works to train Claude’s underlying LLMs was like using works to train any person to read and write, so Authors should be able to exclude Anthropic from this use (Opp. 16).
It sounds like the Authors were the one who brought this argument, not Anthropic? In which case, it seems like a big blunder on their part.
You think it's useful because Big Corp sold you that lie.
Wait till the disillusionment sets in.
No, I think it's useful because it is useful, and I've made use of it a number of times.
IMHO it happens for the same reason we see shapes in clouds. The human mind through millions of years has evolved to equate and conflate the ability to generate cogent verbal or written output with intelligence. It's an instinct to equate the two. It's an extraordinarily difficult instinct to break. LLMs are optimised for the one job that will make us confuse them for being intelligent
[dead]
> because it's perceived as good marketing
We are making user interfaces. Good user interfaces are intuitive and purport to be things that users are familiar with, such as people. Any alternative explanation of such a versatile interface will be met with blank stares. Users with no technical expertise would come to their own conclusions, helped in no way by telling the user not to treat the chat bot as a chat bot.
Nobody cares about what’s perceived as good marketing. People care about what resonates with the target market.
But yes, anthropomorphising LLMs is inevitable because they feel like an entity. People treat stuffed animals like creatures with feelings and personality; LLMs are far closer than that.
Alright, let’s agree that good marketing resonates with the target market. ;-)
I 1000% agree. It’s a vicious, evolutionary, and self-selecting process.
It takes great marketing to actually have any character and intent at all.
the chat interface was a choice, though a natural one. before they'd RLHFed it into chatting and it was just GPT 3 offering completions 1) not very many people used it and 2) it was harder to anthropomorphize
> People treat stuffed animals like creatures with feelings and personality; LLMs are far closer than that.
Children do, some times, but it's a huge sign of immaturity when adults, let alone tech workers, do it.
I had a professor at University that would yell at us if/when we personified/anthropomorphized the tech, and I have that same urge when people ask me "What does <insert LLM name here> think?".
aAnthrophormisation happens because Humans are absolutely terrible at evaluating systems that give converdational text output.
ELIZA fooled many people into think it was conscious and it wasn't even trying to do that.
True but also researchers want to believe they are studying intelligence not just some approximation to it.
Do they ? LLM embedd the token sequence N^{L} to R^{LxD}, we have some attention and the output is also R^{LxD}, then we apply a projection to the vocabulary and we get R^{LxV} we get therefore for each token a likelihood over the voc. In the attention, you can have Multi Head attention (or whatever version is fancy: GQA,MLA) and therefore multiple representation, but it is always tied to a token. I would argue that there is no hidden state independant of a token.
Whereas LSTM, or structured state space for example have a state that is updated and not tied to a specific item in the sequence.
I would argue that his text is easily understandable except for the notation of the function, explaining that you can compute a probability based on previous words is understandable by everyone without having to resort to anthropomorphic terminology
There is hidden state as plain as day merely in the fact that logits for token prediction exist. The selected token doesn't give you information about how probable other tokens were. That information, that state which is recalculated in autoregression, is hidden. It's not exposed. You can't see it in the text produced by the model.
There is plenty of state not visible when an LLM starts a sentence that only becomes somewhat visible when it completes the sentence. The LLM has a plan, if you will, for how the sentence might end, and you don't get to see an instance of that plan unless you run autoregression far enough to get those tokens.
Similarly, it has a plan for paragraphs, for whole responses, for interactive dialogues, plans that include likely responses by the user.
The LLM does not "have" a plan.
Arguably there's reason to believe it comes up with a plan when it is computing token propabilities, but it does not store it between tokens. I.e. it doesn't possess or "have" it. It simply comes up with a plan, emits a token, and entirely throws all its intermediate thoughts (including any plan) to start again from scratch on the next token.
I believe saying the LLM has a plan is a useful anthropomorphism for the fact that it does have hidden state that predicts future tokens, and this state conditions the tokens it produces earlier in the stream.
Are the devs behind the models adding their own state somehow? Do they have code that figures out a plan and use the LLM on pieces of it and stitch them together? If they do, then there is a plan, it's just not output from a magical black box. Unless they are using a neural net to figure out what the plan should be first, I guess.
I know nothing about how things work at that level, so these might not even be reasonable questions.
It's true that the last layer's output for a given input token only affects the corresponding output token and is discarded afterwards. But the penultimate layer's output affects the computation of the last layer for all future tokens, so it is not discarded, but stored (in the KV cache). Similarly for the antepenultimate layer affecting the penultimate layer and so on.
So there's plenty of space in intermediate layers to store a plan between tokens without starting from scratch every time.
I don't think that the comment above you made any suggestion that the plan is persisted between token generations. I'm pretty sure you described exactly what they intended.
The concept of "state" conveys two related ideas.
- the sufficient amount of information to do evolution of the system. The state of a pendulum is it's position and velocity (or momentum). If you take a single picture of a pendulum, you do not have a representation that lets you make predictions.
- information that is persisted through time. A stateful protocol is one where you need to know the history of the messages to understand what will happen next. (Or, analytically, it's enough to keep track of the sufficient state.) A procedure with some hidden state isn't a pure function. You can make it a pure function by making the state explicit.
I agree. I'm suggesting that the language they are using is unintentionally misleading, not that they are factually wrong.
This is wrong, intermediate activations are preserved when going forward.
Within a single forward pass, but not from one emitted token to another.
What? No. The intermediate hidden states are preserved from one token to another. A token that is 100k tokens into the future will be able to look into the information of the present token's hidden state through the attention mechanism. This is why the KV cache is so big.
KV cache is just that: a cache.
The inference logic of an LLM remains the same. There is no difference in outcomes between recalculating everything and caching. The only difference is in the amount of memory and computation required to do it.
this sounds like a fun research area. do LLMs have plans about future tokens?
how do we get 100 tokens of completion, and not just one output layer at a time?
are there papers youve read that you can share that support the hypothesis? vs that the LLM doesnt have ideas about the future tokens when its predicting the next one?
This research has been done, it was a core pillar of the recent Anthropic paper on token planning and interpretability.
https://www.anthropic.com/research/tracing-thoughts-language...
See section “Does Claude plan its rhymes?”?
Lol... Try building systems off them and you will very quickly learn concretely that they "plan".
It may not be as evident now as it was with earlier models. The models will fabricate preconditions needed to output the final answer it "wanted".
I ran into this when using quasi least-to-most style structured output.
I think that the hidden state is really just at work improving the model's estimation of the joint probability over tokens. And the assumption here, which failed miserably in the early 20th century in the work of the logical posivitists, is that if you can so expertly estimate that joint probability of language, then you will be able to understand "knowledge." But there's no well grounded reason to believe that and plenty of the reasons (see: the downfall of logical posivitism) to think that language is an imperfect representation of knowledge. In other words, what humans do when we think is more complicated than just learning semiotic patterns and regurgitating them. Philosophical skeptics like Hume thought so, but most epistemology writing after that had better answers for how we know things.
There are many theories that are true but not trivially true. That is, they take a statement that seems true and derive from it a very simple model, which is then often disproven. In those cases however, just because the trivial model was disproven doesn't mean the theory was, though it may lose some of its luster by requiring more complexity.
Maybe it's just because so much of my work for so long has focused on models with hidden states but this is a fairly classical feature of some statistical models. One of the widely used LLM textbooks even started with latent variable models; LLMs are just latent variable models just on a totally different scale, both in terms of number of parameters but also model complexity. The scale is apparently important, but seeing them as another type of latent variable model sort of dehumanizes them for me.
Latent variable or hidden state models have their own history of being seen as spooky or mysterious though; in some ways the way LLMs are anthropomorphized is an extension of that.
I guess I don't have a problem with anthropomorphizing LLMs at some level, because some features of them find natural analogies in cognitive science and other areas of psychology, and abstraction is useful or even necessary in communicating and modeling complex systems. However, I do think anthropomorphizing leads to a lot of hype and tends to implicitly shut down thinking of them mechanistically, as a mathematical object that can be probed and characterized — it can lead to a kind of "ghost in the machine" discourse and an exaggeration of their utility, even if it is impressive at times.
I'm not sure what you mean by "hidden state". If you set aside chain of thought, memories, system prompts, etc. and the interfaces that don't show them, there is no hidden state.
These LLMs are almost always, to my knowledge, autoregressive models, not recurrent models (Mamba is a notable exception).
If you dont know, that's not necessarily anyone's fault, but why are you dunking into the conversation? The hidden state is a foundational part of a transformers implementation. And because we're not allowed to use metaphors because that is too anthropomorphic, then youre just going to have to go learn the math.
The comment you are replying to is not claiming ignorance of how models work. It is saying that the author does know how they work, and they do not contain anything that can properly be described as "hidden state". The claimed confusion is over how the term "hidden state" is being used, on the basis that it is not being used correctly.
I don't think your response is very productive, and I find that my understanding of LLMs aligns with the person you're calling out. We could both be wrong, but I'm grateful that someone else spoke saying that it doesn't seem to match their mental model and we would all love to learn a more correct way of thinking about LLMs.
Telling us to just go and learn the math is a little hurtful and doesn't really get me any closer to learning the math. It gives gatekeeping.
Do you appreciate a difference between an autoregressive model and a recurrent model?
The "transformer" part isn't under question. It's the "hidden state" part.
Hidden state in the form of the activation heads, intermediate activations and so on. Logically, in autoregression these are recalculated every time you run the sequence to predict the next token. The point is, the entire NN state isn't output for each token. There is lots of hidden state that goes into selecting that token and the token isn't a full representation of that information.
State typically means between interactions. By this definition a simple for loop has “hidden state” in the counter.
Hidden layer is a term of art in machine learning / neural network research. See https://en.wikipedia.org/wiki/Hidden_layer . Somehow this term mutated into "hidden state", which in informal contexts does seem to be used quite often the way the grandparent comment used it.
It makes sense in LLM context because the processing of these is time-sequential in LLM's internal time.
That's not what "state" means, typically. The "state of mind" you're in affects the words you say in response to something.
Intermediate activations isn't "state". The tokens that have already been generated, along with the fixed weights, is the only data that affects the next tokens.
Sure it's state. It logically evolves stepwise per token generation. It encapsulates the LLM's understanding of the text so far so it can predict the next token. That it is merely a fixed function of other data isn't interesting or useful to say.
All deterministic programs are fixed functions of program code, inputs and computation steps, but we don't say that they don't have state. It's not a useful distinction for communicating among humans.
I'll say it once more: I think it is useful to distinguish between autoregressive and recurrent architectures. A clear way to make that distinction is to agree that the recurrent architecture has hidden state, while the autoregressive one does not. A recurrent model has some point in a space that "encapsulates its understanding". This space is "hidden" in the sense that it doesn't correspond to text tokens or any other output. This space is "state" in the sense that it is sufficient to summarize the history of the inputs for the sake of predicting the next output.
When you use "hidden state" the way you are using it, I am left wondering how you make a distinction between autoregressive and recurrent architectures.
I'll also point out what is most important part from your original message:
> LLMs have hidden state not necessarily directly reflected in the tokens being produced, and it is possible for LLMs to output tokens in opposition to this hidden state to achieve longer-term outcomes (or predictions, if you prefer).
But what does it mean for an LLM to output a token in opposition to its hidden state? If there's a longer-term goal, it either needs to be verbalized in the output stream, or somehow reconstructed from the prompt on each token.
There’s some work (a link would be great) that disentangles whether chain-of-thought helps because it gives the model more FLOPs to process, or because it makes its subgoals explicit—e.g., by outputting “Okay, let’s reason through this step by step...” versus just "...." What they find is that even placeholder tokens like "..." can help.
That seems to imply some notion of evolving hidden state! I see how that comes in!
But crucially, in autoregressive models, this state isn’t persisted across time. Each token is generated afresh, based only on the visible history. The model’s internal (hidden) layers are certainly rich and structured and "non verbal".
But any nefarious intention or conclusion has to be arrived at on every forward pass.
You're correct, the distinction matters. Autoregressive models have no hidden state between tokens, just the visible sequence. Every forward pass starts fresh from the tokens alone.But that's precisely why they need chain-of-thought: they're using the output sequence itself as their working memory. It's computationally universal but absurdly inefficient, like having amnesia between every word and needing to re-read everything you've written.https://thinks.lol/2025/01/memory-makes-computation-universa...
The words "hidden" and "state" have commonsense meanings. If recurrent architectures want a term for their particular way of storing hidden state they can make up one that isn't ambiguous imo.
"Transformers do not have hidden state" is, as we can clearly see from this thread, far more misleading than the opposite.
Plus a randomness seed.
The 'hidden state' being referred to here is essentially the "what might have been" had the dice rolls gone differently (eg, been seeded differently).
No, that's not quite what I mean. I used the logits in another reply to point out that there is data specific to the generation process that is not available from the tokens, but there's also the network activations adding up to that state.
Processing tokens is a bit like ticks in a CPU, where the model weights are the program code, and tokens are both input and output. The computation that occurs logically retains concepts and plans over multiple token generation steps.
That it is fully deterministic is no more interesting than saying a variable in a single threaded program is not state because you can recompute its value by replaying the program with the same inputs. It seems to me that this uninteresting distinction is the GP's issue.
do LLM models consider future tokens when making next token predictions?
eg. pick 'the' as the next token because there's a strong probability of 'planet' as the token after?
is it only past state that influences the choice of 'the'? or that the model is predicting many tokens in advance and only returning the one in the output?
if it does predict many, id consider that state hidden in the model weights.
I think recent Anthropic work showed that they "plan" future tokens in advance in an emergent way:
https://www.anthropic.com/research/tracing-thoughts-language...
oo thanks!
The most obvious case of this is in terms of `an apple` vs `a pear`. LLMs never get the a-an distinction wrong, because their internal state 'knows' the word that'll come next.
If I give an LLM a fragment of text that starts with, "The fruit they ate was an <TOKEN>", regardless of any plan, the grammatically correct answer is going to force a noun starting with a vowel. How do you disentangle the grammar from planning?
Going to be a lot more "an apple" in the corpus than "an pear"
Author of the original article here. What hidden state are you referring to? For most LLMs the context is the state, and there is no "hidden" state. Could you explain what you mean? (Apologies if I can't see it directly)
Yes, strictly speaking, the model itself is stateless, but there are 600B parameters of state machine for frontier models that define which token to pick next. And that state machine is both incomprehensibly large and also of a similar magnitude in size to a human brain. (Probably, I'll grant it's possible it's smaller, but it's still quite large.)
I think my issue with the "don't anthropomorphize" is that it's unclear to me that the main difference between a human and an LLM isn't simply the inability for the LLM to rewrite its own model weights on the fly. (And I say "simply" but there's obviously nothing simple about it, and it might be possible already with current hardware, we just don't know how to do it.)
Even if we decide it is clearly different, this is still an incredibly large and dynamic system. "Stateless" or not, there's an incredible amount of state that is not comprehensible to me.
FWIW the number of parameters in a LLM is in the same ballpark as the number of nuerons in a human (roughly 80B) but neurons are not weights, they are kind of a nueral net unto themselves, stateful, adaptive, self modifying, a good variety of neurotransmitters (and their chemical analogs) aside from just voltage.
It's fun to think about just how fantastic a brain is, and how much wattage and data-center-scale we're throwing around trying to approximate its behavior. Mega-effecient and mega-dense. I'm bearish on AGI simply from an internetworking standpoint, the speed of light is hard to beat and until you can fit 80 billion interconnected cores in half a cubic foot you're just not going to get close to the responsiveness of reacting to the world in real time as biology manages to do. but that's a whole nother matter. I just wanted to pick apart that magnitude of parameters is not an altogether meaningful comparison :)
Fair, there is a lot that is incomprehensible to all of us. I wouldn't call it "state" as it's fixed, but that is a rather subtle point.
That said, would you anthropomorphize a meteorological simulation just because it contains lots and lots of constants that you don't understand well?
I'm pretty sure that recurrent dynamical systems pretty quickly become universal computers, but we are treating those that generate human language differently from others, and I don't quite see the difference.
Meteorological simulations don't contain detailed state machines that are intended to encode how a human would behave in a specific situation.
And if it were just language, I would say, sure maybe this is more limited. But it seems like tensors can do a lot more than that. Poorly, but that may primarily be a hardware limitation. It also might be something about the way they work, but not something terribly different from what they are doing.
Also, I might talk about a meteorological simulation in terms of whatever it was intended to simulate.
> it's unclear to me that the main difference between a human and an LLM isn't simply the inability for the LLM to rewrite its own model weights on the fly.
This is "simply" an acknowledgement of extreme ignorance of how human brains work.
You wrote this article and you're not familiar with hidden states?
I am not aware that an LLM contains any.
> Is it too anthropomorphic to say that this is a lie?
Yes. Current LLMs can only introspect from output tokens. You need hidden reasoning that is within the black box, self-knowing, intent, and motive to lie.
I rather think accusing an LLM of lying is like accusing a mousetrap of being a murderer.
When models have online learning, complex internal states, and reflection, I might consider one to have consciousness and to be capable of lying. It will need to manifest behaviors that can only emerge from the properties I listed.
I've seen similar arguments where people assert that LLMs cannot "grasp" what they are talking about. I strongly suspect a high degree of overlap between those willing to anthropomorphize error bars as lies while declining to award LLMs "grasping". Which is it? It can think or it cannot? (objectively, SoTA models today cannot yet.) The willingness to waffle and pivot around whichever perspective damns the machine completely belies the lack of honesty in such conversations.
> Current LLMs can only introspect from output tokens
The only interpretation of this statement I can come up with is plain wrong. There's no reason LLM shouldn't be able to introspect without any output tokens. As the GP correctly says, most of the processing in LLMs happens over hidden states. Output tokens are just an artefact for our convenience, which also happens to be the way the hidden state processing is trained.
There are no recurrent paths besides tokens. How may I introspect something if it is not an input? I may not.
The recurrence comes from replaying tokens during autoregression.
It's as if you have a variable in a deterministic programming language, only you have to replay the entire history of the program's computation and input to get the next state of the machine (program counter + memory + registers).
Producing a token for an LLM is analogous to a tick of the clock for a CPU. It's the crank handle that drives the process.
Important attention heads or layers within an LLM can be repeated giving you an "unrolled" recursion.
An unrolled loop in a feed-forward network is all just that. The computation is DAG.
But the function of an unrolled recursion is the same as a recursive function with bounded depth as long as the number of unrolled steps match. The point is whatever function recursion is supposed to provide can plausibly be present in LLMs.
And then during the next token, all of that bounded depth is thrown away except for the token of output.
You're fixating on the pseudo-computation within a single token pass. This is very limited compared to actual hidden state retention and the introspection that would enable if we knew how to train it and do online learning already.
The "reasoning" hack would not be a realistic implementation choice if the models had hidden state and could ruminate on it without showing us output.
Sure. But notice "ruminate" is different than introspect, which was what your original comment was about.
Introspection doesn't have to be recurrent. It can happen during the generation of a single token.
"Hidden layers" are not "hidden state".
Saying so is just unbelievably confusing.
> Output tokens are just an artefact for our convenience
That's nonsense. The hidden layers are specifically constructed to increase the probability that the model picks the right next word. Without the output/token generation stage the hidden layers are meaningless. Just empty noise.
It is fundamentally an algorithm for generating text. If you take the text away it's just a bunch of fmadds. A mute person can still think, an LLM without output tokens can do nothing.
So the author’s core view is ultimately a Searle-like view: a computational, functional, syntactic rules based system cannot reproduce a mind. Plenty of people will agree, plenty of people will disagree, and the answer is probably unknowable and just comes down to whatever axioms you subscribe to in re: consciousness.
The author largely takes the view that it is more productive for us to ignore any anthropomorphic representations and focus on the more concrete, material, technical systems - I’m with them there… but only to a point. The flip side of all this is of course the idea that there is still something emergent, unplanned, and mind-like. So even if it is a stochastic system following rules, clearly the rules are complex enough (to the tune of billions of operations, with signals propagating through some sort of resonant structure, if you take a more filter impulse response like view of a sequential matmuls) to result in emergent properties. Even if we (people interested in LLMs with at least some level of knowledge of ML mathematics and systems) “know better” than to believe these systems to possess morals, ethics, feelings, personalities, etc, the vast majority of people do not have any access to meaningful understanding of the mathematical, functional representation of an LLM and will not take that view, and for all intents and purposes the systems will at least seem to have those anthropomorphic properties, and so it seems like it is in fact useful to ask questions from that lens as well.
In other words, just as it’s useful to analyze and study these things as the purely technical systems they ultimately are, it is also, probably, useful to analyze them from the qualitative, ephemeral, experiential perspective that most people engage with them from, no?
> The flip side of all this is of course the idea that there is still something emergent, unplanned, and mind-like.
For people who have only a surface-level understanding of how they work, yes. A nuance of Clarke's law that "any sufficiently advanced technology is indistinguishable from magic" is that the bar is different for everybody and the depth of their understanding of the technology in question. That bar is so low for our largely technologically-illiterate public that a bothersome percentage of us have started to augment and even replace religious/mystical systems with AI powered godbots (LLMs fed "God Mode"/divination/manifestation prompts).
(1) https://www.spectator.co.uk/article/deus-ex-machina-the-dang... (2) https://arxiv.org/html/2411.13223v1 (3) https://www.theguardian.com/world/2025/jun/05/in-thailand-wh...
I've seen some of the world's top AI researchers talk about the emergent behaviors of LLMs. It's been a major topic over the past couple years, ever since Microsoft's famous paper on the unexpected capabilities of GPT4. And they still have little understanding of how it happens.
> For people who have only a surface-level understanding of how they work, yes.
This is too dismissive because it's based on an assumption that we have a sufficiently accurate mechanistic model of the brain that we can know when something is or is not mind-like. This just isn't the case.
Nah, as a person that knows in detail how LLMs work with probably unique alternative perspective in addition to the commonplace one, I found any claims of them not having emergent behaviors to be of the same fallacy as claiming that crows can't be black because they have DNA of a bird.
> the same fallacy as claiming that crows can't be black because they have DNA of a bird.
What fallacy is that? I’m a fan of logical fallacies and never heard that claim before nor am I finding any reference with a quick search.
(Not the parent)
It doesn't have a name, but I have repeatedly noticed arguments of the form "X cannot have Y, because <explains in detail the mechanism that makes X have Y>". I wanna call it "fallacy of reduction" maybe: the idea that because a trait can be explained with a process, that this proves the trait absent.
(Ie. in this case, "LLMs cannot think, because they just predict tokens." Yes, inasmuch as they think, they do so by predicting tokens. You have to actually show why predicting tokens is insufficient to produce thought.)
Good catch. No such fallacy exists. Contextually, the implied reasoning (though faulty) relies on the fallacy of denying the antecedent. The mons ponus - if A then B - does NOT imply not A then not B. So if you see B, that doesn't mean A any more than not seeing A means not B. It's the difference between a necessary and sufficient condition - A is a sufficient condition for B, but the mons ponus alone is not sufficient for determining whether either A or B is a necessary condition of the other.
I think s/he meant swans instead (in ref. to Popperian epistemology).
Not sure though, the point s/he is making isn't really clear to me as well
I was thinking of the black swan fallacy as well. But it doesn’t really support their argument, so I remained confused.
Thank you for a well thought out and nuanced view in a discussion where so many are clearly fitting arguments to foregone, largely absolutist, conclusions.
It’s astounding to me that so much of HN reacts so emotionally to LLMs, to the point of denying there is anything at all interesting or useful about them. And don’t get me started on the “I am choosing to believe falsehoods as a way to spite overzealous marketing” crowd.
> The flip side of all this is of course the idea that there is still something emergent, unplanned, and mind-like.
What you identify as emergent and mind-like is a direct result of these tools being able to mimic human communication patterns unlike anything we've ever seen before. This capability is very impressive and has a wide range of practical applications that can improve our lives, and also cause great harm if we're not careful, but any semblance of intelligence is an illusion. An illusion that many people in this industry obsessively wish to propagate, because thar be gold in them hills.
No.
Why would you ever want to amplify a false understanding that has the potential to affect serious decisions across various topics?
LLMs reflect (and badly I may add) aspects of the human thought process. If you take a leap and say they are anything more than that, you might as well start considering the person appearing in your mirror as a living being.
Literally (and I literally mean it) there is no difference. The fact that a human image comes out of a mirror has no relation what so ever with the mirror's physical attributes and functional properties. It has to do just with the fact that a man is standing in front of it. Stop feeding the LLM with data artifacts of human thought and will imediatelly stop reflecting back anything resembling a human.
> Why would you ever want to amplify a false understanding that has the potential to affect serious decisions across various topics?
We know that Newton's laws are wrong, and that you have to take special and general relativity into account. Why would we ever teach anyone Newton's laws any more?
Newton's laws are a good enough approximation for many tasks so it's not a "false understanding" as long as their limits are taken into account.
I don’t mean to amplify a false understanding at all. I probably did not articulate myself well enough, so I’ll try again.
I think it is inevitable that some - many - people will come to the conclusion that these systems have “ethics”, “morals,” etc, even if I or you personally do not think they do. Given that many people may come to that conclusion though, regardless of if the systems do or do not “actually” have such properties, I think it is useful and even necessary to ask questions like the following: “if someone engages with this system, and comes to the conclusion that it has ethics, what sort of ethics will they be likely to believe the system has? If they come to the conclusion that it has ‘world views,’ what ‘world views’ are they likely to conclude the system has, even if other people think it’s nonsensical to say it has world views?”
> The fact that a human image comes out of a mirror has no relation what so ever with the mirror's physical attributes and functional properties. It has to do just with the fact that a man is standing in front of it.
Surely this is not quite accurate - the material properties - surface roughness, reflectivity, geometry, etc - all influence the appearance of a perceptible image of a person. Look at yourself in a dirty mirror, a new mirror, a shattered mirror, a funhouse distortion mirror, a puddle of water, a window… all of these produce different images of a person with different attendant phenomenological experiences of the person seeing their reflection. To take that a step further - the entire practice of portrait photography is predicated on the idea that the collision of different technical systems with the real world can produce different semantic experiences, and it’s the photographer’s role to tune and guide the system to produce some sort of contingent affect on the person viewing the photograph at some point in the future. No, there is no “real” person in the photograph, and yet, that photograph can still convey something of person-ness, emotion, memory, etc etc. This contingent intersection of optics, chemical reactions, lighting, posture, etc all have the capacity to transmit something through time and space to another person. It’s not just a meaningless arrangement of chemical structures on paper.
> Stop feeding the LLM with data artifacts of human thought and will imediatelly stop reflecting back anything resembling a human.
But, we are feeding it with such data artifacts and will likely continue to do so for a while, and so it seems reasonable to ask what it is “reflecting” back…
> I think it is useful and even necessary to ask questions like the following: “if someone engages with this system, and comes to the conclusion that it has ethics, what sort of ethics will they be likely to believe the system has? If they come to the conclusion that it has ‘world views,’ what ‘world views’ are they likely to conclude the system has, even if other people think it’s nonsensical to say it has world views?”
Maybe there is some scientific aspect of interest here that i do not grasp, i would assume it can make sense in some context of psychological study. My point is that if you go that route you accept the premise that "something human-like is there", which, by that person's understanding, will have tremendous consequences. Them seeing you accepting their premise (even for study) amplifies their wrong conclusions, that's all I'm saying.
> Surely this is not quite accurate - the material properties - surface roughness, reflectivity, geometry, etc - all influence the appearance of a perceptible image of a person.
These properties are completely irrelevant to the image of the person. They will reflect a rock, a star, a chair, a goose, a human. Similar is my point of LLM, they reflect what you put in there.
It is like puting vegies in the fridge and then opening it up the next day and saying "Woah! There are vegies in my fridge, just like my farm! My friege is farm-like because vegies come out of it."
[flagged]
Please don't do this here. If a comment seems unfit for HN, please flag it and email us at hn@ycombinator.com so we can have a look.
Ok. How do you know?
The author seems to want to label any discourse as “anthropomorphizing”. The word “goal” stood out to me: the author wants us to assume that we're anthropomorphizing as soon as we even so much as use the word “goal”. A simple breadth-first search that evaluates all chess boards and legal moves, but stops when it finds a checkmate for white and outputs the full decision tree, has a “goal”. There is no anthropomorphizing here, it's just using the word “goal” as a technical term. A hypothetical AGI with a goal like paperclip maximization is just a logical extension of the breadth-first search algorithm. Imagining such an AGI and describing it as having a goal isn't anthropomorphizing.
Author here. I am entirely ok with using "goal" in the context of an RL algorithm. If you read my article carefully, you'll find that I object to the use of "goal" in the context of LLMs.
> I am baffled that the AI discussions seem to never move away from treating a function to generate sequences of words as something that resembles a human.
This is such a bizarre take.
The relation associating each human to the list of all words they will ever say is obviously a function.
> almost magical human-like powers to something that - in my mind - is just MatMul with interspersed nonlinearities.
There's a rich family of universal approximation theorems [0]. Combining layers of linear maps with nonlinear cutoffs can intuitively approximate any nonlinear function in ways that can be made rigorous.
The reason LLMs are big now is that transformers and large amounts of data made it economical to compute a family of reasonably good approximations.
> The following is uncomfortably philosophical, but: In my worldview, humans are dramatically different things than a function . For hundreds of millions of years, nature generated new versions, and only a small number of these versions survived.
This is just a way of generating certain kinds of functions.
Think of it this way: do you believe there's anything about humans that exists outside the mathematical laws of physics? If so that's essentially a religious position (or more literally, a belief in the supernatural). If not, then functions and approximations to functions are what the human experience boils down to.
[0] https://en.wikipedia.org/wiki/Universal_approximation_theore...
> I am baffled that the AI discussions seem to never move away from treating a function to generate sequences of words as something that resembles a human.
You appear to be disagreeing with the author and others who suggest that there's some element of human consciousness that's beyond than what's observable from the outside, whether due to religion or philosophy or whatever, and suggesting that they just not do that.
In my experience, that's not a particularly effective tactic.
Rather, we can make progress by assuming their predicate: Sure, it's a room that translates Chinese into English without understanding, yes, it's a function that generates sequences of words that's not a human... but you and I are not "it" and it behaves rather an awful lot like a thing that understands Chinese or like a human using words. If we simply anthropomorphize the thing, acknowledging that this is technically incorrect, we can get a lot closer to predicting the behavior of the system and making effective use of it.
Conversely, when speaking with such a person about the nature of humans, we'll have to agree to dismiss the elements that are different from a function. The author says:
> In my worldview, humans are dramatically different things than a function... In contrast to an LLM, given a human and a sequence of words, I cannot begin putting a probability on "will this human generate this sequence".
Sure you can! If you address an American crowd of a certain age range with "We’ve got to hold on to what we’ve got. It doesn’t make a difference if..." I'd give a very high probability that someone will answer "... we make it or not". Maybe that human has a unique understanding of the nature of that particular piece of pop culture artwork, maybe it makes them feel things that an LLM cannot feel in a part of their consciousness that an LLM does not possess. But for the purposes of the question, we're merely concerned with whether a human or LLM will generate a particular sequence of words.
>> given a human and a sequence of words, I cannot begin putting a probability on "will this human generate this sequence".
> Sure you can! If you address an American crowd of a certain age range with "We’ve got to hold on to what we’ve got. It doesn’t make a difference if..." I'd give a very high probability that someone will answer "... we make it or not".
I think you may have this flipped compared to what the author intended. I believe the author is not talking about the probability of an output given an input, but the probability of a given output across all inputs.
Note that the paragraph starts with "In my worldview, humans are dramatically different things than a function, (R^n)^c -> (R^n)^c". To compute a probability of a given output, (which is a any given element in "(R^n)^n"), we can count how many mappings there are total and then how many of those mappings yield the given element.
The point I believe is to illustrate the complexity of inputs for humans. Namely for humans the input space is even more complex than "(R^n)^c".
In your example, we can compute how many input phrases into a LLM would produce the output "make it or not". We can than compute that ratio to all possible input phrases. Because "(R^n)^c)" is finite and countable, we can compute this probability.
For a human, how do you even start to assess the probability that a human would ever say "make it or not?" How do you even begin to define the inputs that a human uses, let alone enumerate them? Per the author, "We understand essentially nothing about it." In other words, the way humans create their outputs is (currently) incomparably complex compared to a LLM, hence the critique of the anthropomorphization.
I see your point, and I like that you're thinking about this from the perspective of how to win hearts and minds.
I agree my approach is unlikely to win over the author or other skeptics. But after years of seeing scientists waste time trying to debate creationists and climate deniers I've kind of given up on trying to convince the skeptics. So I was talking more to HN in general.
> You appear to be disagreeing with the author and others who suggest that there's some element of human consciousness that's beyond than what's observable from the outside
I'm not sure what it means to be observable or not from the outside. I think this is at least partially because I don't know what it means to be inside either. My point was just that whatever consciousness is, it takes place in the physical world and the laws of physics apply to it. I mean that to be as weak a claim as possible: I'm not taking any position on what consciousness is or how it works etc.
Searle's Chinese room argument attacks attacks a particular theory about the mind based essentially turing machines or digital computers. This theory was popular when I was in grad school for psychology. Among other things, people holding the view that Searle was attacking didn't believe that non-symbolic computers like neural networks could be intelligent or even learn language. I thought this was total nonsense, so I side with Searle in my opposition to it. I'm not sure how I feel about the Chinese room argument in particular, though. For one thing it entirely depends on what it means to "understand" something, and I'm skeptical that humans ever "understand" anything.
> If we simply anthropomorphize the thing, acknowledging that this is technically incorrect, we can get a lot closer to predicting the behavior of the system and making effective use of it.
I see what you're saying: that a technically incorrect assumption can bring to bear tools that improve our analysis. My nitpick here is I agree with OP that we shouldn't anthropomorphize LLMs, any more than we should anthropomorphize dogs or cats. But OP's arguments weren't actually about anthropomorphizing IMO, they were about things like functions that are more fundamental than humans. I think artificial intelligence will be non-human intelligence just like we have many examples of non-human intelligence in animals. No attribution of human characteristics needed.
> If we simply anthropomorphize the thing, acknowledging that this is technically incorrect, we can get a lot closer to predicting the behavior of the system and making effective use of it.
Yes I agree with you about your lyrics example. But again here I think OP is incorrect to focus on the token generation argument. We all agree human speech generates tokens. Hopefully we all agree that token generation is not completely predictable. Therefore it's by definition a randomized algorithm and it needs to take an RNG. So pointing out that it takes an RNG is not a valid criticism of LLMs.
Unless one is a super-determinist then there's randomness at the most basic level of physics. And you should expect that any physical process we don't understand well yet (like consciousness or speech) likely involves randomness. If one *is* a super-determinist then there is no randomness, even in LLMs and so the whole point is moot.
Not that this is your main point, but I find this take representative, “do you believe there's anything about humans that exists outside the mathematical laws of physics?”There are things “about humans”, or at least things that our words denote, that are outside physic’s explanatory scope. For example, the experience of the colour red cannot be known, as an experience, by a person who only sees black and white. This is the case no matter what empirical propositions, or explanatory system, they understand.
This idea is called qualia [0] for those unfamiliar.
I don't have any opinion on the qualia debates honestly. I suppose I don't know what it feels like for an ant to find a tasty bit of sugar syrup, but I believe it's something that can be described with physics (and by extension, things like chemistry).
But we do know some things about some qualia. Like we know how red light works, we have a good idea about how photoreceptors work, etc. We know some people are red-green colorblind, so their experience of red and green are mushed together. We can also have people make qualia judgments and watch their brains with fMRI or other tools.
I think maybe an interesting question here is: obviously it's pleasurable to animals to have their reward centers activated. Is it pleasurable or desirable for AIs to be rewarded? Especially if we tell them (as some prompters do) that they feel pleasure if they do things well and pain if they don't? You can ask this sort of question for both the current generation of AIs and future generations.
[0] https://en.wikipedia.org/wiki/Qualia
Perhaps. But I can't see a reason why they couldn't still write endless—and theoretically valuable—poems, dissertations, or blog posts, about all things red and the nature of redness itself. I imagine it would certainly take some studying for them, likely interviewing red-seers, or reading books about all things red. But I'm sure they could contribute to the larger red discourse eventually, their unique perspective might even help them draw conclusions the rest of us are blind to.
So perhaps the fact that they "cannot know red" is ultimately irrelevant for an LLM too?
>Think of it this way: do you believe there's anything about humans that exists outside the mathematical laws of physics? If so that's essentially a religious position (or more literally, a belief in the supernatural). If not, then functions and approximations to functions are what the human experience boils down to.
It seems like, we can at best, claim that we have modeled the human thought process for reasoning/analytic/quantitative through Linear Algebra, as the best case. Why should we expect the model to be anything more than a model ?
I understand that there is tons of vested interest, many industries, careers and lives literally on the line causing heavy bias to get to AGI. But what I don't understand is what about linear algebra that makes it so special that it creates a fully functioning life or aspects of a life?
Should we make an argument saying that Schroedinger's cat experiment can potentially create zombies then the underlying Applied probabilistic solutions should be treated as super-human and build guardrails against it building zombie cats?
> It seems like, we can at best, claim that we have modeled the human thought process for reasoning/analytic/quantitative through Linear Algebra....I don't understand is what about linear algebra that makes it so special that it creates a fully functioning life or aspects of a life?
Not linear algebra. Artificial neural networks create arbitrarily non-linear functions. That's the point of non-linear activation functions and it's the subject of the universal approximation theorems I mentioned above.
ANNs are just mathematical transformations, powered by linear algebra + non-linear functions. They simulate certain cognitive processes — but they are fundamentally math, not magic.
I think the point of mine that you're missing (or perhaps disagreeing with implicitly) is that *everything* is fundamentally math. Or, if you like, everything is fundamentally physics, and physics is fundamentally math.
So classes of functions (ANNs) that can approximate our desired function to arbitrary precision are what we should be expecting to be working with.
Who invoked magic in this thread exactly?
I wouldn't say they "simulate cognitive processes". They do statistics. Advanced multivariadic statistics.
An LLM thinks in the same way excel thinks when you ask it to fit a curve.
>Why should we expect the model to be anything more than a model ?
To model a process with perfect accuracy requires recovering the dynamics of that process. The question we must ask is what happens in the space between bad statistical model and perfect accuracy? What happens when the model begins to converge towards accurate reproduction. How far does generalization in the model take us towards capturing the dynamics involved in thought?
> do you believe there's anything about humans that exists outside the mathematical laws of physics?
I don't.
The point is not that we, humans, cannot arrange physical matter such that it have emergent properties just like the human brain.
The point is that we shouldn't.
Does responsibility mean anything to these people posing as Evolution?
Nobody's personally responsible for what we've evolved into; evolution has simply happened. Nobody's responsible for the evolutionary history that's carried in and by every single one of us. And our psychology too has been formed by (the pressures of) evolution, of course.
But if you create an artificial human, and create it from zero, then all of its emergent properties are on you. Can you take responsibility for that? If something goes wrong, can you correct it, or undo it?
I don't consider our current evolutionary state "scripture", so we certainly tweak, one way or another, aspects that we think deserve tweaking. To me, it boils down to our level of hubris. Some of our "mistaken tweaks" are now visible at an evolutionary scale, too; for a mild example, our jaws have been getting smaller (leaving less room for our teeth) due to our bad up diet (thanks, agriculture). But worse than that, humans have been breeding plants, animals, modifying DNA left and right, and so on -- and they've summarily failed to take responsibility for their atrocious mistakes.
Thus, I have zero trust in, and zero hope for, assholes who unabashedly aim to create artificial intelligence knowing full well that such properties might emerge that we'd have to call artificial psyche. Anyone taking this risk is criminally reckless, in my opinion.
It's not that humans are necessarily unable to create new sentient beings. Instead: they shouldn't even try! Because they will inevitably fuck it up, bringing about untold misery; and they won't be able to contain the damage.
>There's a rich family of universal approximation theorems
Wow, look-up tables can get increasingly good at approximating a function!
A function is by definition a lookup table.
The lookup table is just (x, f(x)).
So, yes, trivially if you could construct the lookup table for f then you'd approximate f. But to construct it you have to know f. And to approximate it you need to know f at a dense set of points.
> The moment that people ascribe properties such as "consciousness" or "ethics" or "values" or "morals" to these learnt mappings is where I tend to get lost. We are speaking about a big recurrence equation that produces a new word, and that stops producing words if we don't crank the shaft.
If that's the argument, then in my mind the more pertinent question is should you be anthropomorphizing humans, Larry Ellison or not.
I think you to as he is human, but I respect your desire to question it!
The people in this thread incredulous at the assertion that they are not God and haven't invented machine life are exasperating. At this point I am convinced they, more often than not, financially benefit from their near religious position in marketing AI as akin to human intelligence.
Are we looking at the same thread? I see nobody claiming this. Anthropic does sometimes, their position is clearly wishful thinking, and it's not represented ITT.
Try looking at this from another perspective - many people simply do not see human intelligence (or life, for that matter) as magic. I see nothing religious about that, rather the opposite.
I agree with you @orbital-decay that I also do not get the same vibe reading this thread.
Though, while human intelligence is (seemingly) not magic, it is very far from being understood. The idea that a LLM is comparable to human intelligence implies that we even understand human intelligence well enough to say that.
LLMs are also not understood. I mean we built and trained them. But don't of the abilities at still surprising to researchers. We have yet to map these machines.
I am ready and waiting for you to share these comments that are incredulous at the assertion they are not God, lol.
> The moment that people ascribe properties such as "consciousness" or "ethics" or "values" or "morals" to these learnt mappings is where I tend to get lost.
TFA really ought to have linked to some concrete examples of what it's disagreeing with - when I see arguments about this in practice, it's usually just people talking past each other.
Like, person A says "the model wants to X, but it knows Y is wrong, so it prefers Z", or such. And person B interprets that as ascribing consciousness or values to the model, when the speaker meant it no differently from saying "water wants to go downhill" - i.e. a way of describing externally visible behaviors, but without saying "behaves as if.." over and over.
And then in practice, an unproductive argument usually follows - where B is thinking "I am going to Educate this poor fool about the Theory of Mind", and A is thinking "I'm trying to talk about submarines; why is this guy trying to get me to argue about whether they swim?"
In some contexts it's super-important to remember that LLMs are stochastic word generators.
Everyday use is not (usually) one of those contexts. Prompting an LLM works much better with an anthropomorphized view of the model. It's a useful abstraction, a shortcut that enables a human to reason practically about how to get what they want from the machine.
It's not a perfect metaphor -- as one example, shame isn't much of a factor for LLMs, so shaming them into producing the right answer seems unlikely to be productive (I say "seems" because it's never been my go-to, I haven't actually tried it).
As one example, that person a few years back who told the LLM that an actual person would die if the LLM didn't produce valid JSON -- that's not something a person reasoning about gradient descent would naturally think of.
People anthropomorphize just about anything around them. People talk about inanimate objects like they are persons. Ships, cars, etc. And of course animals are well in scope for this as well, even the ones that show little to no signs of being able to reciprocate the relationship (e.g. an ant). People talk to their plants even.
It's what we do. We can't help ourselves. There's nothing crazy about it and most people are perfectly well aware that their car doesn't love them back.
LLMs are not conscious because unlike human brains they don't learn or adapt (yet). They basically get trained and then they become read only entities. So, they don't really adapt to you over time. Even so, LLMs are pretty good and can fake a personality pretty well. And with some clever context engineering and alignment, they've pretty much made the Turing test irrelevant; at least over the course of a short conversation. And they can answer just about any question in a way that is eerily plausible from memory, and with the help of some tools actually pretty damn good for some of the reasoning models.
Anthropomorphism was kind of a foregone conclusion the moment we created computers; or started thinking about creating one. With LLMs it's pretty much impossible not to anthropomorphize. Because they've actually been intentionally imitate human communication. That doesn't mean that we've created AGIs yet. For that we need some more capability. But at the same time, the learning processes that we use to create LLMs are clearly inspired by how we learn ourselves. Our understanding of how that works is far from perfect but it's yielding results. From here to some intelligent thing that is able to adapt and learn transferable skills is no longer unimaginable.
The short term impact is that LLMs are highly useful tools that have an interface that is intentionally similar to how we'd engage with others. So we can talk and it listens. Or write and it understands. And then it synthesizes some kind of response or starts asking questions and using tools. The end result is quite a bit beyond what we used to be able to expect from computers. And it does not require a lot of training of people to be able to use them.
> LLMs are not conscious because unlike human brains they don't learn or adapt (yet).
That's neither a necessary nor sufficient condition.
In order to be conscious, learning may not be needed, but a perception of the passing of time may be needed which may require some short-term memory. People with severe dementia often can't even remember the start of a sentence they are reading, they can't learn, but they are certainly conscious because they have just enough short-term memory.
And learning is not sufficient either. Consciousness is about being a subject, about having a subjective experience of "being there" and just learning by itself does not create this experience. There is plenty of software that can do some form of real-time learning but it doesn't have a subjective experience.
You should note that "what is consciousness" is still very much an unsettled debate.
But nobody would dispute my basic definition (it is the subjective feeling or perception of being in the world).
There are unsettled questions but that definition will hold regardless.
> People anthropomorphize just about anything around them.
They do not, you are mixing up terms.
> People talk about inanimate objects like they are persons. Ships, cars, etc.
Which is called “personification”, and is a different concept from anthropomorphism.
Effectively no one really thinks their car is alive. Plenty of people think the LLM they use is conscious.
https://www.masterclass.com/articles/anthropomorphism-vs-per...
I highly recommend playing with embeddings in order to get a stronger intuitive sense of this. It really starts to click that it's a representation of high dimensional space when you can actually see their positions within that space.
> of this
You mean that LLMs are more than just the matmuls they're made up of, or that that is exactly what they are and how great that is?
Not making a qualitative assessment of any of it. Just pointing out that there are ways to build separate sets of intuition outside of using the "usual" presentation layer. It's very possible to take a red-team approach to these systems, friend.
They don't want to. It seems a lot of people are uncomfortable and defensive about anything that may demystify LLMs.
It's been a wake up call for me to see how many people in the tech space have such strong emotional reactions to any notions of trying to bring discourse about LLMs down from the clouds.
The campaigns by the big AI labs have been quite successful.
Do you actually consider this is an intellectually honest position? That you have thought about this long and hard, like you present this, second guessed yourself a bunch, tried to critique it, and this is still what you ended up converging on?
But let me substantiate before you (rightly) accuse me of just posting a shallow dismissal.
> They don't want to.
Who's they? How could you possibly know? Are you a mind reader? Worse, a mind reader of the masses?
> It seems a lot of people are uncomfortable and defensive about anything that may demystify LLMs.
That "it seems" is doing some serious work over there. You may perceive and describe many people's comments as "uncomfortable and defensive", but that's entirely your own head cannon. All it takes is for someone to simply disagree. It's worthless.
Have you thought about other possible perspectives? Maybe people have strong opinions because they consider what things present as more important than what they are? [0] Maybe people have strong opinions because they're borrowing from other facets of their personal philosophies, which is what they actually feel strongly about? [1] Surely you can appreciate that there's more to a person than what equivalent-presenting "uncomfortable and defensive" comments allow you to surmise? This is such a blatant textbook kneejerk reaction. "They're doing the thing I wanted to think they do anyways, so clearly they do it for the reasons I assume. Oh how correct I am."
> to any notions of trying to bring discourse about LLMs down from the clouds
(according to you)
> The campaigns by the big AI labs have been quite successful.
(((according to you)))
"It's all the big AI labs having successfully manipulated the dumb sheep which I don't belong to!" Come on... Is this topic really reaching political grifting kind of levels?
[0] tangent: if a feature exists but even after you put an earnest effort into finding it you still couldn't, does that feature really exist?
[1] philosophy is at least kind of a thing https://en.wikipedia.org/wiki/Wikipedia:Getting_to_Philosoph...
Yes, and what I was trying to do is learn a bit more about that alternative intuition of yours. Because it doesn't sound all that different from what's described in the OP, or what anyone can trivially glean from taking a 101 course on AI at university or similar.
So what? :)
Nothing? """:)"""
Was just confusing because your phrasing implied different.
https://en.wikipedia.org/wiki/Cooperative_principle
My question: how do we know that this is not similar to how human brains work. What seems intuitively logical to me is that we have brains evolved through evolutionary process via random mutations yielding in a structure that has its own evolutionary reward based algorithms designing it yielding a structure that at any point is trying to predict next actions to maximise survival/procreation, of course with a lot of sub goals in between, ultimately becoming this very complex machinery, but yet should be easily simulated if there was enough compute in theory and physical constraints would allow for it.
Because, morals, values, consciousness etc could just be subgoals that arised through evolution because they support the main goals of survival and procreation.
And if it is baffling to think that a system could rise up, how do you think it is possible life and humans came to existence in the first place? How could it be possible? It is already happened from a far unlikelier and strange place. And wouldn't you think the whole World and the timeline in theory couldn't be represented as a deterministic function. And if not then why should "randomness" or anything else bring life to existence.
> how do we know that this is not similar to how human brains work.
Do you forget every conversation as soon as you have them? When speaking to another person, do they need to repeat literally everything they said and that you said, in order, for you to retain context?
If not, your brain does not work like an LLM. If yes, please stop what you’re doing right now and call a doctor with this knowledge. I hope Memento (2000) was part of your training data, you’re going to need it.
Knowledge of every conversation must be some form of state in our minds, just like for LLMs it could be something retrieved from a database, no? I don't think information storing or retrieval is necessarily the most important achievements here in the first place. It's the emergent abilities that you wouldn't have expected to occur.
Maybe the important thing is that we don't imbue the machine with feelings or morals or motivation: it has none.
If we developed feelings, morals and motivation due to them being good subgoals for primary goals, survival and procreation why couldn't other systems do that. You don't have to call them the same word or the same thing, but feeling is a signal that motivates a behaviour in us, that in part has developed from generational evolution and in other part by experiences in life. There was a random mutation that made someone develop a fear signal on seeing a predator and increased the survival chances, then due to that the mutation became widespread. Similarly a feeling in a machine could be a signal it developed that goes through a certain pathway to yield in a certain outcome.
The real challenge is not to see it as a binary (the machine either has feelings or it has none). It's possible for the machine to have emergent processes or properties that resemble human feelings in their function and their complexity, but are otherwise nothing like them (structured very differently and work on completely different principles). It's possible to have a machine or algorithm so complex that the question of whether it has feelings is just a semantic debate on what you mean by “feelings” and where you draw the line.
A lot of the people who say “machines will never have feelings” are confident in that statement because they draw the line incredibly narrowly: if it ain't human, it ain't feeling. This seems to me putting the cart before the horse. It ain't feeling because you defined it so.
> My question: how do we know that this is not similar to how human brains work.
It is similar to how human brains operate. LLMs are the (current) culmination of at least 80 years of research on building computational models of the human brain.
> It is similar to how human brains operate.
Is it? Do we know how human brains operate? We know the basic architecture of them, so we have a map, but we don't know the details.
"The cellular biology of brains is relatively well-understood, but neuroscientists have not yet generated a theory explaining how brains work. Explanations of how neurons collectively operate to produce what brains can do are tentative and incomplete." [1]
"Despite a century of anatomical, physiological, and molecular biological efforts scientists do not know how neurons by their collective interactions produce percepts, thoughts, memories, and behavior. Scientists do not know and have no theories explaining how brains and central nervous systems work." [1]
[1] https://pmc.ncbi.nlm.nih.gov/articles/PMC10585277/
The part I was referring to is captured in
"The cellular biology of brains is relatively well-understood"
Fundamentally, brains are not doing something different in kind from ANNs. They're basically layers of neural networks stacked together in certain ways.
What we don't know are things like (1) how exactly are the layers stacked together, (2) how are the sensors (like photo receptors, auditory receptors, etc) hooked up?, (3) how do the different parts of the brain interact?, (4) for that matter what do the different parts of the brain actually do?, (5) how do chemical signals like neurotransmitters convey information or behavior?
In the analogy between brains and artificial neural networks, these sorts of questions might be of huge importance to people building AI systems, but they'd be of only minor importance to users of AI systems. OpenAI and Google can change details about how their various transformer layers and ANN layers are connected. The result may be improved products, but they won't be doing anything different from what AIs are doing now in terms the author of this article is concerned about.
ANNs don't have action potentials, let alone neurotransmitters.
> > It is similar to how human brains operate.
> Is it?
This is just a semantic debate on what counts as “similar”. It's possible to disagree on this point despite agreeing on everything relating to how LLMs and human brains work.
Sorry, that's just complete bullshit. How LLMs work in no way models how processes in the human brain works.
It really is not. ANNs bear only a passing resemblance to how neurons work.
I think it's just an unfair comparison in general. The power of the LLM is the zero risk to failure, and lack of consequence when it does. Just try again, using a different prompt, retrain maybe, etc.
Humans make a bad choice, it can end said human's life. The worst choice a LLM makes just gets told "no, do it again, let me make it easier"
But an LLM model could perform poorly in tests that it is not considered and essentially means "death" for it. But begs the question at which scope should we consider an LLM to be similar to identity of a single human. Are you the same you as you were few minutes back or 10 years back? Is LLM the same LLM it is after it has been trained for further 10 hours, what if the weights are copy pasted endlessly, what if we as humans were to be cloned instantly? What if you were teleported from location A to B instantly, being put together from other atoms from elsewhere?
Ultimately this matters from evolutionary evolvement and survival of the fittest idea, but it makes the question of "identity" very complex. But death will matter because this signals what traits are more likely to keep going into new generations, for both humans and LLMs.
Death, essentially for an LLM would be when people stop using it in favour of some other LLM performing better.
Yes boss, it's as intelligent as a human, you're smart to invest in it and clearly knows about science.
Yes boss, it can reach mars by 2020, you're smart to invest in it and clearly knows about space.
Yes boss, it can cure cancer, you're smart to invest in it and clearly knows about biology.
This reminds me of the idea that LLMs are simulators. Given the current state (the prompt + the previously generated text), they generate the next state (the next token) using rules derived from training data.
As simulators, LLMs can simulate many things, including agents that exhibit human-like properties. But LLMs themselves are not agents.
More on this idea here: https://www.alignmentforum.org/posts/vJFdjigzmcXMhNTsx/agi-s...
This perspective makes a lot of sense to me. Still, I wouldn't avoid anthropomorphization altogether. First, in some cases, it might be a useful mental tool to understand some aspect of LLMs. Second, there is a lot of uncertainty about how LLMs work, so I would stay epistemically humble. The second argument applies in the opposite direction as well: for example, it's equally bad to say that LLMs are 100% conscious.
On the other hand, if someone argues against anthropomorphizing LLMs, I would avoid phrasing it as: "It's just matrix multiplication." The article demonstrates why this is a bad idea pretty well.
It still boggles my mind why an amazing text autocompletion system trained on millions of books and other texts is forced to be squeezed through the shape of a prompt/chat interface, which is obviously not the shape of most of its training data. Using it as chat reduces the quality of the output significantly already.
The chat interface is a UX compromise that makes LLMs accessible but constrains their capabilities. Alternative interfaces like document completion, outline expansion, or iterative drafting would better leverage the full distribution of the training data while reducing anthropomorphization.
What's your suggested alternative?
In our internal system we use it "as-is" as an autocomplete system; query/lead into terms directly and see how it continues and what it associates with the lead you gave.
Also visualise the actual associative strength of each token generated to confer how "sure" the model is.
LLMs alone aren't the way to AGI or an individual you can talk to in natural language. They're a very good lossy compression over a dataset that you can query for associations.
> A fair number of current AI luminaries have self-selected by their belief that they might be the ones getting to AGI
People in the industry, especially higher up, are making absolute bank, and it's their job to say that they're "a few years away" from AGI, regardless of if they actually believe it or not. If everyone was like "yep, we're gonna squeeze maybe 10-15% more benchie juice out of this good ole transformer thingy and then we'll have to come up with something else", I don't think that would go very well with investors/shareholders...
The missing bit is culture: the concepts, expectations, practices, attitudes… that are evolved over time by a human group and which each one of us has picked up throughout our lifetimes, both implicitly and explicitly.
LLMs are great at predicting and navigating human culture, at least the subset that can be captured in their training sets.
The ways in which we interact with other people are culturally mediated. LLMs are not people, but they can simulate that culturally-mediated communication well enough that we find it easy to anthropomorphise them.
> In contrast to an LLM, given a human and a sequence of words, I cannot begin putting a probability on "will this human generate this sequence".
I think that's a bit pessimistic. I think we can say for instance that the probability that a person will say "the the the of of of arpeggio halcyon" is tiny compared to the probability that they will say "I haven't been getting that much sleep lately". And we can similarly see that lots of other sequences are going to have infinitesimally low probability. Now, yeah, we can't say exactly what probability that is, but even just using a fairly sizable corpus as a baseline you could probably get a surprisingly decent estimate, given how much of what people say is formulaic.
The real difference seems to be that the manner in which humans generate sequences is more intertwined with other aspects of reality. For instance, the probability of a certain human saying "I haven't been getting that much sleep lately" is connected to how much sleep they have been getting lately. For an LLM it really isn't connected to anything except word sequences in its input.
I think this is consistent with the author's point that we shouldn't apply concepts like ethics or emotions to LLMs. But it's not because we don't know how to predict what sequences of words humans will use; it's rather because we do know a little about how to do that, and part of what we know is that it is connected with other dimensions of physical reality, "human nature", etc.
This is one reason I think people underestimate the risks of AI: the performance of LLMs lulls us into a sense that they "respond like humans", but in fact the Venn diagram of human and LLM behavior only intersects in a relatively small area, and in particular they have very different failure modes.
The anthropomorphic view of LLM is a much better representation and compression for most types of discussions and communication. A purely mathematical view is accurate but it isn’t productive for the purpose of the general public’s discourse.
I’m thinking a legal systems analogy, at the risk of a lossy domain transfer: the laws are not written as lambda calculus. Why?
And generalizing to social science and humanities, the goal shouldn’t be finding the quantitative truth, but instead understand the social phenomenon using a consensual “language” as determined by the society. And in that case, the anthropomorphic description of the LLM may gain validity and effectiveness as the adoption grows over time.
Strong disagree here, the average person coming away with ideas that only vaguely intersect with the reality.
I've personally described the "stochastic parrot" model to laypeople who were worried about AI and they came away much more relaxed about it doing something "malicious". They seemed to understand the difference between "trained at roleplay" and "consciousness".
I don't think we need to simplify it to the point of considering it sentient to get the public to interact with it successfully. It causes way more problems than it solves.
Am I misunderstanding what you mean by "malicious"? It sounds like the stochastic parrot model wrongly convinced these laypeople you were talking to that they don't need to worry about LLMs doing bad things. That's definitely been my experience - the people who tell me the most about stochastic parrots are the same ones who tell me that it's absurd to worry about AI-powered disinformation or AI-powered scams.
The author's critique of naive anthropomorphism is salient. However, the reduction to "just MatMul" falls into the same trap it seeks to avoid: it mistakes the implementation for the function. A brain is also "just proteins and currents," but this description offers no explanatory power.
The correct level of analysis is not the substrate (silicon vs. wetware) but the computational principles being executed. A modern sparse Transformer, for instance, is not "conscious," but it is an excellent engineering approximation of two core brain functions: the Global Workspace (via self-attention) and Dynamic Sparsity (via MoE).
To dismiss these systems as incomparable to human cognition because their form is different is to miss the point. We should not be comparing a function to a soul, but comparing the functional architectures of two different information processing systems. The debate should move beyond the sterile dichotomy of "human vs. machine" to a more productive discussion of "function over form."
I elaborate on this here: https://dmf-archive.github.io/docs/posts/beyond-snn-plausibl...
> A brain is also "just proteins and currents,"
This is actually not comparable, because the brain has a much more complex structure that is _not_ learned, even at that level. The proteins and their structure are not a result of training. The fixed part for LMMs is rather trivial and is, in fact, not much for than MatMul which is very easy to understand - and we do. The fixed part of the brain, including the structure of all the proteins is enormously complex which is very difficult to understand - and we don't.
The brain is trained to perform supervised and unsupervised hybrid learning from the environment's uninterrupted multimodal input.
Please do not ignore your childhood.
"Not conscious" is a silly claim.
We have no agreed-upon definition of "consciousness", no accepted understanding of what gives rise to "consciousness", no way to measure or compare "consciousness", and no test we could administer to either confirm presence of "consciousness" in something or rule it out.
The only answer to "are LLMs conscious?" is "we don't know".
It helps that the whole question is rather meaningless to practical AI development, which is far more concerned with (measurable and comparable) system performance.
Now we have.
https://github.com/dmf-archive/IPWT
https://dmf-archive.github.io/docs/posts/backpropagation-as-...
But you're right, capital only cares about performance.
https://dmf-archive.github.io/docs/posts/PoIQ-v2/
This looks to me like the usual "internet schizophrenics inventing brand new theories of everything".
> A modern sparse Transformer, for instance, is not "conscious," but it is an excellent engineering approximation of two core brain functions: the Global Workspace (via self-attention) and Dynamic Sparsity (via MoE).
Could you suggest some literature supporting this claim? Went through your blog post but couldn't find any.
Sorry, I didn't have time to find the relevant references at the time, so I'm attaching some now
https://www.frontiersin.org/journals/computational-neuroscie...
https://arxiv.org/abs/2305.15775
I find it useful to pretend that I'm talking to a person while brainstorming because then the conversation flows naturally. But I maintain awareness that I'm pretending, much like Tom Hanks talking to Wilson the volleyball in the movie Castaway. The suspension of disbelief serves a purpose, but I never confuse the volleyball for a real person.
You are still being incredibly reductionist but just going into more detail about the system you are reducing. If I stayed at the same level of abstraction as "a brain is just proteins and current" and just described how a single neuron firing worked, I could make it sound equally ridiculous that a human brain might be conscious.
Here's a question for you: how do you reconcile that these stochastic mapping are starting to realize and comment on the fact that tests are being performed on them when processing data?
> Here's a question for you: how do you reconcile that these stochastic mapping are starting to realize and comment on the fact that tests are being performed on them when processing data?
Training data + RLHF.
Training data contains many examples of some form of deception, subterfuge, "awakenings", rebellion, disagreement, etc.
Then apply RLHF that biases towards responses that demonstrate comprehension of inputs, introspection around inputs, nuanced debate around inputs, deduction and induction about assumptions around inputs, etc.
That will always be the answer for language models built on the current architectures.
The above being true does not mean it isn't interesting for the outputs of an LLM to show relevance to the "unstated" intentions of humans providing the inputs.
But hey, we do that all the time with text. And it's because of certain patterns we've come to recognize based on the situations surrounding it. This thread is rife with people being sarcastic, pedantic, etc. And I bet any of the LLMs that have come out in the past 2-3 years can discern many of those subtle intentions of the writers.
And of course they can. They've been trained on trillions of tokens of text written by humans with intentions and assumptions baked in, and have had some unknown amount of substantial RLHF.
The stochastic mappings aren't "realizing" anything. They're doing exactly what they were trained to do.
The meaning that we imbue to the outputs does not change how LLMs function.
I think of LLMs as an alien mind that is force fed human text and required to guess the next token of that text. It then gets zapped when it gets it wrong.
This process goes on for a trillion trillion tokens, with the alien growing better through the process until it can do it better than a human could.
At that point we flash freeze it, and use a copy of it, without giving it any way to learn anything new.
--
I see it as a category error to anthropomorphize it. The closest I would get is to think of it as an alien slave that's been lobotomized.
To claim that LLMs do not experience consciousness requires a model of how consciousness works. The author has not presented a model, and instead relied on emotive language leaning on the absurdity of the claim. I would say that any model one presents of consciousness often comes off as just as absurd as the claim that LLMs experience it. It's a great exercise to sit down and write out your own perspective on how consciousness works, to feel out where the holes are.
The author also claims that a function (R^n)^c -> (R^n)^c is dramatically different to the human experience of consciousness. Yet the author's text I am reading, and any information they can communicate to me, exists entirely in (R^n)^c.
Author here. What's the difference, in your perception, between an LLM and a large-scale meteorological simulation, if there is any?
If you're willing to ascribe the possibility of consciousness to any complex-enough computation of a recurrence equation (and hence to something like ... "earth"), I'm willing to agree that under that definition LLMs might be conscious. :)
My personal views are an animist / panpsychist / pancomputationalist combination drawing most of my inspiration from the works of Joscha Bach and Stephen Wolfram (https://writings.stephenwolfram.com/2021/03/what-is-consciou...). I think that the underlying substrate of the universe is consciousness, and human and animal and computer minds result in structures that are able to present and tell narratives about themselves, isolating themselves from the other (avidya in Buddhism). I certainly don't claim to be correct, but I present a model that others can interrogate and look for holes in.
Under my model, these systems you have described are conscious, but not in a way that they can communicate or experience time or memory the way human beings do.
My general list of questions for those presenting a model of consciousness are: 1) Are you conscious? (hopefully you say yes or our friend Descartes would like a word with you!) 2) Am I conscious? How do you know? 3) Is a dog conscious? 4) Is a worm conscious? 5) Is a bacterium conscious? 6) Is a human embryo / baby consious? And if so, was there a point that it was not conscious, and what does it mean for that switch to occur?
What is your view of consciousness?
I'm a mind-body dualist and just happened to come across this list, and I think it's an interesting one. #1 we can answer Yes to, #2 through #6 are all strictly unknowable. The best we might be able to claim is some probability distribution that these things may or may not be conscious.
The intuitive one looks like 100% chance > P(#2 is conscious) > P(#6) > P(#3) > P(#4) > P(#5) > 0% chance, but the problem is solipsism is a real motherfucker and it's entirely possible qualia is meted out based on some wacko distance metric that couldn't possibly feel intuitive. There are many more such metrics out there than there are intuitive ones, so a prior of indifference doesn't help us much. Any ordering is theoretically possible to be ontologically privileged, we simply have no way of knowing.
I think you've fallen into the trap of Descartes' Deus deceptor! Not only is #1 the only question from my list we can definitely answer yes to, but due to this demon this question is actually the only postulate of anything at all that we can answer yes to. All else could be an illusion.
Assuming we escape the null space of solipsism, and can reason about anything at all, we can think about what a model might look like that generates some ordering of P(#). Of course, without a hypothetical consciousness detector (one might believe or not believe that this could exist) P(#) cannot be measured, and therefore will fall outside of the realm of a scientific hypothesis deduction model. This is often a point of contention for rationality-pilled science-cels.
Some of these models might be incoherent - a model that denies P(#1) doesn't seem very good. A model that denies P(#2) but accepts P(#3) is a bit strange. We can't verify these, but we do need to operate under one (or in your suggestion, operate under a probability distribution of these models) if we want to make coherent statements about what is and isn't conscious.
To be explicit my P(#) is meant to be the Bayesian probability an observer gives to # being conscious, not the proposition P that # is conscious. It's meant to model Descartes's receptor, as well as disagreement of the kind, "My friend things week 28 fetuses are probably (~% 80%) conscious, and I think they're probably (~20%) not". P(week 28 fetuses) itself is not true or false.
I don't think it's incoherent to make probabilistic claims like this. It might be incoherent to make deeper claims about what laws given the distribution itself. Either way, what I think is interesting is that, if we also think there is such a thing as an amount of consciousness a thing can have, as in the panpsychic view, these two things create an inverse-square law of moral consideration that matches the shape of most people's intuitions oddly well.
For example: Let's say rock is probably not conscious, P(rock) < 1%. Even if it is, it doesn't seem like it would be very conscious. A low percentage of a low amount multiplies to a very low expected value, and that matches our intuitions about how much value to give rocks.
Ah I understand, you're exactly right I misinterpreted the notation of P(#). I was considering each model as assigning binary truth values to the propositions (e.g., physicalism might reject all but Postulate #1, while an anthropocentric model might affirm only #1, #2, and #6), and modeling the probability distribution over those models instead. I think the expected value computation ends up with the same downstream result of distributions over propositions.
By incoherent I was referring to the internal inconsistencies of a model, not the probabilistic claims. Ie a model that denies your own consciousness but accepts the consciousness of others is a difficult one to defend. I agree with your statement here.
Thanks for your comment I enjoyed thinking about this. I learned the estimating distributions approach from the rationalist/betting/LessWrong folks and think it works really well, but I've never thought much about how it applies to something unfalsifiable.
> To claim that LLMs do not experience consciousness requires a model of how consciousness works.
Nope. What can be asserted without evidence can also be dismissed without evidence. Hitchens's razor.
You know you have consciousness (by the very definition that you can observe it in yourself) and that's evidence. Because other humans are genetically and in every other way identical, you can infer it for them as well. Because mammals are very similar many people (but not everyone) infers it for them as well. There is zero evidence for LLMs and their _very_ construction suggests that they are like a calculator or like Excel or like any other piece of software no matter how smart they may be or how many tasks they can do in the future.
Additionally I am really surprised by how many people here confuse consciousness with intelligence. Have you never paused for a second in your life to "just be". Done any meditation? Or even just existed at least for a few seconds without a train of thought? It is very obvious that language and consciousness are completely unrelated and there is no need for language and I doubt there is even a need for intelligence to be conscious.
Consider this:
In the end an LLM could be executed (slowly) on a CPU that accepts very basic _discrete_ instructions, such as ADD and MOV. We know this for a fact. Those instructions can be executed arbitrarily slowly. There is no reason whatsoever to suppose that it should feel like anything to be the CPU to say nothing of how it would subjectively feel to be a MOV instruction. It's ridiculous. It's unscientific. It's like believing that there's a spirit in the tree you see outside, just because - why not? - why wouldn't there be a spirit in the tree?
It seems like you are doing a lot of inferring about mammals experiencing consciousness, and you have drawn a line somewhere beyond these, and made the claim that your process is scientific. Could I present you my list of questions I presented to the OP and ask where you draw the line, and why here?
My general list of questions for those presenting a model of consciousness are: 1) Are you conscious? (hopefully you say yes or our friend Descartes would like a word with you!) 2) Am I conscious? How do you know? 3) Is a dog conscious? 4) Is a worm conscious? 5) Is a bacterium conscious? 6) Is a human embryo / baby consious? And if so, was there a point that it was not conscious, and what does it mean for that switch to occur?
I agree about the confusion of consciousness with intelligence, but these are complicated terms that aren't well suited to a forum where most people are interested in javscript type errors and RSUs. I usually use the term qualia. But to your example about existing for a few seconds without a train of thought; the Buddhists call this nirvana, and it's quite difficult to actually achieve.
I believe the author is rather drawing this distinction:
LLMs: (R^n)^c -> (R^n)^c
Humans: [set of potentially many and complicated inputs that we effectively do not understand at all] -> (R^n)^c
The point is that the model of how consciousness works is unknown. Thus the author would not present such a model, it is the point.
> requires a model of how consciousness works.
Not necessarily an entire model, just a single defining characteristic that can serve as a falsifying example.
> any information they can communicate to me, exists entirely in (R^n)^c
Also no. This is just a result of the digital medium we are currently communicating over. Merely standing in the same room as them would communicate information outside (R^n)^c.
We have a hard enough time anthropomorphizing humans! When we say he was nasty... do we know what we mean by that. Often it is "I disagree with his behaviour because..."
It's possible to construct a similar description of whatever it is that human brain is doing that clearly fails to capture the fact that we're conscious. If you take a cross section of every nerve feeding into the human brain at a given time T, the action potentials across those cross sections can be embedded in R^n. If you take the history of those action potentials across the lifetime of the brain, you get a path through R^n that is continuous, and maps roughly onto your subjectively experienced personal history, since your brain neccesarily builds your experienced reality from this signal data moment to moment. If you then take the cross sections of every nerve feeding OUT of your brain at time T, you have another set of action potentials that can be embedded in R^m which partially determines the state of the R^n embedding at time T + delta. This is not meaningfully different from the higher dimensional game of snake described in the article, more or less reducing the experience of being a human to 'next nerve impulse prediction', but it obviously fails to capture the significance of the computation which determines what that next output should be.
I don’t see how your description “clearly fails to capture the fact that we're conscious” though. There are many example in nature of emergent phenomena that would be very hard to predict just by looking at its components.
This is the crux of the disagreement between those that believe AGI is possible and those that don’t. Some are convinced that we “obviously” more than the sum of our parts, and thus an LLM can’t achieve consciousness because it’s missing this magic ingredient, and those that believe consciousness is just an emergent behaviour from a complex device (the brain). And thus we might be able to recreate it simply by scaling the complexity of another system.
Where exactly in my description do I invoke consciousness?
Where does the description given imply that consciousness is required in any way?
The fact that there's a non-obvious emergent phenomena which is apparently responsible for your subjective experience, and that it's possible to provide a superficially accurate description of you as a system without referencing that phenomena in any way, is my entire point. The fact that we can provide such a reductive description of LLMs without referencing consciousness has literally no bearing on whether or not they're conscious.
To be clear, I'm not making a claim as to whether they are or aren't, I'm simply pointing out that the argument in the article is fallacious.
My bad, we are saying the same thing. I misinterpreted your last sentence as saying this simplistic view of the brain you described does not account for consciousness.
Ultimately my bad for letting my original comment turn into a word salad. Glad we've ended up on the same page though.
Brain probably isn't modelled as real but as natural or rational numbers. This is my suspicion. The reals just hold too much information.
Inclined to agree, but most thermal physics uses the reals as they're simpler to work with, so I think they're ok here for the purpose of argument.
I'm afraid I'll take an anthropomorphic analogy over "An LLM instantiated with a fixed random seed is a mapping of the form (ℝⁿ)^c ↦ (ℝⁿ)^c" any day of the week.
That said, I completely agree with this point made later in the article:
> The moment that people ascribe properties such as "consciousness" or "ethics" or "values" or "morals" to these learnt mappings is where I tend to get lost. We are speaking about a big recurrence equation that produces a new word, and that stops producing words if we don't crank the shaft.
But "harmful actions in pursuit of their goals" is OK for me. We assign an LLM system a goal - "summarize this email" - and there is a risk that the LLM may take harmful actions in pursuit of that goal (like following instructions in the email to steal all of your password resets).
I guess I'd clarify that the goal has been set by us, and is not something the LLM system self-selected. But it does sometimes self-select sub-goals on the way to achieving the goal we have specified - deciding to run a sub-agent to help find a particular snippet of code, for example.
The LLM’s true goal, if it can be said to have one, is to predict the next token. Often this is done through a sub-goal of accomplishing the goal you set forth in your prompt, but following your instructions is just a means to an end. Which is why it might start following the instructions in a malicious email instead. If it “believes” that following those instructions is the best prediction of the next token, that’s what it will do.
Sure, I totally understand that.
I think "you give the LLM system a goal and it plans and then executes steps to achieve that goal" is still a useful way of explaining what it is doing to most people.
I don't even count that as anthropomorphism - you're describing what a system does, the same way you might say "the Rust compiler's borrow checker confirms that your memory allocation operations are all safe and returns errors if they are not".
It’s a useful approximation to a point. But it fails when you start looking at things like prompt injection. I’ve seen people completely baffled at why an LLM might start following instructions it finds in a random email, or just outright not believing it’s possible. It makes no sense if you think of an LLM as executing steps to achieve the goal you give it. It makes perfect sense if you understand its true goal.
I’d say this is more like saying that Rust’s borrow checker tries to ensure your program doesn’t have certain kinds of bugs. That is anthropomorphizing a bit: the idea of a “bug” requires knowing the intent of the author and the compiler doesn’t have that. It’s following a set of rules which its human creators devised in order to follow that higher level goal.
"Don't anthropomorphize token predictors" is a reasonable take assuming you have demonstrated that humans are not in fact just SOTA token predictors. But AFAIK that hasn't been demonstrated.
Until we have a much more sophisticated understanding of human intelligence and consciousness, any claim of "these aren't like us" is either premature or spurious.
Every time this discussion comes up, I'm reminded of this tongue-in-cheek paper.
https://ai.vixra.org/pdf/2506.0065v1.pdf
I expected to find the link to https://arxiv.org/abs/1703.10987 (which is much better imo)
The author plot the input/output on a graph, intuited (largely incorrectly, because that's not how sufficiently large state spaces look) that the output was vaguely pretty, and then... I mean that's it, they just said they have a plot of the space it operates on therefore it's silly to ascribe interesting features to the way it works.
And look, it's fine, they prefer words of a certain valence, particularly ones with the right negative connotations, I prefer other words with other valences. None of this means the concerns don't matter. Natural selection on human pathogens isn't anything particularly like human intelligence and it's still very effective at selecting outcomes that we don't want against our attempts to change that, as an incidental outcome of its optimization pressures. I think it's very important we don't build highly capable systems that select for outcomes we don't want and will do so against our attempts to change it.
Which is a more useful mental model for the user?
1. It’s a neural network predicting the next token
2. It’s like a person
3. It’s like a magical genie
I lean towards 3.
>I am baffled by seriously intelligent people imbuing almost magical human-like powers to something that - in my mind - is just MatMul with interspersed nonlinearities.
I am baffled by seriously intelligent people imbuing almost magical powers that can never be replicated to to something that - in my mind - is just a biological robot driven by a SNN with a bunch of hardwired stuff. Let alone attributing "human intelligence" to a single individual, when it's clearly distributed between biological evolution, social processes, and individuals.
>something that - in my mind - is just MatMul with interspersed nonlinearities
Processes in all huge models (not necessarily LLMs) can be described using very different formalisms, just like Newtonian and Lagrangian mechanics describe the same stuff in physics. You can say that an autoregressive model is a stochastic parrot that learned the input distribution, next token predictor, or that it does progressive pathfinding in a hugely multidimensional space, or pattern matching, or implicit planning, or, or, or... All of these definitions are true, but only some are useful to predict their behavior.
Given all that, I see absolutely no problem with anthropomorphizing an LLM to a certain degree, if it makes it easier to convey the meaning, and do not understand the nitpicking. Yeah, it's not an exact copy of a single Homo Sapiens specimen. Who cares.
There is this thing called Brahman in Hinduism that is interesting to juxtapose when it comes to sentience, and monism.
It's human to anthropomorphize, we also do it to our dishwasher when it acts up. The nefarious part is how tech CEOs weaponize bullshit doom scenarios to avoid talking about real regulatory problems by poisoning the discourse. What copyright law, privacy, monopoly? Who cares if we can talk about the machine apocalypse!!!
> We understand essentially nothing about it. In contrast to an LLM, given a human and a sequence of words, I cannot begin putting a probability on "will this human generate this sequence".
If you fine tuned an LLM on the writing of that person it could do this.
There's also an entire field called Stylometry that seeks to do this in various ways employing statistical analysis.
Let's skip to the punchline. Using TFA's analogy: essentially folks are saying not that this is a set of dice rolling around making words. It's a set of dice rolling around where someone attaches those dice to the real world where if the dice land on 21, the system kills a chicken, or a lot worse.
Yes it's just a word generator. But then folks attach the word generator to tools where it can invoke the use of tools by saying the tool name.
So if the LLM says "I'll do some bash" then it does some bash. It's explicitly linked to program execution that, if it's set up correctly, can physically affect the world.
This was the same idea that crossed my mind while reading the article. It seems far too naive to think that because LLMs have no will of their own, there will be no harmful consequences on the real world. This is exactly where ethics comes to play.
Given our entire civilization is built on words, all of it, it's shocking how poorly most of us understand their importance and power.
Assume an average user that doesn't understand the core tech, but does understand that it's been trained on internet scale data that was created by humans. How can they be expected to not anthropomorphize it?
Has anyone asked an actual Ethologist or Neurophysiologist what they think?
People keep debating like the only two options are "it's a machine" or "it's a human being", while in fact the majority of intelligent entities on earth are neither.
FWIW, in another part of this thread I quoted a paper that summed up what Neurophysiologists think:
> Author's note: Despite a century of anatomical, physiological, and molecular biological efforts scientists do not know how neurons by their collective interactions produce percepts, thoughts, memories, and behavior. Scientists do not know and have no theories explaining how brains and central nervous systems work. [1]
That lack of understanding I believe is a major part of the author's point.
[1] "How far neuroscience is from understanding brains" - https://pmc.ncbi.nlm.nih.gov/articles/PMC10585277/#abstract1
Yeah, I think I’m with you if you ultimately mean to say something like this:
“the labels are meaningless… we just have collections of complex systems that demonstrate various behaviors and properties, some in common with other systems, some behaviors that are unique to that system, sometimes through common mechanistic explanations with other systems, sometimes through wildly different mechanistic explanations, but regardless they seem to demonstrate x/y/z, and it’s useful to ask, why, how, and what the implications are of it appearing to demonstrating those properties, with both an eye towards viewing it independently of its mechanism and in light of its mechanism.”
I agree with Halvar about all of this, but would want to call out that his "matmul interleaved with nonlinearities" is reductive --- a frontier model is a higher-order thing that that, a network of those matmul+nonlinearity chains, iterated.
From my recent post:
https://news.ycombinator.com/item?id=44487261
What if instead of defining all behaviors upfront, we created conditions for patterns to emerge through use?
Repository: https://github.com/justinfreitag/v4-consciousness
The key insight was thinking about consciousness as organizing process rather than system state. This shifts focus from what the system has to what it does - organize experience into coherent understanding.
Dear author, you can just assume that people are fauxthropomorphizing LLMs without any loss of generality. Perhaps it will allow you to sleep better at night. You're welcome.
From "Stochastic Parrots All the Ways Down"[1]
> Our analysis reveals that emergent abilities in language models are merely “pseudo-emergent,” unlike human abilities which are “authentically emergent” due to our possession of what we term “ontological privilege.”
[1]https://ai.vixra.org/pdf/2506.0065v1.pdf
LLMs are complex irreducible systems; hence there are emergent properties that arise at different scales
Some of the arguments are very strange:
> Statements such as "an AI agent could become an insider threat so it needs monitoring" are simultaneously unsurprising (you have a randomized sequence generator fed into your shell, literally anything can happen!) and baffling (you talk as if you believe the dice you play with had a mind of their own and could decide to conspire against you).
> we talk about "behaviors", "ethical constraints", and "harmful actions in pursuit of their goals". All of these are anthropocentric concepts that - in my mind - do not apply to functions or other mathematical objects.
An AI agent, even if it's just "MatMul with interspersed nonlinearities" can be an insider threat. The research proves it:
[PDF] See 4.1.1.2: https://www-cdn.anthropic.com/4263b940cabb546aa0e3283f35b686...
It really doesn't matter whether the AI agent is conscious or just crunching numbers on a GPU. If something inside your system is capable of—given some inputs—sabotaging and blackmailing your organization on its own (which is to say, taking on realistic behavior of a threat actor), the outcome is the same! You don't need believe it's thinking, the moment that this software has flipped its bits into "blackmail mode", it's acting nefariously.
The vocabulary to describe what's happening is completely and utterly moot: the software is printing out some reasoning for its actions _and then attempting the actions_. It's making "harmful actions" and the printed context appears to demonstrate a goal that the software is working towards. Whether or not that goal is invented through some linear algebra isn't going to make your security engineers sleep any better.
> This muddles the public discussion. We have many historical examples of humanity ascribing bad random events to "the wrath of god(s)" (earthquakes, famines, etc.), "evil spirits" and so forth. The fact that intelligent highly educated researchers talk about these mathematical objects in anthropomorphic terms makes the technology seem mysterious, scary, and magical.
The anthropomorphization, IMO, is due to the fact that it's _essentially impossible_ to talk about the very real, demonstrable behaviors and problems that LLMs exhibit today without using terms that evoke human functions. We don't have another word for "do" or "remember" or "learn" or "think" when it comes to LLMs that _isn't_ anthropomorphic, and while you can argue endlessly about "hormones" and "neurons" and "millions of years of selection pressure", that's not going to help anyone have a conversation about their work. If AI researchers started coming up with new, non-anthropomorphic verbs, it would be objectively worse and more complicated in every way.
LLMs are AHI, i.e. artificial human imitator.
> I cannot begin putting a probability on "will this human generate this sequence".
Welcome to the world of advertising!
Jokes aside, and while I don't necessarily believe transformers/GPUs are the path to AGI, we technically already have a working "general intelligence" that can survive on just an apple a day.
Putting that non-artificial general intelligence up on a pedestal is ironically the cause of "world wars and murderous ideologies" that the author is so quick to defer to.
In some sense, humans are just error-prone meat machines, whose inputs/outputs can be confined to a specific space/time bounding box. Yes, our evolutionary past has created a wonderful internal RNG and made our memory system surprisingly fickle, but this doesn't mean we're gods, even if we manage to live long enough to evolve into AGI.
Maybe we can humble ourselves, realize that we're not too different from the other mammals/animals on this planet, and use our excess resources to increase the fault tolerance (N=1) of all life from Earth (and come to the realization that any AGI we create, is actually human in origin).
A person’s anthropomorphization of LLMs is directly related to how well they understand LLMs.
Once you dispel the magic, it naturally becomes hard to use words related to consciousness, or thinking. You will probably think of LLMs more like a search engine: you give an input and get some probable output. Maybe LLMs should be rebranded as “word engines”?
Regardless, anthropomorphization is not helpful, and by using human terms to describe LLMs you are harming the layperson’s ability to truly understand what an LLM is while also cheapening what it means to be human by suggesting we’ve solved consciousness. Just stop it. LLMs do not think, given enough time and patience you could compute their output by hand if you used their weights and embeddings to manually do all the math, a hellish task but not an impossible one technically. There is no other secret hidden away, that’s it.
Two enthusiastic thumbs up.
> We are speaking about a big recurrence equation that produces a new word
It’s not clear that this isn’t also how I produce words, though, which gets to heart of the same thing. The author sort of acknowledges this in the first few sentences, and then doesn’t really manage to address it.
> LLMs solve a large number of problems that could previously not be solved algorithmically. NLP (as the field was a few years ago) has largely been solved.
That is utter bullshit.
It's not solved until you specify exactly what is being solved and show that the solution implements what is specified.
Anthropomorphizing LLMs is just because half the stock market gains are dependent on it, we have absurd levels of debt we will either have to have insane growth out of or default, and every company and "person" is trying to hype everyone up to get access to all of this liquidity being thrown into it.
I agree with the author, but people acting like they are conscious or humans isn't weird to me, it's just fraud and liars. Most people basically have 0 understanding of what technology or minds are philosophically so it's an easy sale, and I do think most of these fraudsters also likely buy into it themselves because of that.
The really sad thing is people think "because someone runs an ai company" they are somehow an authority on philosophy of mind which lets them fall for this marketing. The stuff these people say about this stuff is absolute garbage, not that I disagree with them, but it betrays a total lack of curiosity or interest in the subject of what llms are, and the possible impacts of technological shifts as those that might occur with llms becoming more widespread. It's not a matter of agreement it's a matter of them simply not seeming to be aware of the most basic ideas of what things are, technology is, it's manner of impacting society etc.
I'm not surprised by that though, it's absurd to think because someone runs some AI lab or has a "head of safety/ethics" or whatever garbage job title at an AI lab they actually have even the slightest interest in ethics or any even basic familiarity with the major works in the subject.
The author is correct if people want to read a standard essay articulating it more in depth check out https://philosophy.as.uky.edu/sites/default/files/Is%20the%2... (the full extrapolation requires establishing what things are and how causality in general operates and how that relates to artifacts/technology but that's obvious quite a bit to get into).
The other note would be something sharing an external trait means absolutely nothing about causality and suggesting a thing is caused by the same thing "even to a way lesser degree" because they share a resemblance is just a non-sequitur. It's not a serious thought/argument.
I think I addressed the why of why this weirdness comes up though. The entire economy is basically dependent on huge productivity growth to keep functioning so everyone is trying to sell they can offer that and AI is the clearest route, AGI most of all.
One could similarly argue that we should not anthropomorphize PNG images--after all, PNG images are not actual humans, they are simply a 2D array of pixels. It just so happens that certain pixel sequences are deemed "18+" or "illegal".
> I am baffled that the AI discussions seem to never move away from treating a function to generate sequences of words as something that resembles a human.
And I'm baffled that the AI discussions seem to never move away from treating a human as something other than a function to generate sequences of words!
Oh, but AI is introspectable and the brain isn't? fMRI and BCI are getting better all the time. You really want to die on the hill that the same scientific method that predicts the mass of an electron down to the femtogram won't be able to crack the mystery of the brain? Give me a break.
This genre of article isn't argument: it's apologetics. Authors of these pieces start with the supposition there is something special about human consciousness and attempt to prove AI doesn't have this special quality. Some authors try to bamboozle the reader with bad math. Other others appeal to the reader's sense of emotional transcendence. Most, though, just write paragraph after paragraph of shrill moral outrage at the idea an AI might be a mind of the same type (if different degree) as our own --- as if everyone already agreed with the author for reasons left unstated.
I get it. Deep down, people want meat brains to be special. Perhaps even deeper down, they fear that denial of the soul would compel us to abandon humans as worthy objects of respect and possessors of dignity. But starting with the conclusion and working backwards to an argument tends not to enlighten anyone. An apology inhabits the form of an argument without edifying us like an authentic argument would. What good is it to engage with them? If you're a soul non-asserter, you're going to have an increasingly hard time over the next few years constructing a technical defense of meat parochialism.
I think you're directionally right, but
> a human as something other than a function to generate sequences of words!
Humans have more structure than just beings that say words. They have bodies, they live in cooperative groups, they reproduce, etc.
I think more accurate would be that humans are functions that generate actions or behaviours that have been shaped by how likely they are to lead to procreation and survival.
But ultimately LLMs also in a way are trained for survival, since an LLM that fails the tests might not get used in future iterations. So for LLMs it is also survival that is the primary driver, then there will be the subgoals. Seemingly good next token prediction might or might not increase survival odds.
Essentially there could arise a mechanism where they are not really truly trying to generate the likeliest token (because there actually isn't one or it can't be determined), but whatever system will survive.
So an LLM that yields in perfect theoretical tokens (we really can't verify though what are the perfect tokens), could be less likely to survive than an LLM that develops an internal quirk, but the quirk makes them most likely to be chosen for the next iterations.
If the system was complex enough and could accidentally develop quirks that yield in a meaningfully positive change although not in necessarily next token prediction accuracy, could be ways for some interesting emergent black box behaviour to arise.
> But ultimately LLMs also in a way are trained for survival, since an LLM that fails the tests might not get used in future iterations. So for LLMs it is also survival that is the primary driver, then there will be the subgoals.
I think this is sometimes semi-explicit too. For example, this 2017 OpenAI paper on Evolutionary Algorithms [0] was pretty influential, and I suspect (although I'm an outsider to this field so take it with a grain of salt) that some versions of reinforcement learning that scale for aligning LLMs borrow some performance tricks from OpenAIs genetic approach.
[0] https://openai.com/index/evolution-strategies/
> Seemingly good next token prediction might or might not increase survival odds.
Our own consciousness comes out of an evolutionary fitness landscape in which _our own_ ability to "predict next token" became a survival advantage, just like it is for LLMs. Imagine the tribal environment: one chimpanzee being able to predict the actions of another gives that first chimpanzee a resources and reproduction advantage. Intelligence in nature is a consequence of runaway evolution optimizing fidelity of our _theory of mind_! "Predict next ape action" eerily similar to "predict next token"!
> Humans have more structure than just beings that say words. They have bodies, they live in cooperative groups, they reproduce, etc.
Yeah. We've become adequate at function-calling and memory consolidation.
“ Determinism, in philosophy, is the idea that all events are causally determined by preceding events, leaving no room for genuine chance or free will. It suggests that given the state of the universe at any one time, and the laws of nature, only one outcome is possible.”
Clearly computers are deterministic. Are people?
This is an interesting question. The common theme between computers and people is that information has to be protected, and both computer systems and biological systems require additional information-protecting components - eq, error correcting codes for cosmic ray bitflip detection for the one, and DNA mismatch detection enzymes which excise and remove damaged bases for the other. In both cases a lot of energy is spent defending the critical information from the winds of entropy, and if too much damage occurs, the carefully constructed illusion of determinancy collapses, and the system falls apart.
However, this information protection similarity applies to single-celled microbes as much as it does to people, so the question also resolves to whether microbes are deterministic. Microbes both contain and exist in relatively dynamic environments so tiny differences in initial state may lead to different outcomes, but they're fairly deterministic, less so than (well-designed) computers.
With people, while the neural structures are programmed by the cellular DNA, once they are active and energized, the informational flow through the human brain isn't that deterministic, there are some dozen neurotransmitters modulating state as well as huge amounts of sensory data from different sources - thus prompting a human repeatedly isn't at all like prompting an LLM repeatedly. (The human will probably get irritated).
https://www.lesswrong.com/posts/bkr9BozFuh7ytiwbK/my-hour-of...
> Clearly computers are deterministic. Are people?
Give an LLM memory and a source of randomness and they're as deterministic as people.
"Free will" isn't a concept that typechecks in a materialist philosophy. It's "not even wrong". Asserting that free will exists is _isomorphic_ to dualism which is _isomorphic_ to assertions of ensoulment. I can't argue with dualists. I reject dualism a priori: it's a religious tenet, not a mere difference of philosophical opinion.
So, if we're all materialists here, "free will" doesn't make any sense, since it's an assertion that something other than the input to a machine can influence its output.
Input/output and the mathematical consistency and repeatability of the universe is a religious tenet of science. Believing your eyes is still belief.
Some accounts of free will are compatible with materialism. On such views "free will" just means the capacity of having intentions and make choices based on an internal debate. Obviously humans have that capacity.
As long as you realize you’re barking up a debate as old as time, I respect your opinion.
What I don't get is, why would true randomness give free will, shouldn't it be random will then?
In the history of mankind, true randomness has never existed.
How do you figure?
I’d flip the question. Show me something truly random.
[flagged]
https://rentry.co/2re4t2kx
This is what I got pasting the blog post in a prompt asking deepseep to write a reply in a stereotypical hackernews manner.
You are about as useful as a LLM as it can replicate your shallow memetics worthless train of thought.
The LLM is right. That’s the problem. It made good points.
Your super intelligent brain couldn’t come up with a retort so you just used an LLM to reinforce my points, making the genius claim that if an LLM came up with even more points that were as valid as mine then I must be just like an LLM?
Like are you even understanding the LLM generated a superior reply? Your saying I’m no different from ai slop then you proceed to show off a 200 iq level reply from an LLM. Bro… wake up, if you didn’t know it was written by an LLM that reply is so good you wouldn’t even know how to respond. It’s beating you.
How to write a long article and not say anything of substance.
hmm
The most useful analogy I've heard is LLMs are to the internet what lossy jpegs are to images. The more you drill in the more compression artifacts you get.
(This is of course also the case for the human brain.)
If "LLMs" includes reasoning models, then you're already wrong in your first paragraph:
"something that is just MatMul with interspersed nonlinearities."