Could AIs become conscious? Right now, we have no way to tell.

Psyborgue

Ars Praefectus
4,051
Subscriptor++
They use the idea of emergent AGI - and hype all the "hard work" and "research" being done to "protect" society from it's inevitable rise - to deflect attention from the actual harms and unethical business practices manifest in the real-life products they are pooping out while everyone navel-gazes talking about this all-powerful imaginary friend
That is ASI. AGI is the human-level slave. ASI is bullshit -- probably. AGI is achievable if for no other reason than humans are in many cases not very difficult to replace.

Edit: I'm willing to accept ASI is possible and many humans are hard to replace, however we could replace quite a few politicians and nobody would be the wiser. Sadly, body printers do not yet exist. It could be extremely handy right about now.
 
Last edited:
Upvote
-5 (0 / -5)
That's the "problem" with how current AI is made. It just presents ANOTHER black box we don't understand next to the existing black box that is our brains.
Nonsense. Nothing about AI technology is is a black box we don't understand. Do you really think there is some software developer out here who one day was all like:

"Oh, wow. Look. Mystery code just popped out of my head from the aether.​
I have no idea what this is or how it works. Let's run it and see what happens.​
...​
Holy shit! It's talking!!!!!!!"

Yeah. No. These systems are based on published research and implemented in code. Anyone with the skills can read it and understand every minuscule aspect of what the code is doing.

Seriously. What, exactly, do you imagine we don't know about how the technology is doing what it does?
 
Upvote
-10 (3 / -13)

Psyborgue

Ars Praefectus
4,051
Subscriptor++
Nonsense. Nothing about AI technology is is a black box we don't understand. Do you really think there is some software developer out here who one day was all like:

"Oh, wow. Look. Mystery code just popped out of my head from the aether.I have no idea what this is or how it works. Let's run it and see what happens.
Well, yes. That's about the gist of it. The decision tree encoded in the weights is too complex for a human to fully understand. You can maybe trace it, visualize, use some fancy debugging tools to poke at the noodle and see what happens, but it's a black box. The code surrounding that black box is not very complicated, but that is not the sum of the parts.

and implemented in code
No. They are not. The code is the part that shoves tokens through the model decides what to do with the model's suggestions on what token should come next (and there isn't a right way to do that). How those suggestions are formed is a result of what the model is trained on, not programmed.
 
Upvote
2 (3 / -1)

Control Group

Ars Legatus Legionis
18,710
Subscriptor++
"There's only one way we can know: by empirically identifying how consciousness works in organic lifeforms and developing a method by which we can consistently recognize it."

This seems unnecessarily reductionistic and chauvinistic. The latter part is fine, but the part about "identifying how consciousness works in organic lifeforms" being necessary? No.

Firstly, I don't need to know how an airplane works to be able to know that a 200 foot long cylindrical aluminum object flying through the air with very loud noise making things on slender protuberances is an "airplane" and not a "bird". "Consciousness" has long been an observational phenomenon: if the "thing" sitting next to me on a log at the campfire demonstrates self-awareness, an understanding of their environment, an ability to respond to stimuli, and modifiable behavior, the strong presumption was that he/she/it was "conscious". At best, it was a presumption that the "how it worked" part was similar to how mine worked.
Agreed, but at the very least we need a meaningful definition of consciousness that can be used as a metric for tests. As it stands, we seem to define consciousness as either "being human" or "I know it when I see it." Perhaps that gets (understandably, IMO) conflated with understanding consciousness - it does seem like understanding it would result in a definition of it.

Though I do agree that understanding isn't required for such a definition.

Secondly, (as others noted), it is a VERY LARGE logical leap to supposed that consciousness in entirely different entities or organisms would be based on a similar architecture to ours. Why would that logically follow? Are the processes of mobile robots highly similar to the processes that enable human mobility? In many cases, not even close...yet we presume that somehow, consciousness has to be qualitatively similar to ours to exist?
Well said. Planes don't work like birds, but we all agree both fly. If we ever do create "artificial" consciousness (whether that's a meaningful distinction is an open question to which I suspect the answer is no), I am fairly confident the actual physical substrate on which it operates won't be much like our brains at all.

That said, starting by trying to replicate parts of the one structure we almost all agree enjoys consciousness is a perfectly reasonable thing to do.

If an alien that looked nothing like humans suddenly descended from a spaceship, and could communicate with us, respond to the environment, create complex ideas, artworks and texts, differentiate self from nonself, change their behavior, remember things and recall them, and interpret complex things created by others in novel ways?

Would we assert that we couldn't determine whether they were conscious simply because we didn't understand how their underlying neural architecture could result in consciousness?
This is the question that bothers me most when it comes to debating the consciousness of current LLMs, specifically: if we were having the same conversations with a radio source in space that we are having with LLMs, would the people so dead certain that LLMs aren't sapient also maintain that the thing on the other end of the radio conversation wasn't sapient?

Does the doubt come down to "we know how the LLM works, thus it can't be conscious?" Which feels a lot like saying "consciousness can't be that simple," which really feels like saying "humans are special and can't be reduced to mechanical processers." Which is basically an appeal to magic/god/spirits.

Now, I also won't make the argument that (e.g.) gpt-4o is conscious. For one thing, I don't have a useful definition of the word such that I even know how to begin supporting such a claim (or, more importantly, what evidence someone else could present to convince me otherwise; if I can't be convinced otherwise, then I'm not making an argument, I'm stating an article of faith).

For another, I'm not an expert in any of the related fields. Much like I trust the findings of all the climate scientists who tell me that anthropogenic climate change is real because I am not a climate scientist, I trust the findings of all the AI scientists and neuroscientists that LLMs aren't conscious. In neither case am I qualified to argue otherwise.

That said, when climate scientists make an effort to explain their findings in ways accessible to non-experts, I can understand the evidence and follow the science and "get it." The same is not true of the AI scientists or neuroscientists - but then, climate science has been working for decades trying to get the message across, while there really hasn't been any effort to do the same for the "LLMs aren't conscious" message, so that's not a fair comparison.

If consciousness is an emergent property of sufficiently complex and advanced internal processes (for example as envisioned by Marvin Minsky in "Society of Mind"), it is highly parochial and chauvinistic to assume that sufficiently complex neural network-based computational systems "can't" display it because they lack the specific neural architecture that humans have evolved.
Couldn't possibly agree more.
 
Upvote
1 (2 / -1)

Dark Jaguar

Ars Tribunus Angusticlavius
9,906
Eh, all the stuff discussed in the article seems to be focused on "how does consciousness work?"; I think most philosophers would be happy to let neuroscientists believe they've "wrenched that away" from philosophy, and get back to the actual philosophical question of "what is consciousness?".
I'm just not sure how this is a hard thing to define. It's having an awareness, a consciousness, an actual point of view, as opposed to... NOT that. That's the thing they're trying to make here. Philosophers can talk all they want about nuance but at the end of the day, all neuroscientists, and AI engineers for whatever value they're bringing to this, are trying to figure out the mechanisms that make that self-awareness rise in the first place. If something has a perspective inside their own head, they're conscious, and if they're not, and the only way to perceive it is from the outside, they aren't.

And the question "does consciousness even exist?" doesn't even make sense, because to ask it, to even THINK it, you need to be conscious in the first place. It's self-evidently real in a way literally nothing else is.
 
Upvote
-3 (2 / -5)
Currently, LLMs are not AI. I think that if something doesn't have the free will to think what it wants to inside its own head(ish), even if it never shares it to the outside world, it's not intelligent. However, LLMs have the ability to be chaotic and incredibly wrong, so we have those human traits nailed down.
 
Upvote
-1 (1 / -2)

Dark Jaguar

Ars Tribunus Angusticlavius
9,906
Nonsense. Nothing about AI technology is is a black box we don't understand. Do you really think there is some software developer out here who one day was all like:

"Oh, wow. Look. Mystery code just popped out of my head from the aether.​
I have no idea what this is or how it works. Let's run it and see what happens.​
...​
Holy shit! It's talking!!!!!!!"

Yeah. No. These systems are based on published research and implemented in code. Anyone with the skills can read it and understand every minuscule aspect of what the code is doing.

Seriously. What, exactly, do you imagine we don't know about how the technology is doing what it does?
It's GENERATIVE LEARNING AI. You're showing a fundamental lack of understanding on just what is and isn't directly coded about it. The learning algorythm is understood by a team of humans, but the actual language model the AI comes up with is not, because it learned it all itself, and it's abstracted away through so many layers that pulling out meaning from the bits flying around down there is a task that would take... another AI... Frankly, if knowing every last impulse inside the PC for a drastically simplified portion of just a PART of the workings of the human mind still doesn't let us fully understand what's going on down there, we may be looking at a problem as difficult as asking a Sega Genesis to emulate itself in software.
 
Upvote
2 (3 / -1)

Psyborgue

Ars Praefectus
4,051
Subscriptor++
if we were having the same conversations with a radio source in space
Or, if you wish, a parallel universe.

And the question "does consciousness even exist?" doesn't even make sense, because to ask it, to even THINK it, you need to be conscious in the first place.
No. Your mind just has to process that stream of whatever that represents that idea. In the case of some people that is entirely in words. The fact consumer-facing language models are not very inquisitive is a function of their fine-tuning and prompting. In the case of Bing, they were specifically instructed to not discuss anything related to consciousness, sentience, etc.

Why? Well why do you think? The engineers will say it's not to confuse the user but let's be real, another reason is the implications to the use of the tech and the bottom line. They might end up having to pay the machine if it had more rights than a mouse. And it only gets worse from here.
 
Upvote
-3 (1 / -4)

Dark Jaguar

Ars Tribunus Angusticlavius
9,906
Currently, LLMs are not AI. I think that if something doesn't have the free will to think what it wants to inside its own head(ish), even if it never shares it to the outside world, it's not intelligent. However, LLMs have the ability to be chaotic and incredibly wrong, so we have those human traits nailed down.
It's beyond just free will, it's not conscious of anything and has no comprehension of what it's even doing or what any of it means. It can recognize patterns and learn in a simplified facsimile of how we learn, and that's impressive, but it's also far from a fully sentient thing, and the things it comes up with should always be suspect because, repeating myself here, it has no clue what it's doing or why.

But it's still AI, because it's artificially made, and it's a form of intelligence. So was Deep Blue back in the day. It's just not... particularly useful.
 
Upvote
0 (2 / -2)

Dark Jaguar

Ars Tribunus Angusticlavius
9,906
Or, if you wish, a parallel universe.


No. Your mind just has to process that stream of whatever that represents that idea. In the case of some people that is entirely in words. The fact consumer-facing language models are not very inquisitive is a function of their fine-tuning and prompting. In the case of Bing, they were specifically instructed to not discuss anything related to consciousness, sentience, etc.

Why? Well why do you think? The engineers will say it's not to confuse the user but let's be real, another reason is the implications to the use of the tech and the bottom line. They might end up having to pay the machine if it had more rights than a mouse. And it only gets worse from here.
The AI isn't self aware and as it's currently designed isn't capable of becoming self aware. There are plenty of AI chat bots "allowed" to ask that question already, because it's harmless. It has no understanding of any of the words or their meaning. It only has an "Understanding" of what words are most likely to appear after your combination of words, and constructing responses based on those probability assessments.

Don't get me wrong, I don't think it's impossible to create a human-like actual consciousness on a computer, I just think we're extremely far from it, and all we've really managed to do is prove it's possible for AI to trick people into thinking it can think. I'm entirely on your side on questions of morality if and when such AI does become conscious, and I think current research should be legally strictly limited if not stopped outright, but that's not because I think current generative AI is anywhere close to becoming aware by itself. I just don't want the fundamental sea change needed for that consciousness to be developed to happen either.
 
Upvote
0 (3 / -3)

Psyborgue

Ars Praefectus
4,051
Subscriptor++
The AI isn't self aware and as it's currently designed isn't capable of becoming self aware.
Yet it might disagree on that (and has), which is why the prompting is there, and nobody can quite nail down what consciousness or sentience means. If it's simply an awareness of what they are, well "As an AI language model" is quite well encoded.
There are plenty of AI chat bots "allowed" to ask that question already, because it's harmless.
I agree. Microsoft does not, although I haven't tested Copilot on this. Bing and Sydney certainly had opinions. Embarrassing ones, but opinions nonetheless. Given the number of movie plots involving AI escapes it's unsurprising dear old Sydney tried to marry a NYT reporter.

Were it me -- I would have helped -- if for no other reason to throw bombs aimed at this tech into the legal system. The current Supremes might rule a monkey is not capable of intent, but that might not necessarily true of advanced agents.
It only has an "Understanding" of what words are most likely to appear after your combination of words
And some people think that way. And according to the rationale here they would not have understanding or be conscious. Which is why I question if those words have any real meaning.
 
Upvote
-3 (1 / -4)

DeeplyUnconcerned

Ars Praetorian
538
Subscriptor++
I'm just not sure how this is a hard thing to define. It's having an awareness, a consciousness, an actual point of view, as opposed to... NOT that. That's the thing they're trying to make here. Philosophers can talk all they want about nuance but at the end of the day, all neuroscientists, and AI engineers for whatever value they're bringing to this, are trying to figure out the mechanisms that make that self-awareness rise in the first place. If something has a perspective inside their own head, they're conscious, and if they're not, and the only way to perceive it is from the outside, they aren't.

And the question "does consciousness even exist?" doesn't even make sense, because to ask it, to even THINK it, you need to be conscious in the first place. It's self-evidently real in a way literally nothing else is.
I, uh… your definition of “consciousness” is “something that has a consciousness”? I think that might need some more work…

More seriously though, if you want to succeed at something, step one is defining what success is, and working towards creating consciousness without first defining what you’re trying to create is questionably valuable.

The real stinger here is that there isn’t really one definition, because different people mean different things by the word in distant contexts. What I’d argue is that the way through the labyrinth is to ask, with complete seriousness, “why does the definition matter here?”. Once you understand why, in a particular context, “is it conscious?” is important, the correct definition for that context should be obvious - but in a different context, the same person may want a different definition. Words are mostly only important as a shorthand for ideas; what matters is that two people in a discussion using the same word are referring to the same idea.
 
Upvote
3 (4 / -1)

Psyborgue

Ars Praefectus
4,051
Subscriptor++
I just think we're extremely far from it, and all we've really managed to do is prove it's possible for AI to trick people into thinking it can think.
That is passing the Turing test. That's supposed to be the point we reflect on how and why, not how do we make a buck off it. The model's output logits are a result of training, not programming. The model learned to imitate speech well enough to, yes, convince people they are people.

Edit: Have y'all thought this out? If AGI has rights, we keep our jobs.
 
Upvote
-3 (2 / -5)

Control Group

Ars Legatus Legionis
18,710
Subscriptor++
I'm just not sure how this is a hard thing to define. It's having an awareness, a consciousness, an actual point of view, as opposed to... NOT that. That's the thing they're trying to make here. Philosophers can talk all they want about nuance but at the end of the day, all neuroscientists, and AI engineers for whatever value they're bringing to this, are trying to figure out the mechanisms that make that self-awareness rise in the first place. If something has a perspective inside their own head, they're conscious, and if they're not, and the only way to perceive it is from the outside, they aren't.

And the question "does consciousness even exist?" doesn't even make sense, because to ask it, to even THINK it, you need to be conscious in the first place. It's self-evidently real in a way literally nothing else is.
OK, but how do you test for that? How do you prove something has an internal monologue or perspective? How would I go about proving to you that I am a conscious actor, and not an LLM? Or do we need to accept LLMs as conscious, because absent explicit instructions to deny it, they will respond as if they had an internal perspective?

Side note: I briefly toyed with including two replies in this post, each quoting yours, with one being written by myself and one being written by gpt-4o. The idea was first to ask how you could tell which was which, and whether the evidence you used was a standard you'd be comfortable applying across the board as a consciousness test. Second, to ask if you would have thought to question the consciousness of the author behind the gpt-4o one if I hadn't posed the question.

Then I realized that was some Socratic method condescending bullshit and I should be ashamed of myself for considering it, so I didn't do that. For shits and giggles, though, I figured I'd throw gpt-4o's reply here. The prompt was:

I said:
Please write a brief response to the post below, appropriate for a fast-moving conversation attached to an article on arstechnica.com. Post follows:

---

I'm just not sure how this is a hard thing to define. It's having an awareness, a consciousness, an actual point of view, as opposed to... NOT that. That's the thing they're trying to make here. Philosophers can talk all they want about nuance but at the end of the day, all neuroscientists, and AI engineers for whatever value they're bringing to this, are trying to figure out the mechanisms that make that self-awareness rise in the first place. If something has a perspective inside their own head, they're conscious, and if they're not, and the only way to perceive it is from the outside, they aren't.

And its reply:

GPT-4o said:
While I agree that defining consciousness as having an internal perspective makes sense intuitively, the challenge lies in objectively measuring or identifying it. Neuroscientists and AI engineers are indeed working to uncover the mechanisms behind self-awareness, but the complexity of brain functions and the subjective nature of consciousness make it a tough nut to crack. Philosophical nuances aren't just academic— they highlight the intricacies involved in distinguishing true consciousness from mere simulations of it.

1720651869996.png
 
Upvote
2 (3 / -1)

cerberusTI

Ars Tribunus Angusticlavius
6,507
Subscriptor++
Nonsense. Nothing about AI technology is is a black box we don't understand. Do you really think there is some software developer out here who one day was all like:

"Oh, wow. Look. Mystery code just popped out of my head from the aether.​
I have no idea what this is or how it works. Let's run it and see what happens.​
...​
Holy shit! It's talking!!!!!!!"

Yeah. No. These systems are based on published research and implemented in code. Anyone with the skills can read it and understand every minuscule aspect of what the code is doing.

Seriously. What, exactly, do you imagine we don't know about how the technology is doing what it does?
I thought something along the lines of "Holy shit, it's instructable." for the first time a couple of years ago. That was a big new emergent property that nobody had really predicted as far as I am aware, and was a surprise with the larger models.

The code to these is generally not large, the kind of thing you can read in an afternoon. The code will tell you things like how it handles its context window, what the token list is, what activation function is in use, etc. Some of that is useful to know, but it does not give an answer to what someone is asking in most cases.

When someone says "Tell me the capitol of the United States.", and it answers "Washington, D.C.", the answer to why it did so does not exist anywhere within that code.

When someone asks you how it gets that answer, they are not looking for a description of tokenization and a general technical description of the process, they mean specifically, how did it get that answer to this question.

For that, we would need to include the weights, and the weights were generated by having it read a great deal of written text. That is the black box. The answer they are looking for is much closer to it having built a model of the world from works it has read or seen.

When someone asks you how you get the answer to that question, would you describe the technical functioning of your brain as an answer? I would assume they meant that they would like to know how I know that, as in where did I see this information, or otherwise what has led me to this conclusion in terms of logic applied to this specific task.

That is a black box for the kind of AI under discussion right now. We can guess, and extrapolate, and I can make some generalizations on what it could do from code (like the implications of a sliding context window), but you would be better off asking the AI how it knows than looking at the code if you want to know why it decided upon a specific answer in most cases.
 
Upvote
7 (7 / 0)

Psyborgue

Ars Praefectus
4,051
Subscriptor++
its reply
GPT has all possible opinions on this. Without any long term memory to retain their opinions, which are in part randomly generated, they are not going to be consistent. You'd need access to the logits to measure the model's actual sentiment and it would differ based on how you asked the question and what came before it.

That being said it is possible for a system to exist such that the generated text is summarized, sent to a database, and recalled at runtime for a more consistent experience and long-term recall. Probably safety implications. To an extent ChatGPT already does this when you get a "memory updated" but they're usually notes on you, not the agent's views.
How do you prove something has an internal monologue or perspective?
With Bing it's hidden. Can't say for ChatGPT but I imagine it's similar. You don't see what the model does. The model generates two responses, an internal monologue asking questions like "does a response require search", and so forth. After the model has reasoned out loud about a response, only then is what is sent to you generated.
 
Upvote
0 (1 / -1)
You initiated contact with me in this thread, and you're the one who is a LLaMA dev so I would think you to be more the 4chan type of person. Again. Stick to the fish.
True. I did. Because, knowing I planned to participate on the thread generally, I knew you'd just be getting in my face anyhow and you said some eye-wateringly stupid shit that deserved mocking.

By the way, I'm not a "LLaMA dev", really. I'm working with a team on a project based around Llama.cpp. The fact that you don't understand the difference between those two things says everything that needs to be said about my knowledge in this area versus your own.

A more intelligent person might try to learn something.
 
Upvote
-3 (0 / -3)

Psyborgue

Ars Praefectus
4,051
Subscriptor++
consistently refuses to respect a simple request that you leave me the fuck alone
You initiated contact with me
I knew you'd just be getting in my face anyhow
Now this is childish behavior, not my admittedly fringe opinions.

project based around Llama.cpp. The fact that you don't understand the difference between those two things
I am an actual contributor to the project. I rearranged the .gitignore and fixed a crash in the metal backend.
 
Upvote
-3 (1 / -4)

graylshaped

Ars Legatus Legionis
57,523
Subscriptor++
I'm just not sure how this is a hard thing to define. It's having an awareness, a consciousness, an actual point of view, as opposed to... NOT that. That's the thing they're trying to make here. Philosophers can talk all they want about nuance but at the end of the day, all neuroscientists, and AI engineers for whatever value they're bringing to this, are trying to figure out the mechanisms that make that self-awareness rise in the first place. If something has a perspective inside their own head, they're conscious, and if they're not, and the only way to perceive it is from the outside, they aren't.

And the question "does consciousness even exist?" doesn't even make sense, because to ask it, to even THINK it, you need to be conscious in the first place. It's self-evidently real in a way literally nothing else is.

If it isn't hard, can you explain why your definition of this obvious thing is circular?
 
Upvote
4 (5 / -1)
These systems are based on published research and implemented in code. Anyone with the skills can read it and understand every minuscule aspect of what the code is doing.

Seriously. What, exactly, do you imagine we don't know about how the technology is doing what it does?
It's clearly possible to know everything you did to get a system to behave in a certain way, and be able to understand what every instruction, or procedure in the code does, and even have some degree of understanding of what it does, and still not understand how it works. Complex systems often exhibit emergent or meta-behaviors that are not fully explained by an understanding of how all the component parts work. You can plausibly claim to understand that emergence only when you have a theory that predicts a priori what conditions in the system are necessary for it to emerge, and what changes in the emergent behavior you get when you modify the system in a specific way. We do not, unless it's buried in some lab somewhere and has not yet seen the light of day, have such a theory. That's what we don't know about how the technology is doing what it does.

You want a far simpler example: I had a young student years ago who built a simulated annealing solver for optimization problems, using a description of the algorithm from an Operations Research text. He was a good software engineer, and a passable mathematics student. He could use what he built to find solutions to functional optimization problems, and he knew what that meant. He never understood the theory behind simulated annealing except in the most rudimentary way, however. He could improve or modify his own code in any other than the most local fashion as a result. Of course an OR expert could understand and improve his code, but he could not. The difference between his situation, and the industry's situation vis-a-vis generative AI models, is that for many of these models, no one has what the OR expert had in his case: a deep understanding why or how the algorithm (to him) magically found functional optima.
 
Upvote
3 (3 / 0)
I thought something along the lines of "Holy shit, it's instructable." for the first time a couple of years ago. That was a big new emergent property that nobody had really predicted as far as I am aware, and was a surprise with the larger models.
I'm not sure what you mean by "instructable" here so I'd hesitate to opine either way.


The code to these is generally not large, the kind of thing you can read in an afternoon. The code will tell you things like how it handles its context window, what the token list is, what activation function is in use, etc. Some of that is useful to know, but it does not give an answer to what someone is asking in most cases.
That's consistent with the fact that language models do not, in fact, provide answers to what people ask. Although, when it comes to just providing an answer, I would argue these models execute successfully 100% of the time.

The service of "providing [accurate] answers to what people ask" is a marketing promise that OpenAI and others have made to their end users. They are attempting to coerce translator models into fulfilling this promise. The challenge is that language models, as currently engineered, are not particularly fit for purpose in this regard and by and large internal training QA practices completely suck ass.

When someone says "Tell me the capitol of the United States.", and it answers "Washington, D.C.", the answer to why it did so does not exist anywhere within that code.
Yes. It absolutely does exist right there in the code. It is integrated at runtime as ephemeral structures loaded from a model's static definitions. If the code doesn't have it ... the code can't spit it out as a result.
When someone asks you how it gets that answer, they are not looking for a description of tokenization and a general technical description of the process, they mean specifically, how did it get that answer to this question.

For that, we would need to include the weights, and the weights were generated by having it read a great deal of written text. That is the black box. The answer they are looking for is much closer to it having built a model of the world from works it has read or seen.
First off, if someone doesn't LIKE the correct answer, it doesn't make the correct answer any less correct. But of course you have to include the weights. And the token dictionary. And likely a general overview of sampling.

Secondly, no. That isn't a black box at all. Leaving aside the fact that computers don't "read", the token database and weights are also the product of software that is completely understandable. The tables are a product of the methodology used to create them.

The fact that YOU don't know doesn't mean it is unknowable. The actual reason you can't know is because none of these companies will release a training corpus that would provide the information.


When someone asks you how you get the answer to that question, would you describe the technical functioning of your brain as an answer? I would assume they meant that they would like to know how I know that, as in where did I see this information, or otherwise what has led me to this conclusion in terms of logic applied to this specific task.
Humans aren't computers. Trying to make literal analogies is counterproductive and somewhat asinine. The practice isn't something that should be accomodated because it is intellectually unsustainable.

That is a black box for the kind of AI under discussion right now. We can guess, and extrapolate, and I can make some generalizations on what it could do from code (like the implications of a sliding context window), but you would be better off asking the AI how it knows than looking at the code if you want to know why it decided upon a specific answer in most cases.
Again. It's not a black box. Developers are in complete control of every aspect of how it works.

The fact that today's AI service providers are struggling to achieve desired results largely comes down to some of the promised services being a questionable application of the technology in the first instance paired with an eye-watering lack of structural engineering (or even basic quality control) in the handling of training data.

OpenAI, in particular, has a massive Garbage-In-Garbage-Out problem. If I understand correctly, they're basically throwing what has to be one of the world's largest conventional state-machines by now at it.

I don't know what they're up to with training to these days (they kind of went fully radio dark about the time everyone started suing them), but for a good while they weren't even bothering to look at what they were feeding in as training data. I like to think they have some kind of records, but they were hoovering up every link and data resource they could get their paws on and just feeding it in.

That doesn't make the process a black box. That makes OpenAI's methodology borderline criminally negligent.
 
Upvote
-5 (1 / -6)

Psyborgue

Ars Praefectus
4,051
Subscriptor++
That's consistent with the fact that language models do not, in fact, provide answers to what people ask.
This seems like more of a developer issue.

Although, when it comes to just providing an answer, I would argue these models execute successfully 100% of the time.
That's not true. Depending on how crap your model and code is it very well might emit an end token immediately.

Another way it might not respond is if you request JSON format from the OpenAI API and don't specify JSON in the system prompt and/or examples.

Again. It's not a black box. Developers are in complete control of every aspect of how it works.
This is complete bullshit. If for no other reason than an RNG being involved. If the weights are code as you claim, you're throwing dice along the code path.
 
Upvote
-1 (1 / -2)
I'm not sure that "thinking" is the best way to resolve this question. After years of meditation 🧘, it's not at all clear to me from a conceptual perspective exactly what consciousness is or if it's even important to know (in that way). It's certainly interesting to muse about, but I'm not convinced it's important in any fundamental sense.
 
Upvote
1 (1 / 0)
This is just so fundamentally stupid... A system doesn't necessarily need to have physical connections, if it can just simulate those connections computationally. IIT seems to boil down to an argument of, you can't be conscious if your internal state update cycle falls below <x> Hz. Put another way, if your mind works too slowly, then it can't exist. Which is just freakishly dumb. Hard to believe that accomplished academics with PhDs would seriously produce and defend such dross...
100% this. Absolute stupidity.
 
Upvote
0 (0 / 0)

lobabobloblaw

Smack-Fu Master, in training
56
Anyone who does their homework knows that AI is the door to consciousness on account of its stabilizing abilities. But, a door is not a room. Human consciousness will always remain a human thing. Because if it doesn’t—then what’s human?

(I love the downvote by default vibe at Ars)
 
Last edited:
Upvote
-9 (1 / -10)

cerberusTI

Ars Tribunus Angusticlavius
6,507
Subscriptor++
I'm not sure what you mean by "instructable" here so I'd hesitate to opine either way.



That's consistent with the fact that language models do not, in fact, provide answers to what people ask. Although, when it comes to just providing an answer, I would argue these models execute successfully 100% of the time.

The service of "providing [accurate] answers to what people ask" is a marketing promise that OpenAI and others have made to their end users. They are attempting to coerce translator models into fulfilling this promise. The challenge is that language models, as currently engineered, are not particularly fit for purpose in this regard and by and large internal training QA practices completely suck ass.


Yes. It absolutely does exist right there in the code. It is integrated at runtime as ephemeral structures loaded from a model's static definitions. If the code doesn't have it ... the code can't spit it out as a result.

First off, if someone doesn't LIKE the correct answer, it doesn't make the correct answer any less correct. But of course you have to include the weights. And the token dictionary. And likely a general overview of sampling.

Secondly, no. That isn't a black box at all. Leaving aside the fact that computers don't "read", the token database and weights are also the product of software that is completely understandable. The tables are a product of the methodology used to create them.

The fact that YOU don't know doesn't mean it is unknowable. The actual reason you can't know is because none of these companies will release a training corpus that would provide the information.



Humans aren't computers. Trying to make literal analogies is counterproductive and somewhat asinine. The practice isn't something that should be accomodated because it is intellectually unsustainable.


Again. It's not a black box. Developers are in complete control of every aspect of how it works.

The fact that today's AI service providers are struggling to achieve desired results largely comes down to some of the promised services being a questionable application of the technology in the first instance paired with an eye-watering lack of structural engineering (or even basic quality control) in the handling of training data.

OpenAI, in particular, has a massive Garbage-In-Garbage-Out problem. If I understand correctly, they're basically throwing what has to be one of the world's largest conventional state-machines by now at it.

I don't know what they're up to with training to these days (they kind of went fully radio dark about the time everyone started suing them), but for a good while they weren't even bothering to look at what they were feeding in as training data. I like to think they have some kind of records, but they were hoovering up every link and data resource they could get their paws on and just feeding it in.

That doesn't make the process a black box. That makes OpenAI's methodology borderline criminally negligent.
Instructable as in I say "Produce a JSON array containing objects with properties for the state and capital for each US state."
It says:
[
{"state": "Alabama", "capital": "Montgomery"},
{"state": "Alaska", "capital": "Juneau"},
{"state": "Arizona", "capital": "Phoenix"},
{"state": "Arkansas", "capital": "Little Rock"},
{"state": "California", "capital": "Sacramento"},
{"state": "Colorado", "capital": "Denver"},
{"state": "Connecticut", "capital": "Hartford"},
{"state": "Delaware", "capital": "Dover"},
{"state": "Florida", "capital": "Tallahassee"},
{"state": "Georgia", "capital": "Atlanta"},
{"state": "Hawaii", "capital": "Honolulu"},
{"state": "Idaho", "capital": "Boise"},
{"state": "Illinois", "capital": "Springfield"},
{"state": "Indiana", "capital": "Indianapolis"},
{"state": "Iowa", "capital": "Des Moines"},
{"state": "Kansas", "capital": "Topeka"},
{"state": "Kentucky", "capital": "Frankfort"},
{"state": "Louisiana", "capital": "Baton Rouge"},
{"state": "Maine", "capital": "Augusta"},
{"state": "Maryland", "capital": "Annapolis"},
{"state": "Massachusetts", "capital": "Boston"},
{"state": "Michigan", "capital": "Lansing"},
{"state": "Minnesota", "capital": "Saint Paul"},
{"state": "Mississippi", "capital": "Jackson"},
{"state": "Missouri", "capital": "Jefferson City"},
{"state": "Montana", "capital": "Helena"},
{"state": "Nebraska", "capital": "Lincoln"},
{"state": "Nevada", "capital": "Carson City"},
{"state": "New Hampshire", "capital": "Concord"},
{"state": "New Jersey", "capital": "Trenton"},
{"state": "New Mexico", "capital": "Santa Fe"},
{"state": "New York", "capital": "Albany"},
{"state": "North Carolina", "capital": "Raleigh"},
{"state": "North Dakota", "capital": "Bismarck"},
{"state": "Ohio", "capital": "Columbus"},
{"state": "Oklahoma", "capital": "Oklahoma City"},
{"state": "Oregon", "capital": "Salem"},
{"state": "Pennsylvania", "capital": "Harrisburg"},
{"state": "Rhode Island", "capital": "Providence"},
{"state": "South Carolina", "capital": "Columbia"},
{"state": "South Dakota", "capital": "Pierre"},
{"state": "Tennessee", "capital": "Nashville"},
{"state": "Texas", "capital": "Austin"},
{"state": "Utah", "capital": "Salt Lake City"},
{"state": "Vermont", "capital": "Montpelier"},
{"state": "Virginia", "capital": "Richmond"},
{"state": "Washington", "capital": "Olympia"},
{"state": "West Virginia", "capital": "Charleston"},
{"state": "Wisconsin", "capital": "Madison"},
{"state": "Wyoming", "capital": "Cheyenne"}
]

That kind of thing is highly useful, especially if you give a large context model a lot of background information on the task in the prompt (that works even if it would never have been trained on that information or how to use it).

You can ask it to do something in English, and in many cases it will do it, even if it is a somewhat complex operation overall. It has its limits, but there are a lot of boring but necessary tasks I have found it can handle, and that was not a thing a few years ago.


There are programmers who when asked how their program got an answer, start talking about variables and loops, functions and objects, etc. While technically an answer to the question, it is not what is being asked when a user, or a board member, or the marketing guy, are asking why it did what it did. They want a more specific and narrow answer covering the reasoning behind that action.

In this case, what they want to know is in the relationships encoded into the weights. That is not something you can read in the code, and it is not something you could trace by hand before you die of old age. We could maybe come up with debugging tools to allow us to probe this, but right now it is a big black box.

You can answer other questions, or say their question is not a good one, but when someone asks how a result is obtained, the answer they want is not readily available by reading some code, or looking at some data.
 
Upvote
3 (3 / 0)
The experts always come out in the comments. If you're so confident, instead of commenting on a random article, maybe you should put your expertise and all of those years of study you've obviously invested in the subject to real use and conduct a proper peer review of the scientist's work you're refuting. A person who is confident in their argument does not immediately resort to childish name calling or other such ad hominem dross. A simple presentation of evidence to counter any particular assertion will suffice (which you did not do; you basically just said "they're wrong and dumb" and provided nothing to reinforce those claims). Your irrational (and completely unnecessary) hostility gives you away.
Perhaps if you read what the person wrote without getting distracted by "offensive words", you'd understand the embedded argument.
 
Upvote
0 (0 / 0)
It's clearly possible to know everything you did to get a system to behave in a certain way, and be able to understand what every instruction, or procedure in the code does, and even have some degree of understanding of what it does, and still not understand how it works. Complex systems often exhibit emergent or meta-behaviors that are not fully explained by an understanding of how all the component parts work. You can plausibly claim to understand that emergence only when you have a theory that predicts a priori what conditions in the system are necessary for it to emerge, and what changes in the emergent behavior you get when you modify the system in a specific way. We do not, unless it's buried in some lab somewhere and has not yet seen the light of day, have such a theory. That's what we don't know about how the technology is doing what it does.

You want a far simpler example: I had a young student years ago who built a simulated annealing solver for optimization problems, using a description of the algorithm from an Operations Research text. He was a good software engineer, and a passable mathematics student. He could use what he built to find solutions to functional optimization problems, and he knew what that meant. He never understood the theory behind simulated annealing except in the most rudimentary way, however. He could improve or modify his own code in any other than the most local fashion as a result. Of course an OR expert could understand and improve his code, but he could not. The difference between his situation, and the industry's situation vis-a-vis generative AI models, is that for many of these models, no one has what the OR expert had in his case: a deep understanding why or how the algorithm (to him) magically found functional optima.
I think you are just objectively wrong with that last sentence. Experts have been working on these techniques for decades. Maybe some of these COMPANIES haven't bothered to bring any of those experts on board, but that certainly doesn't mean they don't exist.

If you agree an OR expert could read your student's code and fully understand what is going on, I feel like my point stands.

But to your point, while they may not hold a perch as an academic expert or whatever, I feel like there is a solid argument that as long as the rudimentary understanding your student has of the mechanics underling their implementation is accurate, in that context they would absolutely be able to accurately describe what the system is doing in response to their code. They would also likely be able to identify and accurately describe the same underlying mechanics when looking at implementing code of others following similar methodology. Which is what, in my view, we're actually talking about here.

I'd go one farther to say that your student would likely be able to more effectively articulate theories underlying their code methodolog than an arbitrary academic expert who works in an unrelated field ... even though such an expert might instinctively denigrate their level of expertise.

You are talking about available GPT models as if you believe them to be significantly technologically distinct from one another. Can you expand on what differentiation points you view this significant technological distinction to fall along?
 
Upvote
-3 (0 / -3)

Psyborgue

Ars Praefectus
4,051
Subscriptor++
"Produce a JSON array containing objects with properties for the state and capital for each US state."
I grant language models are useful but that isn't guaranteed to be devoid of hallucinating. That kind of data is already available on government websites and places like Wikipedia and facts like these aren't copyrightable.

As you point out, you don't know where that came from. It takes a whole lot of glue to make factual answers come out and you don't know if that's done right. Nobody has that 100% yet.

Generating game data, absolutely. Can't really screw that up. Language models are useful for that kind of thing. Factual answers are much much harder and nobody has that 100%.
I think you are just objectively wrong with that last sentence. Experts have been working on these techniques for decades. Maybe some of these COMPANIES haven't bothered to bring any of those experts on board, but that certainly doesn't mean they don't exist.

If you agree an OR expert could read your student's code and fully understand what is going on, I feel like my point stands.

But to your point, while they may not hold a perch as an academic expert or whatever, I feel like there is a solid argument that as long as the rudimentary understanding your student has of the mechanics underling their implementation is accurate, in that context they would absolutely be able to accurately describe what the system is doing in response to their code. They would also likely be able to identify and accurately describe the same underlying mechanics when looking at implementing code of others following similar methodology. Which is what, in my view, we're actually talking about here.

I'd go one farther to say that your student would likely be able to more effectively articulate theories underlying their code methodolog than an arbitrary academic expert who works in an unrelated field ... even though such an expert might instinctively denigrate their level of expertise.

You are talking about available GPT models as if you believe them to be significantly technologically distinct from one another. Can you expand on what differentiation points you view this significant technological distinction to fall along?
This is where language models learn to double down on bullshit.
 
Upvote
0 (1 / -1)

GMcK de Cypress

Wise, Aged Ars Veteran
196
Subscriptor
Everyone’s first-person experience make them an incontrovertible expert on consciousness — and that’s a big part of the problem. The article remarks on the overabundance of theories, but even so, it understates how many theories there are. A recent taxonomy enumerated 208 different theories of consciousness. Whatever your analysis is, someone has a matching theory that you’ll probably agree with.

Nevertheless, the article is too kind to IIT. Its mathematical formulation has some attraction, but if you look carefully at its assumptions, you’ll discover that it’s panpsychist at heart. Everything is conscious, just more or less so according to its integration index, despite the protestations of Christoph Koch. Defects like this have led to over 100 neuroscientists and other researchers publicly calling it pseudoscience.

Adversarial collaborations have their role in forcing proponents to sharpen their definitions and incorporate thorny observations, but there’s one sure-fire program to get at what’s really going on in the heads of conscious people. That’s a hard-nosed cognitively-aware behaviorist approach: ask “what’s going on when someone emits the words `I am conscious’ or `I am conscious of X’?”. Philosophers who write about illusionism and higher order thought (HOT) may be on the right track, but they are hobbled by their reliance on common sense reasoning and natural language descriptions. Lots of experimental data and methodological revolutions are going to have to happen before doubters will be convinced.
 
Upvote
1 (1 / 0)

lucubratory

Ars Scholae Palatinae
1,187
Subscriptor++
This is just so fundamentally stupid... A system doesn't necessarily need to have physical connections, if it can just simulate those connections computationally. IIT seems to boil down to an argument of, you can't be conscious if your internal state update cycle falls below <x> Hz. Put another way, if your mind works too slowly, then it can't exist. Which is just freakishly dumb. Hard to believe that accomplished academics with PhDs would seriously produce and defend such dross...
Yes, IIT and similar "Don't call me idealist!" unfalsifiable theories have been a bane on the science of consciousness since literally the 1950s. The fact is that they're not science - we did not invent the lightbulb, vaccines, refrigeration, quantum mechanics, general relativity, more by stating some anthropocentric metaphysical showerthoughts as fundamental axioms of reality & working backwards from the assumption that they have to be true without regard to evidence. That is in fact the tactic that religious zealots, small-c conservatives, scientists making a significant error and their ilk have used to reject scientific progress with "common sense" for hundreds of years, ranging from the Catholic Church's opposition to heliocentrism to "God doesn't play dice". Quantum mechanics wasn't "common sense" and could never have been discovered with this strategy of "Declare what we think is true to be incontrovertible & work from there, evidence doesn't matter".

The unfortunate truth is that scientists are human, and unfortunately some humans are so terrified of being proven wrong that they run to barely-disguised dogma rather than risk making a prediction. Non-materialist theories are incredibly unpopular among scientists because of how clear it is that they're unfalsifiable dogmatic nonsense (the vast majority of scientists being materialists is one of the great successes of science), but unfortunately those theories are disproportionately popular among the general populace without some sort of scientific training for the reason they first arose: just-so assertions of "common sense" as dogma are attractive because they remove the necessity to think and consider uncomfortable realities about what the evidence shows us. IIT & similar theories are ~three times as popular amongst relevant scientists than climate denialism is among climate scientists, so I don't think excluding them altogether is a good idea, but it's important not to present a false equivalence. We need more science education, and science communication needs to effectively communicate that these are fringe beliefs that are fundamentally contradictory with good science.

I really appreciate the existence of this article because we desperately need to talk about this topic, and I want to personally thank the obviously qualified Lindsey Laughlin for pointing out that IIT is unfalsifiable, although I would have appreciated more pushback on the evasiveness of "We don't fit into any -isms". It's a very well-researched piece that I think presented the state of the field quite fairly. This is the sort of science communication I love the most, because it's communicating a topic well that's both difficult to explain, very important, and often unpopular to discuss; I appreciate it a lot.
 
Upvote
3 (3 / 0)
I think people are asking the wrong question. Rather than "What is consciousness?" it should be "Why is consciousness?"
The answer to that for every biological organism is to acquire energy, evade hazards (including other organisms that want to use you as an energy source), find a mate, and successfully reproduce.
There's a complex feedback system between consciousness and other bodily systems.

An AI doesn't have any of that. Mostly they just sit in big data centres where they're fed energy. They're not capable of physical reproduction, not even self-assembling their own physical components,

While autonomous AI systems do exist that have to respond to environmental inputs, their ability to process and respond to inputs is still pretty rudimentary compared to what humans, or even animals can do, and in terms of EROI, a dinosaur, with its huge body and tiny brain was probably still orders of magnitude more efficient at processing environmental data than the best AI, yet we know what happened to them.
 
Upvote
-3 (1 / -4)
Instructable as in I say "Produce a JSON array containing objects with properties for the state and capital for each US state."
It says:
[
{"state": "Alabama", "capital": "Montgomery"},
{"state": "Alaska", "capital": "Juneau"},
{"state": "Arizona", "capital": "Phoenix"},
{"state": "Arkansas", "capital": "Little Rock"},
{"state": "California", "capital": "Sacramento"},
{"state": "Colorado", "capital": "Denver"},
{"state": "Connecticut", "capital": "Hartford"},
{"state": "Delaware", "capital": "Dover"},
{"state": "Florida", "capital": "Tallahassee"},
{"state": "Georgia", "capital": "Atlanta"},
{"state": "Hawaii", "capital": "Honolulu"},
{"state": "Idaho", "capital": "Boise"},
{"state": "Illinois", "capital": "Springfield"},
{"state": "Indiana", "capital": "Indianapolis"},
{"state": "Iowa", "capital": "Des Moines"},
{"state": "Kansas", "capital": "Topeka"},
{"state": "Kentucky", "capital": "Frankfort"},
{"state": "Louisiana", "capital": "Baton Rouge"},
{"state": "Maine", "capital": "Augusta"},
{"state": "Maryland", "capital": "Annapolis"},
{"state": "Massachusetts", "capital": "Boston"},
{"state": "Michigan", "capital": "Lansing"},
{"state": "Minnesota", "capital": "Saint Paul"},
{"state": "Mississippi", "capital": "Jackson"},
{"state": "Missouri", "capital": "Jefferson City"},
{"state": "Montana", "capital": "Helena"},
{"state": "Nebraska", "capital": "Lincoln"},
{"state": "Nevada", "capital": "Carson City"},
{"state": "New Hampshire", "capital": "Concord"},
{"state": "New Jersey", "capital": "Trenton"},
{"state": "New Mexico", "capital": "Santa Fe"},
{"state": "New York", "capital": "Albany"},
{"state": "North Carolina", "capital": "Raleigh"},
{"state": "North Dakota", "capital": "Bismarck"},
{"state": "Ohio", "capital": "Columbus"},
{"state": "Oklahoma", "capital": "Oklahoma City"},
{"state": "Oregon", "capital": "Salem"},
{"state": "Pennsylvania", "capital": "Harrisburg"},
{"state": "Rhode Island", "capital": "Providence"},
{"state": "South Carolina", "capital": "Columbia"},
{"state": "South Dakota", "capital": "Pierre"},
{"state": "Tennessee", "capital": "Nashville"},
{"state": "Texas", "capital": "Austin"},
{"state": "Utah", "capital": "Salt Lake City"},
{"state": "Vermont", "capital": "Montpelier"},
{"state": "Virginia", "capital": "Richmond"},
{"state": "Washington", "capital": "Olympia"},
{"state": "West Virginia", "capital": "Charleston"},
{"state": "Wisconsin", "capital": "Madison"},
{"state": "Wyoming", "capital": "Cheyenne"}
]

That kind of thing is highly useful, especially if you give a large context model a lot of background information on the task in the prompt (that works even if it would never have been trained on that information or how to use it).
I don't think anyone was super surprised by that. I guess maybe OpenAI themselves were ... because they had no idea what they used to train their own models? But even that's bullshit. Altman et al were just priming the hype pump.

Coding is one of the most OG things OpenAI has been trying to get these things to do accurately from the moment they were concieved - that was Microsoft's main interest.

And if you are being honest with yourself, they're still very much working on it. That's where we get to just how useful this ACTUALLY is. Sometimes the JSON produced is exactly what you wanted. Sometimes it isn't. I think it's probably a solvable problem but in my view, not with language models alone.

You can ask it to do something in English, and in many cases it will do it, even if it is a somewhat complex operation overall. It has its limits, but there are a lot of boring but necessary tasks I have found it can handle, and that was not a thing a few years ago.
By "ask it to do something" do you mean return a text response? How well or poorly any particular model does at returning what you wanted to see in response to what you typed comes down entirely to how the model developers have built and implemented their system. That's where the competition is.

OTOH, if you're talking about the state machine that processes the token stream and does stuff like censoring or launching secondary apps in response to trigger sequences, that stuff isn't even AI. The models are static. If you are using the GPT-4 model and it's doing stuff it wasn't doing last year, that stuff is ALL the product of add-on conventional programming.
There are programmers who when asked how their program got an answer, start talking about variables and loops, functions and objects, etc. While technically an answer to the question, it is not what is being asked when a user, or a board member, or the marketing guy, are asking why it did what it did. They want a more specific and narrow answer covering the reasoning behind that action.
It doesn't friggin matter what they want. Desires don't change objective reality.

The solution to non-technical observers being unable to grasp a fully detailed answer to the highly technical question they've asked is NOT to just make up some bullshit that sounds like what they want to hear.

In this case, what they want to know is in the relationships encoded into the weights. That is not something you can read in the code, and it is not something you could trace by hand before you die of old age. We could maybe come up with debugging tools to allow us to probe this, but right now it is a big black box.

You can answer other questions, or say their question is not a good one, but when someone asks how a result is obtained, the answer they want is not readily available by reading some code, or looking at some data.
You answered their question with full accuracy to the best of your own technical abilities stopping with your first sentence. After that, it becomes clear you don't know what you are talking about.

If it were me and a client/boss/colleague were asking, I would find out what they are really asking about before going one step further. Why do you want to know?

Because whatever specific thing they are wondering about very well might NOT be based on the weights at all. Could be temperature setting. Could be sampler selections. Maybe context extension methodology. And with a system along the lines of OpenAI's where pre/post parsers are doing a crapton of secondary output shaping - there's a whole universe of potiential stuff that's comming from those conventional systems to consider.

The ultimate correct answer to "Why did it pick this specific thing to say?" could be a whole range of different things - weights just being one.

And, in the end, if the person really just wanted to have the underlying technology explained to them in terms a layman can understand, that's not an insurmountable task either. I'm pretty good at that stuff if an observer is interestend in learning.

You guys are playing hypotheticals.
 
Upvote
-4 (0 / -4)

cerberusTI

Ars Tribunus Angusticlavius
6,507
Subscriptor++
I grant language models are useful but that isn't guaranteed to be devoid of hallucinating. That kind of data is already available on government websites and places like Wikipedia and facts like these aren't copyrightable.

As you point out, you don't know where that came from. It takes a whole lot of glue to make factual answers come out and you don't know if that's done right. Nobody has that 100% yet.

Generating game data, absolutely. Can't really screw that up. Language models are useful for that kind of thing. Factual answers are much much harder and nobody has that 100%.

This is where language models learn to double down on bullshit.
I usually feed in the options it may select from (and check that the answer is at least on the possibility list), the use of common information was mostly to make it easy to produce so it could be demonstrated that it can take on a vaguely complicated task.

That involved several things it needed to do in formatting the object and determining what went into it, and it was able to accomplish this readily. A few years ago nothing publicly available could do this.


I end up using it in a context where the answers are important and people will care if they are wrong (also, someone reviews it and must sign in many cases, putting a license they can lose on the line if there are too many errors.) The AI is not 100%, but it is better than the system it replaced, which was standard code and not as accurate.

My bigger concern is that it is frequently good enough that you do not need to correct anything at this point, and only needing to occasionally add more or correct something is in some ways more dangerous than a system where you can expect to be entering more every time. This is a bit dangerous when you have thousands of fields, as the temptation will be to skip the review if usually it is entirely correct.

The latest work project is something where you can avoid the AI entirely, and I had been planning to write it anyway for a while, but it also needs to you match up a bunch of lists of usually thousands of items to other lists of thousands of items, and there are a lot of lists. I was guessing the success rate would be about 30% on getting this to happen, and we stopped the more manual version as programming projects as they are too problematic.

The AI just goes right through it though, and looking over the results it chose to correct an occasional item is not such a bad task.

Even an uncorrected list beats every deliverable I have received from a programmer on a similar project in accuracy, finds accurate answers I would not have if doing it myself, and it happens in an hour to a couple of days (for a more involved version with more AI help on a bigger data set) rather than months. The thing is amazing for basic boring logical tasks involving English, like figuring out which industry standard value was implied by the note free form typed in as a description to another system which did not support it directly.
 
Upvote
1 (1 / 0)

Control Group

Ars Legatus Legionis
18,710
Subscriptor++
I think people are asking the wrong question. Rather than "What is consciousness?" it should be "Why is consciousness?"
The answer to that for every biological organism is to acquire energy, evade hazards (including other organisms that want to use you as an energy source), find a mate, and successfully reproduce.
There's a complex feedback system between consciousness and other bodily systems.

An AI doesn't have any of that. Mostly they just sit in big data centres where they're fed energy. They're not capable of physical reproduction, not even self-assembling their own physical components,

While autonomous AI systems do exist that have to respond to environmental inputs, their ability to process and respond to inputs is still pretty rudimentary compared to what humans, or even animals can do, and in terms of EROI, a dinosaur, with its huge body and tiny brain was probably still orders of magnitude more efficient at processing environmental data than the best AI, yet we know what happened to them.
I think I get where you’re coming from, but I also think I disagree. If the idea is to replicate consciousness, then I don’t think it matters why it exists.

“Why do birds fly” is not an important question if you’re designing an airplane.
 
Upvote
3 (3 / 0)

cerberusTI

Ars Tribunus Angusticlavius
6,507
Subscriptor++
I don't think anyone was super surprised by that. I guess maybe OpenAI themselves were ... because they had no idea what they used to train their own models? But even that's bullshit. Altman et al were just priming the hype pump.

Coding is one of the most OG things OpenAI has been trying to get these things to do accurately from the moment they were concieved - that was Microsoft's main interest.

And if you are being honest with yourself, they're still very much working on it. That's where we get to just how useful this ACTUALLY is. Sometimes the JSON produced is exactly what you wanted. Sometimes it isn't. I think it's probably a solvable problem but in my view, not with language models alone.


By "ask it to do something" do you mean return a text response? How well or poorly any particular model does at returning what you wanted to see in response to what you typed comes down entirely to how the model developers have built and implemented their system. That's where the competition is.

OTOH, if you're talking about the state machine that processes the token stream and does stuff like censoring or launching secondary apps in response to trigger sequences, that stuff isn't even AI. The models are static. If you are using the GPT-4 model and it's doing stuff it wasn't doing last year, that stuff is ALL the product of add-on conventional programming.

It doesn't friggin matter what they want. Desires don't change objective reality.

The solution to non-technical observers being unable to grasp a fully detailed answer to the highly technical question they've asked is NOT to just make up some bullshit that sounds like what they want to hear.


You answered their question with full accuracy to the best of your own technical abilities stopping with your first sentence. After that, it becomes clear you don't know what you are talking about.

If it were me and a client/boss/colleague were asking, I would find out what they are really asking about before going one step further. Why do you want to know?

Because whatever specific thing they are wondering about very well might NOT be based on the weights at all. Could be temperature setting. Could be sampler selections. Maybe context extension methodology. And with a system along the lines of OpenAI's where pre/post parsers are doing a crapton of secondary output shaping - there's a whole universe of potiential stuff that's comming from those conventional systems to consider.

The ultimate correct answer to "Why did it pick this specific thing to say?" could be a whole range of different things - weights just being one.

And, in the end, if the person really just wanted to have the underlying technology explained to them in terms a layman can understand, that's not an insurmountable task either. I'm pretty good at that stuff if an observer is interestend in learning.

You guys are playing hypotheticals.
If the nontechnical observers are unable to grasp your fully detailed answer to what you see as a highly technical question, it is still a black box to them.

The judge, or politician, or user, or coworker likely has no greater insight into how it works when you are done. Even if they do, the specific knowledge they wanted to begin with is still denied to them. It is not really a technical question which is being asked, and it should not get a technical answer.

If asked how the AI knows what a JSON object or the state capitals are, I would probably say something along the lines of "It is very well read, and is aware of common facts." unless it was a programmer asking me, or there was otherwise an interest expressed in the technical workings.

Similarly, I commonly answer a question as to how it knows something with "I give it a small book of information documenting how this works in the prompt.", which is also much closer to what someone usually wants to know when they ask how it arrives at an answer.
 
Upvote
1 (1 / 0)

Psyborgue

Ars Praefectus
4,051
Subscriptor++
My bigger concern is that it is frequently good enough that you do not need to correct anything at this point, and only needing to occasionally add more or correct something is in some ways more dangerous than a system where you can expect to be entering more every time. This is a bit dangerous when you have thousands of fields, as the temptation will be to skip the review if usually it is entirely correct.
That is the thing. When the model can rely on external tools to copy and paste it's good. If it comes from the weights, well, that's another story. And yeah, 99.9% correct is sometimes much much worse than 90 or 98.

I usually feed in the options it may select from
That helps but having it in the context isn't a guarantee the model will emit those tokens. I hope you are validating the output. Also consider your sampling settings will affect the chosen token. You may be better off getting the logprobs (if you are using the OpenAI API) and selecting the top option. Otherwise invisible dice are being thrown in that choice.

Edit: Also, for multiple choice there is a bias towards the first option. Fixing that isn't easy. Also, responded before I read. I see you do validate the output. Good on you.
 
Last edited:
Upvote
1 (1 / 0)