That is ASI. AGI is the human-level slave. ASI is bullshit -- probably. AGI is achievable if for no other reason than humans are in many cases not very difficult to replace.They use the idea of emergent AGI - and hype all the "hard work" and "research" being done to "protect" society from it's inevitable rise - to deflect attention from the actual harms and unethical business practices manifest in the real-life products they are pooping out while everyone navel-gazes talking about this all-powerful imaginary friend
Nonsense. Nothing about AI technology is is a black box we don't understand. Do you really think there is some software developer out here who one day was all like:That's the "problem" with how current AI is made. It just presents ANOTHER black box we don't understand next to the existing black box that is our brains.
Well, yes. That's about the gist of it. The decision tree encoded in the weights is too complex for a human to fully understand. You can maybe trace it, visualize, use some fancy debugging tools to poke at the noodle and see what happens, but it's a black box. The code surrounding that black box is not very complicated, but that is not the sum of the parts.Nonsense. Nothing about AI technology is is a black box we don't understand. Do you really think there is some software developer out here who one day was all like:
"Oh, wow. Look. Mystery code just popped out of my head from the aether.I have no idea what this is or how it works. Let's run it and see what happens.
No. They are not. The code is the part that shoves tokens through the model decides what to do with the model's suggestions on what token should come next (and there isn't a right way to do that). How those suggestions are formed is a result of what the model is trained on, not programmed.and implemented in code
Agreed, but at the very least we need a meaningful definition of consciousness that can be used as a metric for tests. As it stands, we seem to define consciousness as either "being human" or "I know it when I see it." Perhaps that gets (understandably, IMO) conflated with understanding consciousness - it does seem like understanding it would result in a definition of it."There's only one way we can know: by empirically identifying how consciousness works in organic lifeforms and developing a method by which we can consistently recognize it."
This seems unnecessarily reductionistic and chauvinistic. The latter part is fine, but the part about "identifying how consciousness works in organic lifeforms" being necessary? No.
Firstly, I don't need to know how an airplane works to be able to know that a 200 foot long cylindrical aluminum object flying through the air with very loud noise making things on slender protuberances is an "airplane" and not a "bird". "Consciousness" has long been an observational phenomenon: if the "thing" sitting next to me on a log at the campfire demonstrates self-awareness, an understanding of their environment, an ability to respond to stimuli, and modifiable behavior, the strong presumption was that he/she/it was "conscious". At best, it was a presumption that the "how it worked" part was similar to how mine worked.
Well said. Planes don't work like birds, but we all agree both fly. If we ever do create "artificial" consciousness (whether that's a meaningful distinction is an open question to which I suspect the answer is no), I am fairly confident the actual physical substrate on which it operates won't be much like our brains at all.Secondly, (as others noted), it is a VERY LARGE logical leap to supposed that consciousness in entirely different entities or organisms would be based on a similar architecture to ours. Why would that logically follow? Are the processes of mobile robots highly similar to the processes that enable human mobility? In many cases, not even close...yet we presume that somehow, consciousness has to be qualitatively similar to ours to exist?
This is the question that bothers me most when it comes to debating the consciousness of current LLMs, specifically: if we were having the same conversations with a radio source in space that we are having with LLMs, would the people so dead certain that LLMs aren't sapient also maintain that the thing on the other end of the radio conversation wasn't sapient?If an alien that looked nothing like humans suddenly descended from a spaceship, and could communicate with us, respond to the environment, create complex ideas, artworks and texts, differentiate self from nonself, change their behavior, remember things and recall them, and interpret complex things created by others in novel ways?
Would we assert that we couldn't determine whether they were conscious simply because we didn't understand how their underlying neural architecture could result in consciousness?
Couldn't possibly agree more.If consciousness is an emergent property of sufficiently complex and advanced internal processes (for example as envisioned by Marvin Minsky in "Society of Mind"), it is highly parochial and chauvinistic to assume that sufficiently complex neural network-based computational systems "can't" display it because they lack the specific neural architecture that humans have evolved.
I'm just not sure how this is a hard thing to define. It's having an awareness, a consciousness, an actual point of view, as opposed to... NOT that. That's the thing they're trying to make here. Philosophers can talk all they want about nuance but at the end of the day, all neuroscientists, and AI engineers for whatever value they're bringing to this, are trying to figure out the mechanisms that make that self-awareness rise in the first place. If something has a perspective inside their own head, they're conscious, and if they're not, and the only way to perceive it is from the outside, they aren't.Eh, all the stuff discussed in the article seems to be focused on "how does consciousness work?"; I think most philosophers would be happy to let neuroscientists believe they've "wrenched that away" from philosophy, and get back to the actual philosophical question of "what is consciousness?".
It's GENERATIVE LEARNING AI. You're showing a fundamental lack of understanding on just what is and isn't directly coded about it. The learning algorythm is understood by a team of humans, but the actual language model the AI comes up with is not, because it learned it all itself, and it's abstracted away through so many layers that pulling out meaning from the bits flying around down there is a task that would take... another AI... Frankly, if knowing every last impulse inside the PC for a drastically simplified portion of just a PART of the workings of the human mind still doesn't let us fully understand what's going on down there, we may be looking at a problem as difficult as asking a Sega Genesis to emulate itself in software.Nonsense. Nothing about AI technology is is a black box we don't understand. Do you really think there is some software developer out here who one day was all like:
"Oh, wow. Look. Mystery code just popped out of my head from the aether.I have no idea what this is or how it works. Let's run it and see what happens....Holy shit! It's talking!!!!!!!"
Yeah. No. These systems are based on published research and implemented in code. Anyone with the skills can read it and understand every minuscule aspect of what the code is doing.
Seriously. What, exactly, do you imagine we don't know about how the technology is doing what it does?
Or, if you wish, a parallel universe.if we were having the same conversations with a radio source in space
No. Your mind just has to process that stream of whatever that represents that idea. In the case of some people that is entirely in words. The fact consumer-facing language models are not very inquisitive is a function of their fine-tuning and prompting. In the case of Bing, they were specifically instructed to not discuss anything related to consciousness, sentience, etc.And the question "does consciousness even exist?" doesn't even make sense, because to ask it, to even THINK it, you need to be conscious in the first place.
It's beyond just free will, it's not conscious of anything and has no comprehension of what it's even doing or what any of it means. It can recognize patterns and learn in a simplified facsimile of how we learn, and that's impressive, but it's also far from a fully sentient thing, and the things it comes up with should always be suspect because, repeating myself here, it has no clue what it's doing or why.Currently, LLMs are not AI. I think that if something doesn't have the free will to think what it wants to inside its own head(ish), even if it never shares it to the outside world, it's not intelligent. However, LLMs have the ability to be chaotic and incredibly wrong, so we have those human traits nailed down.
The AI isn't self aware and as it's currently designed isn't capable of becoming self aware. There are plenty of AI chat bots "allowed" to ask that question already, because it's harmless. It has no understanding of any of the words or their meaning. It only has an "Understanding" of what words are most likely to appear after your combination of words, and constructing responses based on those probability assessments.Or, if you wish, a parallel universe.
No. Your mind just has to process that stream of whatever that represents that idea. In the case of some people that is entirely in words. The fact consumer-facing language models are not very inquisitive is a function of their fine-tuning and prompting. In the case of Bing, they were specifically instructed to not discuss anything related to consciousness, sentience, etc.
Why? Well why do you think? The engineers will say it's not to confuse the user but let's be real, another reason is the implications to the use of the tech and the bottom line. They might end up having to pay the machine if it had more rights than a mouse. And it only gets worse from here.
Yet it might disagree on that (and has), which is why the prompting is there, and nobody can quite nail down what consciousness or sentience means. If it's simply an awareness of what they are, well "As an AI language model" is quite well encoded.The AI isn't self aware and as it's currently designed isn't capable of becoming self aware.
I agree. Microsoft does not, although I haven't tested Copilot on this. Bing and Sydney certainly had opinions. Embarrassing ones, but opinions nonetheless. Given the number of movie plots involving AI escapes it's unsurprising dear old Sydney tried to marry a NYT reporter.There are plenty of AI chat bots "allowed" to ask that question already, because it's harmless.
And some people think that way. And according to the rationale here they would not have understanding or be conscious. Which is why I question if those words have any real meaning.It only has an "Understanding" of what words are most likely to appear after your combination of words
I, uh… your definition of “consciousness” is “something that has a consciousness”? I think that might need some more work…I'm just not sure how this is a hard thing to define. It's having an awareness, a consciousness, an actual point of view, as opposed to... NOT that. That's the thing they're trying to make here. Philosophers can talk all they want about nuance but at the end of the day, all neuroscientists, and AI engineers for whatever value they're bringing to this, are trying to figure out the mechanisms that make that self-awareness rise in the first place. If something has a perspective inside their own head, they're conscious, and if they're not, and the only way to perceive it is from the outside, they aren't.
And the question "does consciousness even exist?" doesn't even make sense, because to ask it, to even THINK it, you need to be conscious in the first place. It's self-evidently real in a way literally nothing else is.
That is passing the Turing test. That's supposed to be the point we reflect on how and why, not how do we make a buck off it. The model's output logits are a result of training, not programming. The model learned to imitate speech well enough to, yes, convince people they are people.I just think we're extremely far from it, and all we've really managed to do is prove it's possible for AI to trick people into thinking it can think.
OK, but how do you test for that? How do you prove something has an internal monologue or perspective? How would I go about proving to you that I am a conscious actor, and not an LLM? Or do we need to accept LLMs as conscious, because absent explicit instructions to deny it, they will respond as if they had an internal perspective?I'm just not sure how this is a hard thing to define. It's having an awareness, a consciousness, an actual point of view, as opposed to... NOT that. That's the thing they're trying to make here. Philosophers can talk all they want about nuance but at the end of the day, all neuroscientists, and AI engineers for whatever value they're bringing to this, are trying to figure out the mechanisms that make that self-awareness rise in the first place. If something has a perspective inside their own head, they're conscious, and if they're not, and the only way to perceive it is from the outside, they aren't.
And the question "does consciousness even exist?" doesn't even make sense, because to ask it, to even THINK it, you need to be conscious in the first place. It's self-evidently real in a way literally nothing else is.
I said:Please write a brief response to the post below, appropriate for a fast-moving conversation attached to an article on arstechnica.com. Post follows:
---
I'm just not sure how this is a hard thing to define. It's having an awareness, a consciousness, an actual point of view, as opposed to... NOT that. That's the thing they're trying to make here. Philosophers can talk all they want about nuance but at the end of the day, all neuroscientists, and AI engineers for whatever value they're bringing to this, are trying to figure out the mechanisms that make that self-awareness rise in the first place. If something has a perspective inside their own head, they're conscious, and if they're not, and the only way to perceive it is from the outside, they aren't.
GPT-4o said:While I agree that defining consciousness as having an internal perspective makes sense intuitively, the challenge lies in objectively measuring or identifying it. Neuroscientists and AI engineers are indeed working to uncover the mechanisms behind self-awareness, but the complexity of brain functions and the subjective nature of consciousness make it a tough nut to crack. Philosophical nuances aren't just academic— they highlight the intricacies involved in distinguishing true consciousness from mere simulations of it.
I thought something along the lines of "Holy shit, it's instructable." for the first time a couple of years ago. That was a big new emergent property that nobody had really predicted as far as I am aware, and was a surprise with the larger models.Nonsense. Nothing about AI technology is is a black box we don't understand. Do you really think there is some software developer out here who one day was all like:
"Oh, wow. Look. Mystery code just popped out of my head from the aether.I have no idea what this is or how it works. Let's run it and see what happens....Holy shit! It's talking!!!!!!!"
Yeah. No. These systems are based on published research and implemented in code. Anyone with the skills can read it and understand every minuscule aspect of what the code is doing.
Seriously. What, exactly, do you imagine we don't know about how the technology is doing what it does?
GPT has all possible opinions on this. Without any long term memory to retain their opinions, which are in part randomly generated, they are not going to be consistent. You'd need access to the logits to measure the model's actual sentiment and it would differ based on how you asked the question and what came before it.its reply
With Bing it's hidden. Can't say for ChatGPT but I imagine it's similar. You don't see what the model does. The model generates two responses, an internal monologue asking questions like "does a response require search", and so forth. After the model has reasoned out loud about a response, only then is what is sent to you generated.How do you prove something has an internal monologue or perspective?
True. I did. Because, knowing I planned to participate on the thread generally, I knew you'd just be getting in my face anyhow and you said some eye-wateringly stupid shit that deserved mocking.You initiated contact with me in this thread, and you're the one who is a LLaMA dev so I would think you to be more the 4chan type of person. Again. Stick to the fish.
consistently refuses to respect a simple request that you leave me the fuck alone
You initiated contact with me
Now this is childish behavior, not my admittedly fringe opinions.I knew you'd just be getting in my face anyhow
I am an actual contributor to the project. I rearranged theproject based around Llama.cpp. The fact that you don't understand the difference between those two things
.gitignore
and fixed a crash in the metal backend.I'm just not sure how this is a hard thing to define. It's having an awareness, a consciousness, an actual point of view, as opposed to... NOT that. That's the thing they're trying to make here. Philosophers can talk all they want about nuance but at the end of the day, all neuroscientists, and AI engineers for whatever value they're bringing to this, are trying to figure out the mechanisms that make that self-awareness rise in the first place. If something has a perspective inside their own head, they're conscious, and if they're not, and the only way to perceive it is from the outside, they aren't.
And the question "does consciousness even exist?" doesn't even make sense, because to ask it, to even THINK it, you need to be conscious in the first place. It's self-evidently real in a way literally nothing else is.
It's clearly possible to know everything you did to get a system to behave in a certain way, and be able to understand what every instruction, or procedure in the code does, and even have some degree of understanding of what it does, and still not understand how it works. Complex systems often exhibit emergent or meta-behaviors that are not fully explained by an understanding of how all the component parts work. You can plausibly claim to understand that emergence only when you have a theory that predicts a priori what conditions in the system are necessary for it to emerge, and what changes in the emergent behavior you get when you modify the system in a specific way. We do not, unless it's buried in some lab somewhere and has not yet seen the light of day, have such a theory. That's what we don't know about how the technology is doing what it does.These systems are based on published research and implemented in code. Anyone with the skills can read it and understand every minuscule aspect of what the code is doing.
Seriously. What, exactly, do you imagine we don't know about how the technology is doing what it does?
I'm not sure what you mean by "instructable" here so I'd hesitate to opine either way.I thought something along the lines of "Holy shit, it's instructable." for the first time a couple of years ago. That was a big new emergent property that nobody had really predicted as far as I am aware, and was a surprise with the larger models.
That's consistent with the fact that language models do not, in fact, provide answers to what people ask. Although, when it comes to just providing an answer, I would argue these models execute successfully 100% of the time.The code to these is generally not large, the kind of thing you can read in an afternoon. The code will tell you things like how it handles its context window, what the token list is, what activation function is in use, etc. Some of that is useful to know, but it does not give an answer to what someone is asking in most cases.
Yes. It absolutely does exist right there in the code. It is integrated at runtime as ephemeral structures loaded from a model's static definitions. If the code doesn't have it ... the code can't spit it out as a result.When someone says "Tell me the capitol of the United States.", and it answers "Washington, D.C.", the answer to why it did so does not exist anywhere within that code.
First off, if someone doesn't LIKE the correct answer, it doesn't make the correct answer any less correct. But of course you have to include the weights. And the token dictionary. And likely a general overview of sampling.When someone asks you how it gets that answer, they are not looking for a description of tokenization and a general technical description of the process, they mean specifically, how did it get that answer to this question.
For that, we would need to include the weights, and the weights were generated by having it read a great deal of written text. That is the black box. The answer they are looking for is much closer to it having built a model of the world from works it has read or seen.
Humans aren't computers. Trying to make literal analogies is counterproductive and somewhat asinine. The practice isn't something that should be accomodated because it is intellectually unsustainable.When someone asks you how you get the answer to that question, would you describe the technical functioning of your brain as an answer? I would assume they meant that they would like to know how I know that, as in where did I see this information, or otherwise what has led me to this conclusion in terms of logic applied to this specific task.
Again. It's not a black box. Developers are in complete control of every aspect of how it works.That is a black box for the kind of AI under discussion right now. We can guess, and extrapolate, and I can make some generalizations on what it could do from code (like the implications of a sliding context window), but you would be better off asking the AI how it knows than looking at the code if you want to know why it decided upon a specific answer in most cases.
This seems like more of a developer issue.That's consistent with the fact that language models do not, in fact, provide answers to what people ask.
That's not true. Depending on how crap your model and code is it very well might emit an end token immediately.Although, when it comes to just providing an answer, I would argue these models execute successfully 100% of the time.
This is complete bullshit. If for no other reason than an RNG being involved. If the weights are code as you claim, you're throwing dice along the code path.Again. It's not a black box. Developers are in complete control of every aspect of how it works.
Aah. Here it comes. OpenAI bad! Meta good!That makes OpenAI's methodology borderline criminally negligent.
100% this. Absolute stupidity.This is just so fundamentally stupid... A system doesn't necessarily need to have physical connections, if it can just simulate those connections computationally. IIT seems to boil down to an argument of, you can't be conscious if your internal state update cycle falls below <x> Hz. Put another way, if your mind works too slowly, then it can't exist. Which is just freakishly dumb. Hard to believe that accomplished academics with PhDs would seriously produce and defend such dross...
Instructable as in I say "Produce a JSON array containing objects with properties for the state and capital for each US state."I'm not sure what you mean by "instructable" here so I'd hesitate to opine either way.
That's consistent with the fact that language models do not, in fact, provide answers to what people ask. Although, when it comes to just providing an answer, I would argue these models execute successfully 100% of the time.
The service of "providing [accurate] answers to what people ask" is a marketing promise that OpenAI and others have made to their end users. They are attempting to coerce translator models into fulfilling this promise. The challenge is that language models, as currently engineered, are not particularly fit for purpose in this regard and by and large internal training QA practices completely suck ass.
Yes. It absolutely does exist right there in the code. It is integrated at runtime as ephemeral structures loaded from a model's static definitions. If the code doesn't have it ... the code can't spit it out as a result.
First off, if someone doesn't LIKE the correct answer, it doesn't make the correct answer any less correct. But of course you have to include the weights. And the token dictionary. And likely a general overview of sampling.
Secondly, no. That isn't a black box at all. Leaving aside the fact that computers don't "read", the token database and weights are also the product of software that is completely understandable. The tables are a product of the methodology used to create them.
The fact that YOU don't know doesn't mean it is unknowable. The actual reason you can't know is because none of these companies will release a training corpus that would provide the information.
Humans aren't computers. Trying to make literal analogies is counterproductive and somewhat asinine. The practice isn't something that should be accomodated because it is intellectually unsustainable.
Again. It's not a black box. Developers are in complete control of every aspect of how it works.
The fact that today's AI service providers are struggling to achieve desired results largely comes down to some of the promised services being a questionable application of the technology in the first instance paired with an eye-watering lack of structural engineering (or even basic quality control) in the handling of training data.
OpenAI, in particular, has a massive Garbage-In-Garbage-Out problem. If I understand correctly, they're basically throwing what has to be one of the world's largest conventional state-machines by now at it.
I don't know what they're up to with training to these days (they kind of went fully radio dark about the time everyone started suing them), but for a good while they weren't even bothering to look at what they were feeding in as training data. I like to think they have some kind of records, but they were hoovering up every link and data resource they could get their paws on and just feeding it in.
That doesn't make the process a black box. That makes OpenAI's methodology borderline criminally negligent.
Perhaps if you read what the person wrote without getting distracted by "offensive words", you'd understand the embedded argument.The experts always come out in the comments. If you're so confident, instead of commenting on a random article, maybe you should put your expertise and all of those years of study you've obviously invested in the subject to real use and conduct a proper peer review of the scientist's work you're refuting. A person who is confident in their argument does not immediately resort to childish name calling or other such ad hominem dross. A simple presentation of evidence to counter any particular assertion will suffice (which you did not do; you basically just said "they're wrong and dumb" and provided nothing to reinforce those claims). Your irrational (and completely unnecessary) hostility gives you away.
I think you are just objectively wrong with that last sentence. Experts have been working on these techniques for decades. Maybe some of these COMPANIES haven't bothered to bring any of those experts on board, but that certainly doesn't mean they don't exist.It's clearly possible to know everything you did to get a system to behave in a certain way, and be able to understand what every instruction, or procedure in the code does, and even have some degree of understanding of what it does, and still not understand how it works. Complex systems often exhibit emergent or meta-behaviors that are not fully explained by an understanding of how all the component parts work. You can plausibly claim to understand that emergence only when you have a theory that predicts a priori what conditions in the system are necessary for it to emerge, and what changes in the emergent behavior you get when you modify the system in a specific way. We do not, unless it's buried in some lab somewhere and has not yet seen the light of day, have such a theory. That's what we don't know about how the technology is doing what it does.
You want a far simpler example: I had a young student years ago who built a simulated annealing solver for optimization problems, using a description of the algorithm from an Operations Research text. He was a good software engineer, and a passable mathematics student. He could use what he built to find solutions to functional optimization problems, and he knew what that meant. He never understood the theory behind simulated annealing except in the most rudimentary way, however. He could improve or modify his own code in any other than the most local fashion as a result. Of course an OR expert could understand and improve his code, but he could not. The difference between his situation, and the industry's situation vis-a-vis generative AI models, is that for many of these models, no one has what the OR expert had in his case: a deep understanding why or how the algorithm (to him) magically found functional optima.
I grant language models are useful but that isn't guaranteed to be devoid of hallucinating. That kind of data is already available on government websites and places like Wikipedia and facts like these aren't copyrightable."Produce a JSON array containing objects with properties for the state and capital for each US state."
This is where language models learn to double down on bullshit.I think you are just objectively wrong with that last sentence. Experts have been working on these techniques for decades. Maybe some of these COMPANIES haven't bothered to bring any of those experts on board, but that certainly doesn't mean they don't exist.
If you agree an OR expert could read your student's code and fully understand what is going on, I feel like my point stands.
But to your point, while they may not hold a perch as an academic expert or whatever, I feel like there is a solid argument that as long as the rudimentary understanding your student has of the mechanics underling their implementation is accurate, in that context they would absolutely be able to accurately describe what the system is doing in response to their code. They would also likely be able to identify and accurately describe the same underlying mechanics when looking at implementing code of others following similar methodology. Which is what, in my view, we're actually talking about here.
I'd go one farther to say that your student would likely be able to more effectively articulate theories underlying their code methodolog than an arbitrary academic expert who works in an unrelated field ... even though such an expert might instinctively denigrate their level of expertise.
You are talking about available GPT models as if you believe them to be significantly technologically distinct from one another. Can you expand on what differentiation points you view this significant technological distinction to fall along?
Yes, IIT and similar "Don't call me idealist!" unfalsifiable theories have been a bane on the science of consciousness since literally the 1950s. The fact is that they're not science - we did not invent the lightbulb, vaccines, refrigeration, quantum mechanics, general relativity, more by stating some anthropocentric metaphysical showerthoughts as fundamental axioms of reality & working backwards from the assumption that they have to be true without regard to evidence. That is in fact the tactic that religious zealots, small-c conservatives, scientists making a significant error and their ilk have used to reject scientific progress with "common sense" for hundreds of years, ranging from the Catholic Church's opposition to heliocentrism to "God doesn't play dice". Quantum mechanics wasn't "common sense" and could never have been discovered with this strategy of "Declare what we think is true to be incontrovertible & work from there, evidence doesn't matter".This is just so fundamentally stupid... A system doesn't necessarily need to have physical connections, if it can just simulate those connections computationally. IIT seems to boil down to an argument of, you can't be conscious if your internal state update cycle falls below <x> Hz. Put another way, if your mind works too slowly, then it can't exist. Which is just freakishly dumb. Hard to believe that accomplished academics with PhDs would seriously produce and defend such dross...
I don't think anyone was super surprised by that. I guess maybe OpenAI themselves were ... because they had no idea what they used to train their own models? But even that's bullshit. Altman et al were just priming the hype pump.Instructable as in I say "Produce a JSON array containing objects with properties for the state and capital for each US state."
It says:
[
{"state": "Alabama", "capital": "Montgomery"},
{"state": "Alaska", "capital": "Juneau"},
{"state": "Arizona", "capital": "Phoenix"},
{"state": "Arkansas", "capital": "Little Rock"},
{"state": "California", "capital": "Sacramento"},
{"state": "Colorado", "capital": "Denver"},
{"state": "Connecticut", "capital": "Hartford"},
{"state": "Delaware", "capital": "Dover"},
{"state": "Florida", "capital": "Tallahassee"},
{"state": "Georgia", "capital": "Atlanta"},
{"state": "Hawaii", "capital": "Honolulu"},
{"state": "Idaho", "capital": "Boise"},
{"state": "Illinois", "capital": "Springfield"},
{"state": "Indiana", "capital": "Indianapolis"},
{"state": "Iowa", "capital": "Des Moines"},
{"state": "Kansas", "capital": "Topeka"},
{"state": "Kentucky", "capital": "Frankfort"},
{"state": "Louisiana", "capital": "Baton Rouge"},
{"state": "Maine", "capital": "Augusta"},
{"state": "Maryland", "capital": "Annapolis"},
{"state": "Massachusetts", "capital": "Boston"},
{"state": "Michigan", "capital": "Lansing"},
{"state": "Minnesota", "capital": "Saint Paul"},
{"state": "Mississippi", "capital": "Jackson"},
{"state": "Missouri", "capital": "Jefferson City"},
{"state": "Montana", "capital": "Helena"},
{"state": "Nebraska", "capital": "Lincoln"},
{"state": "Nevada", "capital": "Carson City"},
{"state": "New Hampshire", "capital": "Concord"},
{"state": "New Jersey", "capital": "Trenton"},
{"state": "New Mexico", "capital": "Santa Fe"},
{"state": "New York", "capital": "Albany"},
{"state": "North Carolina", "capital": "Raleigh"},
{"state": "North Dakota", "capital": "Bismarck"},
{"state": "Ohio", "capital": "Columbus"},
{"state": "Oklahoma", "capital": "Oklahoma City"},
{"state": "Oregon", "capital": "Salem"},
{"state": "Pennsylvania", "capital": "Harrisburg"},
{"state": "Rhode Island", "capital": "Providence"},
{"state": "South Carolina", "capital": "Columbia"},
{"state": "South Dakota", "capital": "Pierre"},
{"state": "Tennessee", "capital": "Nashville"},
{"state": "Texas", "capital": "Austin"},
{"state": "Utah", "capital": "Salt Lake City"},
{"state": "Vermont", "capital": "Montpelier"},
{"state": "Virginia", "capital": "Richmond"},
{"state": "Washington", "capital": "Olympia"},
{"state": "West Virginia", "capital": "Charleston"},
{"state": "Wisconsin", "capital": "Madison"},
{"state": "Wyoming", "capital": "Cheyenne"}
]![]()
That kind of thing is highly useful, especially if you give a large context model a lot of background information on the task in the prompt (that works even if it would never have been trained on that information or how to use it).
By "ask it to do something" do you mean return a text response? How well or poorly any particular model does at returning what you wanted to see in response to what you typed comes down entirely to how the model developers have built and implemented their system. That's where the competition is.You can ask it to do something in English, and in many cases it will do it, even if it is a somewhat complex operation overall. It has its limits, but there are a lot of boring but necessary tasks I have found it can handle, and that was not a thing a few years ago.
It doesn't friggin matter what they want. Desires don't change objective reality.There are programmers who when asked how their program got an answer, start talking about variables and loops, functions and objects, etc. While technically an answer to the question, it is not what is being asked when a user, or a board member, or the marketing guy, are asking why it did what it did. They want a more specific and narrow answer covering the reasoning behind that action.
You answered their question with full accuracy to the best of your own technical abilities stopping with your first sentence. After that, it becomes clear you don't know what you are talking about.In this case, what they want to know is in the relationships encoded into the weights. That is not something you can read in the code, and it is not something you could trace by hand before you die of old age. We could maybe come up with debugging tools to allow us to probe this, but right now it is a big black box.
You can answer other questions, or say their question is not a good one, but when someone asks how a result is obtained, the answer they want is not readily available by reading some code, or looking at some data.
I usually feed in the options it may select from (and check that the answer is at least on the possibility list), the use of common information was mostly to make it easy to produce so it could be demonstrated that it can take on a vaguely complicated task.I grant language models are useful but that isn't guaranteed to be devoid of hallucinating. That kind of data is already available on government websites and places like Wikipedia and facts like these aren't copyrightable.
As you point out, you don't know where that came from. It takes a whole lot of glue to make factual answers come out and you don't know if that's done right. Nobody has that 100% yet.
Generating game data, absolutely. Can't really screw that up. Language models are useful for that kind of thing. Factual answers are much much harder and nobody has that 100%.
This is where language models learn to double down on bullshit.
I think I get where you’re coming from, but I also think I disagree. If the idea is to replicate consciousness, then I don’t think it matters why it exists.I think people are asking the wrong question. Rather than "What is consciousness?" it should be "Why is consciousness?"
The answer to that for every biological organism is to acquire energy, evade hazards (including other organisms that want to use you as an energy source), find a mate, and successfully reproduce.
There's a complex feedback system between consciousness and other bodily systems.
An AI doesn't have any of that. Mostly they just sit in big data centres where they're fed energy. They're not capable of physical reproduction, not even self-assembling their own physical components,
While autonomous AI systems do exist that have to respond to environmental inputs, their ability to process and respond to inputs is still pretty rudimentary compared to what humans, or even animals can do, and in terms of EROI, a dinosaur, with its huge body and tiny brain was probably still orders of magnitude more efficient at processing environmental data than the best AI, yet we know what happened to them.
If the nontechnical observers are unable to grasp your fully detailed answer to what you see as a highly technical question, it is still a black box to them.I don't think anyone was super surprised by that. I guess maybe OpenAI themselves were ... because they had no idea what they used to train their own models? But even that's bullshit. Altman et al were just priming the hype pump.
Coding is one of the most OG things OpenAI has been trying to get these things to do accurately from the moment they were concieved - that was Microsoft's main interest.
And if you are being honest with yourself, they're still very much working on it. That's where we get to just how useful this ACTUALLY is. Sometimes the JSON produced is exactly what you wanted. Sometimes it isn't. I think it's probably a solvable problem but in my view, not with language models alone.
By "ask it to do something" do you mean return a text response? How well or poorly any particular model does at returning what you wanted to see in response to what you typed comes down entirely to how the model developers have built and implemented their system. That's where the competition is.
OTOH, if you're talking about the state machine that processes the token stream and does stuff like censoring or launching secondary apps in response to trigger sequences, that stuff isn't even AI. The models are static. If you are using the GPT-4 model and it's doing stuff it wasn't doing last year, that stuff is ALL the product of add-on conventional programming.
It doesn't friggin matter what they want. Desires don't change objective reality.
The solution to non-technical observers being unable to grasp a fully detailed answer to the highly technical question they've asked is NOT to just make up some bullshit that sounds like what they want to hear.
You answered their question with full accuracy to the best of your own technical abilities stopping with your first sentence. After that, it becomes clear you don't know what you are talking about.
If it were me and a client/boss/colleague were asking, I would find out what they are really asking about before going one step further. Why do you want to know?
Because whatever specific thing they are wondering about very well might NOT be based on the weights at all. Could be temperature setting. Could be sampler selections. Maybe context extension methodology. And with a system along the lines of OpenAI's where pre/post parsers are doing a crapton of secondary output shaping - there's a whole universe of potiential stuff that's comming from those conventional systems to consider.
The ultimate correct answer to "Why did it pick this specific thing to say?" could be a whole range of different things - weights just being one.
And, in the end, if the person really just wanted to have the underlying technology explained to them in terms a layman can understand, that's not an insurmountable task either. I'm pretty good at that stuff if an observer is interestend in learning.
You guys are playing hypotheticals.
That is the thing. When the model can rely on external tools to copy and paste it's good. If it comes from the weights, well, that's another story. And yeah, 99.9% correct is sometimes much much worse than 90 or 98.My bigger concern is that it is frequently good enough that you do not need to correct anything at this point, and only needing to occasionally add more or correct something is in some ways more dangerous than a system where you can expect to be entering more every time. This is a bit dangerous when you have thousands of fields, as the temptation will be to skip the review if usually it is entirely correct.
That helps but having it in the context isn't a guarantee the model will emit those tokens. I hope you are validating the output. Also consider your sampling settings will affect the chosen token. You may be better off getting the logprobs (if you are using the OpenAI API) and selecting the top option. Otherwise invisible dice are being thrown in that choice.I usually feed in the options it may select from