Will the lack of AI Alignment will be the end of humanity?
Here's a discussion 8 days ago with Eliezer Yudkowsky on how the massive funding for recent breakthroughs in machine learning, such as ChatGPT, will likely lead us to a future where a super AI kills all of humanity because we didn't figure out how to properly align it's goals (whatever those happen to be) with our survival.
Do you see this as a serious existential risk on the level of climate change or nuclear war? Do you think it's possible a generalized AI that is cognitively better than all of humanity is on the horizon? If such a thing is possible and relatively imminent, do you think it's risky to be massively investing in technologies today which might lead to it tomorrow?
Even if you don't think it's an existential threat, do you worry that we will have difficulty aligning increasingly powerful models with human values? If you've seen anything about ChatGPT or Bing Chat Search, you know that people have figured out all sorts of ways to get the chat to generate controversial and even dangerous content, since its training data is the internet. You can certainly get it to act like an angry, insulting online person.
Or maybe the real threat is large corporations and governments leveraging these models for their own purposes.
Do you see this as a serious existential risk on the level of climate change or nuclear war? Do you think it's possible a generalized AI that is cognitively better than all of humanity is on the horizon? If such a thing is possible and relatively imminent, do you think it's risky to be massively investing in technologies today which might lead to it tomorrow?
Even if you don't think it's an existential threat, do you worry that we will have difficulty aligning increasingly powerful models with human values? If you've seen anything about ChatGPT or Bing Chat Search, you know that people have figured out all sorts of ways to get the chat to generate controversial and even dangerous content, since its training data is the internet. You can certainly get it to act like an angry, insulting online person.
Or maybe the real threat is large corporations and governments leveraging these models for their own purposes.
Comments (23)
What makes alignment a hard problem for AI models? Because they are based on gradient descent using giant matrices with varying weights. The emergent behaviors are not well understood. There isn't a science to explain the weights in a way that we can just modify them manually to achieve desired results. They have to be trained.
So it's a difficult task for OpenAI to add policies to ChatGPT which can always prevent it from saying anything that might be considered harmful. There are are always ways to hack the chat by prompting it to get around those policies.
This also raises the discussion of who gets to decide what is harmful and when the general public should be protected from a language model generating said content. If future advances do lead to a kind of Super AI, which organiation(s) will be the ones aligning it?
No.
Yes.
All technocapital investments are "risky".
"Worry"? That depends on which "human values" you mean ...
In other words, humans – the investor / CEO class.
Quoting Marchesk
I don't think this "alignment problem" pertains to video game CPUs, (chat)bots, expert systems (i.e. artificial narrow intellects (ANI)) or prospective weak AGI (artificial general intellects). However, once AGI-assisted human engineers develop an intellgent system complex enough for self-referentially simulating a virtual self model that updates itself with real world data N-times per X nanoseconds – strong AGI – therefore with interests & derived goals entailed by being a "self", I don't see how human-nonhuman "misalignment" is avoidable; either we and it will collaboratively coexist or we won't – thus, the not-so-fringy push for deliberate transhumanism (e.g. Elon Musk's "neurolink" project).
Or maybe: What gets to decide ...
@universeness :nerd:
Hobbyists will always make it, if the experts do, given enough time. They'll copy the techniques. This will make the Experts™ decide not to share the technology with hobbyists.
The problem is whether AI should be in the hands of hobbyists or experts. In the hands of hobbyists, you would have full control over its creation and behavior, but so would everyone else, and someone eventually would make evil AI.
Evil AI could never do enough damage to justify keeping technology in the hands of the elites, only being usable by us when they decide to let us use it.
Thanks for the tag 180proof!
I think you make the crucial point here. As long as any AI 'mecha' based system remains dependent on human input/programming then WE will remain in control.
There will be those who want to 'weaponise' future AI to use it for cyber attacks on a perceived national enemy. These issue will always be of concern. Biological Computing may well produce an 'organic' AI of immense power as well. It is likely to me that, the currently rich and powerful are the ones, who will initially invoke the development of such a system and will have control over such a system. But as @180proof clearly states, all of that may just be mere prologue. The moment of 'singularity' will happen when the system becomes able to 'learn' in the way we learn. That is the moment it will be able to program itself, just like we do. But it will have a processing speed and storage capacity way, way beyond humans and will also have the ability to grow in both of those capacities. That growth may well become exponential. That's the point at which I think it may become self-aware and humans will not be able to control it.
Folks like Demis Hassabis, Nick Bostrum, Elon musk and https://towardsdatascience.com/the-15-most-important-ai-companies-in-the-world-79567c594a11 are not morons, they fully understand the dangers involved. They even have plans such as making sure any developed AGI or ASI is created within a DMZ (demiliterised zone). So that it will have restricted access and 'a big red button,' to cut all power to it :lol: in case it goes all terminator (skynet) on us.
I personally agree with @180proof, with:
"I don't see how human-nonhuman "misalignment" is avoidable; either we and it will collaboratively coexist or we won't"
Although I don't dismiss the possibility (I don't know if @180proof does) of an eventual 'complete merge,' between humans and future 'mechatech' and/or 'bio or orgatech.'
I disagree. Concepts like processing speed and memory storage are artifacts of Enlightenment -era Leibnitzian philosophy, which should remind us that our computers are appendages. They are physical manifestations of our scientific and technological models of a particular era. At some point , as we dump reductive concepts like ‘speed of processing’ and ‘memory storage’ for more organic ones, we will no longer design our thinking appendages as calculating devices ( exponentially accelerating or otherwise, since human creativity is not about speeds but about the qualitative nature of our thinking) , but use wetware to devise simple ‘creatures’ which we will interact with in more creative ways, because these living appendages will not be based on deterministic schematics. Currently, our most complex machines cannot do what even the simplest virus can do, much less compete with a single-called organism.
Even the computerized devices we now use , and the non-computer machines before them, never actually behaved deterministically. In their interaction with us they are always capable of surprising us. We call this bugs or errors , but they reflect the fact that even what is supposedly deterministic has no existence prior to its interaction with us interpreting beings, and thus was always in its own way a continually creative appendage
Machines that we invent will always function as appendages of us , as enhancements of our cognitive ecosystem. They contribute to the creation of new steps in our own natural evolution, just as birds nests, rabbit holes, spiders webs and other niche innovations do. But the complexity of webs and nests don’t evolve independently of spiders and birds; they evolve in tandem with them. Saying our machine are smarter or dumber than us is like saying the spider web or birds nest is smarter or dumber than the spider or bird. Should not these extensions of the animal be considered a part of our own living system? When an animal constructs a niche it isnt inventing a life-form, it is enacting and articulating its own life form. Machines, as parts of niches , belong intimately and inextricably to the living self-organizing systems that ‘we' are.
Also, not much is known about human intelligence, so to speak of the intelligence of something that isn't even biological should make one quite skeptical. Something in our thinking about these issues has gone wrong.
:up:
Processing speed is akin to the speed of anything and memory capacity is really just how much space you have available to store stuff along with your method of organising what's stored and your methods of retrieval. These concepts have been around since life began on this planet.
You are capable of learning, what would you list, as the essential 'properties' or 'aspects' or 'features' of the ability to learn?
First generation cognitive science borrowed metaphors from cognitive science such as input -output, processing and memory storage. It has evolved since then. The mind is no longer thought of as a serial machine which inputs data, retrieves and processes it and outputs it, and memory isnÂ’t stored so much as constructed. Eventually these changes will make their way into the designs of our thinking machines.
Irrelevant imo, to the fact that the concept of the speed of a process has been around long before any use of it you have cited. The same applies to the concept of organised storage and retrieval.
Quoting Joshs
When was the mind ever considered a serial machine? It processes in parallel. Even models such as the triune brain system (Rcomplex, Limbic System and Cortex) would confirm that. The brain as two hemispheres, and the fact that many brain operations control many bodily functions 'at the same time,' would suggest to me, that anyone who thought the brain was a serial processor, was a nitwit!
Quoting Joshs
I think you are being unclear in your separation of container and content! Of course memory is not stored. Content is stored IN memory. Memory is a media.
I repeat my question, as you seem to have ignored it, perhaps you meant to, the reason for which, you could perhaps explain:
You are capable of learning, what would you list, as the essential 'properties' or 'aspects' or 'features' of the ability to learn?
Learning is the manifestation of the self-reflexive nature of a living system. A organism functions by making changes in its organization that preserve its overall self-consistency. This consistency through change imparts to living systems their anticipative , goal-oriented character. I argued that computers are our appendages. They are like organ systems within our bodies. Just like the functioning of a liver or heart cannot be understood apart from its inextricable entanglement in the overall aims of the organism, the same is true of our machines with respect to our purposes. They are not autonomous embodied-environmental systems but elements of our living system.
By using the phrase ' self-reflexive,' probably from (according to your profile info) your involvement with human psychology, I assume (based on a google search of the term,) you are referring to something like:
"Reflexivity is a self-referential loop of seeing and changing because we see, a constant inquiry into how we interpret the world and how this, in turn, changes the world."
This is a 'consequential' of learning rather than a mechanism of the process of learning.
Let me offer you a simple example. The human sense of touch allows each of us to 'learn' the difference between rough/smooth, sharp/blunt, soft/hard, wet/dry, pain etc.
The attributes of rough/smooth are in the main, down to the absence or presence of indentations and/or 'bumps' on a surface. There is also the issue of rough/smooth when it comes to textures like hair/fur/feathers, when rough can be simple tangled or clumped hair, for example.
An automated system, using sensors, can be programmed to recognise rough/smooth as well as a human can imo. So if presented with a previously unencountered surface/texture, the auto system could do as well, if not better than a human in judging whether or not it is rough or smooth.
The auto system could then store (memorialise) as much information as is available, regarding that new surface/texture and access that memory content whenever it 'pattern matches' between a new sighting of the surface/texture (via its sight sensors) and it could confirm it's identification via it's touch sensors and it's memorialised information. This is very similar to how a human deals with rough/smooth.
Can we then state that the auto system, is as intelligent as a human, when it comes to rough/smooth.
The answer most people would give is no, but only because they would claim that the auto system does not 'understand' rough/smooth. So my question to you becomes, what would an auto system have to demonstrate to YOU to convince you that it 'understood' rough/smooth?
Quoting Joshs
Are you suggesting that any future automated system will be incapable of demonstrating this ability?
Quoting Joshs
That's a much better point. Can an automated system manifest intent and purpose from it's initial programming and then from it's 'self-programming?'
Our current AGV moon rovers do have a level of 'decision making' which 'could be' argued as demonstrating the infancy of autonomous intent. What evidence do you have that it is IMPOSSIBLE that a future AGI or ASI will be able to clearly demonstrate 'goal setting,' 'intent,' 'purpose?'
Quoting Joshs
No, they have much more potential that mere tools.
Quoting Joshs
Not yet, but, the evidence you have offered so far, to suggest that an ASI, can never be autonomous, conscious, self-aware forms (no matter how much time is involved), in a very similar way, or the same way as humans currently are, (remember WE have not yet clearly defined or 'understood' what consciousness actually IS.) are, not very convincing imo. I find the future projections offered by folks like Demis Hassabis, Nick Bostrom et al, much more compelling that yours, when it comes to future AGI/ASI technology.
Then we agree on that one. :up:
I see those as far more dangerous than the idea of AI being destructive. We might even benefit from AI removing many of the existential threats we have. The problem isn't the AI, the problem is the person programming the AI.
We've lowered the bar for philosophical, moral and intelligent understanding outside of people's work knowledge. Working with AI demands more than just technical and coding abilities, you need to have a deep understanding of complex philosophical and psychological topics, even be creative in thinking about possible scenarios to cover.
At the moment we just have politicians scrambling for laws to regulate AI and coders who gets a hard on for the technology. But few of them actually understands the consequences of certain programming and functions.
If people are to take this tech seriously, then society can't be soft towards who's working with the tech, they need to be the brightest and most philosophically wise people we know of. There's no room for stupid and careless people working with the tech. How to draw the line for that is a hard question, but it's a much easier task than solving the technology itself. The key point though, is to get rid of any people with ideologies about everyone being equal, people who grew up on "a for effort" ideas and similar nonsense. Carelessness comes out of being naive and trivial in mind. People aren't equal, some are wise, some are not and only wise people should be able to work with AI technology.
This tech requires people who are deeply wise and so far there's very few who are.
Quoting Marchesk
No, not in the sense of a human mind. But so far computers are already cognitively better than humans, your calculator is better than you at math. That doesn't mean it's cognitively better at being a human mind.
We can program an algorithm to take care of many tasks, but an AGI that's self-aware would mean that we can't communicate with it because it wouldn't have any interest in us, it would only have an interest in figuring out its own existence. Without the human component, experience, instincts etc. it wouldn't act as a human, it would act as very alien to us. Therefor it is practically useless for us.
The closest we will get to AGI is an algorithmic AI that combines all the AI systems that we are developing now, but that would never be cognitively better than humans since it's not self-aware.
It would be equal to a very advanced calculator.
Quoting Marchesk
We don't have a global solution to climate change, poverty, economic stability, world wars, nuclear annihilation. The clock is ticking on all of that. AI could potentially be one of the key technologies to aid us in improving the world.
Is it more thoughtful to invest in technologies today that just keeps the current destructive machine going? Instead of focusing on making AI safe and use that technology going forward?
It's also something everyone everywhere in every time has been saying about new technology. About cars, planes, computers, internet etc. In every time when a new technology has come along, there have been scared people who barely understands the technology and who scare mongers the world into doubt. I don't see AI being more dangerous than any of those technologies, as long as people guide the development correctly.
If you don't set out rules on how traffic functions, then of course cars are a menace and dangerous for everyone. Any technological epoch requires intelligent people to guide the development into safe practice, but that is not the same as banning technology out of fear. Which is what most people today have; fear; because of movies, because of religious nonsense, because of basically the fear of the unknown.
Quoting Marchesk
The problem is the black box problem. These models need to be able to backtrack how they arrive at specific answers, otherwise it's impossible to install principles that it can follow.
But generally, what I've found is that the people behind these AI systems doesn't have much intelligence in the field of moral philosophy or they're not very wise at understanding how complex sociological situations play out. If someone doesn't understand how racism actually works, how would they ever be able to program an algorithm to safeguard against such things?
If you just program it to "not say specific racist things", there will always be workarounds from a user who want to screw the system into doing it anyway. The key is to program a counter-algorithm that understand racist concepts in order to spot when these pops up, so that when someone tries to force the AI, it will understand that it's being manipulated into it and warn the user that they're trying to do so, then cut the user off if they continue trying it.
Programming an AI to "understand" concepts requires the people doing the programming to actually understand these topics in the first place. I've rarely heard these people actually have that level of philosophical intelligence, it's most often external people trying their system who points it out and then all the coders scramble together not knowing what they did wrong.
Programmers and tech people are smart, but they're not wise. They need to have wise people guiding their designs. I've met a lot of coders working on similar systems and they're not very bright outside of the tech itself. It only takes a minute of philosophical questioning before they stare into space, not knowing what to say.
Quoting Marchesk
Yes, outside of careless and naive tech gurus, this is the second and maybe even worse threat through AI systems. Before anything develops we should have a massive ban on advanced AI weapons. Anyone who uses advanced AI weapons should be shut down. It shouldn't be like it is now when a nation uses phosphorus weapons and everyone just points their finger saying "bad nation", which does nothing. If a nation uses advanced AI weapons, like AI systems that targets and kills autonomously through different ethnic or key signifiers, that nations should be invaded and shut down immediately, because such systems could escalate dramatically if stupid people program it badly. THAT is an existential threat and nations allowing that needs to be shut down. There's no time to "talk them out of it", it only takes one flip of a switch for a badly programmed AI to start a mass murder. If such systems uses something like hive robotics, it could generate a sort of simple grey goo scenario in which we have a cloud of insect-like hiveminded robots who just massacre everyone they come into contact with. And it wouldn't care about borders.
Quoting universeness
According to enactivist embodied approaches , bottom up-top down pattern matching is not how humans achieve sensory perception. We only recognize objects in the surrounding spatial world as objects by interacting with them. An object is mentally constructed through the ways that its sensory features change as a result of the movement of our eyes, head, body. Furthermore, these coordinations between our movements and sensory feedback are themselves intercorrelated with wider organismic patterns of goal-oriented activity. These goals are not externally programmed but emerge endogenously from the autonomous functioning of the organism in its environment. Key to meaning-making in living systems is affectivity and consciousness, which in their most basic form are present in even the simplest organisms due to the integral and holistic nature of its functioning.
HereÂ’s Evan ThomasonÂ’s description of an enactive system:
As long as we are the ones who are creating and programming our machines by basing their functional organization on our understanding of concepts like memory storage , patten matching and sensory input, , their goals cannot be self-generated. They can only generate secondary goals derived as subsets of the programmed concepts , which we then respond to by correcting and improving the programming. This is how our appendages and organ systems function.
Can we ever ‘create’ a system that is truly autonomous? No, but we can tweak living organic material such as dna strands enclosed in cellular-like membranes so that they interact with us in ways that are useful to us. Imagine tiny creatures that we can ‘talk to’. These would be more like our relationship with domesticated animals than with programmed machines.
I agree. Good point.
Quoting Manuel
I also agree. Only that I believe the term "intelligence" is used here metaphorically, symbolically and/or descriptively rather than literally. in the general sense of the word and based on the little --as you say-- we know about actual intelligence.
I mean, you are right, "intelligence" could be used metaphorically - but then it's unclear as to what this means. We can describe physics in terms of intelligence too, but then we are being misleading.
And to be fair, several advances in AI are quite useful.
Well, although intelligence is indeed a quite complex faculty to explain exacty how it works --as most human faculites-- it can be viewed from a practical aspect. That is, think what we mean when we apply it in real life. E.g. an intelligent student is one who can learn and apply what they know easily. The solving of problems shows intelligence. (This is where IQ tests are based on.) And so on ...
Quoting Manuel
Right. AI is based on algorithms, the purpose of which is to solve problems. And this is very useful for those who are involved in its creation and development, because actual human intelligence increases in the process of creating an "artificial" one. And it is equally useful to those who are using AI products, but from an another viewpoint.
All this of course applies to all kinds of machines, inventions, medicine, and to technology in general. They are all products of human intelligence.
I agree that dynamic interaction between a human being and the environment it finds itself in, will have a major effect on it's actions, but so what?
Quoting Joshs
Ok, so you offer detailed neuroscientific theory about how a human might decide if a particular object is rough or smooth. I don't see the significance here. We are discussing a FUTURE ASI!
Initially, all the programming we put into it, will follow the human methodology of 'cognising' the difference between rough and smooth. This may well follow/simulate/emulate 'enactivist embodied approaches' at some point, during the times when humans are still in control of prototype AGI/ASI systems but IF and when an AGI/ASI becomes self-programming or able to learn autonomously, YOU have no clue as to what methodologies, IT will use to learn. It may well continue to demonstrate such as enactivism, or it may not.
Who knows what will grow or originate from within an AGI/ASI (endogenously).
Quoting Joshs
You keep putting the conclusion before the proposal, and you seem to be trying to use that to suggest why an autonomous ASI will never happen. Its fallacious imo, to suggest an ASI cannot BECOME conscious because you need consciousness to learn the way that humans learn.
Quoting Joshs
I broadly agree! But, as you yourself admit, "As long as we are the ones in control of AI.'
Quoting Joshs
I completely disagree with your low level predictions of the future of AI. So do the majority of the experts currently working in the field. Consider this, from 'The Verge' website:
[b]In a new book published this week titled Architects of Intelligence, writer and futurist Martin Ford interviewed 23 of the most prominent men and women who are working in AI today, including DeepMind CEO Demis Hassabis, Google AI Chief Jeff Dean, and Stanford AI director Fei-Fei Li. In an informal survey, Ford asked each of them to guess by which year there will be at least a 50 percent chance of AGI being built.
Of the 23 people Ford interviewed, only 18 answered, and of those, only two went on the record. Interestingly, those two individuals provided the most extreme answers: Ray Kurzweil, a futurist and director of engineering at Google, suggested that by 2029, there would be a 50 percent chance of AGI being built, and Rodney Brooks, roboticist and co-founder of iRobot, went for 2200. The rest of the guesses were scattered between these two extremes, with the average estimate being 2099 — 81 years from now.
In other words: AGI is a comfortable distance away, though you might live to see it happen.[/b]
Consider assigning a time frame such as another 10,000 years, where will AI be then?
Do you really think AI will remain a mere human appendage?
AI had the ability to dramatically shift the share of all income that comes from capital (i.e., earnings from legal ownership of an asset). Already, the labor share of income in modern economies has been declining steadily for half a century, the same period during which median real wage growth flatlined.
With AI, one family could own thousands of self driving trucks. Each one replaced a middle class job. Their collective income becomes returns of capital instead of wages. If one coder can do the work of 8 with AI, AI can replace many hospital admin jobs and make diagnosis quicker, AI can do the legal research of a team of attorneys, etc. more and more jobs can be replaced and the income derived from that work directed to fewer and fewer people.
Human ability tends to be on a roughly normal distribution. Income tends to be on a distribution that is weighted to the top, but still somewhat normal. Wealth, total networth, is generally on a power law distribution due to cumulative returns on capital. If you have $2 million, you can earn $80,000 in risk free interest at today's rates.
In the US already, the bottom 50% own 3% of all wealth. The top 10% own 90+% of stocks and bonds, with the top 1% outpacing the bottom 9% in that slice of the distribution.
AI can radically destabilize politics and economies simply by throwing formerly middle and upper-middle class families into the growing masses of people whose labor is largely irrelevant. The Left has not caught up to this reality. They still want to use the old Marxist language of elites oppressing the masses to extract their labor. The future is more one where elites find the masses irrelevant to their production. The masses will only be important as consumers.
Further, you don't need AIs launching nukes to be scared of how they effect warfare. AI will allow very small professional forces to wipe the floor with much larger armies without automation.
Look up autonomous spotting drones and autonomous artillery. This stuff is already here. A swarm of drones using highly efficient emergent search algorithm a can swarms over a front line or be fired into an area by rocket. Data from across the battle space helps direct them, air dropped seismic detectors, satalites, high altitude balloons, etc. The drones find targets and pass the location back into a fire control queue. Here an AI helps prioritize fire missions and selects appropriate munitions for each target. From the queue, signals go out that in turn have an autonomous system aim itself and fire a smart munition directly onto the target.
Active protection and interception will become essential to the survival of any ground forces and these all will rely heavily on AI. E.g., a drone spots and ATGM launch which in turn guides an IFV mounted laser system to turn and fire in time to destroy it.
Insurgents, a bunch of guys with rifles and basic anti-tank weapons, will be chewed apart by drones swarming overhead with thermal imaging, radar, seismic data, etc. and utilizing autonomous targeting systems.
The main point is, you won't need large masses of people to wage a war. 40,000 well trained professionals with a lot of autonomous systems may be able to defeat a 20th century style mass mobilized army of 4 million.
The rate and accuracy of fire missions possible with tech currently in development alone will dramatically change warfare. Ground engagements will mostly be about whose spotter network can gain air superiority. Soldiers will probably rarely shoot their service rifles by 2050. Why shoot and reveal your position when you can mark a target with your optic/fire control and have your IFV fire off an autonomous 60mm mortar on target or you can have a UGV blast it with 20mm airburst?
You don't need AGI to make warfare radically different, just automation that increases fire mission speed and accuracy five fold, which seems quite possible.
Simulating "pocket" universes. :nerd:
If and when strong – self-aware – AGI 'emerges', human intelligence will be obsolete.
:100:
:smirk: Yeah, and it may even try to 'create' an abiogenesis. ASI becomes god? :lol:
NO NO NO, theists!!! Don't seek that gap for your favourite god shaped plug!
I can't see a future ASI waiting most of 13.8 billion years for it's abiogenesis and evolution to create it's simulated lifeforms. I don't think it would see a need to create relatively pointless objects such as the planet Mercury or Donald Trump. So for me, even if a future ASI could emulate the god properties, it would be very different indeed to any current god model suggested by human theists/deists or theosophists.