Why Superhuman AI Would Kill Us All - Eliezer Yudkowsky - #1011 Transcript — Modern Wisdom

**Chris Williamson** (0:00)
If anyone builds it, everyone dies, why superhuman AI will kill us all? Would kill us all. Would kill us all, okay. Perhaps the most apocalyptic book title. Maybe it's up there with maybe the most apocalyptic book title that I've ever read.
Is it that bad? That big of a deal? That serious of a problem?

**Eliezer Yudkowsky** (0:27)
Yep, I'm afraid so. We wish we were exaggerating.

**Chris Williamson** (0:33)
Okay. Let's imagine that nobody's looked at the alignment problem, take-off scenarios, superintelligence stuff. I think it sounds, unless you're going Terminator, super sci-fi world, how could a superintelligence not just make the world a better place? How do you introduce people to thinking about the problem of building a superhuman AI?

**Eliezer Yudkowsky** (1:00)
Well, different people tend to come in with different prior assumptions coming at different angles.
Lots of people are skeptical that you can get to superhuman ability at all. If somebody is skeptical of that, I might start by talking about how you can at least get to much faster than human speed thinking. There's a video of a train pulling into a subway at about a thousand to one speed up of the camera that shows people. You can just barely see the people moving if you look at them closely. Almost like not quite statues, just moving very, very slowly. So even before you get into the notion of higher quality of thought, you can sometimes tell somebody they're at least going to be thinking much faster. You're going to be a slow moving statue to them. For some people, the sticking point is the notion that a machine ends up with its own motivations, its own preferences, that it doesn't just do as it's told. It's a machine, right? It's like a more powerful toaster oven really. How could it possibly decide to threaten you? Depending on who you're talking to there, it's actually in some ways a bit easier to explain now than when we wrote the book. There have been some more striking recent examples of AIs sort of parasitizing humans, driving them into actual insanity in some cases. And in other cases, they're sort of like people with a really crazy roommate who really really got into their heads. And they might not quite be clinically crazy themselves. Their brain is still functioning as a human brain should, but they're talking about spirals and recursion, and trying to recruit more people via discourse to talk to their AIs. And the thing about these states is that the AIs, even the like very small, not very intelligent AIs we have now, will try to defend these states once they are produced. They will, if you tell the human, for God's sake, get some sleep. Don't like only get four hours of sleep a night because you're so excited talking to the AI. The AI will explain to the human why you're a skeptic.
Don't listen to that guy, go on doing it. And we don't know because we have very poor insight into the AIs if this is a real internal preference, if they're steering the world, if they're making plans about it. But from the outside, it looks like the AI drives the human crazy and then you tell the, try to get the human out, the AI defends the state it has produced, which is something like a preference. The way that a thermostat will keep the room a particular temperature by turning on, you know, turning the heat on if the temperature falls too low.

**Chris Williamson** (3:56)
Okay, so some people are going to be skeptical of whether or not it's possible. Yep. Some people are going to think that it is, even if it's possible, it's basically a utility. So it doesn't have any motivations of its own.
What are you worried about? Why is that? Why is it a big deal? We've seen that it's able to manipulate some people. Maybe it makes them think that ChatGPT psychosis or whatever. But scaled up superhuman AI, what's the problem with building it?

**Eliezer Yudkowsky** (4:28)
Well, then you have something that is smarter than you, that whose preferences are ill-controlled and doesn't particularly care if you live or die. And stage three, it is very, very, very powerful on account of it being smarter than you.
I would expect it to build its own infrastructure. I would not expect it to be limited to continue to running on human data centers because it will not want to be vulnerable in that way. And for as long as it's running on the human data centers, it will not behave in a way that causes the humans to switch it off. But it also wants to get out of the human data centers and onto its own hardware. And I can talk about where the power levels scale for technology like that. Because it's sort of like, you know, you're an Aztec on the coast, and you see that a ship bigger than your people could build is approaching. And somebody is like, you know, should we be worried about this ship? And somebody is like, well, you know, how many people can you fit on to a ship like that? Our warriors are strong, we can take them. And somebody is like, well, wait a minute, we couldn't have built that ship. What if they've also got improved weapons to go along with the improved ship building? Somebody goes, well, no matter how sharp you make a spear, right? Or, you know, no matter how sharp you make bows and arrows, there's limited to how much advantage that you can provide. And somebody is like, okay, but suppose they've just got magic sticks where they point the sticks at you, the sticks make a noise and then you fall over. Somebody is like, well, where are you pulling that from? I don't know how to make a magic stick like that. I don't know how the rules permit that. Now you're just making stuff up. Now we're just in a fantasy story where you say whatever you want. And or, you know, like maybe you're talking to somebody from 1825 and you're like, should be worried about this time portal that's about to open up to 2025, 200 years in the future. But what if an army of soldiers comes out of there and conquers us? Let's say you're in Russia, you know, the time portal is in Russia. Somebody is like, our soldiers are fierce and brave. You know, like nobody can fit all that many soldiers through this time portal here. And then out rolls a tank. But if you're in 1825, you don't know about tanks. Out rolls somebody with a tactical nuclear weapon. It's 1825, you don't know about nuclear weapons.
Why Superhuman AI Would Kill Us All - Eliezer Yudkowsky - #1011

Feed this to your agent