Superintelligence and existential AI risk ought to be taken seriously
The arguments by two Norwegian AI experts against doing so do not hold up to scrutiny
Last week, the Norwegian weekly news magazine Morgenbladet had a large feature article on AI risk, where a multitude of AI experts of various strands were interviewed about the various kinds of risks they see — including the meta-risk of focusing on the wrong risks. The article reinforces my impression that AI debate in Norway has a lot in common with its counterpart in neighboring Sweden. In particular, there is a small number of voices that are familiar with cutting edge research on AI risk and AI safety, and with the mostly US-centered discourse around this topic1 led by thinkers like Yoshua Bengio, Geoffrey Hinton, Ajeya Cotra, Daniel Kokotajlo, Zvi Mowshowitz, Max Tegmark, Joe Carlsmith, Eliezer Yudkowsky and Nate Soares. These are pitted against another group of voices who paint themselves as representing the academic establishment and who are dismissive about the existential risk concerns raised by the former group.
In the Morgenbladet feature, the former group is represented mainly by Aksel Braanen Sterri, who is research director at the Oslo-based think tank Langsikt. He makes many interesting statements, but little or nothing that I am inclined to push back against. So, in order to make this essay more pointed, I will focus on statements made in Morgenbladet by two representatives of the other group: Inga Strümke and Arnoldo Frigessi.
*
Inga Strümke is a physicist who works in machine learning and who in recent years has become Norwegian media’s favorite go-to person to discuss societal implications of AI technology.2 In Morgenbladet, she states, with reference to the aforementioned US-centered discourse, how “extremely important it is that we do not import it to Norway”, because “we should rather focus our limited energy on concrete problems and opportunities here and now”.3
This implicitly assumes that the level of attention and resources spent in total on existential AI risk concerns, and on the more down-to-Earth issues that Strümke is more interested in, is some deterministic fixed amount, putting the two fields in a kind of zero-sum game situation. I believe that this assumption is likely wrong, and that the total amount of resources is better viewed as dynamically expandable, and especially so if the two fields manage to coexist and cooperate in a friendly manner rather than imagining themselves to be in direct competition over a fixed piece of the pie. Some arguments in this direction are given in Section 3 of my paper On the troubled relation between AI ethics and AI safety.
But even if I should happen to be wrong about this attention dynamics aspect, the issue of whether we who are concerned about existential AI risk ought to shut up cannot be settled by this consideration alone. This is because if the risk that a superintelligent AI takes over and wipes out humanity by, say, 2030 (as suggested in a much-discussed report by Kokotajlo et al) is real and substantial, then obviously we need to talk about this risk and how to mitigate it (regardless of the attention game between AI safetyists and more Strümke-style AI ethicists is zero sum or positive sum). Hence, if Strümke wants to argue that, in order to protect Norway from the allegedly unhealthy US discourse on existential AI risk, we should shut up, she needs to make the case that the risk in question is negligible.
The closest that Strümke comes in the Morgenbladet feature to making such a case is when she says that “at present, there is no empirical or scientific basis for assuming that the predictions coming out of Silicon Valley will come true”, and adding that “claims about superintelligence are based on intuition, analogies, extrapolation, and technological optimism”. But I am not impressed by this argument. In fact, with the contrast she implies between scientific rigor on one hand, and intuition, analogies, and extrapolation on the other, she displays a troubling naivety about what science actually is. All three of the phenomena she contrasts with scientific rigor are, in fact, unavoidable components of it. (The fourth term, technological optimism, can be set aside here, since in this context it merely denotes a different assessment than Strümke’s of how quickly technological development can be expected to proceed, which can hardly, in itself, be considered a form of unscientific thinking.)
For science to be practically useful, it needs to say something about the future. But since the scientific observations and data we rely on are necessarily always located in the past, we must resort to extrapolation if we are to say anything at all about what lies ahead. Similarly, we must make use of analogies, such as between what we have observed in the laboratory and how the observed phenomena may be expected to play out in the wild, or between something observed under certain conditions in 2025 and how it might reappear under somewhat different conditions in 2027. And intuition, too, is inescapable, not just because researchers are human and intuition permeates human thought, but more importantly because no matter how mathematically precise and formally articulated our scientific models may be, there is always a residual element of intuition in our judgments about how much trust to place in their connection to reality, and in our analogies and extrapolations.
Contrary to what Strümke claims, assessments of the possible near-term emergence of superintelligence rest on a substantial body of empirical evidence concerning AI development, as well as on scientific analyses of that evidence. Perhaps the most well-known and striking example is the study conducted by the American AI evaluation organization METR on how deep tasks language models are able to complete. Depth here is measured in terms of how long the tasks take for human experts to complete, and what METR finds is that language models’ capabilities in this respect have grown from mere seconds in 2019 to many hours today. The observations closely follow an exponential curve with a doubling time of seven months.
Where this trajectory will lead in the years to come, no one knows with certainty. Perhaps development will continue along this exponential curve; perhaps the tendency toward further acceleration (i.e., even shorter doubling times) that we have seen since 2024 will intensify; or perhaps a ceiling will soon be reached, causing progress to level off. In the first two scenarios, there is considerable reason to think that within a few years we may reach a tipping point where the most advanced AI developers are no longer flesh-and-blood humans but AI systems themselves. This could create a kind of turbocharged feedback loop in development (a phenomenon studied in other empirically grounded scientific work, such as that of Eth and Davidson, 2025), after which superintelligence could follow shortly thereafter. In the third scenario, by contrast, we are more likely headed toward the technologically more modest future that Strümke seems to envision.
Determining which of these scenarios is most likely is of utmost importance for our collective ability to prepare and to steer developments toward outcomes that are beneficial for humanity. Nothing is gained by dismissing the entire discussion as unscientific, as Strümke does.
Nor is she right in imagining that her own implicit predictions — of imminent saturation and flattening of development curves — are any less reliant on intuition, analogies, and extrapolation than those that involve superintelligence and other more dramatic possibilities, and therefore somehow more scientific. On the contrary, I would argue that, in her avoiding to make her assumptions explicit, and her seeking to sweep the entire discussion under the rug, the approach she takes is, if anything, less scientific.
We are all prone to a kind of inductive complacency, expecting the future to remain essentially the same as the present. In many cases there is of course good reason to expect such continuity, but when some quantity is undergoing rapid change in some direction, exemplified by current AI development, it is logically impossible for everything to remain the same: either the quantity itself or the rate of change must turn out different from today. Something has to give: either we will see an abrupt breaking and flattening of current development curves, or we will get AIs that are enormously more capable than those of today. One may reasonably disagree about which of these futures is more likely, but no one should be granted a free pass to treat their own view as the default and to dismiss all others as unscientific.
*
Aroldo Frigessi comes, like me, from the academic discipline of mathematical statistics, and I recall with fondness the interactions we had at some statistics conferences in the 00s. He holds positions at the University of Oslo and the Norwegian Computing Centre. In the Morgenbladet feature, he dismisses the idea “that we can catastrophically lose control” of AI, and backs this up with the claim that “there is no mathematics that shows that the [AI] systems can develop a will of their own”.
I find this utterly unconvincing, for multiple independent reasons. The first and simplest is that even if we would accept Frigessi’s claim, it is still the case that there is no mathematics that shows that the [AI] systems cannot develop a will of their own, so we are then in a position where mathematics does not settle the issue of whether AIs can develop a will of their own. Recycling an argument I made in the Strümke section above, my stance is that in such situations of high uncertainty, “no one should be granted a free pass to treat their own view as the default and to dismiss all others”.4
But I would like to spell out two other slightly more involved reasons for not being impressed by Frigessi’s argument, one having to do with his insistence on “mathematics”, and the other with “a will of their own”. Let me begin with the latter.
It is my experience from discussions of these sorts of AI matters that to many people, the term “will” comes with some heavy baggage in the form of highly anthropocentric and sometimes almost mysterian connotations of what it means to have a “will”. Since a superintelligent AI whose optimization target is some world-state that does not include humans is just as dangerous as one who “wills” such a state, my favorite move here is to simply drop all that baggage by not talking about “will” but instead of the much better understood notion of optimization targets. That takes us to entirely unmysterious waters, because even the simplest thermostat has an optimization target, such as that of keeping room temperature as close to 20°C as possible.
Here I imagine Frigessi objecting by saying that although this is a legitimate example of an optimization target, I am ignoring the “of their own” part of his claim, because the 20°C target was specified by us humans rather than being the AI’s own choice. That is a valid objection, so let me modify my example by connecting the thermostat to a large language model, tasked with figuring out a suitable temperature target, based on what would be convenient for people in the room, along with energy conservation concerns and whatever other relevant aspects the model can think of. The model then feeds this target into the thermostat, and the AI system as a whole (thermostat plus large language model) has thereby developed a target of its own.
At this point, Frigessi might decide to press on and to say that even in this modified example, the target is is ours and not the AI’s, because it was implicitly laid down by our training of and instructions to the language model. In doing so he would be following in the footsteps of Ada Lovelace, who wrote in 1842 about Charles Babbage’s Analytic Engine that it “has no pretensions to originate anything. It can do whatever we know how to order it to perform” (italics in original). I think that (given what we now know) this would be a mistake, and indeed already Alan Turing, in his iconic 1950 paper Computing machinery and intelligence, pointed out what is so untenable about Lovelace’s view, namely that the same argument, mutatis mutandis, applies to show that not even humans can originate anything. So Lovelace’s implicit definition of originality and creativity needs, in order to remain relevant, to be replaced by something less stringent. Turing proposes a definition involving the machine’s ability to surprise us, something that of course we encounter daily in today’s AIs, but which in fact Turing observed already in some of the early computers he was working on. I have expanded on these ideas elsewhere (such as in this book and in this paper), but to be honest I think neither I nor anyone else has been able to essentially improve on the crisp formulations in Turing’s original paper.
At the end of the day, I think the debate about what AIs can be said to originate on their own may turn out to be merely a semantic issue with little relevance to real-world outcomes. If, when the AI-controlled bulldozers arrive to turn everything we hold dear into paperclip factories, Frigessi says “Calm down, no need to worry, paperclip production is not the AI’s own will, but merely a thing we somehow installed in it”, I will find cold comfort in his words.
It remains to say something about Frigessi’s insistence that arguments about what AIs can do should take the form of mathematics. Having a similar academic background as him, I can see where this comes from, but I nevertheless find it misguided. I could easily add some mathematical formalism to, e.g., my thermostat example above, in order to better satisfy the tastes of mathematically inclined readers like Frigessi, yet I will not do so, because it is against my professional ethics to use mathematics merely as decoration to make arguments look more impressive, rather than for gaining insights that were not readily available without the mathematics.
If Frigessi wants to attain a better understanding of AI goals and motivations (or their “will”), then I strongly recommend that he acquaints himself with the theory of orthogonality and instrumental convergence, without worrying too much about its relative lack of mathematical formalism. The classic treatment of this is Nick Bostrom’s 2014 book Superintelligence, but there is plenty of more recent material for the interested reader to choose from, including my own papers from 2019 and 2025, and better yet, the recent book by Yudkowsky and Soares. This theory used to be somewhat divorced from direct empirical observation, but this is no longer true, given, e.g., the experimental work by Apollo and Anthropic on the highly worrying ways in which modern large language models exhibit the instrumentally convergent goal of self-preservation.
*
The positions taken here by Inga Strümke and Arnoldo Frigessi are representative of what I consider to be two of the three main kinds of arguments employed for dismissing existential AI risk concerns, namely (in Strümke’s case) “But how could AI ever become smarter than us?” and (in Frigessi’s) “Why would it ever want to hurt us?”.5 I think this is a strength of the Morgenbladet piece, which is clearly meant to give a broad and balanced overview of the AI risk debate landscape. Still, I hope that some of their readers find their way over to the present essay, so as to learn why Strümke’s and Frigessi’s arguments for not taking issues around superintelligence and existential AI risk seriously do not hold up to scrutiny.
Obviously, in the Swedish instantiation of the AI risk debate, I mean this group of voices to include my own.
I am nowhere near having the analogous position in Swedish AI discourse, so in this respect Strümke has been markedly more successful than me. And as the reader will soon find out, there are major differences in how we think about AI. But there are also some similarities between us, including the odd coincidence that within a time span of just two years, we have both published books about AI with almost the same title, and strikingly similar Orwellian-inspired cover images.
Here and in what follows, all quotes from Morgenbladet are my own translations from the Norwegian original.
I don’t know whether Frigessi is inclined to break the symmetry here by saying “But surely, the burden of proof is on you to show that…”. To which a younger me would have felt tempted to respond along the lines of “On the contrary, by the precautionary principle, it is you who…”. However, with increased age and (one hopes) maturity I have come to resent that kind of burden-of-proof tennis as being unworthy of us rational scientists whose job it is to search for truth without being burdened by overly dogmatic prejudices about what this truth is. See my 2020 essay On science, uncertainty, the atomic bomb, and covid-19 for some related considerations.
The third kind is AI successionism, which I’ve written about at some length in an earlier Substack essay, and which is characterized by the claim that AI replacing humanity would actually be a good thing. That kind of thinking does not surface in the Morgenbladet feature, which is a bit of a relief for me in writing the present piece, because it is in a sense more difficult to counter than the other two kinds of argument. The difficulty stems from the fact that it takes place on the other side of Hume’s is-ought divide where it is less clear that any evidence or rational argument at all has the power to settle disagreements.


