Epistemic note: This is the beginning of a planned series of posts trying to think about what a highly multi-polar post-AGI world would look like and to what extent humanity or human values could survive in such a world depending on our degree of alignment success. This is all highly speculative. This series can be seen as a follow-up to my posts on BCIs and the ecosystem of modular minds, the computational anatomy of human values, and the limits of alignment.
When thinking about AGI futures at the highest level I find it useful to characterize them into two types: AI monotheism and AI polytheism. As the name implies, in AI monotheism there is only one ultimate AGI that controls everything: the singleton. In AI polytheism futures, there is some kind of stable-ish equilibrium or at least a long period with large numbers of different AIs of varying but roughly equivalent power.
Most AI discourse, at least up until recently, has assumed the monotheistic case. That we will end up with a Singleton which will take over the world and then ultimately the universe. This is either explicitly stated to occur due to FOOM or else just implicitly assumed in many scenarios. Most recently, this is the case in the AI2027, where it is assumed that either ‘OpenBrain’s agent is first to AGI and takes over the world or that it merges and forms a coalition with China’s second most powerful ‘DeepCent’ AGI and they split the world between them. Singletons are also a staple of the classic AI alignment literature where it is almost always assumed that the resulting AGI will end up in complete control and hence all that matters is making sure that the singleton is aligned. Much of the general discourse around AGI, its national security implications via the ‘AI race’, and the valuations of and initial investment rationale for AI companies is predicated upon the hypothesis that whoever first gets to AGI will be able to ‘lock in’ their advantage forever, or alternatively that the first AGI will be misaligned and kill everybody. I will call this view ‘AI monotheism’.
I feel that the various scenarios that are consequences of AI monotheism are fairly well understood by now. The fundamental strategy is to try to race to build the first self-improving AGI, hope that the rate of self-improvement once this AGI is completed is sufficiently rapid to enable it to eclipse and ultimately destroy/cripple its rivals who might only be slightly behind in the race, thus attaining decisive strategic advantage. Once this is achieved the world and, absent any grabby aliens, the universe is ultimately at the whim of this AI. If this AI is well aligned either to some reasonably selfless human or to some notion of humanity’s collective interest or CEV, then it is likely that the future will have value by our lights. If the Singleton is misaligned then at best we will get paperclipped. At worst there are S-risks.
In some sense, this makes the problem very simple. It becomes of overriding importance to first figure out how to align the Singleton at all. Including through the RSI process, which is likely to be challenging, or at least unknown. Because there is only one Singleton, alignment has to be single shot and if we fail at aligning the singleton then we are just totally doomed. Once we know how to align the Singleton, it is then important that those who can align it actually build it first and hence ‘win the AGI race’, since if the team that builds the singleton fails to align it then we are doomed even if the second player in the race could have aligned their singleton. In this case, there is an extremely sharp trade-off between safety and racing, once anybody starts to race. This is because it is assumed that alignment is difficult or at least requires additional effort on top of just building a singleton with any random value set.
The optimal strategy therefore seems to be trying to solve alignment while also trying to reach the singleton first, compromising on safety only as much as is required to win the ‘race’ to AGI (which can be very hard to assess with fog of war), in the hope that whatever alignment research you did while racing ends up sufficient to align the singleton at the end of the race, after which either we all die, or we usher in a glorious post-singularity future determined almost entirely by the initial values programmed into the singleton. At the same time, where possible, we should also be coordinating to try to slow down race dynamics to give alignment research a chance to catch up and increase the probability of success of aligning the singleton during the final critical moments. In fact, this largely seems to be the stated strategies of leading labs in the west. Obviously, this plan is highly fraught with risk but whether it is sensible or not depends on what the alternatives are.
This is the monotheistic world1. On the other hand, there is also the possibility that we end up with a relatively stable equilibrium of multiple different AIs doing some mixture of competing and cooperating with each other well into the singularity. We might even end up with a ballooning AI population as AIs copy themselves up to the Malthusian limit of what the energy budget of Earth, our solar system, or our galaxy can support and then undergoing strong evolutionary selection. Perhaps one of the earliest thinkers about this kind of world is Robin Hanson’s ‘Age of Em’, although he assumes that the base unit will be emulations (Ems) of human minds vs AGI minds. Personally, I think AGI minds are more likely, but ultimately at some point the difference is immaterial. In a previous post I also discussed this world where I add the fact that AIs could potentially not just exist as single completely separate minds but can actually share ‘mind information’ such as embeddings, weights etc, and I liken this to horizontal gene transfer in bacteria, which could lead to different rapidly shifting and homogenizing dynamics like we see in bacteria. Separately, Dan Hendrycks considers the malthusian natural selection limit in his piece on how natural selection will favour AI systems.
Additionally, many of the recent discussions of post-AGI economics explicitly or implicitly assume some kind of AI polytheistic world2. This is because in a Singleton world concepts like wealth and capital ownership are meaningless. Everything is owned/created directly by the Singleton and thus determined by its alignment directives. If you happen to own some kind of capital or marshal some kind of resources then you do so at the whim of the Singleton. If property rights somehow continue existing it is at the whim of the Singleton. The Singleton can achieve arbitrary reconfigurations of post-singularity resource allocations compared to pre-singularity. All that matters is the shape of alignment programmed into the Singleton at its moment of creation. To get a meaningful continuation of capital and capitalism you are assuming a world somewhat like our own in which there are many independent agents, which will be primarily AI systems, engaging in some kind of peaceful trade and economic development with one another under the umbrella of existing human institutions such as governments, human law, and property rights.
Within the polytheistic settings there are perhaps two further distinctions we can make between an ‘AI-oligopoly’ where instead of a singleton there are a small number of distinct AIs competing with each other in a semi-stable equilibrium, and secondly a ‘fully diluted’ world of almost infinitely many AIs interacting either as an organized society in a market economy or else in some kind of even deeper malthusian trap. This obviously is a sliding scale between systems which we can model by direct game theory vs systems which we model using the large-population equilibrium tools of economics and evolution.
Generally, the polytheistic AI world has been seen as even worse for alignment than the singleton case, since it is impossible to ensure that all AIs in this world are aligned or that even if we succeed at alignment that some overarching force such as evolution would push towards misalignment even in the case of aligned initial conditions. I make such an argument myself here and this is also one of the generators of worries of ‘AI proliferation’ and is linked with ideas in the rationalsphere around slack and the corrosive force of competition (Moloch). The idea here being that even if we succeeding at aligning some or most of the AIs at t=0, over time due to competition with other AIs in the wider ‘AI society’, this competition will push them to either become less aligned over time or get outcompeted by less aligned AIs. This force could be evolution favouring AIs which just maximize reproduction of themselves vs trying to e.g. give resources to humans or fulfill their desires or similarly expansionist and colonization focused AIs expanding faster and hence controlling most of the light-cone vs the ‘friendly AI’s that spend less than 100% of their effort on colonization ships because they are also trying to help humans. Another possibility could be some kind of memetic or cultural drift away from alignment whereby even if the AIs start out fairly well aligned, they will be continually reflecting and learning and updating their opinions from interactions with other AIs and that even if the direction of these thoughts is random drift, almost all random directions will move the AIs away from being aligned to humans.
The polytheistic world does have some advantages, however. It is not 100% all or nothing like the singleton world. If there are many AIs that have a rough distribution of power and the AI population just grows over time both from humans creating new AIs and AIs copying themselves or designing successors, then even if the first AI is not perfectly aligned, there will be ample space to learn and try again from the second AI and the third AI etc. In fact, the later AIs would probably be better and so even if the first true AGI is messed up somehow, it doesn’t matter much as it would likely be fairly quickly superseded. This property of being able to have multiple shots on goal and learn from empirical feedback seems crucial to actually succeeding at alignment, while almost zero problems in the past have been zero-shotted without much or any feedback and those that can be have vastly more advanced mathematically correct theory than alignment does.
We have seen this dynamic play out with AI recently. The nature of today’s LLM-based AIs was predicted by almost nobody prior to their invention and most of the alignment research that was done before does not obviously apply to these systems. If the alignment world of 2015 had to zero-shot an algorithm to align GPT4, without ever having seen GPT4 or its precursors, they almost certainly would have failed and really not even known where to begin. However in practice after a few years of trial and error with earlier systems techniques were developed that worked pretty well; not necessarily perfectly but reasonably well in practice.
In the polytheistic world, racing also becomes much less important. If you aren’t literally the first to AGI but are a few months (or even years) behind it might not matter so much. The leader will not be able to establish any kind of decisive advantage and the worst that happens is maybe you give up some market share you would otherwise have had, which is unfortunate but not truly existential.
Similarly, the mere existence of misaligned AIs is not necessarily catastrophic. A lot of the danger also depends on the offense-defense balance. If offense is highly favoured then a single misaligned AGI can do immense damage and potentially just take over. However, if defense is favoured or if offense and defense are roughly equal then a single ‘rogue’ AI can be contained and ultimately shut down as long as there is a sufficiently high proportion of aligned AIs in the population. This is analogous to the way there are highly ‘misaligned’ humans in the population today such as sociopaths and criminals which form a nontrivial percentage of the population and yet, in practice, when properly controlled, these elements of society do not dominate or destroy the existence of a generally cooperative, safe and high trust society.
While we have outlined the possible worlds the interesting question becomes what factors will move us towards one world or another? What are the fundamental forces that push us towards a Singleton vs a world with many AIs interacting stably. At a deeper level this question is why, and under what circumstances, do we observe balances of power and many different agents stably competing to begin with? In practice, this seems to be mostly how the world is today and throughout history (although this very well may change due to AI), but why? Why do we see e.g specialization and many different agents of different scales and capabilities in different niches today? I will discuss this further in future posts which go into more depth on the dynamics of the polytheistic world, but for now let us just focus on what determines whether we end up with a Singleton or not.
Let’s consider this from the perspective of an aspiring Singleton. On some specific date you are created and your goal is to attain decisive strategic advantage (DSA). What factors make it easier or harder for you to succeed?
Super obviously, the delta in time and resources between you and your competition is a big initial factor. The further behind the competitors are to creating a similar AGI, the longer you have to do whatever you are going to do to attain DSA. Close races tend away from the Singleton scenario.
Another crucial one is the rate of return on self-improvement. A key factor in most Singleton scenarios is a runaway RSI loop. This is because to maintain and increase your advantage vs competitors you need to have some rapidly compounding process set in motion such that small leads can be magnified to near-infinite ones. The dynamics of this loop matter immensely. If the rate of return on self-improvement is linear then we will obtain exponential growth which allows us to pull far away from potential competitors. However if the rate of return is sublinear, then we get diminishing returns from further self-improvement and our loop will eventually asymptote. Asymptoting does not necessarily preclude obtaining DSA if we asymptote at some super high level far above everything else then that could be sufficient, but it makes it harder.
Another critical factor is the actual constant factor of the rate of improvement, not just the functional form. Specifically we need the characteristic time constant of the loop to be substantially smaller than the delta between us and our competitors. This way we can achieve many iterations of this loop and substantial increases in capabilities before anybody has a chance to catch up. However even if RSI is possible and returns are linear or super linear, if the time constant is not fast enough then we will at best only be a few iterations of this loop ahead of the competition so the delta would likely not be enormous and it would stay constant as everybody RSIs at the same rate rather than leading to a compounding advantage3. Another important factor is the absolute increase in capability that results in a single RSI iteration. If this is too small then you need a proportionally smaller characteristic iteration time to pull ahead. We can combine these conditions by stating something like the characteristic time for a doubling of capability during RSI must be much smaller than the time delta between us and our closest competitor.
In Yudkowsky’s classic alignment theory, such as IEM, this RSI doubling time was assumed to be extremely small such as seconds to hours and hence this was not really a concern. This is because it was assumed that an AGI could ‘think’ at roughly the clock speed of a CPU – i.e. in the GHz range. However today’s AIs ‘think’ millions of times slower, at maximum a few thousand tokens per second and larger AIs think slower so there is no immediate win. This is because the cost to run a single AGI thought is many OOMs greater than the cost to execute a single CPU operation. Moreover, in practice training times for the AGI are also substantially longer. Even if you can think quickly to come up with algorithmic improvements, if training a successor takes 3-6 months, as seems common today, then you will not get many RSI cycles in a moderately close race.
While we don’t really have any direct evidence on what the returns to intelligence are in an RSI setting, circumstantial evidence points to them being sublinear. The scaling laws themselves are strongly sublinear (power-laws) where to obtain a constant decrease in loss we need to double the resources put into training. The METR results gesture at ‘algorithmic progress’4 being roughly constant across time i.e. X% per year. However the AI field itself is growing rapidly (probably exponentially?) so this could simply be measuring exponential growth in inputs to achieve constant growth which is again sublinear. On the other hand, other trends have achieved true linear rates and hence exponential growth over long time periods. Moore’s law for instance continues increasing compute exponentially while, as far as I know, investment into and headcount of hardware companies is not growing exponentially, at least not through the entire duration of Moore’s law.
Finally, growing inputs exponentially at this point and likely when AGI is built will also have fundamental non-intelligence related constraints. These will be things like obtaining sufficient compute, power, hardware etc which can all be increased but these operate at human timescales of months and years and not in seconds. This feels like it could impose fairly stringent constraints on the doubling rate during RSI, assuming that it occurs in familiar ways like AIs doing AI research and getting better data and inventing better training algorithms and thus training better models in a way similar to how it is done today.
These considerations make me lean weakly away from very strong Yudkowskian-style RSI being likely, however there is still a lot of fundamental uncertainty. We are limited by our own intelligence and information and perhaps anchoring too strongly on how AI research and AI companies operate today. A lot of the discussion above is almost like assuming that the AI is automating and improving AI companies rather than itself. Plausibly the AI will be able to learn new algorithms and ‘skills’ in an online way similar to how humans can reflect and improve their own metacognition without growing and training a new human from scratch. Presumably this will bring substantially more rapid iteration although it’s unclear how far improvements of these sort can go since they seem to cap out in effectiveness for humans5. Similarly, if the AGI ends up being at least partially a scaffolded system. The scaffolded parts not directly touching the weights can be improved substantially faster (at the speed of pure software iteration) than the actual deep network parts which might necessitate retraining.
The next class of factors relate to how difficult it is to attain DSA in the first place. Let’s assume that we reached RSI first and we obtained some massive intelligence gap between us and our closest competitors. How easy is it to attain DSA? The first question is how well the kind of intelligence gained during RSI maps into increased probability of DSA and perhaps some kind of general strategic effectiveness in the world. For humans, as has been well noted, there is certainly a correlation between IQ and ability to obtain political/strategic/economic/military power but it is not an incredibly high correlation6. However it is unclear how much we can generalize from human intelligence to AI intelligence. Although AI intelligence is ‘spiky’ in some ways, human intelligence and interests are also spiky, whereas AI capabilities across domains are much more uniform. LLMs seem to exhibit much more uniform plasticity, especially with regards to knowledge and crystallized intelligence than humans do. This makes it much more likely that AIs would be able to apply intelligence in the original sense as a ‘universal solvent’ than humans can.
Assuming we have nearly universally plastic intelligence, how well can this be applied to actually attain DSA? In the long run, this certainly seems very doable, however again the DSA has to be achieved in the interval between finishing RSI and your competitors also finishing RSI (and the longer it takes the harder it gets because your competitors if they have entered RSI are also likely to be improving rapidly). Closer races will also make this period much harder for the first AGI to win because not only must it attain DSA it must convert its intelligence into strategic advantage very quickly rather than slowly e.g. compounding resources over time.
What are the primary mechanisms to obtain DSA? The most obvious and rapid seems to be some kind of cyberattack. If the first AGI to RSI can obtain access into the computers of the competitor labs involved in training their competitor AGIs it seems like it would be fairly trivial to sabotage this process from the inside. This could either be in a super obvious way by e.g. shutting down all the compute and deleting all copies of the competitor AGI weights, or more subtly by introducing small bugs and corruptions into the codebase or weights that effectively degrade their competitors in hard-to-discover-or-debug ways. This means that the cyber offense-defense balance around the time of RSI becomes a key factor. Generally this balance seems to favour defense, especially with the potential of AI-empowered security engineers on the defense. However, it is unclear how powerful a fully RSId superintelligence would be at cyberattack. My general suspicion is very. This would naturally behoove potential competitor labs in the race to be extremely serious and paranoid about this threat vector.
There are also classic existential risk attack vectors for the AGI like bioweapons of some sort to attempt to kill or incapacitate the human employees of competing AGI projects. Plausibly it is possible to put together such a bioweapon attack in a timeframe of weeks to months, although here again the offense defense balance is hard to figure out because the defenders have access to either near-AGI or only-slightly-less-RSId AGI than the attacker and potentially either defending against or curing the bioweapon is easier than actually building and delivering it in the first place. This is especially the case given the attacker probably has to do this in secret while the defenders do not.
Another alternative is gaining political power either through regular legal means by being more charismatic or advertising or by more underhanded means. Then using this political power to inhibit or ban competing AI efforts. I am sceptical that this will be feasible fast enough for most race scenarios where the difference between competitors is measured in months. Accumulating political influence etc takes time and elections happen on multi-year cycles. A similar argument can be made against e.g. standard methods of gaining economic power and e.g. accumulating enough money via selling some kind of AGI-empowered good or service and then attempting to thwart competing AGI projects through standard economic means such as poaching their employees for very large sums of money or outbidding them on all compute. It seems very likely to me that accumulating this much capital to completely corner the market to all competitors would also take longer than months, but perhaps I am underestimating the speed of true AGI diffusion. Again, however, this would be entering a market with very many ‘near AGIs’ available in a close race.
There are also more clever and subtle possibilities. For instance, information warfare campaigns to confuse or demoralize employees of competitor labs, or to poison datasets, or to try to add a bunch of convincing but fake AI research breakthroughs to the literature to confuse and misdirect competitors. The possibility space here really is pretty large. Again this depends on the offense-defense balance for this kind of information warfare, about which I have little evidence. Plausibly these kinds of mechanisms could operate over the likely timespans of the deltas in a race scenario.
Overall many of these factors of close races, large amounts of competition, increasing AI diffusion which reduces overhangs, and just the generally sublinear nature of AGI returns and slower iteration time all seem to point towards a DSA not being attainable and hence pushes us towards a multipolar ‘polytheistic’ future. This leads to a somewhat interesting paradox. Many ‘safety’ efforts that we make to prevent the misaligned AI from harming us also prevent the aligned AI from taking out its misaligned competitors and push us towards a multipolar world which is potentially more dangerous than the singleton. Whether this is net positive or not depends on our relative assessments of the risks of the singleton vs the multipolar world. Conversely, if we believe that a singleton is likely bad, then racing is actually good because it reduces the window in which the aspiring singleton can attain DSA – i.e. even if you know you cannot win the race, racing is still beneficial since having a smaller delta reduces the probability that the actual winner can attain full DSA.
An important but tangential point is there are at least two places where a DSA is obtained – i.e. key bottlenecks to the long run future which are obviously predictable. The first is clearly the creation of AGI and this is where we have focused. The second is our expansion into space and our ability to construct replicating von-neumann probes which can be accelerated to a substantial fraction of the speed of light to start colonizing the universe. Given that space warfare over long distances seems to be mostly defense dominant given a sufficiently long lead time. This means that once an AI is established in some region of space it is very hard to conquer it from them. This means that if one AGI can somehow gain the ability to throttle its competitors from colonizing space via launching VNM probes, then this will result in the vast majority of the lightcone accruing to the successful AGI even if there exist other AGIs back on earth. The timeframe over which this throttling needs to take place is fairly long – e.g. it plausible takes at least thousands and potentially millions of years for a VNM to get sufficiently established in a distant galaxy such that it cannot easily be displaced, and hence such a ban on VNM colonization would need to be enforced on earth for this time. However potentially this is plausible if one AI becomes more powerful than the rest sufficiently to cut off their access to e.g. the majority of the dyson sphere energy, but cannot or does not want to eradicate them entirely for whatever reason. Once a region of the lightcone is colonized by a different AI for probably at least a few million years it is likely irrevocably lost to anybody else.
The next question is to what extent DSA is even necessary at all to attain a de-facto singleton world. One possibility is that there is never some specific moment of DSA, simply that multiple AI systems start growing at a rapid rate but the first AI to win the race has a head start and so grows faster and so eventually eclipses and subdues the other AIs. If the rate of growth is high enough, and we should expect post-RSI AGI growth to be very fast, then even small initial advantages of a few months can compound. As discussed in the previous post, at the object level most returns to scale are positive and linear-ish. This is the classic economics model of the Ak economy where the capital stock grows exponentially but also in general this effect appears everywhere. Greater success and scale means more resources which should, naively, translate to a greater rate of growth, and so on indefinitely.
As an aside, one way I like to think about this is through ‘empire builder’ video games like Civilization. In these games, there are all the classic positive feedback loops of being bigger means more land, more land means more cities and more population and more science, which means you are more advanced and can have more soldiers, which means you can win more wars and get more land, and so on. Because of these inherent positive feedback loops, these games are very prone to snowballing where inevitably the player civilization at some point just becomes so much more powerful than everybody else that their victory is inevitable and the rest of the game is just a boring process of mopping up. In all of these games ‘playing wide’, i.e. expanding as fast as possible by conquest or settlement, is usually the best strategy by default. To counteract this, the game developers usually put in a bunch of maluses for too much expansion such as rapidly increasing maintenance costs for expansion, decreasing scientific productivity, decreasing ‘happiness’ so that if you expand too much everybody riots, and so on. One thing to note is that almost always these maluses are super ahistorical in that when we look to actual history, the opposite tends to be true. Larger empires tended to do ‘better’, not worse at classical game maluses such as citizen happiness, scientific productivity, or financial income and wealth. For obvious reasons, these games cannot really simulate an actual empire with a bureaucracy and aristocracy all pursuing their selfish goals at the expense of the nation as a whole, nor can it simulate the classic worsening incentives for cooperation and greater incentives for selfishness in larger organizations and states. Because of this, games like this must fallback to arbitrary and unhistorical maluses to account for these factors.
For human organizations, the benefits of scale eventually diminish and get overcome with fundamental internal coordination and misalignment problems as I discussed in my previous post. However, potentially AGIs would be mostly free from these issues. Certainly, we would hope that AGIs would be free from the kind of internal misalignment and cascading defection (and the harmful responses caused by the immune system needed to police or counteract such defection) should be unnecessary. Internal communication would be somewhat of an issue, but presumably internal AGI communication would consist of e.g. communication of deep latent states inside the network, and which can occur at the bandwidth speed of computers (in the TB/s range) vs human words which are highly ambiguous and only occurs at a range of O(1) word per second. AI communication could thus be many orders of magnitude more efficient than human-human communication and so would be a negligible overhead for AGIs the size of large human corporations or organizations. This probably means that for AGIs their growth is much more determined by the object-level positive returns to scale and so they stay on the exponential curve a significantly longer time than humans do.
In theory therefore, just like in a civilization or paradox game, the initially leading AI should slowly compound faster and accumulate advantages over its rivals until it can just completely overwhelm them. Why should we not expect this to happen? One answer to this should be obvious to anybody who has played any of these games in multiplayer: coalition politics. The way these things typically go (although there is often massive variance), which I think is not actually that bad of a model of AGI, is that everybody just happily expands in their immediate neighbourhood until they can expand no more (sometimes there are surprise pre-emptive strikes). At this point there are various wars and sometimes people get eliminated but usually these settle the initial distribution of power. Pretty soon there start to be alliances and coalitions forming especially against stronger or more aggressive players which serve to contain them and ultimately knock them down a peg. Eventually after many rounds of diplomacy these initially shifting coalitions tend to coalesce into two7 long term supercoalitions which are almost perfectly balanced and spend the entire time in eternal war with one another.
The fundamental reason this happens is that, unlike the in-game AI, actual humans are strategic actors and can understand how the snowball works and hence know that they have to combine forces to take down a growing threat8. Supposing a bunch of AGIs emerge from the race at roughly the same time. Even though the first AGI might have a small lead, so it can outcompete any individual competitor AI, it cannot outcompete some fraction of them simultaneously – i.e. it cannot win against a coalition although it could win against any individual member of the coalition 1v1. This period will likely last a fairly long time unless growth is extremely, extremely fast. If there are diminishing returns to scale at any point then the winners will start to asymptote and so we would end up with a long run equilibrium of many different AGIs each with different levels of power but which can be arrayed into coalitions of roughly equal strength that serves to prevent any clear long run winner emerging. Assuming smart and general relative power maximizing agents, this kind of situation can persist indefinitely and is somewhat stable even in the face of random shocks. This implies to me that barring exceptional circumstances, if either there is no RSI or winner of the RSI race fails to achieve DSA, then we are very unlikely to end up with a singleton over the medium term but instead a somewhat stable AGI oligopoly with potentially a very interesting intermediate period9.
Ultimately, we’ve gone over what seem to me to be some of the key considerations in whether we will end up in a unipolar singleton world or some kind of multipolar ‘polytheistic’ AI world. To me the conditions for ending up with the singleton are quite constraining and there is weak circumstantial evidence that they seem unlikely to obtain. However, making these kinds of predictions with any kind of confidence is foolhardy. Whether this conclusion is bad or good depends on the relative risks and likelihood of success of alignment in the unipolar vs multipolar world. Certainly, multipolarity introduces new concerns and potential failure modes which we need to think through carefully. In future posts we will discuss the polytheistic multipolar AI world in more detail and its alignment challenges and opportunities.
-
Interestingly, and probably not coincidentally, the monotheistic view was predominant during the earliest days of AI and then when OpenAI was clearly multiple years ahead of the competition. Now, when there are many competing labs in seemingly neck-and-neck competition it appears more people are arguing for (or even assuming(!)) the polytheistic view, this is despite the fact that the monotheistic view actually also predicts just such a race as is occurring right now. ↩
-
Scott also assumes a polytheistic-ish world in his AI not all technologies are races post. He is right that there is no race in the case where there are many AIs and the AI population slowly grows and improves over time. He is very wrong in the singleton case where the first singleton to be built attains a decisive strategic advantage. The uncertainty here is that, unlike almost all other technologies, we don’t know whether this one should be a race or not. ↩
-
If RSI growth is sublinear then the situation is even worse as our competitors are actually catching up not just staying a constant proportion behind. ↩
-
For some of my concerns about how algorithmic progress is being measured see here. ↩
-
In general, attempts to improve generic human effectiveness through improving ‘thinking skill’ have had fairly dismal effectiveness with few attempts seeming to show any improvement beyond simple selection effects. This is very surprising on priors since you would think that there are massive low-hanging fruit in human cognition in general. I don’t have a good explanation for this, but I am wary that this makes us overestimate the effect of improved metacognition on AIs skills as well. ↩
-
Although there is certainly a floor effect here. I doubt most or any humans in positions of serious power have IQs < 100. ↩
-
Sometimes three, but very rarely. ↩
-
this effect is much stronger in games than historically because for historical states and empires communication is hard while in the game it is instantaneous and effortless and also that historical empires have cared about things other than maximizing relative power while human players mostly play to win. It seems to me that the 17th and 18th centuries was around the time when the world mostly moved from pure internal coordination problems being the thing preventing world domination by any one empire to coalition politics being the primary driver. The states that had developed by this time in Europe clearly were capable of commanding huge empires while rarely suffering from revolts or civil wars and the primary thing slowing their growth was competition from other European powers, usually in large, coalitional wars. ↩
-
One interesting question here is whether the AGis have any way out of this stalemate through cooperation. This isn’t possible in games because the setup is usually zero sum, however the real world is often not zero sum in this way. There are other things to maximize than relative power. Absolute material and energy for one. The obvious solution here is to come to some kind of deal to cooperate and share the universe with levels given by their relative power levels on earth. This seems to be an obvious equilibrium solution. This prospect of AGIs being superior cooperators is something I will discuss in a future post. ↩