My current high-level strategic picture of the world

Follow up to: My strategic picture of the work that needs to be done, A view of the main kinds of problems facing us

This post outlines my current epistemic state regarding the most crucial problems facing humanity and the leverage points that we could, at least in principle, intervene on to solve them. 

This is based on my current off-the-cuff impressions, as opposed to careful research. Some of the things that I say here are probably importantly wrong (and if you know that to be the case, please let me know). 

My next step here, is to more carefully research the different “legs” of this strategic outline, shoring up my understanding of each, and clarifying my sense of how tractable each one is as an intervention point.

None of this constitutes a plan. More like, this is a first sketch, to facilitate more detailed elaboration.

The Goal

My overall goal here is to explore the possible ways by which humanity achieves existential victory. By existential victory, I mean, 

The human race[1] survives the acute risk period, and enters a stable period in which it (or our descendants) are able to safely reflect on what a good universe entails, and then act to make the reachable universe good.

This entails humanity surviving all existential risk and getting to a state where existential risk is minimized (for instance, because we are now protected from most disasters by an aligned superintelligence, or a coalition of aligned super intelligences).

Possibly, there is an additional constraint that the human race not just survive, but remain “healthy”along some key dimensions, such as control over our world, intellectual vigor, freedom from oppressive power-structures, trauma, if detriments along those dimensions are irreparable and would therefore permanently limit our ability to reflect on what is Good.

This document describes the two basic trajectories that I can currently see, by which we might systematically achieve that goal (as opposed to succeeding by luck).

The Two Problems

In order to get to that kind of safe attractor state there appear to be two fundamental classes of problems facing humanity: technical AI alignment, and civilizational sanity.

By “technical AI alignment”, I mean the problem of discovering how to build and deploy super-humanly powerful AI systems (embodied either in a singleton, or an “ecosystem” of AIs), safely, in a way that doesn’t extinct humanity, and broadly leaves humans in control of the trajectory of the universe.

By “civilizational sanity”, I mean to point at the catch-all category of whatever causes high leverage decision makers to make wise, scope-sensitive, non-self-destructive, choices.

Civilizational Sanity includes whatever factors cause your society to do things like “saving ~500,000 lives by running human challenge trials on all existing COVID vaccines in February 2020, scaling up vaccine production in parallel with market mechanisms, and then administering vaccinations, en masse, to everyone who wants, with minimal delay”, or something at least that effective, instead of what the US did instead.

It also includes whatever it takes for a government to successfully identify and successfully carry through good macroeconomic policy (which I’ve heard is NGPD targeting, though I don’t personally know).

And it includes whatever factors cause it to be the case that your civilization suddenly acquiring god-like powers (via transformative AI or some other method), results in increased eudaimonia instead of in some kind of disaster.

I think that the only shot we have of exiting the critical risk period by something other than luck is sufficient success at solving AI alignment sufficient success at solving civilizational sanity, and implementing our solution.

(The “Strategic Background” section of this post from MIRI outlines a similar perspective of the high level problem as I outline in this document. However it elaborates, in more detail, a path by which AI alignment would allow humanity to exit the acute risk period (minimally aligned AI -> AGI powered technological development -> risk mitigating technology -> pivotal act that stabilizes the world), and de-emphasizes broad-based civilizational sanity improvements as another path out of the acute risk period.)

Substitution

To some degree, solutions to either technical alignment or civilizational sanity can substitute for each other, insofar as a full solution to one of these problems would approximately obviate the need for solving the other.

For instance, if we had a full and complete understanding of AI alignment, including rigorous proofs and safe demonstrations of alignment failures, fully-worked-out safe engineering approaches, and crisp theory tying it all together, we would be able to exit the critical risk period. 

Even if it wasn’t practical for a small team to code up an aligned AI and foom, with that level of detail, it would be easy to convince the existing AI community (or perhaps just the best equipped team) to build aligned AI, because one could make the case very strongly for the danger of conventional approaches, and provide a crisply-defined alternative.

On the flip side, at some sufficiently high level of global civilizational sanity, key actors would recognize the huge cost to unaligned AI, and successfully coordinate to prevent anyone from building unaligned AI until alignment theory has been worked out.

We can make partial progress on either of these problems. The task facing humanity as a whole is to make sufficient progress on one, the other, or both, of these problems in order to exit the acute risk period. Speaking allegorically, we need the total progress on both to “sum to 1.” [2]

A note on “sufficiency”

Above, I write “I think that the only shot we have of exiting the critical risk period by something other than luck is sufficient success at solving AI alignment or sufficient success at solving civilizational sanity…”.

I want to clearly highlight that the word “sufficient” is doing a lot of work in that sentence. “Sufficient” progress on AI alignment or “sufficient” progress on civilizational sanity is not yet operationalized enough to be a target. I don’t know what constitutes “enough” progress on either one of these, and I don’t know if I could recognize it if I saw it. 

Civilizational Sanity, in particular, is always a two place function: I can only judge a civilization to be insane relative to my own epistemic process. If societal decision making improves, but my own process improves even faster, the world will still seem mad to me, from my new more privileged vantage point. So in that sense, the goal posts should be constantly moving. 

My key claim is only that there is some frontier defined by these axes such that, if the world moves past that frontier, we will be out of the acute risk period, even though I don’t know where that frontier lies.  

A note on timelines

When I talk about civilizational sanity interventions as a line of attack on AI risk, folks often express skepticism that we have enough time: AI timelines are short, so short that it seems unlikely that plans that attempt to reform the decision making process of the whole world will bear fruit before the 0 hour. [3]

I think that this is wrong-headed. It might very well be the case that we don’t have time for any sufficiently good general sanity boosting plans to reach fruition. But it might just as well be the case that we don’t have time for our technical AI alignment research to progress enough to be practically useful.

Our basic situation (I’m claiming), is that we either need to get to correct alignment theory, or to a generally sane civilization before the transformative AI countdown reaches 0. But we don’t know how long either of those projects will take. Reforming the decision processes of the powerful places in the world might take a century or more, but so might solving technical alignment.

Absent more detailed models about both approaches, I don’t think we can assume that one is more tractable, more reliable, or faster, than the other.

AI alignment in particular?

This breakdown is focused on the AI alignment problem in particular (it’s taking up half of the problem space), giving the impression that AI risk, is the only, or perhaps the most dangerous, existential risk.

While AI risk does seem to me to pose the plurality of the risk to humanity, that isn’t the main reason for breaking things down in this way. 

Rather it’s more that every intervention that I can see that has a shot of moving us out of the acute risk period goes through either powerful AI, or much saner civilization, or both. [I would be excited to have counterexamples, if you can think of any.]

We need protection against bio-risk, nuclear war, and civilizational collapse / decline. But robust protection against any one of those doesn’t protect us from the others by default. Aligned AI and a robustly sane civilization are both general enough that a sufficiently good version of either one would eliminate or mitigate the other risks. Any other solution-areas that have that property, and don’t flow through aligned AI or a general sane civilization would deserve their own treatment in this strategic map, but as of yet, I can’t think of any.

Technical AI alignment

I don’t have much to say about the details of this project. In broad strokes, we’re hoping to get a correct enough philosophical understanding of the concepts relevant to AI alignment, formalize that understanding as math, and eventually develop those formalizations into practical engineering approaches. (Elaboration on this trajectory here.)

(There are some folks who are going straight for developing engineering frameworks [links], hoping that they’ll either work, or give us a more concrete, and more nuanced understanding of the problems that need to be solved.)

It seems quite important if there are better or faster ways to make progress here. But my current sense of things is that it is just a matter of people doing the research work + recruiting more people who can do the research work. See my diagram here

Civilizational Sanity

Follow up to: What are some Civilizational Sanity Interventions

This second category is much less straightforward. 

Within the broad problem space of “causing high-level human decision making to be systematically sane”, I can see a number of specific lines of attack, but I have wide error bars on how tractable each one is.

Those lines of attack are

  1. Unblocking governance innovation
  2. Powerful intelligence enhancement
  3. Reliable, scalable, highly effective resolution of psychological trama
  4. Chinese ascendency

I’m sure this list isn’t exhaustive. These four are the only interventions that I currently know of that seem like (from my current epistemic state) they could transform society enough that we could, for instance, handle AI risk gracefully. 

Relationship between these legs

In particular, there’s an important open question of how these approaches relate to each other, and the broader civilizational sanity project. 

I described above that I think that “AI alignment” and “civilizational sanity” have an “or” or a “sum” relationship: sufficient progress on only one of them can allow us to exit the critical risk period.

There might be a similar relationship between the following civilizational sanity interventions: pushing on any one of them, far enough, leads to a large jump in civilizational sanity, kicking off a positive feedback loop. OR it might instead be that only some of these approaches attack the fundamental problem, and without success on that one front, we won’t see large effects from the others.

Unblocking Innovation in Governance

Better Governance is Possible

The most obvious way to improve the sanity of high-leverage decisions on planet earth is governmental reform.

Our governmental decision making processes are a mess. National politics is tribal politics writ-large: instead of a societal-level epistemology trying to select the best policies, we have a bludgeoning match over which coalitions are best, and which people should be in charge. Politicians are selected on the basis of electability, not expertise, or even alignment with society, yet somehow we seem to be ending up with candidates that no one is enthusiastic about. Congress is famously in a semi-constant self-strangle-hold, unable to get anything done. And the constraints of politics forces those politicians to say absurd things in contradiction with, for instance, basic economic theory, and to grandstand about things that don’t matter and (even worse) things that do.

The current system has all kinds of analytical demonstrable game theoretic drawbacks that make undesirable outcomes all but inevitable: including a two-party system that no one likes much, principal agent problems between the populous and the government, and net societal losses as a result of allocation of benefits to special interest groups.

There hasn’t been a major innovation in high-level governance, since the invention and wide-scale deployment of democracy in the 18th century. It seems like we can do better. We could, in principle, have governmental institutions that are effective epistemologies, are able to identify problems and determine and act on policies at the frontier of society’s various tradeoffs instead at the frontier of the tradeoffs of political expediency.

And because governments have so much influence, more effective information processing in that sector could lead to better institution designs in all other sectors. Public policy is in part a matter of creating and regulating other institutions. Saner government decision making entails setting up efficient and socially-beneficial incentives for health, education, etc, which selects for effective institutions in those more specific sectors. In this way, government is a meta-institution that shapes other institutions. (It’s unclear to me to what degree this is true. How much does better policy at the governmental level, automatically correct the inefficiencies of, say, the medical bureaucracy?)

One might therefore think that a particularly high leverage intervention is to develop new systems of governance. But humanity has a pretty large backlog of governance innovations that seem much better than our current setups on a number of dimensions, from the simple, like using Single Transferable Vote instead of First Past the Post, to the radical, like Futarchy, or the abolition of private property in favor of a COST system.

It seems to me that the bottleneck for better governmental systems is not possible alternatives, but rather the opportunity to experiment with those alternatives. Apparently, there are approximately no venues available for governmental innovation on planet earth.

This is not very surprising, because incumbents in power, benefit from the existing power structure and therefore oppose replacing it with a different mechanism. In general, everyone who has the ability to gatekeep experiments with new governance mechanisms is incentivized to be threatened by those experiments

However, widespread experimentation and innovation in governance would likely be a huge deal, because it would allow humanity as a whole to identify the most successful mechanisms, which, having been shown to work, could be tried at larger scales, and eventually widely adopted.

Experimentation Leads to Eventual Wide Adoption

The basic argument that merely allowing experimentation will eventually lead to better governance on a global scale is as follows: 

Many governance mechanisms, if tried, will not only 1) surpass existing systems, but 2) will surpass existing systems in a legible way, both in aggregate outcomes (like economic productivity, employment, and tax-rate), and from direct engagement with those systems (for instance, once voters become familiar with Futarchy, it might seem absurd that you would elect individuals who are both supposed to represent one’s values and have good plans for achieving those values). 

If the condition of “legible superiority” holds, there would be pressure to replicate those mechanisms elsewhere, at all different scales. Eventually, the best innovations simply become the new standard practices.

Similarly, for many incentive-aligning interventions, not using such methods is a stable attractor: it is in the interests of those in power to resist their adoption. But also, wide-spread use of such methods is also a stable attractor. Once common, it is in the interests of those in power to keep using them. As Robin Hanson says of prediction markets:

I’d say if you look at the example of cost accounting, you can imagine a world where nobody does cost accounting. You say of your organization, “Let’s do cost accounting here.”

That’s a problem because you’d be heard as saying, “Somebody around here is stealing and we need to find out who.” So that might be discouraged.

In a world where everybody else does cost accounting, you say, “Let’s not do cost accounting here.” That will be heard as saying, “Could we steal and just not talk about it?” which will also seem negative.

Similarly, with prediction markets, you could imagine a world like ours where nobody does them, and then your proposing to do it will send a bad signal. You’re basically saying, “People are bullshitting around here. We need to find out who and get to the truth.”

But in a world where everybody was doing it, it would be similarly hard not to do it. If every project with a deadline had a betting market and you say, “Let’s not have a betting market on our project deadline,” you’d be basically saying, “We’re not going to make the deadline, folks. Can we just set that aside and not even talk about it?”

This may generalize to many institution designs that are better than the status quo.

For these reasons, finding ways around the general moratorium of governmental innovation, so that new governance mechanisms can be tried, has possibly huge dividends.

Strategies to allow for Experimentation

Currently, the only approaches I’m aware of for creating spaces for governmental innovation are charter cities and sea steading.

Charter cities are bottle-necked on legal restrictions, and the practical coordination problem of getting a critical mass of residents. But I’m hopeful that COVID has caused a permanent shift to remote work, which will give people more freedom in where to live, and increase competition-in-governance between cities and states, who want to attract talent.

Seasteading is currently bottlenecked on the engineering problem of creating livable floating structures, cheaply enough to be scalable. [Double check if cost is actually the key concern.]

Repeatable reform templates

I wonder if there might be a third, more abstract, line of attack on unblocking governance innovation: developing a repeatable method to change existing governmental structures in a way that incentivizes powerful incumbents.

If it were possible to simply buy out incumbents and overhaul the system, that might be a huge opportunity. However, I guess that in most liberal democracies, this is both illegal and generally repugnant (plus politicians are beholden to their party which might object), such that existing power-holders would not accept a straightforward “money for institutional reform” trade.

But there may be some other version which, in practice, incentivizes power-holders to initiate governmental reform. Possibly by letting those power-holders keep their power for some length of time, and also recieve the credit for the change. Or maybe a setup that targets those people before they take power, when they are more idealistic, and more inclined to make an agreement to cause reform, conditional on all their peers doing the same, in the style of a free-state agreement.

If we could find a repeatable “template” for making such deals, it might unlock the ability to iteratively improve existing governmental structures.

I’m not aware of any academic research in this area (both historical case studies of how these kinds of shifts have occurred in the past and analytic models of how to incentivize such changes seem quite useful to me), nor any practical projects aiming for something like this.

Intelligence enhancement

One might posit that the sort of incentive problems that lead to bizarre institutional policies is the inevitable result of the fact that doing better requires understanding many abstract, non-intuitive concepts and/or careful reasoning in complicated domains, and the average person is of average intelligence, which is insufficient to systematically identify better policies and institutional set-ups over worse ones at the current margin.

In this view, the fundamental problem is that our civilizational decision making processes are much worse than is theoretically possible, because we are collectively not smart enough to do better. Some of us can identify the best policies (or at least determine that one policy is better than another), some of the time, but that relies on understanding that is esoteric to many more people, including many crucial decision makers.

But if the average intelligence of the population as a whole was higher, more good ideas would seem obviously good to more people, and it would be substantially easier to get critical mass of acceptance of sane policies on the object level, as well as better information processing mechanisms. (For instance, If the IQ curve was shifted 35 points to the right, many more people would be able to “see at a glance” why prediction markets are an elegant way of aggregating information.)

More intelligence -> More understanding of important principles -> Saner policies

So it might be that the most effective lever on civilizational sanity is intervening on biological intelligence.

The most plausible way to do this is via widespread genetic enhancement, with either selection methods like iterated embryo selection, or direct gene editing using methods like CRISPR.

My current understanding is that these methods are bottle-necked on our current knowledge of the genetic predictors of intelligence: if we knew those more completely, we would basically be able to start human genetic engineering for intelligence. It seems like that knowledge is going to continue to trickle in as we get better at doing genomic analysis and collect larger and larger data sets. [Note: this is just my background belief. Double-check] Possibly, better Machine Learning methods will lead to a sudden jump in the rate of progress on this project?

On the face of it this suggests that any project that could provide a breakthrough in decoding the genetic predictors of intelligence could be high leverage.

Aside from that, there’s some risk that society will fork down a path in which human genetic enhancement is considered unethical, and will be banned. I’m not that worried about this possibility, because as long as some people / groups are doing this for their children there is a competitive pressure to do the same, and I think it is pretty unlikely that China, which is competitive, at the national level, with the rest of the world, and in which families already regularly exert huge efforts to give their children competitive advantages relative to societies at large, will forgo this opportunity. And if China invests in human genetic enhancement, the US will do the same out of a fear of Chinese dominance.

Some other avenues for human intelligence enhancement include nootropics, which seems much less promising for the basic algernonic argument, and brain computer interfaces like neurolink. Of the latter, it is currently unknown which dimensions of human cognition can be readily improved, and if such augmentation will lend itself to wisdom or whatever the precursors to civilizational sanity are.

There’s also the possibility of using sufficiently aligned AI assistants to augment our effective intelligence and decision making. Absent our alignment research giving us very clear criteria for aligned systems, this seems like a very tricky proposition, because of the problems described in this post. But in worlds where AI technology continues to improve along its current trajectory, it might be that using limited AI systems as leverage for improving our decision making and research apparatus, to further improve our alignment technologies, is the best way to go.

A note on improving public understanding by methods other than intelligence enhancement:

Possibly there are other ways to substantially increase each person’s individual intellectual reach, so that we can all come to understand more, without increasing biological intelligence. Things in the vein of “better education”. 

I’m pretty dubious of these. 

I think I have far above average skill in communicating (both teaching and being taught) complex or abstract ideas. But even being pretty skilled, for a human, it is just hard. Even when the conditions are exceptional (a motivated student working one-on-one with a skilled tutor who understands the material and can model / pace to the student’s epistemic state), it just takes many focused hours to grasp many important concepts.

I think that any educational intervention effective enough to actually move the needle on civilizational sanity would have to be very radical: so transformative that it would be a general boost in a person’s learning ability, i.e. an increase in effective intelligence. That said, if anyone has ideas for interventions that could increase most people’s intellectual grasp, I would love to hear them.

(…Possibly a dedicated and well executed campaign to educate the public at large ins some small set of extremely important concepts, with the goal of shifting what sorts of explanations sounds plausible to most people (raising the standard for what kinds of economic claims people can make in public with a straight face, for instance), would be helpful on the margin. But this seems to me like an enormous undertaking which would require pedagogical and mass-communication knowledge that I don’t know if anyone has. And I’m not sure how helpful it would be. Even if the whole world understood econ 101, the real world is more complicated than econ 101, such that I don’t know how much that alone would aid people’s assessment of which policies are best. I suppose it would cut out some first-order class of mistakes.)

I do think there are definitely ways to increase our collective intellectual reach, so that societies can systematically land on correct conclusions without increasing any individual person’s intellectual reach or understanding. These include the governance mechanisms I alluded to in the last section. 

There might also exist society wide “public services”, that could do something like this while side-stepping government bureaucracy entirely, like the dream of arbital. I’m not sure how optimistic I should be about those kinds of interventions. The only comparable historical examples that I can think of are wikipedia and public libraries. Both of these seem like clearly beneficial public goods with huge flow-through effects, which make information easily available to people who want it and wouldn’t otherwise have access. But neither one seems to have obviously improved high-level civilizational decision making relative to the counterfactual.

Clearing “Trauma”??

[The following section is much more speculative, and I don’t yet know what to think of it.]

There’s another story in which the main source of our world’s dysfunction is self-perpetuating trauma patterns. 

There are many variations of this story, which differ in important details. I’ll outline one version here, noting that something like this could be true without this particular story being true.

According to this view…

virtually everyone is traumatized (or if you prefer, “socialized”), into dysfunctional and/or exploitative behavior patterns, to greater or lesser degrees, in the course of growing up. 

The central problem isn’t (just) that everyone is following their local self-interest in globally destructive systems, it is actually much worse than that: people are conditioned in such a way that they are not even acting in their narrow self interest. Instead humans myopically focus on goals, and execute strategies, that are both 1) globally harmful and 2) not even aligned with their own “true” reflective, preference, due to false assumptions underlying their engagement with the world. This myopia also inhibits their ability to think clearly about parts of the world that are related to their trauma

(As a case in point, I think it is probably the case that there are lots of people aggressively pursuing AGI, and who are instinctively flinch away from any thought that AGI might be dangerous, because they have a deep, unarticulated, belief that if they can be successful at that, their parents will love them, or they won’t feel lonely any more, or something like that.)

They’ve been conditioned to feel threatened, or triggered by, a huge class of behaviors that are globally productive, like accurate tracking of harms, and many kinds of positive-sum arrangements.

Furthermore, the core reason why most people can’t seem to think or to have “beliefs in the sense of anticipations about the world” is not (mostly) a matter of intelligence, but rather that their default reasoning and sense-making functions have been damaged by the institutions and social contexts in which they participate (school, for instance).

Those traumatizing contexts  are not designed by conscious malice, but they are also not necessarily incidental. It’s possible that they have been optimized to be traumatizing, via unconscious social hill-climbing.

This is because trauma-patterns are replicators: they have enough expressive power to recreate themselves in other humans, and are therefore subject to a selection pressure that gradually promotes the variations that are most effective at propagating themselves. (Furthermore, there’s a hypothesis that for a traumatized mind, one of the best ways to control the environment to make it safe is to similarly traumatize people in the environment.) The net result is horrendous systems of hurt people hurting people, as a way to pass on that particular flavor of hurt to future generations.

Part of the hypothesis here is that these trauma patterns have always been a thing in human societies, but there has also typically been a counter-force, namely that if you need to work together and have a good understanding of the physical world to survive in a harsh environment, your epistemology can’t be damaged too badly, or you’ll die. But in the modern world, we’ve become so wealthy, and most people have become so divorced from actual production, that that counter-force is much diminished.

Implications for Improving the World

If this story is true, governmental reform is likely to fail for seemingly-mysterious reasons, because there is selection pressure optimizing against good institutional epistemology, over and above bureaucratic inertia and the incentives of entrenched power-holders. If you don’t defuse the underlying trauma-patterns, any system that you try to build will either fail or be subverted by those trauma-patterns..

And under this story, it’s unclear how much intelligence enhancement would help. All else being equal, it seems (?) that being smarter helps in developmental work, and healing from one’s personal traumas, but it might also be the case that greater intelligence enables more efficient propagation of trauma patterns. 

If this story is largely correct, it implies that the actual bottleneck for the world is understanding trauma and trauma resolution methods well enough to heal trauma-patterns at scale. If we can do that, the agency and intellect of the world (which is currently mostly suppressed), will be unblocked, and most of the other problems of the world will approximately solve themselves.

I also don’t know to what extent there already exist methods for reliably and rapidly resolving trauma patterns, and the degree to which the bottleneck is actually one 1 to n scaling rather than 0 to 1 discovery. Certainly there are various methods that at least some people have gotten at least some benefit from, though it remains unclear how much of the total potential benefit even the best methods provide to the people who have gotten the most from them.

I don’t know what to think of all of this yet, the degree to which trauma is at the root of the world’s ills, the degree to which things have actually been optimized to be traumatizing as opposed to ending up that way by accident, or even if “trauma” is a meaningful category pointing at a real phenomenon that is different from “learning” in a principled way.

I’ll note that even if the strong version of this story is not correct, it might still be the case that many people’s intellectual capability is handicaped by psychological baggage. So it might be the case that research into effective trauma-resolution methods may be an effective line of attack on improving the world’s intellectual capability. For instance, finding a non-scalable method for reliably resolving trauma might be an important win, because at minimum, we could apply it to all of the AI safety researchers. This might be one of the possible gains on the table for speeding progress on the alignment problem. 

(Though this is also something to be careful of, since such methods would likely have some kind of psychological side effects, and we don’t necessarily want to reshape the psyches of earth’s contingent of alignment researchers all in the same way. I worry that we might have already done this to some degree with circling: Circling seems quite good and quite helpful, but I think that we should be concerned that if we make a mistake about what directions are good to push the culture of our small AI safety community, we’re likely to destroy a lot of value.)

The Rise of China??

In the first section describing what I meant by civilizational sanity up above, I noted “sensible response to COVID” as one indicator of civilizational sanity. Notably, China’s covid response, seems, overall, to have been much more effective than the West’s.

This doesn’t seem like an aberration, either. As a non-expert foreigner, looking in, it looks like China’s society/government is overall more like an agent than the US government. It seems possible to imagine the PRC having a coherent “stance” on AI risk. If Xi Jinping came to the conclusion that AGI was an existential risk, I imagine that that could actually be propagated through the chinese government, and the chinese society, in a way that has a pretty good chance of leading to strong constraints on AGI development (like the nationalization, or at least the auditing of any AGI projects).

Whereas if Joe Biden, or Donald Trump, or anyone else who is anything close to a “leader of the US government”, got it into their head that AI risk was a problem…the issue would immediately be politicized, with everyone in the media taking sides on one of two lowest-common denominator narratives each straw-manning the other. One side would attempt to produce (probably senseless) legislation in the frame of preventing the bad guys from doing bad things, while the other side goes to absurd lengths to block them as a matter of principle, and in the end we’re left with some regulation on tech companies that doesn’t cleave to the actual shape of the problems at all, and pisses off researchers who are frustrated that this anthropomorphizing, “AI risk” hubbub, just made their lives much harder, alienating them.

(One might think that this is actually a national security issue, and it would be taken more seriously than that, but COVID was a huge public health issue, and we managed to politicize wearing masks.

So, maybe it would be good for the world if China was the dominant world power?

I think that overall, China’s society, and high level decision making is currently saner than that of the western world. So maybe on the margin, the world is better off if China were more dominant. 

However, I have a number of reservations.

  1. China’s human rights record is not great. Apparently, there is an ongoing genocide of the Uighurs, happening right now. My deontology is pretty reluctant to put mass murders in charge of the world.
    1. I’m not sure how to think about this. Genocide is extremely bad. And furthermore we have a strong, coordinated norm to censor and take action against it (although, obviously not that strong, since I don’t know of a single person who has taken any action other than (occasionally) tweeting news articles, in this case). But also, I’m not sure whether I should just parse this as standard practice for great powers / ruling empires. The US has committed similarly bad atrocities in its history (slavery and the extermination/relocation of the Indians come to mind), and as far as I know, continues to commit similar atrocities. And the stakes are literally astronomical. Does the specter of extinction and the weight of all future earth-originating civilization mean we should just neglect contemorary genocide in our realpolitik calculations? I’m not comfortable with that, but I don’t know what to think about it.
  2. I don’t have a strong reason to expect that China’s institutions are fundamentally better functioning than the US’s, I think they’re just younger. If China is exhibiting the kind of functionality and decisiveness, that the US was enjoying 60 years ago, then it seems pretty plausible that 60 years from now (or maybe sooner than that, on the general principle that the world is moving faster now), the chinese system will be similarly scrolrotic and dysfunctional.
    1. Indeed, we might make a more specific argument that institutions are able to remain functional so long as there is growth, because a growing pie means everyone can win. But when growth slows or stops there’s no longer a selection pressure for effectiveness, and institutions entrench themselves because rent seeking is a better strategy. (Or maybe the causality goes the other way: there’s a continual, gradual, increase in rent-seeking as actors entrench their power-bases, which gradually cuts out production, until all (or almost all) that’s left is rent-seeking. In any case, I think China has got to be nearing the top of it’s explosive s-curve, and I don’t expect its national agency to be robust to that.
  3. I would guess, not knowing much more than a stereotype of Chinese culture, that even if it is saner and more effective than western culture right now, the west has more of the generators that can lead to further increases in civilizational sanity. I might be totally off base here, but the East’s emphasis on conformity and social hierarchy seems like it would make it even MORE resistant to, say, the wide-scale adoption of prediction markets than the US is. (Though maybe the ruling party is enough of an unincentivized incentivizer to overcome this effect?) I suspect that it is even less likely to generate the kind of iconoclastic thinkers who would think up the idea of prediction markets in the first place. It would be quite bad if we got some boost in civilizational sanity with the rise of China, but that Chinese dominance curtainald any further improvement on that dimension. 
  4. It is currently unclear to me how much it matters which culture the intelligence explosion takes place in.
    1. Under the assumption of a strong attractor in the human CEV, it seems like it doesn’t matter much at all: we’re all, currently, so radically confused about Goodness, that the apparently-huge cultural differences are just noise. And even if that’s not true, I would guess that the differences between my ideal future, and some human descended society, are probably massively outweighed by the looming probability of extinction and a sterile universe. Chinese people live happy lives in China, now, and have lived happy lives throughout history, even if they tolerate a level of conformity and restriction-of-expression that I would find stifling, to say the least.
    2. However, I think it might not be an exaggeration to say that the CPC believes that thoughts should be censored to serve the state. I can imagine technological augmented versions of thought control that are so severe as to permanently damage the human civilization’s ability to think together, which might constitute the sort of irreparable “damage” that prevent us from deliberating to discover and the executing on a good future. If this sort of technology is more likely to come from China than from the west, Chinese supremacy might be disastrous.
    3. It does seem really important that AGI not lock the future into an inescapable immortal dictatorship (Probably? Maybe most people just live basically happy lives in an immortal dictatorship?). And I want to track if that is more likely to result from an intelligence explosion directed by China than by my native culture.

Summing up

  • The problem facing humanity in this era is figuring out how to exit the acute risk period, systematically, instead of by luck. 
  • The only ways that I can see to do this, depend on aligned AI or a much saner human civilization. 
  • So the problem breaks down into two subproblems: solve AI alignment or achieve enough civilizational sanity.
  • AI alignment research is going apace, and if there are ways to speed it up, that would be great.
  • I can currently see four lines of attack on civilizational sanity: unblocking innovation in governance, intelligence enhancement, and possibly widespread trauma resolution, or Chineses ascendancy. 
  • All of those plans might turn out to be on-net bad for the world, on further reflection.

Questions

Some of my questions for going forward:

  1. How long until transformative AI arrives?
  2. Are there tractable ways to speed Technical AI alignment substantially?
  3. Are there tractable ways to unblock governance experimentation?
  4. Follow up on charter city projects
  5. What’s blocking sea steading? Is it cost as I believe?
  6. How large are the expected flow-through effects of governmental sanity interventions on other sectors?
  7. Conditional on unblocking innovation in governance, how long is it likely to take for the best innovations to propagate outward until they are standard best practices?
  8. What’s the bottleneck for human genetic intelligence augmentation
  9. Along what dimensions would Nurolink improve human capabilities?
  10. Is “trauma” a natural kind? To what extent is it true that psychological trauma is driving exploitative and counter-productive organizational patterns in the world?
  11. How much saner is China? How long will the Chinese system remain “alive”?
  12. How different will the long term future be, if the intelligence explosion happens in one culture rather than another?

Footnotes

[1] –  Or some civilization or other mechanism, bearing human values.

[2] –  Though of course, there isn’t a linear relationship between the individual progress bars, and total victory. We might be “70%” of the way to a full solution to both problems (whatever that means), but between the two, not have enough of the right pieces to get a combined solution that lets us exit the critical risk period. That’s why it is only allegorical.

[3] – And, in contrast, I sometimes talk with people who are so pessimistic about alignment work, that they take it for granted that the thing to do is take over the world by conventional means.

Psychoanalyzing, people seem to gravitate to the line of attack that is within their skillset, and therefore feels more comfortable to think about. This seems like a perfectly good heuristic for specialization, but it doesn’t seem like a particularly good way to identify which approach is more tractable in the abstract.

A view of the main kinds of problems facing us

I’ve decided that I want to to make more of a point to write down my macro-strategic thoughts, because writing things down often produces new insights and refinements, and so that other folks can engage with.

This is one frame or lens that I tend to think with a lot. This might be more of a lens or a model-let than a full break-down.

There are two broad classes of problems that we need to solve: we have some pre-paradigmatic science to figure out, and we have have the problem of civilizational sanity.

Preparadigmatic science

There are a number of hard scientific or scientific-philosophical problems that we’re facing down as a species.

Most notably, the problem of AI alignment, but also finding technical solutions to various risks caused by bio-techinlogy, possibly getting our bearings with regards to what civilization collapse means and how it is likely to come about, possibly getting a handle on the risk of a simulation shut-down, possibly making sense of the large scale cultural, political, cognitive shifts that are likely to follow from new technologies that disrupt existing social systems (like VR?).

Basically, for every x-risk, and every big shift to human civilization, there is work to be done even making sense of the situation, and framing the problem.

As this work progresses it eventually transitions into incremental science / engineering, as the problems are clarified and specified, and the good methodologies for attacking those problems solidify.

(Work on bio-risk, might already be in this phase. And I think that work towards human genetic enhancement is basically incremental science.)

To my rough intuitions, it seems like these problems, in order of pressingness are:

  1. AI alignment
  2. Bio-risk
  3. Human genetic enhancement
  4. Social, political, civilizational collapse

…where that ranking is mostly determined by which one will have a very large impact on the world first.

So there’s the object-level work of just trying to make progress on these puzzles, plus a bunch of support work for doing that object level work.

The support work includes

  • Operations that makes the research machines run (ex: MIRI ops)
  • Recruitment (and acclimation) of people who can do this kind of work (ex: CFAR)
  • Creating and maintaining infrastructure that enables intellectually fruitful conversations (ex: LessWrong)
  • Developing methodology for making progress on the problems (ex: CFAR, a little, but in practice I think that this basically has to be done by the people trying to do the object level work.)
  • Other stuff.

So we have a whole ecosystem of folks who are supporting this preparadgimatic development.

Civilizational Sanity

I think that in most worlds, if we completely succeeded at the pre-paradigmatic science, and the incremental science and engineering that follows it, the world still wouldn’t be saved.

Broadly, one way or the other, there are huge technological and social changes heading our way, and human decision makers are going to decide how to respond to those changes, possibly in ways that will have very long term repercussions on the trajectory of earth-originating life.

As a central example, if we more-or-less-completly solved AI alignment, from a full theory of agent-foundations, all the way down to the specific implementation, we would still find ourselves in a world, where humanity has attained god-like power over the universe, which we could very well abuse, and end up with a much much worse future than we might otherwise have had. And by default, I don’t expect humanity to refrain from using new capabilities rashly and unwisely.

Completely solving alignment does give us a big leg up on this problem, because we’ll have the aid of superintelligent assistants in our decision making, or we might just have an AI system implement our CEV in classic fashion.

I would say that “aligned superintelligent assistants” and “AIs implementing CEV”, are civilizational sanity interventions: technologies or institutions that help humanity’s high level decision-makers to make wise decisions in response to huge changes that, by default, they will not comprehend.

I gave some examples of possible Civ Sanity interventions here.

Also, think that some forms of governance / policy work that OpenPhil, OpenAI, and FHI have done, count as part of this category, though I want to cleanly distinguish between pushing for object-level policy proposals that you’ve already figured out, and instantiating systems that make it more likely that good policies will be reached and acted upon in general.

Overall, this class of interventions seems neglected by our community, compared to doing and supporting preparadigmatic research. That might be justified. There’s reason to think that we are well equipped to make progress on hard important research problems, but changing the way the world works, seems like it might be harder on some absolute scale, or less suited to our abilities.