Alignment

It seems likely AI progress forecasting is a complex topic, but for an intro you might look at Holden Karnofsky’s Are we “trending towards” transformative AI?, a part of his series on “the most important century”; and the Metaculus questions When will the first weakly general AI system be devised, tested, and publicly known of? and When will the first general AI system be devised, tested and publicly known of? that this century will see the creation of human-level machine intelligence. When I say this, I’m not thinking about any one design schema in particular, of one particular way it could come about; rather, “human-level machine intelligence” is a pointer to a large swathe of design-space of AI systems such that cognitive work of that system or systems, in relevant domains, is on par (or exceeds) human cognitive work. Shortly thereafter, things will probably get a bit crazy – think of what happened on planet Earth the last time something slightly smarter than chimps arose, and then note that any machine-based intelligence will also have a much more direct access to its substrate of thinking than we do, being able to modify it and improve it much more easily. E.g. it will be able to modify its own code and upgrade its hardware with relative ease. What follows, then, goes by the cute name of intelligence explosion. See this FAQ if the description so far has left you nonplussed. For some related reading, see The Power of Intelligence and the Metaculus question Transition Time From AGI to Superintelligence

“AI alignment” is a term for our civilizational attempt that this craziness produces as few bad consequences, and as many good consequences as possible. As one might expected from a little craziness, the bad ones are potentially unboundedly bad, such as human extinction, while the good ones are potentially indescribably good – though Bostrom does attempt to describe a glimpse in his Letter from Utopia.

To the extent that this view corresponds to reality – and I think it does quite a bit – working on AI alignment seems to me among the most urgent and worthwhile activities of our age. So, consequently, I try to spend a considerable part of my waking hours working on it and thinking about it.

If you are new to AI, and the whole premise is plain unintelligible to you, I recommend WaitButWhy’s two-part article The AI Revolution: The Road to Superintelligence. Nick Bostrom has a nice TED talk on this topic, as does Sam Harris.

If you’ve read/watched all that and you’re antsy to learn/do more, you can shoot me an email describing your background, and I can try to advise you on the next steps.

(I’m also particularly eager to hear from other people in Croatia who might be interested in working on AI alignment.)