About this site
Around the Bend, Carl Gaertner; h/t Robin Sloan and The Bonfoey Gallery
The development or adoption of new technologies can create new risks. Sometimes, these risks are global in scope; a few prominent examples are global risks from fossil fuels, nuclear weapons, bioweapons, and gain-of-function research.
The purpose of this site is to:
- Lay out the case for global risk from future progress in deep learning as clearly and simply as possible;
- Describe the two fundamental safety problems underlying the risk; and
- Document the engineering approaches that look most likely to solve these problems.
My hope is that this will help to grow a small, healthy network of deep learning safety research projects making steady progress towards new training methods that don’t suffer from the failure modes described here.
Relationship to other work
What’s the relationship between what I write on this site and other approaches to potential risks from advanced AI? (E.g. 1, 2, 3, 4, 5)
First, very little on this site is original to me; the picture I’m presenting is my current understanding of the situation, but it’s the product of a lot of prior work from others over the past couple of decades. I’m cribbing the most from Paul Christiano.
Second, people working on AI risk have a wide variety of views about how risks could play out and what kinds of work are most likely to help. This site basically doesn’t address those disagreements – my approach is to give the case for my own views as succinctly as possible, instead of peppering each page with exhaustive links and discussions of disagreeing views. So, while I will be using phrases like “the case for risk” and “fundamental safety problems,” I hope it’s clear that this is the picture as I see it, and not an established consensus view.
Thanks to Open Philanthropy for a grant supporting this work.