Lots of Links
How to use this page: This lots-of-links page is intended to be used as a reference when you want to look up something specific.
AI safety is a new and fast-growing research field. This means that everything is a bit of a mess. If you are new to the field and feel overwhelmed or don't know where to start, we recommend contacting AI Safety Quest for guidance. Alternatively, just have a look at the highlighted links and ignore the rest.
Page maintenance: This page is sporadically maintained by Linda Linsefors. Please reach out if you want to help keep it up to date, or you have other comments or suggestions.
linda.linsefors@gmail.com
Last update: 2024-12-13
Contents
News and Community
List of communities
News
AI Safety Newsletter by Center for AI Safety
AI Safety in China by Concordia AI (安远AI)
Don't Worry About the Vase | Substack or WordPress <- Not just AI
Community Blogs
AI Safety Discussion – This group is primarily for people who have experience in AI/ML and/or are familiar with AI safety. Beginners are encouraged to join the AI Safety Discussion (Open) group.
Twitter/X (Very incomplete since I (Linda) don't use twitter. Please help.)
AI Safety Core by JJ Balisanyuka-Smith
⏹️ = Stop AI, ⏸️ = Pause AI
(These symbols are used on twitter to show support for these types of policies.)
Other
AI Safety Google Group – used by various people for various announcements
AI Alignment Slack group – discussions, networking, etc.
AI Existential Safety Community from FLI
Local groups (click to expand)
Australia & New Zealand
Asia
Africa
Equiano Institute - A grassroots interdisciplinary responsible AI lab for Africa and the Global South
Netherlands
UK & Ireland
Europe
South America
North America
Fellowships, Internships, Training Programs, Study Guides, etc
AISafety.com Events and Training – a calendar showing all upcoming programs and other events
Programs
AI Safety Camp (AISC) – online, part-time research program.
ML4Good - Intensive in-person Bootcamps
Principles of Intelligent Behavior in Biological and Social Systems (PIBBSS)
Swiss Existential Risk Initiative (CHERI), Research Fellowship
Center for Human-Compatible AI (CHAI), Research Fellowship, Collaborations, Internships
Center for the Governance of AI (GovAI), Research Fellows, Summer and Winter Fellowships
ML for Alignment Bootcamp (MLAB) – not currently running, but you can sign up for news on future iterations or request access to their curriculum
Self studies
AI Safety Fundamentals
Introduction to ML Safety – Self Study Curriculum
Alignment Research Engineer Accelerator (ARENA) – follow links from each module to find full curriculum
Levelling Up in AI Safety Research Engineering by Gabriel Mukobi
Study Guide by John Wentsworth
Agent Foundations for Superintelligence-Robust Alignment (AFSRA) by plex
Alignment reading list by Lucius Bushnaq
Reading Guide for the Global Politics of Artificial Intelligence
Description of Refine, including initial reading list by Adam Shimi
Introduction to AI Safety, Ethics, and Society textbook + lectures by Dan Hendrycks
"Key Phenomena in AI Risk" - Reading Curriculum Principal Curator: TJ (particlemania)
See AISafety.com Courses for more
Old self study guides (pre-LLMs)
MIRI’s Research Guide (despite the name it is actually more of a study guide)
Open AI’s Spinning Up in Deep RL (teaches you AI and not AI safety, but still useful)
Other lists
List of AI Safety Technical Courses, Reading Lists, and Curriculums from Nonlinear
List of some university courses on X-risk | Pablo's miscellany <- Not AI Safety specific
Top US policy master's programs <- Not AI Safety specific
Research and Career Advice
Research/General Advice
Safety isn’t safety without a social model (or: dispelling the myth of per se technical safety) by Andrew Critch
Inside Views, Impostor Syndrome, and the Great LARP by John Wentworth
You Are Not Measuring What You Think You Are Measuring by John Wentworth
Research productivity tip: "Solve The Whole Problem Day" by Steven Byrnes
Alignment Research Field Guide by MIRI – the best part is 3A. Transmitters and receivers
Focus on the places where you feel shocked everyone's dropping the ball by Nate Soares
Career Advice
FAQ: Advice for AI alignment researchers by Rohin Shah
Beneficial AI Research Career Advice by Adam Gleave
Technical AI Safety Careers by AI Safety Fundamentals
PhD Advice
Should you do a PhD? by Linda
Leveraging Academia and Deliberate Grad School by Andrew Critch
A Survival Guide to a PhD by Andrej Karpathy
How to Write an Email to a Potential Ph.D. Advisor/Professor
List of (mostly academic AI Safety Professionals - Look here for potential PhD superviosrs
Job listings
80,000 Hours Job Board WARNING: This job board includes listings of jobs at AI capabilities labs such as OpenAI. Please don't apply for these jobs. Even so called safety roles at these labs should not be assumed to be good places to work for someone who cares about AI safety.
Many job openings are posted in #opportunities in the AI Alignment Slack
Career Coaching
Arkose – Career coaching for machine learning professionals interested in contributing to technical AI safety work.
Free Coaching for Early/Mid-Career Software Engineers with Jeffrey Yun
Research Maps, Reviews, Information Databases, etc
Ten Levels of AI Alignment Difficulty by Sammy Martin
Mapping the Conceptual Territory in AI Existential Safety and Alignment by Jack Koch
AI Alignment 2018-19 Review by Rohin Shah
2020 AI Alignment Literature Review and Charity Comparison by Larks
An Extremely Opinionated Annotated List of My Favourite Mechanistic Interpretability Papers v2 by Neal Nanda
Probability Calculator designed by Will Petillo to calculate p(doom) of AI destroying the world
Code libraries
Inspect - a framework for large language model evaluations created by the UK AI Safety Institute.
NNsight - NNsight (/ɛn.saɪt/) is a package for interpreting and manipulating the internals of models.
TransformerLens - This is a library for doing mechanistic interpretability of GPT-2 Style language models.
Research Agendas
Technical AI safety
MIRI: Agent Foundations for Aligning Machine Intelligence with Human Interests (2017) and Alignment for Advanced Machine Learning Systems research agendas
CLR: Cooperation, Conflict, and Transformative Artificial Intelligence: A Research Agenda (+ includes some questions related to AI governance)
Paul Christiano’s research agenda summary (and FAQ and talk) (2018)
Synthesising a human's preferences into a utility function (example use and talk), Stuart Armstrong, (2019),
The Learning-Theoretic AI Alignment Research Agenda, Vanessa Kosoy, (2018)
Research Priorities for Robust and Beneficial Artificial Intelligence, Stuart Russell, Daniel Dewey, Max Tegmark, (2016)
Concrete problems in AI Safety, Dario Amodei, Chris Olah, Jacob Steinhardt, Paul Christiano, John Schulman, Dan Mané, (2016)
AGI Safety Literature Review, Tom Everitt, Gary Lea, Marcus Hutter, (2018)
AI Services as a Research Paradigm, Vojta Kovarik, (2020)
Avoiding Negative Side Effects due to Incomplete Knowledge of AI Systems, Sandhya Saisubramanian, Shlomo Zilberstein, Ece Kamar, (2020)
AI Research Considerations for Human Existential Safety (ARCHES), Andrew Critch, David Krueger, (2020)
How do we become confident in the safety of a machine learning system? By Evan Hubinger
Research Agenda: Using Neuroscience to Achieve Safe and Beneficial AGI by Steve Brynes
Unsolved Problems in ML Safety By Dan Hendrycks, Nicholas Carlini, J. Schulman, J. Steinhardt
Foundational Challenges in Assuring Alignment and Safety of Large Language Models, Usman Anwar, et al.
AI governance
AI Impacts: promising research projects and possible empirical investigations
Governance of AI program at FHI: Alan Dafoe's AI governance research agenda
Center for a New American Security: Artificial Intelligence and Global Security Initiative Research Agenda
FLI: A survey of research questions for robust and beneficial AI (+ some aspects also fall into technical AI safety)
Luke Muehlhauser’s list of research questions to improve our strategic picture of superintelligence (2014)
Books, Papers, Podcasts, and Videos
(Non-exhaustive list of AI safety material)Books
Introduction to AI Safety, Ethics, and Society (textbook) by Dan Hendrycks, 2024
Taming the Machine: Ethically Harness the Power of AI by Nell Watson, 2024
Uncontrollable: The Threat of Artificial Superintelligence and the Race to Save the World by Darren McKee, 2023
The Alignment Problem by Brian Christian, 2020
Human Compatible by Stuart Russell, 2019
Reframing Superintelligence by Eric Drexler, 2019
The AI Does Not Hate You: Superintelligence, Rationality and the Race to Save the World by Tom Chivers, 2019
Artificial Intelligence Safety and Security, By Roman Yampolskiy, 2018
Superintelligence: Paths, Dangers, Strategies by Nick Bostrom, 2014
Some other reading
Victoria Krakovna's AI safety resources (contains a list of motivational resources and key papers for some AI Safety sub fields)
Pragmatic AI Safety by ThomasWoodside
X-Risk Analysis for AI Research by Dan Hendrycks, Mantas Mazeika
Podcasts
80k's Podcast (Effective Altruism podcast with some AI Safety episodes)
The Nonlinear Library – a repository of text-to-speech content from the EA Forum, Alignment Forum, LessWrong, and other EA blogs
YouTube
(Most of these channels are a mix of AI Safety content and other content)Robert Miles discuss AI on Computerphile and Robert Miles's own YouTube channel
SlateStarCodex Meetups (recorded talks)
CEA Artificial Intelligence playlist
Other Videos
AISafety.video – A list of very many videos
AI Safety Research Groups
(Many of these groups do a combination of AI Safety and other X-risks)There are many academic and independent researchers who are interested in AI safety and are not covered by this list. We are not going to list specific individuals publicly, so please contact us if you want to find more AI safety researchers.
Technical AI safety
Center for Human-Compatible Artificial Intelligence (CHAI), University of California, Berkeley
Future of Humanity Institute (FHI), University of Oxford
Center on Long-Term Risk (CLR), London
Ought, San Francisco
Redwood Research, Berkeley
AI governance
The Center for the Study of Existential Risk (CSER), University of Cambridge
Future of Humanity Institute (FHI), University of Oxford
Global Catastrophic Risk Institute (GCRI), various locations
Median Group, Berkeley
Center for Security and Emerging Technology (CSET), Washington
Forecast and strategy
Future of Humanity Institute (FHI), University of Oxford
Leverhulme Center for the Future of Intelligence (CFI), University of Cambridge
Organisation working on both technical safety and capabilities
(These orgs employ people who are doing valuable technical AI safety research, which is good. But they are also employing people doing AI capabilities research, which is bad since it speeds up AI development and reduces the time we have left to solve AI safety.)Outreach and Advocacy Groups and Initiatives
Existential Risk Observatory – collects and spreads information about existential risks
Future of Life Institute (FLI) – outreach, podcast and conferences
Palisade Research – produces demos of dangerous AI capabilities
Funding
Grants
The Center on Long-Term Risk Fund (CLR Fund) – S-risk focused
Survival and Flourishing Fund (SFF) – awards and facilitates grants to existing charities
Career development and transition funding | Open Philanthropy
Open Philanthropy Undergraduate Scholarship | Open Philanthropy
AI Safety: Neuro/Security/Cryptography/Multipolar Approaches | Foresight Institute
Fundraising platforms
Housing
CEEALAR / EA Hotel – a group house in Blackpool, UK which provides free food and housing for people working on Effective Altruist projects (including AI safety)
Other lists
Field building Orgs and Initiatives
AI Safety Support (AISS) – that's us!
Lightcone Infrastructure – building tech, infrastructure and community
Berkeley Existential Risk Initiative (BERI) – support for academic researchers