OpenAI Launches Superalignment Effort to Control Risks of Superintelligent AI

OpenAI pledges 20% of its compute to fight rogue superintelligent AI of the future.

Superintelligence can arrive within this decade, OpenAI claims in its official announcement of the superalignment effort.

Artificial intelligence is growing at a breakneck speed and it’s well-known that currently, we have no way to control a superintelligent AI system, should one make its appearance. Alignment, in this context, is the collection of processes that aims to bring AI systems closer to human intent and values so that we’re always in control.

Last year, OpenAI noted that “unaligned AGI could pose substantial risks to humanity” and in this latest announcement, the company fully embraced the idea that superintelligence “could lead to the disempowerment of humanity or even human extinction.”

These are big scenarios and conclusive proof of anything is virtually nonexistent. The company aims to keep human supervision and governance on par with the improvements in AI and AGI.

The new effort will see OpenAI dedicating 20% of its computing power to “solve the core technical challenges of superintelligence alignment in four years.”

OpenAI’s superalignment team is led by ML researcher Jan Leike who leads the company’s Alignment Team and Ilya Sutskever, co-founder and chief scientist at OpenAI. The company is currently hiring for positions within the team.

The team writes:

Currently, we don’t have a solution for steering or controlling a potentially superintelligent AI, and preventing it from going rogue. Our current techniques for aligning AI, such as reinforcement learning from human feedback, rely on humans’ ability to supervise AI. But humans won’t be able to reliably supervise AI systems much smarter than us, and so our current alignment techniques will not scale to superintelligence. We need new scientific and technical breakthroughs.
Jan Leike & Ilya Sutskever, OpenAI Superalignment

The new human-level automated alignment researcher will significantly improve the rate at which the company can solve technical challenges that make it hard to supervise a hypothetical superintelligent AI. Banking on AI itself to fight malicious super-AI programs of the future, the new team will openly contribute their findings to alignment and safety issues of non-OpenAI models as well.

By Abhimanyu