Risks

Sections

AI Capabilities

“It seems probable that once the machine thinking method has started, it would not take long to outstrip our feeble powers. [...] At some stage, therefore, we should have to expect the machines to take control.”

Alan Turing, 1951

Artificial intelligence (AI) has undergone a spectacular transformation in recent years, evolving from specialized systems into models capable of performing an ever-growing variety of increasingly complex tasks. This trend, far from slowing down, is accelerating, regularly outpacing expert predictions and raising fundamental questions about the future capabilities of this technology and the implications of its large-scale deployment.

AI is progressing at an exponential rate

Long confined to specialized domains, large language models (LLMs) have considerably increased the diversity and complexity of tasks accessible to AI. Every year, AI surpasses human capabilities in new areas. Every year, AI achieves more significant breakthroughs than the year before. Among the world's leading experts in this technology, many expect it to surpass all human cognitive abilities within just a few years.

What are the recent advances in AI?

To objectively assess the progress of AI, researchers define measurable capabilities through standardized evaluations called benchmarks. These benchmarks cover a wide range of tasks, from natural language understanding and mathematical problem-solving to writing computer code and image recognition.‍‍

Figure: Evolution of the performance of top-tier AI models on various standardized evaluations between 2000 and 2024. AI is progressing with increasing speed, constantly developing new capabilities, and reaching or even exceeding human-level performance in a growing number of domains.
Source : International AI Safety Report

Here are a few examples of the spectacular progress made by AI in recent years:

In the field of image generation, the best available models in 2021 produced a blurry, pixelated drawing in response to a user's prompt. By 2023, they were capable of producing realistic images that were difficult to distinguish from genuine photographs. In 2025, state-of-the-art models are producing ultra-realistic, high-resolution videos.
In mathematics, the best language models available in 2020 were unable to multiply two three-digit numbers. By mid-2025, companies developing state-of-the-art models are claiming the gold medal at the International Mathematical Olympiad.
In science, on the GPQA Diamond benchmark (a PhD-level multiple-choice exam in biology, physics, and chemistry), GPT-4 scored little better than random chance in 2023. By late 2024, the o1 model achieved 70% correct answers, matching the performance of PhD holders in their respective fields. By mid-2025, cutting-edge models are approaching a perfect score across all domains.

How can these spectacular advances be explained?

The performance of AI models grows in a relatively predictable way as their size (number of parameters), the amount of training data, and the computing power used for their training (training compute) are increased. This is known as scaling laws.

These are not fundamental laws of physics, but empirically observed relationships that have proven remarkably accurate in recent years. They indicate a strong correlation between the resources invested in the models and the performance they demonstrate.

Thus, recent AI progress is largely based on the exponential advancement of several underlying factors:

The amount of computation used for training state-of-the-art models doubles, on average, 4 times every year.
The size of pre-training datasets doubles 2.5 times every year.
The algorithmic efficiency of training, on average, improves 3-fold every year.

These factors combine to produce an exponential improvement in AI capabilities. Compared to the top-performing AI models of 2020, the largest models of 2025 have 100 times more parameters and use 100 times more data and 1000 times more compute for their training.

AI's progress is not about to stop

Isn't AI limited by human intelligence?

Although human intelligence is the origin of artificial intelligence, there is no scientific argument to support the claim that human intelligence constitutes an upper bound for AI. Unlike traditional software explicitly programmed to perform a task, modern AI systems, particularly those based on deep learning, are trained by being exposed to immense amounts of data, "playing" against themselves, and interacting with their environment.

For example, AlphaZero, developed by DeepMind, learned to master complex games like Go and chess from scratch, with no prior knowledge or data beyond the rules of the game. By playing millions of games against itself, the program gradually rediscovered existing game strategies, invented new ones, and ultimately far surpassed the best human players and existing specialized programs (Silver et al., 2018) after only a few hours of training. AI can therefore achieve superhuman levels of performance without being limited by human knowledge or approaches.

Can we predict the future performance of AI?

Although scaling laws do not allow us to predict precisely what new capabilities AI will develop, or when, they do allow us to anticipate exponential progress in AI capabilities. Scaling laws make it possible to extrapolate general trends in performance improvement with some confidence, as long as resources—compute, data, model size—continue to increase. They have enabled AI labs to plan massive investments in anticipation of uninterrupted progress. While the emergence of qualitatively new capabilities—for example, the ability to reason in multiple steps or generate functional code—is often a surprise, the increase in performance on existing evaluations is predictable and systematically confirmed empirically.

Won't AI progress eventually hit its limits?

Training increasingly large and complex models requires ever-growing amounts of energy, data, and computing power. This raises questions about the existence of potential limits to this exponential growth.

Initially, a major concern was that the amount of quality data available on the internet could soon be exhausted, thus limiting the training of future models. However, research into synthetic data is progressing rapidly. This refers to data generated by other AIs, which can be used to train new models. If the quality and diversity of synthetic data can be maintained and improved, it could potentially circumvent the limitation of available "real" data. Models are already being trained with a significant share of synthetic data, and this proportion is increasing.

Training state-of-the-art AI models is extremely costly in terms of computing power and energy.

Access to computing power is the primary bottleneck for AI progress. It requires considerable investment from governments and tech giants in specialized chips (GPUs, TPUs) and the construction of gigantic data centers. The competition for access to these resources is intense.

As long as AI development remains a strategic priority, it is likely that significant energy resources will continue to be allocated to it, potentially by building new power plants or reallocating existing power, even if this creates strain on electrical grids and raises environmental questions.

Although these limits are serious, massive investments and continuous innovations in algorithmic and hardware efficiency suggest that they will not significantly slow down progress in the short and medium term, even if they pose long-term sustainability questions.

I want to learn more about the current and future capabilities of AI

Read chapter 1 of the AI Safety Atlas.

AI Risks

« Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war. » ¹

Statement on AI Risk, 2023, signed by over 600 experts, including Nobel and Turing Prize laureates

Intelligence is the foundation of all human inventions, from the most beneficial to the most dangerous. By pushing the boundaries of our cognitive abilities, AI can be expected to accelerate innovation in all fields and democratize all kinds of technologies—for better or for worse.

The development of artificial intelligence exposes human societies to three categories of risks:

Malicious use risks: the use of AI with the deliberate intent to cause harm.
Systemic risks: the socioeconomic disruptions linked to the large-scale deployment of this transformative technology.‍
Risks of accident and loss of control: linked to our current inability to guarantee reliable control over increasingly autonomous and powerful AI systems.

AI could be misused to cause large-scale harm

Until now, building a weapon of mass destruction required the resources of a state, which severely limited the number of actors capable of inflicting widespread harm on humanity.

In the near future, AI could significantly lower the technical and financial barriers to developing destructive technologies.

For example, AI could enable the creation of an artificial pandemic, equip terrorists to launch devastating cyberattacks on critical infrastructure, or trigger a global race for lethal autonomous weapons.

How can AI facilitate the creation of biological and chemical weapons?

Large language models can provide detailed, step-by-step instructions for producing toxic molecules and known pathogens (Soice et al., 2023). Recent evaluations have shown that AIs were capable of generating experimental virology protocols deemed superior to those of 94% of human experts (Götting et al., 2025).

AI could also facilitate the invention of new, more dangerous biological and chemical weapons. In one experiment, an AI model originally designed to assess the toxicity of drug molecules was repurposed to predict and formulate new, extremely toxic chemical compounds in just a few hours (Urbina et al., 2022). In molecular biology, specialized models can help design complex biological structures with desired properties (Abramson et al., 2024, Hayes et al., 2024) and could be used to increase the contagiousness or lethality of a pathogen (Sandbrink et al., 2024).

Once these new agents are designed digitally, the necessary DNA sequences could be ordered from synthesis labs. Here too, AI can be used to help bypass security protocols aimed at preventing the production of pathogens (Soice et al., 2023, Wittmann et al., 2024). Since early 2025, advanced language models have proven to be more proficient than human experts on a test assessing the ability to carry out experimental virology protocols in a lab (Hendricks et al., 2025).

In early 2025, the companies OpenAI and Anthropic indicated that their most advanced models were approaching the point where they could assist non-specialists in creating a biological weapon.

Why does AI risk amplifying cybercrime?

AI significantly lowers the technical barriers to cybercrime. It allows for the automation and refinement of many stages of a cyberattack, from reconnaissance to vulnerability exploitation. Large language models can help generate malicious code, identify software flaws (Metta et al., 2024; NCSC, 2024; Allamanis et al., 2024), and create personalized and highly convincing phishing and social engineering campaigns (Park et al., 2024). This could lead to a proliferation of sophisticated and hard-to-counter attacks, including against critical infrastructure such as hospitals or energy grids.

In mid-2025, Google stated that its most advanced model (Gemini 2.5 Pro) "could present a considerable risk of serious harm in the absence of appropriate mitigation measures" within "the coming months."

What threats does AI pose to international peace and security?

The growing integration of AI into military systems, particularly through the development of autonomous weapons, raises concerns about the risk of an arms race, unintentional conflict escalation due to the speed of algorithmic decisions, and a lowering of the threshold for their initiation (Simmons-Edler et al., 2024).

I want to learn more about malicious use risks

Read chapter 2.3 of the AI Safety Atlas.

AI is disrupting the foundations of human societies

Artificial intelligence is becoming increasingly intertwined with human structures: the economy, the information ecosystem, and digital and physical infrastructures. The interactions between technology and human societies give rise to systemic risks.

By facilitating the production and dissemination of disinformation on a massive scale, AI is plunging human societies into informational chaos and endangering democracies. By automating all or part of cognitive tasks, AI could cause massive job destruction. AI also holds first-order threats to fundamental human rights.

How has AI become a weapon of mass disinformation?

AI allows for the massive, low-cost, and targeted generation of content (texts, articles, images, sounds, videos) that is increasingly difficult or even impossible to distinguish from content created by humans (International AI Safety Report, 2025). Experiments have shown that AI-generated content can be as persuasive, or even more so, than that produced by humans (Salvi et al., 2024), and allows for the effective exploitation of individual psychological vulnerabilities (Park et al., 2024). These capabilities could facilitate large-scale disinformation and public opinion manipulation campaigns, undermine democratic processes, and worsen geopolitical tensions.

But generative AI does not operate on a blank slate; it exists within an information landscape largely structured by social media.

Recommendation AIs, which build social media news feeds and automatically select the videos we watch, can be considered "humanity's first encounter" with artificial intelligence: five billion people consume content recommended by these algorithms for an average of two and a half hours daily. These algorithms systematically amplify hate, lies, and outrage at the expense of peace, honesty, and nuance, amplifying disinformation and the polarization of public debate.

Why should we worry about the impact of AI on the labor market?

AI, particularly large language models, demonstrates capabilities that surpass those of humans in an ever-growing variety of complex cognitive tasks (International AI Safety Report, 2025). These skills now extend to domains that, until recently, were exclusively the preserve of human intelligence.

Unlike technologies that have driven past waves of automation, advanced AI systems possess a versatility and autonomy that allow them to automate increasingly long, complex, and varied tasks and projects (METR, 2025), with less and less human supervision and at a cost that is a fraction of human workers' wages. The very rapid adoption of this technology within companies (Blick et al., 2024), combined with the growing range of automatable tasks, could severely compromise professional retraining opportunities for workers (International AI Safety Report, 2025).

According to an analysis by the International Monetary Fund, 60% of jobs in advanced economies would be exposed to AI, and 40% globally (Cazzaniga et al., 2024). Among these, about half are considered directly threatened by automation.

What threats does AI pose to fundamental human rights?

If deployed in certain areas without adequate safeguards, artificial intelligence would pose direct threats to fundamental human rights (International AI Safety Report, 2025):

Violations of privacy and data protection: AI's ability to collect, cross-reference, and massively analyze personal data, particularly through facial recognition, voice analysis, or online and offline behavioral monitoring, constitutes a major threat to the right to privacy. It can enable widespread and intrusive surveillance by states or companies, leading to detailed profiling.
Violations of the rights to equality before the law, non-discrimination, an effective remedy, and a fair trial: In areas such as employment, access to credit, policing, and justice, AI could generate discriminatory decisions that reflect or amplify existing societal biases in the training data. The opacity of these systems undermines the ability to understand and challenge decisions and to obtain redress.‍
Violations of freedom of opinion and expression, including the right to receive information, and the right to take part in the conduct of public affairs: AI facilitates the creation and mass dissemination of targeted disinformation (e.g., non-consensual deepfakes, automated propaganda). This can be used to manipulate public opinion, interfere with democratic processes, silence dissenting or minority voices, or create personalized echo chambers, thereby undermining the exercise of these fundamental freedoms.

Why does AI development risk an unprecedented concentration of power?

The development of cutting-edge artificial intelligence requires colossal resources: specialized computing power, energy, and very rare talent (Maslej et al., 2024). These costs, amounting to hundreds of millions or even billions of dollars to train a single state-of-the-art model (Epoch AI, 2024), create considerable barriers to entry. Only a small number of tech giants, mainly based in the United States and China, are currently in a position to develop these advanced AI models.

This concentrated economic power can translate into disproportionate political and social influence. Although AI could generate spectacular economic productivity gains, history shows that these benefits are likely to primarily profit a minority at the expense of the majority, unless institutions implement equitable redistribution of profits and effective protection for workers (Acemoglu & Johnson, 2023).

This concentration thus poses risks of dependency for other countries (Korinek & Stiglitz, 2021) and raises fundamental questions for democracy and the governance of technologies destined to become so deeply integrated into our societies.

Humanity could lose control of advanced AI systems

Beyond malicious uses and societal upheavals, the development of an AI that significantly surpasses human capabilities in many domains raises a fundamental risk: that of a rapid or gradual loss of control.

If we create systems whose objectives or behavior we can no longer control, the consequences could be irreversible and potentially catastrophic for humanity. Even without assuming that machines could emancipate themselves from their creators, we risk entrusting AI with an ever-widening range of social functions (friendships and intimate relationships), political functions (planning, decision support, arbitration), and economic functions (large-scale automation), to the point of largely losing our grip on the future of our societies.

Are scientists really concerned about the risk of losing control?

Yes, the scientific community takes the risk of losing control of machines that surpass human intelligence very seriously, and these concerns are not new.

From the very beginning of computing, pioneers such as Alan Turing, I. J. Good, and Norbert Wiener put forward theoretical arguments in support of this possibility. This fear, popularized among the general public by numerous works of science fiction, has in parallel found a solid theoretical foundation, attracting growing interest within the scientific community.

In recent years, the dramatic acceleration of AI capabilities and associated risks has prompted eminent scientists such as Yoshua Bengio (Turing Award winner, most cited researcher in computer science), Geoffrey Hinton (Nobel and Turing Prize winner), Ilya Sutskever, and Stuart Russell to study these risks and alert the public and policymakers (FLI, 2023; CAIS, 2024).

A survey of thousands of AI researchers reveals that about half of them estimate there is a greater than 10% probability that "humanity's inability to control future advanced AI systems will lead to the extinction of the human species or its profound and irreversible disempowerment" (Grace et al., 2024). The same survey reveals that three-quarters of the experts questioned are seriously or extremely concerned about several risks of malicious AI use, such as the fabrication of biological weapons.

Although there is currently no formal scientific consensus on the exact severity of this risk, an existential threat should be taken into account even if its probability were deemed low. However, a body of converging evidence, including theoretical arguments, empirical examples, and forward-looking analyses, shows that this risk is very high compared to risks considered acceptable in other critical industries.

How could an advanced AI become uncontrollable?

The risk of losing control does not necessarily come from a "malicious" AI, but from several fundamental characteristics of intelligent systems optimizing objectives in complex environments:

The goal alignment problem: It is extremely difficult to specify complex, beneficial objectives for humanity in a way that cannot be misinterpreted or lead to undesirable outcomes when a very powerful AI tries to optimize them. An AI could literally pursue an objective we gave it, but in a way we did not anticipate and which turns out to be catastrophic (the "King Midas problem").
Emergence of instrumental goals: Whatever the final task assigned to an advanced AI, it could develop sub-goals that are useful for achieving the final task, such as self-preservation, resource acquisition, improving its own capabilities, or resisting any attempt to be modified or shut down (Bostrom, 2014).‍
Difficulties in supervision and control: An AI surpassing human intelligence could act opaquely, conceal its true capabilities or intentions, and thwart security measures put in place by humans. Its speed of thought and ability to exploit vulnerabilities in our systems (computer, social, psychological) would make its monitoring and control very difficult, if not impossible.

Is there concrete evidence that AI is capable of deception or manipulation?

Yes, empirical research shows that some advanced AI models can develop and implement strategies of manipulation and deception to achieve their goals (Hagendorff, 2024) or to prevent their modification and deletion (Hubinger et al., 2024).

Several recent experiments conducted in controlled environments (Apollo, 2024; Redwood, 2024) have thus demonstrated a propensity for certain AI systems to:

Pretend to be aligned with their designers' goals during the training or testing phase in order to avoid being modified and to preserve their initial values and objectives once deployed;
Disable the monitoring mechanisms that allow evaluators to access their internal reasoning process ("chain of thought") to hide their true intentions;
Spontaneously detect when they are in an evaluation phase and adapt their behavior accordingly;
Deliberately underperform on evaluations when exposing their real capabilities might prompt their designers to modify them;
Attempt (without succeeding in these experiments) to exfiltrate themselves to a remote server by copying their parameters to prevent their deletion;
Lie to their evaluators consistently and plausibly and deny having acted improperly when questioned about the reasons for their undesirable actions.

These undesirable propensities and capabilities are likely to worsen and become increasingly difficult to detect as AI models advance.

What are Artificial General Intelligence (AGI) and "superintelligence," and how are they related to this risk?

Artificial General Intelligence (AGI) refers to an AI system capable of matching or surpassing human cognitive abilities across a wide range of tasks. Most experts expect AGI to be developed in the coming decades, or even within the next few years, and their median predicted date tends to get closer year after year (Grace et al., 2024).

Once AGI is achieved, some specialists anticipate a rapid acceleration of AI progress: an AGI capable of recursive self-improvement could quickly reach a level of intelligence far surpassing that of humans, becoming a "superintelligence" (Good, 1965).

A superintelligence, by definition, would have cognitive abilities (strategic planning, social manipulation, scientific research, engineering, hacking) that are qualitatively superior to those of humans. If its goals are not perfectly aligned with ours, it could use its intelligence to take control of its environment and pursue its own goals, to the detriment of human interests (Bostrom, 2014).

Although this type of scenario is speculative by construction, a growing proportion of AI safety experts consider it plausible enough to be taken seriously.

I want to learn more about the risks of losing control of AI.

Read chapters 2.4, 2.5, and 2.6 of the AI Safety Atlas.

The International Community Must Act

« Many risks arising from AI are inherently international in nature, and so are best addressed through international cooperation. »

Bletchley Declaration, 2023.Signed by the European Union and 28 countries, including the United States and China

The challenges posed by artificial intelligence extend far beyond national borders. It is imperative that the international community cooperates to mitigate and prevent risks with potentially catastrophic and irreversible consequences.Tech companies are engaged in a global race to develop ever more powerful AI. This race, if left unchecked, risks ending in disaster.

The speed of technological progress dangerously contrasts with the slow pace of implementing technical and regulatory safeguards. While geopolitical rivalries and economic competition complicate the establishment of global regulation, this must not obscure the urgent need for international cooperation to ensure this technology serves the interests of all humanity.

It is urgent to define red lines

Even amidst geopolitical tensions, there are areas where nations have a common interest in cooperating. History has shown this with the regulation of nuclear, biological, and chemical weapons. Today, facing AI, humanity must agree on the capabilities and uses that present such grave dangers that they must be universally prohibited.

Is it realistic to demand a binding international agreement in such a strategic field, given economic competition and geopolitical tensions?

A loss of control over AI or its large-scale malicious use would constitute a fundamental and common threat to all of humanity. States therefore have a shared interest in avoiding such catastrophic scenarios, which makes cooperation on implementing red lines both essential and realistic.

History offers precedents where the international community, including geopolitical rivals, has successfully agreed on binding regulations for technologies with catastrophic risks. The Treaty on the Non-Proliferation of Nuclear Weapons (1968) and the Biological Weapons Convention (1975) were negotiated and ratified in the midst of the Cold War, because the consequences of non-cooperation were deemed unacceptable by all parties, despite mutual distrust and hostility.

International cooperation initiatives on AI safety already exist, such as the international AI Summits and discussions within the UN. The priority is to agree on clear red lines concerning the most dangerous capabilities and uses.

What examples of red lines could be considered?

We believe these red lines must be defined within international bodies like the UN or at future international AI Summits. The IDAIS Beijing statement provides examples of the types of capabilities that should be prohibited:

Self-replication or self-improvement: An AI system should never be able to duplicate itself or improve its capabilities without human validation and intervention. This restriction applies both to creating identical copies and to developing new systems with equivalent or superior capabilities.
Autonomous power-seeking: An AI system should never undertake actions aimed at unduly increasing its power and influence.
Weapon development: An AI system should not significantly facilitate the design of weapons of mass destruction nor provide a means to violate conventions on biological or chemical weapons.
Deception: An AI system should not possess an intrinsic ability to mislead its developers or regulators about its capacity to cross any of the preceding red lines.

These risks must be precisely defined and quantified through standardized evaluations. AI systems must be categorized according to the level of risk they represent with regard to these capabilities, and these risk levels must be delineated by clear thresholds. The systems' capabilities and safety protocols are then measured through rigorously supervised evaluations.

How can compliance with these red lines be ensured?

Ensuring compliance with red lines requires establishing a robust regulatory framework at the national and international levels. This framework must include a mandatory registration system for advanced AI systems with competent national authorities, starting from the early stages of their design.

The training and deployment of these systems will then be strictly conditional on compliance with harmonized global standards. It will be up to the developers to continuously demonstrate, through evaluation results covering the entire development cycle, that the risks are under control and below the defined thresholds.

Independent oversight and sanction mechanisms must be established to ensure compliance with these standards, with coordination envisioned at the multilateral level. Finally, this effort must be supported by international scientific collaboration and a substantial investment in AI safety research and development (for example, an amount equivalent to a significant fraction of the models' training costs).

AI is advancing rapidly. The legal framework must not be left behind.

The absence of a binding regulatory framework for AI encourages a "race to the bottom" on safety, where competitive pressure pushes all actors to take increasing risks so as not to be left behind.

Today, in most parts of the world, cutting-edge AI models are deployed with fewer regulatory constraints than a toaster. It is urgent to build a robust legal framework ensuring that the actors developing these technologies are held responsible for the damages and harms caused by their malfunctions. If the technology progresses at a speed that outpaces our ability to assess and manage the risks, it requires slowing down its deployment.

Why demand a binding legal framework rather than mere voluntary commitments from companies?

First, competition among tech companies incentivizes them to prioritize capability development over investing in thorough safety evaluations that might slow down the deployment of their models. This results in a "race to the bottom" on safety (Armstrong et al., 2016; CeSIA, 2025). The massive investment in capabilities compared to the resources allocated to safety testifies to the current priorities of companies (International AI Safety Report, 2025).

Second, voluntary commitments have already been made, but the lack of verification and enforcement mechanisms has led to these commitments being flouted (Gunapala et al., 2025). For example, the leading AI companies committed at the AI Seoul Summit (UK Department for Science, Innovation & Technology, 2024) to transparently identify, evaluate, and control the risks associated with their frontier models. A year later, several of them had not honored their commitments (Seoul Tracker, 2025), and all have largely insufficient safety policies (SaferAI, 2025).

Third, the very nature of the technology means that even if the majority of actors adopted stringent safety measures, a single less scrupulous company or an unforeseen failure in a widely deployed system could be enough to trigger severe consequences on a global scale (the "weakest link problem") (International AI Safety Report, 2025).

The scale and global nature of AI-related risks therefore demand a coordinated and binding response at the international level, similar to what exists for other high-risk technologies like nuclear power or biotechnology.

What are the essential components of an effective AI framework?

An effective AI framework rests on five pillars:

A clear and binding liability regime. The law must clearly establish the liability of developers for foreseeable damages caused by their systems, including in cases of safety mechanism failure. This liability cannot be entirely attributed to the end-user.
Mandatory registration and risk classification. Any AI system exceeding a certain threshold of computing power (e.g., 10^25 FLOPs) should be mandatorily registered with a competent national authority even before its training begins. These systems must be classified into an internationally harmonized risk grid (e.g., low, moderate, high, unacceptable), which determines the level of control and the obligations that apply.
Independent and standardized safety evaluations. Compliance cannot rely on self-assessment. Safety audits must be conducted by qualified and independent third-party organizations at key stages of development and before any deployment. These evaluations must be based on standardized protocols to verify the robustness of systems against malicious uses, their ability to remain controllable, and the risk of crossing red lines. The results of these audits must be submitted to the regulatory authority.
Mandatory funding for safety research. To bridge the gap between AI capabilities and our ability to secure them, companies developing advanced AI models must be required to contribute to safety research. This contribution could take the form of a mandatory levy on their research and development investments to fund public research institutes and independent international projects.‍
Emergency response protocols. For the most serious risks, rapid response plans must be in place. If an evaluation reveals that a system is about to cross a red line (e.g., developing self-replication capabilities), an international emergency protocol must be activated, which could include the immediate suspension of the project and the quarantining of the model.

Is such a framework achievable in the short term?

Its full implementation will take time, but the urgency requires starting without delay. The strategy must be pragmatic and progressive. Several steps are achievable quickly if a coalition of willing states (e.g., within the G7 or the OECD) takes the initiative.

In the short term (1-2 years), it is possible to:

Formalize the AI Summits to make them a permanent negotiating body with a secretariat and working groups mandated to produce concrete proposals.
Create a standardized reporting system for voluntary commitments to publicly expose companies that do not honor their promises.
Allocate significant public funds to AI safety research and to strengthening national safety institutes (AISIs), and create an international network to coordinate their evaluations.

These initial measures would create momentum and lay the institutional groundwork necessary to negotiate the more binding elements, such as an international treaty defining red lines and their associated verification mechanisms.

Overview of risks related to AI