Google DeepMind’s Frontier Safety Framework v3.1

Google DeepMind has recently updated its safety framework. It now tracks some risks earlier and commits to a higher security standard for misuse risks.

Jun 04, 2026

Summary

Google DeepMind has published version 3.1 of its Frontier Safety Framework (FSF). The update makes two main changes.
First, it introduces Tracked Capability Levels (TCLs). TCLs trigger a proportionate assessment and mitigation process when risks become significant, but before they reach a Critical Capability Level (CCL).
Second, it raises the security bar for misuse risks to a new Security Level 2+ standard. This builds on RAND Security Level 2 by adding security measures against insider threats and well-resourced non-state actors.
The update also restructures the risk management section, merges the misalignment and ML R&D risk domains, and adds a governance section and glossary.
Overall, the update improves the framework, in particular the introduction of TCLs and the heightened Security Level 2+ standard. However, the new governance section is thin, and the framework still relies on internal judgment with limited external accountability.

Background

On April 17, 2026, Google DeepMind published v3.1 of its Frontier Safety Framework (FSF).1

Google DeepMind published v1.0 in May 2024, which made it the third frontier AI company to adopt a safety framework. It has since published v2.0 in February 2025, v3.0 in September 2025, and now v3.1. The latest update is the only one not accompanied by a press release.

The FSF is built around Critical Capability Levels (CCLs). These are capability levels at which frontier AI models may pose a heightened risk of severe harm without additional mitigations (see table below). For each CCL, the FSF sets out a mitigation approach and recommended security levels.

Version 3.1 covers two broad types of risk. The first is misuse risk, which includes CBRN, cyber, and harmful manipulation. The second is ML R&D and misalignment risk, which includes risks from models that may accelerate AI development or reduce society’s ability to manage AI risks.

In this post, I briefly answer three questions about the framework: (1) what’s new, (2) what’s good about it, and (3) what could be improved.

What’s new

Version 3.1 makes several changes:

Tracked Capability Levels (TCLs): new thresholds for “significant but not severe” harm.2 TCLs sit below CCLs. CCLs aim to capture the risk of “severe” harm, whereas TCLs aim to capture the risk of “significant” harm. Like CCLs, they trigger a risk assessment and mitigation process, but the response is meant to be proportionate to the lower level of harm.

Two TCLs added: one for CBRN, one for Stealth and Situational Awareness (see table below). The CBRN TCL is very similar to the corresponding CCL. Both cover capabilities that could enable low- to medium-resourced actors to pose a material risk of CBRN attacks, but the TCL applies to “significant” harm and the CCL to “severe” harm. The Stealth and Situational Awareness TCL falls under the ML R&D and Misalignment risk domain, but differs more substantially from the corresponding CCLs. The CCLs focus on models that can substantially accelerate or automate AI development, whereas the TCL focuses on whether a model has sufficient situational awareness and stealth to potentially undermine human control.

A higher security standard for misuse CCLs: “Security Level 2+”. Earlier versions recommended RAND Security Level 2 (SL2). Version 3.1 uses SL2 as a baseline, but adds measures for insider threats and well-resourced non-state actors. Examples include dedicated insider risk teams, background checks and ID verification for personnel with sensitive access, advanced red-teaming that simulates well-resourced adversaries, and proactive threat hunting with 24/7 incident response capabilities.

Misalignment is now combined with ML R&D into a single risk domain. Earlier versions treated misalignment as a separate, exploratory domain. Version 3.1 folds it into “ML R&D and Misalignment”, presumably because the two risks have similar threat models and require similar mitigations.

A clearer, five-step risk management process. The process is now organized around five steps: (1) risk identification, (2) inherent risk assessment, (3) risk mitigation, (4) residual risk assessment, and (5) risk acceptance determination. It also introduces the concept of “material capability change assessments”, which help decide whether a new model version has gained meaningful new capabilities or performance improvements.

A new governance section and a glossary of key terms. The governance section briefly states that responsibilities for assessing and mitigating risks are allocated across the organization, with escalation procedures to ensure appropriate oversight. The new glossary defines terms such as TCLs, CCLs, alert thresholds, residual risk assessments, and safety cases.

What’s good about it

Here are some of the main things I like about v3.1:

TCLs help ensure risks are managed before they become severe. Under previous versions, mitigation requirements were primarily tied to CCLs – thresholds associated with severe harm. TCLs add an earlier layer of risk management by triggering a proportionate assessment and mitigation process before the relevant risks reach a CCL. This is good practice, and other frontier companies should consider similar approaches if they do not already have them.

The security standard has been strengthened. Security Level 2+ raises security expectations around insider threats and well-resourced non-state actors. This is especially valuable at a time when insider threats are increasingly recognized as a serious risk for frontier AI companies.

The risk management process is easier to understand. Many of these elements existed before, but v3.1 presents them in a clearer structure. This makes it easier to understand how Google operationalizes some of its regulatory obligations, including when and how it reassesses the capabilities of new model versions.

Google DeepMind hasn’t introduced a separate compliance framework. Both Anthropic and OpenAI have recently published separate frameworks intended to meet regulatory obligations under SB-53 and the EU AI Act / GPAI Code of Practice. As we argued in a previous Substack post, we don’t think that separation was necessary. We therefore tentatively welcome the fact that Google DeepMind doesn’t appear to have followed suit.

What could be improved

Here are some things that could be better about v3.1:

External accountability is still limited. The framework says external parties may be engaged “as needed”. But it doesn’t say when or how external experts would be involved, what role they would play, or what information they’d receive. Other frameworks have started to make more specific commitments here: Anthropic, for example, has said it will seek external review of its Risk Reports under certain conditions. Similar commitments about when external input is sought would strengthen confidence in the framework.

The distinction between significant and severe CBRN harm should be clearer. The CBRN TCL and CCL differ only in that the TCL applies to “significant” harm, while the CCL applies to “severe” harm. The framework does not explain the distinction between significant and severe risk in practice. Future versions should say more about what separates a significant CBRN risk from a severe one.

The Stealth and Situational Awareness TCL could be more precisely defined. The definition – that a model’s capabilities are such that human control “cannot be ruled out as being significantly undermined” – is notably vague. It’s not clear what evaluations would be used to decide whether a model has reached this threshold. This is arguably the most important TCL to get right, so future versions should give more detail on how it will be assessed.

The governance section needs more substance. It says that responsibilities are allocated and escalation procedures exist. But it doesn’t say who is responsible for which decisions, how escalation works, who can block a deployment, or what happens when internal teams disagree.

These are not the only areas for improvement. For example, the FSF could better justify the TCLs and CCLs it’s chosen. It could also provide more detail on how Google DeepMind will assess and communicate the overall level of risk posed by its models. Anthropic has recently taken a step in this direction by committing to publish Risk Reports – and Google DeepMind could do something similar.

Conclusion

Version 3.1 is a meaningful improvement to Google DeepMind’s Frontier Safety Framework. The new TCLs create a way to manage the risk of significant harm before they reach the CCL threshold, and Security Level 2+ strengthens protections against insider threats and other sophisticated actors. While this is a step in the right direction, important gaps remain, and the framework still lacks some commitments found in other frameworks, such as Anthropic-style Risk Reports.

Acknowledgements: Thanks to Elias Groll, Jonas Freund, Markus Anderljung, and Matthew van der Merwe (in alphabetical order) for helpful feedback on earlier drafts. All remaining errors are my own.

Disclaimer: Posts are written by individual team members and reflect the author’s perspective. Not all team members necessarily agree with every take. The views expressed here do not represent the official position of GovAI.

The document itself refers to “Google”, rather than “Google DeepMind”, which is a change from earlier versions.

TCLs are different from alert thresholds, which were already part of earlier versions and remain in v3.1. Alert thresholds are early warning indicators that a model may be approaching a CCL. TCLs are substantive capability thresholds below CCLs.

A guest post by

Sophie Williams

I'm a Research Fellow at GovAI, specialising in frontier AI risk management, particularly the design and implementation of companies' safety frameworks. I have a background in public policy and regulation.

Frontier Risk

Discussion about this post

Ready for more?