Responsible Scaling Policy
BLUF
The Responsible Scaling Policy (RSP) is Anthropic’s internal governance framework for managing the dual-use risks of increasingly capable frontier AI models. Introduced in 2023 and updated in 2024, it establishes a tiered “AI Safety Level” (ASL) classification system — analogous to biosafety levels (BSL-1 through BSL-4) — that determines what deployment, access, and research restrictions apply to models at each capability tier. The RSP is the primary public reference for Anthropic’s self-regulatory commitments; it governs (in principle) whether and how models like Claude Mythos may be deployed, under what access controls, and with what usage restrictions.
ASL Classification System
| Level | Risk Threshold | Implication |
|---|---|---|
| ASL-1 | No meaningful uplift beyond publicly available tools | No specific deployment restrictions |
| ASL-2 | Meaningful uplift to CBRN or critical-infrastructure attack | Current deployment standard; enhanced monitoring required |
| ASL-3 | Substantial uplift to mass-casualty weapon development or autonomous cyberattack | Restricted deployment; government notification; mandatory safety measures |
| ASL-4 | Catastrophic potential (autonomous replication, major weapons development) | No deployment without extraordinary safety measures |
As of 2024, Anthropic has classified Claude 3 and Claude 3.5 models as ASL-2. No public ASL classification for Claude Mythos has been released.
Analytical Significance
The RSP represents an attempt to self-regulate frontier AI development in the absence of binding government regulation. Its limitations are analytically significant:
- Self-reporting: Anthropic determines its own ASL classification without independent audit or mandatory government disclosure
- Enforcement gap: No legal mechanism compels compliance; the RSP is a policy commitment, not a legal obligation
- Mythos gap: No published RSP evaluation for Claude Mythos exists — the most capable Anthropic model as of April 2026 lacks a public capability-risk assessment, even as NSA operationalizes it for vulnerability research
- Comparison: OpenAI’s Preparedness Framework is structurally similar; both are voluntary self-governance instruments; neither has binding external enforcement
Key Connections
- Anthropic — policy author and sole enforcer
- Claude Mythos — the model for which no public RSP evaluation exists
- Constitutional AI — Anthropic’s alignment methodology; the technical basis for RSP safety claims
- Dual-Use Technology — RSP is specifically designed to manage dual-use AI capability risks
Sources
- Anthropic RSP documentation (anthropic.com/responsible-scaling-policy) — [High confidence — primary]
- UK AI Safety Institute evaluations (2023–2024) — [High confidence]
- Center for AI Safety (CAIS): RSP analysis — [Medium confidence]