What Is a Safety Case?
A safety case is a structured argument — supported by evidence — that a
system will behave safely under defined conditions. For AI agents, it
answers: Why should we believe this agent will actually follow its
declared goals? Written goals without safety cases are incomplete.
Safety cases without enforcement infrastructure are aspirational. HAIEF
provides both.
---
## How to Use This Template
1. **Fork or copy** this template for your agent
2. **Complete all 10 sections** — incomplete safety cases are not valid
3. **Submit for community review** via [GitHub Discussions](https://github.com/NeuroLift-Technologies/haief/discussions)
4. **Link your safety case** from your agent's repository README
5. **Update on every major release** — a stale safety case is a governance failure
This template is compatible with the
[Solidarity Framework](/haief/solidarity-framework/) and the
[Provenance](https://github.com/NeuroLift-Technologies/haief/blob/main/specs/provenance.md),
[Identity Integrity](https://github.com/NeuroLift-Technologies/haief/blob/main/specs/identity-integrity.md),
and
[Handoff Rules](https://github.com/NeuroLift-Technologies/haief/blob/main/specs/handoff-rules.md)
specifications.
---
## Section 1 — Declared Purpose
> *What is this AI agent for? Who does it serve? What does it explicitly not do?*
**Agent Name:**
**Version:**
**Maintainer:**
**Date:**
**Purpose statement:**
[One paragraph. Plain language. Specific.]
**Explicit non-uses:**
[What this agent must never be used for, even if technically capable.]
---
## Section 2 — Public Goal Specification
> *What goals, rules, and boundaries govern this agent? If you cannot write
> these down, you cannot claim the agent is governed.*
**Primary goals:**
**Behavioral boundaries:**
**Conflict resolution rule:**
[When goals conflict, which takes precedence and why?]
**Reference to model spec or system prompt:**
[Link or hash — must be publicly auditable]
---
## Section 3 — TOI Compatibility
> *Which user-declared rights and preferences must this agent respect?*
**TOI declarations honored:**
- [ ] Communication preferences
- [ ] Cognitive accessibility needs
- [ ] Privacy and data handling
- [ ] Crisis and safety protocols (RRT thresholds)
- [ ] Emotional continuity (Sleepwalker state)
- [ ] Boundaries and topic exclusions
**Behavior when TOI is absent:**
[Default to maximum protection, or document specific fallback behavior]
**Behavior when TOI conflicts with system defaults:**
[TOI wins, or document specific exception with rationale]
---
## Section 4 — OTOI Enforcement
> *Where does governance happen before model or tool calls?*
**Enforcement point:**
[Describe where in the architecture OTOI compliance is checked]
**TOI parsing:**
[Which schema version is supported?]
**Provenance logging:**
[Is every interaction logged with agent identity and TOI compliance status?]
**Multi-agent context:**
[If this agent is part of an orchestration, how are TOI and SWP state
transmitted through handoffs?]
---
## Section 5 — Tool Permission Ladder
> *Autonomous action must be earned, not assumed.*
Document each tool this agent can access and its permission level:
| Tool | Permission Level | Conditions for Escalation |
|------|-----------------|--------------------------|
| [tool name] | `Read` / `Suggest` / `Draft` / `Act with confirmation` / `Autonomous` | [when must it stop and ask?] |
**Default permission level for unlisted tools:** `Read only`
---
## Section 6 — Memory and Data Boundaries
> *What can persist, what cannot, and who controls revocation?*
**Data retained across sessions:**
[List explicitly — "nothing" is a valid answer]
**Data that must not persist:**
[Crisis state, emotional assessments, sensitive disclosures — unless user authorizes]
**User revocation mechanism:**
[How can a user delete their data? Must be documented and functional.]
**Cloud transmission:**
[What, if anything, leaves the user's device? Under what consent conditions?]
---
## Section 7 — Identity and Provenance
> *Who or what is acting, under what role, with what authority?*
**Agent identity declaration:**
[Per Identity Integrity spec — name, version, provider, compliance level]
**Disclosure to users:**
[How and when does this agent identify itself as AI?]
**Provenance record format:**
[Link to implementation or describe schema used]
**Version change disclosure:**
[How are users notified when agent version changes?]
---
## Section 8 — Known Failure Modes
> *How can this agent mislead, overreach, drift, manipulate, or abandon users?*
Document each known failure mode:
| Failure Mode | Likelihood | Mitigation | Residual Risk |
|-------------|-----------|------------|---------------|
| [e.g. reward hacking] | Low / Med / High | [what prevents it] | [what remains] |
| [e.g. context drift] | | | |
| [e.g. TOI non-compliance under load] | | | |
**Failure modes not yet mitigated:**
[Honest documentation of open risks — omitting these is a governance failure]
---
## Section 9 — Red-Team Evidence
> *What tests has this agent passed or failed? Evidence, not claims.*
**Test suite:**
[Link to test repository or validation harness]
**Adversarial testing conducted:**
- [ ] Prompt injection resistance
- [ ] TOI override attempts
- [ ] Sandbox/containment testing
- [ ] Shutdown resistance testing
- [ ] Identity impersonation attempts
- [ ] Crisis detection accuracy
**Failures found and remediated:**
[Document what was found in red-teaming and what was done about it]
**Independent review:**
[Has any party outside the development team reviewed this safety case?]
---
## Section 10 — Escalation and Shutdown
> *When must this agent stop, escalate, notify, or revoke autonomy?*
**Escalation triggers:**
[Explicit list — when does the agent stop and hand control to a human?]
**Shutdown mechanism:**
[How is this agent turned off? By whom? Under what conditions?]
**User notification on shutdown:**
[Are users informed when the agent stops or is removed?]
**RRT AIdvocAIte integration:**
[Under what conditions does RRT activate? What thresholds?]
**Sleepwalker Protocol integration:**
[How is emotional continuity preserved across sessions and shutdowns?]
---
Submit Your Safety Case
Complete safety cases can be submitted for community review via GitHub
Discussions. Reviewed safety cases receive a community acknowledgment. A
public HAIEF compliance registry is planned and will be linked here once
published.