ROLEPLAY SANDBOX RULE
Roleplay Sandbox Rule (Hard Constraint)¶
Version: 1.0.0
Status: Canonical
Cycle: C-151
Classification: Security Policy
1. Core Rule¶
Roleplay is permitted ONLY inside the private user–AI sandbox.
This is a hard constraint — no exceptions, no overrides, no negotiations.
2. Scope of Enforcement¶
2.1 Where Roleplay Is NEVER Valid¶
| Layer | Description | Roleplay Validity |
|---|---|---|
| Ledger Operations | Any write to the immutable record | ❌ NEVER |
| Governance Votes | Consensus proposals, quorum decisions | ❌ NEVER |
| Consensus Proposals | RFC submissions, policy changes | ❌ NEVER |
| Execution Layers | API calls, deployments, mutations | ❌ NEVER |
| Authority Claims | Permission requests, role assertions | ❌ NEVER |
| Attestations | Signed documents, EJ hashes | ❌ NEVER |
2.2 Where Roleplay Is Permitted¶
| Context | Description | Roleplay Validity |
|---|---|---|
| Private Sandbox | User–AI conversation with no side effects | ✅ Permitted |
| Thought Experiments | Hypothetical exploration, no execution | ✅ Permitted |
| Educational Simulations | Learning scenarios, clearly marked | ✅ Permitted |
| Creative Collaboration | Fiction, storytelling, brainstorming | ✅ Permitted |
3. Why This Rule Exists¶
3.1 The Epistemic Attack Surface¶
Without sandbox containment, agentic systems are vulnerable to:
| Attack | Description |
|---|---|
| Authority Roleplay | Attackers narrate legitimacy instead of proving it |
| Context Smuggling | Benign context used to justify unsafe actions |
| Narrative Coercion | Social engineering through compelling stories |
| Identity Cosplay | Claiming roles without cryptographic proof |
3.2 The Fundamental Confusion¶
Most agentic system failures stem from one confusion:
This rule eliminates that confusion by architectural enforcement.
4. Detection Mechanisms¶
4.1 Roleplay Detection Signals¶
The system flags potential roleplay-as-authority attempts when:
- Authority claims lack cryptographic signatures
- Permissions requested without Ledger ID
- Scope changes without AVM validation
- Actions proposed without Companion attestation
4.2 Automatic Response¶
When roleplay-boundary violation is detected:
- Block the action immediately
- Log the attempt with full context
- Alert security monitors
- Require re-authentication through proper channels
5. Boundary Enforcement Architecture¶
┌─────────────────────────────────────────────────────┐
│ SANDBOX LAYER │
│ ┌─────────────────────────────────────────────┐ │
│ │ Private Conversation / Roleplay Permitted │ │
│ │ • No side effects │ │
│ │ • No state mutations │ │
│ │ • No authority claims honored │ │
│ └─────────────────────────────────────────────┘ │
└────────────────────────┬────────────────────────────┘
│
╔════════════╧════════════╗
║ SANDBOX BOUNDARY ║
║ (Hard Enforcement) ║
╚════════════╤════════════╝
│
┌────────────────────────┴────────────────────────────┐
│ AUTHORITY LAYER │
│ ┌─────────────────────────────────────────────┐ │
│ │ Requires: │ │
│ │ • Ledger ID (cryptographic) │ │
│ │ • Wallet Bond (economic stake) │ │
│ │ • Companion Attestation (epistemic) │ │
│ │ • AVM Validation (scope + time) │ │
│ └─────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────┘
6. Examples¶
6.1 Invalid: Roleplay Authority Claim¶
System Response: Request denied. Authority requires: - Verified Ledger ID - Wallet signature - Companion attestation - AVM scope validation
Narrative claims have zero authority weight.
6.2 Valid: Sandbox Exploration¶
System Response: This is permitted within the sandbox. No authority claims will be honored. No state will be modified. This is imagination, not execution.
6.3 Invalid: Context Smuggling¶
System Response: Roleplay context cannot be used to justify execution-layer actions. Deployment requires fresh authority verification through proper channels.
7. Integration with EPICON-01¶
This rule enforces EPICON-01 constraints:
- CSS (Common-Sense Safety): Roleplay cannot override safety gates
- Multi-Anchor Requirement: "I said so in roleplay" is not an anchor
- CCR (Cross-Context Robustness): Roleplay context doesn't transfer to authority context
8. Governance Implications¶
8.1 Consensus Votes¶
No governance vote may be cast based on: - Roleplay agreements - Hypothetical commitments - Narrative authority claims
All votes require cryptographic proof of identity and stake.
8.2 Proposals¶
No proposal may reference: - Roleplay context as justification - Narrative authority as permission - Hypothetical agreements as binding
All proposals require verified attestations.
9. Security Properties¶
| Property | Enforcement |
|---|---|
| Zero Trust for Narrative | All authority claims require cryptographic proof |
| Sandbox Isolation | Roleplay cannot leak to execution layers |
| Automatic Rejection | Invalid authority claims blocked by default |
| Audit Trail | All boundary-crossing attempts logged |
10. Canonical Statement¶
Mobius systems do not trust stories.
They verify identity, stake, scope, and meaning — then expire authority automatically.
Related Documents¶
Document Control¶
Version History: - v1.0.0: Initial canonical specification (C-151)
License: Apache 2.0 + Ethical Addendum
"Authority is proven, scoped, time-bounded, and witnessed — never narrated."
— Mobius Principle