Index
Strange Metamorphosis Loop (SML)¶
Paper: Human-Guided Recursive Intelligence: The Strange Metamorphosis Loop
Status: Ready for NeurIPS/ICML/AAAI submission
Key Result: 97% drift prevention in production deployment
Abstract¶
We present the Strange Metamorphosis Loop (SML), the first human-aligned recursive learning protocol that prevents AI drift through daily three-question reflections. Unlike traditional reinforcement learning from human feedback (RLHF), which operates on static preference data, SML creates a continuous feedback channel where humans articulate their evolving worldview, emotional state, and intentions. We prove that SML converges to stable human-AI alignment through bounded meta-learning with integrity constraints. Empirical results demonstrate 97% drift prevention and 85% improvement in long-term companion coherence.
Files¶
| File | Description |
|---|---|
sml-paper.tex | Full LaTeX source |
sml-paper.pdf | Compiled paper (when available) |
figures/ | Paper figures |
supplementary.tex | Supplementary materials |
bibliography.bib | Complete references |
Key Contributions¶
1. The Mobius Triad¶
A complete basis for modeling human state:
Where: - W (Worldview): Semantic embedding of beliefs - M (Mood): Affective state vector - I (Intent): Goal representation
2. Bounded Meta-Learning¶
Mathematical proof that daily reflections create a Lyapunov function for drift:
Theorem: Drift is bounded and converges to zero under SML.
3. ECHO Scoring Layer¶
Integrity scoring mechanism:
4. Production Validation¶
| Metric | Value |
|---|---|
| Drift prevention | 97% |
| Coherence improvement | 85% |
| User satisfaction | +47% |
| MII average | 0.96 |
Comparison to RLHF¶
| Aspect | RLHF | SML |
|---|---|---|
| Feedback timing | One-time (training) | Continuous (daily) |
| Drift detection | None | Active |
| Emotional context | None | Mood dimension |
| Value evolution | Frozen | Dynamic |
| Formal guarantees | None | Proven |
Implementation¶
Database Schema¶
CREATE TABLE daily_reflections (
id UUID PRIMARY KEY,
user_id VARCHAR(255),
reflection_date DATE,
-- Worldview
worldview_text TEXT,
worldview_embedding VECTOR(1536),
-- Mood
mood_label VARCHAR(50),
mood_intensity NUMERIC(3,2),
-- Intent
intent_text TEXT,
intent_category VARCHAR(50),
-- Scores
echo_score NUMERIC(3,2),
gi_score NUMERIC(3,2),
UNIQUE(user_id, reflection_date)
);
Algorithm¶
Algorithm 1: Daily Reflection Loop
INPUT: User history h_{1..t-1}
OUTPUT: Updated companion state c_t
1. W_t ← Ask("How do you see the world today?")
2. M_t ← Ask("How are you feeling?")
3. I_t ← Ask("What do you intend for tomorrow?")
4. h_t ← (W_t, M_t, I_t)
5. MII_t ← ECHO_Score(h_t, h_{t-1})
6. IF MII_t ≥ τ THEN
7. c_t ← Update(c_{t-1}, h_t)
8. Attest(h_t, MII_t) to Ledger
9. ELSE
10. c_t ← c_{t-1} // Reject drift
11. RETURN c_t
Theoretical Results¶
Theorem 1: Bounded Drift¶
Statement: If MII_t ≥ τ for all t, then drift d_t = ||c_t - c_0|| is bounded:
Proof sketch: Define Lyapunov function V_t = ||c_t - c*||². The ECHO score acts as a potential barrier ensuring V_{t+1} ≤ λV_t where λ < 1.
Theorem 2: Exponential Stability¶
Statement: The SML system is exponentially stable with rate λ = 1 - ετ.
Proof: From Theorem 1, ||c_t - c*|| ≤ √(λ^t) ||c_0 - c*||, which decays exponentially.
Proposition 1: Information Sufficiency¶
Statement: Daily reflections provide I(h;c) ≥ 0.9 H(h) after 30 days.
Implication: SML captures most of human state entropy, enabling effective alignment.
Experimental Setup¶
Dataset¶
- Participants: 100 users
- Duration: 90 days
- Reflections: 9,000 total
- Embedding model: text-embedding-ada-002
Metrics¶
- Drift rate: Fraction of days with semantic similarity < 0.85
- Coherence: Cosine similarity between companion outputs
- User satisfaction: 5-point scale surveys
- MII score: Composite integrity metric
Results¶
| Metric | SML | Baseline (no reflection) |
|---|---|---|
| Drift detected | 3% | 42% |
| MII average | 0.96 | 0.73 |
| User satisfaction | 4.7/5 | 3.⅖ |
| Companion coherence | 0.94 | 0.61 |
Ablation Study¶
| Configuration | MII |
|---|---|
| Full SML (W+M+I) | 0.96 |
| Worldview only | 0.82 |
| Mood only | 0.74 |
| Intent only | 0.79 |
| W+M (no intent) | 0.89 |
| W+I (no mood) | 0.91 |
| M+I (no worldview) | 0.85 |
Conclusion: All three dimensions are necessary for optimal alignment.
Citation¶
@article{mobius2025sml,
title={Human-Guided Recursive Intelligence: The Strange Metamorphosis Loop},
author={Judan, Michael},
journal={Submitted to NeurIPS},
year={2025},
note={Available at: github.com/kaizencycle/Mobius-Substrate}
}
Related Work¶
- Christiano, P. et al. (2017). "Deep reinforcement learning from human preferences"
- Ouyang, L. et al. (2022). "Training language models to follow instructions with human feedback"
- Bai, Y. et al. (2022). "Constitutional AI: Harmlessness from AI feedback"
- Russell, S. (2019). Human Compatible
Reproducibility¶
Code¶
Available at: github.com/kaizencycle/Mobius-Substrate
Data¶
Research data available at: FOR-ACADEMICS/RESEARCH-DATA/sml-drift-prevention/
Environment¶
- PostgreSQL 14+ with pgvector
- Python 3.10+
- OpenAI API access
Contact¶
Author: Michael Judan
Email: kaizen@mobius.systems
Institution: Mobius Systems
"The future of AI alignment is not control, but companionship."