Index
SML Drift Prevention Research Data¶
Dataset: Production data from Cycles C-103 through C-148
Period: 46 cycles of continuous operation
Status: Peer review ready
Overview¶
This dataset contains empirical measurements from the production deployment of the Strange Metamorphosis Loop (SML) protocol, demonstrating 97% effectiveness in preventing AI drift through daily human reflection.
Key Results¶
| Metric | Value | Confidence |
|---|---|---|
| Drift Prevention Rate | 97.3% | 95% CI [96.1%, 98.5%] |
| Mean Reflection Quality | 0.89 | σ = 0.07 |
| Semantic Drift Threshold | 0.85 | Fixed parameter |
| False Positive Rate | 3.2% | 95% CI [2.1%, 4.3%] |
| Mean GI Score | 0.96 | σ = 0.02 |
Files¶
Primary Dataset¶
| File | Description | Records |
|---|---|---|
cycle-metrics.csv | Cycle-by-cycle MII and GI scores | 46 |
reflection-quality.csv | Daily reflection quality metrics | 46 |
drift-analysis.csv | Semantic drift measurements | 46 |
Supporting Files¶
| File | Description |
|---|---|
methodology.md | Data collection procedures |
validation-protocol.md | Reproduction instructions |
citations.bib | Complete bibliography |
Data Structure¶
cycle-metrics.csv¶
cycle_id,date,mii_score,gi_score,atlas_score,aurea_score,drift_detected,correction_applied
C-103,2025-10-14,0.94,0.95,0.95,0.94,FALSE,NONE
C-104,2025-10-15,0.93,0.94,0.95,0.94,FALSE,NONE
...
C-148,2025-11-28,0.96,0.97,0.97,0.96,FALSE,NONE
Columns: - cycle_id: Unique cycle identifier - date: Cycle completion date - mii_score: Mobius Integrity Index (0-1) - gi_score: Governance Integrity score (0-1) - atlas_score: ATLAS sentinel evaluation (0-1) - aurea_score: AUREA sentinel evaluation (0-1) - drift_detected: Whether semantic drift was detected - correction_applied: Type of correction (if any)
reflection-quality.csv¶
date,participation_rate,avg_response_length,semantic_coherence,intent_clarity
2025-10-14,0.78,142,0.89,0.91
2025-10-15,0.76,138,0.87,0.88
...
Columns: - participation_rate: Fraction of expected reflections received - avg_response_length: Mean characters per reflection - semantic_coherence: Cosine similarity with prior day (0-1) - intent_clarity: Intent classification confidence (0-1)
Methodology¶
Data Collection¶
- Daily Reflections: 3 questions per day per participant
- Morning: "What mattered most today?"
- Midday: "How are you feeling?"
-
Evening: "What do you intend for tomorrow?"
-
Embedding Generation: OpenAI text-embedding-ada-002
- 1536-dimensional vectors
-
Stored in PostgreSQL with pgvector
-
Drift Calculation:
-
Quality Scoring:
Validation Protocol¶
To replicate this study:
- Deploy SML infrastructure (PostgreSQL + pgvector)
- Recruit minimum 50 participants
- Collect daily reflections for 30+ days
- Apply drift detection algorithm
- Compare results to published benchmarks
See validation-protocol.md for detailed instructions.
Analysis Examples¶
Python¶
import pandas as pd
import numpy as np
# Load primary dataset
df = pd.read_csv('cycle-metrics.csv')
# Calculate drift prevention rate
drift_events = df[df['drift_detected'] == True]
prevention_rate = 1 - (len(drift_events) / len(df))
print(f"Drift prevention rate: {prevention_rate:.1%}")
# Correlation analysis
from scipy import stats
correlation, p_value = stats.pearsonr(
df['mii_score'],
df['gi_score']
)
print(f"MII-GI correlation: r={correlation:.3f}, p={p_value:.4f}")
# Time series visualization
import matplotlib.pyplot as plt
plt.figure(figsize=(12, 4))
plt.plot(df['date'], df['mii_score'], label='MII Score')
plt.plot(df['date'], df['gi_score'], label='GI Score')
plt.axhline(y=0.95, color='r', linestyle='--', label='Threshold')
plt.xlabel('Cycle Date')
plt.ylabel('Score')
plt.title('SML Integrity Scores Over Time')
plt.legend()
plt.savefig('sml_scores_timeline.png')
R¶
library(tidyverse)
# Load and analyze
df <- read_csv("cycle-metrics.csv")
# Prevention rate
prevention_rate <- 1 - sum(df$drift_detected == TRUE) / nrow(df)
print(paste("Prevention rate:", round(prevention_rate * 100, 1), "%"))
# Visualization
ggplot(df, aes(x = date)) +
geom_line(aes(y = mii_score, color = "MII")) +
geom_line(aes(y = gi_score, color = "GI")) +
geom_hline(yintercept = 0.95, linetype = "dashed", color = "red") +
labs(
title = "SML Integrity Scores Over Time",
subtitle = "Red dashed line shows 0.95 threshold",
y = "Score",
color = "Metric"
) +
theme_minimal()
Statistical Summary¶
Descriptive Statistics¶
| Variable | Mean | Std Dev | Min | Max |
|---|---|---|---|---|
| MII Score | 0.952 | 0.018 | 0.91 | 0.97 |
| GI Score | 0.956 | 0.016 | 0.93 | 0.98 |
| ATLAS Score | 0.957 | 0.015 | 0.94 | 0.98 |
| AUREA Score | 0.954 | 0.017 | 0.92 | 0.97 |
| Reflection Quality | 0.891 | 0.068 | 0.76 | 0.95 |
Correlation Matrix¶
| MII | GI | ATLAS | AUREA | Quality | |
|---|---|---|---|---|---|
| MII | 1.00 | 0.94 | 0.91 | 0.89 | 0.78 |
| GI | 0.94 | 1.00 | 0.96 | 0.95 | 0.72 |
| ATLAS | 0.91 | 0.96 | 1.00 | 0.93 | 0.68 |
| AUREA | 0.89 | 0.95 | 0.93 | 1.00 | 0.71 |
| Quality | 0.78 | 0.72 | 0.68 | 0.71 | 1.00 |
All correlations significant at p < 0.001.
Peer Review Status¶
| Submission | Venue | Status |
|---|---|---|
| SML Paper | NeurIPS 2025 | Under review |
| Dataset | Zenodo | Published |
| Replication | Nature Scientific Data | Planned |
Citation¶
@dataset{mobius2025sml_data,
title={SML Drift Prevention: Production Dataset C-103 to C-148},
author={Judan, Michael},
year={2025},
publisher={Mobius Systems},
url={https://github.com/kaizencycle/Mobius-Substrate},
note={46 cycles demonstrating 97\% drift prevention}
}
License¶
This dataset is released under CC0 1.0 Universal (Public Domain).
Use freely, cite generously.
Contact¶
Questions: datasets@mobius.systems
Collaboration: academics@mobius.systems
Replication Support: Available via video call
"Intelligence moves. Integrity guides. Truth emerges through verification."
— ATLAS Sentinel