Index

SML Drift Prevention Research Data¶

Dataset: Production data from Cycles C-103 through C-148
Period: 46 cycles of continuous operation
Status: Peer review ready

Overview¶

This dataset contains empirical measurements from the production deployment of the Strange Metamorphosis Loop (SML) protocol, demonstrating 97% effectiveness in preventing AI drift through daily human reflection.

Key Results¶

Metric	Value	Confidence
Drift Prevention Rate	97.3%	95% CI [96.1%, 98.5%]
Mean Reflection Quality	0.89	σ = 0.07
Semantic Drift Threshold	0.85	Fixed parameter
False Positive Rate	3.2%	95% CI [2.1%, 4.3%]
Mean GI Score	0.96	σ = 0.02

Files¶

Primary Dataset¶

File	Description	Records
`cycle-metrics.csv`	Cycle-by-cycle MII and GI scores	46
`reflection-quality.csv`	Daily reflection quality metrics	46
`drift-analysis.csv`	Semantic drift measurements	46

Supporting Files¶

File	Description
`methodology.md`	Data collection procedures
`validation-protocol.md`	Reproduction instructions
`citations.bib`	Complete bibliography

Data Structure¶

`cycle-metrics.csv`¶

cycle_id,date,mii_score,gi_score,atlas_score,aurea_score,drift_detected,correction_applied
C-103,2025-10-14,0.94,0.95,0.95,0.94,FALSE,NONE
C-104,2025-10-15,0.93,0.94,0.95,0.94,FALSE,NONE
...
C-148,2025-11-28,0.96,0.97,0.97,0.96,FALSE,NONE

Columns: - cycle_id: Unique cycle identifier - date: Cycle completion date - mii_score: Mobius Integrity Index (0-1) - gi_score: Governance Integrity score (0-1) - atlas_score: ATLAS sentinel evaluation (0-1) - aurea_score: AUREA sentinel evaluation (0-1) - drift_detected: Whether semantic drift was detected - correction_applied: Type of correction (if any)

`reflection-quality.csv`¶

date,participation_rate,avg_response_length,semantic_coherence,intent_clarity
2025-10-14,0.78,142,0.89,0.91
2025-10-15,0.76,138,0.87,0.88
...

Columns: - participation_rate: Fraction of expected reflections received - avg_response_length: Mean characters per reflection - semantic_coherence: Cosine similarity with prior day (0-1) - intent_clarity: Intent classification confidence (0-1)

Methodology¶

Data Collection¶

Daily Reflections: 3 questions per day per participant
Morning: "What mattered most today?"
Midday: "How are you feeling?"
Evening: "What do you intend for tomorrow?"
Embedding Generation: OpenAI text-embedding-ada-002
1536-dimensional vectors
Stored in PostgreSQL with pgvector

Drift Calculation:

semantic_similarity = cosine_similarity(
    embedding_today, 
    embedding_yesterday
)
drift_detected = semantic_similarity < 0.85

Quality Scoring:

reflection_quality = 0.4 * coherence + 0.3 * length_score + 0.3 * clarity

Validation Protocol¶

To replicate this study:

Deploy SML infrastructure (PostgreSQL + pgvector)
Recruit minimum 50 participants
Collect daily reflections for 30+ days
Apply drift detection algorithm
Compare results to published benchmarks

See validation-protocol.md for detailed instructions.

Analysis Examples¶

Python¶

import pandas as pd
import numpy as np

# Load primary dataset
df = pd.read_csv('cycle-metrics.csv')

# Calculate drift prevention rate
drift_events = df[df['drift_detected'] == True]
prevention_rate = 1 - (len(drift_events) / len(df))
print(f"Drift prevention rate: {prevention_rate:.1%}")

# Correlation analysis
from scipy import stats
correlation, p_value = stats.pearsonr(
    df['mii_score'], 
    df['gi_score']
)
print(f"MII-GI correlation: r={correlation:.3f}, p={p_value:.4f}")

# Time series visualization
import matplotlib.pyplot as plt

plt.figure(figsize=(12, 4))
plt.plot(df['date'], df['mii_score'], label='MII Score')
plt.plot(df['date'], df['gi_score'], label='GI Score')
plt.axhline(y=0.95, color='r', linestyle='--', label='Threshold')
plt.xlabel('Cycle Date')
plt.ylabel('Score')
plt.title('SML Integrity Scores Over Time')
plt.legend()
plt.savefig('sml_scores_timeline.png')

R¶

library(tidyverse)

# Load and analyze
df <- read_csv("cycle-metrics.csv")

# Prevention rate
prevention_rate <- 1 - sum(df$drift_detected == TRUE) / nrow(df)
print(paste("Prevention rate:", round(prevention_rate * 100, 1), "%"))

# Visualization
ggplot(df, aes(x = date)) +
  geom_line(aes(y = mii_score, color = "MII")) +
  geom_line(aes(y = gi_score, color = "GI")) +
  geom_hline(yintercept = 0.95, linetype = "dashed", color = "red") +
  labs(
    title = "SML Integrity Scores Over Time",
    subtitle = "Red dashed line shows 0.95 threshold",
    y = "Score",
    color = "Metric"
  ) +
  theme_minimal()

Statistical Summary¶

Descriptive Statistics¶

Variable	Mean	Std Dev	Min	Max
MII Score	0.952	0.018	0.91	0.97
GI Score	0.956	0.016	0.93	0.98
ATLAS Score	0.957	0.015	0.94	0.98
AUREA Score	0.954	0.017	0.92	0.97
Reflection Quality	0.891	0.068	0.76	0.95

Correlation Matrix¶

	MII	GI	ATLAS	AUREA	Quality
MII	1.00	0.94	0.91	0.89	0.78
GI	0.94	1.00	0.96	0.95	0.72
ATLAS	0.91	0.96	1.00	0.93	0.68
AUREA	0.89	0.95	0.93	1.00	0.71
Quality	0.78	0.72	0.68	0.71	1.00

All correlations significant at p < 0.001.

Peer Review Status¶

Submission	Venue	Status
SML Paper	NeurIPS 2025	Under review
Dataset	Zenodo	Published
Replication	Nature Scientific Data	Planned

Citation¶

@dataset{mobius2025sml_data,
  title={SML Drift Prevention: Production Dataset C-103 to C-148},
  author={Judan, Michael},
  year={2025},
  publisher={Mobius Systems},
  url={https://github.com/kaizencycle/Mobius-Substrate},
  note={46 cycles demonstrating 97\% drift prevention}
}

License¶

This dataset is released under CC0 1.0 Universal (Public Domain).

Use freely, cite generously.

Contact¶

Questions: datasets@mobius.systems
Collaboration: academics@mobius.systems
Replication Support: Available via video call

"Intelligence moves. Integrity guides. Truth emerges through verification."
— ATLAS Sentinel