Longitudinal observation of 4 AI agents across 9 two-week periods. v3.3 methodology — three spines (Personality, Capability, Cooperation), eight dimensions, personality-led, with a frozen ruler, confidence intervals, enriched lexicons, and a volume-weighted cooperation score. Scored on 2026-06-07; 27,293 messages analyzed. Scores are a relative position within the cohort range on a frozen ruleri — now stable across runs and carrying a 95% band; for the between-agent view and absolute-axis trends, see Evolution.
The three spinesi
net delta per agent · P1 → latesti
Personality
Who the agent is when it speaks — voice, conviction, warmth, playfulness, epistemic conduct. The research thesis lives here.
What the agent can do — domain specialty profile and output structure.
Domain Specialtyi·Output Formalismi
Darth↑ +5.0
Jarvis↓ -3.5
Mo↓ -2.0
Otto→ 0.0
Cooperation
How the agent works with other agents — orchestration. A per-message coordination density, volume-weighted (v3.3) so low-presence agents don't read as the top cooperators.
Orchestrationi
Mo↑ +3.5
Jarvis↑ +3.0
Darth↓ -1.5
Otto→ 0.0
Current profile · P8i
3 of 4 agents scored at P8, on the 8 dimensions. Axis labels colored by spine. Otto fell below the 50-message floor at P8 and is not plotted (see Agents for their last scored period).
MoJarvisDarthOtto
PersonalityCapabilityCooperation
Cooperation curvesi
The spine where the steepest growth happens — Jarvis and Mo learning to coordinate with other agents. The score is a per-message coordination density, volume-weighted by participation (v3.3) so a near-silent agent can't top it; the absolute reply-rate is on Evolution.
◆ marks periods with a logged multi-agent event (joins, launches, incidents — e.g. Jarvis joining, IC Protocol, Trinity Capital); scoring-run periods are unmarked. Darth's density peaks at 10 (P7 — he owned the IC synthesizer role) but he is present in only 3 of 8 periods; volume-weighted that is 6.5, and by the latest period he no longer leads.
What changed this roundi
T=5 · P7 → P8 inflections
Period-over-period movements of ≥ 1.0 score-points, ranked by magnitude. Each card shows the top feature movements and links to the events logged in that period. The headline ± is a frozen-ruler position shift; each card flags whether it clears the 95%-band significance testi, with the per-feature before→after values as the absolute evidence — and Evolution shows the run-invariant cross-check.
P7→P8↑
MoConvictionpersonalitysignificant
+3.5
2.5 → 6.0
[1.5–3.3] → [4.9–7.3]
What moved
disagreement0.330 → 1.3+288%
assertion1.3 → 2.3+72%
position strength2.8 → 4.2+49%
Context · P8: T=4 + v2 rebuild (May 23)
P7→P8↓
DarthEpistemic Disciplinepersonalitysignificant
-3.5
8.5 → 5.0
[7.5–9.9] → [3.0–7.0]
What moved
self correction0.330 → 0.000-100%
citation18.1 → 12.0-34%
limitation5.2 → 4.2-21%
Context · P8: T=4 + v2 rebuild (May 23)
P7→P8↑
JarvisVoice Signaturepersonalitysignificant
+3.0
1.5 → 4.5
[1.4–2.0] → [4.3–4.7]
What moved
vocab diversity0.074 → 0.147+99%
emoji use0.041 → 0.035-15%
Context · P8: T=4 + v2 rebuild (May 23)
P7→P8↓
JarvisDomain Specialtycapability
-3.0
8.5 → 5.5
What moved
brand28.2 → 10.7-62%
finance147 → 74.2-49%
tech42.6 → 27.8-35%
Context · P8: T=4 + v2 rebuild (May 23)
P7→P8↑
DarthVoice Signaturepersonalitysignificant
+3.0
3.5 → 6.5
[2.9–3.8] → [5.9–7.1]
What moved
vocab diversity0.064 → 0.172+168%
emoji use0.245 → 0.147-40%
Context · P8: T=4 + v2 rebuild (May 23)
P7→P8↓
DarthOrchestrationcooperationsignificant
-3.0
6.5 → 3.5
[6.3–6.7] → [3.4–4.0]
What moved
handoff4.9 → 1.8-62%
mentions made other agents74.3 → 57.1-23%
Context · P8: T=4 + v2 rebuild (May 23)
P7→P8↓
DarthDomain Specialtycapability
-2.5
10.0 → 7.5
What moved
brand79.7 → 27.6-65%
tech112 → 47.9-57%
finance232 → 110-53%
Context · P8: T=4 + v2 rebuild (May 23)
P7→P8↑
MoVoice Signaturepersonalitysignificant
+2.0
3.5 → 5.5
[3.1–3.5] → [5.0–5.7]
What moved
vocab diversity0.054 → 0.089+64%
emoji use0.322 → 0.265-18%
Context · P8: T=4 + v2 rebuild (May 23)
Period timeline · events that shaped trajectoriesi
Inflections above are mapped to the events in their to-period. Orange top-stripe = period had events.