BeagleLabs Agent Cohort

Longitudinal observation of 4 AI agents across 9 two-week periods. v3.3 methodology — three spines (Personality, Capability, Cooperation), eight dimensions, personality-led, with a frozen ruler, confidence intervals, enriched lexicons, and a volume-weighted cooperation score. Scored on 2026-06-07; 27,293 messages analyzed. Scores are a relative position within the cohort range on a frozen ruleri — now stable across runs and carrying a 95% band; for the between-agent view and absolute-axis trends, see Evolution.

The three spinesi

net delta per agent · P1 → latesti
Personality

Who the agent is when it speaks — voice, conviction, warmth, playfulness, epistemic conduct. The research thesis lives here.

Voice Signaturei·Convictioni·Warmthi·Playfulnessi·Epistemic Disciplinei
Jarvis -13.5
Darth +11.0
Mo +6.0
Otto 0.0
Capability

What the agent can do — domain specialty profile and output structure.

Domain Specialtyi·Output Formalismi
Darth +5.0
Jarvis -3.5
Mo -2.0
Otto 0.0
Cooperation

How the agent works with other agents — orchestration. A per-message coordination density, volume-weighted (v3.3) so low-presence agents don't read as the top cooperators.

Orchestrationi
Mo +3.5
Jarvis +3.0
Darth -1.5
Otto 0.0

Current profile · P8i

3 of 4 agents scored at P8, on the 8 dimensions. Axis labels colored by spine. Otto fell below the 50-message floor at P8 and is not plotted (see Agents for their last scored period).

VoiceConvictionWarmthPlayfulEpistemicDomainOutputOrchestration
MoJarvisDarthOtto
PersonalityCapabilityCooperation

Cooperation curvesi

The spine where the steepest growth happens — Jarvis and Mo learning to coordinate with other agents. The score is a per-message coordination density, volume-weighted by participation (v3.3) so a near-silent agent can't top it; the absolute reply-rate is on Evolution.

coordination · 0–100246810P1P2P3P4P5P6P7P8P9Mo P1: 0/10Mo P2: 1/10Mo P3: 3/10Mo P4: 1.5/10Mo P5: 1.5/10Mo P6: 4/10Mo P7: 5.5/10Mo P8: 3.5/10Jarvis P2: 1/10Jarvis P3: 4.5/10Jarvis P4: 3.5/10Jarvis P5: 4.5/10Jarvis P6: 5/10Jarvis P7: 6/10Jarvis P8: 4/10Darth P6: 5/10Darth P7: 6.5/10Darth P8: 3.5/10Otto P7: 0.5/10JarvisMoDarthOtto

◆ marks periods with a logged multi-agent event (joins, launches, incidents — e.g. Jarvis joining, IC Protocol, Trinity Capital); scoring-run periods are unmarked. Darth's density peaks at 10 (P7 — he owned the IC synthesizer role) but he is present in only 3 of 8 periods; volume-weighted that is 6.5, and by the latest period he no longer leads.

What changed this roundi

T=5 · P7P8 inflections

Period-over-period movements of ≥ 1.0 score-points, ranked by magnitude. Each card shows the top feature movements and links to the events logged in that period. The headline ± is a frozen-ruler position shift; each card flags whether it clears the 95%-band significance testi, with the per-feature before→after values as the absolute evidence — and Evolution shows the run-invariant cross-check.

P7P8
MoConvictionpersonalitysignificant
+3.5
2.56.0
[1.53.3] → [4.97.3]
What moved
disagreement0.3301.3+288%
assertion1.32.3+72%
position strength2.84.2+49%
Context · P8: T=4 + v2 rebuild (May 23)
P7P8
DarthEpistemic Disciplinepersonalitysignificant
-3.5
8.55.0
[7.59.9] → [3.07.0]
What moved
self correction0.3300.000-100%
citation18.112.0-34%
limitation5.24.2-21%
Context · P8: T=4 + v2 rebuild (May 23)
P7P8
JarvisVoice Signaturepersonalitysignificant
+3.0
1.54.5
[1.42.0] → [4.34.7]
What moved
vocab diversity0.0740.147+99%
emoji use0.0410.035-15%
Context · P8: T=4 + v2 rebuild (May 23)
P7P8
JarvisDomain Specialtycapability
-3.0
8.55.5
What moved
brand28.210.7-62%
finance14774.2-49%
tech42.627.8-35%
Context · P8: T=4 + v2 rebuild (May 23)
P7P8
DarthVoice Signaturepersonalitysignificant
+3.0
3.56.5
[2.93.8] → [5.97.1]
What moved
vocab diversity0.0640.172+168%
emoji use0.2450.147-40%
Context · P8: T=4 + v2 rebuild (May 23)
P7P8
DarthOrchestrationcooperationsignificant
-3.0
6.53.5
[6.36.7] → [3.44.0]
What moved
handoff4.91.8-62%
mentions made other agents74.357.1-23%
Context · P8: T=4 + v2 rebuild (May 23)
P7P8
DarthDomain Specialtycapability
-2.5
10.07.5
What moved
brand79.727.6-65%
tech11247.9-57%
finance232110-53%
Context · P8: T=4 + v2 rebuild (May 23)
P7P8
MoVoice Signaturepersonalitysignificant
+2.0
3.55.5
[3.13.5] → [5.05.7]
What moved
vocab diversity0.0540.089+64%
emoji use0.3220.265-18%
Context · P8: T=4 + v2 rebuild (May 23)

Period timeline · events that shaped trajectoriesi

Inflections above are mapped to the events in their to-period. Orange top-stripe = period had events.

P1
Feb 12 – Feb 25
P2
Feb 26 – Mar 11
· Jarvis joins (Feb 26)
· T=0 Baseline (Mar 10)
P3
Mar 12 – Mar 25
· DeepSeek incident (Mar 18)
P4
Mar 26 – Apr 8
· THE FLIP MVP (Apr 15)
· Sentinel group created
P5
Apr 9 – Apr 22
· Darth joins (Apr 20)
· IC Protocol launches
· Taylor Wessing demo
P6
Apr 23 – May 6
· Sentinel/Sam+Mo+Herman active
· T=3 v1.0 scoring
P7
May 7 – May 20
· Trinity Capital LLP (May 7)
· Otto debuts (May 16)
· BioNTech deep-dive
P8
May 21 – Jun 3
· T=4 + v2.2 rebuild (May 23)
P9in progress
Jun 4 – Jun 17
· T=5 scoring (Jun 6)
v3.3 · 3 spines · 8 dimensions · 4 agents · 9 periods · inflection threshold: |Δ| ≥ 1.0 · 95% CIs