| From | Pattern | Our adoption |
|---|
| Superpowers | TDD iron law | Validator HARD RULE: behavioural assertions need failing-then-passing test |
| gstack | Boil the lake | Added to debugger / ui-qa / curator / product / architect prompts |
| claudecode-orchestrator | Quality through truth | Validator EVIDENCE RULE: quote source output to claim PASS |
| claudecode-orchestrator | Service smoke-test | bin/service-smoke-test.sh + smokeTest.onDone/onFeaturePass |
| Hermes | Named checkpoints | CHECKPOINT <name> decision verb + triggerOn: post-planner auto-fire |
| Composio | Git worktree per agent | branchIsolation.useWorktrees + worktree_* fns |
| Conductor | Two-mode parallelism | parallelWorkers.mode: "lane"|"competition" |
| Agent Swarm | Per-agent IDENTITY.md | ~/autonomous-harness/identity/<role>.md cross-mission append-only |
| MOLTRON | Self-evolving learnings | Worker TRICK: convention promoted by curator |
| Ralph Loop | 4-layer memory | L0 runtime / L1 raw / L2 summary / L3 MEMORY / L4 identity |
| GSD | Per-phase orchestrators with state-to-disk | Fresh-context per role per iteration |
| Agentwise | Real-time dashboard | Next.js /harness UI |
| From | Pattern | Why rejected |
|---|
| OpenSwarm | Linear/Notion task source | User explicit: no external task source |
| OpenSwarm | LanceDB vector memory | Overkill at our scale |
| Claude-Swarm | Tmux-based agent messaging | File-based simpler |
| Agentwise | Discord/Slack control | UI-coupled |
| MOLTRON | Workers rewrite their own prompts | Audit / determinism preferred |
| AutoGPT | Infinite loop, no decision verbs | Drift catastrophe |
| LangChain | Heavyweight Python runtime | Bash + claude won |
| All “skill packs” | Human as orchestrator | We needed autonomous loop |
| Pattern | Why we needed it | Where it lives |
|---|
| Decisions timeline + ghost rate | Detect orchestrator parser regressions | /api/harness/:slug/decisions + dashboard panel |
| Cost-per-feature attribution | Per-session cost tracking is too coarse | Filename tag pattern: <ts>-<role>-<fid>.jsonl → /usage byFeature |
| Plan-reviewer as autonomous gate | gstack’s /plan-ceo-review is human-triggered; ours fires automatically | prompts/plan-reviewer.md + run.sh gate |
| Autonomous Product role | MOLTRON evolves capabilities; we wanted scope expansion | prompts/product.md + GOAL.md vs SPEC.md diff + proposals CRUD |
| Proposals CRUD with bulk accept/reject | Operator triage workflow | .harness/proposals/*.md + dashboard 🪄 tab |
| Mission cost cap with SIGSTOP auto-pause (later removed) | Soft warn before hard cap | Removed at user request after $135 mission |
| Per-feature timeline aggregating snapshots + agent runs + debug + PR | Forensics replacement for log scrolling | /api/harness/:slug/features/:id/timeline |
| Health endpoint with 9 composite checks | One-glance system health | /api/harness/:slug/health |
We bet that:
- A small bash supervisor with no agent-runtime opinion is the right unit of generality.
- Stable per-role markdown prompts give us prompt-cache-dominance (99.94% hit rate).
- Files-on-disk are the right state model — any role can re-derive from them.
- A curator role doing memory compaction prevents unbounded
raw.md growth.
- Observability beats autonomy — we’d rather see the orchestrator’s parse-ghost rate than have a more autonomous orchestrator we can’t observe.
So far the bet is paying off: 99.94% cache hit, 22/37 features green at iteration 5 of the live mission, $135/mission, $0.05/role-invocation effective cost.