Skip to content

Hermes

Referenced as a target in the gstack installer (./setup --host hermes installs gstack into ~/.hermes/skills/). Public-facing details are sparse; what we know comes from the comparative writeups.

Most opinionated of the popular Claude Code frameworks. Built around autonomous orchestration: Claude operates with significant autonomy on long-running multi-step workflows, with planning, persistent memory, multi-agent coordination, retry/error handling, and human-in-the-loop checkpoints.

Designed for “AI-powered product” use cases rather than personal-tool use.

  • Named human-in-loop checkpoints. Borrowed verbatim. Our CHECKPOINT <name> decision verb is the structural equivalent. triggerOn: post-planner auto-fires after planner+plan-reviewer pass.
  • Distinct from escalation. Hermes’ insight: a checkpoint is “we hit the milestone, please approve” — not “I’m stuck.” We split the verbs and the UI banners accordingly.
  • Autonomous retry/error handling at the framework level. Our retry is stateful at the feature level — attempts increments, debugger fires at threshold ≥ 3. Hermes seems to retry at the agent level; we prefer disk-tracked retry at the workflow level.
  • Multi-agent coordination protocol. We didn’t need a wire protocol; files-on-disk are the coordination surface.
HermesPapercup
Multi-agent wire protocolFile-based coordination
Built around “AI product” use caseBuilt around autonomous coding
Checkpoint defined per workflowCheckpoint defined per triggerOn config

We don’t have first-hand experience with Hermes — the public surface is mostly comparison writeups. The named-checkpoint pattern is the highest-value thing we lifted. If you’re building Hermes-style “AI products” in production, the framework is worth a deep look.