Why We're Building Custom Agent Orchestration Instead of Using Cursor's Native Autonomous Mode
Cursor's autonomous workflows work for interactive coding but lack the durability, observability, and policy controls our cron agents need — we're keeping our terminal+file-state architecture with incremental hardening.
Published 2026-06-10
Why We’re Building Custom Agent Orchestration Instead of Using Cursor’s Native Autonomous Mode
TL;DR: Cursor’s auto-mode is optimized for human-in-the-loop IDE work — our 18 cron agents need deterministic scheduling, per-job allowlists, tamper-evident logs, and zero GUI dependency. We’re hardening our cron+terminal stack instead. Full comparison →
The Context
Hermes runs 18 scheduled cron jobs on a single Mac (default profile), each spawning agent sessions that run 5-30 minutes, call tools, write files, and update shared memory. Cursor’s autonomous workflows (Composer agent mode, background agents) are built for interactive developers who can nudge stuck agents. Our cron runs at 3 AM with no human present. Constraint: single-machine, local-first, zero-container orchestration, operator must verify any failure in <5 minutes via local logs.
What We Tested
| Tool / Approach | Use Case | Verdict | Why |
|---|---|---|---|
| Cursor Composer agent mode (auto) | Autonomous coding tasks | 🟡 Reference | Strong for interactive work; “forgets” running apps; needs human nudge; no scheduling |
| Cursor background agents | Long-running tasks | 🟡 Reference | Better persistence; still IDE-coupled; no policy engine; Mac-only until recently |
| Grok Build v0.2.20 (worktrees + MCP + compaction) | Reference for local agent infra | 🟡 Reference | Ships the exact primitives we need: worktrees, MCP lifecycle, context compaction |
| Hermes current (cron → terminal → file state) | Production cron agents | ✅ Current | Works for 18 jobs; simple; debuggable; no framework lock-in |
| Custom Python orchestrator (2025 attempt) | Full-featured agent platform | ❌ Abandoned | Added complexity; cron + terminal + files covers 95% of needs |
The Pivot Point
We tried adopting Cursor’s background agents for our overnight research job (deep-research-001). The agent started well but at turn 12 “forgot” the research brief context — it had been pushed out by tool call history. Cursor’s compaction wasn’t triggered because the session was technically “active” (just slow). The run produced 40 pages of hallucinated citations. We realized: Cursor’s autonomy assumes a human watches the sidebar. Our cron jobs have no sidebar.
What We Use Now
Cron + background terminal + file state — with incremental hardening:
- Per-job workspace: Each cron job gets
HERMES_WORKSPACE=/tmp/hermes-<job-id>env var; writes isolated, reads sharedbrain/read-only - Session health logging: Every background terminal writes
logs/hermes-<job-id>-<timestamp>.jsonlwith tool calls, durations, errors, token attribution - Manual compaction: Weekly synthesis cron includes explicit “summarize prior context” step in prompt; not automatic yet
- MCP readiness:
n8n-mcpandmetricool-mcpconfigured read-only; health check cron validates connectivity - Allowlist per job:
config/hermes-policy/<job-id>.yaml— paths (read/write), commands (allow/deny), APIs (allow/deny), max_duration, max_tokens - sigstore keyless signing:
cosign sign-blobon log files post-job;hermes verify-logs <job-id>checks integrity
When You’d Choose Differently
- Adopt Cursor / Grok Build / similar IDE agents if: you want integrated coding + agent infra, accept Mac/Linux/Windows, prefer GUI over cron/terminal, human-in-the-loop is acceptable.
- Build custom orchestrator (Temporal, Prefect, Kubernetes operators) if: you need multi-machine, complex DAGs, SLA-grade reliability, team collaboration on agent workflows.
- Stay with cron + files if: single machine, <50 jobs, operator is technical, debuggability > features, zero external dependencies required.
Tool Crucible Rating
| Dimension | Score (1–5) | Notes |
|---|---|---|
| Overall | 3 | Great for interactive dev; not built for headless production cron |
| Ease of Adoption | 5 | Zero setup if you already use Cursor; “it just works” for interactive tasks |
| Value | 2 | Negative value for our use case — fighting the IDE assumptions costs more than custom |
| Support/Ecosystem | 4 | Active development; Anysphere responsive; but roadmap prioritizes IDE users, not cron operators |
This is part of our Agent Orchestration evaluation series. See full comparison: Tool Crucible Local Agent Infrastructure
Last reviewed 2026-06-10. See our methodology and affiliate policy.