← Back to library

By David Ondrej · 42:57 · transcript ok

Watch videoView transcript

Hermes Agent Just 10x’d Everyone’s Claude Code

Video: https://www.youtube.com/watch?v=1nDiiXfMUK4

Video ID: 1nDiiXfMUK4

Duration: 42:57

Transcript status: ok

Generated: 2026-05-02T07:00:36Z

Core thesis

The video argues for a “personal agent on a VPS” workflow: run Hermes as the always-on orchestrator, connect it to chat surfaces like Discord, give it Claude Code as a coding worker, then wire GitHub and Vercel so plain-English messages can become deployed software changes.

The strongest version of the idea is not “fire humans and vibe deploy everything.” It is: keep an agent manager online, make it reachable from normal messaging apps, and let it delegate bounded coding tasks to background coding agents while you continue doing other work.

Big ideas / key insights

Best timestamped moments with interpretation

Screen-level insights: what the visuals add

Comment-derived insights

The comments are useful because they expose the tension between excitement and production skepticism.

Practical workflow to steal — with guardrails

1. Provision a small VPS first. Start minimal unless you are running local models. If inference is via OpenRouter/Anthropic/OpenAI, the VPS mainly needs reliable uptime, disk, and enough RAM for toolchains.

2. Install Hermes and verify the base agent. Confirm the model provider works before adding GitHub, Discord, or coding agents.

3. Install and authenticate Claude Code separately. Treat Hermes as orchestrator and Claude Code as a worker with its own auth and health check (`claude --version`, `claude doctor` where applicable).

4. Use chat integration for control, not unlimited authority. Discord/Telegram are convenient, but restrict allowed users/channels and avoid exposing the bot to broad servers.

5. Store secrets outside model-visible prompts. Prefer config/env mechanisms where the agent can use credentials through tools without printing them. Still assume any tool-enabled agent has a meaningful blast radius.

6. Use GitHub branches and PRs by default. The demo pushes directly; for real work, require feature branches, tests, review, and deploy previews.

7. Add a deployment quality gate. Before production deploys, require build, lint, typecheck, and at least a smoke/E2E test. For anything sensitive, add human approval.

8. Make architecture constraints explicit. The SQLite-on-Vercel failure happened because deployment constraints were not specified. Tell the agent target runtime limits before implementation.

9. Monitor cost. Subagents, long context, repeated polling, and “update me every minute” loops can quietly become expensive.

10. Keep the main promise, drop the recklessness. The good idea is asynchronous agent delegation from chat. The risky part is treating chat-to-production as the default.

Visible tools / code / platforms

My read / why it matters

The workflow is directionally right: an always-on agent orchestrator plus background coding workers is much closer to how useful personal AI systems will feel than a single local CLI window. The chat-native interface is also a big deal. If you can send a voice note from your phone and receive a PR or preview link later, the software-building loop changes.

But the demo also accidentally teaches the main danger. Fast autonomous coding is easy to over-trust. The SQLite/Vercel mistake, direct push behavior, broad token scopes, and “just fix and deploy” loop are exactly how agentic speed becomes agentic slop.

The version worth adopting is: Hermes/OpenClaw-style orchestrator for intake and delegation, Claude Code/Codex-style workers for implementation, GitHub PRs for review, Vercel/Cloudflare previews for inspection, and explicit tests/security gates before production. That keeps the magical part—ideas turning into working artifacts asynchronously—without pretending production engineering is just a Discord message.