← Back to library

Build a proactive agent workflow with Claude Code

Claude22:03Transcript ✅Added May 21, 12:40 am GMT+8

Actionable Insights

  • Build a docs-sync routine first, not a production-changing routine (evidence: /schedule demo at 7:29 and Routines UI at 13:42-14:12). Use Claude Code Routines with a prompt like: “weekly, diff source repo main against docs repo; identify user-facing changes; open a docs PR only when evidence exists; include changed files and screenshots.” Connect source repo, docs repo, GitHub, and Slack with least-privilege scopes. Pass/fail: after 4 runs, at least 3 PRs should be relevant and there should be zero auto-merges or broad connector grants.
  • First-run checklist for the docs-sync routine: repo access confirmed; connector scope limited; branch naming such as routine/docs-sync-YYYY-MM-DD; PR template includes source commits, docs pages touched, screenshots/rendered-doc check, and CI status; Slack notification goes to a low-noise channel; reviewer is explicitly assigned. Compare this with GitHub Actions if your team already has strong CI/event infrastructure. Caution: Routines reduce cron/hosting boilerplate, but they do not replace permissions design or human review.
  • Design every routine with a three-cell spec: trigger, context, steerability (evidence: transcript around 8:32-9:02 and routine configuration frame at 14:12). Trigger can be schedule, GitHub event, or API webhook; context should include only needed repos/connectors; steerability should define how to watch, stop, resume, and review. Use Claude Code best practices for repo context hygiene.
  • Add a critic/reviewer routine only after the generator emits structured evidence (evidence: agent-on-agent review discussion later in the talk). The review routine should check changed files, rendered output, and whether the PR maps to actual source changes. Evaluation: critic comments should find real defects or explicitly approve with evidence; if it only paraphrases, remove it.

Core thesis

The useful shift is not “let AI write more code”; it is designing an operating loop where agents have the right context, tools, triggers, isolation, verification, and human control points. The video is strongest when treated as workflow design evidence, not as proof that autonomy removes engineering responsibility.

Big ideas / key insights

  • Routines reduce cron/hosting/session-state boilerplate for proactive Claude Code jobs. Verdict preview: agree, confidence High. Supported by transcript 2:53-5:58 and Anthropic Routines docs. Overclaim risk: routines do not remove the need to design permissions, review policy, budgets, and rollback boundaries.
  • Scheduled docs-sync agents are a good first production use case. Verdict preview: agree, confidence Medium. Transcript demo 6:28-15:14 is concrete, and docs are lower-risk than automatic production changes. Practical takeaway: start with PR creation, not auto-merge.
  • Agent-on-agent review can keep autonomous outputs honest. Verdict preview: mixed, confidence Medium. The generator/critic pattern is sensible, but a second model is not a guarantee. Keep deterministic tests and human review for high-risk changes.

Best timestamped moments with interpretation

  • 0:19 — Hello everyone. How are you? Good. Okay. Amazing. Welcome to the last workshop session of the day. I hope you have all enjoyed the very first day of code with Claude. Uh my name…
  • 0:49 — Today I’m here to talk to you about how to build a proactive agent workflow with cloud code. Um, can I get a show of hands? Who has used our routines feature inside of Cloud Cod…
  • 1:20 — keep your hands up if you’ve enjoyed building all of that infra and maintaining that job. All right, we have one guy back there. We have one guy. Thank you. Thank you for your e…
  • 1:52 — really powerful coding tool, but we want to take Claude code and turn it into a really powerful coding teammate. A teammate notices when something breaks and does something abou…
  • 2:22 — teammate of tomorrow. So today we’ll be talking about four things. I’ll go through some of the challenges that you folks have felt uh building proactive agents today. I’ll go th…
  • 2:53 — So I want to talk first through the challenges with building proactive agents today. We all know it’s doable. Um but I want to talk about what what’s a little bit cumbersome wit…
  • 3:24 — things like hosting, data persistence, and authentication. Basically, you’ll need to build a whole infrastruct outside of your prompts, which is doable, but it’s a lot of work a…
  • 3:54 — up. Um, but again, there’s there’s a lot of infra that you need to build yourself here. Finally, the challenge with building proactive agents today is sometimes you want to be a…
  • 4:25 — these three issues um, and build routines. Routines is a brand new feature inside of Cloud Code. It’s an automation where you can kick off a remote Claude code session by only d…
  • 4:56 — and developed this routines feature inside of cloud code. The first thing is that we wanted these agents to be always available. These agents these routines run on claude codes …
  1. Start with a low-risk workflow that produces reviewable artifacts: docs PRs, smoke-test reports, migration plans, or issue triage.
  2. Encode context in files the agent can repeatedly read (CLAUDE.md, checklists, ADRs, runbooks).
  3. Give tools deliberately: browser automation, GitHub, Slack/Linear, cloud logs, or local panes only when the task needs them.
  4. Require evidence before completion: diffs, screenshots, command output, test results, and cited source links.
  5. Promote autonomy gradually: observe → steer → require PR review → allow constrained auto-actions only after measured reliability.

Comment insights

  • (12 likes) @Paisley-Qz: still no new sonnet ..
  • (2 likes) @dimiemo: still no new sonnet
  • (0 likes) @hankli0130: Nice job 👍
  • (0 likes) @raajthegoat: this is cool
  • (0 likes) @asas_AI: wow
  • (0 likes) @alexkhaerov: NewGen re-invented the cron.
  • (0 likes) @ronyra: the best (first)
  • (0 likes) @nathanwells4809: Here before the context limit joke hits too comment

Distilled read: the comments are light and mostly reactive. Useful caveats include concern about context/token exhaustion, skepticism that routines are “cron reinvented,” and interest in model/version availability. Treat the comment section as weak signal, not technical validation.

Deep research

External sources checked or used as context:

Research synthesis: the strongest support comes from first-party docs for the named tools plus established software-delivery research that emphasizes feedback loops, CI/CD, platform engineering, and sociotechnical constraints. The strongest contradiction is not that these tools are useless; it is that output metrics or demos do not prove organization-wide productivity, reliability, or safety without measuring downstream quality, review load, incident rate, and developer experience.

Verdict

  • Claim: Routines reduce cron/hosting/session-state boilerplate for proactive Claude Code jobs.
    • Verdict: agree
    • Confidence: High
    • Evidence and limits: Supported by transcript 2:53-5:58 and Anthropic Routines docs. Overclaim risk: routines do not remove the need to design permissions, review policy, budgets, and rollback boundaries.
    • Practical takeaway: Apply the pattern, but keep measurable guardrails and human approval for irreversible/high-risk actions.
  • Claim: Scheduled docs-sync agents are a good first production use case.
    • Verdict: agree
    • Confidence: Medium
    • Evidence and limits: Transcript demo 6:28-15:14 is concrete, and docs are lower-risk than automatic production changes. Practical takeaway: start with PR creation, not auto-merge.
    • Practical takeaway: Apply the pattern, but keep measurable guardrails and human approval for irreversible/high-risk actions.
  • Claim: Agent-on-agent review can keep autonomous outputs honest.
    • Verdict: mixed
    • Confidence: Medium
    • Evidence and limits: The generator/critic pattern is sensible, but a second model is not a guarantee. Keep deterministic tests and human review for high-risk changes.
    • Practical takeaway: Apply the pattern, but keep measurable guardrails and human approval for irreversible/high-risk actions.

Screen-level insights

  • 0:19 title slide introduces Maya Nielan, Anthropic, and the proactive-agent workflow topic.
  • 7:29 terminal demo shows /schedule Once a week review all the new changes merged to main... create a PR to update docs, proving the routine is prompt-driven, not just conceptual.
  • 13:42-14:12 Claude Code Routines web UI shows connected source/docs repos, GitHub/Slack connectors, weekly trigger, and generated multi-step instructions.

Why the visual step matters: it prevents the analysis from treating a polished talk as only words. Frames show whether the speaker demonstrated an actual UI/CLI/workflow, whether claims were backed by concrete configuration, and where the video only provided stage narration rather than product evidence.

My read / why it matters

The practical opportunity is to make agent work inspectable and boring: clear triggers, scoped context, isolated execution, repeatable verification, and concise human review. The risk is mistaking “agent can act” for “agent should act.” Teams that win will build operating systems around agents, not just prompts.

Verification notes

  • Source/evidence audit: Main claims were tied to transcript timestamps, extracted comments, frame observations, and named external sources above. First-party docs were preferred for product capabilities.
  • Transcript/comment/frame fidelity audit: Timestamped moments were taken from the extraction markdown; comment insights are explicitly marked as weak where comments were sparse; screen claims are limited to visible UI/text and nearby transcript.
  • Hallucination/overclaim audit: Verdicts distinguish demo/internal claims from independently verified facts. Organization-wide productivity claims are marked mixed unless supported beyond the video.
  • Actionable Insights audit: Top bullets were rewritten as executable workflows with first steps, tools/links, evaluation criteria, and cautions. Residual uncertainty remains around fast-changing Claude Code feature availability and any private/internal metrics presented in talks.