← Back to parent livestream

Segment 14: Jun Yu Tan (Tusk): Fence, OS level guardrails, and deterministic safety for coding agents

AI Engineer9h 27mTranscript ✅Added May 29, 12:54 am GMT+8

  • Timestamp: 03:38:09
  • Duration: 10m 23s
  • Livestream range: 03:38:09 → 03:48:32
  • Transcript evidence: 20 chunks, about 1767 words

Actionable Insights

  1. Turn Fence into an operating checklist. Turn the speaker’s idea into a concrete workflow: define the user, the input, the tool boundary, the review step, and the failure condition.
  2. Separate capability from accountability. The recurring lesson in this chapter is that more capable AI changes who does the work, but not who owns the outcome. When applying it to secure agent execution and harnesses, write down what the system may do autonomously and what still requires explicit human judgment.
  3. Instrument the loop before scaling it. The useful operating loop is: capture context, let the tool act, review the result, preserve the learning, and tighten the next run. Write down acceptance criteria and review notes early so the workflow can be audited later.
  4. Design for the failure mode, not the demo. The polished demo version of fence, OS level guardrails, and deterministic safety for coding agents is less important than the places it breaks: weak context, unsafe permissions, weak evaluation, unclear ownership, latency, or poor human review.
  5. Convert this into a safe agent execution checklist. The durable takeaway from Jun Yu Tan (Tusk) is to turn “Fence, OS level guardrails, and deterministic safety for coding agents” into explicit operating rules: what the system may do, what it must prove, what evidence a reviewer needs, and where a human must stay accountable. The next useful artifact is a short checklist or eval case that someone can actually run.

What they actually use/show that is worth copying

  • container isolation: Container isolation is the safety idea worth copying. Assume the agent will make mistakes, then make sure those mistakes happen inside a boundary that limits blast radius.
  • Codex as software lifecycle agent: The harness is the product. Model capability becomes dependable only when planning, tools, execution, review, and rollback are explicit.
  • Daytona sandbox boundaries: This is a hard safety mechanism, not a prompt-only policy. The useful pattern is to restrict what the agent can execute and where failures can spread.
  • Cursor / Baby Cursor: The harness is the product. Model capability becomes dependable only when planning, tools, execution, review, and rollback are explicit.
  • Bluelabs relationship AI: This is a concrete mechanism from the talk. The useful question is whether it reduces friction, improves reliability, or makes human review easier in a real workflow.
  • to-do planning tools and states: This is a concrete mechanism from the talk. The useful question is whether it reduces friction, improves reliability, or makes human review easier in a real workflow.
  • production traces as eval ground truth: The practical value is that behavior becomes measurable. Instead of vibe-checking the agent, the speaker is using traces, tests, logs, or evals to make failures visible and repeatable.

Core thesis

Jun Yu Tan (Tusk) uses this chapter to make a specific argument about fence, OS level guardrails, and deterministic safety for coding agents. The useful pattern is not just the named product or institution; it is how the segment exposes the new operating model for secure agent execution and harnesses: humans keep taste, accountability, and deployment judgment while agents or models absorb more of the execution loop.

The chapter starts from this evidence: “this by you know training developers to sanitize inputs harder. Uh we solved it with prepared statements right by moving this boundary into the driver.” That opening matters because it frames the segment as a concrete slice of the broader AIE Singapore Day 2 theme: agentic systems are moving from demos into production workflows, evaluation harnesses, creative tools, owned infrastructure, robotics, and enterprise runtimes. The analysis should therefore be read as a nested talk-level packet, not as a generic summary of the entire livestream.

Comment insights

The extracted YouTube comments do not provide reliable speaker-specific audience reactions for Jun Yu Tan (Tusk). So this section should not pretend there is detailed sentiment about the talk. The useful audience-facing read is instead content-based: this segment is valuable for viewers who care about fence, os level guardrails, and deterministic safety for coding agents, especially the concrete implementation choices and operating constraints called out in the transcript.

Deep research

The research value of this talk is the practical architecture behind Fence, OS level guardrails, and deterministic safety for coding agents. Jun Yu Tan (Tusk) is not only making a broad claim; the useful details are the concrete mechanisms named in the transcript: container isolation, Codex as software lifecycle agent, Daytona sandbox boundaries, Cursor / Baby Cursor, Bluelabs relationship AI, to-do planning tools and states.

The main question to take away is how those mechanisms change the workflow. What becomes cheaper, what needs a stronger checkpoint, and what must remain human-owned? For this talk, the strongest evidence is in the speaker’s examples rather than in generic AI optimism. Use the named tools and operating choices as the starting point for further research, then validate whether the same pattern fits your own environment, security constraints, and evaluation loop.

Verdict

  • The talk contains a specific operating lesson about Fence, OS level guardrails, and deterministic safety for coding agents: Agree. The speaker gives enough segment-level evidence to extract concrete implications rather than treating it as generic conference commentary.
  • The named tools/examples should be copied blindly: Disagree. They are useful design references, but each needs to be checked against local security, data, latency, cost, and human-review requirements.
  • The most valuable part is the concrete workflow detail: Agree. The strongest takeaways are the mechanisms, constraints, and examples the speaker actually names.
  • The implementation details are transcript-supported: Agree. This page cites details such as container isolation, Codex as software lifecycle agent, Daytona sandbox boundaries, Cursor / Baby Cursor.
  • Human accountability disappears when agents improve: Disagree. The recurring production pattern is to move execution into tools while keeping ownership, review, and failure handling explicit.

Screen-level insights

  • 3:39:07 — opening frame: Jun Yu Tan (Tusk) frames the talk around fence, os level guardrails, and deterministic safety for coding agents, with the useful setup being: “I’ve tro through Twitter to see what people think about this flag or permission prompts in general. Um the top row represents uh some kind of prompt fatigue, right?”
  • 3:44:46 — container isolation: The talk shows or names this as part of the actual workflow. The relevant evidence is: “you can reach and commands you can never run and that’s it. There’s no demon, no image, no container runtime. So here’s a quick demo. Uh I think this is running a little fast but I can explain it.”
  • 3:42:40 — Codex as software lifecycle agent: The talk shows or names this as part of the actual workflow. The relevant evidence is: “open-ended. I’m calling this agent overreach. Right? The interesting thing here is that there may or may not um be a malicious attacker, right? Unlike those um above. Sometimes agents just execute projection. They hallucinate. They get prom injected.”
  • 3:47:50 — Daytona sandbox boundaries: The talk shows or names this as part of the actual workflow. The relevant evidence is: “achieve defense in depth and most teams already have one of these layers right if you’re using cloud code you have probably been on auto mode if you’re are security conscious you might already run agents in containers or cloud sandboxes u but what I want more…”
  • 3:39:38 — Cursor / Baby Cursor: The talk shows or names this as part of the actual workflow. The relevant evidence is: “bit uneasy about what the agents can do or already have been burnt by you know sometimes agents just like deleting uh costly data or even the entire system. So this is the UX filler mode right here.”
  • 3:45:48 — closing implication: The later part of the talk turns the idea into a practical takeaway: “now I’m just asking it to know like just update the readm of today’s date just make a simple file change um it does that but now um when it you know tries to um create a commit and push the commit to remote this fails because um in our fence config we have um…”

Verification notes

Verified against the extracted transcript for Jun Yu Tan (Tusk)’s talk on Fence, OS level guardrails, and deterministic safety for coding agents. The supported claims in this page are based on concrete tools/artifacts named in the talk: container isolation, Codex as software lifecycle agent, Daytona sandbox boundaries, Cursor / Baby Cursor, Bluelabs relationship AI, to-do planning tools and states, production traces as eval ground truth. I treated auto-caption wording cautiously, kept only details that are explicitly present in the segment transcript, and avoided importing claims from adjacent speakers or from the overall conference description.