Why Your AI UX Is Broken (and It’s Not the Model’s Fault) — Mike Christensen, Ably

AI Engineer18:37Transcript ✅Added May 18, 4:40 pm GMT+8

Actionable Insights

Stop binding the agent stream to one browser connection Put a durable session/channel between agent and clients so reconnects, multiple tabs, and mobile handoff can replay state. Start by turning this into a small, reversible pilot: write down the exact input, expected output, owner, and success metric before changing the wider workflow. The useful detail from the analysis is: The top comments correctly compress the talk: “put a stateful layer between your agent and clients so that you can interact with the same conversation on two different tabs.” Another notes the talk is partly a build-up to an Ably sales pitch. Direct HTTP/SSE streaming is easy to start with but breaks down for resumability, multiple devices/tabs, live control, interruptions, and human handoff. Treat the first run as an evaluation, not a migration: capture before/after examples, note where the method saves time or improves quality, and keep the old path available until the new one passes repeated checks. Watch for the main failure mode here: overgeneralizing the creator’s demo beyond the evidence. If the video or comments only showed a narrow case, keep the rollout narrow and require fresh proof before broad adoption.
Choose bidirectional control when users can steer/stop agents SSE is simple for one-way streaming, but stop/interruption/steering needs a separate control path or WebSocket-style bidirectional channel. Start by turning this into a small, reversible pilot: write down the exact input, expected output, owner, and success metric before changing the wider workflow. The useful detail from the analysis is: Supporting sources and concepts: - SSE is one-way from server to client, so cancellation/steering requires another request/control channel. Direct HTTP/SSE streaming is easy to start with but breaks down for resumability, multiple devices/tabs, live control, interruptions, and human handoff. Treat the first run as an evaluation, not a migration: capture before/after examples, note where the method saves time or improves quality, and keep the old path available until the new one passes repeated checks. Watch for the main failure mode here: overgeneralizing the creator’s demo beyond the evidence. If the video or comments only showed a narrow case, keep the rollout narrow and require fresh proof before broad adoption.
Persist ordered events Store agent tokens/events/tool states with sequence IDs so clients can resume from the last seen event without asking the agent to rebuild client-specific streams. Start by turning this into a small, reversible pilot: write down the exact input, expected output, owner, and success metric before changing the wider workflow. The useful detail from the analysis is: - 6:13 reconnect scenario: The client drops mid-stream while the LLM keeps generating; without a durable layer events have nowhere reliable to go. The proposed pattern is a durable session layer that decouples agents from clients. Treat the first run as an evaluation, not a migration: capture before/after examples, note where the method saves time or improves quality, and keep the old path available until the new one passes repeated checks. Watch for the main failure mode here: overgeneralizing the creator’s demo beyond the evidence. If the video or comments only showed a narrow case, keep the rollout narrow and require fresh proof before broad adoption.
Design for multi-surface continuity Test: start in one tab, open another tab, disconnect network, resume, cancel from the second tab, and hand off to a human operator. Start by turning this into a small, reversible pilot: write down the exact input, expected output, owner, and success metric before changing the wider workflow. The useful detail from the analysis is: - 16:29–17:59 demo frames: The demo shows disconnect/reconnect, cancel from another tab, and human participant handoff. Direct HTTP/SSE streaming is easy to start with but breaks down for resumability, multiple devices/tabs, live control, interruptions, and human handoff. Treat the first run as an evaluation, not a migration: capture before/after examples, note where the method saves time or improves quality, and keep the old path available until the new one passes repeated checks. Watch for the main failure mode here: overgeneralizing the creator’s demo beyond the evidence. If the video or comments only showed a narrow case, keep the rollout narrow and require fresh proof before broad adoption.
Separate vendor pitch from architecture Ably sells real-time messaging, but the architectural pattern is generally useful: durable session state plus pub/sub/replay. Start by turning this into a small, reversible pilot: write down the exact input, expected output, owner, and success metric before changing the wider workflow. The useful detail from the analysis is: The proposed pattern is a durable session layer that decouples agents from clients. Both points are useful: the core architecture is simple and valid, but teams should evaluate whether they need a managed real-time platform or can implement a smaller durable-event layer themselves. Treat the first run as an evaluation, not a migration: capture before/after examples, note where the method saves time or improves quality, and keep the old path available until the new one passes repeated checks. Watch for the main failure mode here: overgeneralizing the creator’s demo beyond the evidence. If the video or comments only showed a narrow case, keep the rollout narrow and require fresh proof before broad adoption.

Core thesis

Christensen argues that many AI chat products feel broken because their UX architecture assumes a single client-server stream. Direct HTTP/SSE streaming is easy to start with but breaks down for resumability, multiple devices/tabs, live control, interruptions, and human handoff. The proposed pattern is a durable session layer that decouples agents from clients.

Comment insights

The top comments correctly compress the talk: “put a stateful layer between your agent and clients so that you can interact with the same conversation on two different tabs.” Another notes the talk is partly a build-up to an Ably sales pitch. Both points are useful: the core architecture is simple and valid, but teams should evaluate whether they need a managed real-time platform or can implement a smaller durable-event layer themselves.

Deep research

Supporting sources and concepts:

SSE is one-way from server to client, so cancellation/steering requires another request/control channel.
WebSockets and pub/sub channels support bidirectional live interaction but need auth, ordering, replay, and backpressure design.
Durable event logs/session stores are a common distributed-systems pattern for resumability and multi-client sync.

Limiting evidence:

A durable session layer adds infrastructure complexity, storage, auth, and event-ordering responsibilities.
SSE can still work for simple one-tab chat apps or with a separate cancellation endpoint.
Ably-specific scale claims are vendor claims; the architecture should be evaluated independently.

Verdict

Direct HTTP/SSE is fine for prototypes but limited for rich AI UX: Agree, high confidence.
Durable sessions solve reconnect and multi-device continuity: Agree, high confidence if event persistence/replay is implemented correctly.
WebSockets are always required: Mixed. Bidirectional control is required; WebSockets are one option.
The talk is longer than the core idea: Agree, but the demos/examples still clarify failure modes.

Screen-level insights

0:37 default SSE pattern: The talk names Vercel AI SDK and similar frameworks using SSE by default. This grounds the critique in common implementations.
3:40 core capabilities slide: The transcript lists resumability, continuity across surfaces, and steering while the agent works.
6:13 reconnect scenario: The client drops mid-stream while the LLM keeps generating; without a durable layer events have nowhere reliable to go.
8:14 cancellation ambiguity: Closing SSE could mean disconnect/resume or cancel. This is the cleanest technical explanation for needing explicit control semantics.
16:29–17:59 demo frames: The demo shows disconnect/reconnect, cancel from another tab, and human participant handoff. Those visuals support the multi-surface claim.

Verification notes

Verification passes performed: source/evidence audit against transcript/comment evidence; fidelity audit for SSE/resume/cancel claims; hallucination audit separating architecture from Ably-specific marketing; Actionable Insights audit producing implementation checks. Residual uncertainty: the extraction does not include complete demo code or benchmark data.

Actionable Insights audit: expanded to the newer detailed format with fuller implementation notes, evaluation checks, and cautions where the existing evidence supports elaboration.