Back to Videos

Architecting for Model Step-Changes: A Fireside with Vercel's Guillermo Rauch

Channel Anthropic
Date May 6, 2026
Duration 27 min
Tags Vercel, v0, Model Upgrades, Architecture, Agent-First
TL;DR

Guillermo Rauch (Vercel CEO) and Angela Jiang (Anthropic) discuss how Vercel ships day-one compatibility with major model releases. The key: abstract model capabilities behind evaluation-driven interfaces, so when a more capable model ships, the product gets better automatically rather than requiring a rewrite. The conversation covers Vercel's bets on agents, v0's evolution, and the organizational change required to become "agent-pilled."

Key Takeaways

Summary

Day-One Readiness

When Anthropic released Opus 4.5, v0 shipped support on day one. Guillermo explains that this wasn't heroic last-minute engineering — it was the product of an architectural philosophy: every v0 capability is defined by what it should produce (an eval target), not by how the model produces it. When a more capable model ships, the evals re-run, quality goes up, and no rewrites are required.

Becoming Agent-Pilled

Guillermo uses "agent-pilled" to describe a genuine organizational transformation — not just adopting AI tools, but restructuring how work gets done. Vercel has dedicated teams that spend significant time reviewing agent outputs, designing eval suites, and thinking about how to delegate larger and larger chunks of engineering work to agentic systems. This is a skill the company is deliberately building, not an artifact of tool adoption.

v0's Trajectory

v0 started as a component generator — paste a description, get a React component. Today it generates complete applications: routing, database schemas, API routes, deployment configuration. Each capability step was enabled by a corresponding improvement in model capability. Guillermo describes the pattern as "the model gets better, we get out of the way."

Eval-Driven Architecture

Vercel's approach to quality is entirely eval-driven. Before any model change ships, it runs against a battery of representative v0 tasks — UI generation, form handling, API integration, responsive layout. The eval scores are the product manager's metrics. If a new model passes the evals with higher scores, it ships. No one reviews individual examples manually at scale.

The Bets That Didn't Pan Out

Guillermo is candid: Vercel built a lot of scaffolding to compensate for early model limitations — structured output parsing, output validation layers, fallback handlers — that became dead weight as models improved. The lesson: build less scaffolding, design evals, and let the model catch up. Most workarounds for model limitations are temporary.

Notable Quotes

"v0 was ready for Opus 4.5 on day one — not because we knew it was coming, but because we don't build around what today's model can do. We build around what quality means."

"Agent-pilled isn't a vibe. It's a restructuring. We have engineers whose primary job is reviewing agent outputs and writing evals. That's new."

"The biggest waste we made was building scaffolding to compensate for model limitations. Most of it is gone now. Build less, eval more."

References