Back to Videos

How Datadog Built a Universal Machine Tool for Claude Code

Channel Anthropic
Date May 6, 2026
Duration 31 min
Tags Datadog, Claude Code, Enterprise, Tool Framework, Engineering Platform
TL;DR

Datadog reached 90% AI coding tool adoption across their engineering org in four months, with Claude Code handling two-thirds of that usage. But as sessions grew more ambitious, the tools being generated per-session became unmaintainable. Sesh Nalla (VP Engineering) explains how Datadog built Temper — a constrained framework that makes tools secure, reusable, and composable across the entire org.

Key Takeaways

Summary

The Adoption Curve

Sesh Nalla opens with a remarkable stat: 90% of Datadog's engineers adopted AI coding tools for production work in the four months after launch, with Claude Code driving two-thirds of that usage. This wasn't mandated — it was demand-driven. Engineers saw peers shipping faster and wanted the same. The pace of adoption created a new problem: the tools being generated to support those sessions.

The Tool Sprawl Problem

As Claude Code sessions became more ambitious — spanning multiple services, requiring Datadog-specific API access, integrating with internal monitoring — engineers started generating custom tools per session. A tool to query the metrics API. A tool to parse log formats. A tool to trigger deployments. Each session generated its own version. None composed. None were tested. Several had security issues — overly broad API access, credentials in tool code, no audit logging.

Temper: The Solution

Temper is a constrained framework for building Claude Code tools that are secure, reusable, and composable. The constraints:

The Compound Effect

The breakthrough insight behind Temper is that tools should compound. A one-off session tool is sunk cost; a Temper tool is an investment. Every new tool added to the library makes every future Claude Code session at Datadog more capable. After six months, Datadog engineers starting a new session can invoke 200+ battle-tested, security-reviewed tools covering their entire internal stack.

Results

Since deploying Temper, Datadog has seen session length increase (more ambitious tasks), security incident rate from Claude Code decrease to zero (vs. several per month with ad-hoc tools), and tool reuse rate of 78% (tools are being used, not regenerated). Engineering teams are contributing tools to the shared library proactively.

Notable Quotes

"We had 90% adoption in four months. That's incredible. And it created a tool sprawl problem we didn't anticipate — because 90% of engineers generating tools means a lot of tools, and most of them were one-offs."

"The key insight behind Temper is that tools should be an investment, not a cost. If you generate a tool and use it once, you've wasted it. If you generate a tool and it gets used 10,000 times across the org, you've compounded."

"Security isn't a post-hoc layer. It's a constraint at the design level. Temper makes it impossible to ship a tool that violates least-privilege — the framework rejects it."

References