Logo SQL Growth

Datadog Feature Flags

by DataMarvin
9 hours ago
Views: 11
Illustrative Image

Feature flags are not a new idea. The basic concept — wrap a code change in a conditional, control who sees it — has been around for decades. What's changed is the operational context. As systems grow more complex and AI-powered features become standard, the gap between "the flag is on" and "the flag is working well" has become a real problem.

Datadog Feature Flags, which reached general availability in early 2026, takes a different approach from standalone flag tools. Its core proposition is simple: feature flags should be native to your observability layer, not separate from it.


1. The Problem With Standalone Feature Flags

A typical feature flag workflow with a standalone tool looks like this:

Engineer flips flag ON for 10% of users
        ↓
Waits and watches
        ↓
Checks flag tool dashboard: "flag is on for X users"
        ↓
Checks Datadog (or Grafana, or whatever) separately: "latency is up?"
        ↓
Tries to manually correlate: "was it the flag? was it something else?"

The problem is context switching. The flag state lives in one system; the performance data lives in another. When something goes wrong — and in production, something always eventually goes wrong — you're spending time connecting dots that should already be connected.

The blast radius of a bad release is determined by how fast you can detect the issue and respond. Every minute of manual correlation is a minute the bad experience is spreading.


2. What Datadog Feature Flags Does Differently

Datadog Feature Flags is built natively into the Datadog observability stack. Flag evaluations are automatically embedded into APM traces, RUM sessions, and logs — so when you look at a trace showing elevated latency, you can see right there which flags were active for that request.

The core capability set breaks into three areas.

Observability-driven rollouts

Rather than rolling out a flag on a schedule and hoping for the best, Datadog Feature Flags lets you drive rollout decisions from live telemetry. You define health criteria using Datadog metrics — error rate, p95 latency, SLO status — and the platform uses those signals to gate progression.

A canary release might look like:

Start: 1% of traffic → flag ON
        ↓
Monitor: error rate < 0.5%, p95 latency < 200ms for 30 min
        ↓
If healthy → advance to 10% → 25% → 50% → 100%
If degraded → automatic rollback, no manual intervention needed

The rollout logic is data-driven, not schedule-driven. You're not guessing that 30 minutes is enough time to catch issues — you're defining what "healthy" means and letting the system enforce it.

Automated rollbacks and circuit breakers

The flip side of automated canary progression is automated rollback. You can configure circuit breakers that watch specific Datadog monitors or SLOs, and trigger an instant flag rollback the moment a threshold is crossed.

This matters because the alternative — someone on-call noticing a spike, filing a ticket, tracking down the responsible flag, manually rolling back — can take 20–30 minutes even in a well-run org. An automated circuit breaker can respond in seconds.

Flag lifecycle management and stale flag cleanup

One of the less glamorous but genuinely painful problems in any organization that uses feature flags at scale: stale flags. Flags that were supposed to be temporary linger in codebases for months or years, accumulating as quiet technical debt.

Datadog Feature Flags includes built-in governance and lifecycle management, including automated detection and cleanup of stale flags. The Bits AI and MCP integrations can identify flags that haven't been evaluated recently and surface them for removal — reducing the long-term maintenance burden.


3. How It Connects to the Rest of Datadog

The key integration points:

APM (Application Performance Monitoring) Flag evaluations are attached to distributed traces. If a specific flag variant is causing elevated error rates or latency spikes, you can filter traces by flag state and see the difference immediately — without leaving APM.

RUM (Real User Monitoring) Flag states are embedded in user session data. You can see which users in a session replay were exposed to which flag variants, and correlate flag exposure with frontend performance metrics like Core Web Vitals or rage clicks.

SLOs You can tie flag rollout gating directly to SLO health. If the SLO for a downstream service starts degrading, the flag progression pauses or rolls back automatically.

Product Analytics Once Eppo's capabilities are more deeply integrated, this layer will add statistical experimentation on top — measuring not just whether the flag broke something (reliability) but whether it moved the business metrics you care about (conversion, retention, revenue).


4. Pricing Model

Feature Flags pricing is based on monthly volume of flag configuration requests. Usage below 1 million monthly flag configuration requests is included at no cost. For usage above 1 million requests, pricing is billed per additional 1 million monthly flag configuration requests.

Feature Flags can be used independently and don't require licensing other Datadog products, but they leverage the APM and RUM SDKs for server-side and client-side implementations. Feature flag evaluations are automatically correlated with Datadog APM, RUM, and Product Analytics data — but you must already be a customer of the relevant Datadog products to use those correlations.

In practice: the flag management itself is cheap or free at low volumes, but the full value — observability correlation, automated rollbacks — only materializes if you're already using Datadog APM and RUM.


5. Datadog Feature Flags vs. Standalone Tools (GrowthBook, LaunchDarkly)

Datadog Feature FlagsStandalone tools (GrowthBook, LaunchDarkly)
Core strengthObservability-native, tight telemetry correlationPurpose-built flag management, broader SDK support
Automated rollbacksYes, triggered by live Datadog telemetryLimited or manual
Statistical experimentationPlanned (via Eppo integration)Varies — GrowthBook has strong stats engine
Stale flag managementBuilt-in (Bits AI)Manual or limited
Pricing modelUsage-based (MFCRs)Seat-based or usage-based
Best forTeams already deeply on DatadogTeams wanting best-in-class flag management independently

The honest tradeoff: if your team is already using Datadog for observability, Feature Flags gives you something no standalone tool can replicate — flag state embedded natively in every trace and session. If you're not on Datadog, or if you need features like warehouse-native experimentation, a standalone tool may still be the better fit.


6. What This Signals About Where Feature Flags Are Heading

Datadog's move into feature flags is part of a broader convergence. For years, observability and experimentation lived in separate worlds — one owned by platform/SRE teams, the other by product/data teams. The infrastructure needed to connect them (flag state → telemetry correlation, automated rollback, A/B statistical analysis) existed only at large companies that built it internally.

What's happening now — with Datadog Feature Flags natively connecting to APM and RUM, and Eppo's statistical engine being integrated over time — is that this full-stack capability is becoming available as a commercial product. The question for most teams is no longer "can we afford to build this?" but "which platform do we consolidate on?"


Takeaway

Datadog Feature Flags is not just another flag management tool. Its differentiated value is the native connection between flag state and observability data — making automated, data-driven rollouts and rollbacks practical for teams that are already on the Datadog stack.

One sentence summary:

Datadog Feature Flags turns feature rollouts from a manual, cross-tool coordination problem into an automated, telemetry-driven workflow — for teams already living in Datadog.

More

Based on Tags

Recent Popular

Most Popular

  • Why You Shouldn't Peek at Your A/B Test Results

    An Introduction t Sequential AB Testing

    Illustrative Image
  • Stratified Sampling in A/B Testing

    Why Random Isn't Always Enough

    Illustrative Image
  • What Is CUPED

    and Why It Makes Your Experiments Faster

    Illustrative Image