Open to Engineering Manager / Director rolesLet's connect
Labs/Architecture/A Feature Flag We Forgot About Caused a Production Incident
Architectureincident-responsefeature-flagsdata-corruption

A Feature Flag We Forgot About Caused a Production Incident

A stale flag's default value routes financial transactions through deprecated code, corrupting data for 6 hours.

Situation

You're the tech lead at a fintech company. An alert fires: financial transaction records don't match between two systems. Investigation reveals an 18-month-old feature flag is evaluating to its default value because the flag provider is timing out, routing transactions through deprecated code that writes data in an old format.

Stakes

  • Financial transaction data corruption spreading for 6 hours
  • Flag provider outage causing unexpected default evaluation
  • Deprecated code path writing incompatible data formats

The incident is spreading — data corruption has been running for 6 hours. What's your immediate priority?