Architectureincident-responsefeature-flagsdata-corruption
A Feature Flag We Forgot About Caused a Production Incident
A stale flag's default value routes financial transactions through deprecated code, corrupting data for 6 hours.
Situation
You're the tech lead at a fintech company. An alert fires: financial transaction records don't match between two systems. Investigation reveals an 18-month-old feature flag is evaluating to its default value because the flag provider is timing out, routing transactions through deprecated code that writes data in an old format.
Stakes
- Financial transaction data corruption spreading for 6 hours
- Flag provider outage causing unexpected default evaluation
- Deprecated code path writing incompatible data formats
The incident is spreading — data corruption has been running for 6 hours. What's your immediate priority?