A feature flag created 18 months ago was still in the codebase. When the flag provider had a timeout, the flag evaluated to its default value — which no longer matched the state of the system. The result was a data corruption bug that took three days to fully remediate.
Feature flags are treated as temporary by design but permanent by neglect. This flag was supposed to be removed after a two-week rollout 18 months ago. Instead, it became invisible load-bearing infrastructure — and when it failed, it failed silently into a state that corrupted downstream data.
You're the tech lead at a fintech company. An alert fires: financial transaction records don't match between two systems. Investigation reveals a feature flag that's been in the codebase for 18 months is suddenly evaluating to its default value because the flag provider is timing out. The default routes transactions through a deprecated code path that writes data in an old format. It's been running this way for 6 hours. What do you do?
No hints. Just judgment.
Hardcoding the flag and deploying immediately is the right instinct — stop the bleeding first. But deploying without establishing the corruption boundary first means the remediation team has no clean scope for recovery. They end up diffing entire datasets instead of querying a precise time window. Two minutes of scoping before the fix saves days of recovery work after.