Rotating API Secrets Without Downtime: A Practical Approach
Hardcoded API keys and secrets that never rotate are a ticking time bomb. Here's how to set up automated secret rotation without bringing your application down.
Jean-Pierre Broeders
Freelance DevOps Engineer
Rotating API Secrets Without Downtime
Somewhere in a config file sits an API key that's three years old. Nobody remembers who created it. Sound familiar? Then this article is worth reading.
Hardcoded credentials are the number one cause of API-related data breaches. Not because developers don't know better, but because secret rotation is surprisingly tricky to get right in practice. The fear that "something will break" keeps keys unchanged for months — sometimes years.
The Dual-Key Pattern
The core of zero-downtime rotation comes down to one simple principle: always have two valid keys at the same time. One active, one freshly created or about to expire.
The flow:
- Generate a new key (key B) while the current one (key A) is still valid
- Deploy key B to all consumers
- Wait until all services have switched to key B
- Revoke key A
Sounds straightforward. The real complexity lives in steps 2 and 3: how do you know for sure that every service has switched over?
Implementation with Azure Key Vault
Azure Key Vault supports secret versioning out-of-the-box, making it a natural place to orchestrate rotation.
// Fetch the latest version — no hardcoded version ID
var client = new SecretClient(
new Uri("https://my-vault.vault.azure.net/"),
new DefaultAzureCredential());
KeyVaultSecret secret = await client.GetSecretAsync("api-key-external-service");
string apiKey = secret.Value;
The trick is to not cache the secret indefinitely. A TTL of 5-15 minutes strikes a good balance between performance and rotation speed.
public class RotatingSecretProvider
{
private readonly SecretClient _client;
private string _cachedSecret;
private DateTime _cacheExpiry;
private readonly TimeSpan _cacheTtl = TimeSpan.FromMinutes(10);
public async Task<string> GetSecretAsync(string name)
{
if (_cachedSecret != null && DateTime.UtcNow < _cacheExpiry)
return _cachedSecret;
var secret = await _client.GetSecretAsync(name);
_cachedSecret = secret.Value.Value;
_cacheExpiry = DateTime.UtcNow.Add(_cacheTtl);
return _cachedSecret;
}
}
Automated Rotation with an Azure Function
Manual rotation beats no rotation, but it doesn't scale. An Azure Function with a timer trigger can fully automate this.
[Function("RotateApiKeys")]
public async Task Run(
[TimerTrigger("0 0 2 */30 * *")] TimerInfo timer, // every 30 days at 02:00
FunctionContext context)
{
var logger = context.GetLogger("RotateApiKeys");
// Step 1: Generate new key at the external service
var newKey = await _externalService.RegenerateSecondaryKeyAsync();
// Step 2: Store in Key Vault as a new version
await _secretClient.SetSecretAsync("external-api-key", newKey);
// Step 3: Wait for consumers to pick up the new key
await Task.Delay(TimeSpan.FromMinutes(20));
// Step 4: Revoke the old key
await _externalService.RevokeOldKeyAsync();
logger.LogInformation("Key rotation completed: {Time}", DateTime.UtcNow);
}
That Task.Delay — in a real production setup, a more robust approach is needed. Better to split rotation into two phases: one function that generates the new key, and a second one that revokes the old key an hour later.
Common Mistakes
A few patterns that show up regularly in codebases:
| Mistake | Why It Fails | Fix |
|---|---|---|
| Secret in environment variable at deploy time | Stays until next deploy, sometimes weeks | Fetch at runtime from vault with short TTL |
| One key for all environments | Compromise in staging = compromise in production | Separate keys per environment, rotate independently |
| Rotation without monitoring | Silent failures until a customer calls | Alert on 401/403 spikes after rotation |
| No fallback when vault is down | Vault outage = full application outage | Cached secret as fallback with short TTL |
Monitoring After Rotation
This part gets forgotten a lot. After every rotation, active monitoring should verify everything still works. Simplest approach: a health check endpoint that makes a lightweight API call with the current credentials.
# Simple health check after rotation
curl -s -o /dev/null -w "%{http_code}" \
-H "Authorization: Bearer $(az keyvault secret show \
--vault-name my-vault \
--name external-api-key \
--query value -o tsv)" \
https://api.external-service.com/health
If this returns a 401 within 30 minutes of rotation, something went wrong and an automatic rollback should kick in.
Git Pre-commit Hooks as a Safety Net
Even with the best rotation strategy, a key sometimes leaks into code. A pre-commit hook prevents secrets from ever reaching the repository.
#!/bin/bash
# .git/hooks/pre-commit
# Simple regex check for common secret patterns
PATTERNS="(ghp_[a-zA-Z0-9]{36}|AKIA[0-9A-Z]{16}|sk-[a-zA-Z0-9]{48})"
if git diff --cached --diff-filter=ACM | grep -qE "$PATTERNS"; then
echo "❌ Possible secret found in staged files!"
echo "Use a vault or environment variable instead of hardcoded keys."
exit 1
fi
Tools like gitleaks or trufflehog do this more thoroughly, but a simple regex hook already catches about 80% of cases.
Wrapping Up
Secret rotation doesn't have to be scary. The dual-key pattern prevents downtime, a vault with versioning makes it manageable, and automation via a timer-triggered function ensures it actually happens. Combine that with post-rotation monitoring and pre-commit hooks as a safety net, and the risk of leaked or stale credentials drops dramatically.
Start small. Pick the key that hasn't been changed the longest — probably everyone on the team knows which one that is — and set up rotation for it. The rest follows naturally.
