Predictive Autoscaling: Scale Your Database Before Demand Spikes
Reactive autoscaling has a fundamental problem: it waits for something to go wrong. Your database hits 95% CPU, the autoscaler wakes up, requests a resize, and for the next few minutes your application eats latency while the new resources come online. If your traffic is predictable (and most production traffic is), this delay is avoidable.
FoundryDB's predictive autoscaling engine learns your workload's seasonal patterns and scales your database before demand spikes arrive. It combines real-time metric thresholds with historical baselines, anomaly detection, and configurable cost limits so you stay fast without overspending.
How It Works
The autoscaling system operates at two levels: a reactive policy that you configure per service via the API, and a predictive engine that runs continuously across all services.
Reactive Autoscale Policies
Every FoundryDB service supports a metric-based autoscale policy. You define which metrics to watch, what thresholds trigger a scale-up or scale-down, and the boundaries the autoscaler must stay within.
The four supported metrics are:
| Metric | What it tracks |
|---|---|
cpu_percent | CPU utilization across the primary node |
memory_percent | Memory pressure from buffers, caches, and active queries |
connections_percent | Percentage of max connections in use |
disk_percent | Storage utilization on the data volume |
Each metric has an independent threshold_up (trigger scale-up), threshold_down (trigger scale-down), and duration_seconds (how long the metric must sustain that level before acting). This prevents transient spikes from causing unnecessary tier changes.
Predictive Engine
The predictive layer builds a seasonal baseline from your workload history. It looks at the same hour of the same day of the week over the past 7 days to compute a mean and standard deviation for CPU utilization. When current CPU deviates significantly from this baseline (measured by z-score), the engine acts preemptively rather than waiting for a hard threshold breach.
Three triggers drive predictive decisions:
- Anomaly spike. A z-score above 2.5 combined with CPU already above 75% triggers an aggressive scale-up of 2 tiers. This handles sudden, unexpected load that the seasonal model didn't predict.
- Sustained growth. Three consecutive metric windows (each 15 minutes) all above the scale-up threshold triggers a 1-tier increase. This catches gradual load increases like organic traffic growth.
- Sustained low. Seven consecutive windows all below 30% CPU triggers a 1-tier decrease. The higher bar for scale-down (7 vs 3 windows) prevents premature downsizing during temporary lulls.
Every decision (including no-ops) is recorded to an audit table for debugging, reporting, and cooldown enforcement.
Configuring an Autoscale Policy
Set up a CPU-based autoscale policy using the REST API:
curl -u user:password -X PUT \
https://api.foundrydb.com/managed-services/{id}/autoscale-policy \
-H "Content-Type: application/json" \
-d '{
"enabled": true,
"metrics": [
{
"metric": "cpu_percent",
"threshold_up": 80,
"threshold_down": 30,
"duration_seconds": 300
},
{
"metric": "memory_percent",
"threshold_up": 85,
"threshold_down": 40,
"duration_seconds": 300
}
],
"min_plan": "tier-2",
"max_plan": "tier-8",
"cooldown_seconds": 300
}'
This tells the autoscaler: scale up when CPU exceeds 80% for 5 minutes, scale down when it stays below 30% for 5 minutes, and never go below tier-2 (2 vCPU, 4 GB) or above tier-8 (8 vCPU, 16 GB). After any scaling action, wait at least 5 minutes before making another decision.
The min_plan and max_plan boundaries are your cost controls. Setting max_plan to tier-8 means your bill will never exceed the hourly rate for that tier, regardless of what the autoscaler recommends.
Storage Autoscaling
Storage autoscaling works independently from compute autoscaling. Unlike compute, storage can only grow (you cannot shrink a disk without risking data loss). Configure it alongside your compute policy:
curl -u user:password -X PUT \
https://api.foundrydb.com/managed-services/{id}/autoscale-policy \
-H "Content-Type: application/json" \
-d '{
"enabled": true,
"storage_auto_scale": {
"enabled": true,
"threshold_percent": 85,
"increment_gb": 50,
"max_size_gb": 1000
}
}'
When disk usage crosses 85%, the autoscaler adds 50 GB. It repeats as needed until the disk reaches the 1000 GB cap. The storage autoscaler checks every 5 minutes, with a 60-minute cooldown between expansions.
Default values if you enable storage autoscaling without specifying parameters:
| Parameter | Default |
|---|---|
threshold_percent | 80% |
increment_gb | 10 GB |
max_size_gb | 500 GB |
cooldown_minutes | 60 |
Scaling History and Audit Trail
Every scaling operation records who triggered it: user (manual), auto_scale (reactive policy), or system (predictive engine). You can query your service's scaling history to see what happened, when, and why:
curl -u user:password \
https://api.foundrydb.com/managed-services/{id}/autoscale-policy
The response includes last_scale_at so you can see when the most recent action occurred. The predictive engine also maintains a separate audit table that logs the z-score, seasonal baseline, current CPU average, and confidence level for every decision, including decisions where it evaluated your service and chose not to scale.
Reactive vs. Predictive: When Each Fires
The two systems complement each other. Here is how they divide responsibility:
| Scenario | Reactive | Predictive |
|---|---|---|
| Monday morning traffic ramp-up | Fires after CPU crosses 80% | Fires before, based on last Monday's pattern |
| Unexpected viral traffic spike | Fires after 5 min sustained threshold | Fires immediately via anomaly detection (z-score > 2.5) |
| Gradual organic growth over weeks | Fires when thresholds breach | Fires on sustained growth (3 consecutive high windows) |
| Weekend low traffic | Scales down after 5 min below threshold | Scales down after 7 consecutive low windows (~105 min) |
| One-off batch job spike | May fire if sustained > duration_seconds | Ignores if within seasonal norms |
The predictive engine is deliberately conservative. It requires a z-score above 2.5 for anomaly spikes (roughly 99th percentile deviation), 3 sustained high windows for growth, and 7 sustained low windows for scale-down. A 6-hour cooldown prevents thrashing from rapid successive decisions.
Dry-Run Mode
Before trusting the autoscaler with production changes, enable dry-run mode. In this mode, the predictive engine evaluates your workload and records all decisions to the audit log, but does not execute any tier changes. Review the decisions over a few days to verify the engine's judgment before going live.
Disabling Autoscaling
Remove the autoscale policy entirely:
curl -u user:password -X DELETE \
https://api.foundrydb.com/managed-services/{id}/autoscale-policy
The response confirms the timestamp when autoscaling was disabled. All existing resources remain at their current tier and size.
Best Practices
Start with reactive, then layer predictive. Set conservative thresholds (80% up, 30% down) and monitor scaling events for a week. Once you trust the behavior, the predictive engine adds the look-ahead advantage.
Set meaningful plan boundaries. Your min_plan should be the smallest tier that handles your baseline traffic. Your max_plan should be the largest tier your budget allows. The gap between them is the autoscaler's operating range.
Use duration to filter noise. A duration_seconds of 300 (5 minutes) prevents one-off query spikes from triggering a resize. Reduce it to 60 only if your application is genuinely latency-sensitive to brief spikes.
Monitor storage thresholds proactively. Unlike compute, storage cannot scale down. Set max_size_gb to a value you are comfortable paying for indefinitely, and monitor disk growth trends in your metrics dashboard.
Review the audit log. The predictive scaling decisions table records every evaluation. If the engine is making decisions you disagree with, adjust the min_plan/max_plan boundaries or the reactive thresholds.
What's Next
- Read the Scaling operations guide for manual vertical and horizontal scaling
- Explore monitoring and metrics to see the data that drives autoscaling decisions
- Try FoundryDB free and set up your first autoscale policy in under 5 minutes