Predictive Autoscaling: Scale Your Database Before Demand Spikes

April 6, 2026 · 7 min read

Engineering @ FoundryDB

Reactive autoscaling has a fundamental problem: it waits for something to go wrong. Your database hits 95% CPU, the autoscaler wakes up, requests a resize, and for the next few minutes your application eats latency while the new resources come online. If your traffic is predictable (and most production traffic is), this delay is avoidable.

FoundryDB's predictive autoscaling engine learns your workload's seasonal patterns and scales your database before demand spikes arrive. It combines real-time metric thresholds with historical baselines, anomaly detection, and configurable cost limits so you stay fast without overspending.

Vertical scaling: hot vs cold resize · online storage grow

GROW tier-4 · disk grown 200 GB · online

Servicetier-4 · 4 vCPU / 8 GBhot resize →New plan4 vCPU / 8 GBgrow only →Data disk200 GB · online

Service (resizing)New compute planReachable · online growReboot blip (cold)Data diskmounted volume (dashed)

How It Works

The autoscaling system operates at two levels: a reactive policy that you configure per service via the API, and a predictive engine that runs continuously across all services.

Reactive Autoscale Policies

Every FoundryDB service supports a metric-based autoscale policy. You define which metrics to watch, what thresholds trigger a scale-up or scale-down, and the boundaries the autoscaler must stay within.

The four supported metrics are:

Metric	What it tracks
`cpu_percent`	CPU utilization across the primary node
`memory_percent`	Memory pressure from buffers, caches, and active queries
`connections_percent`	Percentage of max connections in use
`disk_percent`	Storage utilization on the data volume

Each metric has an independent threshold_up (trigger scale-up), threshold_down (trigger scale-down), and duration_seconds (how long the metric must sustain that level before acting). This prevents transient spikes from causing unnecessary tier changes.

Predictive Engine

The predictive layer builds a seasonal baseline from your workload history. It looks at the same hour of the same day of the week over the past 7 days to compute a mean and standard deviation for CPU utilization. When current CPU deviates significantly from this baseline (measured by z-score), the engine acts preemptively rather than waiting for a hard threshold breach.

Three triggers drive predictive decisions:

Anomaly spike. A z-score above 2.5 combined with CPU already above 75% triggers an aggressive scale-up of 2 tiers. This handles sudden, unexpected load that the seasonal model didn't predict.
Sustained growth. Three consecutive metric windows (each 15 minutes) all above the scale-up threshold triggers a 1-tier increase. This catches gradual load increases like organic traffic growth.
Sustained low. Seven consecutive windows all below 30% CPU triggers a 1-tier decrease. The higher bar for scale-down (7 vs 3 windows) prevents premature downsizing during temporary lulls.

Every decision (including no-ops) is recorded to an audit table for debugging, reporting, and cooldown enforcement.

Configuring an Autoscale Policy

Set up a CPU-based autoscale policy using the REST API:

curl -u user:password -X PUT \
  https://api.foundrydb.com/managed-services/{id}/autoscale-policy \
  -H "Content-Type: application/json" \
  -d '{
    "enabled": true,
    "metrics": [
      {
        "metric": "cpu_percent",
        "threshold_up": 80,
        "threshold_down": 30,
        "duration_seconds": 300
      },
      {
        "metric": "memory_percent",
        "threshold_up": 85,
        "threshold_down": 40,
        "duration_seconds": 300
      }
    ],
    "min_plan": "tier-2",
    "max_plan": "tier-8",
    "cooldown_seconds": 300
  }'

This tells the autoscaler: scale up when CPU exceeds 80% for 5 minutes, scale down when it stays below 30% for 5 minutes, and never go below tier-2 (2 vCPU, 4 GB) or above tier-8 (8 vCPU, 16 GB). After any scaling action, wait at least 5 minutes before making another decision.

The min_plan and max_plan boundaries are your cost controls. Setting max_plan to tier-8 means your bill will never exceed the hourly rate for that tier, regardless of what the autoscaler recommends.

Storage Autoscaling

Storage autoscaling works independently from compute autoscaling. Unlike compute, storage can only grow (you cannot shrink a disk without risking data loss). Configure it alongside your compute policy:

curl -u user:password -X PUT \
  https://api.foundrydb.com/managed-services/{id}/autoscale-policy \
  -H "Content-Type: application/json" \
  -d '{
    "enabled": true,
    "storage_auto_scale": {
      "enabled": true,
      "threshold_percent": 85,
      "increment_gb": 50,
      "max_size_gb": 1000
    }
  }'

When disk usage crosses 85%, the autoscaler adds 50 GB. It repeats as needed until the disk reaches the 1000 GB cap. The storage autoscaler checks every 5 minutes, with a 60-minute cooldown between expansions.

Default values if you enable storage autoscaling without specifying parameters:

Parameter	Default
`threshold_percent`	80%
`increment_gb`	10 GB
`max_size_gb`	500 GB
`cooldown_minutes`	60

Scaling History and Audit Trail

Every scaling operation records who triggered it: user (manual), auto_scale (reactive policy), or system (predictive engine). You can query your service's scaling history to see what happened, when, and why:

curl -u user:password \
  https://api.foundrydb.com/managed-services/{id}/autoscale-policy

The response includes last_scale_at so you can see when the most recent action occurred. The predictive engine also maintains a separate audit table that logs the z-score, seasonal baseline, current CPU average, and confidence level for every decision, including decisions where it evaluated your service and chose not to scale.

Reactive vs. Predictive: When Each Fires

The two systems complement each other. Here is how they divide responsibility:

Scenario	Reactive	Predictive
Monday morning traffic ramp-up	Fires after CPU crosses 80%	Fires before, based on last Monday's pattern
Unexpected viral traffic spike	Fires after 5 min sustained threshold	Fires immediately via anomaly detection (z-score > 2.5)
Gradual organic growth over weeks	Fires when thresholds breach	Fires on sustained growth (3 consecutive high windows)
Weekend low traffic	Scales down after 5 min below threshold	Scales down after 7 consecutive low windows (~105 min)
One-off batch job spike	May fire if sustained > `duration_seconds`	Ignores if within seasonal norms

The predictive engine is deliberately conservative. It requires a z-score above 2.5 for anomaly spikes (roughly 99th percentile deviation), 3 sustained high windows for growth, and 7 sustained low windows for scale-down. A 6-hour cooldown prevents thrashing from rapid successive decisions.

Dry-Run Mode

Before trusting the autoscaler with production changes, enable dry-run mode. In this mode, the predictive engine evaluates your workload and records all decisions to the audit log, but does not execute any tier changes. Review the decisions over a few days to verify the engine's judgment before going live.

Disabling Autoscaling

Remove the autoscale policy entirely:

curl -u user:password -X DELETE \
  https://api.foundrydb.com/managed-services/{id}/autoscale-policy

The response confirms the timestamp when autoscaling was disabled. All existing resources remain at their current tier and size.

Best Practices

Start with reactive, then layer predictive. Set conservative thresholds (80% up, 30% down) and monitor scaling events for a week. Once you trust the behavior, the predictive engine adds the look-ahead advantage.

Set meaningful plan boundaries. Your min_plan should be the smallest tier that handles your baseline traffic. Your max_plan should be the largest tier your budget allows. The gap between them is the autoscaler's operating range.

Use duration to filter noise. A duration_seconds of 300 (5 minutes) prevents one-off query spikes from triggering a resize. Reduce it to 60 only if your application is genuinely latency-sensitive to brief spikes.

Monitor storage thresholds proactively. Unlike compute, storage cannot scale down. Set max_size_gb to a value you are comfortable paying for indefinitely, and monitor disk growth trends in your metrics dashboard.

Review the audit log. The predictive scaling decisions table records every evaluation. If the engine is making decisions you disagree with, adjust the min_plan/max_plan boundaries or the reactive thresholds.

What's Next

Read the Scaling operations guide for manual vertical and horizontal scaling
Explore monitoring and metrics to see the data that drives autoscaling decisions
Try FoundryDB free and set up your first autoscale policy in under 5 minutes

How It Works​

Reactive Autoscale Policies​

Predictive Engine​

Configuring an Autoscale Policy​

Storage Autoscaling​

Scaling History and Audit Trail​

Reactive vs. Predictive: When Each Fires​

Dry-Run Mode​

Disabling Autoscaling​

Best Practices​

What's Next​