Skip to main content

FoundryDB Edge Now Stays Up On Its Own: Per-Location HA and Autoscaling

· 4 min read
FoundryDB Team
Engineering @ FoundryDB

The edge already puts a managed front end in front of your app: your own domain with an automatic certificate, caching, rate limiting, a WAF, and request analytics, served from EU points of presence. The piece you could not see was what happened when a machine behind that edge had a bad day. Today that piece is done. Every point of presence now runs as an active node with a hot standby, each location autoscales with your traffic, and if a whole location goes dark your requests are routed elsewhere and brought home when it recovers. None of it is something you configure.

What changed

Until now a point of presence was a single node serving your domain. It worked, but a single node is a single thing that can fail. We have made each location resilient and elastic, and the platform keeps it that way on its own.

  • Per-location high availability. Every point of presence is now an active node with a hot standby ready next to it. If the active node fails, the address that serves your traffic moves to the standby within seconds. Your domain keeps pointing at the same address, so there is no DNS change to wait on and no certificate to re-issue.
  • Location-level rerouting. If an entire location goes down, your traffic is automatically repointed to another location. When the original location recovers, your traffic returns home.
  • Autoscaling. Each location scales out when traffic rises and scales back in when it subsides, so a spike is absorbed without you sizing anything ahead of time.
  • Self-healing. Whenever a node is lost or a standby is promoted, the platform restores the location to full redundancy on its own, so you are not running with a thinner safety margin until someone notices.

Failover without a DNS change

The reason failover is fast and invisible is that your domain points at a serving address, not at a specific machine. When the active node fails, that serving address moves to the hot standby. Your DNS record never changes, so there is nothing to propagate and no stale lookups to wait out, and because the certificate belongs to the domain rather than the machine, there is no reissue.

The serving address is what moves. Your domain, your DNS, and your certificate stay exactly as they are.

Scaling with your traffic

A point of presence is not a fixed-size box anymore. As requests climb, the location adds capacity to keep serving them, and as the surge passes it releases that capacity again. You do not pick a node count or trip a switch when a campaign goes live.

Scaling is automatic and gradual rather than instant, and it follows real demand in both directions.

What this means for you

You get the resilience you would otherwise build and babysit, without building or babysitting it.

  • A node failure does not interrupt your domain. The serving address moves to the standby within seconds and your users keep being served.
  • A location going down does not take you offline. Traffic shifts to another location and comes back when the first one recovers.
  • A traffic spike does not require you to provision ahead. The location scales out to meet it and back in afterward.
  • You do not stand up load balancers, write health checks, or script failover. That is the platform's job, and it runs continuously, including the work of restoring full redundancy after a failure.

A note on honesty: this is automatic, not magic. Failover happens within seconds rather than truly instantly, scaling tracks demand rather than being unlimited, and rerouting around a downed location is automatic rather than imperceptible. What you get is a front end that absorbs the common failures and the common spikes on its own, so they stop being your pager's problem.

Already on, nothing to turn on

If you are running an app behind the FoundryDB edge, you already have this. There is no setting to flip and no migration to run. Your existing domains, cache rules, rate limits, and WAF settings carry over unchanged, and the analytics you already watch keep reporting through failovers and scaling events.

Point a domain at the edge and let it run. Keeping each location up, redundant, and right-sized is the platform's job now, not yours. See the edge gateway docs to get started.