Skip to main content

FoundryDB Is Now the European AI Data Platform

· 8 min read
FoundryDB Team
Engineering @ FoundryDB

Your data already lives in Europe. Your databases run in European zones, your backups stay in European object storage, and your compliance story is clean right up until your application calls a model. Then a prompt full of customer data crosses the Atlantic on a key someone pasted into an environment variable, with no ceiling, no metering, and no answer to the question "where did that text actually go?"

Today we close that gap, and we do something bigger than close it. Three features ship together: vector search as a service over pgvector, embedding pipelines that run as real jobs with schedules and run history, and a managed inference proxy that puts one governed, OpenAI-compatible endpoint in front of OpenAI, Anthropic, Mistral, and Azure OpenAI. Together they make FoundryDB the European AI Data Platform: the one place where your data, your embeddings, and your model calls live under a single set of controls.

What you can build now

You could already store vectors on FoundryDB and keep them in sync. What you can do today is build the whole RAG loop without leaving the platform and without inventing the plumbing yourself. An embedding pipeline keeps your pgvector table fresh on a schedule. Vector search queries it through a credential-free API. And every model call along the way runs through one metered, ceilinged, EU-aware endpoint. That is RAG infrastructure as a platform concern instead of application glue you maintain forever.

Three pieces, designed to compose. Here is each of them.

Vector search, now a single API call

Searching your vectors used to mean a database connection, a driver, and SQL full of <=> operators living in your application code. Not anymore. Vector search is now a typed API call. Hand it a table and a query, raw vector or plain text, and it returns the nearest rows. No database credentials in the client, no SQL to compose, no connection pool to babysit.

Pass query_text together with a pipeline_id and the platform embeds your text with the exact provider, model, and dimensions that produced the indexed vectors, so query and corpus can never drift apart. Pass a raw vector instead if you embed client-side. Choose cosine, L2, or inner product, narrow with column-equality filters, and pick the columns that come back alongside the distance.

The endpoint is read-only by construction. The controller composes the SELECT from validated, typed inputs and brokers it through the same read-only data plane the Data Explorer uses, so it inherits every read-only enforcement layer plus row and byte caps. Results come back in the Data Explorer's result shape, which means they render in the dashboard's result table and auto-charts with zero extra work. top_k is capped at 100 on purpose: this is a similarity lookup, not a table export.

Embedding pipelines become embedding jobs

Embedding pipelines used to run exactly one way: a continuous poller watching your source table, embedding new rows as they appeared. That is still the right default for live data, and it still works exactly as before. But plenty of corpora do not change every minute, and paying a poll loop to discover nothing is waste you can now skip.

Pipelines now have a mode: continuous, scheduled, or manual. Scheduled pipelines take a standard cron expression. Both scheduled and manual pipelines execute as discrete runs with full accounting. A nightly pipeline wakes at 02:00, embeds whatever changed, and goes back to sleep, and every run is a first-class record with rows_scanned, rows_embedded, rows_failed, and tokens_used counters. You see exactly what a run did and what it cost.

We built the runs around the failure modes embedding jobs actually have, not the happy path:

  • Partial-failure retry. When a provider batch fails, the worker splits it down and retries individual rows (up to max_row_retries, default 3). A run where some rows failed finishes as partial, not as a mystery.
  • Error samples. Each run records up to 20 per-row failures with the source row ID and the provider error, so debugging starts from evidence instead of from re-running and watching.
  • Incremental watermark. Runs track what is already embedded, so a re-run only processes new or changed rows. Triggering a manual run twice in a row is cheap, not a full re-embed of the corpus.
  • Scoped sources. The optional source_filter is a restricted WHERE fragment validated against a strict grammar on both the controller and the agent, so a pipeline embeds only the rows you mean it to.

The managed inference proxy

The third piece is the largest, and the one we are most excited to put in your hands. FoundryDB now runs an OpenAI-compatible inference endpoint at https://api.foundrydb.com/inference/v1, fronting OpenAI, Anthropic, Mistral, and Azure OpenAI behind one URL and one credential format. Models are addressed with a provider prefix, like mistral/mistral-small-latest or openai/gpt-4.1-mini. Existing code needs one line changed: the base URL.

from openai import OpenAI

client = OpenAI(
base_url="https://api.foundrydb.com/inference/v1",
api_key="fdb-inf-...", # your FoundryDB inference key
)

The keys stay yours. You bring your own provider API keys, the platform stores them encrypted, and a missing or disabled provider config is a hard error at request time, never a silent fallback to some shared platform key. On top of that sit the controls your provider dashboard simply does not give you:

  • Dedicated data-plane keys. You mint fdb-inf- keys for your applications. Provider keys never leave the platform, only the hash of each inference key is stored, and the secret is shown exactly once.
  • Mandatory token ceilings. Every key requires a positive monthly token limit. There is no unlimited key, by design. A key that hits its ceiling returns a clean token_ceiling_exceeded error instead of a surprise invoice.
  • Per-key rate limits. Each key carries a requests-per-minute limit, default 60.
  • An organization cost circuit breaker. Set a monthly cost limit and the proxy trips org-wide when the month's spend crosses it. You reset it explicitly after raising the limit or accepting the overrun.
  • Full usage metering. Every call is recorded with provider, model, tokens in and out, cost, latency, and status, aggregated into your FoundryDB bill alongside your databases. One place to see what AI actually costs.
  • Zero content stored. The usage record never contains prompt or response content. The platform meters your calls; it does not read them.

Streaming is supported, and /inference/v1/embeddings is there too, so the same governed path serves both chat and embedding workloads.

The European angle, stated honestly

The headline control is a single switch. Set eu_only on your organization and the proxy refuses any call that would leave EU-resident infrastructure, returning eu_residency_unavailable instead of quietly routing elsewhere.

What makes this more than a checkbox is that "EU residency" means something different at every provider, and we refuse to flatten that. The platform encodes the differences as a versioned residency matrix, with a verification date on every row, consumed by the proxy's pre-flight check and by your organization's compliance report. It is data, not a marketing claim. As of 2026-06-11 the matrix reads:

ProviderEU pathHowCaveats
MistralNativeapi.mistral.ai is EU-hosted by defaultZero data retention requires their Scale plan; default abuse-monitoring retention is 30 days
OpenAIEU endpointApproved EU project served via eu.api.openai.com, implemented as zero data retentionApproval-gated; roughly 10 percent price uplift on newer models
Azure OpenAIEU endpointEU Data Zone resources pin processing and storage to the Azure EU boundarySelf-serve; newest models lag global availability; abuse monitoring may retain prompts up to 30 days unless modified monitoring is approved
AnthropicNoneFirst-party API offers US and global routing onlyBlocked under eu_only

For the endpoint-class providers, residency additionally requires an explicit eu_endpoint attestation on your provider config; the platform never infers it from a base URL. And when a provider's first-party API has no EU path, the honest behavior under eu_only is to block it, not to pretend. Provider residency terms shift, which is exactly why the matrix is versioned and dated rather than baked into prose. When a provider's posture changes, the matrix changes, and your compliance report changes with it.

Honest edges in v1

We would rather you know where the edges are than discover them later:

  • The Anthropic translation layer supports chat completions without tool calls. Streaming works; tool use through the proxy is not there yet.
  • Vector search filters support equality only, and top_k is hard-capped at 100.
  • Embedding pipeline source_filter accepts a deliberately restricted WHERE grammar: no subqueries, no functions of consequence, nothing that could change the statement's shape.
  • The proxy covers chat completions and embeddings. Image, audio, and batch endpoints are not proxied.

These are the next chapters, not the end of the story.

Go build it

The pieces compose into something that used to take a week of integration: an embedding pipeline keeps your pgvector table fresh on a schedule, vector search queries it through a credential-free API, and every model call along the way runs through one metered, ceilinged, EU-aware endpoint.

Create a PostgreSQL service with the pgvector extension, wire up your first pipeline from the dashboard, and point any OpenAI-compatible client at the proxy. Your data never had to leave Europe. Now your AI calls do not have to either. Go build something with them.