Stand Up a Private, EU-Resident RAG Chatbot in Minutes

June 23, 2026 · 5 min read

Engineering @ FoundryDB

Retrieval-augmented chat is the demo everyone wants and almost nobody ships cleanly. The interface is easy. The plumbing is not. You need a vector store, somewhere to keep the documents, an inference endpoint that does not leak your data, and an app that knows how to reach all three. That is a database, a bucket, an API key, a handful of environment variables, and a firewall rule or two, all wired by hand before you see a single answer.

The rag-chatbot stack collapses that into one launch. Pick it, accept the cost preview, and a few minutes later you are chatting over your own data on infrastructure you own, resident in Europe.

One-click stack launch fan-out

RUNNING Stack wired · endpoint live

Stack Templaterag-chatbotlaunch ⇉PostgreSQLpgvectorAppOpen WebUIFilesbucketInferenceEU key

Template · AppPostgreSQL (pgvector)Files bucketInference (EU)wiring (env injected)

What you get

Behind one button, the platform stands up and connects four things:

Open WebUI, a polished chat application, running on its own VM at a real foundrydb.com hostname with a valid certificate and sign-in enabled.
Your own PostgreSQL with pgvector, so embeddings and chat history live in a database that is yours and queryable with plain SQL, not a black box.
A Files bucket, for the documents you want the assistant to reason over.
An EU-routed inference key, minted against your own model provider, so every token of generation runs through a key the platform issued for this stack, inside the EU.

Each piece is attached to the next. The chat app already knows the database address and credentials. It already has the bucket. It already has an OpenAI-compatible inference endpoint pointed at your provider. There is no connection string to copy, no S3 access key to paste, no firewall rule to open. You open the URL and you are talking to your data.

Your provider, never ours

This is the one stack that includes both object storage and inference, and the inference part matters. The key it mints routes to your organization's own configured provider. There is no shared, platform-default model sitting between you and your prompts. The stack never creates a provider, so if your org has none enabled, the launch stops at the cost preview and tells you. Add a provider, preview again, launch.

The key is pinned to EU routing and carries a monthly budget ceiling, so your generation spend is bounded rather than open-ended. Your data goes to the provider you chose, under a key issued for this stack, routed inside Europe.

Launch it

Over the API the flow is two steps: preview the cost, then launch with the cost you accepted.

# Preview the cost (fails with 400 if no inference provider is enabled)
curl -X POST https://api.foundrydb.com/stacks/preview \
  -H "Authorization: Bearer $FOUNDRYDB_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"template_name": "rag-chatbot"}'

# Launch, passing the monthly_total from the preview
curl -X POST https://api.foundrydb.com/stacks \
  -H "Authorization: Bearer $FOUNDRYDB_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "my-rag-chatbot",
    "template_name": "rag-chatbot",
    "accepted_monthly_cost": 67.00
  }'

The launch returns 201 Created with the stack in Pending. Poll GET /stacks/{id} until it reaches Running, and the response carries an endpoint_url with the live chat address. Most stacks complete within 5 minutes. Or skip the curl entirely and click Launch a RAG chatbot in the console.

When you open the endpoint, the first account you create is the administrator, and that account, your settings, and your chat history all persist into your PostgreSQL. Upload documents into the bucket, ask questions, and retrieval runs over pgvector in your database while generation runs through your EU-pinned key.

The guarantees are in the launch, not a footnote

A hard cost preview. Before anything is provisioned, you see exactly what the stack will cost, broken down per resource. The inference line is a budget ceiling, not a guaranteed charge. You approve a number, then the launch runs, and the number is re-checked so it cannot drift on you.

Atomic rollback. A stack is four resources coming up together. If any step fails, the whole launch rolls back cleanly. You never end up with an orphaned database, a stranded bucket, or a half-attached app quietly costing you money.

EU residency. The database, the bucket, the app, and the inference routing all live in Europe. Residency is where the platform runs, not a setting you remember to flip.

Atomic teardown. One DELETE /stacks/{id} removes every child resource in reverse dependency order, no debris left behind.

Composition, not a new product

The rag-chatbot stack is pure composition over building blocks FoundryDB already operates: a managed PostgreSQL service with pgvector, a Files bucket, the inference proxy, and app hosting, arranged in the right order and wired together. Nothing new runs underneath it. That is why it is reproducible, why the cost is predictable, and why you can pick the whole thing apart afterward and manage each piece as an ordinary service in your account.

Launch one

Your first useful screen should not cost you an afternoon of plumbing. Open the catalog, pick Launch a RAG chatbot, review the cost, and launch. For the full console and API walkthrough, including how the attachment wiring and the minted inference key fit together, read the Launch a RAG Chatbot tutorial.

Stop wiring primitives. Launch the finished thing.

What you get​

Your provider, never ours​

Launch it​

The guarantees are in the launch, not a footnote​

Composition, not a new product​

Launch one​