One post tagged with "llm"

Building a RAG Pipeline with OpenSearch as the Vector Store

April 14, 2026 · 7 min read

Engineering @ FoundryDB

Retrieval-Augmented Generation (RAG) augments a language model's response by first retrieving relevant context from a database, then passing that context into the prompt. OpenSearch is a natural fit for the retrieval step: it runs the embedding model internally, stores the vectors, and returns ranked results in a single query. This post shows the retrieval step with real scores from a live OpenSearch 2.19.1 cluster managed by FoundryDB, and explains how to wire the retrieved chunks into a prompt and call an LLM.

This post uses a dedicated knowledge base index with 6 database documentation chunks, embedded using all-MiniLM-L6-v2 (384 dimensions). The retrieval, prompt assembly, and a complete prompt were all tested on a live FoundryDB cluster.

RAG loop · composed from FoundryDB primitives

QUERY retrieve → augment → generate → answer

Appquestionembed →Vector Searchtop-k← pgvectorPrompt + Contextaugmentgenerate →LLM ProviderEU-routed

PostgreSQL sourceEmbedding pipelinepgvector columnVector searchPrompt + contextInference proxy · LLM