Skip to main content

Connect LangChain to FoundryDB pgvector

Use FoundryDB's managed PostgreSQL with pgvector as a vector store for LangChain applications. Setup, document loading, and similarity search in under 10 minutes.

Prerequisites

  • A FoundryDB PostgreSQL service with pgvector
  • Python 3.10+
  • An OpenAI API key
pip install langchain langchain-openai langchain-postgres psycopg2-binary

Step 1: Create a PostgreSQL Service

curl -u $USER:$PASS -X POST https://api.foundrydb.com/managed-services \
-H "Content-Type: application/json" \
-d '{"name":"my-vectordb","database_type":"postgresql","version":"17","plan_name":"tier-2","zone":"se-sto1","storage_size_gb":50,"storage_tier":"maxiops"}'

Or use the RAG Pipeline template which includes PostgreSQL with pgvector pre-configured.

Step 2: Enable pgvector

PGPASSWORD=YOUR_PASSWORD psql \
"host=my-vectordb.abc123.db.foundrydb.com user=app_user dbname=defaultdb sslmode=require" \
-c "CREATE EXTENSION IF NOT EXISTS vector;"

Step 3: Use LangChain with pgvector

import os
from langchain_openai import OpenAIEmbeddings
from langchain_postgres import PGVector
from langchain_core.documents import Document

os.environ["OPENAI_API_KEY"] = "sk-..."

CONNECTION = "postgresql+psycopg2://app_user:PASSWORD@my-vectordb.abc123.db.foundrydb.com:5432/defaultdb?sslmode=require"

embeddings = OpenAIEmbeddings(model="text-embedding-ada-002")
vector_store = PGVector(
embeddings=embeddings,
collection_name="langchain_docs",
connection=CONNECTION,
use_jsonb=True,
)

# Add documents
docs = [
Document(page_content="FoundryDB supports PostgreSQL, MySQL, MongoDB, Valkey, and Kafka.", metadata={"source": "docs"}),
Document(page_content="pgvector enables vector similarity search in PostgreSQL.", metadata={"source": "features"}),
Document(page_content="Valkey is a Redis-compatible in-memory data store.", metadata={"source": "features"}),
]
vector_store.add_documents(docs)

# Search
results = vector_store.similarity_search_with_score("What databases does FoundryDB support?", k=3)
for doc, score in results:
print(f"[{score:.3f}] {doc.page_content}")

Step 4: Use as a Retriever in a RAG Chain

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser

llm = ChatOpenAI(model="gpt-4o-mini")
retriever = vector_store.as_retriever(search_kwargs={"k": 3})

template = """Answer based on context. If you can't, say so.

Context: {context}
Question: {question}
Answer:"""

chain = (
{"context": retriever, "question": RunnablePassthrough()}
| ChatPromptTemplate.from_template(template)
| llm
| StrOutputParser()
)

print(chain.invoke("What is FoundryDB's approach to data sovereignty?"))

Performance Tips

  • HNSW index: Tune with m = 24, ef_construction = 100 for better recall
  • Connection pooling: Enable PgBouncer for high-concurrency workloads
  • Batch inserts: Use add_documents() with batches of 100-500

Next Steps

  • Build a full RAG pipeline with Kafka ingestion and Valkey caching
  • Explore MongoDB Atlas or OpenSearch as alternative vector stores