Skip to main content

Full-Text Search with Managed OpenSearch: From Zero to Production

· 7 min read
FoundryDB Team
Engineering @ FoundryDB

Every application eventually outgrows LIKE '%query%'. Once your product catalog, help center, or log pipeline crosses a few million documents, you need an inverted index, not a sequential scan. OpenSearch provides exactly that: full-text search with relevance scoring, custom analyzers, aggregations, and a visualization layer built in.

The hard part is running it. JVM heap tuning, cluster formation, TLS certificate management, shard rebalancing, and snapshot configuration turn a "quick search feature" into a permanent ops project. FoundryDB removes all of that. You get a managed OpenSearch 2.19 cluster with TLS, authentication, automated snapshots, and monitoring out of the box.

This guide walks through deploying OpenSearch on FoundryDB and building production-quality search: from provisioning to index design, full-text queries, custom analyzers, and aggregations.

Provision OpenSearch in Under Five Minutes

Create a managed OpenSearch service with a single API call. FoundryDB handles JVM configuration (heap is automatically set to 50% of available memory, capped at 31 GB for compressed oops), the security plugin, TLS certificates, and systemd service management.

curl -u YOUR_API_KEY: -X POST https://api.foundrydb.com/managed-services \
-H "Content-Type: application/json" \
-d '{
"name": "product-search",
"database_type": "opensearch",
"version": "2",
"plan_name": "tier-4",
"zone": "se-sto1",
"storage_size_gb": 100,
"storage_tier": "maxiops"
}'

Within minutes, your service is live at product-search.db.foundrydb.com:9200 with HTTPS and HTTP Basic authentication. Retrieve your credentials from the dashboard or the API:

curl -u YOUR_API_KEY: \
https://api.foundrydb.com/managed-services/{id}/database-users

Verify the cluster is healthy:

curl -u admin:YOUR_PASS https://product-search.db.foundrydb.com:9200/_cluster/health

You should see "status": "green" and "number_of_nodes": 1. When you need more capacity, add data nodes through the API without downtime.

Design Your Index

Good search starts with good mappings. Define field types explicitly rather than relying on dynamic mapping, which often guesses wrong (mapping a product SKU as text when it should be keyword, for example).

curl -u admin:YOUR_PASS -X PUT \
https://product-search.db.foundrydb.com:9200/products \
-H "Content-Type: application/json" \
-d '{
"settings": {
"number_of_shards": 3,
"number_of_replicas": 1,
"index.refresh_interval": "1s",
"analysis": {
"analyzer": {
"product_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": ["lowercase", "asciifolding", "edge_ngram_filter"]
}
},
"filter": {
"edge_ngram_filter": {
"type": "edge_ngram",
"min_gram": 2,
"max_gram": 15
}
}
}
},
"mappings": {
"properties": {
"name": {"type": "text", "analyzer": "product_analyzer",
"search_analyzer": "standard"},
"description": {"type": "text"},
"category": {"type": "keyword"},
"price": {"type": "float"},
"tags": {"type": "keyword"},
"created_at": {"type": "date"}
}
}
}'

A few things to note about this mapping:

  • text vs keyword: Use text for fields that need full-text search (tokenized, analyzed). Use keyword for exact matches, filters, and aggregations.
  • Custom analyzer: The product_analyzer applies lowercasing, ASCII folding (so "cafe" matches "café"), and edge n-grams for autocomplete/prefix matching. We use standard as the search analyzer to avoid n-gram expansion at query time.
  • Shard count: Three shards works well for indexes up to a few hundred gigabytes. FoundryDB's default limits allow up to 1000 fields per index and a 1-second refresh interval, both configurable.

Index Documents

Bulk indexing is significantly faster than single-document inserts. Use the _bulk API for any batch larger than a handful of documents:

curl -u admin:YOUR_PASS -X POST \
https://product-search.db.foundrydb.com:9200/products/_bulk \
-H "Content-Type: application/x-ndjson" \
-d '
{"index": {}}
{"name": "Wireless Ergonomic Keyboard", "description": "Bluetooth mechanical keyboard with split layout and adjustable tenting", "category": "peripherals", "price": 149.99, "tags": ["keyboard", "ergonomic", "bluetooth"], "created_at": "2026-03-15"}
{"index": {}}
{"name": "USB-C Docking Station Pro", "description": "Triple display dock with 100W passthrough charging and 2.5GbE ethernet", "category": "peripherals", "price": 229.00, "tags": ["dock", "usb-c", "displays"], "created_at": "2026-03-20"}
{"index": {}}
{"name": "Noise Cancelling Headphones", "description": "Over-ear wireless headphones with adaptive ANC and 40-hour battery", "category": "audio", "price": 349.99, "tags": ["headphones", "anc", "wireless"], "created_at": "2026-04-01"}
'

Full-Text Search Queries

Simple Match

The match query tokenizes your search term and finds documents containing any of the tokens:

curl -u admin:YOUR_PASS -X GET \
https://product-search.db.foundrydb.com:9200/products/_search \
-H "Content-Type: application/json" \
-d '{"query": {"match": {"name": "wireless keyboard"}}}'

This returns both the keyboard and the headphones, since both contain "wireless". Each result includes a _score reflecting how well it matches.

Search across multiple fields at once, with optional field boosting:

curl -u admin:YOUR_PASS -X GET \
https://product-search.db.foundrydb.com:9200/products/_search \
-H "Content-Type: application/json" \
-d '{
"query": {
"multi_match": {
"query": "ergonomic bluetooth",
"fields": ["name^3", "description", "tags^2"],
"type": "best_fields"
}
}
}'

The ^3 on name means a match in the product name scores three times higher than a match in the description. This is the fastest way to tune relevance without reindexing.

Boolean Queries for Filtering and Faceting

Combine full-text search with structured filters using bool queries. Filters in the filter clause skip scoring (faster) and are cached:

curl -u admin:YOUR_PASS -X GET \
https://product-search.db.foundrydb.com:9200/products/_search \
-H "Content-Type: application/json" \
-d '{
"query": {
"bool": {
"must": [
{"multi_match": {"query": "wireless", "fields": ["name^2", "description"]}}
],
"filter": [
{"term": {"category": "peripherals"}},
{"range": {"price": {"lte": 200}}}
]
}
}
}'

This finds "wireless" products in the "peripherals" category priced at 200 or less. Only the keyboard matches.

Aggregations for Analytics

OpenSearch aggregations turn your search index into an analytics engine. Combine them with queries to build faceted navigation or dashboards.

curl -u admin:YOUR_PASS -X GET \
https://product-search.db.foundrydb.com:9200/products/_search \
-H "Content-Type: application/json" \
-d '{
"size": 0,
"aggs": {
"by_category": {
"terms": {"field": "category"},
"aggs": {
"avg_price": {"avg": {"field": "price"}},
"price_range": {"stats": {"field": "price"}}
}
},
"popular_tags": {
"terms": {"field": "tags", "size": 10}
}
}
}'

Setting "size": 0 returns only aggregation results, no document hits. This pattern powers category filters ("Peripherals (2)"), price range sliders, and tag clouds in search UIs.

Monitor Cluster Health

FoundryDB exposes key OpenSearch metrics through its monitoring API: cluster_health, active_shards, indexing_rate, search_rate, jvm_heap_used, and fielddata_evictions. Export these to Datadog, Prometheus, Grafana Cloud, or any of the seven supported destinations.

curl -u YOUR_API_KEY: \
"https://api.foundrydb.com/managed-services/{id}/metrics?metric=search_rate&period=1h"

Watch jvm_heap_used and fielddata_evictions closely. If heap pressure stays above 75% or fielddata evictions spike, it is time to either scale up (bigger plan) or scale out (add data nodes). Both operations are a single API call with no downtime.

Index Lifecycle Management

Production search indexes grow indefinitely without governance. Use OpenSearch's ISM (Index State Management) policies to automate rollover and cleanup:

curl -u admin:YOUR_PASS -X PUT \
https://product-search.db.foundrydb.com:9200/_plugins/_ism/policies/logs-policy \
-H "Content-Type: application/json" \
-d '{
"policy": {
"description": "Roll over daily, delete after 30 days",
"states": [
{
"name": "hot",
"actions": [{"rollover": {"min_size": "30gb", "min_index_age": "1d"}}],
"transitions": [{"state_name": "delete", "conditions": {"min_index_age": "30d"}}]
},
{
"name": "delete",
"actions": [{"delete": {}}],
"transitions": []
}
]
}
}'

This is especially useful for log analytics and event data where you want hot storage for recent data and automatic cleanup for older indexes.

Scaling Your Cluster

When a single node is not enough, add data nodes to distribute shards and increase throughput:

curl -u YOUR_API_KEY: -X POST \
https://api.foundrydb.com/managed-services/{id}/nodes \
-H "Content-Type: application/json" \
-d '{"role": "data"}'

FoundryDB handles cluster discovery, transport TLS between nodes, and shard rebalancing automatically. New nodes join the cluster using seed-based discovery, and the security plugin's shared CA ensures mutual TLS on the transport layer (port 9300) without manual certificate distribution.

What to Build Next

OpenSearch on FoundryDB covers more than product search. Common patterns include:

  • Log analytics: Ship application logs via Filebeat or Fluent Bit, use ISM for retention, and query with aggregations for dashboards.
  • Event analytics pipeline: Pair with managed Kafka for real-time ingestion. See the Event Analytics tutorial for a complete walkthrough.
  • Autocomplete: The edge n-gram analyzer shown above powers type-ahead search with sub-50ms latency on most plans.
  • Geospatial search: Use geo_point and geo_shape field types for location-based queries.

Ready to add search to your stack? Create a free FoundryDB account and provision OpenSearch in under five minutes. Check the OpenSearch documentation for connection examples in Python, Node.js, and curl, or explore the full API reference.