Skip to main content

Event Streaming with Managed Kafka: Patterns That Scale

· 7 min read
FoundryDB Team
Engineering @ FoundryDB

Event-driven architecture has moved from buzzword to baseline. If you are building microservices, real-time analytics, or data pipelines in 2026, Kafka is likely somewhere in the stack. The challenge is not whether to use it, but how to operate it without drowning in ZooKeeper configs, TLS certificate rotation, and broker rebalancing.

This post covers practical streaming patterns on FoundryDB's managed Kafka: topic design, partitioning strategies, consumer groups, schema enforcement, and monitoring. All examples use FoundryDB's Kafka 4.0 with KRaft mode and SASL/SCRAM authentication.

Why Managed Kafka

Running Kafka in production has historically meant managing ZooKeeper quorums, rolling broker upgrades, and debugging ISR shrinkage at 2 AM. FoundryDB removes that overhead entirely.

Every Kafka service on FoundryDB runs in KRaft mode (no ZooKeeper dependency), with SASL/SCRAM-SHA-256 authentication enabled by default and TLS on all client connections. You get per-topic ACLs, automated backups, consumer lag monitoring, and broker scaling through a single API. Provisioning takes under five minutes.

Provisioning a Kafka Service

A single API call creates a production-ready Kafka cluster. The CLI works just as well.

# Via CLI
fdb services create \
--name events-prod \
--engine kafka \
--version 4.0 \
--plan tier-4 \
--storage 200 \
--storage-tier maxiops \
--zone eu-helsinki

# Via API
curl -u admin:password -X POST \
https://api.foundrydb.com/managed-services \
-H "Content-Type: application/json" \
-d '{
"name": "events-prod",
"database_type": "kafka",
"version": "4.0",
"plan_name": "tier-4",
"storage_size_gb": 200,
"storage_tier": "maxiops",
"zone": "eu-helsinki"
}'

Once the service is running, grab your bootstrap server and SASL credentials from fdb services get events-prod --format json. Every example below assumes these are set as environment variables.

Topic Design Patterns

Topic design is where most Kafka architectures succeed or fail. A poorly designed topic forces you into expensive rework later. Three patterns cover the majority of use cases.

Event Sourcing

Store every state change as an immutable event. The topic becomes your system of record, and current state is derived by replaying events.

curl -u admin:password -X POST \
https://api.foundrydb.com/managed-services/{id}/kafka/topics \
-H "Content-Type: application/json" \
-d '{
"name": "orders.events",
"partitions": 12,
"replication_factor": 3,
"config": {
"cleanup.policy": "compact",
"retention.ms": "-1",
"min.insync.replicas": "2"
}
}'

Use cleanup.policy: compact with infinite retention. Kafka keeps the latest value for each key, so your topic does not grow unbounded while still preserving the full current state.

Change Data Capture (CDC)

Capture row-level changes from a source database and stream them into Kafka. Each topic mirrors a table: cdc.public.users, cdc.public.orders. Consumers build materialized views, update search indexes, or sync data warehouses.

Use cleanup.policy: delete with a retention window that matches your recovery SLA (7 to 30 days is typical). Partition by the table's primary key so all changes to a given row land on the same partition in order.

Command/Query Separation (CQRS)

Separate write-side commands from read-side queries using dedicated topics. Commands like payments.commands carry intent ("charge this card"), while events like payments.events carry facts ("payment succeeded"). This decouples write validation from read model updates and lets each side scale independently.

Partitioning Strategies

Partitions are the unit of parallelism in Kafka. The partition key determines which partition a message lands on, and getting it right is critical for ordering guarantees and even load distribution.

StrategyPartition KeyOrdering GuaranteeBest For
By entity IDorder_id, user_idAll events for one entity in orderEvent sourcing, CDC
By tenanttenant_idAll events for one tenant in orderMulti-tenant SaaS
Round-robinnull (no key)NoneLogs, metrics, high-throughput analytics

By entity ID is the most common choice. It guarantees that all events for a given order or user arrive at the same consumer in sequence, which is essential for correct state reconstruction.

By tenant works well for SaaS platforms where you need tenant-level ordering but want to spread load across partitions. Watch out for hot partitions if one tenant produces disproportionately more traffic.

Round-robin (null key) maximizes throughput by distributing messages evenly, but sacrifices ordering. Use it only when message order does not matter.

Consumer Groups and Exactly-Once Semantics

Consumer groups let you scale processing horizontally. Kafka assigns each partition to exactly one consumer within a group, so adding consumers (up to the partition count) increases throughput linearly.

# Monitor consumer lag via the FoundryDB API
curl -u admin:password \
https://api.foundrydb.com/managed-services/{id}/kafka/consumer-groups/order-processors/lag

For exactly-once processing, combine Kafka's idempotent producer with transactional consumers. The producer assigns sequence numbers to each message, and the consumer commits offsets within a transaction alongside its output writes. In Java:

Properties props = new Properties();
props.put("bootstrap.servers", System.getenv("KAFKA_BOOTSTRAP"));
props.put("security.protocol", "SASL_SSL");
props.put("sasl.mechanism", "SCRAM-SHA-256");
props.put("sasl.jaas.config",
"org.apache.kafka.common.security.scram.ScramLoginModule required "
+ "username=\"" + System.getenv("KAFKA_USER") + "\" "
+ "password=\"" + System.getenv("KAFKA_PASSWORD") + "\";");

// Exactly-once producer
props.put("enable.idempotence", "true");
props.put("transactional.id", "order-processor-1");
props.put("acks", "all");

KafkaProducer<String, String> producer = new KafkaProducer<>(props);
producer.initTransactions();

The key settings are enable.idempotence: true, acks: all, and a stable transactional.id. On the broker side, FoundryDB defaults to min.insync.replicas: 2, so acks=all requires at least two brokers to acknowledge before a write is considered committed.

Schema Registry for Contract Enforcement

As your topic count grows, schema drift becomes a real problem. A producer changes a field name, and three downstream consumers break silently. Schema Registry enforces contracts between producers and consumers.

FoundryDB supports Confluent-compatible Schema Registry. Register Avro or JSON Schema definitions, and the serializer validates every message before it hits the broker.

from confluent_kafka import Producer
from confluent_kafka.schema_registry import SchemaRegistryClient
from confluent_kafka.schema_registry.avro import AvroSerializer

schema_registry = SchemaRegistryClient({
"url": "https://events-prod.db.foundrydb.com:8081"
})

order_schema = """{
"type": "record",
"name": "OrderEvent",
"fields": [
{"name": "order_id", "type": "string"},
{"name": "status", "type": {"type": "enum", "name": "Status",
"symbols": ["CREATED", "PAID", "SHIPPED", "DELIVERED"]}},
{"name": "amount_cents", "type": "long"},
{"name": "timestamp", "type": {"type": "long", "logicalType": "timestamp-millis"}}
]
}"""

serializer = AvroSerializer(schema_registry, order_schema)

producer = Producer({
"bootstrap.servers": "events-prod.db.foundrydb.com:9093",
"security.protocol": "SASL_SSL",
"sasl.mechanisms": "SCRAM-SHA-256",
"sasl.username": "order-service",
"sasl.password": "your-password",
})

producer.produce(
topic="orders.events",
key="order-12345",
value=serializer({"order_id": "order-12345", "status": "CREATED",
"amount_cents": 4999, "timestamp": 1775212800000}, None),
)
producer.flush()

Set schema compatibility to BACKWARD (the default) so consumers using older schemas can still read messages produced with newer schemas. This lets you evolve schemas safely by adding optional fields without breaking existing consumers.

Monitoring: What to Watch

Two metrics matter more than all others for Kafka operations.

Consumer lag measures how far behind a consumer group is from the latest offset. Rising lag means consumers cannot keep up with producers. The fix is usually more consumer instances (up to your partition count) or faster processing logic. Monitor via the FoundryDB API:

curl -u admin:password \
"https://api.foundrydb.com/managed-services/{id}/metrics?metric=consumer_lag&period=1h"

Under-replicated partitions indicates brokers that have fallen behind the leader. A non-zero count means your cluster is at risk of data loss if another broker fails. In a managed environment, FoundryDB handles broker recovery automatically, but this metric is still worth alerting on. A sustained non-zero value warrants a support ticket.

Other metrics to track: messages_in_per_sec (throughput), bytes_in_per_sec and bytes_out_per_sec (network saturation), and disk usage (plan your retention window accordingly).

Connecting with SASL/SCRAM

Every Kafka client needs four things: bootstrap server, security protocol, SASL mechanism, and credentials. Here is a minimal Python consumer for reference:

from kafka import KafkaConsumer
import json

consumer = KafkaConsumer(
"orders.events",
bootstrap_servers="events-prod.db.foundrydb.com:9093",
security_protocol="SASL_SSL",
sasl_mechanism="SCRAM-SHA-256",
sasl_plain_username="order-analytics",
sasl_plain_password="your-password",
group_id="analytics-workers",
auto_offset_reset="earliest",
value_deserializer=lambda m: json.loads(m.decode("utf-8")),
)

for message in consumer:
print(f"Partition {message.partition}, Offset {message.offset}: {message.value}")

The bootstrap port is always 9093 for SASL connections. Port 9092 is reserved for internal broker communication and is not exposed to clients.

Next Steps

Start with one topic and one consumer group. Get the partitioning strategy right, enforce schemas early, and monitor consumer lag from day one. Retrofitting these patterns into an existing system is significantly harder than building them in from the start.

Create your first Kafka service on FoundryDB in under five minutes with the quick start guide, or dive into the full Kafka documentation for topic management, ACLs, and broker scaling. If you are building event-driven microservices, the RAG pipeline tutorial shows how Kafka integrates with PostgreSQL and Valkey on the same platform.