Introduction
DeltaForge is a versatile, high-performance Change Data Capture (CDC) engine built in Rust. It streams database changes into downstream systems like Kafka, Redis, and NATS - giving you full control over how events are routed, transformed, and delivered. Built-in schema discovery automatically infers and tracks the shape of your data as it flows through, including deep inspection of nested JSON structures.
Pipelines are defined declaratively in YAML, making it straightforward to onboard new use cases without custom code.
| Built with | Sources | Processors | Sinks | Output Formats |
|
Rust |
MySQL · PostgreSQL |
JavaScript · Outbox |
Kafka · Redis · NATS |
|
Why DeltaForge?
Core Capabilities
- ⚡ Powered by Rust : Predictable performance, memory safety, and minimal resource footprint.
- 🔌 Pluggable architecture : Sources, processors, and sinks are modular and independently extensible.
- 🧩 Declarative pipelines : Define sources, transforms, sinks, and commit policies in version-controlled YAML with environment variable expansion for secrets.
- 📦 Reliable checkpointing : Resume safely after restarts with at-least-once delivery guarantees.
- 🔀 Dynamic routing : Route events to per-table topics, streams, or subjects using templates or JavaScript logic.
- 📤 Transactional outbox : Publish domain events atomically with database writes. Per-aggregate routing, raw payload delivery, zero polling.
- 🛠️ Cloud-native ready : Single binary, Docker images, JSON logs, Prometheus metrics, and liveness/readiness probes for Kubernetes.
Schema Intelligence
- 🔍 Schema sensing : Automatically infer and track schema from event payloads, including deep inspection of nested JSON structures.
- 🗺️ High-cardinality handling : Detect and normalize dynamic map keys (session IDs, trace IDs) to prevent false schema evolution events.
- 🏷️ Schema fingerprinting : SHA-256 based change detection with schema-to-checkpoint correlation for reliable replay.
- 🗃️ Source-owned semantics : Preserves native database types (PostgreSQL arrays, MySQL JSON, etc.) instead of normalizing to a universal type system.
Operational Features
- 🔄 Graceful failover : Handles source failover with automatic schema revalidation - no manual intervention needed.
- 🧬 Zero-downtime schema evolution : Detects DDL changes and reloads schemas automatically, no pipeline restart needed.
- 🎯 Flexible table selection : Wildcard patterns (
db.*,schema.prefix%) for easy onboarding. - 📀 Transaction boundaries : Optionally keep source transactions intact across batches.
- ⚙️ Commit policies : Control checkpoint behavior with
all,required, orquorummodes across multiple sinks. - 🔧 Live pipeline management : Pause, resume, patch, and inspect running pipelines via the REST API.
- 🗄️ Safe initial snapshot : Consistent parallel backfill of existing tables before streaming begins, with binlog/WAL retention validation, background guards, and crash-resume at table granularity.
Use Cases
DeltaForge is designed for:
- Real-time data synchronization : Keep caches, search indexes, and analytics systems in sync with your primary database.
- Event-driven architectures : Stream database changes to Kafka or NATS for downstream microservices.
- Transactional messaging : Use the outbox pattern to publish domain events atomically with database writes - no distributed transactions needed.
- Audit trails and compliance : Capture every mutation with full before/after images for SOC2, HIPAA, or GDPR requirements.
- Lightweight ETL : Transform, filter, and route data in-flight with JavaScript processors - no Spark or Flink cluster needed.
DeltaForge is not a DAG-based stream processor. It is a focused CDC engine meant to replace tools like Debezium when you need a lighter, cloud-native, and more customizable runtime.
Getting Started
| Guide | Description |
|---|---|
| Quickstart | Get DeltaForge running in minutes |
| CDC Overview | Understand Change Data Capture concepts |
| Configuration | Pipeline spec reference |
| Development | Build from source, contribute |
Quick Example
metadata:
name: orders-to-kafka
tenant: acme
spec:
source:
type: mysql
config:
dsn: ${MYSQL_DSN}
tables: [shop.orders]
processors:
- type: javascript
inline: |
(event) => {
event.processed_at = Date.now();
return [event];
}
sinks:
- type: kafka
config:
brokers: ${KAFKA_BROKERS}
topic: order-events
Installation
Docker (recommended):
docker pull ghcr.io/vnvo/deltaforge:latest
From source:
git clone https://github.com/vnvo/deltaforge.git
cd deltaforge
cargo build --release
See the Development Guide for detailed build instructions and available Docker image variants.
License
DeltaForge is dual-licensed under MIT and Apache 2.0.