Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

DeltaForge

Version Arch License

Introduction

DeltaForge is a versatile, high-performance Change Data Capture (CDC) engine built in Rust. It streams database changes into downstream systems like Kafka, Redis, and NATS - giving you full control over how events are routed, transformed, and delivered. Built-in schema discovery automatically infers and tracks the shape of your data as it flows through, including deep inspection of nested JSON structures.

Pipelines are defined declaratively in YAML, making it straightforward to onboard new use cases without custom code.

Built with Sources Processors Sinks Output Formats
Rust
Rust
MySQL PostgreSQL
MySQL · PostgreSQL
JavaScript
JavaScript · Outbox
Kafka Redis NATS
Kafka · Redis · NATS
Native Debezium CloudEvents

Why DeltaForge?

Core Capabilities

  • Powered by Rust : Predictable performance, memory safety, and minimal resource footprint.
  • 🔌 Pluggable architecture : Sources, processors, and sinks are modular and independently extensible.
  • 🧩 Declarative pipelines : Define sources, transforms, sinks, and commit policies in version-controlled YAML with environment variable expansion for secrets.
  • 📦 Reliable checkpointing : Resume safely after restarts with at-least-once delivery guarantees.
  • 🔀 Dynamic routing : Route events to per-table topics, streams, or subjects using templates or JavaScript logic.
  • 📤 Transactional outbox : Publish domain events atomically with database writes. Per-aggregate routing, raw payload delivery, zero polling.
  • 🛠️ Cloud-native ready : Single binary, Docker images, JSON logs, Prometheus metrics, and liveness/readiness probes for Kubernetes.

Schema Intelligence

  • 🔍 Schema sensing : Automatically infer and track schema from event payloads, including deep inspection of nested JSON structures.
  • 🗺️ High-cardinality handling : Detect and normalize dynamic map keys (session IDs, trace IDs) to prevent false schema evolution events.
  • 🏷️ Schema fingerprinting : SHA-256 based change detection with schema-to-checkpoint correlation for reliable replay.
  • 🗃️ Source-owned semantics : Preserves native database types (PostgreSQL arrays, MySQL JSON, etc.) instead of normalizing to a universal type system.

Operational Features

  • 🔄 Graceful failover : Handles source failover with automatic schema revalidation - no manual intervention needed.
  • 🧬 Zero-downtime schema evolution : Detects DDL changes and reloads schemas automatically, no pipeline restart needed.
  • 🎯 Flexible table selection : Wildcard patterns (db.*, schema.prefix%) for easy onboarding.
  • 📀 Transaction boundaries : Optionally keep source transactions intact across batches.
  • ⚙️ Commit policies : Control checkpoint behavior with all, required, or quorum modes across multiple sinks.
  • 🔧 Live pipeline management : Pause, resume, patch, and inspect running pipelines via the REST API.
  • 🗄️ Safe initial snapshot : Consistent parallel backfill of existing tables before streaming begins, with binlog/WAL retention validation, background guards, and crash-resume at table granularity.

Use Cases

DeltaForge is designed for:

  • Real-time data synchronization : Keep caches, search indexes, and analytics systems in sync with your primary database.
  • Event-driven architectures : Stream database changes to Kafka or NATS for downstream microservices.
  • Transactional messaging : Use the outbox pattern to publish domain events atomically with database writes - no distributed transactions needed.
  • Audit trails and compliance : Capture every mutation with full before/after images for SOC2, HIPAA, or GDPR requirements.
  • Lightweight ETL : Transform, filter, and route data in-flight with JavaScript processors - no Spark or Flink cluster needed.

DeltaForge is not a DAG-based stream processor. It is a focused CDC engine meant to replace tools like Debezium when you need a lighter, cloud-native, and more customizable runtime.

Getting Started

GuideDescription
QuickstartGet DeltaForge running in minutes
CDC OverviewUnderstand Change Data Capture concepts
ConfigurationPipeline spec reference
DevelopmentBuild from source, contribute

Quick Example

metadata:
  name: orders-to-kafka
  tenant: acme

spec:
  source:
    type: mysql
    config:
      dsn: ${MYSQL_DSN}
      tables: [shop.orders]

  processors:
    - type: javascript
      inline: |
        (event) => {
          event.processed_at = Date.now();
          return [event];
        }

  sinks:
    - type: kafka
      config:
        brokers: ${KAFKA_BROKERS}
        topic: order-events

Installation

Docker (recommended):

docker pull ghcr.io/vnvo/deltaforge:latest

From source:

git clone https://github.com/vnvo/deltaforge.git
cd deltaforge
cargo build --release

See the Development Guide for detailed build instructions and available Docker image variants.

License

DeltaForge is dual-licensed under MIT and Apache 2.0.