Guarantees & Correctness

This page defines DeltaForge’s data delivery guarantees, ordering model, transaction semantics, failure handling, and operational boundaries. Every claim here is backed by the implementation — no aspirational statements.

Delivery Guarantees

Per-sink delivery tiers

Sink	Delivery guarantee	Dedup mechanism	Consumer action required
Kafka (`exactly_once: true`)	End-to-end exactly-once	Kafka two-phase commit per batch	Set `isolation.level=read_committed`
Kafka (default)	At-least-once (idempotent producer)	Retries are deduped by rdkafka; crash-replay produces duplicates	Dedup by event ID or idempotency key
NATS JetStream	At-least-once + server-side dedup	`Nats-Msg-Id` header within `duplicate_window`	Configure `duplicate_window` on stream
Redis Streams	At-least-once + consumer-side dedup	`idempotency_key` field in XADD payload	Check `idempotency_key` before processing

Terminology rule: “exactly-once” is used only when DeltaForge guarantees no duplicates without consumer cooperation. All other sinks are “at-least-once” with a stated dedup mechanism. This distinction matters — calling NATS or Redis “exactly-once” would be misleading because dedup depends on server configuration or consumer behavior outside DeltaForge’s control.

What “at-least-once” means

No data loss: every event from the source is delivered to the sink at least once. Checkpoints are saved only after the sink acknowledges delivery — never before.
Duplicates on crash recovery: if DeltaForge crashes after delivering a batch but before saving the checkpoint, that batch is replayed on restart. Consumers must handle duplicates (see Consumer Guidance below).
No silent drops: events are never discarded. If delivery fails, the batch is retried with exponential backoff until it succeeds or a fatal error stops the pipeline.

What “exactly-once” means (Kafka)

With exactly_once: true, each batch is wrapped in a Kafka transaction (begin_transaction / commit_transaction). Consumers using isolation.level=read_committed only see committed batches — no partial deliveries. If a transaction fails, it is aborted and retried from the same checkpoint position.

Exactly-once overhead is ~7-11% with properly tuned batch sizes. See the Performance guide for benchmark details.

Ordering Model

Within a source

Events are emitted in the source’s native order:

MySQL: binlog file + position order. WriteRowsEvent batches preserve row order within each binlog event.
PostgreSQL: LSN (Log Sequence Number) order. One WAL message per row change.
Turso: change_id order (monotonically increasing).

DeltaForge does not reorder events. The source order is preserved through the pipeline.

Within a batch

All events in a batch maintain their source order. The delivery task processes batches in FIFO order from a bounded channel — no reordering between batches.

Per-primary-key ordering (the core guarantee)

DeltaForge guarantees per-primary-key ordering within a table under non-sharded operation. This means: for any single row identified by its primary key, all changes (INSERT, UPDATE, DELETE) are delivered to the sink in the exact order they occurred in the source database.

For Kafka specifically: the default message key is the serialized primary key, so events for the same row always go to the same partition and arrive in order. With dynamic routing (key template), ordering follows the resolved key — events with the same key are ordered; different keys may land in different partitions.

Cross-table ordering

There is no global ordering across tables. Events from different tables may be interleaved across batches. This is by design — enforcing global ordering would require single-threaded delivery, which would cap throughput.

However, when batch.respect_source_tx: true (the default), all rows from a single database transaction are kept in the same batch (see Transaction Boundaries below). This preserves causal ordering within a transaction.

Ordering under retries

When a batch delivery fails and is retried, the batch is re-delivered as a unit in the same order. No reordering occurs within or across retries. The max_inflight=1 setting (default) ensures strict ordering; with max_inflight > 1, batches are still delivered in FIFO order by the single-threaded delivery task.

Cross-sink ordering

All sinks receive the same batch simultaneously. The relative order of events is identical across all sinks.

Transaction Boundaries

How it works

When batch.respect_source_tx: true (the default), the coordinator checks each event’s tx_end flag before splitting a batch:

MySQL: tx_end is set on the last row of each XID (transaction commit) event.
PostgreSQL: tx_end is set on the COMMIT WAL record.
Turso: each change is its own transaction (tx_end always true).

The batch accumulator will not split a batch at a point that would separate rows from the same transaction. If the batch limit (max_events or max_bytes) is reached mid-transaction, the batch grows beyond the limit to include all remaining rows in that transaction.

What this guarantees

All rows from one database transaction appear in the same batch.
Each batch is delivered atomically to each sink (all events in the batch succeed or fail together).
With Kafka exactly_once: true, the entire batch is committed as a single Kafka transaction — consumers see all rows from the DB transaction atomically.
Cross-table transactions: a transaction spanning tables A and B is emitted as a single batch containing events for both tables, tagged with the same tx_id. The batch is delivered atomically. This is stronger than “tagged but not grouped” — all events from one DB transaction are in one batch and delivered as a unit.

Precise transaction semantics

To avoid ambiguity, here is exactly what DeltaForge guarantees about transactions:

Events from one source transaction are emitted contiguously within a single batch.
Multi-table transactions preserve commit grouping — all rows from tables A and B in one DB transaction appear in the same batch.
Within a single sink, events from one transaction are delivered atomically (the batch succeeds or fails as a unit).
Across heterogeneous sinks, DeltaForge does not guarantee atomic commit. Kafka may commit a transaction while Redis is still retrying. Each sink’s checkpoint tracks its own progress independently.
Retries do not break transaction grouping — a retried batch contains the same events in the same order.
Under non-sharded operation, no sink may observe partial progress within a source transaction (the batch is the commit unit).

Edge cases

A single database transaction that exceeds max_events or max_bytes is still kept in one batch. The limits are exceeded rather than the transaction being split.
With respect_source_tx: false, batches are split purely by size/time limits regardless of transaction boundaries. Cross-table transaction atomicity is not preserved in this mode.

Failure Isolation

Per-sink independence

All sinks deliver concurrently. One sink’s failure does not block other sinks:

The coordinator dispatches the same batch to all sinks simultaneously.
Each sink’s delivery result is collected independently.
Only sinks that delivered successfully get their checkpoints advanced.
Failed sinks remain at their prior checkpoint position — they will receive the same batch again on retry or restart.

Required vs. optional sinks

Each sink is marked required: true (default) or required: false:

Required: must succeed for the pipeline to consider the batch delivered. If a required sink fails, no checkpoint advances for any sink.
Optional (best-effort): failures are logged but don’t prevent the pipeline from advancing. Optional sinks that fail will catch up on restart via replay from their own checkpoint.

Commit policy

The commit policy determines when checkpoints advance:

Policy	Behavior
`required` (default)	All `required: true` sinks must acknowledge
`all`	Every sink (required and optional) must acknowledge
`quorum`	At least N sinks must acknowledge

The policy is checked before any checkpoint is committed. If the policy isn’t satisfied, no sink advances — this prevents optional sinks from getting ahead of failed required sinks.

Per-sink checkpoints

Each sink maintains its own checkpoint key ({source_id}::sink::{sink_id}). On restart, the source replays from the minimum checkpoint across all sinks. This means:

A fast sink is never held back by a slow one during normal operation.
A slow or failed sink only causes replay for itself, not re-delivery to sinks that are already ahead.
Adding a new sink triggers replay from the source’s earliest available position for that sink only.

Fatal errors

Some errors are unrecoverable and stop the pipeline immediately:

Kafka ProducerFenced: another producer instance started with the same transactional.id. The broker fences the old producer permanently.
Permanent auth revocation: credentials are invalid and retrying won’t help.

Fatal errors return SinkError::Fatal and are not retried. The pipeline stops and requires operator intervention.

Error Classification & Retry

Retry behavior by sink

All sinks use exponential backoff with jitter. The classification determines whether an error is retried:

Kafka:

Error	Classification	Behavior
Queue full	Retryable	Backoff, retry (100ms base, 10s max, 3 attempts)
Message timeout	Retryable	Backoff, retry
Broker connection failure	Retryable	Backoff, retry
Authentication failure	Non-retryable	Fail immediately
Message too large	Non-retryable	Fail immediately
Producer fenced	Fatal	Pipeline stops
Transaction commit failure (fatal)	Fatal	Pipeline stops

NATS:

Error	Classification	Behavior
Connection failure	Retryable	Backoff, retry (50ms base, 5s max, 3 attempts)
Publish timeout	Retryable	Backoff, retry
Authentication failure	Non-retryable	Fail immediately
No responders	Non-retryable	Fail immediately

Redis:

Error	Classification	Behavior
Connection failure	Retryable	Backoff, retry (50ms base, 5s max, 3 attempts)
Command timeout	Retryable	Backoff, retry
NOAUTH / WRONGPASS	Non-retryable	Fail immediately
Permission denied	Non-retryable	Fail immediately

After retry exhaustion

If all retry attempts fail for a retryable error, the error is propagated to the coordinator. The coordinator’s behavior depends on the commit policy:

Required sink: the batch is not committed, and the pipeline will retry the entire batch on the next cycle.
Optional sink: the failure is logged, and the pipeline continues with other sinks.

Checkpoint Semantics

When checkpoints are saved

The checkpoint commit follows a strict sequence:

1. Accumulate events from source into a batch
2. Run processors (transform, filter)
3. Deliver batch to ALL sinks concurrently
4. Check commit policy (required/all/quorum)
5. Commit per-sink checkpoints (only for successful sinks)

Key invariant: a checkpoint is saved only after the sink has acknowledged delivery AND the commit policy is satisfied. This is the foundation of at-least-once delivery.

On crash recovery

DeltaForge reads per-sink checkpoints from the checkpoint store.
The source resumes from the minimum checkpoint across all sinks.
Sinks that were already ahead of the minimum position receive duplicate events — they must handle these idempotently (or use exactly-once mode).
Sinks that were behind receive their missing events.

Checkpoint storage

Checkpoints are stored in SQLite (default) with WAL mode and synchronous=NORMAL for durability. The checkpoint store survives SIGKILL — no graceful shutdown required for checkpoint safety.

Backpressure

DeltaForge implements end-to-end backpressure without dropping events:

Source → [event channel] → Accumulator → [batch channel (max_inflight)] → Delivery → Sinks

Sink slow: delivery task blocks waiting for sink acknowledgement.
Batch channel full: accumulator blocks waiting to enqueue the next batch (bounded by max_inflight).
Event channel full: source blocks waiting to enqueue the next event.
Source slows: the database connection idles until the channel has capacity.

No events are dropped at any stage. Backpressure propagates from the slowest sink all the way back to the source connection.

max_inflight controls the pipeline depth: higher values allow overlapping batch building with delivery (better throughput), lower values reduce memory usage and latency.

Consumer Guidance

Idempotency key

Every event has a deterministic idempotency key in the format:

{tenant}|{db}.{table}|{tx_id}|{event_id}

This key is identical across replays — the same source event always produces the same key.

Kafka with exactly_once: true: set isolation.level=read_committed on consumers. No application-level dedup needed.
Kafka without exactly_once: use the event’s id field (UUID v7) or the idempotency key to detect duplicates.
NATS JetStream: server-side dedup via Nats-Msg-Id header. Configure duplicate_window on the stream to cover your maximum expected downtime (default: 2 minutes).
Redis Streams: check the idempotency_key field in the stream entry before processing. Use a Redis SET or application-level tracking to remember processed keys.

Dedup window

How long should consumers remember processed event IDs? Match your maximum expected DeltaForge downtime:

Scenario	Recommended window
Normal operation (no crashes)	No dedup needed (at-most-once per run)
Planned restarts	5 minutes
Unplanned crashes with auto-restart	15-30 minutes
Disaster recovery	Match your RPO

Correctness Test Matrix

Every guarantee is backed by a test. This matrix maps guarantees to their verification:

Guarantee	Test	Type	Status
No data loss (at-least-once)	`crash_recovery` chaos scenario	Chaos	Exists
Kafka end-to-end exactly-once	`exactly_once` chaos scenario + `kafka_sink_exactly_once_*`	Chaos + Integration	Exists
Producer fencing detection	`kafka_sink_exactly_once_producer_fencing`	Integration	Exists
Per-primary-key ordering	Events keyed by PK → same Kafka partition	By design	Verified via Kafka partition assignment
Transaction boundary preservation	`respect_source_tx` + `check_and_split` coordinator logic	Unit	Exists
Per-sink checkpoint independence	`test_per_sink_checkpoint_only_advances_on_success`	Unit	Exists
Per-sink checkpoint legacy fallback	`per_sink_proxy_falls_back_to_legacy_key`	Unit	Exists
Commit policy gate before checkpoint	`test_per_sink_checkpoint_only_advances_on_success`	Unit	Exists
DLQ routes per-event failures	`test_dlq_routes_failed_events_and_pipeline_continues`	Unit	Exists
DLQ all-fail batch	`test_dlq_all_events_fail_no_send`	Unit	Exists
DLQ overflow (drop_oldest)	`dlq::overflow_drop_oldest`	Unit	Exists
DLQ overflow (reject)	`dlq::overflow_reject_drops_new`	Unit	Exists
DLQ overflow (block)	`dlq::overflow_block_waits_for_ack`	Unit	Exists
DLQ cleanup expired	`dlq::cleanup_expired_removes_old_entries`	Unit	Exists
Partial batch timer flush	`test_partial_batch_flushed_by_timer`	Unit	Exists
Network partition recovery	`network_partition` chaos scenario	Chaos	Exists
Sink outage recovery	`sink_outage` chaos scenario	Chaos	Exists
Schema drift handling	`schema_drift` chaos scenario	Chaos	Exists
MySQL failover detection	`failover` chaos scenario	Chaos	Exists
Postgres failover detection	`pg_failover` chaos scenario	Chaos	Exists
Binlog purge detection	`binlog_purge` chaos scenario	Chaos	Exists
Replication slot drop detection	`slot_dropped` chaos scenario	Chaos	Exists
NATS dedup within window	Verify `Nats-Msg-Id` prevents duplicates	Integration	Planned
Redis idempotency key	Verify consumer-side dedup via key	Integration	Planned
Snapshot → CDC handoff	No gaps or duplicates at boundary	Integration	Planned

Limitations

These are not guaranteed and are documented honestly:

No cross-table global ordering — events from different tables may be interleaved. This is by design; enforcing global order would require single-threaded delivery and cap throughput. Use respect_source_tx: true to preserve ordering within database transactions.
No stateful stream processing — DeltaForge does not support joins, aggregations, or windowing. For stateful processing, consume DeltaForge’s output with Apache Flink, ksqlDB, or Kafka Streams.
Dead letter queue — when journal.enabled: true, poison events (serialization/routing failures) are routed to a DLQ instead of blocking the pipeline. Without DLQ enabled, a single bad event will still block. See the DLQ page.
No schema registry integration (yet) — schema sensing detects structural drift and can halt on breaking changes, but there is no Confluent Schema Registry or Avro/Protobuf encoding support. Planned for a future release.
Snapshot consistency — initial snapshots use lock-free parallel reads. The snapshot is eventually consistent with the CDC stream; there may be a brief overlap period where both snapshot rows and CDC events for the same row are delivered. Consumers should use the event timestamp or idempotency key to resolve.

Keyboard shortcuts

DeltaForge CDC Framework - User Guide