Skip to content

Durability Configuration

Configure data safety vs performance trade-offs for your workload.

Overview

NornicDB provides configurable durability settings that balance data safety with write performance. The default settings are optimized for most workloads, but you can enable Strict Durability Mode for critical data like financial transactions.

Quick Start

# Default: Good balance (recommended for most workloads)
# No configuration needed - these are the defaults

# Strict: Maximum safety for financial/critical data
NORNICDB_STRICT_DURABILITY=true

Configuration Options

Environment Variables

Variable Default Description
NORNICDB_STRICT_DURABILITY false Enable maximum safety mode (2-5x slower)
NORNICDB_WAL_SYNC_MODE batch WAL sync strategy: batch, immediate, or none
NORNICDB_WAL_SYNC_INTERVAL 100ms Interval for batch sync mode
NORNICDB_WAL_AUTO_COMPACTION_ENABLED true Enable automatic snapshots + WAL truncation
NORNICDB_WAL_RETENTION_MAX_SEGMENTS 0 Keep at most N sealed WAL segments (0 = unlimited)
NORNICDB_WAL_RETENTION_MAX_AGE 0s Keep sealed WAL segments newer than this duration (0 = unlimited)
NORNICDB_WAL_LEDGER_RETENTION_DEFAULTS false Enable ledger retention defaults when unset

Sync Modes Explained

┌─────────────────────────────────────────────────────────────────┐
│                    DURABILITY SPECTRUM                          │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  FASTEST ◄─────────────────────────────────────────► SAFEST    │
│                                                                 │
│  "none"            "batch" (DEFAULT)         "immediate"        │
│  (no fsync)        (100ms fsync)             (every write)      │
│                                                                 │
│  Performance: 10-100x          1x                 0.1-0.5x      │
│  Data risk:   HIGH          ~100ms loss            NONE         │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

batch (Default)

NORNICDB_WAL_SYNC_MODE=batch
NORNICDB_WAL_SYNC_INTERVAL=100ms
  • Behavior: Buffers writes and fsyncs every 100ms
  • Data Risk: Up to 100ms of data loss on crash
  • Performance: Good throughput for most workloads
  • Use Case: General purpose, web applications, caching

immediate

NORNICDB_WAL_SYNC_MODE=immediate
# Or simply:
NORNICDB_STRICT_DURABILITY=true
  • Behavior: fsync after every write
  • Data Risk: NONE - every committed write is durable
  • Performance: 2-5x slower writes
  • Use Case: Financial data, transactions, audit logs

none

NORNICDB_WAL_SYNC_MODE=none
  • Behavior: No fsync - relies on OS buffer cache
  • Data Risk: HIGH - entire buffer can be lost on crash
  • Performance: 10-100x faster writes
  • Use Case: Testing, development, temporary data

WAL Retention (Ledger Use Cases)

You can retain sealed WAL segments for audit/ledger workflows. This keeps a historical mutation log while still allowing periodic snapshots.

# Keep the most recent 24 segments and 7 days of history
NORNICDB_WAL_RETENTION_MAX_SEGMENTS=24
NORNICDB_WAL_RETENTION_MAX_AGE=168h

# Or enable ledger defaults when unset (opt-in)
NORNICDB_WAL_LEDGER_RETENTION_DEFAULTS=true

YAML configuration:

database:
  wal_retention_max_segments: 24
  wal_retention_max_age: "168h"

Auto-compaction remains enabled by default (preserves existing behavior). These settings are opt-in and only control how long sealed WAL segments are retained after snapshots.

Do not confuse WAL retention with MVCC historical retention:

  • WAL retention controls how long sealed log segments are kept after snapshots
  • MVCC retention controls how many historical node/edge versions remain available for snapshot reads

For MVCC history tuning, see Historical Reads & MVCC Retention.

Auto-Compaction Toggle

If you need a strictly append-only WAL without periodic truncation, disable auto-compaction:

NORNICDB_WAL_AUTO_COMPACTION_ENABLED=false
database:
  wal_auto_compaction_enabled: false

With auto-compaction disabled, you must manage snapshots and WAL growth manually.

Strict Durability Mode

For maximum data safety, enable strict durability:

NORNICDB_STRICT_DURABILITY=true

This automatically enables:

Setting Value Effect
WAL Sync Mode immediate fsync every write
Badger SyncWrites true Sync underlying storage
AsyncEngine Flush 10ms More frequent cache flushes

When to Use Strict Mode

Use Strict Durability for: - Financial transactions - Payment processing - Audit logging - Compliance-critical data (HIPAA, SOX, PCI-DSS) - Data that cannot be regenerated

Don't need Strict Durability for: - Caches - Session data - Search indexes - Analytics - Embeddings (regenerable)

Performance Comparison

Benchmark results on Apple M3 Max with NVMe SSD:

Mode Writes/sec Latency (p99) Data Safety
none 150,000 0.1ms Low
batch (default) 50,000 2ms Medium
immediate 15,000 10ms High

Docker Configuration

# docker-compose.yml
services:
  nornicdb:
    image: timothyswt/nornicdb-arm64-metal:latest
    environment:
      # Default mode (recommended)
      - NORNICDB_WAL_SYNC_MODE=batch
      - NORNICDB_WAL_SYNC_INTERVAL=100ms

      # OR: Strict mode for financial data
      # - NORNICDB_STRICT_DURABILITY=true

Programmatic Configuration

import "github.com/orneryd/nornicdb/pkg/storage"

// Default: batch mode (good balance)
cfg := storage.DefaultWALConfig()
// cfg.SyncMode = "batch"
// cfg.BatchSyncInterval = 100ms

// Strict: immediate sync
cfg := &storage.WALConfig{
    Dir:      "/data/wal",
    SyncMode: "immediate",
}

wal, err := storage.NewWAL("/data/wal", cfg)

Crash Recovery

NornicDB includes robust crash recovery regardless of sync mode with multiple layers of corruption prevention:

WAL Record Format (v2)

Each WAL entry uses an atomic binary format designed to detect any form of corruption:

┌──────────────────────────────────────────────────────────────────────────┐
│                         WAL RECORD FORMAT (v2)                           │
├──────────┬─────────┬────────┬─────────────┬────────┬──────────┬─────────┤
│ Magic    │ Version │ Length │ Payload     │ CRC32  │ Trailer  │ Padding │
│ "WALE"   │   2     │ uint32 │ JSON bytes  │ uint32 │ 8 bytes  │ 0-7 B   │
│ 4 bytes  │ 1 byte  │ 4 bytes│ N bytes     │ 4 bytes│ canary   │ align   │
└──────────┴─────────┴────────┴─────────────┴────────┴──────────┴─────────┘
                                                          ↑          ↑
                                              0xDEADBEEFFEEDFACE    8-byte
                                              detects incomplete   prevents
                                                  writes          torn headers

Corruption Prevention Layers

  1. Magic Header (WALE): Validates record boundaries and format detection
  2. Version Byte: Forward/backward compatibility between WAL versions
  3. Length Prefix: Detects truncated payloads from partial writes
  4. CRC32-C Checksum: Hardware-accelerated detection of bit flips and corruption
  5. Trailer Canary (0xDEADBEEFFEEDFACE): Confirms complete write - if missing or wrong, the write was interrupted
  6. 8-Byte Alignment: Prevents torn headers on sector boundaries (512B/4KB sectors)
  7. Transaction Rollback: Incomplete transactions are automatically rolled back on recovery

How Each Layer Protects Your Data

Failure Mode Detection Mechanism
Crash mid-header write Partial magic or missing length
Crash mid-payload write Length mismatch or truncated read
Crash after payload, before CRC Missing CRC bytes detected
Crash after CRC, before trailer Missing trailer canary
Bit flip in payload CRC32-C mismatch
Sector-aligned torn write 8-byte alignment ensures atomic header
Incomplete transaction Missing commit marker triggers rollback

Recovery Behavior

On startup, NornicDB scans the WAL and:

  1. Commits complete entries - All layers validated
  2. ⚠️ Skips incomplete entries - Detected via missing trailer/CRC
  3. 🔄 Rolls back partial transactions - No commit marker found
  4. 🔁 Regenerates embeddings - Corrupted embedding entries skipped (safe to regenerate)

The sync mode only affects how much data might be lost on crash:

Mode Maximum Data Loss Recovery Time
immediate 0 < 1 second
batch ~100ms < 1 second
none OS buffer (seconds) < 1 second

Monitoring

Monitor durability metrics via Prometheus:

# WAL write latency
histogram_quantile(0.99, rate(nornicdb_wal_write_duration_seconds_bucket[5m]))

# Sync operations per second
rate(nornicdb_wal_syncs_total[5m])

# Flush errors (should be 0)
nornicdb_async_engine_flush_errors_total

See Also


Configure for your workload → Use defaults for most cases, strict for critical data.