Vault Mode

19. Vault Mode (Cloud Backup)

Added (2026-02-09): Defines the Blind Replica architecture — a relay operating in vault mode for long-term encrypted backup. See docs/research/blind-replica-architecture.md for full research context.

19.1 Overview

Vault mode is a relay configuration that provides long-term encrypted storage for Cloud Backup. It reuses the existing relay binary, protocol, and BlobStorage trait — the only differences are retention policy and storage backend.

Property	Transit Mode (current)	Vault Mode (new)
Purpose	Ephemeral message passing	Long-term encrypted backup
Retention	TTL (default 7 days)	Unlimited
Storage	Local SQLite	Object storage (R2/B2/S3)
Content blobs	Not stored	Stored via iroh-blobs
Cleanup	`cleanup_expired()` runs hourly	No expiry cleanup
Cursor tracking	Per-group	Per-group (same mechanism)
Zero-knowledge	Yes	Yes
Deployment	Global, latency-sensitive	Centralized, near storage region

19.2 Design Principles

Principle	Meaning
No new crate	Vault mode is a relay config variant, not a separate binary
No new protocol	Same envelopes, same content refs, same push/pull
No TTL	Stored blobs do not expire; deleted only on explicit request
Object storage	Blobs stored in R2/B2, not local SQLite
Thin proxy	Vault relay holds cursor metadata locally; blob data is remote
Separate deployment	Transit and vault relays run independently for failure isolation

19.3 Storage Modes

The relay’s StorageConfig gains a mode field:

[storage]
mode = "transit"   # Current behavior: SQLite, TTL cleanup
# or
mode = "vault"     # New: object storage, no TTL

Transit mode (default): Unchanged from Section 7. SQLite storage, TTL-based cleanup, per-group quotas. All existing behavior is preserved.

Vault mode: The BlobStorage trait (already defined in sync-relay/src/storage/mod.rs) gains a new implementation backed by object storage. The trait contract does not change — store_blob(), get_blobs_after(), cleanup_expired(), etc. all have the same signatures. The vault implementation simply:

Stores blob payloads in object storage (R2/B2) keyed by {group_id}/{cursor}
Keeps cursor metadata in a local SQLite (lightweight, <100MB for 100K users)
Returns a no-op from cleanup_expired() (nothing expires)

19.4 Object Storage Backend

The new ObjectStorage struct implements the existing BlobStorage trait:

BlobStorage trait hierarchy: SqliteStorage (transit mode) and ObjectStorage (vault mode) with local metadata + R2/B2/S3

Object key scheme:

{bucket}/{group_id_hex}/{cursor_padded}

Example: vk-vault/a1b2c3d4.../000000000042

Cursor is zero-padded to 15 digits for lexicographic ordering, enabling prefix scans for pull operations.

Operations mapping:

`BlobStorage` method	Object storage behavior
`store_blob()`	PUT object + INSERT cursor metadata locally
`get_blobs_after()`	SELECT cursors locally → GET objects by key
`get_max_cursor()`	SELECT from local metadata
`mark_delivered()`	UPDATE local metadata (same as transit)
`cleanup_expired()`	No-op (returns 0 deleted)
`get_group_storage()`	SUM from local metadata (payload sizes tracked locally)
`get_blob()`	GET single object by blob_id

API compatibility:

Provider	API	Notes
Cloudflare R2	S3-compatible	Free egress (recommended for restore-heavy workload)
Backblaze B2	S3-compatible	Cheaper storage, paid egress
AWS S3	Native	Higher cost, global availability
MinIO	S3-compatible	Self-hosted option

All providers use the S3-compatible API via the rust-s3 or aws-sdk-s3 crate. The implementation is provider-agnostic.

19.5 Vault Mode Configuration

[storage]
mode = "vault"

# Object storage backend
backend = "r2"                    # "r2", "b2", "s3", "minio"
bucket = "vk-vault"
endpoint = "https://xxx.r2.cloudflarestorage.com"
access_key_id = "${VAULT_ACCESS_KEY}"
secret_access_key = "${VAULT_SECRET_KEY}"
region = "auto"                   # R2 uses "auto"

# Local metadata database (cursor tracking, delivery status)
metadata_database = "/data/vault-meta.db"

# Quotas
max_group_storage = 53687091200   # 50 GB fair use cap per group
max_blob_size = 10485760          # 10 MB per blob (larger than transit's 1MB)

[storage.content]
enabled = true                    # Accept iroh-blobs for content storage
max_content_per_group = 53687091200  # 50 GB content cap per group

Environment variable overrides (for container deployment):

Variable	Purpose
`VAULT_ACCESS_KEY`	Object storage access key
`VAULT_SECRET_KEY`	Object storage secret key
`VAULT_BUCKET`	Bucket name
`VAULT_ENDPOINT`	S3-compatible endpoint URL

19.6 Content Blob Storage

Transit relays do not store content blobs — large files transfer device-to-device via iroh-blobs (Section 17). Vault relays additionally participate as an iroh-blobs peer to capture content for backup.

When Cloud Backup is ON:

Vault content flow: Device A sends sync messages to transit relay and vault relay, content blobs go to vault relay then to object storage

The vault relay runs an iroh-blobs server that accepts content blob transfers. Each blob is stored as a separate object in the same bucket, keyed by content hash:

{bucket}/{group_id_hex}/content/{content_hash_hex}

Content blobs use the same encrypt-then-hash pipeline defined in Section 17 — the vault relay receives already-encrypted ciphertext and its BLAKE3 hash. It stores the ciphertext without decryption.

19.7 Client Cloud Backup Toggle

The sync-client gains a CloudBackup configuration:

enum CloudBackup {
    Off,    // Default. Push to transit relay(s) only.
    On {
        vault_relay_address: String,  // Vault relay endpoint
    },
}

Behavior when ON:

push() sends to transit relay(s) AND vault relay
Content transfers also target the vault relay’s iroh-blobs endpoint
Vault push is fire-and-forget (does not block the primary push acknowledgement)

Behavior when OFF:

Same as current behavior. No vault relay interaction.

The toggle is a local user preference. It does not affect the protocol, the encryption, or any relay behavior. The vault relay does not know whether a user has the toggle ON or OFF — it simply receives pushes like any other relay.

19.8 Restore Flow

Restore uses the existing pull protocol. No new messages or endpoints required.

Full restore (all devices lost):

Application (VardKista) provides group_id and vault relay address (from Recovery Kit — see docs/research/blind-replica-architecture.md)
sync-client connects to vault relay
sync-client.pull(group_id, after_cursor: 0) — pulls all stored blobs from the beginning
Content blobs retrieved via iroh-blobs from vault relay
Application decrypts and rebuilds local state

Partial restore (existing device available):

New device pairs with existing device (existing pairing flow)
New device syncs from transit relay AND vault relay
CRDTs merge state from all sources — no conflict resolution needed

The vault relay does not distinguish between a “restore” pull and a normal pull. Both are the same operation: get_blobs_after(group_id, cursor).

19.9 Quotas and Fair Use

Limit	Value	Enforcement
Max storage per group	50 GB	`get_group_storage()` check before `store_blob()`
Max blob size	10 MB	Same as transit (checked in `handle_push()`)
Max content per group	50 GB	Separate tracking for iroh-blobs content
Total per-group cap	100 GB	Sync messages + content combined

When a group exceeds its quota, store_blob() returns QuotaExceeded. The client handles this gracefully — the application can prompt the user to manage storage.

19.10 Data Lifecycle

Event	Action
Cloud Backup ON	Vault relay starts receiving pushes. Full sync from transit relay history (if available within TTL).
Cloud Backup OFF	Vault relay stops receiving new pushes. Existing data preserved for 30 days.
30 days after OFF	All group data purged from object storage. Cursor metadata deleted.
Re-enable within 30 days	Resume from last cursor. No re-sync needed.
Re-enable after 30 days	Full re-sync from device.
User requests deletion	Immediate: all blobs for group deleted from object storage. Metadata purged.

Deletion is simple: The vault relay controls the R2 bucket. Deletion = DELETE all objects with the group’s prefix + DELETE local metadata rows. No scattered copies in analytics, CDNs, or backup systems.

19.11 Security Properties

All security properties from Section 4 are preserved. Additionally:

Property	How
Zero-knowledge	Vault relay stores encrypted blobs. Same ciphertext as transit. No keys.
No new attack surface	Same protocol, same message types, same authentication.
Object storage isolation	Each group’s data under a unique prefix. No cross-group access.
Deletion completeness	Object storage has no hidden caches or replicas beyond what the bucket provides.
Metadata minimality	Vault metadata is cursor positions and payload sizes. No content, no filenames, no timestamps beyond ordering.

What the vault relay can observe (unchanged from transit):

Blob sizes
Group IDs (opaque)
Device public keys (pseudonymous)
Timing of push/pull operations
IP addresses (at connection level, not logged)

What the vault relay cannot observe:

Blob contents (encrypted)
File names, types, or structure
User identity or account information
Relationships between groups

19.12 Infrastructure Sizing

Vault relays and transit relays are separate deployments with different scaling profiles:

Property	Transit Relay	Vault Relay
Container size	CX22 (4GB RAM, 40GB disk)	CX11 (2GB RAM, 20GB disk)
Scaling factor	Concurrent connections	Bandwidth (restore throughput)
Data location	Local SQLite	Object storage (remote)
Local disk usage	5-20 GB (active messages)	<1 GB (cursor metadata only)
Global distribution	Yes (latency-sensitive)	No (colocate near object storage)
Failure impact	Real-time sync degrades	Backup/restore unavailable, sync unaffected

See docs/research/blind-replica-architecture.md for detailed cost analysis at scale.

Appendix A: Crate Dependencies

sync-types

[dependencies]
serde = { version = "1", features = ["derive"] }
rmp-serde = "1"
uuid = { version = "1", features = ["v4", "serde"] }

sync-core

[dependencies]
sync-types = { path = "../sync-types" }

sync-client

[dependencies]
sync-types = { path = "../sync-types" }
sync-core = { path = "../sync-core" }
sync-content = { path = "../sync-content" }  # Large content transfer
tokio = { version = "1", features = ["rt", "sync", "time"] }
clatter = "2.2"                  # Hybrid Noise protocol (ML-KEM-768 + X25519)
iroh = "0.96"                    # Endpoint, connections, discovery (all tiers) - requires cargo patch
argon2 = "0.5"
chacha20poly1305 = "0.10"        # Supports XChaCha20
rand = "0.8"
thiserror = "1"
tracing = "0.1"

sync-content

[dependencies]
sync-types = { path = "../sync-types" }
iroh-blobs = "0.98"             # Content-addressed storage with BLAKE3/Bao
iroh = "0.96"                   # Endpoint for transfers - requires cargo patch
chacha20poly1305 = "0.10"       # XChaCha20-Poly1305 for content encryption
blake3 = "1"                    # Hashing ciphertext for content address
hkdf = "0.12"                   # Content key derivation from GroupSecret
sha2 = "0.10"                   # HKDF-SHA256
tokio = { version = "1", features = ["rt", "sync", "fs"] }
thiserror = "1"
tracing = "0.1"

sync-relay

[dependencies]
sync-types = { path = "../sync-types" }
tokio = { version = "1", features = ["full"] }
iroh = "0.96"                    # Endpoint for accepting client connections (QUIC) - requires cargo patch
clatter = "2.2"                  # Hybrid Noise protocol (ML-KEM-768 + X25519)
sqlx = { version = "0.8", default-features = false, features = ["sqlite", "runtime-tokio", "derive"] }
axum = "0.7"                     # Health/metrics HTTP endpoints only
tower = "0.4"
tracing = "0.1"
tracing-subscriber = "0.3"
config = "0.14"

sync-bridge

[package]
name = "zerok-sync-bridge"

[lib]
crate-type = ["rlib"]

[dependencies]
sync-client = { path = "../sync-client" }
sync-types = { path = "../sync-types" }
tokio = { version = "1", features = ["rt-multi-thread", "sync", "time"] }
thiserror = "1"
tracing = "0.1"

sync-node (napi-rs)

[package]
name = "zerok-sync-node"

[lib]
crate-type = ["cdylib"]

[dependencies]
sync-bridge = { path = "../sync-bridge" }
napi = { version = "3", features = ["async", "tokio_rt"] }
napi-derive = "3"

[build-dependencies]
napi-build = "2"

sync-python (PyO3)

[package]
name = "zerok-sync-python"

[lib]
name = "zerok_sync"
crate-type = ["cdylib"]

[dependencies]
sync-bridge = { path = "../sync-bridge" }
pyo3 = { version = "0.28", features = ["extension-module", "abi3-py310"] }
pyo3-async-runtimes = { version = "0.28", features = ["tokio-runtime"] }
tokio = { version = "1", features = ["rt-multi-thread"] }

Framework Integration: Tauri Plugin (updated)

# tauri-plugin-sync — now depends on sync-bridge, not sync-client directly
[dependencies]
sync-bridge = { path = "../sync-bridge" }
tauri = "2"
tauri-plugin = "2"
serde = { version = "1", features = ["derive"] }
serde_json = "1"