Version: Next

Read/Write Separation

Production data and AI applications typically have two very different workloads on the same data:

Writes / ingest — pulling from source systems, normalizing, accelerating, indexing, and refreshing. CPU-, network-, and memory-heavy. Bursty. Doesn't need to be co-located with the application.
Reads — answering user-facing requests, feeding context to AI agents, serving dashboards. Latency-sensitive. Often horizontally scaled with the application itself.

Running both on the same Spice instance forces a single hardware shape, refresh schedule, and failure domain on workloads that have nothing in common. Read/write separation splits them into two tiers: a centralized write/ingest cluster that owns refresh and acceleration, and one or more lightweight read instances (typically sidecars next to the application) that serve queries from a local materialized copy.

The two tiers communicate through two channels:

Snapshots in object storage — the cluster periodically writes a compact acceleration file (DuckDB or SQLite) to S3, GCS, or ADLS. Read instances bootstrap from the latest snapshot on startup and (optionally) refresh from snapshots on a schedule. No live network dependency on the cluster.
Live query delegation — when a read instance needs data outside its materialized working set (a historical query, a cross-dataset join, a broad search), it transparently delegates to the cluster over Arrow Flight. See Cluster-Sidecar Architecture.

Most production deployments use both: snapshots for the steady-state working set, and live delegation for the long tail.

When to use read/write separation

Use this pattern when:

Application instances need sub-millisecond reads but data refresh, ingestion, or acceleration would saturate them.
The same datasets are read by many replicas, each currently re-ingesting from source systems.
Read instances need to start fast — autoscaling, scale-to-many agent containers, or ephemeral Cloud Run / Knative workloads where cold-starting from source is too slow.
Upstream data sources have rate or cost limits that prevent every replica from connecting directly.
Read instances run outside the cluster's network — at the edge, in another VPC, on a developer laptop — and cannot maintain a permanent dependency on the source system.

It is overkill when one Spice instance is sufficient (start with Sidecar) or when the workload is purely batch/analytical with relaxed latency (use Microservice).

How it works

The cluster (write tier)

The cluster owns every refresh, acceleration, and search index for the datasets in scope. It runs as a standalone Spice deployment — typically a Kubernetes Deployment or StatefulSet, or a managed Spice Cloud app — and holds the only credentials to the source systems.

Cluster Spicepod responsibilities:

Connect to every source: object stores, OLTP databases, lakehouses, search indices, message queues.
Run all refresh schedules, CDC, and stream ingest.
Accelerate to file-mode engines (DuckDB or SQLite) so the materialization can be exported as a snapshot.
Write snapshots to a shared object store after each refresh.

# cluster spicepod.yaml
snapshots:
  enabled: true
  location: s3://spiceai-snapshots/prod/
  params:
    s3_auth: iam_role

datasets:
  - from: s3://my-lake/orders/
    name: orders
    params:
      file_format: parquet
    acceleration:
      enabled: true
      engine: duckdb
      mode: file
      refresh_check_interval: 5m
      snapshots: enabled # write a new snapshot after every refresh
      snapshots_trigger: refresh_complete
      snapshots_compaction: enabled
      params:
        duckdb_file: /data/orders.db

  - from: postgres:public.customers
    name: customers
    params:
      pg_host: postgres.internal
      pg_user: ${ secrets:PG_USER }
      pg_pass: ${ secrets:PG_PASS }
    acceleration:
      enabled: true
      engine: duckdb
      mode: file
      refresh_mode: changes # CDC
      snapshots: enabled
      snapshots_trigger: time_interval
      snapshots_trigger_threshold: 10m
      params:
        duckdb_file: /data/customers.db

Snapshots are partitioned by date and dataset (month=YYYY-MM/day=YYYY-MM-DD/dataset=<name>/...), so retention is a normal object-store lifecycle rule. See Snapshots for the full configuration reference.

The read instances (read tier)

Read instances run alongside applications — typically as Kubernetes pod sidecars, but the same configuration works in Cloud Run, on bare metal, or on a developer laptop. They never connect to source systems; their only inbound dependencies are the snapshot bucket and (optionally) the cluster's Arrow Flight endpoint.

Read Spicepod responsibilities:

Bootstrap each accelerated dataset from the latest snapshot on startup. No source connection required.
Optionally refresh from newer snapshots on a schedule (bootstrap_only mode polls for new snapshots without writing them).
Optionally delegate queries that fall outside the materialized working set to the cluster over Arrow Flight.

# read instance spicepod.yaml
snapshots:
  enabled: true
  location: s3://spiceai-snapshots/prod/
  bootstrap_on_failure_behavior: fallback # try older snapshots if the newest fails
  params:
    s3_auth: iam_role

datasets:
  - from: s3://my-lake/orders/ # same source URL, but never used at runtime
    name: orders
    params:
      file_format: parquet
    acceleration:
      enabled: true
      engine: duckdb
      mode: file
      snapshots: bootstrap_only # download only; never write back
      params:
        duckdb_file: /local/orders.db

  - from: postgres:public.customers
    name: customers
    acceleration:
      enabled: true
      engine: duckdb
      mode: file
      snapshots: bootstrap_only
      params:
        duckdb_file: /local/customers.db

snapshots: bootstrap_only is the key setting — read instances read snapshots but never write them, so multiple replicas don't race to upload. Combine with a periodic refresh trigger to pick up new snapshots without re-querying the source.

Live delegation for the long tail

Snapshots cover the working set. For queries that span beyond it — historical analytics, cross-dataset joins, distributed search — read instances delegate to the cluster using a spiceai connector entry pointing at the cluster's Arrow Flight endpoint.

# read instance spicepod.yaml (continued)
datasets:
  - from: spiceai:orders_history
    name: orders_history
    params:
      endpoint: grpcs://cluster.spice.svc.cluster.local:50051
      api_key: ${ secrets:CLUSTER_API_KEY }

The application sees a single SQL surface — accelerated tables and delegated tables compose normally in joins and CTEs. See Cluster-Sidecar Architecture for the conceptual model.

Operational model

Bootstrap and refresh on read instances

When a read instance starts:

For each accelerated dataset, Spice checks for the local file (duckdb_file / sqlite_file).
If absent and snapshots are enabled, Spice lists the snapshot prefix, downloads the newest snapshot for that dataset, and the dataset goes ready immediately.
If no snapshot is found, behavior is governed by bootstrap_on_failure_behavior:
- warn (default) — boot empty and refresh from the source. Avoid in read-tier instances that should not have source access.
- fallback — try older snapshots until one loads.
- retry — keep retrying the newest snapshot.

For zero-source-credentials read instances, set bootstrap_on_failure_behavior: fallback or retry and ensure the dataset is never configured with usable source credentials.

Steady-state refresh on read instances is configured per dataset:

acceleration:
  refresh_check_interval: 1m # check the snapshot bucket every minute
  snapshots: bootstrap_only

When a newer snapshot is available, the dataset hot-swaps without restarting the pod.

Snapshot retention and storage

Snapshots are written to Hive-partitioned paths so retention is straightforward:

s3://spiceai-snapshots/prod/
  month=2026-05/day=2026-05-01/dataset=orders/orders_20260501T120000Z.db
  month=2026-05/day=2026-05-02/dataset=orders/orders_20260502T120000Z.db

Apply an object-store lifecycle rule (S3 lifecycle, GCS Object Lifecycle Management, ADLS Lifecycle) to expire old partitions. Most deployments keep 24–72 hours of refresh-triggered snapshots and a daily archive beyond that.

The snapshot bucket is the only shared dependency between the tiers, so keep it in the same region as the read instances and apply VPC endpoints / Private Google Access to keep traffic on the private network.

Versioning the Spicepod

The cluster and the read instances share dataset names but not full Spicepods. Two patterns work well:

Fork two Spicepods from a common base. Keep datasets: definitions in a shared file and merge the cluster-only and read-only fields at deploy time (Helm value overlays, Kustomize, Jsonnet).
Single Spicepod, role-based behavior. Use Spicepod includes and environment-specific values to switch snapshots: enabled (cluster) vs snapshots: bootstrap_only (reads) per role.

Whichever approach is chosen, treat schema changes as backward-compatible by default — read instances may be running snapshots from a previous cluster version during a rollout.

Deploy on Kubernetes

The reference topology runs the cluster as a StatefulSet (or SpicepodSet on Spice.ai Enterprise) and the read instances as sidecars in application pods. Both use the same Spice Helm chart.

Cluster release

# cluster-values.yaml
replicaCount: 3
stateful:
  enabled: true
  storageClass: gp3 # or hyperdisk-balanced, managed-csi-premium
  size: 100Gi

serviceAccount:
  create: true
  name: spiceai-cluster
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/SpiceAIClusterRole

spicepod:
  # full Spicepod with sources, refresh schedules, and snapshots: enabled
  ...

helm upgrade --install spiceai-cluster spiceai/spiceai \
  -n spiceai-cluster --create-namespace \
  -f cluster-values.yaml

Read instance sidecars

Read instances are deployed as a sidecar container in application pods, configured via a ConfigMap that holds the read-tier Spicepod. The application points at 127.0.0.1:8090 (HTTP) or 127.0.0.1:50051 (Arrow Flight) — no service discovery needed.

# application Deployment
spec:
  template:
    spec:
      serviceAccountName: spiceai-read # IRSA / Workload Identity for snapshot bucket
      volumes:
        - name: spicepod
          configMap:
            name: spiceai-read-spicepod
        - name: accel
          emptyDir: {} # ephemeral; bootstrapped from snapshots
      containers:
        - name: app
          image: my-app:1.2.3
          env:
            - name: SPICEAI_HTTP_URL
              value: http://127.0.0.1:8090

        - name: spiceai
          image: spiceai/spiceai:1.11.5
          args: ['--http', '0.0.0.0:8090', '--flight', '0.0.0.0:50051']
          volumeMounts:
            - name: spicepod
              mountPath: /spicepod
              readOnly: true
            - name: accel
              mountPath: /local
          readinessProbe:
            httpGet: { path: /v1/ready, port: 8090 }
          livenessProbe:
            httpGet: { path: /health, port: 8090 }

The read sidecar's ServiceAccount only needs read access to the snapshot bucket. It should not be granted source-system credentials — that's what makes the read tier safe to scale to many replicas.

Spice.ai Enterprise

For production, the Spice.ai Enterprise Kubernetes Operator manages both tiers as custom resources:

SpicepodSet — per-replica StatefulSets for the cluster, with automatic PVC resizing, configurable update strategies, and crashloop protection.
SpicepodCluster — distributed scheduler/executor tiers when the cluster itself is large enough to need its own internal split.
Sidecar injection via webhook, so application teams add a single annotation to opt in.

Capacity sizing

Rough first-pass sizing rules:

Tier	Typical shape
Cluster (writer)	3+ replicas. Memory sized for the largest accelerated dataset. Network bandwidth for source ingest.
Read instance	1 replica per application pod. 0.5–2 vCPU, 512Mi–4Gi memory, 10–50Gi local SSD.
Snapshot bucket	Standard tier, same region. Lifecycle rule sized to refresh frequency × number of datasets × 24–72h.

Read-tier memory is dominated by the working set of the file-mode acceleration engine. DuckDB compaction (snapshots_compaction: enabled) typically reduces snapshot size by 30–60%.

Security model

The split simplifies the credential surface area:

Cluster — holds source credentials, snapshot write credentials, and cluster-internal mTLS. Runs in a private subnet; no public ingress.
Read instances — hold snapshot read credentials and a per-instance Arrow Flight token to the cluster (for live delegation). No source credentials.
Application — talks to its sidecar over loopback. No outbound credentials at all.

Compromising a read instance grants the attacker the read tier's snapshot bucket and the delegated query surface — never the source systems.

Observability

Both tiers expose the same metrics and tracing endpoints. Practical splits:

Cluster dashboards — refresh duration, snapshot upload size and latency, source connector errors, ingest queue depth.
Read dashboards — bootstrap duration, snapshot age (write-time vs current-time), query latency p50/p99, delegation rate (queries served locally vs forwarded to the cluster).

A high delegation rate is a signal to expand the materialized working set. A growing snapshot age is a signal that the cluster is falling behind on refresh.

Cluster-Sidecar Architecture — the conceptual model and live-delegation pattern.
Snapshots — full reference for snapshot configuration, triggers, and modes.
Sidecar Architecture — single-instance precursor to this pattern.
Cluster Architecture — internal scheduler/executor split for the cluster tier (Spice.ai Enterprise).
Kubernetes Deployment Guide — Helm, Argo CD, and Flux options for the cluster.
CI/CD — automating cluster and read-instance rollouts.
Spice.ai Enterprise Kubernetes Operator — recommended for production self-hosted deployments.

When to use read/write separation​

How it works​

The cluster (write tier)​

The read instances (read tier)​

Live delegation for the long tail​

Operational model​

Bootstrap and refresh on read instances​

Snapshot retention and storage​

Versioning the Spicepod​

Deploy on Kubernetes​

Cluster release​

Read instance sidecars​

Spice.ai Enterprise​

Capacity sizing​

Security model​

Observability​

Related​