13 posts tagged with "distributed-query"

Distributed Query related topics and usage

View All Tags

Spice v2.1.0 (Jul 9, 2026)

July 9, 2026 · 36 min read

Jack Eadie

Token Plumber at Spice AI

Spice v2.1.0 is now available! 🎉

Spice v2.1.0 is the next minor release of Spice, headlined by high-throughput Cayenne CDC, scaling and resilience improvements to PostgreSQL logical replication, expanded distributed query with Iceberg catalog scans and broadcast joins, and the upgrade to DataFusion v54 (including v53), Arrow v58.3, and Vortex v0.74. The release also adds experimental adaptive self-tuning for the Cayenne accelerator, distributed GLM inference, and a range of security, search, and connector improvements.

Highlights in v2.1.0 include:

High-Throughput Cayenne CDC — in-memory CDC tier, dedicated compaction runtime, and write-path optimizations that cut replication lag on high-volume CDC workloads
PostgreSQL Replication at Scale — multiple changes-mode datasets share a single replication slot, unchanged-TOAST recovery, and resilient reconnects across rolling deploys
Distributed Query — distributed Ballista scans of Iceberg catalog tables, broadcast joins for small dimension tables, and shared scheduler job state with failover
DataFusion v54 — upgrade to DataFusion v54 (folding in v53), Arrow v58.3, and Vortex v0.74, bringing faster joins, scans, and planning
Adaptive Self-Tuning (Experimental) — opt-in closed-loop tuning and maintained aggregates that adapt Cayenne to hardware, schema, and live workload

What's New in v2.1.0

High-Throughput Cayenne CDC

A major focus of v2.1 is Spice Cayenne write-path throughput for change-data-capture (HTAP) workloads:

In-Memory CDC Tier: A new in-memory CDC tier and follow-ups cut replication lag on hot upsert tables, with bounded mem-tier checkpointing and O(1) per-scan deletion views, plus a two-phase off-fence checkpoint on the ingest path.
Dedicated Compaction Runtime: A dedicated compaction runtime with CDC pipelining and protected snapshots isolates compaction from query and ingest paths, with parallelized deletion-vector writes, per-batch directory-barrier coalescing, and size-aware parallel encode for protected-snapshot compaction.
Incremental Protected Snapshot Compaction: Incremental compaction of protected snapshots (used in Cayenne's merge-on-read deletion index) reduces disk usage and improves query performance.
Smaller WAL & Metadata-Only Publish: cayenne_insert_record table IDs are stored as 16-byte raw-UUID BLOBs, cutting CDC WAL volume ~34%; upsert commits publish metadata-only, dropping per-key insert records; transient staged CDC deltas are light-encoded.
Delta-Write Encoding Levels: A new cayenne_delta_encoding setting (default auto) selects delta-write encoding, and cayenne_compression_strategy: zstd is now fully wired.
In-Memory CDC Sharding: PK-hash intra-apply sharding parallelizes in-memory CDC apply.
Scan Safety Under Write: In-flight scans are ref-counted so snapshot GC can't delete Vortex files mid-read; in-RAM scan parallelism, query admission control, and sound scan output ordering improve read behavior under sustained CDC.

Delta-write encoding effort and Vortex compression are tunable per accelerator. cayenne_delta_encoding: auto (the default) size-gates fresh CDC/append writes — small deltas use a light scheme and are re-encoded during compaction — or pin an explicit level 0..10 (7 is the full default cascade); cayenne_compression_strategy selects the Vortex compression:

acceleration:
  engine: cayenne
  refresh_mode: changes
  params:
    cayenne_delta_encoding: auto # 'auto' (default), or pin a level 0..10 (7 = full cascade)
    cayenne_compression_strategy: zstd # 'btrblocks' (default) or 'zstd'

Change Data Capture & HTAP

PostgreSQL logical replication (CDC, refresh_mode: changes, introduced in v2.0) gets significant scaling and resilience work in v2.1:

Shared Replication Slot: Multiple refresh_mode: changes PostgreSQL datasets on the same connection can name the same pg_replication_slot to share a single replication slot, walsender decoder, and publication, with decoded changes multiplexed by (schema, table) to each dataset. This collapses the slot count from one-per-dataset to one — staying well under Postgres's default max_replication_slots = 10.

datasets:
  - from: postgres:public.orders
    name: orders
    params:
      pg_db: mydb
      pg_replication_slot: spice_cdc # shared slot name
    acceleration:
      refresh_mode: changes
  - from: postgres:public.customers
    name: customers
    params:
      pg_db: mydb
      pg_replication_slot: spice_cdc # same name -> one slot, walsender & publication
    acceleration:
      refresh_mode: changes

Unchanged-TOAST Recovery: Under REPLICA IDENTITY FULL, when an UPDATE leaves a large TOASTed column unchanged, pgoutput sends an "unchanged" marker; Spice now fills that value from the old tuple — its old value is its current value — so updates no longer error or drop columns. Without an old tuple, the error persists with a hint to enable REPLICA IDENTITY FULL.
Transient Walsender Contention: Slot-contention errors during rolling deploys — SQLSTATE 55006 ("replication slot is active for PID") and 53300 ("requested standby connections exceeds max_wal_senders") — are now classified as transient and retried with backoff instead of fatally terminating the stream. Replication connections are also released at shutdown start (not process exit), freeing walsender seats for replacement instances.
Strict CDC Param Validation: PostgreSQL CDC parameters are strictly validated rather than silently defaulted.
Debezium Schema Evolution: Fixes for Debezium schema-evolution support, including tombstone-message handling and sign-extension of minimal-width base64 decimals.

Distributed Query

Distributed Query gains:

Distributed Iceberg Catalog Scans: Ballista distributes scans of Iceberg catalog tables across executors.
Broadcast Joins: Small dimension tables are broadcast to executors for distributed joins.
Shared Scheduler Job State with Failover: Ballista job state is shared so the scheduler can fail over without losing in-flight work.

Performance & Query Engine

Apache DataFusion is upgraded to v54, folding in v53, alongside Arrow v58.3 and Vortex v0.74 (with a pin bump adding intra-file decode split and a per-execution kernel cache). Two DataFusion releases land in this upgrade:

DataFusion v54 (release notes): adds LATERAL joins, SQL lambda functions (x -> expr with array_transform/array_filter/array_any_match), spilling nested-loop joins, and a faster arrow-avro reader. Performance work includes morsel-driven Parquet scans (up to ~2x faster for skewed scans), 20-50x faster sort-merge semi/anti/mark joins, redundant-sort-key pruning, NDV-based cardinality estimation, and inner_product/cosine_distance functions.
DataFusion v53 (release notes): adds LIMIT-aware Parquet row-group pruning, broader filter pushdown through joins and UNION, nested-field pushdown (get_field into the scan), faster query planning (some plans dropping from ~4-5ms to ~100us), and 42 faster built-in functions.

Federation deny-list enforcement and catalog DDL are restored after the DataFusion upgrades, and a cost-based left-deep join reordering rule is added for Cayenne acceleration.

AI & LLM

Native GLM Support with Distributed Inference: Native GLM model support with surfaced reasoning_content, including tensor-parallel GLM inference. Load a GLM model with model_type: glm4 (glm4moe and glm4moelite are also supported):

models:
  - name: glm
    from: huggingface:huggingface.co/THUDM/glm-4-9b-chat
    params:
      model_type: glm4

For large models, GLM inference can be distributed across nodes (tensor parallelism) via the mistral.rs pure-TCP ring all-reduce backend — no NCCL/system dependency. This is a Spice.ai Enterprise feature requiring the distributed build. Run the same model on each node, changing only node_rank:

models:
  - name: glm
    from: huggingface:huggingface.co/THUDM/glm-4-9b-chat
    params:
      model_type: glm4
      distributed_backend: ring
      nodes: 10.0.4.21,10.0.4.22 # ordered host/IP per rank; the ring backend currently requires exactly 2
      node_rank: 0 # rank of THIS node in [0, world_size); rank 0 serves the API. Set node_rank: 1 on 10.0.4.22

NSQL Context Endpoint: A new GET /v1/nsql/context endpoint returns the SQL dialect, dataset schemas (with optional sample rows), and registered functions that Spice injects into natural-language-to-SQL (POST /v1/nsql) requests — useful for inspecting or caching exactly what the model sees:

# Inspect the context injected into /v1/nsql requests (examples_limit default 3, max 100)
curl "http://localhost:8090/v1/nsql/context?include_examples=true&examples_limit=3"

Returns the dialect, per-dataset schema (keys, indexes, searchable columns), the registered function inventory, and sample rows (abbreviated):

{
  "context": "# Spice.ai NSQL Context",
  "instructions": [
    "Write SQL for the Spice runtime, which uses Apache DataFusion with the SQL parser configured for the PostgreSQL dialect.",
    "Use table and column descriptions, primary keys, foreign keys, unique constraints, and indexes when choosing joins and filters."
  ],
  "sql": {
    "engine": "Apache DataFusion",
    "version": "54.0.0",
    "dialect": "PostgreSQL",
    "parser": "DataFusion SQL parser configured with PostgreSQL dialect"
  },
  "datasets": [
    {
      "name": "sales.orders",
      "table": "orders",
      "description": "Customer orders",
      "columns": [
        { "name": "order_id", "data_type": "Int64", "nullable": false, "primary_key": true, "indexed": true },
        { "name": "customer_id", "data_type": "Utf8", "nullable": false, "vector_search": true, "full_text_search": true }
      ],
      "primary_key": ["order_id"],
      "foreign_keys": [
        { "columns": ["customer_id"], "foreign_table": "spice.sales.customers", "foreign_columns": ["id"] }
      ]
    }
  ],
  "functions": {
    "summary": "Spice SQL runs on Apache DataFusion ... Run SELECT * FROM list_udfs() to inspect the full registered function inventory",
    "search": [
      { "name": "vector_search", "syntax": "vector_search(dataset, 'query text'[, column])" },
      { "name": "text_search", "syntax": "text_search(dataset, 'query text'[, column])" }
    ]
  },
  "samples": [
    { "title": "Example rows for `sales.orders`", "content": "| order_id | customer_id |\n| --- | --- |\n| 42 | CUST-1 |" }
  ]
}

Search & Vectors

S3 Vectors Pagination: QueryVectors paginates for top-K up to 10,000.
Elasticsearch kNN Candidate Pool: The default kNN candidate pool is raised from 10 to 1000 for better recall.

SQL & Query Engine

FlightSQL Substrait Plans: CommandStatementSubstraitPlan support.
Large Result Streaming: Flight streaming is optimized for large result sets.
Write Authorization: The SQL tool allows writes for ReadWrite API keys.
Schema Evolution Policies: on_schema_change supports widening-only evolution and a drop_and_recreate policy.

Security & Connectors

Kafka mTLS: Mutual TLS configuration is surfaced in the Kafka data connector.
Secret Resolution at Startup: Secret references are checked and reported at startup.
DuckDB HNSW: Upgrade to DuckDB v1.5.3 with the statically linked VSS (HNSW) vector extension.

Adaptive Self-Tuning (Experimental)

The Spice Cayenne accelerator gains experimental opt-in self-tuning. cayenne_tuning: auto derives configuration from the detected hardware and inferred schema, while adaptive additionally runs a per-table closed-feedback controller that adapts flush caps, the in-memory CDC tier, compaction cadence, and write concurrency toward operator SLOs (replication lag, freshness, query latency, queries-per-hour). Cayenne can also maintain aggregates incrementally — with predicate-aware delta serving and incremental retraction — and fold whole-table SUM/AVG/COUNT/MIN/MAX from statistics. These features are experimental and disabled by default.

datasets:
  - from: postgres:public.orders
    name: orders
    acceleration:
      engine: cayenne
      refresh_mode: changes
      params:
        cayenne_tuning: adaptive # 'auto' (static, env- + schema-derived) or 'adaptive' (closed-loop)

Observability

Per-Dataset Query Attribution: The query_executions metric gains a datasets dimension.
HTAP Diagnostics: Improved HTAP replication diagnostics on non-convergence.
Cayenne Write Observability: Write-phase observability for the in-memory CDC tier.

Notable Bug Fixes

Cayenne Utf8View: The Utf8View read schema avoids a hash-join offset overflow.
Cayenne metastore: cayenne_metastore: turso is honored for partitioned tables and the dataset checkpoint.
Dual-write detection: Dual-write accelerated tables are detected behind the metadata-enrichment wrapper.
digest_many collisions: Values are length-prefixed so column boundaries can't collide.
Turso WAL checkpoint: WAL checkpoints route through the native Turso connection.
TLS status probe: The status check probes the metrics endpoint over HTTPS when TLS is enabled.
Search snippet offsets: Character chunk offsets persist so search snippets aren't shifted or garbled.
Async query chunk offsets: /v1/queries chunk row_offset uses the cumulative offset rather than chunk_index * chunk_size.

Dependency Updates

Dependency / Component	Version
DataFusion	v54
Arrow (arrow-rs)	v58.3
Vortex	v0.74
iceberg-rust	v0.9.1
DuckDB	v1.5.3
Rust toolchain	v1.95.0

Contributors

Breaking Changes

No breaking changes.

Cookbook Updates

No new cookbook recipes.

The Spice Cookbook includes more than 100 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v2.1.0, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:2.1.0 image:

docker pull spiceai/spiceai:2.1.0

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai --version 2.1.0

AWS Marketplace:

Spice is available in the AWS Marketplace.

What's Changed

Changelog

Make DuckDB schema cast logic more robust by @sgrebnov in #10991
perf(cayenne): reduce allocation overheads in hot paths by @lukekim in #10950
improve error message for 'params.spiceai_region' by @Jeadie in #10954
fix(acceleration): rename WriteMode variants to fix #10960 by @phillipleblanc in #10974
add Scylla, bigquery and turso to throughput benchmarking by @Jeadie in #11006
deps(ballista): pull in shuffle-on-object-store correctness fixes (datafusion-ballista PRs #42 + #43) by @phillipleblanc in #10919
fix(kafka): seek to sidecar offsets via post_rebalance callback on restart by @ewgenius in #11007
Propagate source comments into schema metadata by @lukekim in #10944
feat(chbench-driver): better alignment to BenchBase by @sgrebnov in #11012
fix: handle EXISTS/NOT EXISTS subqueries in federation analyzer by @sgrebnov in #10996
Enable filter pushdown in spicepod defined UDTFs by @Jeadie in #11004
Refactor spice dataset configuration command by @Jeadie in #10999
fix: ensure HashJoinExec partition counts match when join input is statically empty by @phillipleblanc in #11025
feat(chbench): Improve OLTP throughput and reduce PostgreSQL CDC overhead by @sgrebnov in #11018
feat(cluster): add distributed query observability metrics by @phillipleblanc in #10990
fix: delegate truncate in PolyTableProvider to inner write provider by @claudespice in #11036
Remove default runtime features - enable explicitly in spiced by @phillipleblanc in #11037
fix: preserve field and schema metadata in Vortex physical schema calculation by @claudespice in #11013
Expose metadata descriptions via PostgreSQL UDFs by @lukekim in #11032
fix: route Turso WAL checkpoint through native turso connection (fixes #10657) by @claudespice in #11048
fix: add missing truncate delegation and guards for wrapper TableProviders by @claudespice in #11014
Update DataConnector statuses by @krinart in #11052
remove redundant readonly checks by @Jeadie in #10975
Fix Unity Catalog connector compatibility with OSS Unity Catalog by @ewgenius in #11026
refactor(cdc): reduce CDC sub-batch splits for interleaved upsert/delete workloads by @sgrebnov in #11051
feat(cayenne): allow inline writes with pending deletions (deletes/upserts) by @sgrebnov in #11031
fix(sql-tool): defer read-only gate to caller's API key role by @phillipleblanc in #11040
fix: map Gemini Recitation finish reason to ContentFilter by @claudespice in #11046
feat(cayenne): fast-path CDC deletes by extracting PK values from filters by @sgrebnov in #11049
fix(cluster): gate scheduler readiness on executor partition loads by @phillipleblanc in #10992
fix(snowflake): enforce function deny-list in federation pushdown by @claudespice in #11057
perf(cdc): Last-write-wins dedup in group_into_sub_batches to reduce sub-batch splits by @sgrebnov in #11059
[Bug] Timing between reconnect and AllocateInitialPartitions leaves connection without flight_sql_client by @Jeadie in #10805
Define 'trait QueryEngine' to refactor runtime crate by @Jeadie in #11028
fix(snowflake): apply Spice function deny-list in extracted connector crate by @claudespice in #11071
perf(cayenne): keep CDC upsert PK keysets resident to avoid per-batch full-table rebuilds by @lukekim in #11074
fix(postgres-replication): emit recovery log + reduce reconnect-warn volume by @claudespice in #11084
fix metadata on search indexing by @Jeadie in #11080
perf(cayenne): scale CDC inline flush caps with memory + storage class by @lukekim in #11087
feat(cayenne): merge-on-read position deletes for PK upsert tables + memory-pool accounting by @lukekim in #11085
Support tuple-IN composite PK extraction in Cayenne delete fast-path by @sgrebnov in #11093
feat(cluster): report per-executor table statistics so distributed JoinSelection can size joins by @phillipleblanc in #11089
feat(cluster): NDV-aware executor stats so CDC q18 join swap fires by @phillipleblanc in #11098
Improve HTAP replication diagnostics on non-convergence by @sgrebnov in #11100
Normalize DataType::Null to Int32 in acceleration schema for duckdb by @krinart in #11062
Fix debezium schema evolution support by @ewgenius in #11095
feat(cayenne): incremental write-path executor statistics for distributed join sizing by @phillipleblanc in #11104
fix(cache): periodic moka maintenance to drain invalidation predicates (#11077) by @phillipleblanc in #11106
fix: validate Snowflake account identifiers and auth config by @Jeadie in #11024
fix: trace external mcp server tool calls by @ewgenius in #11058
Remove unwrap from test code; drop clippy::unwrap_used suppressions by @phillipleblanc in #11108
Upgrade to DuckDB 1.5.3 + statically link the VSS (HNSW) extension by @sgrebnov in #11107
Fix fetched_at for HTTP connector by @Jeadie in #11116
fix(search): propagate LIMIT to base TableScan in VectorScanTableProvider (fixes #8368) by @claudespice in #11124
feat(runtime): add spicebench feature to register the Cayenne catalog connector by @phillipleblanc in #11122
fix(cayenne): tombstone inline-checkpointed rows on upsert to prevent duplicate PKs by @sgrebnov in #11129
Remove possibility of a deadlock in RuntimeStatus by @krinart in #11114
Fix Windows build: vendored-vss duckdb-rs + adapt to table-providers mongodb API by @phillipleblanc in #11140
localpod: synchronize child refreshes when parent uses in-memory (arrow) accelerator by @phillipleblanc in #11139
Add datasets dimension to the query_executions metric by @phillipleblanc in #11138
fix(spiceai): keep correlated subqueries out of JOIN ON for Spice Cloud federation by @phillipleblanc in #11143
fix(duckdb): normalize timestamp columns to microsecond precision (fixes #10627) by @claudespice in #11145
feat(cayenne): sharded parallel Vortex encode with key/time clustering by @lukekim in #11144
fix(cluster): prevent DoPut write pipeline self-deadlock under ingest backpressure by @phillipleblanc in #11160
feat(chbench): configurable HTAP concurrency, DuckDB query overrides, and OLTP rate control by @sgrebnov in #11162
fix(http): preserve non-JSON response rows instead of crashing nested decomposition (fixes #11155) by @claudespice in #11161
Use declared schema in DynamoDB/MongoDB/Debezium by @krinart in #11066
fix(cluster): prevent partitioned datasets from staying Refreshing by @phillipleblanc in #11157
fix(runtime): don't list postgres as a valid accelerator engine when postgres-accel is disabled by @sgrebnov in #11169
fix(spark): recover stale or broken Spark Connect sessions on failure by @lukekim in #11171
fix(secrets): don't abort secret lookup precedence walk on a failing store by @phillipleblanc in #11175
feat(cayenne): bound aggregate write concurrency, conservative defaults, and write/read observability by @lukekim in #11170
feat(unity_catalog): support Unity Catalog credential vending for Delta Lake tables by @phillipleblanc in #11180
fix(secrets): keep failed secret stores registered so lookups report the init root cause by @phillipleblanc in #11181
fix(debezium): sign-extend minimal-width base64 decimals instead of zero-padding by @claudespice in #11184
fix(deps): update hickory-resolver to 0.26 (evicts hickory-proto 0.25.x) by @phillipleblanc in #11183
refactor(secrets): derive secret store metadata from a single registry table by @phillipleblanc in #11188
perf(cayenne): cut CDC replication lag on hot upsert tables by @lukekim in #11191
feat(cayenne): async inline-fallback (per-tombstone published flag) + 64c/256GB tuning by @lukekim in #11194
Add HTTP connector mTLS support by @lukekim in #11127
feat(snowflake): push AT TIME ZONE as CONVERT_TIMEZONE and pin session to UTC by @lukekim in #11190
feat(cdc): make cdc_max_coalesce_age_ms a real apply-loop linger by @sgrebnov in #11196
feat(cayenne): delta-write encoding levels (cayenne_delta_encoding, default auto) + make compression_strategy=zstd real by @lukekim in #11199
Add NSQL context endpoint by @lukekim in #11075
fix(federation): respect the Spice function deny-list across all SQL connectors; dialect-aware DuckDB pushdown by @claudespice in #11186
fix: surface unknown/applied cayenne_* runtime.params at startup (fixes #10970) by @claudespice in #11133
perf(cayenne): plain-fsync ordering tier on the staged-commit hot path by @lukekim in #11198
fix(kafka): decode JSON payloads to Arrow directly — fixes lossy Decimal128 + removes double-parse (#11192) by @claudespice in #11207
feat(cayenne): self-tuning accelerator — hardware + schema + closed-loop adaptive (auto/adaptive modes) by @lukekim in #11213
perf(cayenne): CDC throughput — SF-100 @10K txn/s toward <5s lag + 5K QPH by @lukekim in #11206
fix(cluster): distribute accelerated tables wrapped by metadata/index providers by @phillipleblanc in #11226
Improve Cayenne adaptive tuning and schema safety by @lukekim in #11237
feat: Add cayenne_file_pruning param by @peasee in #11239
Debezium connector - handle tombstone messages in kafka topic, with schema evolution enabled by @ewgenius in #11197
feat(cayenne): broadcast small-dimension joins to executors by @phillipleblanc in #11245
fix: scope request context across the managed query runtime by @phillipleblanc in #11253
fix: Strip inference columns from table schema on query by @peasee in #11251
fix(flightsql): don't drop un-pushed FilterExec predicates in distributed pushdown rules by @claudespice in #11256
feat(postgres): share one replication slot across multiple changes-mode datasets by @phillipleblanc in #11255
Upgrade to DataFusion v53.1, Arrow v58.3, Vortex v0.74, and dependencies by @lukekim in #11118
feat(connectors): support file_format: vortex everywhere parquet is supported by @lukekim in #11282
perf(cayenne): metadata-only publish — drop per-key insert records on upsert commit by @lukekim in #11260
fix(kafka): harden fetch_latest_message for multi-partition topics by @ewgenius in #11285
perf(cayenne): bound mem-tier checkpoint churn + O(1) per-scan deletion view by @lukekim in #11249
fix(udfs): length-prefix digest_many values so column boundaries can't collide (fixes #11272) by @claudespice in #11288
fix: restore federation deny-list enforcement regressed by the DataFusion 53 upgrade by @claudespice in #11294
fix(postgres): recover unchanged-TOAST columns from the old tuple; classify walsender contention as transient by @phillipleblanc in #11293
feat: deepen extended schema inference and wire it into cayenne compaction sharding/sorting by @lukekim in #11284
perf(cayenne): in-memory CDC tier follow-ups + write-phase observability by @lukekim in #11278
Support per-dataset CDC tunable overrides by @sgrebnov in #11295
feat(cayenne): harden adaptive auto-tuner (controller hygiene, mem-tier actuator, delete/burst signals, single opt-in) by @lukekim in #11302
fix(cayenne): scan inlined-view capture starvation under sustained CDC (analytical QPH) by @lukekim in #11299
fix(vortex): don't row-evaluate hash-join dynamic filters in the scan by @sgrebnov in #11307
fix(deps): bump rust-postgres crates (RUSTSEC-2026-0178/0179) by @lukekim in #11313
fix: strict validation of Postgres CDC params instead of silent defaults (fixes #11274) by @claudespice in #11304
fix(duckdb): always quote on-refresh sort columns so reserved-word names don't break refresh by @claudespice in #11305
feat(flightsql): fall back to original connection when endpoint location is unreachable by @melks in #11287
perf(cayenne): light-encode transient staged CDC deltas by @lukekim in #11311
feat(acceleration): widening-only schema evolution via on_schema_change by @lukekim in #11261
deps(vortex): bump pin to spiceai-53 HEAD — intra-file decode split + per-execution kernel cache by @lukekim in #11314
feat(github): enhance GitHub component validation and error handling by @lukekim in #11259
feat(cayenne): goal-driven adaptive tuning toward operator SLOs (lag, freshness, query latency, QPH) by @lukekim in #11310
fix(cayenne): broadcast-join rewrite must bail on ambiguous columns, NULL-equal joins, and residual filters by @claudespice in #11252
Add Cayenne maintained aggregates by @lukekim in #11235
perf(cayenne): single-hash composite deletion filter via KeyDeletionIndex::get_batch by @phillipleblanc in #11325
fix(Vortex): decline only the InList membership conjunct of hash-join dynamic filters by @sgrebnov in #11335
fix: scope SQL UDF arg inlining to args-table columns (fixes #11273) by @claudespice in #11337
fix(refresh): restore S3 ETag/Version refresh-skip behind provider wrappers by @phillipleblanc in #11339
fix(cayenne): ref-count in-flight scans so GC can't delete Vortex files mid-read by @phillipleblanc in #11321
fix(runtime): retry object-store dataset load when source files are not yet available by @phillipleblanc in #11342
feat(s3): default to path-style for dotted bucket names on standard AWS by @phillipleblanc in #11347
fix(runtime): resolve accelerated table through metadata-enrichment wrapper by @phillipleblanc in #11345
fix: detect dual-write accelerated tables behind the metadata-enrichment wrapper by @claudespice in #11351
feat(cayenne): incremental seq-prefix bake — shrink the merge-on-read deletion index by @lukekim in #11326
fix(adbc): prevent Spice-specific UDFs from being pushed down to ADBC sources by @krinart in #11297
fix: Query Redshift schema details from svv_redshift tables by @peasee in #11362
perf(cayenne): tune Turso connection PRAGMAs + jitter metastore retries by @lukekim in #11359
Upgrade to DataFusion 54 by @sgrebnov in #11360
feat(runtime): dedicated CDC-apply tokio runtime + per-runtime tokio metrics by @lukekim in #11370
fix(cayenne): spill oversized hash joins via sort-merge to avoid OOM by @lukekim in #11371
fix(cayenne): honor cayenne_metastore: turso for partitioned tables and the dataset checkpoint by @phillipleblanc in #11365
perf(cayenne): in-RAM scan parallelism, query admission control, skip no-op deletion encode, sound scan output_ordering by @lukekim in #11332
fix(catalog): restore DDL after DataFusion 54 broke transparent catalog-provider downcasts by @phillipleblanc in #11375
feat(flightsql): infer schema via SELECT * LIMIT 1 when GetTables is unimplemented by @melks in #11286
fix(cayenne): Utf8View read schema avoids hash-join offset overflow by @lukekim in #11379
feat(cluster): support distributed (Ballista) scans of Iceberg tables by @phillipleblanc in #11378
feat(optimizer): cost-based left-deep join reordering for Cayenne acceleration by @sgrebnov in #11377
cli - fix service-account auth in spice cloud * commands by @ewgenius in #11316
fix: Support external Redshift table schema inference and Hive external type parsing by @peasee in #11399
feat(llms): native GLM support — opt-in flash-attn + surface reasoning_content by @lukekim in #11400
feat(s3_vectors): paginate QueryVectors for topK up to 10,000 by @bjchambers in #11405
fix: surface .env parse errors with line numbers instead of silently skipping by @Oxygen56 in #11306
fix(status): probe metrics endpoint over https when TLS is enabled by @phillipleblanc in #11393
fix(mcp): record task_history spans for tool calls proxied through /v1/mcp by @phillipleblanc in #11397
Properly handle date_trunc in BigQueryDialect by @krinart in #11416
fix(snowflake): honor column scale in Int64 timestamp arm and cast TIME by @claudespice in #11418
fix: /v1/queries chunk row_offset uses cumulative offset, not chunk_index * chunk_size (fixes #11271) by @claudespice in #11398
feat(cayenne): incremental retraction for maintained aggregates + anchor bench by @lukekim in #11389
feat(cluster): support distributed (Ballista) scans of Iceberg catalog tables by @phillipleblanc in #11419
Simplify chat/responses models by @Jeadie in #10997
feat(llms): distributed tensor-parallel GLM inference via mistral.rs ring backend by @lukekim in #11406
Optimize Flight streaming for large result sets by @lukekim in #11420
fix(deps): evict rustls 0.21 / rustls-webpki 0.101.7 (GHSA-82j2-j2ch-gfr8) by @phillipleblanc in #11428
feat(cayenne): in-memory CDC intra-apply sharding (PK-hash shards) by @lukekim in #11421
fix(cayenne): shard CDC upserts with pending deletions so the N>1 slot-ack advances by @lukekim in #11445
fix(cluster): keep built-in avg over Spark avg (distributed aggregate state schema mismatch) by @phillipleblanc in #11434
feat(views): support params.file_format for embedding chunking by @Jeadie in #11424
fix: offload blocking sync calls off the primary async runtime by @phillipleblanc in #11435
fix(udfs): rebind dot_product alias to Spice's inner_product on DataFusion 54 by @lukekim in #11443
feat(secrets): add full-fidelity reference iteration and a resolution-status API by @phillipleblanc in #11195
fix: Deny unsupported array functions for Postgres pushdown by @peasee in #11450
fix(cli): spice query honors --http-endpoint instead of failing on a Flight connect by @phillipleblanc in #11452
fix(cayenne): coordinate query-pool + in-memory CDC tier budgets to prevent adaptive OOM by @lukekim in #11449
Surface mTLS config in Kafka data connector by @v1gnesh in #11372
feat(secrets): check and report secret references at startup by @phillipleblanc in #11457
Default cayenne_force_view_types to false by @sgrebnov in #11459
fix: resolve table-reference qualification in results-cache invalidation (fixes #11266) by @claudespice in #11460
feat(cayenne): metadata aggregate pushdown — fold whole-table SUM/AVG/COUNT/MIN/MAX from statistics by @bjchambers in #11414
fix(search): restore numeric trunc and fix SortPreservingMergeExec planning error (DF54) by @Jeadie in #11415
fix(cluster): route Ballista shuffle/temp to the data PVC by @phillipleblanc in #11454
fix(search): default Elasticsearch kNN candidate pool to 1000 instead of 10 (fixes #11264) by @claudespice in #11467
feat(acceleration): add on_schema_change drop_and_recreate policy by @lukekim in #11462
feat(cayenne): predicate-aware maintained aggregates serve filtered analytical queries from the CDC delta by @lukekim in #11458
feat(cluster): shared Ballista job state with scheduler failover by @phillipleblanc in #11436
feat(cayenne): extend HLL NDV sketching to string and date columns by @bjchambers in #11468
fix(search): persist character chunk offsets so search snippets aren't shifted/garbled (fixes #11269) by @claudespice in #11479
feat(cayenne): storage-aware adaptive CDC tuning — calibration probe, IMDS, I/O-cliff fast path, infeasible-SLO feedback by @lukekim in #11463
fix(cayenne): LIMIT N under-delivers on key-deletion tables by @lukekim in #11490
fix(cayenne): live/tier-accurate join build-side stats (merge-on-read deletes + never-shrink NDV) by @lukekim in #11496
perf(cayenne): kernel-space I/O hygiene — compaction fadvise + staged-commit barrier reduction by @lukekim in #11495
feat(cayenne): feed maintained-aggregate IVM from the staged-disk CDC path by @lukekim in #11491
feat(cayenne): global adaptive-tuning SLOs with per-dataset overrides; QPH global-only by @lukekim in #11497
Update search snapshots by @sgrebnov in #11473
fix: Support reading column types longer than 128 chars in Redshift by @peasee in #11500
fix(cluster): distributed (Ballista) query-execution config + scheduled SF10 bench by @phillipleblanc in #11478
fix(http): retry transient response-body read failures; de-flake backoff test by @claudespice in #11482
Re-land orphaned deletion-vector cleanup during retention deletes by @lukekim in #11501
fix(cayenne): re-upsert over a pending delete tombstone records an insert-record (overwrite resurrection) by @bjchambers in #11469
fix: Flight DoPut silently dropped client batches on early sink completion by @claudespice in #11507
Upgrade OpenTelemetry to 0.32 and reqwest to 0.13 by @phillipleblanc in #11506
fix(queries): run async /v1/queries jobs under the submitting request context by @phillipleblanc in #11505
fix(cayenne): restore append-only current-snapshot compaction by @Jeadie in #11439
perf(cayenne): orphaned deletion-vector cleanup off the write path, behind a knob by @bjchambers in #11517
fix(datafusion): accurate projected scan byte size so hash joins build the smaller side by @sgrebnov in #11503
fix(cayenne): user-visible DELETE WHERE pk IN (...) reports the real row count by @lukekim in #11514
fix(cayenne): seed persisted num_rows for hash-join sizing by @sgrebnov in #11515
fix(cayenne): correctness & memory_limit fixes from perf audit (2 P0, 3 P1) by @lukekim in #11516
feat(cayenne): wire orphaned-DV cleanup knob to spicepod params + doc sync by @bjchambers in #11523
Fix s3 vectors API by @krinart in #11536
fix: Placeholder table initialization lock swap by @peasee in #11540
chore(cluster): bump ballista pin for the null-aware anti-join fix by @phillipleblanc in #11544
fix(runtime-tools): fix memory table identifier validation rejecting valid names by @Jeadie in #11546
Bump datafusion to include spiceai/datafusion#181 by @Jeadie in #11563
Update deny.toml by @krinart in #11571
Bump datafusion (spiceai/datafusion#182) and datafusion-table-providers (#27): fix q16 CollectLeft planning error and SQLite q6 wrong revenue by @Jeadie in #11598

Full Changelog: https://github.com/spiceai/spiceai/compare/v2.0.0...v2.1.0

Spice v2.0.1 (Jun 17, 2026)

June 17, 2026 · 4 min read

Phillip LeBlanc

Co-Founder and CTO of Spice AI

Spice v2.0.1 is now available! 🛠️

Spice v2.0.1 is a patch release focused on reliability and performance. It speeds up Apache Iceberg reads and fixes bugs across AWS S3 and object-store datasets, data acceleration, distributed query, and authenticated access.

What's New in v2.0.1

Faster Iceberg Reads with Parallel File Scanning

The Apache Iceberg reader now scans data files in parallel (#11331), improving read throughput and latency for Iceberg tables that span many files.

AWS S3 & Object-Store Reliability

Three fixes improve S3 and object-store dataset behavior:

Refresh-skip restored (#11339): ETag/Version-based refresh-skip works reliably again, so unchanged S3 objects are no longer re-downloaded on every refresh.
Retry when source files are not yet available (#11342): an object-store dataset whose source files are not present at startup now retries and becomes ready once the data appears, instead of failing permanently.
Path-style addressing for dotted bucket names (#11347): on standard AWS, buckets whose names contain dots now default to path-style addressing, avoiding TLS wildcard certificate errors under virtual-hosted-style HTTPS.

Data Acceleration & Distributed Query Fixes

Two fixes ensure accelerated datasets behave correctly in more configurations:

Acceleration endpoints (#11345): /v1/datasets/{name}/acceleration/refresh (and the related update-refresh-sql, partition-filters, and snapshots endpoints) now work for all accelerated datasets, fixing cases where some incorrectly reported Table is not accelerated.
Distributed clusters (#11226): the distributed query coordinator now serves accelerated data from executors for all accelerated datasets, instead of falling back to reading from the source for some.

Authenticated Query Fixes

With authentication enabled, queries now consistently run as the requesting user (#11253), so per-user behavior such as results caching is correctly scoped to each user.

Contributors

Breaking Changes

No breaking changes.

Cookbook Updates

No new cookbook recipes.

The Spice Cookbook includes more than 100 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v2.0.1, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:2.0.1 image:

docker pull spiceai/spiceai:2.0.1

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai --version 2.0.1

AWS Marketplace:

Spice is available in the AWS Marketplace.

What's Changed

Changelog

fix(ci): key testoperator & validator artifacts by checked-out commit by @sgrebnov in #11281
chore(deps): fix cargo-deny advisory failures on release/2.0 by @phillipleblanc in #11333
chore(deps): bump iceberg-rust to parallel file scanning fork (release/2.0) by @phillipleblanc in #11331
fix(refresh): restore S3 ETag/Version refresh-skip behind provider wrappers by @phillipleblanc in #11339
fix(runtime): retry object-store dataset load when source files are not yet available by @phillipleblanc in #11342
feat(s3): default to path-style for dotted bucket names on standard AWS by @phillipleblanc in #11347
fix(runtime): resolve accelerated table through metadata-enrichment wrapper by @phillipleblanc in #11345
fix(cluster): distribute accelerated tables wrapped by metadata/index providers by @phillipleblanc in #11226
fix: scope request context across the managed query runtime by @phillipleblanc in #11253

Full Changelog: https://github.com/spiceai/spiceai/compare/v2.0.0...v2.0.1

Spice v2.0-stable (Jun 5, 2026)

June 5, 2026 · 94 min read

Phillip LeBlanc

Co-Founder and CTO of Spice AI

53 releases since Spice 1.0-stable, Spice.ai OSS has reached the 2.0-stable milestone! 🎉

Spice v2.0.0 is the next major release of Spice and a major milestone in the project's development, advancing Spice from a single-node engine into a distributed data and query platform built for enterprise AI agents. These agents need low-latency, governed access to data spread across many production systems, and because they generate their own queries autonomously, that access has to be sandboxed, observable, and able to absorb occasional heavy analytical queries without overwhelming the underlying systems. The release is headlined by multi-node distributed query, now generally available — multi-active, highly-available, and object-store-native, built on Apache Ballista — distributing both query execution and ingestion across executors with data-local routing and per-executor statistics for distributed join planning. Alongside it, the Spice Cayenne data accelerator is generally available, built on the Vortex compressed columnar format, with a high-throughput CDC write path, MERGE INTO, SQL-defined partitioning, inline writes, a dedicated compaction runtime, and write-path statistics for distributed join sizing. The engine also moves to DataFusion v52 with sort pushdown, a rewritten merge join, and dynamic filters, and the Spice CLI is rewritten in Rust as a single self-contained binary.

v2.0 also expands real-time and write-path capabilities across the platform: native CDC from MongoDB Change Streams and PostgreSQL WAL logical replication, durable Kafka CDC offsets, DML write-back for PostgreSQL, Snowflake, DynamoDB, Arrow, and DuckLake, DDL and MERGE INTO for Iceberg catalogs, mutual TLS across server endpoints and outbound connectors, HashiCorp Vault and Azure Key Vault secret stores, user-defined functions, hybrid search with Elasticsearch and DuckDB HNSW vector indexes, provider-aware LLM prompt caching, and the Responses API across all model providers.

Highlights in v2.0.0 include:

Spice Cayenne (GA) — generally available on the Vortex compressed columnar format, with WAL-staged writes, inline low-latency writes, fast-path CDC deletes, merge-on-read position deletes, composite & SQL-defined partitioning, MERGE INTO, dedicated compaction runtime, and join-sizing statistics maintained on the write path
Multi-Active HA Distributed Query (GA) — multi-node distributed query built on Apache Ballista, with object-store-native clustering, dynamic cluster sizing, distributed ingestion, data-local query routing, per-executor table statistics for distributed join planning, and async queries via /v1/queries
Mutual TLS (mTLS) — public mTLS for HTTP and Flight, TLS cert hot-reload, and mTLS client certificates for FlightSQL and Spice.ai connectors
Enterprise Authentication & Authorization — OIDC bearer-token verification and Cedar-based authorization policy with per-principal row- and column-level filtering
New Secret Stores — HashiCorp Vault and Azure Key Vault
CDC Sources — native MongoDB Change Streams, PostgreSQL WAL logical replication, and durable Kafka CDC offsets — no Debezium or Kafka middleware required
DML & DDL — INSERT/UPDATE/DELETE write-back for PostgreSQL, Snowflake, DynamoDB, and Arrow; CREATE TABLE/DROP TABLE and MERGE INTO for Iceberg catalogs
User-Defined Functions — SQL UDFs in spicepods, remote UDFs over HTTP, and optional geospatial ST_* UDFs
On-Demand Dataset Loading & Unified Query Cancellation — faster startup and end-to-end cancellation across HTTP, Flight, FlightSQL, and MCP
Dynamic HTTP Connector — OAuth2 refresh tokens, pagination, dynamic headers, subquery-driven parameters, and rate-control state persisted across restarts
Storage-Profile Accelerator Tuning & refresh_mode: snapshot — storage-aware acceleration defaults and point-in-time snapshot acceleration
Search & Vectors — Elasticsearch data connector with native hybrid search, DuckDB HNSW vector engine with a statically linked VSS extension, multi-vector MaxSim embeddings, and a rerank() UDTF
AI & LLM — provider-aware prompt caching, Responses API across all providers, MCP Streamable HTTP transport, and a searchable LLM tool registry
New Data Connectors — Elasticsearch (Alpha), GCS (Alpha), Azure Cosmos DB (Alpha), Git (RC), ADBC, DuckLake (Beta), and catalog connectors for PostgreSQL, MySQL, MSSQL, and Snowflake
Rust CLI — single-binary spice CLI with spice query async REPL, shell completions, and --output=json
Dependency upgrades including DataFusion v52.5, DuckDB v1.5.3, Arrow v57.2, iceberg-rust v0.9.1, Turso v0.6.1, and Vortex v0.69

Spice v2.0 includes several breaking changes. Review the breaking changes section before upgrading.

Distribution Changes

AI/ML support including local LLM/ML model and hosted LLM inference is now included in the default Spice build and image. The separate models build variant has been removed.

With models now included by default, the data-only distribution (without AI/ML support) is only published in nightly builds. Official production-ready data-only distributions are available exclusively through Spice Cloud and the Enterprise release.

A new Network Attached Storage (NAS) distribution with built-in SMB and NFS data connector support is also available in nightly builds and with Spice.ai Enterprise.

Distribution / Variant	Open Source	Spice Cloud	Enterprise
Default	✅	✅	✅
Data	Nightly only	✅	✅
NAS (SMB + NFS)	Nightly only	❌	✅
Metal (macOS)	✅	✅	✅
CUDA (Linux)	Nightly only	✅	✅
Allocator variants	Nightly only	✅	✅
ODBC connector	Local build only	✅	✅

Native Windows builds are no longer provided; use WSL for local development. For more details, see the Distributions documentation.

What's New in v2.0.0

Spice Cayenne Reaches General Availability

The Spice Cayenne data accelerator is generally available in v2.0, with a major focus across the release candidates on write-path throughput, correctness, and distributed operation.

Write path & ingest:

Staged Append Writes: WAL-based staged append writes prevent partial writes and data loss on stream errors — batches commit atomically.
Inline Writes: Small writes are serialized as Arrow IPC and committed directly into the Cayenne metastore, bypassing the staged Vortex write path for low-latency ingest. Inline upserts atomically rewrite existing inline rows, inline data stays query-visible via an in-memory union scan, and rows are checkpointed to Vortex when thresholds are reached. Inline writes now also proceed with pending deletions in flight, and inline flush caps scale with available memory and storage class.
Fast-Path CDC Deletes: DELETE statements whose filters identify primary keys directly — including composite keys expressed as (k1, k2) IN ((...), (...)) — skip the table scan entirely.
Merge-On-Read Position Deletes: Primary-key upsert tables use position deletes with memory-pool accounting, avoiding full-table rewrites on update-heavy workloads.
Resident Upsert Keysets: CDC upsert primary-key keysets stay resident between batches, avoiding per-batch full-table rebuilds.
CDC Sub-Batch Efficiency: Interleaved upsert/delete workloads produce fewer sub-batch splits, with last-write-wins deduplication applied within batches.
Dedicated Compaction Runtime: Background compaction runs on a dedicated thread pool with CDC pipelining and protected snapshots, isolating compaction work from query and ingest paths.

Query & planning:

Join Filter Propagation: Filters propagate across equi-join keys, with range fallback for large join filters and IN-list rewrites.
Write-Path Join-Sizing Statistics: Cayenne maintains live row counts and HyperLogLog-based distinct-value estimates on the write path, so distributed JoinSelection can correctly size joins without rescans.
Scan-Result Cache: A new scan-result cache accelerates hot reads, with parallel Vortex partition writes and lock-free deletion caches with bloom-prefiltered probes.

SQL & catalog:

MERGE INTO: Upsert-style MERGE INTO for Cayenne catalog tables, distributed across executors in cluster mode.
PARTITION BY in SQL: Define partitioning directly in CREATE TABLE ... PARTITION BY (...); metadata is persisted in the catalog and survives restarts.
Composite Partitioning: partition_by: [col1, col2] with hierarchical path-like keys.
File-Based Retention Deletes: Time-based retention uses file-level deletes for both position-based and primary-key tables.

Correctness: Synchronized partition commits, correct NULL-sentinel handling for nullable partition expressions, tombstoned inline-checkpointed rows on upsert (preventing duplicate primary keys), and live reads through expired protected snapshots.

Multi-Active HA Distributed Query (GA)

Spice.ai Enterprise feature. See High Availability.

Distributed Query is generally available. Built on Apache Ballista, it distributes query execution across multiple active executor nodes with no single point of failure, reading directly from object storage rather than relying on a central cluster.

Distributed query supports two execution modes:

Synchronous: Queries for accelerated datasets are distributed across executors and results stream back in real-time — best for interactive, latency-sensitive queries.
Asynchronous: Queries submitted via the HTTP /v1/queries API materialize results to object storage for later retrieval — best for long-running analytical and batch workloads.

Key capabilities:

Dynamic Cluster Sizing: The planner adjusts parallelism to the number of active executors as nodes join or leave.
Distributed Ingestion: Ingestion for partitioned accelerated tables is distributed across executors, with partition-aware write-through splitting scheduler-side Flight DoPut writes to the responsible executors.
Data-Local Query Routing: Cayenne catalog queries route to the executors holding the relevant partitions.
Per-Executor Table Statistics: Executors report table statistics — including NDV-aware estimates — so distributed JoinSelection can size joins correctly, fixing out-of-memory conditions on large semi-joins.
Readiness & Failure Detection: /v1/ready gates on a configurable executor quorum for safe rolling deployments; scheduler readiness additionally waits for executor partition loads; executor heartbeat timeout reduced from 180s to 30s.
Distributed DML & DDL: UPDATE/DELETE forwarding to all executors, executor DDL sync for late joiners, and distributed MERGE INTO.
Cluster Observability: New cluster metrics (including scheduler_active_executors_count), distributed runtime.task_history replication, and a Grafana dashboard.
Ballista S3 Shuffle: Async queries with runtime.params.shuffle_location: s3://... complete reliably with executor-environment-derived S3 clients.

Security: Mutual TLS, Secret Stores, and Hardening

Several capabilities in this section are Spice.ai Enterprise features. See Enterprise Security.

Mutual TLS across the platform:

Public mTLS for HTTP and Flight: client_auth_mode: request (optional, for migration windows) or required (strict) client-certificate verification.
TLS Cert Hot-Reload: The runtime reloads TLS certificates on SIGHUP for zero-downtime rotation.
Outbound mTLS Client Certificates: FlightSQL and Spice.ai data connectors present client certificates to upstream services; the spice sql REPL supports mTLS client auth.

runtime:
  tls:
    enabled: true
    certificate_file: /etc/spice/tls/server.crt
    key_file: /etc/spice/tls/server.key
    client_auth_mode: required
    client_auth_ca_file: /etc/spice/tls/client-ca.crt

Authentication & Authorization (Spice.ai Enterprise):

OIDC Authentication: Validate OIDC bearer tokens (JWTs) issued by enterprise identity providers — Microsoft Entra ID, Okta, Auth0, AWS Cognito, and Google — for secure access to runtime endpoints, standalone or combined with API keys.
Principal-Based Policy Enforcement: Fine-grained, Cedar-based authorization policy configured under runtime.authorization governs allow/deny access across datasets, models, tools, and endpoints. Combined with identity SQL functions (current_principal(), current_principal_email(), current_principal_groups()), policies enforce per-principal row-level filtering and column masking.

New Secret Stores: HashiCorp Vault (KV v1/v2; token, approle, kubernetes, and jwt auth with automatic lease renewal) and Azure Key Vault (service principal, managed identity, workload identity, Azure CLI, or auto-detect; sovereign cloud support).

Hardening:

Read-only API Key Enforcement on the Flight DoGet path and async query endpoints.
Per-Principal Cache Namespacing: SQL, search, and caching-accelerator caches are namespaced per authenticated principal so cached results never cross identity boundaries.
API Key Timing Leak & Remote-UDF SSRF: Closed a timing-based position-disclosure leak in API key comparison and blocked SSRF via remote UDF endpoints.
Snowflake Function Deny-List: A function deny-list is enforced in Snowflake federation pushdown, and Snowflake account identifiers and auth configuration are validated at startup.
MCP allowed_hosts: MCP servers can be restricted to an explicit allowlist of upstream hosts.

Change Data Capture (CDC) Sources

See Change Data Capture (CDC) for an overview of CDC in Spice.

MongoDB Change Streams: MongoDB datasets with refresh_mode: changes stream changes natively into any local accelerator — no Debezium or Kafka required.
PostgreSQL Native Replication (WAL): PostgreSQL datasets stream INSERT/UPDATE/DELETE directly from logical replication using pgoutput decoding, with automatic per-replica slot management, an initial REPEATABLE READ bootstrap snapshot, and durable LSN acknowledgement.
Kafka CDC Offset Persistence: Kafka CDC offsets persist in sidecar tables for durable, resumable streams across restarts and failovers.
Pipelined CDC Ingestion: Source reads overlap with batch apply, with envelope coalescing and improved nullability propagation.
Debezium Schema Evolution: Schema changes in Debezium-sourced datasets no longer break dataset initialization on reload.

datasets:
  - from: postgres:my_table
    name: my_table
    params:
      pg_host: localhost
      pg_db: mydb
    acceleration:
      enabled: true
      engine: duckdb
      refresh_mode: changes

DML, DDL, and Write-Back

Spice v2.0 turns more connectors and catalogs into full read/write tables:

PostgreSQL DML: INSERT, UPDATE, and DELETE write-back on PostgreSQL datasets, with foreign-key metadata exposed via the PostgreSQL catalog connector.
Snowflake DML: INSERT, UPDATE, and DELETE write-back on Snowflake datasets.
DynamoDB DML: INSERT, UPDATE, and DELETE for DynamoDB, complementing read and CDC streaming.
Arrow Primary Key Upserts: Native update-or-insert semantics for in-memory Arrow-accelerated tables.
DDL for Iceberg: CREATE TABLE and DROP TABLE via FlightSQL and /v1/sql for Iceberg, with catalog.access: read_write_create.
DuckLake INSERT: DuckLake catalog tables with read_write access support INSERT.

SQL & User-Defined Functions

See the SQL Reference for the full SQL surface area.

User-Defined Functions: Define reusable SQL UDFs as first-class spicepod components, or invoke remote functions over HTTP (Spice.ai Enterprise), plus table user functions.
Spatial SQL UDFs: Optional geospatial ST_* UDFs for geometry workloads.
JSON UDTFs: flatten_json, json_tree, and flatten_json_properties table-valued functions for JSON transformation and schema decomposition (with options such as expand_maps). See JSON Functions and Operators.
PostgreSQL Metadata UDFs: Dataset and column descriptions are exposed via PostgreSQL-compatible UDFs (obj_description, col_description), so BI tools and psql surface Spice metadata.
FlightSQL Substrait Plans: CommandStatementSubstraitPlan support for clients submitting Substrait-encoded plans.
SQL REPL Expanded View: Toggle \x for a vertical key-value layout on wide result sets.
Prepared statement, federation, and unparsing fixes across the engine, including keeping correlated subqueries out of JOIN ON conditions for Spice Cloud federation and correct EXISTS/NOT EXISTS subquery handling in the federation analyzer.

Runtime Features

On-Demand Dataset Loading: Datasets can be deferred — registered with a declared schema at startup (columns[].type, columns[].nullable) and fully resolved on first reference, reducing startup time and memory for large spicepods.
Unified Query Cancellation: HTTP, Flight, FlightSQL, MCP, and internal execution paths honour a unified cancellation signal — disconnects, REPL Ctrl-C, and cancelled HTTP requests cancel the query end-to-end.
Storage-Profile Accelerator Tuning: acceleration.storage_profile (auto, local_ssd, ebs, tmpfs) applies storage-aware defaults across DuckDB, SQLite, Turso, and Cayenne file-mode accelerators; auto detects the backing storage.
refresh_mode: snapshot (Spice.ai Enterprise): Point-in-time snapshot acceleration with SQLite/Turso WAL flushing and Cayenne metastore slice integration, now reporting accurate readiness when no snapshot exists yet.
Structured Component Errors: /v1/datasets?status=true and /v1/models?status=true return structured error objects (category, type, code) and human-readable error_message fields; the CLI shows an ERROR column.
Actionable Config Errors: Parameter typos, missing secret references, and unknown engine names produce specific, actionable errors with suggestions.

Spicepod v2

Spicepods now support version: v2, the default for spice init, while v1 spicepods continue to work with automatic migration of deprecated fields.

Version	Status
`v2`	Default. Used by `spice init`.
`v1`	Supported. Deprecated fields auto-migrate.
`v1beta1`	Removed. No longer accepted.

v1 (deprecated)	v2 (preferred)	Notes
`runtime.results_cache`	`runtime.caching.sql_results`	All fields migrate automatically. `cache_max_size` → `max_size`.
`runtime.memory_limit`	`runtime.query.memory_limit`	Auto-migrated. `query.memory_limit` takes priority if both set.
`runtime.temp_directory`	`runtime.query.temp_directory`	Auto-migrated. `query.temp_directory` takes priority if both set.
`dataset.invalid_type_action`	`dataset.unsupported_type_action`	Auto-migrated. v2 adds a new `string` variant.

New v2 fields include runtime.ready_state, runtime.query.spill_compression, runtime.caching.sql_results.stale_while_revalidate_ttl, runtime.caching.sql_results.encoding, scheduler partition-assignment configuration, and catalog.access: read_write_create.

Data Connectors & Catalogs

New connectors:

Elasticsearch (Alpha, Spice.ai Enterprise): Query Elasticsearch indexes as SQL tables with native hybrid search — vector_search() kNN, text_search() BM25, and rrf() fusion — plus Elasticsearch as a backing vector engine, direct FTS engine configuration, and index lifecycle controls.
GCS (Alpha): Federated queries against Google Cloud Storage, with Iceberg table support.
Azure Cosmos DB (Alpha): Read-only NoSQL / Core SQL API connector with cross-partition scans and schema inference.
Git (RC): HTTPS/SSH auth, Git LFS support, and per-repo connection resilience.
ADBC: Data connector and catalog with full query federation, BigQuery support, and schema/table discovery.
DuckLake (Beta): Lakehouse-style data management with DuckDB as the metadata catalog and object storage for data — ACID transactions, time travel, and schema evolution on Parquet.
Self-Hosted Spice Connector: Connect Spice to another self-hosted Spice runtime as a federated source.

New catalog connectors for PostgreSQL, MySQL, MSSQL, and Snowflake, using native metadata catalogs for schema and table discovery. Unity Catalog compatibility extends to OSS Unity Catalog deployments, and DDL-defined catalogs can expose and query views.

HTTP connector: OAuth2 refresh-token authentication, query-parameter and no-limit pagination, dynamic request headers parameterised from query predicates, subquery-driven request parameters for fan-out queries, response metadata as queryable columns, map-to-array conversion, shared and persistent rate-control state across restarts and replicas, no caching of transient 429/5xx errors, and a correctly populated fetched_at column.

JSON ingestion: Single-object documents, JSONL, BOM-prefixed input, Socrata SODA responses, format auto-detection, and RFC 6901 json_pointer extraction of nested payloads.

Databricks: Resilience controls, Unity Catalog-aware permission prechecks with structured advisory errors, Classic SQL Warehouse foreign-table compatibility, connect_timeout/client_timeout parameters, a Databricks SQL dialect for federation, and Delta Lake column mapping (Name and Id modes).

Other connector improvements: MongoDB SRV support; MySQL mysql_zero_date_behavior; Snowflake OBJECT, MAP, GEOGRAPHY, GEOMETRY, VECTOR, and TIMESTAMP_LTZ types plus key-pair auth; ClickHouse Date32; S3 s3_url_style for path-style addressing and faster Parquet reads; GraphQL custom auth headers; Oracle and MSSQL sort/limit pushdown; GitHub GraphQL resilience; and improved Kafka reliability.

AI & LLM

Provider-Aware Prompt Caching: LLM calls automatically use provider-side prompt caching (e.g., Anthropic, OpenAI) for system prompts and tool descriptions, reducing latency and cost.
Responses API Across All Providers: The Responses API works with every configured model provider, including streaming response.output_text.delta events and Authorization: Bearer header support.
Multi-Vector Embeddings with MaxSim: List-of-string columns produce one embedding per element with MaxSim/mean/sum scoring for ColBERT-style late-interaction retrieval, plus a _match column identifying the best-matching element.
rerank() UDTF: Reorder results from vector_search, text_search, or rrf using any registered chat model as a reranker, with automatic query propagation and pushdown support.
Searchable LLM Tool Registry: Agents discover tools via semantic search instead of enumerating every tool in the system prompt.
MCP Improvements: Streamable HTTP transport (/v1/mcp) on rmcp v1.5.0, native auth for streamable HTTP tools (mcp_auth_token, mcp_headers), external MCP server tool calls traced in task history, and configurable allowed_hosts.
Per-Model Rate-Limited AI UDF Execution for controlling concurrent AI function invocations.

Search & Vectors

DuckDB Vector Engine: vector_engine: duckdb uses DuckDB's HNSW index for fast approximate nearest-neighbor search without an external vector store. In v2.0.0, the DuckDB VSS extension is statically linked into the bundled DuckDB, so HNSW vector search works out-of-the-box on clean machines with no extension download. HNSW indexes are preserved across data refresh, and cosine_distance pushes down via array_cosine_distance.
Hybrid Search: Combine kNN vector search and BM25 full-text search with reciprocal rank fusion (rrf()), backed by Tantivy, Elasticsearch, or DuckDB.
Full-Text Search Performance: Significantly faster Tantivy ingestion with rollback-on-error, and search metadata is correctly preserved on indexing and in Vortex physical schema calculation.
Embedding Validation: row_id columns are validated during dataset initialization.

Caching

Improvements across Caching:

Stale-While-Revalidate: runtime.caching.sql_results.stale_while_revalidate_ttl serves stale results while revalidating in the background.
Cache Encoding: Optional compression (e.g., zstd) for SQL results cache entries.
Retention Policies for cached query results, and improved CDC-driven cache invalidation (including view plan invalidation on updates).
Idle Cache Maintenance: Periodic maintenance drains invalidation predicates on idle caches, fixing unbounded memory growth in rarely-read caches.

Performance & Query Engine

Apache DataFusion is upgraded to v52.5 over the course of the release cycle, bringing:

Sort Pushdown to Scans: ~30x faster top-K queries on pre-sorted data; Parquet scans reverse row-group order for DESC on ASC-sorted files.
Rewritten Sort-Merge Join: Up to three orders of magnitude faster in pathological cases (e.g., TPC-H Q21: minutes → milliseconds).
Dynamic Filters: MIN/MAX aggregates and hash-join build sides prune files, row groups, and rows during execution.
Faster CASE Expressions, statistics caching, and prefix-aware list-files caching for faster planning.
TableProvider DELETE/UPDATE hooks and the RelationPlanner API for extensible SQL planning.
Strict Overflow Handling: try_cast_to errors on overflow instead of silently producing NULLs.

Additional engine work: default query memory limit raised from 70% to 90% with GreedyMemoryPool, partial aggregation optimization for FlightSQLExec, improved partitioned query planning, and metastore transaction support to prevent concurrent conflicts.

Rust CLI

The Spice CLI is completely rewritten from Go to Rust — a single spice binary built from the same codebase as spiced, with full feature parity across 27+ commands.

spice query: Interactive REPL for async queries with multi-line SQL, progress indication, and cancellation.
spice dataset configure: Non-interactive flag-based configuration (--from, --description, --param KEY=VALUE, --set) alongside interactive prompts.
spice completions: Shell completion script generation.
--output=json: Machine-readable output for scripting; spice login --output adds env, json, and keychain modes.
spice init writes a yaml-language-server schema directive for IDE completions.

Observability

OpenTelemetry: Exporter fixes, authenticated metrics export, configurable metric name prefix (runtime.telemetry.metric_prefix), delta temporality by default, and OTLP resource attributes via runtime.telemetry.properties.
Query Metrics: The query_executions metric gains a datasets dimension for per-dataset query attribution.
Ingestion Metrics: rows_written, bytes_written, and dataset_acceleration_size_bytes for acceleration refresh and Flight DoPut/ADBC ingestion, and EXPLAIN ANALYZE metrics in FlightSQLExec.
Task History: Distributed task history in cluster mode and tracing for external MCP server tool calls.

Notable Bug Fixes

localpod synchronization: localpod child datasets correctly track parent refreshes when the parent uses the in-memory Arrow accelerator.
Spice Cloud federation: Correlated subqueries are kept out of JOIN ON conditions, fixing rejected federated queries.
refresh_mode: snapshot: No longer reports Ready with empty data when no snapshot exists.
Search metadata: Field and schema metadata preserved on search indexing and in Vortex physical schema calculation.
HTTP connector: fetched_at column is correctly populated.
Connector correctness: DynamoDB Streams transient-error retries and typed-NULL DML handling; ScyllaDB physical filter pushdown disabled to fix incorrect results; MSSQL TOP N pushdown; DuckDB DELETE/UPDATE on full and caching refresh modes; Turso checked arithmetic for timestamp conversions; ODBC queries no longer silently return 0 rows on failure; Flight GetFlightInfo/DoGet schema parity.

Dependency Updates

Dependency / Component	Version
DataFusion	v52.5
Ballista	v52
Arrow (arrow-rs)	v57.2
DuckDB	v1.5.3 (with statically linked VSS)
iceberg-rust	v0.9.1
Turso (libsql)	v0.6.1
Vortex	v0.69.0
delta_kernel	v0.18.2
rmcp (MCP)	v1.5.0
mistral.rs	v0.8.x (candle v0.10.1)
ADBC Core	v0.23
Rust toolchain	v1.94.1

Contributors

Breaking Changes

Models included by default: The separate models build variant has been removed. Local LLM inference is always included in the default build and image.
Windows native builds removed: Use WSL for local development.
Spicepod version defaults to v2: spice init creates version: v2 spicepods. v1 remains supported with auto-migration; v1beta1 is no longer accepted.

Flattened runtime.scheduler configuration: The nested runtime.scheduler.partition_management block is flattened and renamed:

# Before
runtime:
  scheduler:
    partition_management:
      interval: 30s
      max_assignments_per_cycle: 16
      discovery_timeout: 10s

# After
runtime:
  scheduler:
    partition_assignment_interval: 30s
    max_assignments_per_interval: 16
    partition_discovery_timeout: 10s

S3 metadata columns renamed: location, last_modified, size → _location, _last_modified, _size.
Default query memory limit changed: Increased from 70% to 90%.
Metric renames: accelerated_refresh metrics renamed to acceleration_refresh; last_refresh_time gauge renamed to include the milliseconds unit.
DuckDB parameter rename: partitioned_write_flush_threshold → partitioned_write_flush_threshold_rows.
/v1/search API: Always returns an array in matches, even for single results.
/v1/evals API removed.
Perplexity model provider removed.
x.ai model endpoint: x.ai models exclusively use the /v1/responses endpoint.

Upgrade Guide from v1.x

Most v1 spicepods continue to work on v2.0 — v1 remains supported and deprecated fields auto-migrate at load time — so many deployments can upgrade by updating the binary or image alone. The steps below cover the breaking changes that may require manual action. Review each before upgrading a production deployment.

1. Build, image, and platform changes

Models are now included by default. The separate models build variant (and the corresponding -models image tags) has been removed; local LLM inference is always included in the default build and image. If your deployment pinned a models build or -models-tagged image, switch to the default build/image.
Native Windows builds are removed. Use WSL for local Windows development.

2. Adopt Spicepod `v2` (recommended)

spice init now creates version: v2 spicepods. v1 spicepods remain supported with automatic migration, but v1beta1 is no longer accepted. To move to v2, set version: v2 and update the following fields — each auto-migrates from v1, but updating now clears the deprecation:

v1 (deprecated)	v2 (preferred)
`runtime.results_cache`	`runtime.caching.sql_results` (`cache_max_size` → `max_size`)
`runtime.memory_limit`	`runtime.query.memory_limit`
`runtime.temp_directory`	`runtime.query.temp_directory`
`dataset.invalid_type_action`	`dataset.unsupported_type_action`

3. Update changed configuration

DuckDB parameter rename: partitioned_write_flush_threshold → partitioned_write_flush_threshold_rows.
Default query memory limit raised from 70% to 90%. If you relied on the previous default to leave headroom for other processes on the host, set it explicitly via runtime.query.memory_limit.

4. Update queries and API clients

S3 metadata columns renamed: location, last_modified, size → _location, _last_modified, _size. Update any queries that reference these columns.
/v1/search always returns an array in matches, even for a single result. Update clients that assumed a scalar value.
/v1/evals API removed. Remove integrations that depend on it.

5. Update model providers

Perplexity model provider removed. Re-point affected models to another provider.
x.ai models use the /v1/responses endpoint exclusively. Ensure x.ai integrations target the Responses API.

6. Update observability

Metric renames: accelerated_refresh → acceleration_refresh, and the last_refresh_time gauge is renamed to include the milliseconds unit. Update dashboards and alerts that reference these metric names.

After updating, restart the runtime and verify datasets and models report ready via /v1/datasets?status=true and /v1/models?status=true (the CLI shows a Ready/ERROR column).

Cookbook Updates

New Spice Cookbook recipes added during the v2.0 release cycle:

Async Queries: Submit long-running queries asynchronously and retrieve results later.
DuckLake Catalog: Lakehouse-style data management with ACID transactions and time travel.
Distributed Query: Run Spice in multi-active distributed cluster mode.
mTLS: Mutual TLS for HTTP and Flight endpoints.
Elasticsearch Connector: Query Elasticsearch indexes as SQL tables.
MCP Server: Use Spice as an MCP server over Streamable HTTP.
Snowflake DML: Write-back to Snowflake with INSERT/UPDATE/DELETE.
PostgreSQL, MySQL, and MSSQL Catalogs: Schema and table discovery for external databases.
Full-Text Search: BM25 full-text search over accelerated datasets.

The Spice Cookbook includes more than 100 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v2.0.0, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:2.0.0 image:

docker pull spiceai/spiceai:2.0.0

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai --version 2.0.0

AWS Marketplace:

Spice is available in the AWS Marketplace.

What's Changed

Changelog

Add TPC-DS integration tests with S3 source and PostgreSQL acceleration by @phillipleblanc in #9006
fix(tests): fix flaky/slow/failing unit tests by @phillipleblanc in #9009
fix: Update benchmark snapshots for DF51 upgrade by @app/github-actions in #9008
fix: add feature gate to rrf TEST_EMBEDDING_MODEL by @phillipleblanc in #9017
fix: features check by @phillipleblanc in #9014
fix: Enable Cayenne acceleration snapshots by @lukekim in #9020
URL table support by @lukekim in #9018
ScyllaDB key filter by @lukekim in #8997
fix: Schema mismatch when using column projection with HTTP caching by @phillipleblanc in #9021
Add more tests for HTTP caching with columns selection by @sgrebnov in #9025
HTTP cache snapshots: default to time_interval and fix snapshots_creation_policy: on_change by @sgrebnov in #9026
Fix duplicate snapshot creation on startup by @sgrebnov in #9029
Add ScyllaDB and SMB to the README table by @krinart in #9034
Remove waiting for runtime to be ready before creating snapshot by @krinart in #9033
Fix snapshot on_change policy to skip when no writes occurred by @sgrebnov in #9028
Release notes for release release/1.11.0-rc.2 by @krinart in #9016
ci: use arduino/setup-protoc for official protobuf compiler by @phillipleblanc in #9036
ci: install unzip on aarch64 runner for arduino/setup-protoc by @phillipleblanc in #9038
fix: don't fail release if upload to minio fails by @phillipleblanc in #9039
Add missing protoc step to setup-cc action by @krinart in #9041
fix: Update Search integration test snapshots by @app/github-actions in #9013
Fix formula_1 and codebase_community in bird-bench by @Jeadie in #9000
Cayenne S3 Express One Zone improvements by @lukekim in #9015
Add zlib1g-dev to CI by @lukekim in #9052
Improve validation and logging for hash indexes by @lukekim in #9047
Upgrade Vortex with CASE-WHEN by @lukekim in #9051
x.ai models now exclusively use /v1/responses endpoint by @lukekim in #9400
Improvements for snapshot schema comparison by @krinart in #9401
v2.0 breaking changes by @lukekim in #9233
Create PartitionManagementTask for scheduler to update accelerated table partition assignments by @Jeadie in #9378
refactor(Cayenne): route all write orchestration through CayenneDataSink by @sgrebnov in #9402
Refactor benchmark to use QueryExecutor trait by @Jeadie in #9418
feat: Add spidapter build and release workflow by @peasee in #9427
Testoperator: add support for api-key when connecting to external spice instance by @sgrebnov in #9421
Initial implementation of Ducklake catalog & data connectors by @lukekim in #9083
Require aws_lc_rs since jsonwebtoken upgrade by @Jeadie in #9426
feat: Add spidapter tool by @peasee in #9425
Add release notes for 1.11.2 patch release by @sgrebnov in #9430
feat(spidapter): integrate system-adapter-protocol with SCP provisioning by @phillipleblanc in #9434
Add DuckLake TPCH E2E workflow and federated Spicepod configuration by @lukekim in #9431
fix(spidapter): use Flight handshake auth instead of x-api-key header by @phillipleblanc in #9435
[spidapter] Keep only what sparks joy by @Jeadie in #9439
Refactor binary operator balancing by @Jeadie in #9424
feat: Add Iceberg DDL support (CREATE TABLE / DROP TABLE) for default catalog override by @phillipleblanc in #9440
Fix Flight SQL schema consistency: expand view types and verify field names by @sgrebnov in #9438
Update spidapter for new system-adapter-protocol by @sgrebnov in #9442
docs: fix typos and syntax errors in style guide and error handling docs by @cluster2600 in #9445
Add acceleration refresh ingestion metrics (rows_written, bytes_written) by @phillipleblanc in #9461
Refactor(Cayenne): Replace CatalogError and string based errors with Snafu errors by @sgrebnov in #9403
Replace deprecated claude-3-5-haiku-latest with claude-haiku-4-5 by @Jeadie in #9492
Fix #9481: Preserve schema in results cache for empty query results by @phillipleblanc in #9485
Fix partition by serializing by @Jeadie in #9474
query: reconcile execution stream nullability with logical plan schema by @phillipleblanc in #9486
initial spice-cloud-client crate and spice cloud metrics --app <app-name>. by @Jeadie in #9480
feat: Return dataset error message in datasets API by @peasee in #9487
Spicebench by @lukekim in #9447
build(deps): consolidate dependabot dependency updates by @phillipleblanc in #9504
fix(cluster): route non-partitioned accelerated tables in distributed mode by @phillipleblanc in #9508
Enable core scalar UDFs in refresh SQL by @sgrebnov in #9502
Fix metrics in Spidapter again by @Jeadie in #9497
fix(cluster): tolerate Completed->status propagation race in distributed query handle by @phillipleblanc in #9510
feat: Support distributed ingestion in cayenne catalog by @peasee in #9506
Fix Cayenne duplicate primary keys after DELETE + UPSERT CDC sequences by @krinart in #9494
fix(cluster): rewrite table scans inside subqueries for distributed execution by @phillipleblanc in #9518
fix: Set catalog mode to readwritecreate in spidapter by @peasee in #9519
Upgrade AWS SDK crates & set APN user-agent in AWS SDK credential bridge by @lukekim in #8328
feat(runtime): add runtime ready_state on_registration semantics by @lukekim in #9522
fix: Add spidapter post-setup retries by @peasee in #9526
Make partition discovery more robust and make initialization non-blocking by @sgrebnov in #9499
Make lint-rust-fix support targeted packages and features by @Jeadie in #9511
Handle new Cloud SCP API by @Jeadie in #9532
Refactor and simplify streaming benchmarks by @krinart in #9405
fix: ensure spidapter only increments attempts on failures by @peasee in #9534
feat: Support specifying app resources in spidapter by @peasee in #9536
test(runtime): Spice Cayenne DDL integration test by @lukekim in #9535
fix: Handle schema evolution mismatch errors during data refresh by @lukekim in #9527
fix: resolve clippy lint warnings by @phillipleblanc in #9547
pr-builds --tag <TAG> for build_and_release.yml by @Jeadie in #9507
Add --output flag to spice login with env/json/keychain modes by @Jeadie in #9541
Don't use 'PartitionedTableScanRewrite' in async distributed query by @Jeadie in #9548
feat(spidapter): add local backend mode with single executor by @phillipleblanc in #9531
support chat template in HF by @Jeadie in #9543
fix(cayenne): stream PK retention deletes and run OOM regression in CI by @phillipleblanc in #9533
cayenne: Staged append writes to prevent partial writes and data loss on stream error by @sgrebnov in #9491
AcceleratedTable::scan use FederatedTable::scan when ClusterRole::Scheduler by @Jeadie in #9550
Upgrade to delta-kernel-rs v0.18.2 by @lukekim in #9528
Run cayenne tests as part of PR CI by @sgrebnov in #9554
Upgrade to DataFusion v52.2.0 by @lukekim in #9419
Remove Snapshot Compaction + Add snapshot existence check by @krinart in #9523
Update dependencies by @lukekim in #9566
fix: Update benchmark snapshots by @app/github-actions in #9565
fix: Compare Cayenne table configuration on startup by @peasee in #9529
Make Refresh::refresh_sql more robust to alterations over time. by @Jeadie in #9549
fix: Update datafusion-table-providers dependency to latest revision by @lukekim in #9574
Unset AWS_ENDPOINT_URL when empty by @krinart in #9575
fix: allow BytesProcessedExec repartitioning for unordered input by @lukekim in #9540
Sanitize DataFusion errors by @lukekim in #9530
Add conditional logging for partition assignments by @Jeadie in #9577
use 'properly early exit on SIGTERM' by @Jeadie in #9573
Update datafusion to 52.2.0 by @phillipleblanc in #9582
Ensure we query one and only one partition per request by @Jeadie in #9416
feat: Add support for Spicepod version v2 by @lukekim in #9583
[SpiceDQ] Improve error messages; Avoid race condition on allocate_initial_partitions. by @Jeadie in #9579
Update ballista dependencies to latest 52.0.0 revision by @lukekim in #9581
Fix Databricks spark_connect mode always disabled by @phillipleblanc in #9586
Support partitioning in Arrow accelerator by @Jeadie in #9571
Fix spice query CLI response deserialization by @phillipleblanc in #9588
fix: Update benchmark snapshots by @app/github-actions in #9584
fix: Share RuntimeEnv across Cayenne read/write/delete paths for targeted list_files_cache invalidation by @sgrebnov in #9589
feat: Add file:// state_location support for async queries scheduler by @phillipleblanc in #9590
Update endgame links by @krinart in #9598
ci: fix E2E CLI upgrade test to use latest release for spiced download by @phillipleblanc in #9613
fix(DF): Lazily initialize BatchCoalescer in RepartitionExec to avoid schema type mismatch by @sgrebnov in #9623
feat: Implement catalog connectors for various databases by @lukekim in #9509
Refactor and clean up code across multiple crates by @lukekim in #9620
fix: Improve error handling for distributed mode and state_location configuration by @lukekim in #9611
Properly install postgres in install-postgres action by @krinart in #9629
fix: Use Python venv for schema validation in CI by @phillipleblanc in #9637
Update spicepod.schema.json by @app/github-actions in #9640
Update testoperator dispatch to use release/2.0 branch by @phillipleblanc in #9641
fix: Align CUDA asset names in Dockerfile and install tests with build output by @phillipleblanc in #9639
Fix expect test scripts in E2E Installation AI test by @sgrebnov in #9643
testoperator for partitioned arrow accelerator by @Jeadie in #9635
Remove default 1s refresh_check_interval from spidapter for hive datasets by @phillipleblanc in #9645
Fix scheduler panic and cancel race condition by @phillipleblanc in #9644
Align Spice.ai connector parameter names across catalog/data connectors by @lukekim in #9632
docs: update distribution details and add NAS support in release notes by @lukekim in #9650
Enable postgres-accel in CI builds for benchmarks by @sgrebnov in #9649
perf: Cache Turso metastore connection across operations by @penberg in #9646
Add 'scheduler_state_location' to spidapter by @Jeadie in #9655
Implement Cayenne S3 Express multi-zone live test with data validation by @lukekim in #9631
chore(spidapter): bump default memory limit from 8Gi to 32Gi by @phillipleblanc in #9661
perf: Use prepare_cached() in Turso and SQLite metastore backends by @penberg in #9662
Improve CDC cache invalidation by @krinart in #9651
Refactor Cayenne IDs to use UUIDv7 strings by @lukekim in #9667
fix: add liveness check for dead executors in partition routing by @Jeadie in #9657
fix(s3): Fix metadata column schema mismatches in projected queries by @sgrebnov in #9664
s3_metadata_columns tests: include test for location outside table prefix by @sgrebnov in #9676
docs: Update DuckDB, GCS, Git connector and Cayenne documentation by @lukekim in #9671
Add s3_url_style support for S3 connector URL addressing by @phillipleblanc in #9642
Consolidate E2E workflows and require WSL for Windows runtime by @lukekim in #9660
Upgrade to Rust v1.93.1 by @lukekim in #9669
Security fixes and improvements by @lukekim in #9666
feat(flight): add DoPut rows/bytes written metrics for DoPut ETL ingestion tracking by @phillipleblanc in #9663
Skip caching http error response + add response_headers by @krinart in #9670
refactor: Remove v1/evals functionality by @Jeadie in #9420
Make a test harness for Distributed Spice integration tests by @Jeadie in #9615
Enable on_zero_results: use_source for views by @krinart in #9699
fix(spidapter): Lower memory limit, passthrough AWS secrets, override flight URL by @peasee in #9704
Show an error on a shared acceleration file with snapshots enabled by @krinart in #9698
Fixes for anthropic by @Jeadie in #9707
Use max_partitions_per_executor in allocate_initial_partitions by @Jeadie in #9659
[SpiceDQ] Accelerations must have partition key by @Jeadie in #9711
Upgrade to Turso v0.5 by @lukekim in #9628
feat: Rename metadata columns to _location, _last_modified, _size by @phillipleblanc in #9712
fix: bump datafusion-ballista to fix BatchCoalescer schema mismatch panic by @phillipleblanc in #9716
fix: Ensure Cayenne respects target file size by @peasee in #9730
refactor: Make DDL preprocessing generic from Iceberg DDL processing by @peasee in #9731
[SpiceDQ] Distribute query of Cayenne Catalog to executors with data by @Jeadie in #9727
Properly set primary_keys/on_conflict for Cayenne tables by @krinart in #9739
Add executor resource and replica support to cloud app config by @ewgenius in #9734
feat: Support PARTITION BY in Cayenne Catalog table creation by @peasee in #9741
Update datafusion and related packages to version 52.3.0 by @lukekim in #9708
Route FlightSQL statement updates through QueryBuilder by @phillipleblanc in #9754
JSON file format improvements by @lukekim in #9743
[SpiceDQ] Partition Cayenne catalogs writes through to executors by @Jeadie in #9737
Update to DF v52.3.0 versions of datafusion & datafusion-tableproviders by @lukekim in #9756
Make S3 metadata column handling more robust by @sgrebnov in #9762
Fetch API keys from dedicated endpoint instead of apps response by @phillipleblanc in #9767
Update arrow-rs, datafusion-federation, and datafusion-table-providers dependencies by @phillipleblanc in #9769
Chunk metastore batch inserts to respect SQLite parameter limits by @phillipleblanc in #9770
Improve JSON SODA support by @lukekim in #9795
Add ADBC Data Connector by @lukekim in #9723
docs: Release Cayenne as RC by @peasee in #9766
cli[feat]: cloud mode to use region-specific endpoints by @lukekim in #9803
Include updated JSON formats in HTTPS connector by @lukekim in #9800
Flight DoPut: Partition-aware write-through forwarding by @Jeadie in #9759
Pass through authentication to ADBC connector by @lukekim in #9801
Move scheduler_state_location from adapter metadata to env var by @phillipleblanc in #9802
Fix Cayenne DoPut upsert returning stale data after 3+ writes by @phillipleblanc in #9806
Fix JSON column projection producing schema mismatch by @sgrebnov in #9811
Fix http connector by @krinart in #9818
Fix ADBC Connector build and test by @lukekim in #9813
Support update & delete DML for distributed cayenne catalog by @Jeadie in #9805
Set allow_http param when S3 endpoint uses http scheme by @phillipleblanc in #9834
fix: Cayenne Catalog DDL requires a connected executor in distributed mode by @Jeadie in #9838
fix: Add conditional put support for file:// scheduler state location by @Jeadie in #9842
fix: Require the DDL primary key contain the partition key by @Jeadie in #9844
fix: Databricks SQL Warehouse schema retrieval with INLINE disposition and async retry by @lukekim in #9846
Filter pushdown improvements for SqlTable by @lukekim in #9852
feat: add iam_role_source parameter for AWS credential configuration by @lukekim in #9854
Fix ODBC queries silently returning 0 rows on query failure by @lukekim in #9864
feat(adbc): Add ADBC catalog connector with schema/table discovery by @lukekim in #9865
Make Turso SQL unparsing more robust and fix date comparisons by @lukekim in #9871
Fix Flight/FlightSQL filter precedence and mutable query consistency by @lukekim in #9876
Partial Aggregation optimisation for FlightSQLExec by @lukekim in #9882
fix: v1/responses API preserves client instructions when system_prompt is set by @Jeadie in #9884
feat: emit scheduler_active_executors_count and use it in spidapter by @Jeadie in #9885
feat: Add custom auth header support for GraphQL connector by @krinart in #9899
Add --endpoint flag to spice run with scheme-based routing by @lukekim in #9903
When executor connects, send DDL for existing tables by @Jeadie in #9904
fix: Improve ADBC driver shutdown handling and error classification by @lukekim in #9905
fix: require all executors to succeed for distributed DML (DELETE/UPDATE) forwarding by @Jeadie in #9908
fix(cayenne catalog): fix catalog refresh race condition causing duplicate primary keys by @Jeadie in #9909
Remove Perplexity support by @Jeadie in #9910
Fix refresh_sql support for debezium constraints by @krinart in #9912
Implement DML for DynamoDBTableProvider by @lukekim in #9915
chore: Update iceberg-rust fork to v0.9 by @lukekim in #9917
Run physical optimizer on FallbackOnZeroResultsScanExec fallback plan by @sgrebnov in #9927
Improve Databricks error message when dataset has no columns by @sgrebnov in #9928
Delta Lake: fix data skipping for >= timestamp predicates by @sgrebnov in #9932
fix: Ensure distributed Cayenne DML inserts are forwarded to executors by @Jeadie in #9948
Add full query federation support for ADBC data connector by @lukekim in #9953
Make time_format deserialization case-insensitive by @claudespice in #9955
Hash ADBC join-pushdown context to prevent credential leaks in EXPLAIN plans by @lukekim in #9956
fix: Normalize Arrow Dictionary types for DuckDB and SQLite acceleration by @sgrebnov in #9959
ADBC BigQuery: Improve BigQuery dialect date/time and interval SQL generation by @lukekim in #9967
Make BigQueryDialect more robust and add BigQuery TPC-H benchmark support by @lukekim in #9969
fix: Show proper unauthorized error instead of misleading runtime unavailable by @lukekim in #9972
fix: Enforce target_chunk_size as hard maximum in chunking by @lukekim in #9973
Add caching retention by @krinart in #9984
fix: improve Databricks schema error detection and messages by @lukekim in #9987
fix: Set default S3 region for opendal operator and fix cayenne nextest by @phillipleblanc in #9995
fix(PostgreSQL): fix schema discovery for PostgreSQL partitioned tables by @sgrebnov in #9997
fix: Defer cache size check until after encoding for compressed results by @krinart in #10001
fix: Rewrite numeric BETWEEN to CAST(AS REAL) for Turso by @lukekim in #10003
fix: Handle integer time columns in append refresh for all accelerators by @sgrebnov in #10004
fix: preserve s3a:// scheme when building OpenDalStorageFactory with custom endpoint by @phillipleblanc in #10006
Fix ISO8601 time_format with Vortex/Cayenne append refresh by @sgrebnov in #10009
fix: Address data correctness bugs found in audit by @sgrebnov in #10015
fix(federation): fix SQL unparsing for Inexact filter pushdown with alias by @lukekim in #10017
Improve GitHub connector ref handling and resilience by @lukekim in #10023
feat: Add spice completions command for shell completion generation by @lukekim in #10024
fix: Fix data correctness bugs in DynamoDB decimal conversion and GraphQL pagination by @sgrebnov in #10054
Implement RefreshDataset for distributed control stream by @Jeadie in #10055
perf: Improve S3 parquet read performance by @sgrebnov in #10064
fix: Prevent write-through stalls and preserve PartitionTableProvider during catalog refresh by @Jeadie in #10066
feat: spice completions auto-detects shell directory and writes file by @lukekim in #10068
fix: Bug in DynamoDB, GraphQL, and ISO8601 refresh data handling by @sgrebnov in #10063
fix partial aggregation deduplication on string checking by @lukekim in #10078
fix: add MetastoreTransaction support to prevent concurrent transaction conflicts by @phillipleblanc in #10080
fix: Use GreedyMemoryPool, add spidapter query memory limit arg by @phillipleblanc in #10082
feat: Add metrics for EXPLAIN ANALYZE in FlightSQLExec by @lukekim in #10084
Use strict cast in try_cast_to to error on overflow instead of silent NULL by @sgrebnov in #10104
feat: Implement MERGE INTO for Cayenne catalog tables by @peasee in #10105
feat: Add distributed MERGE INTO support for Cayenne catalog tables by @peasee in #10106
Improve JSON format auto-detection for single multi-line objects by @lukekim in #10107
Add mode: file_update acceleration mode by @krinart in #10108
Coerce unsupported Arrow types to Iceberg v2 equivalents in REST catalog API by @peasee in #10109
fix: Update default query memory limit to 90% from 70% by @phillipleblanc in #10112
feat: Add mTLS client auth support to spice sql REPL by @lukekim in #10113
fix(datafusion-federation): report error on overflow instead of silent NULL by @sgrebnov in #10124
fix: Prevent data loss in MERGE when source has duplicate keys by @peasee in #10126
feat: Add ClickHouse Date32 type support by @sgrebnov in #10132
Add Delta Lake column mapping support (Name/Id modes) by @sgrebnov in #10134
fix: Restore Turso numeric BETWEEN rewrite lost in DML revert by @lukekim in #10139
fix: Enable arm64 Linux builds with fp16 and lld workarounds by @lukekim in #10142
fix: remove double trailing slash in Unity Catalog storage locations by @sgrebnov in #10147
fix: Improve GitHub GraphQL client resilience and performance by @lukekim in #10151
Enable reqwest compression and optimize HTTP client settings by @lukekim in #10154
fix: executor startup failures by @Jeadie in #10155
feat: Distributed runtime.task_history support by @Jeadie in #10156
fix: Preserve timestamp timezone in DDL forwarding to executors by @peasee in #10159
feat: Per-model rate-limited concurrent AI UDF execution by @Jeadie in #10160
fix(Turso): Reject subquery/outer-ref filter pushdown in Turso provider by @lukekim in #10174
Fix linux/macos spice upgrade by @phillipleblanc in #10194
Improve CREATE TABLE LIKE error messages, success output, EXPLAIN, and validation by @peasee in #10203
fix: chunk MERGE delete filters and update Vortex for stack-safe IN-lists by @peasee in #10207
Propagate runtime.params.parquet_page_index to Delta Lake connector by @sgrebnov in #10209
Properly mark dataset as Ready on Scheduler by @Jeadie in #10215
fix: handle Utf8View/LargeUtf8 in GitHub connector ref filters by @lukekim in #10217
fix(databricks): Fix schema introspection and timestamp overflow by @lukekim in #10226
fix(databricks): Fix schema introspection failures for non-Unity-Catalog environments by @lukekim in #10227
feat: Add pagination support to HTTP data connector by @lukekim in #10228
feat(databricks): DESCRIBE TABLE fallback and source-native type parsing for Lakehouse Federation by @lukekim in #10229
fix(databricks): harden HTTP retries, compression, and token refresh by @lukekim in #10232
feat[helm chart]: Add support for ServiceAccount annotations and AWS IRSA example by @peasee in #9833
fix: Log warning and fall back gracefully on Cayenne config change by @krinart in #9092
fix: Handle engine mismatch gracefully in snapshot fallback loop by @krinart in #9187
fix: Full Text Search schema mismatch with ADBC connector by @lukekim in #10235
docs: Update v2.0.0-rc.2 release notes with latest changes by @lukekim in #10238
Fix append refresh dedup failure when refresh_sql selects column subset by @sgrebnov in #10225
Revert "Properly mark dataset as Ready on Scheduler (#10215)" by @sgrebnov in #10242
Fix failing merge conflicts for benchmarks by @krinart in #10247
fix(github): fetch commits for dynamic and slash refs by @lukekim in #10233
Upgrade DataFusion to v52.5.0-rc1 by @lukekim in #10249
Merge develop to trunk (2026-04-09) by @claudespice in #10248
fix: Validate embedding row_id columns during dataset init (fixes #8226) by @claudespice in #10208
fix: Update tpch benchmark snapshots for federated/glue[csv].yaml by @app/github-actions in #10244
feat(databricks): add resilience controls, UC awareness, and task history instrumentation by @lukekim in #10246
fix: Make PartitionManager resilient to bare vs fully qualified table references by @sgrebnov in #10257
fix: Update tpch benchmark snapshots for accelerated/s3[parquet]-cayenne[file].yaml by @app/github-actions in #10256
Merge develop to trunk (2026-04-10) by @claudespice in #10251
Improve Snowflake/ADBC dataset registration performance and observability by @lukekim in #10266
Fixes for kafka connector by @krinart in #10263
fix(runtime): gate otel code tags, suppress aws sdk noise, and unblock connector init by @lukekim in #10260
fix(runtime): avoid regionless AWS SDK loads by @lukekim in #10271
Add versioned release install workflow coverage by @lukekim in #10276
fix(runtime): handle HTTP JSON unions and spicepod reloads by @lukekim in #10277
Databricks UC permission prechecks: explicit denial as permanent error, ambiguous cases advisory by @lukekim in #10274
Revert component status changes re-introduced by develop merge (#10248) by @sgrebnov in #10293
Fix broken CI workflows by @ewgenius in #10294
Group dependabot updates by ecosystem by @lukekim in #10296
fix(tests): Replace flaky S3 Vectors snapshot tests with structural validation by @lukekim in #10301
Update test_github_workflows snapshot by @lukekim in #10304
fix(ci): fix Bedrock runner mismatch and snapshot auto-merge failure by @ewgenius in #10306
feat(http): Add map-to-array conversion and query-parameter pagination by @lukekim in #10295
New crate: datafusion-ddl by @Jeadie in #10205
Make Databricks UC permission checks advisory with structured error reporting by @lukekim in #10283
build(deps): bump the github-actions-dependencies group with 4 updates by @app/dependabot in #10298
fix: Clear cached plans on view updates by @peasee in #10312
build(deps): bump the aws-sdk group with 7 updates by @app/dependabot in #10299
Code out of runtime. by @Jeadie in #10178
fix: Respect function registry denies for accelerated table filter pushdown by @peasee in #10311
fix: Don't block heartbeat when all slots acquired by @peasee in #10322
fix: strip only outer parens in get_table_partition_expr_from_ctx by @Jeadie in #10323
Upgrade datafusion-table-providers with MongoDB SRV support by @lukekim in #10317
fix: Avoid pushing down bucketing partition expressions into executors by @peasee in #10324
Upgrade datafusion-table-providers to d1b911a5 and bump adbc to 0.23 by @lukekim in #10329
fix: Update Search integration test snapshots by @app/github-actions in #10308
Handle foreign table + Classic sql warehouse combination gracefully by @krinart in #10318
New crate datafusion-flightsql by @Jeadie in #10201
Set tantivy=warn unless very verbose logging by @Jeadie in #10338
Remove image registry and image name options from spidapter by @ewgenius in #10241
build(deps): bump sysinfo from 0.37.2 to 0.38.4 by @app/dependabot in #10291
build(deps): bump futures from 0.3.31 to 0.3.32 by @app/dependabot in #10289
New crate 'datafusion-dml' by @Jeadie in #10334
Jeadie/26 04 16/spice sql by @Jeadie in #10343
Add Teraswitch/Pittsburgh apt mirrors + retry config for CI runners by @lukekim in #10349
Implement sort pushdown and fix pushdown gaps across providers by @lukekim in #10337
Merge develop to trunk (2026-04-16) by @claudespice in #10345
Update candle and mistral.rs lock-step pins by @lukekim in #10278
docs: fix status badges in README by @lukekim in #10350
Migrate secrets to vars by @krinart in #10354
Add limit pushdown and improve sort pushdown for Oracle and MSSQL by @sgrebnov in #10351
Fix ubuntu mirror configuration by @ewgenius in #10359
fix: Increase throughput test default ready_wait from 30s to 300s (fixes #8207) by @claudespice in #10344
Add auth headers support to OTEL metrics exporter by @lukekim in #10347
fix(github): shrink GraphQL page size on gateway errors; lower comment defaults by @lukekim in #10355
Relax apt mirror substitution failure to warning in CI action by @ewgenius in #10361
feat(http): Add OAuth2 refresh-token auth to HTTP connector by @lukekim in #10348
Upgrade Rust toolchain to 1.94.1 by @lukekim in #10353
Handle order by and sort in PartitionedTableScanRewrite by @Jeadie in #9656
Fix OTEL Exporter by @krinart in #10363
Pin spiceai candle / TEI forks to merged revs; drop local [patch] overrides by @lukekim in #10362
Integrate spiceio and makefile_targets into pr.yml by @lukekim in #10357
ci: skip artifact compression for test binaries/archives by @lukekim in #10381
chore(deps): bump spiceai/candle, spiceai/mistral.rs, aws-lc-rs, tantivy, rand by @lukekim in #10379
Bump datafusion-table-providers (#10375) by @lukekim in #10384
fix: Update Search integration test snapshots by @app/github-actions in #10376
v2.0.0-rc.3 preparation by @ewgenius in #10382
fix(spicepod): JSON schema accepts string or {name: expr} for partition_by by @lukekim in #10352
fix: Use ROUND for Turso decimal BETWEEN comparisons (fixes #9872) by @claudespice in #10360
Revert "v2.0.0-rc.3 preparation" from trunk by @ewgenius in #10386
Add on_schema_resolved dataset ready state by @lukekim in #10368
feat: Add Elasticsearch data connector with hybrid search support by @lukekim in #10258
ci: bump test archive upload compression-level to 1 by @lukekim in #10388
feat(git-connector): promote Git connector to RC status by @lukekim in #10385
feat(postgres): stream WAL directly to Spice accelerators by @lukekim in #10364
Add schema decomposition to the HTTP connector by @lukekim in #10393
fix(cayenne): Skip catalog refresh state reload for existing providers by @sgrebnov in #10396
Make cayenne-flightsql tool by @Jeadie in #10356
build(deps): bump the github-actions-dependencies group with 2 updates by @app/dependabot in #10398
Update openapi.json by @app/github-actions in #10272
Merge develop to trunk — 2026-04-19 by @claudespice in #10407
feat(otel): default OTLP push exporter to delta temporality by @phillipleblanc in #10412
fix: Restore analyzer rule ordering to run federation before type coercion by @sgrebnov in #10415
fix: Map Utf8/LargeUtf8 to STRING in Databricks/Spark SQL dialects by @sgrebnov in #10420
feat(otel): add metric name prefix at runtime.telemetry.metric_prefix by @phillipleblanc in #10418
fix: Map LargeUtf8 to VARCHAR in Athena ODBC dialect by @sgrebnov in #10419
feat(cluster): connector-driven object store registration on executors by @phillipleblanc in #10414
build(deps): bump ubuntu from 22.04 to 24.04 in the docker-dependencies group by @app/dependabot in #10397
fix: Update benchmark snapshots Apr 20 by @app/github-actions in #10417
feat(otel): apply runtime.telemetry.properties as resource attributes on exported metrics by @phillipleblanc in #10416
Publish RC releases to DockerHub; upgrade runners to ubuntu-24.04 by @lukekim in #10428
feat: Add Azure Cosmos DB (NoSQL) data connector (RC) by @lukekim in #10392
feat(datafusion): flatten_json_properties + json_tree UDTFs by @lukekim in #10406
Harden /v1/tools and /v1/nsql against unauthenticated / LLM-driven SQL by @lukekim in #10365
feat(embeddings): multi-vector embeddings with MaxSim + late-interaction by @lukekim in #10408
Update GH runners for CUDA builds by @ewgenius in #10432
fix(delta_lake): register object stores on cluster executors by @phillipleblanc in #10436
DF-native DML by @krinart in #10327
ci: run Build and Test on spiceai-macos; split install jobs by profile by @lukekim in #10434
Improve search UDTFs: text_search, vector_search, rrf by @lukekim in #10387
fix(model2vec): Improve robustness of model loading for sentence-transformers layouts by @sgrebnov in #10444
Merge develop to trunk — 2026-04-21 by @claudespice in #10448
Enable filter pushdown for vector_search UDTF by @sgrebnov in #10447
Support Snowflake OBJECT, MAP, GEOGRAPHY, GEOMETRY, VECTOR, TIMESTAMP_LTZ types by @lukekim in #10451
Fix Databricks tests by @krinart in #10449
fix(cluster): forward register_object_stores through connector wrappers by @phillipleblanc in #10460
Fixes for vector-search by @krinart in #10455
Add expand_maps option and flatten_json UDTF by @lukekim in #10452
fix: Update Search integration test snapshots by @app/github-actions in #10458
Fix physical codec decode ambiguity for empty protobuf messages by @sgrebnov in #10466
chore(logging): demote s3_single_file_cached skip refresh log to debug by @phillipleblanc in #10467
Enable filter pushdown for rrf UDTF by @sgrebnov in #10465
feat(cluster): consolidate distributed state into cluster.json by @phillipleblanc in #10463
feat(cayenne): Add column statistics and data inlining by @lukekim in #10314
docs(copilot): flag missing wrapper delegation when adding default trait methods by @phillipleblanc in #10461
Wire Elasticsearch vector engine write path through acceleration by @lukekim in #10453
Add helm lint CI by @ewgenius in #10468
Fix Azure and GCS acceleration snapshot object store credential handling by @phillipleblanc in #10486
Update spicepod.schema.json by @app/github-actions in #10485
fix(secrets): harden AWS Secrets Manager secret store by @lukekim in #10478
Update datafusion-ballista crate by @sgrebnov in #10488
feat(secrets): add ParameterSpec and more params for AWS secrets manager by @phillipleblanc in #10487
Add rerank UDTF for hybrid search with query auto-propagation by @lukekim in #10469
Fix flatten_json_properties by @krinart in #10475
fix: preserve field and schema metadata in expand_views_schema by @claudespice in #10494
Upgrade rmcp to upstream 1.5.0; switch MCP server to Streamable HTTP by @lukekim in #10491
fix: handle Snowflake TIMESTAMP_LTZ wire format and prevent nanosecond overflow by @claudespice in #10493
Lint parity in Makefile by @krinart in #10492
Add connect_timeout/client_timeout params to Databricks sql_warehouse mode by @lukekim in #10495
fix(tracing): suppress opentelemetry INFO logs at all verbosity levels by @lukekim in #10497
DynamoDB DML by @krinart in #10470
feat(cayenne): native vector search via SIMD similarity UDFs by @lukekim in #10456
fix(cli): suppress banner for all JSON-producing cloud subcommands (fixes #10498) by @claudespice in #10510
fix(deps): bump openssl to 0.10.78 by @phillipleblanc in #10509
fix(s3): quiet AWS SDK credential probe when no region is configured by @phillipleblanc in #10506
fix(cdc): emit ready signal on caught-up Kafka/Debezium streams (#5201) by @phillipleblanc in #10504
runtime-cluster crate + Run partition discovery before forwarding refresh to executors by @krinart in #10490
Update lint-rust target to use --keep-going by @Jeadie in #10508
Add TPC-H SF100 s3[parquet]-duckdb[file] benchmark spicepod by @lukekim in #10524
Remove dev-profile install steps from pr.yml by @Jeadie in #10507
fix: add missing NULL check on Timestamp path in append refresh by @claudespice in #10518
fix: return error on Decimal128/256 overflow instead of silently dropping scale by @claudespice in #10519
fix: delegate update and delete_from in IndexedTableProvider and EmbeddingTable by @claudespice in #10520
feat(devx): make config errors, CLI, and REPL lead users to success by @lukekim in #10489
fix(rerank): defer execution to RerankExec, enable filters and projection pushdown by @sgrebnov in #10514
fix(llms): support Gemma models with missing attention_bias config field by @lukekim in #10523
Fix vector_search silently ignoring named limit/column/include_score args by @sgrebnov in #10527
fix: split unsupported filters locally in scan() for UseSource mode by @ewgenius in #10528
feat(secrets): add Azure Key Vault secret store by @lukekim in #10496
Bump mistralrs by @krinart in #10532
Fix benchmark configurations and CI build issues by @sgrebnov in #10535
Fix catalog query overrides for MySQL and MSSQL benchmarks by @sgrebnov in #10543
For Cayenne, preserve matched columns for MERGE ... ON <cols> by @Jeadie in #10340
build(deps): bump the aws-sdk group across 1 directory with 5 updates by @app/dependabot in #10538
docs: update AI agent instructions (git workflow + Rust 1.94) by @lukekim in #10544
fix: Update tpch benchmark snapshots by @app/github-actions in #10529
fix: Update tpch benchmark snapshots for accelerated/s3[parquet]-duckdb[file].yaml by @app/github-actions in #10525
Extract runtime-datafusion from runtime by @krinart in #10545
Use generic DML extension planner for Cayenne by @Jeadie in #10437
fix: Update Search integration test snapshots by @app/github-actions in #10552
Fix security and correctness audit issues by @lukekim in #10526
fix(MySQL): revert MySQL result column reorder to fix federated query failures by @sgrebnov in #10557
Fix protoc installation by @krinart in #10566
fix: Disable Ballista dynamic filters on HashJoinExec by @peasee in #10548
Support views on DDL catalogs by @Jeadie in #10554
Update datafusion by @Jeadie in #10422
Improve full-text search indexing performance by @sgrebnov in #10464
feat(mysql): add mysql_zero_date_behavior parameter (null|error) by @phillipleblanc in #10573
fix(snowflake): declare private_key in connector PARAMETERS (fixes #10517) by @claudespice in #10559
Honour CARGO_TARGET_DIR in Makefiles by @Jeadie in #10569
Enable cosine_distance pushdown to DuckDB accelerator via array_cosine_distance by @sgrebnov in #10564
fix: Update test snapshots by @app/github-actions in #10570
fix: Update tpch benchmark snapshots by @app/github-actions in #10560
feat(snapshots): make snapshots an optional feature by @phillipleblanc in #10574
Enforce read-only API key restrictions on Flight DoGet and async query paths by @Jeadie in #10551
Improved security posture on Github workflows by @Jeadie in #10556
fix: Update datafusion-table-providers to improve SqlTable filter pushdown by @sgrebnov in #10595
feat(secrets): add HashiCorp Vault secret store by @phillipleblanc in #10561
fix: delegate update() in UpsertDedupTableProvider to inner provider by @claudespice in #10593
Add DuckDB vector engine support by @lukekim in #10562
Sharepoint - add object-store listing connector with expanded auth and write support by @lukekim in #10473
fix: Install protoc from source by @peasee in #10597
Enable DML support for PostgreSQL data connector by @phillipleblanc in #10446
feat(postgres): support inline PEM sslrootcert by @claudespice in #10578
Add foreign key metadata discovery to PostgreSQL Catalog by @sgrebnov in #10849
Add Snowflake DML support by @lukekim in #10747
Add MongoDB Change Streams support by @lukekim in #10813
Add user-defined functions by @lukekim in #10571
Add table user functions and gate HTTP servers by @lukekim in #10675
feat: add on-demand dataset loading by @phillipleblanc in #10629
feat(runtime): declared-schema deferred datasets by @phillipleblanc in #10669
feat(spicepod, runtime): add columns[].type / nullable + lenient type parser by @phillipleblanc in #10661
Replace external smb crate with internal SMB 3.1.1 client by @phillipleblanc in #10516
Add unified query cancellation across all paths by @lukekim in #10390
Add dynamic HTTP request headers by @lukekim in #10604
feat(http): Support dynamic HTTP connector request params from subqueries by @lukekim in #10636
feat(http): pass through HTTP metadata columns with JSON schema decomposition by @lukekim in #10679
Add nolimit HTTP pagination max pages by @lukekim in #10673
Add shared HTTP rate control for connectors by @lukekim in #10648
Use origin label instead of name for HTTP rate control metrics by @lukekim in #10689
fix(http): reject OR across different HTTP filter columns by @lukekim in #10625
Add provider-aware LLM prompt caching by @lukekim in #10645
Add searchable registry mode for LLM tools by @lukekim in #10647
feat: refresh_mode: snapshot + SQLite/Turso WAL flush + Cayenne metastore slice by @phillipleblanc in #10651
feat: per-principal cache namespacing for SQL/search/caching-accelerator by @lukekim in #10702
Add self-hosted Spice connector support by @phillipleblanc in #10546
Add Delta Lake Azure tenant parameter by @phillipleblanc in #10671
Support OAuth2 client credentials in 'spice cloud login' by @ewgenius in #10586
Add configurable allowed_hosts for MCP by @lukekim in #10638
fix: make Helm chart probes configurable by @peasee in #10696
Strip high-cardinality datasets dim from anonymous telemetry by @lukekim in #10711
feat(elasticsearch): direct FTS engine config + index lifecycle and ingestion controls by @lukekim in #10672
Add DuckDB HNSW vector index support for accelerated views by @sgrebnov in #10695
Rewrite DuckDB vector search SQL to activate HNSW_INDEX_SCAN by @sgrebnov in #10674
Fix DuckDB HNSW vector indexes lost after data refresh by @sgrebnov in #10668
Fix DuckDB DELETE/UPDATE on full and caching refresh mode datasets by @phillipleblanc in #10632
Fix DuckLake connector: downcast, module registration, schema discovery, and S3 credentials by @sgrebnov in #10650
Fix federation pushing denied functions inside subqueries to remote engines by @phillipleblanc in #10692
fix(caching): honour refresh_on_startup: always in caching mode by @phillipleblanc in #10594
fix(iceberg): rebuild storage factory when Hadoop catalog scheme is inferred by @sgrebnov in #10601
Pipeline CDC ingestion: overlap source reads with batch apply by @lukekim in #10676
fix: add NULL check to CDC primary key extraction by @lukekim in #10684
Properly handle nullability during CDC processing by @krinart in #10803
Flatten scheduler config and rename partition management → partition assignment by @lukekim in #10450
Improve NSQL UX and harden internal LLM tools by @lukekim in #10715
Support Responses API across model providers by @lukekim in #10724
Update xAI default model and handle Grok model retirements by @Jeadie in #10723
Improve cli table layout by @krinart in #10725
TLS cert hot-reload (mTLS plan M1) by @phillipleblanc in #10727
Fix DuckLake catalog include filter being ignored by @phillipleblanc in #10738
Promote DuckLake Catalog and Data Connector to Beta quality by @sgrebnov in #10743
feat(ducklake): Support INSERT on catalog tables with read_write access by @sgrebnov in #10744
perf(cdc): coalesce envelopes and overlap commits in apply pipeline by @lukekim in #10745
feat: Allow full version tags in spicepod version by @peasee in #10748
Add Arrow primary key upserts by @lukekim in #10749
fix(snapshot): keep refresh_mode snapshot read-only by @phillipleblanc in #10752
feat(tls): public mTLS for HTTP and Flight (channel + identity modes) by @phillipleblanc in #10753
perf(cayenne): lock-free deletion caches with bloom-prefiltered probe by @lukekim in #10756
fix(security): close API key timing-position leak and remote-UDF SSRF by @lukekim in #10757
Fix 'wait_until_dependent_tables_are_ready' for catalogs by @phillipleblanc in #10758
Fixes for views and resolved tables on 'spice refresh' CLI by @phillipleblanc in #10759
Implement FlightSQL CommandStatementSubstraitPlan support by @lukekim in #10761
feat(connectors): mTLS client cert support for flightsql and spiceai connectors by @phillipleblanc in #10764
Allow arbitrary filenames when specifying spicepod path + kind validation by @krinart in #10777
fix: ignore field metadata in schema compatibility check in index_table_scan by @Jeadie in #10778
Display pushed-down limits in EXPLAIN TREE output by @lukekim in #10779
fix: enable streaming append for Kafka with Cayenne accelerator by @lukekim in #10780
fix: bound chunked-index intermediate batch size to prevent OOM by @phillipleblanc in #10783
fix: label all columns in spice cloud metrics table output by @claudespice in #10784
fix: use checked arithmetic for Turso integer-millis timestamp read path by @claudespice in #10786
fix: use checked arithmetic in timestamp-to-nanosecond conversions by @claudespice in #10666
Upgrade to DuckDB v1.5.2 by @sgrebnov in #10788
Improve CDC ingestion performance by @lukekim in #10789
Fix tool_search/tool_invoke spans by @lukekim in #10791
Add Cayenne inline mutations and benchmark coverage by @lukekim in #10792
Ensure we always resolve table names in distributed mode/metadata by @Jeadie in #10793
Remove permanent errors from DynamoDB Streams by @krinart in #10794
Add expanded view mode for wide table display in SQL REPL by @lukekim in #10797
Fix Cayenne CDC schema mismatch error by @sgrebnov in #10800
Executors should create catalog tables on join by @Jeadie in #10807
Add compressed file support for listing connectors by @lukekim in #10809
Improve Cayenne mutation, scan, and inline memtable scaling by @lukekim in #10811
Add range fallback for large join filters by @lukekim in #10816
Improve Cayenne join filter pushdown by @lukekim in #10818
Synchronize Cayenne partition commits across partitions by @phillipleblanc in #10819
fix: Deny nondistributed cayenne catalog by @peasee in #10821
Enable parallel Cayenne Vortex writes by @lukekim in #10822
Expand Arrow type handling in formatting and Elasticsearch by @lukekim in #10825
Add response.output_text.delta to responses API by @krinart in #10828
feat(cayenne): add join filter propagation and no-spill Q21 planning by @lukekim in #10840
Upgrade Turso to v0.6.0 by @sgrebnov in #10843
feat(cli): add spice feedback command to open community Slack by @lukekim in #10856
Upgrade iceberg to v0.9.1 by @sgrebnov in #10859
feat(cluster): per-request executor readiness gate on /v1/ready by @phillipleblanc in #10860
fix: Require dim-side statistics for CayennePropagateFilterAcrossEquiJoinKeys by @sgrebnov in #10863
fix: Debezium schema evolution breaks dataset init on reload by @claudespice in #10144
fix(mssql): Push topK limit to SQL Server for non-nullable sort columns by @Jeadie in #10621
fix(ScyllaDB): disable physical filter pushdown by @sgrebnov in #10772
fix: handle typed NULLs and prevent overflow in DynamoDB DML type conversions by @krinart in #10511
fix: use InsertOp::Overwrite in DynamoDB bootstrap scan_and_overwrite_accelerator by @krinart in #10639
Improve DynamoDB Bootstrap performance by @krinart in #10616
fix: preserve field and schema metadata in Vortex type transformation by @lukekim in #10628
fix: GH connector - explicitly use AWS LC RS crypto provider for jwt by @phillipleblanc in #10619
fix: add snapshot mode guards to delete_from/update and delegate DML in SwappableTableProvider by @phillipleblanc in #10685
Persist HTTP rate-control state in object storage by @lukekim in #10697
Rate limit metrics HTTP endpoint by @lukekim in #10162
feat(geo): add optional spatial SQL UDF support by @lukekim in #10833
feat(cayenne): CDC throughput, compaction, scan caching, and benchmarks by @lukekim in #10852
fix(cayenne): fix Vortex panic on highly compressible data by @sgrebnov in #10855
fix(cayenne): Read live protected snapshots after cleanup grace period by @sgrebnov in #10901
fix: Disable Cayenne HashJoin rewriter optimizer by @sgrebnov in #10882
Fix GetFlightInfo vs DoGet Flight Schema by @krinart in #10864
fix(search): preserve column casing in /v1/search primary key plumbing by @claudespice in #10909
fix(object-store): dedupe s3 url style auto-detection log by @phillipleblanc in #10898
Improve Spice CLI manifest editing and direct command modes by @lukekim in #10815
Persist Kafka CDC offsets in sidecar tables by @lukekim in #10823
feat(task-history): record Ballista stages for distributed queries by @phillipleblanc in #10831
Add '#[deny(clippy::missing_trait_methods)]' to wrapper/delegation trait impls by @Jeadie in #10795
Optimize Cayenne catalog maintenance paths by @lukekim in #10904
Centralize DuckDB settings for accelerator by @ewgenius in #10895
deps(ballista): bump to 47e2b494 to fix S3 shuffle reads under cluster mode by @phillipleblanc in #10910
Authorization header + Bump async-openai + responses_adapter fix by @krinart in #10911
Tune accelerators by storage profile by @lukekim in #10913
feat: add dataset-level on_schema_change config by @lukekim in #10908
Handle NULL sentinel for nullable partition expressions by @Jeadie in #10880
fix: Remove Cayenne Catalog from catalog registration by @peasee in #10914
Add catalog name to foreign key metadata in postgres catalog by @Jeadie in #10917
Cayenne perf: eliminate redundant clones, PK point-lookup fanout fix, IN-list rewrite + microbench coverage by @lukekim in #10916
fix(turso-shared): retry on Turso BEGIN CONCURRENT "Write-write conflict" by @lukekim in #10946
Vendor Vortex DataFusion for Cayenne by @lukekim in #10933
perf(cayenne): background retention + enable CDC pipelining for retention-configured tables by @lukekim in #10936
feat(cayenne): scale metastore pool to 32 + vs_duckdb_scaling benches (1→128 concurrency, sqlite + turso lanes) by @lukekim in #10943
feat(mcp): support auth for streamable HTTP tools by @phillipleblanc in #10927
Explicit error if v1/search requests a table without search index by @Jeadie in #10968
Fix spicepod loading failure when directory name contains dots by @sgrebnov in #10958
Extend append tests with arrow engine configurations by @sgrebnov in #10959
Remove dataset on_schema_change Policy from rc.5 release notes by @sgrebnov in #10964
Skip tpcds_q78 for Cayenne engine at SF100 by @sgrebnov in #10966
fix: Update benchmark snapshots May-20 by @app/github-actions in #10952
Fix #10951: UdtfExec invariant Vec lengths must match children count by @phillipleblanc in #10953
docs(release): update v2.0.0-rc.5 notes with latest trunk PRs by @lukekim in #10949
Remove eval related things for v2.0.0 by @Jeadie in #10945
build(deps): bump ubuntu from 24.04 to 26.04 in the docker-dependencies group by @app/dependabot in #10883
fix: Add publish = false to chbench-driver by @sgrebnov in #10939
[Bug] Timing between reconnect and AllocateInitialPartitions leaves connection without flight_sql_client by @Jeadie in #10805
Fix: refresh_mode: snapshot reports Ready with empty data when no snapshot exists by @sgrebnov in #10979
fix(cluster): gate scheduler readiness on executor partition loads by @phillipleblanc in #10992
fix: handle EXISTS/NOT EXISTS subqueries in federation analyzer by @sgrebnov in #10996
Refactor spice dataset configuration command by @Jeadie in #10999
fix: preserve field and schema metadata in Vortex physical schema calculation by @claudespice in #11013
fix: validate Snowflake account identifiers and auth config by @Jeadie in #11024
Fix Unity Catalog connector deserialization failure with OSS Unity Catalog by @ewgenius in #11026
feat(cayenne): allow inline writes with pending deletions (deletes/upserts) by @sgrebnov in #11031
Expose metadata descriptions via PostgreSQL UDFs by @lukekim in #11032
Remove default runtime features - enable explicitly in spiced by @phillipleblanc in #11037
feat(cayenne): fast-path CDC deletes by extracting PK values from filters by @sgrebnov in #11049
Cayenne optimizer rules: auto relevance test for q21-shape (all-Cayenne CH-Bench) and runtime rule selection by @lukekim in #11050
refactor(cdc): reduce CDC sub-batch splits for interleaved upsert/delete workloads by @sgrebnov in #11051
fix(snowflake): enforce function deny-list in federation pushdown by @claudespice in #11057
fix(mcp): trace external server tool calls in task history by @ewgenius in #11058
perf(cdc): Last-write-wins dedup in group_into_sub_batches to reduce sub-batch splits by @sgrebnov in #11059
PM edits to v2.0.0-rc5 by @lukekim in #11067
fix(snowflake): wire deny-list in extracted connector crate (#10703) by @claudespice in #11071
perf(cayenne): keep CDC upsert PK keysets resident to avoid per-batch full-table rebuilds by @lukekim in #11074
Fix metadata on search indexing by @Jeadie in #11080
feat(cayenne): merge-on-read position deletes for PK upsert tables + memory-pool accounting by @lukekim in #11085
perf(cayenne): scale CDC inline flush caps with memory + storage class by @lukekim in #11087
feat(cluster): report per-executor table statistics so distributed JoinSelection can size joins by @phillipleblanc in #11089
Improve Cayenne CDC write and compaction path tracing by @sgrebnov in #11091
Support tuple-IN composite PK extraction in Cayenne delete fast-path by @sgrebnov in #11093
feat(cluster): NDV-aware executor stats so CDC q18 join swap fires by @phillipleblanc in #11098
feat(cayenne): maintain join-sizing stats on the write path by @phillipleblanc in #11104
fix(cache): run periodic moka maintenance for idle caches by @phillipleblanc in #11106
Upgrade to DuckDB 1.5.3 + statically link the VSS (HNSW) extension by @sgrebnov in #11107
Fix fetched_at for HTTP connector by @Jeadie in #11116
fix(cayenne): tombstone inline-checkpointed rows on upsert to prevent duplicate PKs by @sgrebnov in #11129
feat: dedicated compaction runtime for Cayenne + CDC pipelining, protected snapshots, and test coverage by @lukekim in #11130
Add datasets dimension to the query_executions metric by @phillipleblanc in #11138
Fix #11137: localpod child not tracking parent refreshes with in-memory (arrow) parent accelerator by @phillipleblanc in #11139
Fix Windows build: vendor the VSS extension (drop nested submodule) by @phillipleblanc in #11140
fix(spiceai): keep correlated subqueries out of JOIN ON for Spice Cloud federation by @phillipleblanc in #11143
Refactor spice dataset configuration command by @Jeadie in #10999
feat(cayenne): sharded parallel Vortex encode with key/time clustering by @lukekim in #11144
fix(cluster): prevent DoPut write pipeline self-deadlock under ingest backpressure by @phillipleblanc in #11160
fix(cayenne): only warn on genuine protected-snapshot amplification by @lukekim in #11158

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.11.6...v2.0.0

Spice v2.0-rc.2 (Apr 10, 2026)

April 10, 2026 · 28 min read

Evgenii Khramkov

Senior Software Engineer at Spice AI

Announcing the release of Spice v2.0-rc.2! 🔥

v2.0.0-rc.2 is the second release candidate for advanced testing of v2.0, building on v2.0.0-rc.1.

Highlights in this release candidate include:

Distributed Spice Cayenne Query and Write Improvements with data-local query routing and partition-aware write-through
DataFusion v52.4.0 Upgrade with aligned arrow-rs, datafusion-federation, and datafusion-table-providers
MERGE INTO for Spice Cayenne catalog tables with distributed support across executors
PARTITION BY Support for Cayenne enabling SQL-defined partitioning in CREATE TABLE statements
ADBC Data Connector & Catalog with full query federation, BigQuery support, and schema/table discovery
Databricks Lakehouse Federation Improvements with improved reliability, resilience, DESCRIBE TABLE fallback, and source-native type parsing
Delta Lake Column Mapping supporting Name and Id mapping modes
HTTP Pagination support for paginated API endpoints in the HTTP data connector
New Catalog Connectors for PostgreSQL, MySQL, MSSQL, and Snowflake
JSON Ingestion Improvements with single-object support, soda (Socrata Open Data) format support, json_pointer extraction, and auto-detection
Per-Model Rate-Limited AI UDF Execution for controlling concurrent AI function invocations
Dependency upgrades including Turso v0.5.3, iceberg-rust v0.9, and Vortex improvements

What's New in v2.0.0-rc.2

Distributed Cayenne Query and Write Improvements

Distributed query for Cayenne-backed tables now has better partition awareness for both reads and writes.

Key improvements:

Data-Local Query Routing: Cayenne catalog queries can now be routed to executors that hold the relevant partitions, improving distributed query efficiency.
Partition-Aware Write Through: Scheduler-side Flight DoPut ingestion now splits partitioned Cayenne writes and forwards them to the responsible executors instead of routing through a single raw-forward path.
Dynamic Partition Assignment: Newly observed partitions can be added and assigned atomically as data arrives, with persisted partition metadata for future routing.
Better Cluster Coordination: Partition management is now separated for accelerated and federated tables, improving routing behavior for distributed Cayenne catalog workloads.
Distributed UPDATE/DELETE DML: UPDATE and DELETE statements for Cayenne catalog tables are now forwarded to all executors in distributed mode, with all executors required to succeed.
Distributed runtime.task_history: Task history is now replicated across the distributed cluster for observability.
RefreshDataset Control Stream: Dataset refresh operations are now distributed via the control stream to executors.
Executor DDL Sync: When an executor connects, it receives DDL for all existing tables, ensuring late-joining executors have full table state.

MERGE INTO for Spice Cayenne

Spice now supports MERGE INTO statements for Cayenne catalog tables, enabling upsert-style data operations with full distributed support.

Key improvements:

MERGE INTO Support: Execute MERGE INTO statements against Cayenne catalog tables for combined insert/update/delete operations.
Distributed MERGE: MERGE operations are automatically distributed across executors in cluster mode.
Data Safety: Duplicate source keys are detected and prevented to avoid data loss during MERGE operations.
Chunked Delete Filters: Large MERGE delete filter lists are chunked to prevent stack overflow with Vortex IN-list expressions.

`PARTITION BY` Support for Cayenne

SQL Partition Management: Spice now supports PARTITION BY for Cayenne-backed CREATE TABLE statements, enabling partition definitions to be expressed directly in SQL and persisted in the Cayenne catalog.

Key improvements:

SQL Partition Definition: Define Cayenne table partitioning directly in SQL using CREATE TABLE ... PARTITION BY (...).
Partition Validation: Partition expressions are parsed and validated during DDL analysis before table creation.
Persisted Partition Metadata: Partition metadata is stored in the Cayenne catalog and can be reloaded by the runtime after restart.
Distributed DDL Support: Partition metadata is forwarded when CREATE TABLE is distributed to executors in cluster mode.
Improved Type Support: Partition utilities now support newer string scalar variants such as Utf8View.

Example:

CREATE TABLE events (id INT, region TEXT, ts TIMESTAMP) PARTITION BY (region)

Catalog Connector Enhancements

Spice now includes additional catalog connectors for major database systems, improving schema discovery and federation workflows across external data systems.

Key improvements:

New Catalog Connectors: Added catalog connectors for PostgreSQL, MySQL, MSSQL, and Snowflake.
Schema and Table Discovery: Connectors use native metadata catalogs such as information_schema / INFORMATION_SCHEMA to discover schemas and tables.
Improved Federation Workflows: These connectors make it easier to expose external database metadata through Spice for cross-system federation scenarios.
PostgreSQL Partitioned Tables: Fixed schema discovery for PostgreSQL partitioned tables.

Example PostgreSQL catalog configuration:

catalogs:
  - from: pg
    name: pg
    include:
      - 'public.*'
    params:
      pg_host: localhost
      pg_port: 5432
      pg_user: postgres
      pg_pass: ${secrets:POSTGRES_PASSWORD}
      pg_db: my_database
      pg_sslmode: disable

JSON Ingestion Improvements

JSON ingestion is now more flexible and robust.

Key improvements:

More JSON Formats: Added support for single-object JSON documents, auto-detected JSON formats, and Socrata SODA responses.
json_pointer Extraction: Extract nested payloads before schema inference and reading using RFC 6901 JSON Pointer syntax.
Better Auto-Detection: JSON format detection now handles arrays, objects, JSONL, and BOM-prefixed input more reliably, including single multi-line objects.
SODA Support: Added schema extraction and data conversion for Socrata Open Data API responses.
Broader Compatibility: Improved handling for BOM-prefixed files, CRLF-delimited JSONL, nested payloads, mixed structures, and wrapped documents.

Example using json_pointer to extract nested data from an API response:

datasets:
  - from: https://api.example.com/v1/data
    name: users
    params:
      json_pointer: /data/users

DataFusion v52.4.0 Upgrade

Apache DataFusion has been upgraded from v52.2.0 to v52.4.0, with aligned updates across arrow-rs, datafusion-federation, and datafusion-table-providers.

Key improvements:

DataFusion v52.4.0: Brings the latest fixes and compatibility improvements across query planning and execution.
Strict Overflow Handling: try_cast_to now uses strict cast to return errors on overflow instead of silently producing NULL values.
Federation Fix: Fixed SQL unparsing for Inexact filter pushdown with aliases.
Partial Aggregation Optimization: Improved partial aggregation performance for FlightSQLExec.

Dependency Upgrades

Dependency	Version / Update
Turso (libsql)	v0.5.3 (from v0.4.4)
iceberg-rust	v0.9
Vortex	Map type support, stack-safe IN-lists
arrow-rs	Arrow v57.2.0
datafusion-federation	Updated for DataFusion v52.4.0 alignment
datafusion-table-providers	Updated for DataFusion v52.4.0 alignment
datafusion-ballista	Bumped to fix BatchCoalescer schema mismatch panic

Other Improvements

Cayenne released as RC: Cayenne data accelerator is now promoted to release candidate status.
File Update Acceleration Mode: Added mode: file_update acceleration mode for file-based data refresh.
spice completions Command: New CLI command for generating shell completion scripts, with auto-detection of shell directory.
--endpoint Flag: Added --endpoint flag to spice run with scheme-based routing for custom endpoints.
mTLS Client Auth: Added mTLS client authentication support to the spice sql REPL.
DynamoDB DML: Implemented DML (INSERT, UPDATE, DELETE) support for the DynamoDB table provider.
Caching Retention: Added retention policies for cached query results.
GraphQL Custom Auth Headers: Added custom authorization header support for the GraphQL connector.
ClickHouse Date32 Support: Added Date32 type support for the ClickHouse connector.
AWS IAM Role Source: Added iam_role_source parameter for fine-grained AWS credential configuration.
S3 Metadata Columns: Metadata columns renamed to _location, _last_modified, _size for consistency, with more robust handling in projected queries.
S3 URL Style: Added s3_url_style parameter for S3 connector URL addressing (path-style vs virtual-hosted). Useful for S3-compatible stores like MinIO:
```
params:
  s3_endpoint: https://minio.local:9000
  s3_url_style: path
```
S3 Parquet Performance: Improved S3 parquet read performance.
HTTP Caching: Transient HTTP error responses such as 429 and 5xx are no longer cached, preventing stale error payloads from being served from cache.
HTTP Connector Metadata: Added response_headers as structured map data for HTTP datasets.

Views on_zero_results: Accelerated views now support on_zero_results: use_source to fall back to the source when no results are found:

views:
  - name: sales_summary
    sql: |
      SELECT region, SUM(amount) as total
      FROM sales
      GROUP BY region
    acceleration:
      enabled: true
      on_zero_results: use_source

Flight DoPut Ingestion Metrics: Added rows_written and bytes_written metrics for Flight DoPut / ADBC ETL ingestion.
EXPLAIN ANALYZE Metrics: Added metrics for EXPLAIN ANALYZE in FlightSQLExec.
Scheduler Executor Metrics: Added scheduler_active_executors_count metric for monitoring active executors.
Query Memory Limit: Updated default query memory limit from 70% to 90%, with GreedyMemoryPool for improved memory management.
MetastoreTransaction Support: Added transaction support to prevent concurrent metastore transaction conflicts.
Iceberg REST Catalog: Coerce unsupported Arrow types to Iceberg v2 equivalents in the REST catalog API.
CDC Cache Invalidation: Improved cache invalidation for CDC-backed datasets.
Spice.ai Connector Alignment: Parameter names aligned across catalog and data connectors for Spice.ai Cloud.
Cayenne File Size: Cayenne now correctly respects the configured target file size (defaults to 128MB).
Cayenne Primary Keys: Properly set primary_keys/on_conflict for Cayenne tables.
Turso Metastore Performance: Cached metastore connections and prepared statements for improved Turso and SQLite metastore performance.
Turso SQL Robustness: More robust SQL unparsing and date comparison handling for Turso.
Dictionary Type Normalization: Normalize Arrow Dictionary types for DuckDB and SQLite acceleration.
GitHub Connector Resilience: Improved GraphQL client resilience, performance, and ref filter handling.
ODBC Fix: Fixed ODBC queries silently returning 0 rows on query failure.
Anthropic Fixes: Fixed compatibility issues with Anthropic model provider.
v1/responses API Fix: The /v1/responses API now correctly preserves client instructions when system_prompt is set.
Shared Acceleration Snapshots: Show an error when snapshots are enabled on a shared acceleration file.
Distributed Mode Error Handling: Improved error handling for distributed mode and state_location configuration.
Helm Chart: Added support for ServiceAccount annotations and AWS IRSA example.
Perplexity Removed: Removed Perplexity model provider support.
Rust v1.93.1: Upgraded Rust toolchain to v1.93.1.

Contributors

Breaking Changes

S3 metadata columns renamed: S3 metadata columns renamed from location, last_modified, size to _location, _last_modified, _size.
v1/evals API removed: The /v1/evals endpoint has been removed.
Perplexity removed: Perplexity model provider support has been removed.
Default query memory limit changed: Default query memory limit increased from 70% to 90%.

Upgrading

To upgrade to v2.0.0-rc.2, use one of the following methods:

CLI:

spice upgrade v2.0.0-rc.2

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:2.0.0-rc.2 image:

docker pull spiceai/spiceai:2.0.0-rc.2

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai --version 2.0.0-rc.2

AWS Marketplace:

Spice is available in the AWS Marketplace.

What's Changed

Changelog

ci: fix E2E CLI upgrade test to use latest release for spiced download by @phillipleblanc in #9613
fix(DF): Lazily initialize BatchCoalescer in RepartitionExec to avoid schema type mismatch by @sgrebnov in #9623
feat: Implement catalog connectors for various databases by @lukekim in #9509
Refactor and clean up code across multiple crates by @lukekim in #9620
fix: Improve error handling for distributed mode and state_location configuration by @lukekim in #9611
Properly install postgres in install-postgres action by @krinart in #9629
fix: Use Python venv for schema validation in CI by @phillipleblanc in #9637
Update spicepod.schema.json by @app/github-actions in #9640
Update testoperator dispatch to use release/2.0 branch by @phillipleblanc in #9641
fix: Align CUDA asset names in Dockerfile and install tests with build output by @phillipleblanc in #9639
Fix expect test scripts in E2E Installation AI test by @sgrebnov in #9643
testoperator for partitioned arrow accelerator by @Jeadie in #9635
Remove default 1s refresh_check_interval from spidapter for hive datasets by @phillipleblanc in #9645
Fix scheduler panic and cancel race condition by @phillipleblanc in #9644
Align Spice.ai connector parameter names across catalog/data connectors by @lukekim in #9632
docs: update distribution details and add NAS support in release notes by @lukekim in #9650
Enable postgres-accel in CI builds for benchmarks by @sgrebnov in #9649
perf: Cache Turso metastore connection across operations by @penberg in #9646
Add 'scheduler_state_location' to spidapter by @Jeadie in #9655
Implement Cayenne S3 Express multi-zone live test with data validation by @lukekim in #9631
chore(spidapter): bump default memory limit from 8Gi to 32Gi by @phillipleblanc in #9661
perf: Use prepare_cached() in Turso and SQLite metastore backends by @penberg in #9662
Improve CDC cache invalidation by @krinart in #9651
Refactor Cayenne IDs to use UUIDv7 strings by @lukekim in #9667
fix: add liveness check for dead executors in partition routing by @Jeadie in #9657
fix(s3): Fix metadata column schema mismatches in projected queries by @sgrebnov in #9664
s3_metadata_columns tests: include test for location outside table prefix by @sgrebnov in #9676
docs: Update DuckDB, GCS, Git connector and Cayenne documentation by @lukekim in #9671
Add s3_url_style support for S3 connector URL addressing by @phillipleblanc in #9642
Consolidate E2E workflows and require WSL for Windows runtime by @lukekim in #9660
Upgrade to Rust v1.93.1 by @lukekim in #9669
Security fixes and improvements by @lukekim in #9666
feat(flight): add DoPut rows/bytes written metrics for DoPut ETL ingestion tracking by @phillipleblanc in #9663
Skip caching http error response + add response_headers by @krinart in #9670
refactor: Remove v1/evals functionality by @Jeadie in #9420
Make a test harness for Distributed Spice integration tests by @Jeadie in #9615
Enable on_zero_results: use_source for views by @krinart in #9699
fix(spidapter): Lower memory limit, passthrough AWS secrets, override flight URL by @peasee in #9704
Show an error on a shared acceleration file with snapshots enabled by @krinart in #9698
Fixes for anthropic by @Jeadie in #9707
Use max_partitions_per_executor in allocate_initial_partitions by @Jeadie in #9659
[SpiceDQ] Accelerations must have partition key by @Jeadie in #9711
Upgrade to Turso v0.5 by @lukekim in #9628
feat: Rename metadata columns to _location, _last_modified, _size by @phillipleblanc in #9712
fix: bump datafusion-ballista to fix BatchCoalescer schema mismatch panic by @phillipleblanc in #9716
fix: Ensure Cayenne respects target file size by @peasee in #9730
refactor: Make DDL preprocessing generic from Iceberg DDL processing by @peasee in #9731
[SpiceDQ] Distribute query of Cayenne Catalog to executors with data by @Jeadie in #9727
Properly set primary_keys/on_conflict for Cayenne tables by @krinart in #9739
Add executor resource and replica support to cloud app config by @ewgenius in #9734
feat: Support PARTITION BY in Cayenne Catalog table creation by @peasee in #9741
Update datafusion and related packages to version 52.3.0 by @lukekim in #9708
Route FlightSQL statement updates through QueryBuilder by @phillipleblanc in #9754
JSON file format improvements by @lukekim in #9743
[SpiceDQ] Partition Cayenne catalogs writes through to executors by @Jeadie in #9737
Update to DF v52.3.0 versions of datafusion & datafusion-tableproviders by @lukekim in #9756
Make S3 metadata column handling more robust by @sgrebnov in #9762
Fetch API keys from dedicated endpoint instead of apps response by @phillipleblanc in #9767
Update arrow-rs, datafusion-federation, and datafusion-table-providers dependencies by @phillipleblanc in #9769
Chunk metastore batch inserts to respect SQLite parameter limits by @phillipleblanc in #9770
Improve JSON SODA support by @lukekim in #9795
Add ADBC Data Connector by @lukekim in #9723
docs: Release Cayenne as RC by @peasee in #9766
cli[feat]: cloud mode to use region-specific endpoints by @lukekim in #9803
Include updated JSON formats in HTTPS connector by @lukekim in #9800
Flight DoPut: Partition-aware write-through forwarding by @Jeadie in #9759
Pass through authentication to ADBC connector by @lukekim in #9801
Move scheduler_state_location from adapter metadata to env var by @phillipleblanc in #9802
Fix Cayenne DoPut upsert returning stale data after 3+ writes by @phillipleblanc in #9806
Fix JSON column projection producing schema mismatch by @sgrebnov in #9811
Fix http connector by @krinart in #9818
Fix ADBC Connector build and test by @lukekim in #9813
Support update & delete DML for distributed cayenne catalog by @Jeadie in #9805
Set allow_http param when S3 endpoint uses http scheme by @phillipleblanc in #9834
fix: Cayenne Catalog DDL requires a connected executor in distributed mode by @Jeadie in #9838
fix: Add conditional put support for file:// scheduler state location by @Jeadie in #9842
fix: Require the DDL primary key contain the partition key by @Jeadie in #9844
fix: Databricks SQL Warehouse schema retrieval with INLINE disposition and async retry by @lukekim in #9846
Filter pushdown improvements for SqlTable by @lukekim in #9852
feat: add iam_role_source parameter for AWS credential configuration by @lukekim in #9854
Fix ODBC queries silently returning 0 rows on query failure by @lukekim in #9864
feat(adbc): Add ADBC catalog connector with schema/table discovery by @lukekim in #9865
Make Turso SQL unparsing more robust and fix date comparisons by @lukekim in #9871
Fix Flight/FlightSQL filter precedence and mutable query consistency by @lukekim in #9876
Partial Aggregation optimisation for FlightSQLExec by @lukekim in #9882
fix: v1/responses API preserves client instructions when system_prompt is set by @Jeadie in #9884
feat: emit scheduler_active_executors_count and use it in spidapter by @Jeadie in #9885
feat: Add custom auth header support for GraphQL connector by @krinart in #9899
Add --endpoint flag to spice run with scheme-based routing by @lukekim in #9903
When executor connects, send DDL for existing tables by @Jeadie in #9904
fix: Improve ADBC driver shutdown handling and error classification by @lukekim in #9905
fix: require all executors to succeed for distributed DML (DELETE/UPDATE) forwarding by @Jeadie in #9908
fix(cayenne catalog): fix catalog refresh race condition causing duplicate primary keys by @Jeadie in #9909
Remove Perplexity support by @Jeadie in #9910
Fix refresh_sql support for debezium constraints by @krinart in #9912
Implement DML for DynamoDBTableProvider by @lukekim in #9915
chore: Update iceberg-rust fork to v0.9 by @lukekim in #9917
Run physical optimizer on FallbackOnZeroResultsScanExec fallback plan by @sgrebnov in #9927
Improve Databricks error message when dataset has no columns by @sgrebnov in #9928
Delta Lake: fix data skipping for >= timestamp predicates by @sgrebnov in #9932
fix: Ensure distributed Cayenne DML inserts are forwarded to executors by @Jeadie in #9948
Add full query federation support for ADBC data connector by @lukekim in #9953
Make time_format deserialization case-insensitive by @vyershov in #9955
Hash ADBC join-pushdown context to prevent credential leaks in EXPLAIN plans by @lukekim in #9956
fix: Normalize Arrow Dictionary types for DuckDB and SQLite acceleration by @sgrebnov in #9959
ADBC BigQuery: Improve BigQuery dialect date/time and interval SQL generation by @lukekim in #9967
Make BigQueryDialect more robust and add BigQuery TPC-H benchmark support by @lukekim in #9969
fix: Show proper unauthorized error instead of misleading runtime unavailable by @lukekim in #9972
fix: Enforce target_chunk_size as hard maximum in chunking by @lukekim in #9973
Add caching retention by @krinart in #9984
fix: improve Databricks schema error detection and messages by @lukekim in #9987
fix: Set default S3 region for opendal operator and fix cayenne nextest by @phillipleblanc in #9995
fix(PostgreSQL): fix schema discovery for PostgreSQL partitioned tables by @sgrebnov in #9997
fix: Defer cache size check until after encoding for compressed results by @krinart in #10001
fix: Rewrite numeric BETWEEN to CAST(AS REAL) for Turso by @lukekim in #10003
fix: Handle integer time columns in append refresh for all accelerators by @sgrebnov in #10004
fix: preserve s3a:// scheme when building OpenDalStorageFactory with custom endpoint by @phillipleblanc in #10006
Fix ISO8601 time_format with Vortex/Cayenne append refresh by @sgrebnov in #10009
fix: Address data correctness bugs found in audit by @sgrebnov in #10015
fix(federation): fix SQL unparsing for Inexact filter pushdown with alias by @lukekim in #10017
Improve GitHub connector ref handling and resilience by @lukekim in #10023
feat: Add spice completions command for shell completion generation by @lukekim in #10024
fix: Fix data correctness bugs in DynamoDB decimal conversion and GraphQL pagination by @sgrebnov in #10054
Implement RefreshDataset for distributed control stream by @Jeadie in #10055
perf: Improve S3 parquet read performance by @sgrebnov in #10064
fix: Prevent write-through stalls and preserve PartitionTableProvider during catalog refresh by @Jeadie in #10066
feat: spice completions auto-detects shell directory and writes file by @lukekim in #10068
fix: Bug in DynamoDB, GraphQL, and ISO8601 refresh data handling by @sgrebnov in #10063
fix partial aggregation deduplication on string checking by @lukekim in #10078
fix: add MetastoreTransaction support to prevent concurrent transaction conflicts by @phillipleblanc in #10080
fix: Use GreedyMemoryPool, add spidapter query memory limit arg by @phillipleblanc in #10082
feat: Add metrics for EXPLAIN ANALYZE in FlightSQLExec by @lukekim in #10084
Use strict cast in try_cast_to to error on overflow instead of silent NULL by @sgrebnov in #10104
feat: Implement MERGE INTO for Cayenne catalog tables by @peasee in #10105
feat: Add distributed MERGE INTO support for Cayenne catalog tables by @peasee in #10106
Improve JSON format auto-detection for single multi-line objects by @lukekim in #10107
Add mode: file_update acceleration mode by @krinart in #10108
Coerce unsupported Arrow types to Iceberg v2 equivalents in REST catalog API by @peasee in #10109
fix: Update default query memory limit to 90% from 70% by @phillipleblanc in #10112
feat: Add mTLS client auth support to spice sql REPL by @lukekim in #10113
fix(datafusion-federation): report error on overflow instead of silent NULL by @sgrebnov in #10124
fix: Prevent data loss in MERGE when source has duplicate keys by @peasee in #10126
feat: Add ClickHouse Date32 type support by @sgrebnov in #10132
Add Delta Lake column mapping support (Name/Id modes) by @sgrebnov in #10134
fix: Restore Turso numeric BETWEEN rewrite lost in DML revert by @lukekim in #10139
fix: Enable arm64 Linux builds with fp16 and lld workarounds by @lukekim in #10142
fix: remove double trailing slash in Unity Catalog storage locations by @sgrebnov in #10147
fix: Improve GitHub GraphQL client resilience and performance by @lukekim in #10151
Enable reqwest compression and optimize HTTP client settings by @lukekim in #10154
fix: executor startup failures by @Jeadie in #10155
feat: Distributed runtime.task_history support by @Jeadie in #10156
fix: Preserve timestamp timezone in DDL forwarding to executors by @peasee in #10159
feat: Per-model rate-limited concurrent AI UDF execution by @Jeadie in #10160
fix(Turso): Reject subquery/outer-ref filter pushdown in Turso provider by @lukekim in #10174
Fix linux/macos spice upgrade by @phillipleblanc in #10194
Improve CREATE TABLE LIKE error messages, success output, EXPLAIN, and validation by @peasee in #10203
fix: chunk MERGE delete filters and update Vortex for stack-safe IN-lists by @peasee in #10207
Propagate runtime.params.parquet_page_index to Delta Lake connector by @sgrebnov in #10209
Properly mark dataset as Ready on Scheduler by @Jeadie in #10215
fix: handle Utf8View/LargeUtf8 in GitHub connector ref filters by @lukekim in #10217
fix(databricks): Fix schema introspection and timestamp overflow by @lukekim in #10226
fix(databricks): Fix schema introspection failures for non-Unity-Catalog environments by @lukekim in #10227
feat: Add pagination support to HTTP data connector by @lukekim in #10228
feat(databricks): DESCRIBE TABLE fallback and source-native type parsing for Lakehouse Federation by @lukekim in #10229
fix(databricks): harden HTTP retries, compression, and token refresh by @lukekim in #10232
feat[helm chart]: Add support for ServiceAccount annotations and AWS IRSA example by @peasee in #9833
fix: Log warning and fall back gracefully on Cayenne config change by @krinart in #9092
fix: Handle engine mismatch gracefully in snapshot fallback loop by @krinart in #9187

Full Changelog: https://github.com/spiceai/spiceai/compare/v2.0.0-rc.1...v2.0.0-rc.2

Spice v2.0-rc.1 (Mar 4, 2026)

March 4, 2026 · 23 min read

Sergei Grebnov

Senior Software Engineer at Spice AI

Announcing the release of Spice v2.0-rc.1! 🚀

v2.0.0-rc.1 is the first release candidate for early testing of v2.0.

Highlights in this release candidate include:

Active-Active Highly-Available Distributed Query that is object-store-native and built on Apache Ballista, with dynamic cluster sizing, distributed ingestion, and cluster observability
Spice Cayenne RC with staged append writes, file-based retention deletes, composite partitioning, and distributed ingestion
DataFusion v52.2.0 Upgrade with sort pushdown, a new merge join, and dynamic filters
DDL Support for CREATE TABLE and DROP TABLE via SQL for Iceberg and Cayenne catalogs
DuckLake Catalog & Data Connector for lakehouse-style data management
GCS Data Connector (Alpha) for Google Cloud Storage
Rust CLI Rewrite for a unified single-binary experience
Dependency upgrades including DuckDB v1.4.4, delta_kernel v0.18.2, and mistral.rs

Spice v2.0 includes several breaking changes. Review the breaking changes section before upgrading.

Distribution Changes

AI/ML support including local LLM/ML model and hosted LLM inference is now included in the default Spice build and image. The separate models build variant has been removed.

A new Network Attached Storage (NAS) distribution with built-in SMB and NFS data connector support is also now available in nightly builds and with Spice.ai Enterprise.

Distribution / Variant	Open Source	Spice Cloud	Enterprise
Default	✅	✅	✅
Data	Nightly only	✅	✅
NAS (SMB + NFS)	Nightly only	❌	✅
Metal (macOS)	✅	✅	✅
CUDA (Linux)	Nightly only	✅	✅
Allocator variants	Nightly only	✅	✅
ODBC connector	Local build only	✅	✅

For more details, see the Distributions documentation.

What's New in v2.0.0-rc.1

Active-Active HA Distributed Query

Distributed Query exits Beta with active-active highly-available object-store-based distributed query.

Distributed query supports two execution modes:

Synchronous: Queries for accelerated datasets are distributed across executors and results are streamed back in real-time. Non-accelerated datasets execute only on the scheduler. Best for interactive queries where low latency is critical.
Asynchronous: Queries are submitted via the new HTTP-only /v1/queries API and results are materialized to object storage for later retrieval. Best for long-running analytical workloads, batch processing, and non-accelerated datasets in distributed mode.

Key improvements:

Dynamic Cluster Sizing: The query planner automatically adjusts parallelism based on the number of active executors in the cluster, ensuring optimal resource utilization as nodes are added or removed.
Distributed Ingestion: Data ingestion for partitioned accelerated tables is now distributed across executor nodes, enabling higher throughput and parallel data loading in cluster mode. Regular (non-partitioned) accelerated tables do not distribute ingestion loads.
Synchronous Execution on Scheduler: /v1/sql and FlightSQL queries now execute synchronously on the scheduler when appropriate, reducing inter-node overhead for queries that don't benefit from distribution.
Faster Failure Detection: Executor heartbeat timeout reduced from 180s to 30s, enabling the cluster to quickly detect and respond to executor failures.
Cluster Observability: New metrics and Grafana dashboard for monitoring distributed query clusters.

Spice Cayenne Improvements

The Spice Cayenne data accelerator exits Beta with significant reliability and performance improvements:

Staged Append Writes: WAL-based staged append writes prevent partial writes and data loss on stream errors. Batches are written to a WAL file before being committed, ensuring atomicity.
File-Based Retention Deletes: Time-based retention now supports file-level deletes for both position-based and primary-key tables, reducing I/O overhead compared to row-level deletion.
Multiple Partition Expressions: Support for composite partitioning with partition_by: [col1, col2] using hierarchical path-like keys (e.g., 2025/10/15).
Distributed Ingestion: Cayenne catalog now supports distributed ingestion across executor nodes in cluster mode, including UPDATE operations.
Improved Robustness: Fixed CDC edge case where DELETE + UPSERT sequences could produce duplicate primary keys across protected snapshots. Improved upsert handling during runtime restarts.

DataFusion v52.2.0 Upgrade

Apache DataFusion has been upgraded to v52.2.0, bringing significant performance improvements, new query features, and enhanced extensibility.

Performance Improvements:

Faster CASE Expressions: Lookup-table-based evaluation for certain CASE expressions avoids repeated evaluation, accelerating common ETL patterns
MIN/MAX Aggregate Dynamic Filters: Queries with MIN/MAX aggregates now create dynamic filters during scan to prune files and rows as tighter bounds are discovered during execution
New Merge Join: Rewritten sort-merge join (SMJ) operator with speedups of three orders of magnitude in pathological cases (e.g., TPC-H Q21: minutes → milliseconds)
Caching Improvements: New statistics cache for file metadata avoids repeatedly recalculating statistics, significantly improving planning time. A prefix-aware list-files cache accelerates evaluating partition predicates for Hive partitioned tables
Improved Hash Join Filter Pushdown: Build-side hash map contents are now passed dynamically to probe-side scans for pruning files, row groups, and individual rows

Major Features:

Sort Pushdown to Scans: Sorts are pushed into data sources, enabling ~30x performance improvement on pre-sorted data with top-K queries. Parquet scans now reverse row group order for DESC queries on ASC-sorted files
TableProvider supports DELETE and UPDATE: New hooks for DELETE and UPDATE statements in the TableProvider trait, enabling Iceberg and Cayenne connectors to implement SQL DELETE and UPDATE operations
More Extensible SQL Planning: New RelationPlanner API for extending SQL planning for FROM clauses, enabling support for vendor-specific SQL dialects

DDL Support for Iceberg and Cayenne

SQL Schema Management: Spice now supports CREATE TABLE and DROP TABLE DDL operations for Iceberg and Cayenne catalogs via FlightSQL and the /v1/sql API. DML validation has been updated for catalog-level writability.

DuckLake Catalog & Data Connector

Lakehouse-Style Data Management: New DuckLake catalog and data connector enable lakehouse-style data management with DuckDB as the metadata catalog and object storage for data files. DuckLake provides ACID transactions, time travel, and schema evolution on top of Parquet files.

GCS Data Connector (Alpha)

Google Cloud Storage Support: New Google Cloud Storage data connector enables federated queries against data stored in GCS buckets, with Iceberg table support.

Rust CLI Rewrite

Unified Single-Binary Experience: The Spice CLI has been completely rewritten from Go to Rust, eliminating the Go dependency and providing a single spice binary built from the same codebase as spiced. This improves startup performance, reduces distribution size, and ensures consistent behavior between CLI and runtime.

Key Features:

Full Feature Parity: All 27+ CLI commands re-implemented in Rust with identical behavior
New spice query Command: Interactive REPL for async queries via the /v1/queries API with multi-line SQL input, spinner progress indicator, Ctrl+C cancellation, and partial query ID matching
--output=json Flag: Machine-readable JSON output for CLI commands, enabling scripting and automation
spice login --output: New output modes (env, json, keychain) for flexible credential management
spice cloud metrics: New command for Spice Cloud deployment metrics

Models Included by Default

Local LLM/ML model inference (via mistral.rs) is now included in the default Spice build. The separate models build variant has been removed. This simplifies installation and ensures all users have access to local AI inference capabilities.

Error Propagation for Dataset and Model Status APIs

The /v1/datasets and /v1/models APIs now return structured error information when a component is in an Error state. The ?status=true query parameter must be passed to retrieve the real-time component status, including the error state and details. Previously, the status field only indicated Error with no further detail. Now, two new fields are included when ?status=true is specified:

error: A structured object with category, type, and code fields for programmatic error handling (e.g. { "category": "dataset", "type": "auth", "code": "dataset.auth" }).
error_message: A human-readable description of why the component entered an error state.

These fields are only present when ?status=true is passed and the component is in an error state.

Example /v1/datasets?status=true response:

[
  {
    "from": "postgres:syncs",
    "name": "daily_journal",
    "replication_enabled": false,
    "acceleration_enabled": true,
    "status": "Ready"
  },
  {
    "from": "databricks:hive_metastore.default.messages",
    "name": "messages",
    "replication_enabled": false,
    "acceleration_enabled": true,
    "status": "Error",
    "error": {
      "category": "dataset",
      "type": "auth",
      "code": "dataset.auth"
    },
    "error_message": "Unable to authenticate with datasource credentials"
  }
]

The spice datasets and spice models CLI commands now include an ERROR column that displays the error message for any component in an error state.

Additional Dependency Upgrades

Dependency	Version
Ballista	v52.0.0
DuckDB	v1.4.4
delta_kernel	v0.18.2
mistral.rs	v0.7.0 (candle fork removed, now uses candle 0.9.2 from crates.io)
Turso (libsql)	v0.4.4
Vortex	Upgraded with CASE-WHEN support
AWS SDK	Multiple crates updated + APN user-agent support

Other Improvements

Spicepod v2 Support: Spicepods now support version v2, and spice init generates spicepod.yaml files with version: v2 by default while maintaining backward compatibility for existing v1 spicepods.
x.ai Models: x.ai models now exclusively use the /v1/responses endpoint with rate limiting support.
HuggingFace Chat Templates: Added support for chat templates in HuggingFace model configurations.
Databricks SQL Dialect: Added Databricks SQL dialect for DataFusion unparser, improving federation query generation.
Snowflake: Added snowflake_private_key parameter for key-pair authentication.
Acceleration Metrics: New rows_written, bytes_written, and dataset_acceleration_size_bytes metrics for acceleration refresh ingestion.
Refresh SQL UDFs: Core scalar UDFs are now enabled in refresh SQL expressions.
FlightSQL: Fixed TLS connection handling for grpc+tls:// endpoints with custom CA certificate support.
FlightSQL: Fixed schema consistency by expanding view types and verifying field names.
Hash Index: Fixed query correctness when hash index is used with additional filters.
Results Cache: Fixed schema preservation for empty query results.
Query Nullability: Reconciled execution stream nullability with logical plan schema.
Schema Evolution: Graceful handling of schema evolution mismatch errors during data refresh.
Internal YAML Parser: Replaced deprecated serde_yaml with an internal YAML implementation.

Spicepod v1 to v2 Changes

Spicepod v2 introduces configuration improvements while maintaining backward compatibility with v1. Existing v1 spicepods continue to work — deprecated fields are automatically migrated at load time.

Version support:

Version	Status
`v2`	Default. Used by `spice init`.
`v1`	Supported. Deprecated fields auto-migrate.
`v1beta1`	Removed. No longer accepted.

Configuration changes:

v1 (deprecated)	v2 (preferred)	Notes
`runtime.results_cache`	`runtime.caching.sql_results`	All fields migrate automatically. `cache_max_size` → `max_size`.
`runtime.memory_limit`	`runtime.query.memory_limit`	Auto-migrated. `query.memory_limit` takes priority if both set.
`runtime.temp_directory`	`runtime.query.temp_directory`	Auto-migrated. `query.temp_directory` takes priority if both set.
`dataset.invalid_type_action`	`dataset.unsupported_type_action`	Auto-migrated. v2 adds a new `string` variant.

New v2 fields:

runtime.ready_state — Controls when the runtime reports ready (on_load default, or on_registration).
runtime.flight.do_put_rate_limit_enabled — Enable/disable FlightSQL DoPut rate limiting (default: true).
runtime.query.spill_compression — Compression for query spill files (e.g., lz4_frame).
runtime.scheduler.partition_management — Configure partition assignment interval, limits, and timeouts for distributed mode.
runtime.caching.sql_results.stale_while_revalidate_ttl — Serve stale cached results while revalidating in the background.
runtime.caching.sql_results.encoding — Cache entry compression (e.g., zstd).
catalog.access: read_write_create — New access mode for catalogs that support DDL operations.

Migration note: When both the deprecated v1 field and its v2 equivalent are set, the v2 field takes priority.

Contributors

Breaking Changes

Cayenne and Distributed Query exit Beta: Beta warnings have been removed from documentation and code. Both features are now considered GA-ready.
Models included by default: The separate models build variant has been removed. Local LLM inference is now always included.
Spicepod version defaults to v2: New spicepods created with spice init now default to version: v2. Existing v1 spicepods remain supported, and v1beta1 is no longer accepted.
Windows native builds removed: Native Windows builds are no longer provided. Use WSL for local development instead.
Metric renames: accelerated_refresh metrics renamed to acceleration_refresh for consistency. last_refresh_time gauge renamed to include milliseconds unit.
Caching config renamed: ResultsCache replaced with SQLResultsCacheConfig in configuration.
DuckDB parameter rename: partitioned_write_flush_threshold renamed to partitioned_write_flush_threshold_rows.
v1/search API: The /v1/search API now always returns an array in matches, even for single results.
x.ai model endpoint: x.ai models now exclusively use the /v1/responses endpoint.
Error messages: Error messages across S3 Vectors, ScyllaDB, Snowflake, ClickHouse, and other components have been refactored for clarity and consistency.

Cookbook Updates

New and updated Spice Cookbook recipes:

Async Queries: Submit long-running queries asynchronously and retrieve results later.
DuckLake Catalog Connector: Use DuckLake for lakehouse-style data management with ACID transactions and time travel.

The Spice Cookbook includes 88 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v2.0.0-rc.1, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:2.0.0-rc.1 image:

docker pull spiceai/spiceai:2.0.0-rc.1

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai --version 2.0.0-rc.1

AWS Marketplace:

Spice is available in the AWS Marketplace.

What's Changed

Changelog

Add TPC-DS integration tests with S3 source and PostgreSQL acceleration by @phillipleblanc in #9006
fix(tests): fix flaky/slow/failing unit tests by @phillipleblanc in #9009
fix: Update benchmark snapshots for DF51 upgrade by @app/github-actions in #9008
fix: add feature gate to rrf TEST_EMBEDDING_MODEL by @phillipleblanc in #9017
fix: features check by @phillipleblanc in #9014
fix: Enable Cayenne acceleration snapshots by @lukekim in #9020
URL table support by @lukekim in #9018
ScyllaDB key filter by @lukekim in #8997
fix: Schema mismatch when using column projection with HTTP caching by @phillipleblanc in #9021
Add more tests for HTTP caching with columns selection by @sgrebnov in #9025
HTTP cache snapshots: default to time_interval and fix snapshots_creation_policy: on_change by @sgrebnov in #9026
Fix duplicate snapshot creation on startup by @sgrebnov in #9029
Add ScyllaDB and SMB to the README table by @krinart in #9034
Remove waiting for runtime to be ready before creating snapshot by @krinart in #9033
Fix snapshot on_change policy to skip when no writes occurred by @sgrebnov in #9028
Release notes for release release/1.11.0-rc.2 by @krinart in #9016
ci: use arduino/setup-protoc for official protobuf compiler by @phillipleblanc in #9036
ci: install unzip on aarch64 runner for arduino/setup-protoc by @phillipleblanc in #9038
fix: don't fail release if upload to minio fails by @phillipleblanc in #9039
Add missing protoc step to setup-cc action by @krinart in #9041
fix: Update Search integration test snapshots by @app/github-actions in #9013
Fix formula_1 and codebase_community in bird-bench by @Jeadie in #9000
Cayenne S3 Express One Zone improvements by @lukekim in #9015
Add zlib1g-dev to CI by @lukekim in #9052
Improve validation and logging for hash indexes by @lukekim in #9047
Upgrade Vortex with CASE-WHEN by @lukekim in #9051
x.ai models now exclusively use /v1/responses endpoint by @lukekim in #9400
Improvements for snapshot schema comparison by @krinart in #9401
v2.0 breaking changes by @lukekim in #9233
Create PartitionManagementTask for scheduler to update accelerated table partition assignments by @Jeadie in #9378
refactor(Cayenne): route all write orchestration through CayenneDataSink by @sgrebnov in #9402
Refactor benchmark to use QueryExecutor trait by @Jeadie in #9418
feat: Add spidapter build and release workflow by @peasee in #9427
Testoperator: add support for api-key when connecting to external spice instance by @sgrebnov in #9421
Initial implementation of Ducklake catalog & data connectors by @lukekim in #9083
Require aws_lc_rs since jsonwebtoken upgrade by @Jeadie in #9426
feat: Add spidapter tool by @peasee in #9425
Add release notes for 1.11.2 patch release by @sgrebnov in #9430
feat(spidapter): integrate system-adapter-protocol with SCP provisioning by @phillipleblanc in #9434
Add DuckLake TPCH E2E workflow and federated Spicepod configuration by @lukekim in #9431
fix(spidapter): use Flight handshake auth instead of x-api-key header by @phillipleblanc in #9435
[spidapter] Keep only what sparks joy by @Jeadie in #9439
Refactor binary operator balancing by @Jeadie in #9424
feat: Add Iceberg DDL support (CREATE TABLE / DROP TABLE) for default catalog override by @phillipleblanc in #9440
Fix Flight SQL schema consistency: expand view types and verify field names by @sgrebnov in #9438
Update spidapter for new system-adapter-protocol by @sgrebnov in #9442
docs: fix typos and syntax errors in style guide and error handling docs by @cluster2600 in #9445
Add acceleration refresh ingestion metrics (rows_written, bytes_written) by @phillipleblanc in #9461
Refactor(Cayenne): Replace CatalogError and string based errors with Snafu errors by @sgrebnov in #9403
Replace deprecated claude-3-5-haiku-latest with claude-haiku-4-5 by @Jeadie in #9492
Fix #9481: Preserve schema in results cache for empty query results by @phillipleblanc in #9485
Fix partition by serializing by @Jeadie in #9474
query: reconcile execution stream nullability with logical plan schema by @phillipleblanc in #9486
initial spice-cloud-client crate and spice cloud metrics --app <app-name>. by @Jeadie in #9480
feat: Return dataset error message in datasets API by @peasee in #9487
Spicebench by @lukekim in #9447
build(deps): consolidate dependabot dependency updates by @phillipleblanc in #9504
fix(cluster): route non-partitioned accelerated tables in distributed mode by @phillipleblanc in #9508
Enable core scalar UDFs in refresh SQL by @sgrebnov in #9502
Fix metrics in Spidapter again by @Jeadie in #9497
fix(cluster): tolerate Completed->status propagation race in distributed query handle by @phillipleblanc in #9510
feat: Support distributed ingestion in cayenne catalog by @peasee in #9506
Fix Cayenne duplicate primary keys after DELETE + UPSERT CDC sequences by @krinart in #9494
fix(cluster): rewrite table scans inside subqueries for distributed execution by @phillipleblanc in #9518
fix: Set catalog mode to readwritecreate in spidapter by @peasee in #9519
Upgrade AWS SDK crates & set APN user-agent in AWS SDK credential bridge by @lukekim in #8328
feat(runtime): add runtime ready_state on_registration semantics by @lukekim in #9522
fix: Add spidapter post-setup retries by @peasee in #9526
Make partition discovery more robust and make initialization non-blocking by @sgrebnov in #9499
Make lint-rust-fix support targeted packages and features by @Jeadie in #9511
Handle new Cloud SCP API by @Jeadie in #9532
Refactor and simplify streaming benchmarks by @krinart in #9405
fix: ensure spidapter only increments attempts on failures by @peasee in #9534
feat: Support specifying app resources in spidapter by @peasee in #9536
test(runtime): Spice Cayenne DDL integration test by @lukekim in #9535
fix: Handle schema evolution mismatch errors during data refresh by @lukekim in #9527
fix: resolve clippy lint warnings by @phillipleblanc in #9547
pr-builds --tag <TAG> for build_and_release.yml by @Jeadie in #9507
Add --output flag to spice login with env/json/keychain modes by @Jeadie in #9541
Don't use 'PartitionedTableScanRewrite' in async distributed query by @Jeadie in #9548
feat(spidapter): add local backend mode with single executor by @phillipleblanc in #9531
support chat template in HF by @Jeadie in #9543
fix(cayenne): stream PK retention deletes and run OOM regression in CI by @phillipleblanc in #9533
cayenne: Staged append writes to prevent partial writes and data loss on stream error by @sgrebnov in #9491
AcceleratedTable::scan use FederatedTable::scan when ClusterRole::Scheduler by @Jeadie in #9550
Upgrade to delta-kernel-rs v0.18.2 by @lukekim in #9528
Run cayenne tests as part of PR CI by @sgrebnov in #9554
Upgrade to DataFusion v52.2.0 by @lukekim in #9419
Remove Snapshot Compaction + Add snapshot existence check by @krinart in #9523
Update dependencies by @lukekim in #9566
fix: Update benchmark snapshots by @app/github-actions in #9565
fix: Compare Cayenne table configuration on startup by @peasee in #9529
Make Refresh::refresh_sql more robust to alterations over time. by @Jeadie in #9549
fix: Update datafusion-table-providers dependency to latest revision by @lukekim in #9574
Unset AWS_ENDPOINT_URL when empty by @krinart in #9575
fix: allow BytesProcessedExec repartitioning for unordered input by @lukekim in #9540
Sanitize DataFusion errors by @lukekim in #9530
Add conditional logging for partition assignments by @Jeadie in #9577
use 'properly early exit on SIGTERM' by @Jeadie in #9573
Update datafusion to 52.2.0 by @phillipleblanc in #9582
Ensure we query one and only one partition per request by @Jeadie in #9416
feat: Add support for Spicepod version v2 by @lukekim in #9583
[SpiceDQ] Improve error messages; Avoid race condition on allocate_initial_partitions. by @Jeadie in #9579
Update ballista dependencies to latest 52.0.0 revision by @lukekim in #9581
Fix Databricks spark_connect mode always disabled by @phillipleblanc in #9586
Support partitioning in Arrow accelerator by @Jeadie in #9571
Fix spice query CLI response deserialization by @phillipleblanc in #9588
fix: Update benchmark snapshots by @app/github-actions in #9584
fix: Share RuntimeEnv across Cayenne read/write/delete paths for targeted list_files_cache invalidation by @sgrebnov in #9589
feat: Add file:// state_location support for async queries scheduler by @phillipleblanc in #9590
Update endgame links by @krinart in #9598

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.11.2...v2.0.0-rc.1

Spice v1.11.0 (Jan 28, 2026)

January 28, 2026 · 58 min read

William Croxson

Senior Software Engineer at Spice AI

Announcing the release of Spice v1.11.0-stable! ⚡

In Spice v1.11.0, Spice Cayenne reaches Beta status with acceleration snapshots, Key-based deletion vectors, and Amazon S3 Express One Zone support. DataFusion has been upgraded to v51 along with Arrow v57.2, and iceberg-rust v0.8.0. v1.11 adds several DynamoDB & DynamoDB Streams improvements such as JSON nesting, and adds significant improvements to Distributed Query with active-active schedulers and mTLS for enterprise-grade high-availability and secure cluster communication.

This release also adds new SMB, NFS, and ScyllaDB Data Connectors (Alpha), Prepared Statements with full SDK support (gospice, spice-rs, spice-dotnet, spice-java, spice.js, and spicepy), Google LLM Support for expanded AI inference capabilities, and significant improvements to caching, observability, and Hash Indexing for Arrow Acceleration.

What's New in v1.11.0

Spice Cayenne Accelerator Reaches Beta

Spice Cayenne has been promoted to Beta status with acceleration snapshots support and numerous performance and stability improvements.

Key Enhancements:

Key-based Deletion Vectors: Improved deletion vector support using key-based lookups for more efficient data management and faster delete operations. Key-based deletion vectors are more memory-efficient than positional vectors for sparse deletions.
S3 Express One Zone Support: Store Cayenne data files in S3 Express One Zone for single-digit millisecond latency, ideal for latency-sensitive query workloads that require persistence.

Improved Reliability:

Resolved FuturesUnordered reentrant drop crashes
Fixed memory growth issues related to Vortex metrics allocation
Metadata catalog now properly respects cayenne_file_path location
Added warnings for unparseable configuration values

For more details, refer to the Cayenne Documentation.

DataFusion v51 Upgrade

Apache DataFusion has been upgraded to v51, bringing significant performance improvements, new SQL features, and enhanced observability.

DataFusion v51 ClickBench Performance

Performance Improvements:

Faster CASE Expression Evaluation: Expressions now short-circuit earlier, reuse partial results, and avoid unnecessary scattering, speeding up common ETL patterns
Better Defaults for Remote Parquet Reads: DataFusion now fetches the last 512KB of Parquet files by default, typically avoiding 2 I/O requests per file
Faster Parquet Metadata Parsing: Leverages Arrow 57's new thrift metadata parser for up to 4x faster metadata parsing

New SQL Features:

SQL Pipe Operators: Support for |> syntax for inline transforms
DESCRIBE <query>: Returns the schema of any query without executing it
Named Arguments in SQL Functions: PostgreSQL-style param => value syntax for scalar, aggregate, and window functions
Decimal32/Decimal64 Support: New Arrow types supported including aggregations like SUM, AVG, and MIN/MAX

Example pipe operator:

SELECT * FROM t
|> WHERE a > 10
|> ORDER BY b
|> LIMIT 5;

Improved Observability:

Improved EXPLAIN ANALYZE Metrics: New metrics including output_bytes, selectivity for filters, reduction_factor for aggregates, and detailed timing breakdowns

Arrow 57.2 Upgrade

Apache Arrow has been upgraded to v57.2, bringing major performance improvements and new capabilities.

Key Features:

4x Faster Parquet Metadata Parsing: A rewritten thrift metadata parser delivers up to 4x faster metadata parsing, especially beneficial for low-latency use cases and files with large amounts of metadata
Parquet Variant Support: Experimental support for reading and writing the new Parquet Variant type for semi-structured data, including shredded variant values
Parquet Geometry Support: Read and write support for Parquet Geometry types (GEOMETRY and GEOGRAPHY) with GeospatialStatistics
New arrow-avro Crate: Efficient conversion between Apache Avro and Arrow RecordBatches with projection pushdown and vectorized execution support

DynamoDB Connector Enhancements

Added JSON nesting for DynamoDB Streams
Improved batch deletion handling

Distributed Query Improvements

High Availability Clusters: Spice now supports running multiple active schedulers in an active/active configuration for production deployments. This eliminates the scheduler as a single point of failure and enables graceful handling of node failures.

Multiple schedulers run simultaneously, each capable of accepting queries
Schedulers coordinate via a shared S3-compatible object store
Executors discover all schedulers automatically
A load balancer distributes client queries across schedulers

Example HA configuration:

runtime:
  scheduler:
    state_location: s3://my-bucket/spice-cluster
    params:
      region: us-east-1

mTLS Verification: Cluster communication between scheduler and executors now supports mutual TLS verification for enhanced security.

Credential Propagation: S3, ABFS, and GCS credentials are now automatically propagated to executors in cluster mode, enabling access to cloud storage across the distributed query cluster.

Improved Resilience:

Exponential backoff for scheduler disconnection recovery
Increased gRPC message size limit from 16MB to 100MB for large query plans
HTTP health endpoint for cluster executors
Automatic executor role inference when --scheduler-address is provided

For more details, refer to the Distributed Query Documentation.

iceberg-rust v0.8.0 Upgrade

Spice has been upgraded to iceberg-rust v0.8.0, bringing improved Iceberg table support.

Key Features:

V3 Metadata Support: Full support for Iceberg V3 table metadata format
INSERT INTO Partitioned Tables: DataFusion integration now supports inserting data into partitioned Iceberg tables
Improved Delete File Handling: Better support for position and equality delete files, including shared delete file loading and caching
SQL Catalog Updates: Implement update_table and register_table for SQL catalog
S3 Tables Catalog: Implement update_table for S3 Tables catalog
Enhanced Arrow Integration: Convert Arrow schema to Iceberg schema with auto-assigned field IDs, _file column support, and Date32 type support

Acceleration Snapshots

Acceleration snapshots enable point-in-time recovery and data versioning for accelerated datasets. Snapshots capture the state of accelerated data at specific points, allowing for fast bootstrap recovery and rollback capabilities.

Key Features:

Flexible Triggers: Configure when snapshots are created based on time intervals or stream batch counts
Automatic Compaction: Reduce storage overhead by compacting older snapshots (DuckDB only)
Bootstrap Integration: Snapshots can reset cache expiry on load for seamless recovery (DuckDB with Caching refresh mode)
Smart Creation Policies: Only create snapshots when data has actually changed

Example configuration:

datasets:
  - from: s3://my-bucket/data.parquet
    name: my_dataset
    acceleration:
      enabled: true
      engine: cayenne
      mode: file
      snapshots: enabled
      snapshots_trigger: time_interval
      snapshots_trigger_threshold: 1h
      snapshots_creation_policy: on_changed

Snapshots API and CLI: New API endpoints and CLI commands for managing snapshots programmatically.

CLI Commands:

# List all snapshots for a dataset
spice acceleration snapshots taxi_trips

# Get details of a specific snapshot
spice acceleration snapshot taxi_trips 3

# Set the current snapshot for rollback (requires runtime restart)
spice acceleration set-snapshot taxi_trips 2

HTTP API Endpoints:

Method	Endpoint	Description
GET	`/v1/datasets/{dataset}/acceleration/snapshots`	List all snapshots for a dataset
GET	`/v1/datasets/{dataset}/acceleration/snapshots/{id}`	Get details of a specific snapshot
POST	`/v1/datasets/{dataset}/acceleration/snapshots/current`	Set the current snapshot for rollback

For more details, refer to the Acceleration Snapshots Documentation.

Caching Acceleration Mode Improvements

The Caching Acceleration Mode introduced in v1.10.0 has received significant performance optimizations and reliability fixes in this release.

Performance Optimizations:

Non-blocking Cache Writes: Cache misses no longer block query responses. Data is written to the cache asynchronously after the query returns, reducing query latency for cache miss scenarios.
Batch Cache Writes: Multiple cache entries are now written in batches rather than individually, significantly improving write throughput for high-volume cache operations.

Reliability Fixes:

Correct SWR Refresh Behavior: The stale-while-revalidate (SWR) pattern now correctly refreshes only the specific entries that were accessed instead of refreshing all stale rows in the dataset. This prevents unnecessary source queries and reduces load on upstream data sources.
Deduplicated Refresh Requests: Fixed an issue where JSON array responses could trigger multiple redundant refresh operations. Refresh requests are now properly deduplicated.
Fixed Cache Hit Detection: Resolved an issue where queries that didn't include fetched_at in their projection would always result in cache misses, even when cached data was available.
Unfiltered Query Optimization: SELECT * queries without filters now return cached data directly without unnecessary filtering overhead.

For more details, refer to the Caching Acceleration Mode Documentation.

Prepared Statements

Improved Query Performance and Security: Spice now supports prepared statements, enabling parameterized queries that improve both performance through query plan caching and security by preventing SQL injection attacks.

Key Features:

Query Plan Caching: Prepared statements cache query plans, reducing planning overhead for repeated queries
SQL Injection Prevention: Parameters are safely bound, preventing SQL injection vulnerabilities
Arrow Flight SQL Support: Full prepared statement support via Arrow Flight SQL protocol

SDK Support:

SDK	Support	Min Version	Method
gospice (Go)	✅ Full	v8.0.0+	`SqlWithParams()` with typed constructors (`Int32Param`, `StringParam`, `TimestampParam`, etc.)
spice-rs (Rust)	✅ Full	v3.0.0+	`query_with_params()` with `RecordBatch` parameters
spice-dotnet (.NET)	✅ Full	v0.3.0+	`QueryWithParams()` with typed parameter builders
spice-java (Java)	✅ Full	v0.5.0+	`queryWithParams()` with typed `Param` constructors (`Param.int64()`, `Param.string()`, etc.)
spice.js (JavaScript)	✅ Full	v3.1.0+	`query()` with parameterized query support
spicepy (Python)	✅ Full	v3.1.0+	`query()` with parameterized query support

Example (Go):

import "github.com/spiceai/gospice/v8"

client, _ := spice.NewClient()
defer client.Close()

// Parameterized query with typed parameters
results, _ := client.SqlWithParams(ctx,
    "SELECT * FROM products WHERE price > $1 AND category = $2",
    spice.Float64Param(10.0),
    spice.StringParam("electronics"),
)

Example (Java):

import ai.spice.SpiceClient;
import ai.spice.Param;
import org.apache.arrow.adbc.core.ArrowReader;

try (SpiceClient client = new SpiceClient()) {
    // With automatic type inference
    ArrowReader reader = client.queryWithParams(
        "SELECT * FROM products WHERE price > $1 AND category = $2",
        10.0, "electronics");

    // With explicit typed parameters
    ArrowReader reader = client.queryWithParams(
        "SELECT * FROM products WHERE price > $1 AND category = $2",
        Param.float64(10.0),
        Param.string("electronics"));
}

For more details, refer to the Parameterized Queries Documentation.

Spice Java SDK v0.5.0

Parameterized Query Support for Java: The Spice Java SDK v0.5.0 introduces parameterized queries using ADBC (Arrow Database Connectivity), providing a safer and more efficient way to execute queries with dynamic parameters.

Key Features:

SQL Injection Prevention: Parameters are safely bound, preventing SQL injection vulnerabilities
Automatic Type Inference: Java types are automatically mapped to Arrow types (e.g., double → Float64, String → Utf8)
Explicit Type Control: Use the new Param class with typed factory methods (Param.int64(), Param.string(), Param.decimal128(), etc.) for precise control over Arrow types
Updated Dependencies: Apache Arrow Flight SQL upgraded to 18.3.0, plus new ADBC driver support

Example:

import ai.spice.SpiceClient;
import ai.spice.Param;

try (SpiceClient client = new SpiceClient()) {
    // With automatic type inference
    ArrowReader reader = client.queryWithParams(
        "SELECT * FROM taxi_trips WHERE trip_distance > $1 LIMIT 10",
        5.0);

    // With explicit typed parameters for precise control
    ArrowReader reader = client.queryWithParams(
        "SELECT * FROM orders WHERE order_id = $1 AND amount >= $2",
        Param.int64(12345),
        Param.decimal128(new BigDecimal("99.99"), 10, 2));
}

Maven:

<dependency>
  <groupId>ai.spice</groupId>
  <artifactId>spiceai</artifactId>
  <version>0.5.0</version>
</dependency>

For more details, refer to the Spice Java SDK Repository.

Google LLM Support

Expanded AI Provider Support: Spice now supports Google embedding and chat models via the Google AI provider, expanding the available LLM options for AI inference workloads alongside existing providers like OpenAI, Anthropic, and AWS Bedrock.

Key Features:

Google Chat Models: Access Google's Gemini models for chat completions
Google Embeddings: Generate embeddings using Google's text embedding models
Unified API: Use the same OpenAI-compatible API endpoints for all LLM providers

Example spicepod.yaml configuration:

models:
  - from: google:gemini-2.0-flash
    name: gemini
    params:
      google_api_key: ${secrets:GOOGLE_API_KEY}

embeddings:
  - from: google:text-embedding-004
    name: google_embeddings
    params:
      google_api_key: ${secrets:GOOGLE_API_KEY}

For more details, refer to the Google LLM Documentation (see docs PR #1286).

URL Tables

Query data sources directly via URL in SQL without prior dataset registration. Supports S3, Azure Blob Storage, and HTTP/HTTPS URLs with automatic format detection and partition inference.

Supported Patterns:

Single files: SELECT * FROM 's3://bucket/data.parquet'
Directories/prefixes: SELECT * FROM 's3://bucket/data/'
Glob patterns: SELECT * FROM 's3://bucket/year=*/month=*/data.parquet'

Key Features:

Automatic file format detection (Parquet, CSV, JSON, etc.)
Hive-style partition inference with filter pushdown
Schema inference from files
Works with both SQL and DataFrame APIs

Example with hive partitioning:

-- Partitions are automatically inferred from paths
SELECT * FROM 's3://bucket/data/' WHERE year = '2024' AND month = '01'

Enable via spicepod.yml:

runtime:
  params:
    url_tables: enabled

Cluster Mode Async Query APIs (experimental)

New asynchronous query APIs for long-running queries in cluster mode:

/v1/queries endpoint: Submit queries and retrieve results asynchronously

OpenTelemetry Improvements

Unified Telemetry Endpoint: OTel metrics ingestion has been consolidated to the Flight port (50051), simplifying deployment by removing the separate OTel port (50052). The push-based metrics exporter continues to support integration with OpenTelemetry collectors.

Note: This is a breaking change. Update your configurations if you were using the dedicated OTel port 50052. Internal cluster communication now uses port 50052 exclusively.

Observability Improvements

Enhanced Dashboards: Updated Grafana and Datadog example dashboards with:

Snapshot monitoring widgets
Improved accelerated datasets section
Renamed ingestion lag charts for clarity

Additional Histogram Buckets: Added more buckets to histogram metrics for better latency distribution visibility.

For more details, refer to the Monitoring Documentation.

Hash Indexing for Arrow Acceleration (experimental)

Arrow-based accelerations now support hash indexing for faster point lookups on equality predicates. Hash indexes provide O(1) average-case lookup performance for columns with high cardinality.

Features:

Primary key hash index support
Secondary index support for non-primary key columns
Composite key support with proper null value handling

Example configuration:

datasets:
  - from: postgres:users
    name: users
    acceleration:
      enabled: true
      engine: arrow
      primary_key: user_id
      indexes:
        '(tenant_id, user_id)': unique  # Composite hash index

For more details, refer to the Hash Index Documentation.

SMB and NFS Data Connectors

Network-Attached Storage Connectors: New data connectors for SMB (Server Message Block) and NFS (Network File System) protocols enable direct federated queries against network-attached storage without requiring data movement to cloud object stores.

Key Features:

SMB Protocol Support: Connect to Windows file shares and Samba servers with authentication support
NFS Protocol Support: Connect to Unix/Linux NFS exports for direct data access
Federated Queries: Query Parquet, CSV, JSON, and other file formats directly from network storage with full SQL support
Acceleration Support: Accelerate data from SMB/NFS sources using DuckDB, Spice Cayenne, or other accelerators

Example spicepod.yaml configuration:

datasets:
  # SMB share
  - from: smb://fileserver/share/data.parquet
    name: smb_data
    params:
      smb_username: ${secrets:SMB_USER}
      smb_password: ${secrets:SMB_PASS}

  # NFS export
  - from: nfs://nfsserver/export/data.parquet
    name: nfs_data

For more details, refer to the Data Connectors Documentation.

ScyllaDB Data Connector

A new data connector for ScyllaDB, the high-performance NoSQL database compatible with Apache Cassandra. Query ScyllaDB tables directly or accelerate them for faster analytics.

Example configuration:

datasets:
  - from: scylladb:my_keyspace.my_table
    name: scylla_data
    acceleration:
      enabled: true
      engine: duckdb

For more details, refer to the ScyllaDB Data Connector Documentation.

Flight SQL TLS Connection Fixes

TLS Connection Support: Fixed TLS connection issues when using grpc+tls:// scheme with Flight SQL endpoints. Added support for custom CA certificate files via the new flightsql_tls_ca_certificate_file parameter.

Developer Experience Improvements

Turso v0.3.2 Upgrade: Upgraded Turso accelerator for improved performance and reliability
Rust 1.91 Upgrade: Updated to Rust 1.91 for latest language features and performance improvements
Spice Cloud CLI: Added spice cloud CLI commands for cloud deployment management
Improved Spicepod Schema: Improved JSON schema generation for better IDE support and validation
Acceleration Snapshots: Added configurable snapshots_create_interval for periodic acceleration snapshots independent of refresh cycles
Tiered Caching with Localpod: The Localpod connector now supports caching refresh mode, enabling multi-layer acceleration where a persistent cache feeds a fast in-memory cache
GitHub Data Connector: Added workflows and workflow runs support for GitHub repositories
NDJSON/LDJSON Support: Added support for Newline Delimited JSON and Line Delimited JSON file formats

Additional Improvements & Bug Fixes

Model Listing: New functionality to list available models across multiple AI providers
DuckDB Partitioned Tables: Primary key constraints now supported in partitioned DuckDB table mode
Post-refresh Sorting: New on_refresh_sort_columns parameter for DuckDB enables data ordering after writes
Improved Install Scripts: Removed jq dependency and improved cross-platform compatibility
Better Error Messages: Improved error messaging for bucket UDF arguments and deprecated OpenAI parameters
Reliability: Fixed DynamoDB IAM role authentication with new dynamodb_auth: iam_role parameter
Reliability: Fixed cluster executors to use scheduler's temp_directory parameter for shuffle files
Reliability: Initialize secrets before object stores in cluster executor mode
Reliability: Added page-level retry with backoff for transient GitHub GraphQL errors
Performance: Improved statistics for rewritten DistributeFileScanOptimizer plans
Developer Experience: Added max_message_size configuration for Flight service

Contributors

Breaking Changes

OTel Ingestion Port Change

OTel ingestion has been moved to the Flight port (50051), removing the separate OTel port 50052. Port 50052 is now used exclusively for internal cluster communication. Update your configurations if you were using the dedicated OTel port.

Distributed Query Cluster Mode Requires mTLS

Distributed query cluster mode now requires mTLS for secure communication between cluster nodes. This is a security enhancement to prevent unauthorized nodes from joining the cluster and accessing secrets.

Migration Steps:

Generate certificates using spice cluster tls init and spice cluster tls add
Update scheduler and executor startup commands with --node-mtls-* arguments
For development/testing, use --allow-insecure-connections to opt out of mTLS

Renamed CLI Arguments:

Old Name	New Name
`--cluster-mode`	`--role`
`--cluster-ca-certificate-file`	`--node-mtls-ca-certificate-file`
`--cluster-certificate-file`	`--node-mtls-certificate-file`
`--cluster-key-file`	`--node-mtls-key-file`
`--cluster-address`	`--node-bind-address`
`--cluster-advertise-address`	`--node-advertise-address`
`--cluster-scheduler-url`	`--scheduler-address`

Removed CLI Arguments:

--cluster-api-key: Replaced by mTLS authentication

Cookbook Updates

New ScyllaDB Data Connector Recipe: New recipe demonstrating how to use the ScyllaDB Data Connector. See ScyllaDB Data Connector Recipe for details.

New SMB Data Connector Recipe: New recipe demonstrating how to use the SMB Data Connector. See SMB Data Connector Recipe for details.

The Spice Cookbook includes 86 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.11.0, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.11.0 image:

docker pull spiceai/spiceai:1.11.0

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai --version 1.11.0

AWS Marketplace:

Spice is available in the AWS Marketplace.

Dependencies

DataFusion: Upgraded to v51 (release notes)
Arrow: Upgraded to v57.2 (release notes)
iceberg-rust: Upgraded to v0.8.0 (release notes)

What's Changed

Changelog

OTel exporter for push metrics by @lukekim in #8442
fix: Update benchmark snapshots by @app/github-actions in #8448
Add TPCH append tests to scheduled dispatch workflow by @sgrebnov in #8451
Add snapshot creation logging by @krinart in #8469
Fix PeriodicReader panic by @krinart in #8471
Benchmarks: increase readiness timeout for turso acceleration (TPC-H) by @sgrebnov in #8470
fix: Pin CUDA build actions to commits by @peasee in #8477
Add Criterion benchmarking to chunking crate. by @Jeadie in #8431
DuckDB agg pushdown: gate behind accelerator parameter by @mach-kernel in #8474
Rename aggregate_pushdown_optimization -> optimizer_duckdb_aggregate_pushdown by @ewgenius in #8485
Handle throttling exception for DynamoDB streams by @phillipleblanc in #8492
docs: Add release notes by @peasee in #8478
Update spicepod.schema.json by @app/github-actions in #8496
Move 'test_projection_pushdown' to runtime-datafusion by @Jeadie in #8490
Fix OTEL metrics HTTP exporter client setup by @phillipleblanc in #8489
Update endgame to include new caching accelerator cookbook by @phillipleblanc in #8487
DynamoDB tests and fixes by @lukekim in #8491
Align make lint-rust-fix with make lint-rust by @Jeadie in #8499
fix: Remove unused Cayenne parameters by @peasee in #8500
Force task history captured_plan outputs to be captured even if they would be filtered out otherwise by @phillipleblanc in #8501
release: post-release updates by @peasee in #8503
CI: Fix E2E models dispatch by @mach-kernel in #8505
Use an isolated Tokio runtime for refresh tasks that is separate from the main query API by @phillipleblanc in #8504
Update openapi.json by @app/github-actions in #8512
Update dependencies by @phillipleblanc in #8513
fix: Avoid double hashing cache key by @peasee in #8511
fix: Eagerly drop cached records for results larger than max by @peasee in #8516
feat: Support vortex zstd compressor by @peasee in #8515
warning if column is defined in spicepod but is non-existant by @Jeadie in #8498
Return summarized spicepods from /v1/spicepods by @phillipleblanc in #8404
Fix for idle DynamoDB Stream by @krinart in #8506
Use Datafusion::Plan over Datafusion::Internal for user-facing search errors. by @Jeadie in #8484
DDB Streams Integration Test + Memory Acceleration + Improved Warning by @krinart in #8520
Upgrade to gospice v8 by @lukekim in #8524
Vortex file format for object store by @Jeadie in #8525
docs: Add missing cookbooks to endgame, focus area section by @peasee in #8527
ListingTableConnector: Drop partition columns that reoccur in file schema by @mach-kernel in #8519
fix(cluster): initialize secrets before object stores in executor by @sgrebnov in #8532
Use separate Tokio runtime for SWR refreshes by @phillipleblanc in #8530
fix: SQL Results Cache SWR triggers incorrect cache miss metric by @phillipleblanc in #8529
feat: testoperator: OTLP streaming metrics / connect to existing instance / infinite mode / query timeout by @phillipleblanc in #8537
Update openapi.json by @app/github-actions in #8518
Add better attributes for search testing by @Jeadie in #8531
fix: Improve Cayenne errors, ID selection for table/partition creation by @peasee in #8523
Percent-encode Kubernetes secret name path segment by @phillipleblanc in #8522
Initial parameterisation of Search integration tests by @Jeadie in #8066
fix: Add warning when multiple partitions are defined for the same table by @peasee in #8540
Add DuckDB file-mode support for search test parameterization. by @Jeadie in #8541
fix: Add recursion depth limits to prevent DoS via deeply nested data (DynamoDB + S3 Vectors) by @phillipleblanc in #8544
Remove the clippy::too_many_lines lint by @phillipleblanc in #8549
feat: Add spice cluster tls commands by @phillipleblanc in #8550
Move OTel ingestion to Flight port, remove separate OTel port 50052 by @phillipleblanc in #8551
Move verbose tool init messages to trace by @phillipleblanc in #8552
Add SANs to spice cluster tls add certificates by @phillipleblanc in #8554
Add cpu, gpu, and memory to telemetry by @lukekim in #8483
feat: Add workflows and workflow runs to GitHub Data Connector by @peasee in #8548
Add S3Vectors option for paramterised search tests by @Jeadie in #8555
Distributed query: TLS + API key by @mach-kernel in #8468
Refactor RRF SQL to use LogicalPlanBuilder by @Jeadie in #7968
Require mTLS for distributed query cluster mode by @phillipleblanc in #8580
Fix stats for rewritten DistributeFileScanOptimizer plans by @mach-kernel in #8581
Bump actions/cache from 4.3.0 to 5.0.1 by @app/dependabot in #8573
Show user-friendly error on empty DDB table by @krinart in #8586
Add (Deprecated) labels to deprecated spice sql params by @krinart in #8588
Fix kafka warning when security.protocol is set to PLAINTEXT by @krinart in #8587
Upgrade dependencies by @phillipleblanc in #8593
Release notes for v1.10.1 by @Jeadie in #8568
Run search benchmarks twice a week. by @Jeadie in #8592
Add cayenne data accelerator by @Jeadie in #8553
SQL allowlist for tools: sql, list_datasets, search, table_schema by @Jeadie in #8449
use rstest for llms integration tests by @Jeadie in #8566
Post v1.10.1 housekeeping by @Jeadie in #8600
Add checklist for SDK publication in end_game.md by @Jeadie in #8602
Fix IMAGE_TAG assignment for Docker compatibility by @Jeadie in #8599
Modify cluster arguments to spiced for UX review by @phillipleblanc in #8603
Update QA analytics with new release data by @Jeadie in #8601
Remove 'tract-core' dependency. by @Jeadie in #8605
Google embedding and chat models. by @Jeadie in #8423
Fix test_github_workflows integration test by @sgrebnov in #8607
Update openapi.json by @app/github-actions in #8604
ci: Upload artifacts to MinIO eagerly after each build step by @phillipleblanc in #8615
fix: SQLite accelerator decimal/date handling by @phillipleblanc in #8606
Configure mTLS for executor-to-executor gRPC connections by @sgrebnov in #8617
feat: Enable localpod with caching mode accelerator for tiered caching by @phillipleblanc in #8621
Add Cayenne S3 Express One Zone support for data files by @lukekim in #8502
Add snapshot interval for acceleration snapshots by @phillipleblanc in #8627
Add dataset_load_parallelism parameter to spicepod.yml by @peasee in #8630
Json Nesting for DynamoDB by @krinart in #8623
Restore deprecated open-telemetry flag in spiced by @phillipleblanc in #8629
Implement batching for Kafka/Debezium + null Decimal handling by @krinart in #8622
fix: Status field in /v1/datasets & /v1/models by @lukekim in #8633
Add Spice test operator improvements by @sgrebnov in #8625
Add v1.10.2 release notes by @sgrebnov in #8640
Align object-store vortex with runtime feature flagging by @Jeadie in #8620
Extended LLM & search tests on cron by @Jeadie in #8624
Test-operator: emit main metrics as part of load tests by @sgrebnov in #8639
Build a local docker image from an existing Spice binary by @phillipleblanc in #8619
fix: Use runtime-rate-control for GitHub Data Connector by @peasee in #8638
Upgrade dependencies by @phillipleblanc in #8655
Bump headers-accept from 0.1.4 to 0.3.0 by @app/dependabot in #8644
Update openapi.json by @app/github-actions in #8637
Update SECURITY.md - Include v1.10.2 by @sgrebnov in #8661
Serialize acceleration snapshots with refresh writes by @phillipleblanc in #8652
Update AI Installation test to use minilm_l6_v2 by @sgrebnov in #8659
Fixes for search integration test CI by @Jeadie in #8656
fix: Use a GitHub rate controller per auth context by @peasee in #8662
fix: Update Search integration test snapshots by @app/github-actions in #8654
Make E2E Test Release Installation (AI, Local HF model) test more robust by @sgrebnov in #8666
Fix issue with location predicate for custom S3 endpoints + regression integration test by @phillipleblanc in #8668
fix: Validate schema match before projection pushdown in UnionProjectionPushdownOptimizer by @phillipleblanc in #8669
Proper batch commit for kafka/debezium by @krinart in #8671
Improve spicepod json schema generation by @ewgenius in #8547
Start the anonymous telemetry exporter asynchronously by @phillipleblanc in #8679
fix: Move enforce-pulls to hosted runner by @phillipleblanc in #8686
Update QA analytics with 1.10.2 release data by @sgrebnov in #8667
fix: Azure does not support suffix range requests by @phillipleblanc in #8685
Remove spicepod-validator cargo build from build-dev target by @Jeadie in #8684
fix: Update test snapshots by @app/github-actions in #8680
fix: Update Search integration test snapshots by @app/github-actions in #8681
SMB and NFS Data Connectors by @lukekim in #8674
Upgrade to openai-async v0.32 by @lukekim in #8635
v1.10.3 release notes by @phillipleblanc in #8693
Upgrade dependencies by @phillipleblanc in #8704
fix: Support NDJSON and LDJSON by @lukekim in #8649
move OpenAI overrides to non-prefixed by @Jeadie in #8678
Update Google LLM param: google_dimensions -> dimensions by @Jeadie in #8677
Make cluster mTLS optional with insecure flag by @phillipleblanc in #8703
Revert "fix: Move enforce-pulls to hosted runner (#8686)" by @phillipleblanc in #8709
Initial 'testoperator run text-to-sql' by @Jeadie in #8618
Add support for abfss by @krinart in #8706
Add testoperator TPCH dispatch for ABFS with hierarchical namespace disabled + versioning enabled by @phillipleblanc in #8711
Update openapi.json by @app/github-actions in #8692
cluster: validate --role argument by @phillipleblanc in #8717
Upgrade to Turso v0.3.2 by @lukekim in #8716
Rename --insecure to --allow-insecure-connections to be consistent with existing naming by @lukekim in #8720
Remove 'testoperator run http-consistency/http-overhead' by @Jeadie in #8708
refactor: Remove cluster feature flag by @phillipleblanc in #8718
Docs: Distributed query ADR by @mach-kernel in #8608
Use model.datasets to allowlist on tools by @Jeadie in #8714
cluster: quality of life improvements to starting cluster mode locally by @phillipleblanc in #8719
Docs: Ballista extension ADR by @mach-kernel in #8616
Improve deprecation messages when going from prefixed -> non-prefixed. by @Jeadie in #8724
Remove tools from auto-defaults by @Jeadie in #8725
Make distinct providers for vector spilling, vector partitioning. by @Jeadie in #8546
cluster: default scheduler address port by @phillipleblanc in #8728
Add Makefile targets for testoperator by @Jeadie in #8729
text-to-sql dispatch in testoperator by @Jeadie in #8705
DR-006: High Availability Distributed Query with Stateless Schedulers by @lukekim in #8721
DR-007: mTLS for Distributed Query Cluster Communication by @lukekim in #8722
SMB and NFS improvements by @lukekim in #8710
fix: Cluster executors use scheduler's temp_directory for shuffle files by @phillipleblanc in #8733
use 'max_message_size' in flight service too by @Jeadie in #8730
Add page-level retry for transient GraphQL errors with backoff and increase GitHub rate limit buffer up to 100 by @ewgenius in #8726
Make testoperator Dockerfile; CI to build docker image to ghcr.io. by @Jeadie in #8732
cluster: UnionProjectionPushdownOptimizer: Add projection pushdown diagnostics for union children by @phillipleblanc in #8734
Fix column projection order mismatch with location metadata columns by @phillipleblanc in #8738
Fixes for testoperator. by @Jeadie in #8737
Improve Cayenne Deletion Vectors with KeyBased support by @lukekim in #8713
Fix testoperator_dispatch.yaml by @Jeadie in #8740
Add spice cloud CLI commands by @lukekim in #8528
Add FTP, NFS, & SMB TPCH SF1 spicepods by @lukekim in #8739
Prepared Statements by @lukekim in #7588
Schedule dispatch of testoperator run text-to-sql. by @Jeadie in #8745
Fix minio for ai benchmark CI by @Jeadie in #8743
Upgrade to Rust 1.91 by @phillipleblanc in #8749
fix: Update benchmark snapshots by @app/github-actions in #8763
Benchmarks: make row count validation skip logic configurable by scale factor, query set, and overrides by @sgrebnov in #8756
Make benchmark tests more robust by @sgrebnov in #8766
Add parameter to force using iam_role for DynamoDB by @krinart in #8767
fix: Update Search integration test snapshots by @app/github-actions in #8735
v1.10.4 release notes by @phillipleblanc in #8790
Trace metrics export errors by @sgrebnov in #8791
v1.10.4 SECURITY.md update by @phillipleblanc in #8800
Add timezone database to Docker image to fix Cayenne acceleration panic by @sgrebnov in #8799
Upgrade dependencies by @phillipleblanc in #8801
Fix table_allowlist for table sampling and NSQL by @Jeadie in #8789
Cayenne primary key on-conflict handling by @lukekim in #8788
fix: Update benchmark snapshots by @app/github-actions in #8773
fix: correctly identify deprecated openai_* parameters by @phillipleblanc in #8809
fix: Update benchmark snapshots by @app/github-actions in #8812
Use workspace version for cayenne crate by @phillipleblanc in #8811
Don't CAST strings which breaks push down optimizer by @lukekim in #8810
fix: Update benchmark snapshots by @app/github-actions in #8815
Update async-openai to latest revision 4dcd633aad6f - brings fix for openai compatible model providers by @ewgenius in #8816
Add auth/iam_role_source to DynamoDB connector by @krinart in #8808
DynamoDB fixes: JSON nesting for Streams, proper batch deletions by @krinart in #8821
Rough roadmap for 2026-2027 by @lukekim in #8805
Release notes for v1.11.0-rc1 by @ewgenius in #8786
Make S3V integration tests prepare_for_aws_tests more robust by @sgrebnov in #8820
Bump rsa from 0.9.9 to 0.9.10 in the cargo group across 1 directory by @app/dependabot in #8819
Add timezone database to Release and CUDA Docker images to fix Cayene panic by @sgrebnov in #8832
fix: UnionProjectionPushdownOptimizer - Schema change during transform_down breaks parent nodes by @phillipleblanc in #8831
Update grafana/datadog example dashboards by @krinart in #8833
Add Dev bird bench as text-to-sql queryset in CI. by @Jeadie in #8753
Update testoperator scheduler to use release/1.11 branch by @ewgenius in #8829
Spice Cayenne fixes and test spicepods for Beta & RC by @lukekim in #8787
testoperator dispatch all bird-bench database variants by @Jeadie in #8835
feat: Improve column statistics handling with safe access and defaults by @phillipleblanc in #8836
cluster: mTLS verification by @phillipleblanc in #8837
fix: 8770: Unsupported ScalarFunctionExpr in ORDER BY by @lukekim in #8838
Workflow tweaks by @lukekim in #8845
Cayenne: metadata catalog should respect cayenne_file_path location by @sgrebnov in #8844
Expand Cayenne feature coverage by @lukekim in #8848
docs: HA distributed query decisions by @phillipleblanc in #8817
fix(optimizer): Fix correctness issues in UnionProjectionPushdownOptimizer by @phillipleblanc in #8851
Pin reqwest to 0.12.24 to fix HuggingFace embedding model download by @ewgenius in #8853
Fix builds and pin to Ubuntu 22.04 by @lukekim in #8856
Revert "Fix builds and pin to Ubuntu 22.04" by @lukekim in #8861
Ensure setup Rust is run by @lukekim in #8862
fix: Ubuntu 24.04+ renamed libaio1 to libaio1t64 by @lukekim in #8865
Upgrade to Pulls with Spice v2 by @lukekim in #8866
Add limit and configuration name to 'testoperator run text-to-sql' by @Jeadie in #8839
PR check and test optimization by @lukekim in #8868
Upgrade S3 Vectors SDK and improve test robustness by @lukekim in #8867
[Testoperator] Query level and improved aggregate level for NSQL by @Jeadie in #8840
Add docker build for private branches for ghcr.io/spiceai/spiceai-dev by @phillipleblanc in #8873
Expand the data acceleration round-trip test coverage by @lukekim in #8855
fix: Provide a better error for improper bucket UDF arguments by @peasee in #8849
ScyllaDB Data Connector by @lukekim in #8827
Use tokio-rusqlite for Spice Cayenne SQLite by @lukekim in #8857
Cayenne: fix FuturesUnordered reentrant drop crash by @sgrebnov in #8863
Bump github/codeql-action from 4.31.9 to 4.31.10 by @app/dependabot in #8884
Bump golang.org/x/sys from 0.39.0 to 0.40.0 by @app/dependabot in #8881
Bump github.com/spiceai/gospice/v8 from 8.0.0 to 8.0.1 by @app/dependabot in #8883
Bump roaring from 0.11.2 to 0.11.3 by @app/dependabot in #8885
Bump golang.org/x/mod from 0.31.0 to 0.32.0 by @app/dependabot in #8882
Bump aws-sdk-s3 from 1.115.0 to 1.119.0 by @app/dependabot in #8887
Bump libc from 0.2.177 to 0.2.180 by @app/dependabot in #8886
Bump tokio-util from 0.7.17 to 0.7.18 by @app/dependabot in #8889
Bump governor from 0.10.2 to 0.10.4 by @app/dependabot in #8888
fix: flaky test test_concurrent_partition_creation by @phillipleblanc in #8898
Update Cayenne snapshots for TPC-DS by @lukekim in #8890
Add more buckets to histogram metrics by @krinart in #8850
feat: Add HTTP health endpoint for cluster executors by @phillipleblanc in #8899
feat: Implement model listing functionality for multiple providers by @lukekim in #8901
feat: Initial HA schedulers distributed query implementation by @phillipleblanc in #8852
fix: infer executor role from --scheduler-address when --role is omitted by @phillipleblanc in #8903
Improve install scripts and remove jq dependency by @lukekim in #8847
Benchmarks: sort PartitionedUnionExec children for deterministic snapshot comparison by @sgrebnov in #8877
Cayenne: share VortexFileCache across partitions via CayenneContext by @sgrebnov in #8880
Update ballista to add exponential backoff for scheduler disconnection by @phillipleblanc in #8905
Configurably add BirdBench evidence to testoperator text-to-SQL. by @Jeadie in #8904
Helm: Allow command override via values.yaml by @sgrebnov in #8906
Fix distributed query gRPC message size limit (16MB -> 100MB) by @phillipleblanc in #8900
OS specific setup actions by @lukekim in #8909
Cayenne should warn if unable to parse configuration value by @sgrebnov in #8907
Add snapshots widgets to example dashboard by @krinart in #8910
Add quality criteria for the features by @krinart in #8897
Improve Accelerated Datasets section for Grafana/Datadog dashboards by @krinart in #8915
Use HTTP traceparent in NSQL to support concurrency in 'testoperator run text-to-SQL' by @Jeadie in #8912
Remove setup for cc from integration_models.yml by @Jeadie in #8917
Propagate Azure and GCS credentials to executors in cluster mode by @phillipleblanc in #8918
Cayenne: fix memory growth due to vortex metrics allocation by @sgrebnov in #8908
fix(caching): Deduplicate refresh requests for JSON array responses by @sgrebnov in #8921
fix(caching): Return cached data directly for unfiltered queries (SELECT *) by @sgrebnov in #8919
Correct MinIO path syntax for spiced download by @Jeadie in #8916
Acceleration snapshots compaction + Improved Snapshots UX by @krinart in #8858
Change base image from bookworm-slim to trixie-slim by @Jeadie in #8923
Add testoperator run text-to-sql metrics from LogicalPlan by @Jeadie in #8895
Fix spicepod dependencies in testoperator by @Jeadie in #8875
Update copilot instructions for data correctness by @lukekim in #8922
Add BootstrapStatus + Snapshot bootstrapping parallelization by @krinart in #8926
fix: add missing feature-gate for AWS Secrets Manager error variant by @phillipleblanc in #8928
refactor: make ConnectorParams fields public for external connectors by @phillipleblanc in #8929
fix(caching): SWR refreshes only accessed entry instead of all stale rows by @sgrebnov in #8931
Cayenne: include cayenne_metadata_dir to known params by @sgrebnov in #8933
Rename Ingestion Lag chart in example dashboards by @krinart in #8932
fix(caching): Fix HTTP caching always MISS when projection excludes fetched_at by @sgrebnov in #8930
Reset expiry after snapshot bootstraping for Caching by @krinart in #8925
Set use_ssl=false for sccache by @lukekim in #8945
Hash indexing for Arrow Acceleration by @lukekim in #8924
[Cayenne] Acceleration snapshots support by @lukekim in #7973
perf(caching): Non-blocking cache writes on cache miss by @sgrebnov in #8948
Update NSQL models by @lukekim in #8951
Hash Index Key verification by @lukekim in #8949
Add snapshots_creation_policy param by @krinart in #8954
Remove candle & cudarc from non-models build by @lukekim in #8955
Acceleration Snapshots API and CLI by @lukekim in #8934
Ignore test for data_components arrow::indexed::test_primary_key_value_matches_batch by @Jeadie in #8962
fix: Update benchmark snapshots by @app/github-actions in #8965
Hash Index secondary index support by @lukekim in #8958
fix: Support primary key constraints in partitioned DuckDB tables mode by @sgrebnov in #8966
perf(caching): Batch cache writes by @sgrebnov in #8959
CI perf optimizations by @lukekim in #8968
Fix Makefile linting by @Jeadie in #8970
Fixes in testoperator run text-to-sql. by @Jeadie in #8927
implement Chat::as_sql for xAI anthropic by @Jeadie in #8957
Fix duckdb_file_path in search integration test by @Jeadie in #8972
fix: Update benchmark snapshots by @app/github-actions in #8971
Maintenance updates to Anthropic API by @Jeadie in #8956
Add CacheBackend Trait, implement pingora-lru, and add throughput tests by @lukekim in #8080
fix: Update benchmark snapshots by @app/github-actions in #8974
fix: Update benchmark snapshots by @app/github-actions in #8975
Make accelerator shutdown more robust by @lukekim in #8969
feat(duckdb): Add on_refresh_sort_columns for post-write data ordering (initial version) by @sgrebnov in #8964
Proper handling for initial snapshot by @krinart in #8911
fix: Remove --no-default-features from cargo-hack command in features workflow by @phillipleblanc in #8977
build(deps): bump actions/cache from 5.0.1 to 5.0.2 by @app/dependabot in #8983
build(deps): bump actions/checkout from 4 to 6 by @app/dependabot in #8982
build(deps): bump actions/setup-go from 6.1.0 to 6.2.0 by @app/dependabot in #8984
build(deps): bump github.com/olekukonko/tablewriter from 1.1.2 to 1.1.3 by @app/dependabot in #8979
build(deps): bump github.com/klauspost/compress from 1.18.2 to 1.18.3 by @app/dependabot in #8980
Add /v1/queries and Arrow Flight async APIs by @lukekim in #8946
build(deps): bump Vampire/setup-wsl from 5 to 6 by @app/dependabot in #8981
fix: Update Search integration test snapshots by @app/github-actions in #8973
build(deps): bump insta from 1.46.0 to 1.46.1 by @app/dependabot in #8988
build(deps): bump schemars from 1.1.0 to 1.2.0 by @app/dependabot in #8985
fix: Update benchmark snapshots by @app/github-actions in #8978
fix: Data correctness edge cases by @lukekim in #8953
Correct MinIO path syntax for spiced download (Part 2) by @Jeadie in #8995
Make .spice/data in search integration tests by @Jeadie in #8992
fix: Hash index composite keys null values by @lukekim in #9001
Update Cayenne status to Beta by @lukekim in #9002
fix: Disable TPC-DS result validation (not yet supported) by @sgrebnov in #9004
feat: Upgrade to DataFusion v51 and dependencies by @lukekim in #8864
Improvements for snapshots_creation_policy by @krinart in #9003
fix(ci): restore cached spicepod-validator binary instead of lookup-only by @phillipleblanc in #9007
Update version by @krinart in #9010
Update lock file - https://github.com/spiceai/spiceai/commit/53babbf07ca8c1c7b2e1da42ce58c465d9bc9276/
fix: Enable Cayenne acceleration snapshots by @lukekim in #9020
Add TPC-DS integration tests with S3 source and PostgreSQL acceleration by @phillipleblanc in #9006
fix(tests): fix flaky/slow/failing unit tests by @phillipleblanc in #9009
fix: Update benchmark snapshots for DF51 upgrade by @app/github-actions in #9008
fix: add feature gate to rrf TEST_EMBEDDING_MODEL by @phillipleblanc in #9017
fix: features check by @phillipleblanc in #9014
URL table support by @lukekim in #9018
ScyllaDB key filter by @lukekim in #8997
fix: Schema mismatch when using column projection with HTTP caching by @phillipleblanc in #9021
Add more tests for HTTP caching with columns selection by @sgrebnov in #9025
HTTP cache snapshots: default to time_interval and fix snapshots_creation_policy: on_change by @sgrebnov in #9026
Fix duplicate snapshot creation on startup by @sgrebnov in #9029
Remove waiting for runtime to be ready before creating snapshot by @krinart in #9033
Fix snapshot on_change policy to skip when no writes occurred by @sgrebnov in #9028
Release notes for release release/1.11.0-rc.2 by @krinart in #9016
ci: use arduino/setup-protoc for official protobuf compiler by @phillipleblanc in #9036
ci: install unzip on aarch64 runner for arduino/setup-protoc by @phillipleblanc in #9038
fix: don't fail release if upload to minio fails by @phillipleblanc in #9039
Improve validation and logging for hash indexes by @lukekim in #9047
Pin to ubuntu-22.04 by @lukekim in #9068
Fix broken telemetry for testoperator by @krinart in #9054
Fix release builds by @lukekim in #9069
Spice 1.11.0-rc3 release notes by @krinart in #9070
Update spicepod.schema.json by @app/github-actions in #9071
Add missing protoc step to setup-cc action by @krinart in #9041
Fix TLS connection for grpc+tls:// Flight SQL endpoints and add custom CA certificate support by @phillipleblanc in #9073
Update 1.11.0-rc.3 release notes by @krinart in #9082
Fix formula_1 and codebase_community in bird-bench by @Jeadie in #9000
Cayenne S3 Express One Zone improvements by @lukekim in #9015
Add zlib1g-dev to CI by @lukekim in #9052
Upgrade Vortex with CASE-WHEN by @lukekim in #9051
fix: Cayenne CatalogError handling for constraint violations by @lukekim in #9050
Fix Docker build failing to copy shared libraries due to ldd output parsing by @phillipleblanc in #9058
feat: Change /v1/sql and FlightSQL to use local execution in cluster mode by @phillipleblanc in #9055
Remove unmaintained dependencies by @lukekim in #9045
Enable cayenne + changes stream by @Jeadie in #9053
feat(cli): add spice query command for async queries REPL by @phillipleblanc in #9057
Remove unncessary allocations by @lukekim in #9059
Add dataset_acceleration_size_bytes metric by @krinart in #9062
Fix tracing of sql_query beneath tool_use::sample_data. by @Jeadie in #9043
Basic script to run distributed spice by @Jeadie in #9049
Add integration tests for Acceleration Snapshots by @krinart in #9067
Upgrade CUDA toolkit to 12.6.0 by @sgrebnov in #9079
Install required protoc dependency for CUDA build by @sgrebnov in #9080
feat(cluster): add executor control stream heartbeat by @phillipleblanc in #9072
feat: Fix async queries API and integrate Ballista shuffle improvements by @lukekim in #9075
Remove models variant (now default) & Windows builds (use WSL) by @lukekim in #9063
Cayenne: share upload semaphore across partitions to bound memory growth and optimize I/O by @sgrebnov in #9078
Snowflake data connector - add snowflake_private_key parameter by @ewgenius in #9085
Rewrite Go CLI in Rust by @phillipleblanc in #9061
GCS Data Connector (Alpha) by @lukekim in #9084
Skip 'latest' Docker tag for pre-release versions by @sgrebnov in #9077
Add HTTP endpoints for acceleration snapshots API by @phillipleblanc in #9065
Cayenne: Allow append mode with both primary_key and time_column by @sgrebnov in #9090
Add Cluster Observability (Metrics+Dashboard) by @phillipleblanc in #9066
proto for 'CayenneAccelerationExec' by @Jeadie in #9094
Add 'anthropic-beta' header for structured outputs by @Jeadie in #9093
Cayenne: refactor write path to use insert_into() as single entry point (part 1) by @sgrebnov in #9088
Fix testoperator dispatch by @sgrebnov in #9097
Fix setup-spiced GH action (_models suffix does not exist anymore) by @sgrebnov in #9102
build(deps): bump github/codeql-action from 4.31.10 to 4.31.11 by @app/dependabot in #9108
Remove DistributeFileScanOptimizer and UnionProjectionPushdownOptimizer & set target_partitions dynamically based on cluster capacity by @phillipleblanc in #9100
build(deps): bump zip from 2.4.2 to 6.0.0 by @app/dependabot in #9111
fix: Preserve query parameter order in HTTP connector to match filter values by @sgrebnov in #9114
Add PollNow interrupt for Ballista executors to reduce task scheduling latency by @phillipleblanc in #9098
Revert "GCS Data Connector (Alpha) " by @lukekim in #9084
Fix stack overflow for CDC batching by @krinart in #9115
fix: update Ballista fork to include executor timeout fix by @phillipleblanc in #9124
Properly propagate SIGINT/SIGTERM from CLI to runtime by @krinart in #9127
fix: Use the same vortex dependency as ballista by @peasee in #9123
release: Bump version to 1.11.0 for stable - https://github.com/spiceai/spiceai/commit/14d09f8e262008df69ded898ed3bebee08471508/
Cayenne snapshots with shared metadata by @lukekim in #9118
Improve error handling for URL tables with Azure URLs by @phillipleblanc in #9129
Add missing Windows build step for spice CLI in build_and_release workflow by @phillipleblanc in #9143
Fix install-dev to use debug build path for spice binary by @phillipleblanc in #9142
fix: CLI builds by @peasee in #9145
Always create initial snapshots (unless bootstrapped) + when no snapshots exist by @krinart in #9119
fix(cayenne): Fix upsert with pending deletions causing duplicate PKs by @sgrebnov in #9152
fix(flightrepl): Add chrono-tz feature to flightrepl for timezone formatting by @sgrebnov in #9153
fix(delta_lake): Preserve container name in ABFSS URLs for Azure Delta Lake tables by @sgrebnov in #9155
fix: Make CLI system and asset type detection more robust by @peasee in #9148
fix: Set query set properly on benchmarks telemetry metrics attributes by @peasee in #9162
fix: Download _models variant - https://github.com/spiceai/spiceai/commit/27f3058d0007595b02c198755c3b22319032ff30/
fix: Helm chart image tag - https://github.com/spiceai/spiceai/commit/7405c8df0db4ecce0ed6d4a4d424553d604b9036/
Revert "Remove models variant (now default) & Windows builds (use WSL) " by @lukekim in #9063
fix(cli): Several CLI fixes from the Go to Rust migration by @lukekim in #9157

Spice v1.11.0-rc.2 (Jan 22, 2026)

January 22, 2026 · 24 min read

Viktor Yershov

Senior Software Engineer at Spice AI

Announcing the release of Spice v1.11.0-rc.2! ⭐

v1.11.0-rc.2 is the second release candidate for advanced test of v1.11. It brings Spice Cayenne to Beta status with acceleration snapshots support, a new ScyllaDB Data Connector, upgrades to DataFusion v51, Arrow 57.2, and iceberg-rust v0.8.0. It includes significant improvements to distributed query, caching, and observability.

What's New in v1.11.0-rc.2

Spice Cayenne Accelerator Reaches Beta

Spice Cayenne has been promoted to Beta status with acceleration snapshots support and numerous stability improvements.

Improved Reliability:

Fixed timezone database issues in Docker images that caused acceleration panics
Resolved FuturesUnordered reentrant drop crashes
Fixed memory growth issues related to Vortex metrics allocation
Metadata catalog now properly respects cayenne_file_path location
Added warnings for unparseable configuration values

Example configuration with snapshots:

datasets:
  - from: s3://my-bucket/data.parquet
    name: my_dataset
    acceleration:
      enabled: true
      engine: cayenne
      mode: file

DataFusion v51 Upgrade

Apache DataFusion has been upgraded to v51, bringing significant performance improvements, new SQL features, and enhanced observability.

DataFusion v51 ClickBench Performance

Performance Improvements:

Faster CASE Expression Evaluation: Expressions now short-circuit earlier, reuse partial results, and avoid unnecessary scattering, speeding up common ETL patterns
Better Defaults for Remote Parquet Reads: DataFusion now fetches the last 512KB of Parquet files by default, typically avoiding 2 I/O requests per file
Faster Parquet Metadata Parsing: Leverages Arrow 57's new thrift metadata parser for up to 4x faster metadata parsing

New SQL Features:

SQL Pipe Operators: Support for |> syntax for inline transforms
DESCRIBE <query>: Returns the schema of any query without executing it
Named Arguments in SQL Functions: PostgreSQL-style param => value syntax for scalar, aggregate, and window functions
Decimal32/Decimal64 Support: New Arrow types supported including aggregations like SUM, AVG, and MIN/MAX

Example pipe operator:

SELECT * FROM t
|> WHERE a > 10
|> ORDER BY b
|> LIMIT 5;

Improved Observability:

Improved EXPLAIN ANALYZE Metrics: New metrics including output_bytes, selectivity for filters, reduction_factor for aggregates, and detailed timing breakdowns

Arrow 57.2 Upgrade

Spice has been upgraded to Apache Arrow Rust 57.2.0, bringing major performance improvements and new capabilities.

Key Features:

4x Faster Parquet Metadata Parsing: A rewritten thrift metadata parser delivers up to 4x faster metadata parsing, especially beneficial for low-latency use cases and files with large amounts of metadata
Parquet Variant Support: Experimental support for reading and writing the new Parquet Variant type for semi-structured data, including shredded variant values
Parquet Geometry Support: Read and write support for Parquet Geometry types (GEOMETRY and GEOGRAPHY) with GeospatialStatistics
New arrow-avro Crate: Efficient conversion between Apache Avro and Arrow RecordBatches with projection pushdown and vectorized execution support

iceberg-rust v0.8.0 Upgrade

Spice has been upgraded to iceberg-rust v0.8.0, bringing improved Iceberg table support.

Key Features:

V3 Metadata Support: Full support for Iceberg V3 table metadata format
INSERT INTO Partitioned Tables: DataFusion integration now supports inserting data into partitioned Iceberg tables
Improved Delete File Handling: Better support for position and equality delete files, including shared delete file loading and caching
SQL Catalog Updates: Implement update_table and register_table for SQL catalog
S3 Tables Catalog: Implement update_table for S3 Tables catalog
Enhanced Arrow Integration: Convert Arrow schema to Iceberg schema with auto-assigned field IDs, _file column support, and Date32 type support

Acceleration Snapshots

Key Feature Improvements in v1.11:

Flexible Triggers: Configure when snapshots are created based on time intervals or stream batch counts
Automatic Compaction: Reduce storage overhead by compacting older snapshots (DuckDB only)
Bootstrap Integration: Snapshots can reset cache expiry on load for seamless recovery (DuckDB with Caching refresh mode)
Smart Creation Policies: Only create snapshots when data has actually changed

Example configuration:

datasets:
  - from: s3://my-bucket/data.parquet
    name: my_dataset
    acceleration:
      enabled: true
      engine: cayenne
      mode: file
      snapshots: enabled
      snapshots_trigger: time_interval
      snapshots_trigger_threshold: 1h
      snapshots_creation_policy: on_changed

Snapshots API and CLI: New API endpoints and CLI commands for managing snapshots programmatically. List, create, and restore snapshots directly from the command line or via HTTP.

For more details, refer to the Acceleration Snapshots Documentation.

ScyllaDB Data Connector

A new data connector for ScyllaDB, the high-performance NoSQL database compatible with Apache Cassandra. Query ScyllaDB tables directly or accelerate them for faster analytics.

Example configuration:

datasets:
  - from: scylladb:my_keyspace.my_table
    name: scylla_data
    acceleration:
      enabled: true
      engine: duckdb

For more details, refer to the ScyllaDB Data Connector Documentation.

Distributed Query Improvements

mTLS Verification: Cluster communication between scheduler and executors now supports mutual TLS verification for enhanced security.

Credential Propagation: Azure and GCS credentials are now automatically propagated to executors in cluster mode, enabling access to cloud storage across the distributed query cluster.

Improved Resilience:

Exponential backoff for scheduler disconnection recovery
Increased gRPC message size limit from 16MB to 100MB for large query plans
HTTP health endpoint for cluster executors
Automatic executor role inference when --scheduler-address is provided

For more details, refer to the Distributed Query Documentation.

Caching Acceleration Mode Improvements

The Caching Acceleration Mode introduced in v1.10.0 has received significant performance optimizations and reliability fixes in this release.

Performance Optimizations:

Non-blocking Cache Writes: Cache misses no longer block query responses. Data is written to the cache asynchronously after the query returns, reducing query latency for cache miss scenarios.
Batch Cache Writes: Multiple cache entries are now written in batches rather than individually, significantly improving write throughput for high-volume cache operations.

Reliability Fixes:

Correct SWR Refresh Behavior: The stale-while-revalidate (SWR) pattern now correctly refreshes only the specific entries that were accessed instead of refreshing all stale rows in the dataset. This prevents unnecessary source queries and reduces load on upstream data sources.
Deduplicated Refresh Requests: Fixed an issue where JSON array responses could trigger multiple redundant refresh operations. Refresh requests are now properly deduplicated.
Fixed Cache Hit Detection: Resolved an issue where queries that didn't include fetched_at in their projection would always result in cache misses, even when cached data was available.
Unfiltered Query Optimization: SELECT * queries without filters now return cached data directly without unnecessary filtering overhead.

For more details, refer to the Caching Acceleration Mode Documentation.

DynamoDB Connector Enhancements

Added JSON nesting for DynamoDB Streams
Proper batch deletion handling

URL Tables

Query data sources directly via URL in SQL without prior dataset registration. Supports S3, Azure Blob Storage, and HTTP/HTTPS URLs with automatic format detection and partition inference.

Supported Patterns:

Single files: SELECT * FROM 's3://bucket/data.parquet'
Directories/prefixes: SELECT * FROM 's3://bucket/data/'
Glob patterns: SELECT * FROM 's3://bucket/year=*/month=*/data.parquet'

Key Features:

Automatic file format detection (Parquet, CSV, JSON, etc.)
Hive-style partition inference with filter pushdown
Schema inference from files
Works with both SQL and DataFrame APIs

Example with hive partitioning:

-- Partitions are automatically inferred from paths
SELECT * FROM 's3://bucket/data/' WHERE year = '2024' AND month = '01'

Enable via spicepod.yml:

runtime:
  params:
    url_tables: enabled

Cluster Mode Async Query APIs (experimental)

New asynchronous query APIs for long-running queries in cluster mode:

/v1/queries endpoint: Submit queries and retrieve results asynchronously
Arrow Flight async support: Non-blocking query execution via Arrow Flight protocol

Observability Improvements

Enhanced Dashboards: Updated Grafana and Datadog example dashboards with:

Snapshot monitoring widgets
Improved accelerated datasets section
Renamed ingestion lag charts for clarity

Additional Histogram Buckets: Added more buckets to histogram metrics for better latency distribution visibility.

For more details, refer to the Monitoring Documentation.

Additional Improvements

Model Listing: New functionality to list available models across multiple AI providers
DuckDB Partitioned Tables: Primary key constraints now supported in partitioned DuckDB table mode
Post-refresh Sorting: New on_refresh_sort_columns parameter for DuckDB enables data ordering after writes
Improved Install Scripts: Removed jq dependency and improved cross-platform compatibility
Better Error Messages: Improved error messaging for bucket UDF arguments and deprecated OpenAI parameters

Contributors

Breaking Changes

No breaking changes.

Cookbook Updates

New ScyllaDB Data Connector Recipe: New recipe demonstrating how to use ScyllaDB Data Connector. See ScyllaDB Data Connector Recipe for details.

New SMB Data Connector Recipe: New recipe demonstrating how to use ScyllaDB Data Connector. See SMB Data Connector Recipe for details.

The Spice Cookbook includes 86 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.11.0-rc.2, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:v1.11.0-rc.2 image:

docker pull spiceai/spiceai:v1.11.0-rc.2

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

AWS Marketplace:

Spice is available in the AWS Marketplace.

Dependencies

DataFusion: Upgraded to v51 (release notes)
Arrow: Upgraded to v57 (release notes)
iceberg-rust: Upgraded to v0.8.0 (release notes)

Changelog

Add timezone database to Docker image to fix Cayenne acceleration panic by @sgrebnov in #8799
Upgrade dependencies by @phillipleblanc in #8801
Fix table_allowlist for table sampling and NSQL by @Jeadie in #8789
Cayenne primary key on-conflict handling by @lukekim in #8788
fix: Update benchmark snapshots by @app/github-actions in #8773
fix: correctly identify deprecated openai_* parameters by @phillipleblanc in #8809
fix: Update benchmark snapshots by @app/github-actions in #8812
Use workspace version for cayenne crate by @phillipleblanc in #8811
Don't CAST strings which breaks push down optimizer by @lukekim in #8810
fix: Update benchmark snapshots by @app/github-actions in #8815
Update async-openai to latest revision 4dcd633aad6f - brings fix for openai compatible model providers by @ewgenius in #8816
Add auth/iam_role_source to DynamoDB connector by @krinart in #8808
DynamoDB fixes: JSON nesting for Streams, proper batch deletions by @krinart in #8821
Rough roadmap for 2026-2027 by @lukekim in #8805
Release notes for v1.11.0-rc1 by @ewgenius in #8786
Make S3V integration tests prepare_for_aws_tests more robust by @sgrebnov in #8820
Bump rsa from 0.9.9 to 0.9.10 in the cargo group across 1 directory by @app/dependabot in #8819
Add timezone database to Release and CUDA Docker images to fix Cayene panic by @sgrebnov in #8832
fix: UnionProjectionPushdownOptimizer - Schema change during transform_down breaks parent nodes by @phillipleblanc in #8831
Update grafana/datadog example dashboards by @krinart in #8833
Add Dev bird bench as text-to-sql queryset in CI. by @Jeadie in #8753
Update testoperator scheduler to use release/1.11 branch by @ewgenius in #8829
Spice Cayenne fixes and test spicepods for Beta & RC by @lukekim in #8787
testoperator dispatch all bird-bench database variants by @Jeadie in #8835
feat: Improve column statistics handling with safe access and defaults by @phillipleblanc in #8836
cluster: mTLS verification by @phillipleblanc in #8837
fix: 8770: Unsupported ScalarFunctionExpr in ORDER BY by @lukekim in #8838
Workflow tweaks by @lukekim in #8845
Cayenne: metadata catalog should respect cayenne_file_path location by @sgrebnov in #8844
Expand Cayenne feature coverage by @lukekim in #8848
docs: HA distributed query decisions by @phillipleblanc in #8817
fix(optimizer): Fix correctness issues in UnionProjectionPushdownOptimizer by @phillipleblanc in #8851
Pin reqwest to 0.12.24 to fix HuggingFace embedding model download by @ewgenius in #8853
Fix builds and pin to Ubuntu 22.04 by @lukekim in #8856
Revert "Fix builds and pin to Ubuntu 22.04" by @lukekim in #8861
Ensure setup Rust is run by @lukekim in #8862
fix: Ubuntu 24.04+ renamed libaio1 to libaio1t64 by @lukekim in #8865
Upgrade to Pulls with Spice v2 by @lukekim in #8866
Add limit and configuration name to 'testoperator run text-to-sql' by @Jeadie in #8839
PR check and test optimization by @lukekim in #8868
Upgrade S3 Vectors SDK and improve test robustness by @lukekim in #8867
[Testoperator] Query level and improved aggregate level for NSQL by @Jeadie in #8840
Add docker build for private branches for ghcr.io/spiceai/spiceai-dev by @phillipleblanc in #8873
Expand the data acceleration round-trip test coverage by @lukekim in #8855
fix: Provide a better error for improper bucket UDF arguments by @peasee in #8849
ScyllaDB Data Connector by @lukekim in #8827
Use tokio-rusqlite for Spice Cayenne SQLite by @lukekim in #8857
Cayenne: fix FuturesUnordered reentrant drop crash by @sgrebnov in #8863
Bump github/codeql-action from 4.31.9 to 4.31.10 by @app/dependabot in #8884
Bump golang.org/x/sys from 0.39.0 to 0.40.0 by @app/dependabot in #8881
Bump github.com/spiceai/gospice/v8 from 8.0.0 to 8.0.1 by @app/dependabot in #8883
Bump roaring from 0.11.2 to 0.11.3 by @app/dependabot in #8885
Bump golang.org/x/mod from 0.31.0 to 0.32.0 by @app/dependabot in #8882
Bump aws-sdk-s3 from 1.115.0 to 1.119.0 by @app/dependabot in #8887
Bump libc from 0.2.177 to 0.2.180 by @app/dependabot in #8886
Bump tokio-util from 0.7.17 to 0.7.18 by @app/dependabot in #8889
Bump governor from 0.10.2 to 0.10.4 by @app/dependabot in #8888
fix: flaky test test_concurrent_partition_creation by @phillipleblanc in #8898
Update Cayenne snapshots for TPC-DS by @lukekim in #8890
Add more buckets to histogram metrics by @krinart in #8850
feat: Add HTTP health endpoint for cluster executors by @phillipleblanc in #8899
feat: Implement model listing functionality for multiple providers by @lukekim in #8901
feat: Initial HA schedulers distributed query implementation by @phillipleblanc in #8852
fix: infer executor role from --scheduler-address when --role is omitted by @phillipleblanc in #8903
Improve install scripts and remove jq dependency by @lukekim in #8847
Benchmarks: sort PartitionedUnionExec children for deterministic snapshot comparison by @sgrebnov in #8877
Cayenne: share VortexFileCache across partitions via CayenneContext by @sgrebnov in #8880
Update ballista to add exponential backoff for scheduler disconnection by @phillipleblanc in #8905
Configurably add BirdBench evidence to testoperator text-to-SQL. by @Jeadie in #8904
Helm: Allow command override via values.yaml by @sgrebnov in #8906
Fix distributed query gRPC message size limit (16MB -> 100MB) by @phillipleblanc in #8900
OS specific setup actions by @lukekim in #8909
Cayenne should warn if unable to parse configuration value by @sgrebnov in #8907
Add snapshots widgets to example dashboard by @krinart in #8910
Add quality criteria for the features by @krinart in #8897
Improve Accelerated Datasets section for Grafana/Datadog dashboards by @krinart in #8915
Use HTTP traceparent in NSQL to support concurrency in 'testoperator run text-to-SQL' by @Jeadie in #8912
Remove setup for cc from integration_models.yml by @Jeadie in #8917
Propagate Azure and GCS credentials to executors in cluster mode by @phillipleblanc in #8918
Cayenne: fix memory growth due to vortex metrics allocation by @sgrebnov in #8908
fix(caching): Deduplicate refresh requests for JSON array responses by @sgrebnov in #8921
fix(caching): Return cached data directly for unfiltered queries (SELECT *) by @sgrebnov in #8919
Correct MinIO path syntax for spiced download by @Jeadie in #8916
Acceleration snapshots compaction + Improved Snapshots UX by @krinart in #8858
Change base image from bookworm-slim to trixie-slim by @Jeadie in #8923
Add testoperator run text-to-sql metrics from LogicalPlan by @Jeadie in #8895
Fix spicepod dependencies in testoperator by @Jeadie in #8875
Update copilot instructions for data correctness by @lukekim in #8922
Add BootstrapStatus + Snapshot bootstrapping parallelization by @krinart in #8926
fix: add missing feature-gate for AWS Secrets Manager error variant by @phillipleblanc in #8928
refactor: make ConnectorParams fields public for external connectors by @phillipleblanc in #8929
fix(caching): SWR refreshes only accessed entry instead of all stale rows by @sgrebnov in #8931
Cayenne: include cayenne_metadata_dir to known params by @sgrebnov in #8933
Rename Ingestion Lag chart in example dashboards by @krinart in #8932
fix(caching): Fix HTTP caching always MISS when projection excludes fetched_at by @sgrebnov in #8930
Reset expiry after snapshot bootstraping for Caching by @krinart in #8925
Set use_ssl=false for sccache by @lukekim in #8945
Hash indexing for Arrow Acceleration by @lukekim in #8924
[Cayenne] Acceleration snapshots support by @lukekim in #7973
perf(caching): Non-blocking cache writes on cache miss by @sgrebnov in #8948
Update NSQL models by @lukekim in #8951
Hash Index Key verification by @lukekim in #8949
Add snapshots_creation_policy param by @krinart in #8954
Remove candle & cudarc from non-models build by @lukekim in #8955
Acceleration Snapshots API and CLI by @lukekim in #8934
Ignore test for data_components arrow::indexed::test_primary_key_value_matches_batch by @Jeadie in #8962
fix: Update benchmark snapshots by @app/github-actions in #8965
Hash Index secondary index support by @lukekim in #8958
fix: Support primary key constraints in partitioned DuckDB tables mode by @sgrebnov in #8966
perf(caching): Batch cache writes by @sgrebnov in #8959
CI perf optimizations by @lukekim in #8968
Fix Makefile linting by @Jeadie in #8970
Fixes in testoperator run text-to-sql. by @Jeadie in #8927
implement Chat::as_sql for xAI anthropic by @Jeadie in #8957
Fix duckdb_file_path in search integration test by @Jeadie in #8972
fix: Update benchmark snapshots by @app/github-actions in #8971
Maintenance updates to Anthropic API by @Jeadie in #8956
Add CacheBackend Trait, implement pingora-lru, and add throughput tests by @lukekim in #8080
fix: Update benchmark snapshots by @app/github-actions in #8974
fix: Update benchmark snapshots by @app/github-actions in #8975
Make accelerator shutdown more robust by @lukekim in #8969
feat(duckdb): Add on_refresh_sort_columns for post-write data ordering (initial version) by @sgrebnov in #8964
Proper handling for initial snapshot by @krinart in #8911
fix: Remove --no-default-features from cargo-hack command in features workflow by @phillipleblanc in #8977
build(deps): bump actions/cache from 5.0.1 to 5.0.2 by @app/dependabot in #8983
build(deps): bump actions/checkout from 4 to 6 by @app/dependabot in #8982
build(deps): bump actions/setup-go from 6.1.0 to 6.2.0 by @app/dependabot in #8984
build(deps): bump github.com/olekukonko/tablewriter from 1.1.2 to 1.1.3 by @app/dependabot in #8979
build(deps): bump github.com/klauspost/compress from 1.18.2 to 1.18.3 by @app/dependabot in #8980
Add /v1/queries and Arrow Flight async APIs by @lukekim in #8946
build(deps): bump Vampire/setup-wsl from 5 to 6 by @app/dependabot in #8981
fix: Update Search integration test snapshots by @app/github-actions in #8973
build(deps): bump insta from 1.46.0 to 1.46.1 by @app/dependabot in #8988
build(deps): bump schemars from 1.1.0 to 1.2.0 by @app/dependabot in #8985
fix: Update benchmark snapshots by @app/github-actions in #8978
fix: Data correctness edge cases by @lukekim in #8953
Correct MinIO path syntax for spiced download (Part 2) by @Jeadie in #8995
Make .spice/data in search integration tests by @Jeadie in #8992
fix: Hash index composite keys null values by @lukekim in #9001
Update Cayenne status to Beta by @lukekim in #9002
fix: Disable TPC-DS result validation (not yet supported) by @sgrebnov in #9004
feat: Upgrade to DataFusion v51 and dependencies by @lukekim in #8864
Improvements for snapshots_creation_policy by @krinart in #9003
fix(ci): restore cached spicepod-validator binary instead of lookup-only by @phillipleblanc in #9007
Update version by @krinart in #9010

Spice v1.11.0-rc.1 (Jan 6, 2026)

January 7, 2026 · 17 min read

Evgenii Khramkov

Senior Software Engineer at Spice AI

Announcing the release of Spice v1.11.0-rc.1! ⭐

v1.11.0-rc.1 is the first release candidate for early testing of v1.11 features including Distributed Query with mTLS for enterprise-grade secure cluster communication, new SMB and NFS Data Connectors for direct network-attached storage access, Prepared Statements for improved query performance and security, Cayenne Accelerator Enhancements with Key-based deletion vectors and Amazon S3 Express One Zone support, Google LLM Support for expanded AI inference capabilities, and Spice Java SDK v0.5.0 with parameterized query support.

What's New in v1.11.0-rc.1

Distributed Query with mTLS

Enterprise-Grade Secure Cluster Communication: Distributed query cluster mode now enables mutual TLS (mTLS) by default for secure communication between schedulers and executors. Internal cluster communication includes highly privileged RPC calls like fetching Spicepod configuration and expanding secrets. mTLS ensures only authenticated nodes can join the cluster and access sensitive data.

Key Features:

Mutual TLS Authentication: All executor-to-scheduler and executor-to-executor gRPC connections on the internal cluster port (50052) are secured with mTLS, securing communication, and preventing unauthorized nodes from joining the cluster
Certificate Management CLI: New developer spice cluster tls init and spice cluster tls add commands for generating CA certificates and node certificates with proper SANs (Subject Alternative Names)
Simplified CLI Arguments: Renamed cluster arguments for clarity (--role, --scheduler-address, --node-mtls-*) with --scheduler-address implying --role executor
Port Separation: Public services (Flight queries, HTTP API, Prometheus metrics) remain on ports 50051, 8090, and 9090 respectively, while internal cluster services (SchedulerGrpcServer, ClusterService) are isolated on port 50052 with mTLS enforced
Development Mode: Use --allow-insecure-connections flag to disable mTLS requirement for local development and testing

Quick Start:

# Generate certificates for development
spice cluster tls init
spice cluster tls add scheduler1
spice cluster tls add executor1

# Start scheduler
spiced --role scheduler \
  --node-mtls-ca-certificate-file ca.crt \
  --node-mtls-certificate-file scheduler1.crt \
  --node-mtls-key-file scheduler1.key

# Start executor
spiced --role executor \
  --scheduler-address https://scheduler1:50052 \
  --node-mtls-ca-certificate-file ca.crt \
  --node-mtls-certificate-file executor1.crt \
  --node-mtls-key-file executor1.key

For more details, refer to the Distributed Query Documentation.

SMB and NFS Data Connectors

Key Features:

SMB Protocol Support: Connect to Windows file shares and Samba servers with authentication support
NFS Protocol Support: Connect to Unix/Linux NFS exports for direct data access
Federated Queries: Query Parquet, CSV, JSON, and other file formats directly from network storage with full SQL support
Acceleration Support: Accelerate data from SMB/NFS sources using DuckDB, Spice Cayenne, or other accelerators

Example spicepod.yaml configuration:

datasets:
  # SMB share
  - from: smb://fileserver/share/data.parquet
    name: smb_data
    params:
      smb_username: ${secrets:SMB_USER}
      smb_password: ${secrets:SMB_PASS}

  # NFS export
  - from: nfs://nfsserver/export/data.parquet
    name: nfs_data

For more details, refer to the Data Connectors Documentation.

Prepared Statements

Key Features:

Query Plan Caching: Prepared statements cache query plans, reducing planning overhead for repeated queries
SQL Injection Prevention: Parameters are safely bound, preventing SQL injection vulnerabilities
Arrow Flight SQL Support: Full prepared statement support via Arrow Flight SQL protocol

SDK Support:

SDK	Support	Min Version	Method
gospice (Go)	✅ Full	v8.0.0+	`SqlWithParams()` with typed constructors (`Int32Param`, `StringParam`, `TimestampParam`, etc.)
spice-rs (Rust)	✅ Full	v3.0.0+	`query_with_params()` with `RecordBatch` parameters
spice-dotnet (.NET)	❌ Not yet	-	Coming soon
spice-java (Java)	✅ Full	v0.5.0+	`queryWithParams()` with typed `Param` constructors (`Param.int64()`, `Param.string()`, etc.)
spice.js (JavaScript)	❌ Not yet	-	Coming soon
spicepy (Python)	❌ Not yet	-	Coming soon

Example (Go):

import "github.com/spiceai/gospice/v8"

client, _ := spice.NewClient()
defer client.Close()

// Parameterized query with typed parameters
results, _ := client.SqlWithParams(ctx,
    "SELECT * FROM products WHERE price > $1 AND category = $2",
    spice.Float64Param(10.0),
    spice.StringParam("electronics"),
)

Example (Java):

import ai.spice.SpiceClient;
import ai.spice.Param;
import org.apache.arrow.adbc.core.ArrowReader;

try (SpiceClient client = new SpiceClient()) {
    // With automatic type inference
    ArrowReader reader = client.queryWithParams(
        "SELECT * FROM products WHERE price > $1 AND category = $2",
        10.0, "electronics");

    // With explicit typed parameters
    ArrowReader reader = client.queryWithParams(
        "SELECT * FROM products WHERE price > $1 AND category = $2",
        Param.float64(10.0),
        Param.string("electronics"));
}

For more details, refer to the Parameterized Queries Documentation.

Spice Cayenne Accelerator Enhancements

The Spice Cayenne data accelerator has been improved with several key enhancements:

KeyBased Deletion Vectors: Improved deletion vector support using key-based lookups for more efficient data management and faster delete operations. KeyBased deletion vectors are more memory-efficient than positional vectors for sparse deletions.
S3 Express One Zone Support: Store Cayenne data files in S3 Express One Zone for single-digit millisecond latency, ideal for latency-sensitive query workloads that require persistence.

Example spicepod.yaml configuration:

datasets:
  - from: s3://my-bucket/data.parquet
    name: fast_data
    acceleration:
      enabled: true
      engine: cayenne
      mode: file
      params:
        # Use S3 Express One Zone for data files
        cayenne_s3express_bucket: my-express-bucket--usw2-az1--x-s3

For more details, refer to the Cayenne Documentation.

Google LLM Support

Key Features:

Google Chat Models: Access Google's Gemini models for chat completions
Google Embeddings: Generate embeddings using Google's text embedding models
Unified API: Use the same OpenAI-compatible API endpoints for all LLM providers

Example spicepod.yaml configuration:

models:
  - from: google:gemini-2.0-flash
    name: gemini
    params:
      google_api_key: ${secrets:GOOGLE_API_KEY}

embeddings:
  - from: google:text-embedding-004
    name: google_embeddings
    params:
      google_api_key: ${secrets:GOOGLE_API_KEY}

For more details, refer to the Google LLM Documentation (see docs PR #1286).

Spice Java SDK v0.5.0

Key Features:

SQL Injection Prevention: Parameters are safely bound, preventing SQL injection vulnerabilities
Automatic Type Inference: Java types are automatically mapped to Arrow types (e.g., double → Float64, String → Utf8)
Explicit Type Control: Use the new Param class with typed factory methods (Param.int64(), Param.string(), Param.decimal128(), etc.) for precise control over Arrow types
Updated Dependencies: Apache Arrow Flight SQL upgraded to 18.3.0, plus new ADBC driver support

Example:

import ai.spice.SpiceClient;
import ai.spice.Param;

try (SpiceClient client = new SpiceClient()) {
    // With automatic type inference
    ArrowReader reader = client.queryWithParams(
        "SELECT * FROM taxi_trips WHERE trip_distance > $1 LIMIT 10",
        5.0);

    // With explicit typed parameters for precise control
    ArrowReader reader = client.queryWithParams(
        "SELECT * FROM orders WHERE order_id = $1 AND amount >= $2",
        Param.int64(12345),
        Param.decimal128(new BigDecimal("99.99"), 10, 2));
}

Maven:

<dependency>
  <groupId>ai.spice</groupId>
  <artifactId>spiceai</artifactId>
  <version>0.5.0</version>
</dependency>

For more details, refer to the Spice Java SDK Repository.

OpenTelemetry Improvements

Note: This is a breaking change. Update your configurations if you were using the dedicated OTel port 50052. Internal cluster communication now uses port 50052 exclusively.

Developer Experience Improvements

Turso v0.3.2 Upgrade: Upgraded Turso accelerator for improved performance and reliability
Rust 1.91 Upgrade: Updated to Rust 1.91 for latest language features and performance improvements
Spice Cloud CLI: Added spice cloud CLI commands for cloud deployment management
Improved Spicepod Schema: Enhanced JSON schema generation for better IDE support and validation
Acceleration Snapshots: Added configurable snapshots_create_interval for periodic acceleration snapshots independent of refresh cycles
Tiered Caching with Localpod: The Localpod connector now supports caching refresh mode, enabling multi-layer acceleration where a persistent cache feeds a fast in-memory cache
GitHub Data Connector: Added workflows and workflow runs support for GitHub repositories
NDJSON/LDJSON Support: Added support for Newline Delimited JSON and Line Delimited JSON file formats

Additional Improvements & Bug Fixes

Reliability: Fixed DynamoDB IAM role authentication with new dynamodb_auth: iam_role parameter
Reliability: Fixed cluster executors to use scheduler's temp_directory parameter for shuffle files
Reliability: Initialize secrets before object stores in cluster executor mode
Reliability: Added page-level retry with backoff for transient GitHub GraphQL errors
Performance: Improved statistics for rewritten DistributeFileScanOptimizer plans
Developer Experience: Added max_message_size configuration for Flight service

Contributors

Breaking Changes

OTel Ingestion Port Change

Distributed Query Cluster Mode Requires mTLS

Migration Steps:

Generate certificates using spice cluster tls init and spice cluster tls add
Update scheduler and executor startup commands with --node-mtls-* arguments
For development/testing, use --allow-insecure-connections to opt out of mTLS

Renamed CLI Arguments:

Old Name	New Name
`--cluster-mode`	`--role`
`--cluster-ca-certificate-file`	`--node-mtls-ca-certificate-file`
`--cluster-certificate-file`	`--node-mtls-certificate-file`
`--cluster-key-file`	`--node-mtls-key-file`
`--cluster-address`	`--node-bind-address`
`--cluster-advertise-address`	`--node-advertise-address`
`--cluster-scheduler-url`	`--scheduler-address`

Removed CLI Arguments:

--cluster-api-key: Replaced by mTLS authentication

Cookbook Updates

No major cookbook updates.

The Spice Cookbook includes 84 recipes to help you get started with Spice quickly and easily.

Upgrading

To try v1.11.0-rc.1, use one of the following methods:

CLI:

spice upgrade --version 1.11.0-rc.1

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.11.0-rc.1 image:

docker pull spiceai/spiceai:1.11.0-rc.1

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai --version 1.11.0-rc.1

AWS Marketplace:

🎉 Spice is available in the AWS Marketplace!

What's Changed

Changelog

OTel exporter for push metrics by @lukekim in #8442
fix: Update benchmark snapshots by @app/github-actions in #8448
Add TPCH append tests to scheduled dispatch workflow by @sgrebnov in #8451
Add snapshot creation logging by @krinart in #8469
Fix PeriodicReader panic by @krinart in #8471
Benchmarks: increase readiness timeout for turso acceleration (TPC-H) by @sgrebnov in #8470
fix: Pin CUDA build actions to commits by @peasee in #8477
Add Criterion benchmarking to chunking crate. by @Jeadie in #8431
DuckDB agg pushdown: gate behind accelerator parameter by @mach-kernel in #8474
Rename aggregate_pushdown_optimization -> optimizer_duckdb_aggregate_pushdown by @ewgenius in #8485
Handle throttling exception for DynamoDB streams by @phillipleblanc in #8492
docs: Add release notes by @peasee in #8478
Update spicepod.schema.json by @app/github-actions in #8496
Move 'test_projection_pushdown' to runtime-datafusion by @Jeadie in #8490
Fix OTEL metrics HTTP exporter client setup by @phillipleblanc in #8489
Update endgame to include new caching accelerator cookbook by @phillipleblanc in #8487
DynamoDB tests and fixes by @lukekim in #8491
Align make lint-rust-fix with make lint-rust by @Jeadie in #8499
fix: Remove unused Cayenne parameters by @peasee in #8500
Force task history captured_plan outputs to be captured even if they would be filtered out otherwise by @phillipleblanc in #8501
release: post-release updates by @peasee in #8503
CI: Fix E2E models dispatch by @mach-kernel in #8505
Use an isolated Tokio runtime for refresh tasks that is separate from the main query API by @phillipleblanc in #8504
Update openapi.json by @app/github-actions in #8512
Update dependencies by @phillipleblanc in #8513
fix: Avoid double hashing cache key by @peasee in #8511
fix: Eagerly drop cached records for results larger than max by @peasee in #8516
Revert "fix: Move enforce-pulls to hosted runner (#8686)" by @phillipleblanc in #8709
Initial 'testoperator run text-to-sql' by @Jeadie in #8618
Add support for abfss by @krinart in #8706
Add testoperator TPCH dispatch for ABFS with hierarchical namespace disabled + versioning enabled by @phillipleblanc in #8711
Update openapi.json by @app/github-actions in #8692
cluster: validate --role argument by @phillipleblanc in #8717
Upgrade to Turso v0.3.2 by @lukekim in #8716
Rename --insecure to --allow-insecure-connections to be consistent with existing naming by @lukekim in #8720
Remove 'testoperator run http-consistency/http-overhead' by @Jeadie in #8708
refactor: Remove cluster feature flag by @phillipleblanc in #8718
Docs: Distributed query ADR by @mach-kernel in #8608
Use model.datasets to allowlist on tools by @Jeadie in #8714
cluster: quality of life improvements to starting cluster mode locally by @phillipleblanc in #8719
Docs: Ballista extension ADR by @mach-kernel in #8616
Improve deprecation messages when going from prefixed -> non-prefixed. by @Jeadie in #8724
Remove tools from auto-defaults by @Jeadie in #8725
Make distinct providers for vector spilling, vector partitioning. by @Jeadie in #8546
cluster: default scheduler address port by @phillipleblanc in #8728
Add Makefile targets for testoperator by @Jeadie in #8729
text-to-sql dispatch in testoperator by @Jeadie in #8705
DR-006: High Availability Distributed Query with Stateless Schedulers by @lukekim in #8721
DR-007: mTLS for Distributed Query Cluster Communication by @lukekim in #8722
SMB and NFS improvements by @lukekim in #8710
fix: Cluster executors use scheduler's temp_directory for shuffle files by @phillipleblanc in #8733
use 'max_message_size' in flight service too by @Jeadie in #8730
Add page-level retry for transient GraphQL errors with backoff and increase GitHub rate limit buffer up to 100 by @ewgenius in #8726
Make testoperator Dockerfile; CI to build docker image to ghcr.io. by @Jeadie in #8732
cluster: UnionProjectionPushdownOptimizer: Add projection pushdown diagnostics for union children by @phillipleblanc in #8734
Fix column projection order mismatch with location metadata columns by @phillipleblanc in #8738
Fixes for testoperator. by @Jeadie in #8737
Improve Cayenne Deletion Vectors with KeyBased support by @lukekim in #8713
Fix testoperator_dispatch.yaml by @Jeadie in #8740
Add spice cloud CLI commands by @lukekim in #8528
Add FTP, NFS, & SMB TPCH SF1 spicepods by @lukekim in #8739
Prepared Statements by @lukekim in #7588
Schedule dispatch of testoperator run text-to-sql. by @Jeadie in #8745
Fix minio for ai benchmark CI by @Jeadie in #8743
Upgrade to Rust 1.91 by @phillipleblanc in #8749
fix: Update benchmark snapshots by @app/github-actions in #8763
Benchmarks: make row count validation skip logic configurable by scale factor, query set, and overrides by @sgrebnov in #8756
Make benchmark tests more robust by @sgrebnov in #8766
Add parameter to force using iam_role for DynamoDB by @krinart in #8767
fix: Update Search integration test snapshots by @app/github-actions in #8735
v1.10.4 release notes by @phillipleblanc in #8790
Trace metrics export errors by @sgrebnov in #8791
fix: correctly identify deprecated openai_* parameters by @phillipleblanc in #8809
Don't CAST strings which breaks push down optimizer by @lukekim in #8810
Add timezone database to Docker image to fix Cayenne acceleration panic by @sgrebnov in #8799
Update async-openai to latest revision 4dcd633aad6f - brings fix for openai compatible model providers by @ewgenius in #8816
Add auth/iam_role_source to DynamoDB connector by @krinart in #8808
DynamoDB fixes: JSON nesting for Streams, proper batch deletions by @krinart in #8821

Spice v1.10.0 (Dec 9, 2025)

December 9, 2025 · 18 min read

William Croxson

Senior Software Engineer at Spice AI

Announcing the release of Spice v1.10.0! ⚡

Spice v1.10.0 introduces a new Caching Acceleration Mode with stale-while-revalidate (SWR) semantics for disk-persisted, low-latency queries with background refresh. This release also adds the TinyLFU eviction policy for the SQL results cache, a preview of the DynamoDB Streams connector for real-time CDC, S3 location predicate pruning for faster partitioned queries, improved distributed query execution, and multiple security hardening improvements.

What's New in v1.10.0

Caching Acceleration Mode

Low-Latency Queries with Background Refresh: This release introduces a new caching acceleration mode that implements the stale-while-revalidate (SWR) pattern. Queries return cached results immediately while data refreshes asynchronously in the background, eliminating query latency spikes during refresh cycles. Cached data persists to disk using DuckDB, SQLite, or Cayenne file modes.

Key Features:

Stale-While-Revalidate (SWR): Returns cached data immediately while refreshing in the background, reducing query latency
Disk Persistence: Cached results persist across restarts using DuckDB, SQLite, or Cayenne file modes
Configurable Refresh: Control refresh intervals with refresh_check_interval to balance freshness and source load

Recommendation: Use retention configuration with caching acceleration to ensure stale data is cleaned up over time.

Example spicepod.yaml configuration:

datasets:
  - from: http://localhost:7400
    name: cached_data
    time_column: fetched_at
    acceleration:
      enabled: true
      engine: duckdb
      mode: file # Persist cache to disk
      refresh_mode: caching
      refresh_check_interval: 10m
      retention_check_enabled: true
      retention_period: 24h
      retention_check_interval: 1h

For more details, refer to the Data Acceleration Documentation.

TinyLFU Cache Eviction Policy

Higher Cache Hit Rates for SQL Results Cache: A new TinyLFU cache eviction policy is now available for the SQL results cache. TinyLFU is a probabilistic cache admission policy that maintains higher hit rates than LRU while keeping memory usage predictable, making it ideal for workloads with varying query frequency patterns.

Example spicepod.yaml configuration:

runtime:
  caching:
    sql_results:
      enabled: true
      eviction_policy: tiny_lfu # default: lru

For more details, refer to the Caching Documentation and the Moka TinyLFU Documentation for details of the algorithm.

DynamoDB Streams Data Connector (Preview)

Real-Time Change Data Capture for DynamoDB: The DynamoDB connector now integrates with DynamoDB Streams for real-time change data capture (CDC). This enables continuous synchronization of DynamoDB table changes into Spice for real-time query, search, and LLM-inference.

Key Features:

Real-Time CDC: Automatically captures inserts, updates, and deletes from DynamoDB tables as they occur
Table Bootstrapping: Performs an initial full table scan before streaming changes, ensuring complete data consistency
Acceleration Integration: Works with refresh_mode: changes to incrementally update accelerated datasets

Note: DynamoDB Streams must be enabled on your DynamoDB table. This feature is in preview.

Example spicepod.yaml configuration:

datasets:
  - from: dynamodb:my_table
    name: orders_stream
    acceleration:
      enabled: true
      refresh_mode: changes # Enable Streams capture

For more details, refer to the DynamoDB Connector Documentation.

OpenTelemetry Metrics Exporter

Spice can now push metrics to an OpenTelemetry collector, enabling integration with platforms such as Jaeger, New Relic, Honeycomb, and other OpenTelemetry-compatible backends.

Key Features:

Protocol Support: Supports the gRPC (default port 4317) protocol
Configurable Push Interval: Control how frequently metrics are pushed to the collector

Example spicepod.yaml configuration for gRPC:

runtime:
  telemetry:
    enabled: true
    otel_exporter:
      endpoint: 'localhost:4317'
      push_interval: '30s'

For more details, refer to the Observability & Monitoring Documentation.

S3 Connector Improvements

S3 Location Predicate Pruning: The S3 data connector now supports location-based predicate pruning, dramatically reducing data scanned by pushing down location filter predicates to S3 listing operations. For partitioned datasets (e.g., year=2025/month=12/), Spice now skips listing irrelevant partitions entirely, significantly reducing query latency and S3 API costs.

AWS S3 Tables Write Support: Full read/write capability for AWS S3 Tables, enabling direct integration with AWS's managed table format for S3. Use standard SQL INSERT INTO to write data.

For more details, refer to the S3 Data Connector Documentation and Glue Data Connector Documentation.

Faster Distributed Query Execution

Distributed query planning and execution have been significantly improved:

Fixed executor registration in cluster mode for more reliable distributed deployments
Improved hostname resolution for Flight server binding, enabling better executor discovery
Distributed accelerator registration: Data accelerators now properly register in distributed mode
Optimized query planning: DistributeFileScanOptimizer improvements for faster planning with large datasets

For more details, refer to the Distributed Query Documentation.

Search Improvements

Search capabilities have been improved with several performance and reliability enhancements:

Fixed FTS query blocking: Full-text search queries no longer block unnecessarily, improving query responsiveness
Optimized vector index operations: Eliminated unnecessary list_vectors calls for better performance
Improved limit pushdown: IndexerExec now properly handles limit pushdown for more efficient searches

For more details, refer to the Search Documentation.

Security Hardening

Multiple security improvements have been implemented:

SQL Identifier Quoting: Hardened SQL identifier quoting across all database connectors (PostgreSQL, MySQL, DuckDB, etc.) to prevent SQL injection attacks through table or column names
Token Redaction: Sensitive authentication tokens are now fully redacted in debug and error output, preventing accidental credential exposure in logs
Path Traversal Prevention: Fixed tar extraction operations to prevent directory traversal vulnerabilities when processing archived files
Input Sanitization: Added strict validation for top_n_sample order_by clause parsing to prevent injection attacks
Glue Credential Handling: Prevented automatic loading of AWS credentials from environment in Glue connector, ensuring explicit credential configuration

Developer Experience Improvements

Health probe metrics: Added health probe latency metrics for better observability
CLI improvements: Fixed .clear history command in the REPL to fully clear persisted history

Contributors

Breaking Changes

No breaking changes.

Cookbook Updates

No major cookbook updates.

The Spice Cookbook includes 82 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.10.0, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.10.0 image:

docker pull spiceai/spiceai:1.10.0

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

AWS Marketplace:

🎉 Spice is now available in the AWS Marketplace!

What's Changed

Changelog

Test-operator: Add tpcds_q8 to the default row-count validation skip list by @sgrebnov in #8185
fix: Remove unwrap_used from test by @peasee in #8212
Run glue_iceberg_integration_test_catalog as part of main integration tests by @sgrebnov in #8222
Add TPCH sf100 testoperator spicepods with dispatch by @Jeadie in #8192
Build with CPU native flags by @lukekim in #8224
fix: Apply assertion clippy in CI/Makefile only by @peasee in #8229
feat: Support running queries only in testoperator by @peasee in #8211
DuckDB query planning: aggregate pushdown by @mach-kernel in #8174
install.sh improvements by @lukekim in #8252
Fix .clear history by @lukekim in #8254
Improve the output of dataset loading by @lukekim in #8256
Refactor view validation by @lukekim in #8258
Upgrade AWS crates by @lukekim in #8259
fix: Pushdown dynamic filters to partition scans by @peasee in #8240
Harden SQL identifier quoting in connectors by @phillipleblanc in #8276
Cayenne sort_columns on insert by @lukekim in #8091
Redact token debug output by @phillipleblanc in #8280
fix: Cayenne configuration options by @lukekim in #8281
Prevent path traversal in untar by @phillipleblanc in #8284
Fix cluster mode executor registration by @mach-kernel in #8292
Unignore s3_vectors_kafka_stream test by @Jeadie in #8289
Post-release house keeping by @krinart in #8293
Improve generate_changelog script by @krinart in #8273
Acceleration mode caching by @lukekim in #8237
Sanitization and security checks by @lukekim in #7854
Add health probe latency metric by @phillipleblanc in #8300
Add distributed registration for data accelerators by @phillipleblanc in #8299
Pass IndexedTableProvider down in 'changes_stream' and 'append_stream' by @Jeadie in #8295
Add dynamodb-streams crate by @krinart in #8283
Distributed query: resolve executor hostname when determining Flight server binding by @mach-kernel in #8304
Return computed embeddings from index for partitioned S3Vectors by @Jeadie in #8306
[DDB Streams] Skeleton for DynamoDB Streams by @krinart in #8296
DistributeFileScanOptimizer: Improve planning performance by @mach-kernel in #8305
feat: Add an ExactLeftAccumulator implementation by @peasee in #8302
deps: Upgrade Vortex to 0.56 by @peasee in #8311
DynamoDB table bootstrapping + streaming by @krinart in #8312
Avoid calling S3Vector list_vectors (or equivalent) when indexing into VectorIndexs by @Jeadie in #8282
Add on_conflict testing support to append benchmark by @sgrebnov in #8314
docker: Add valid home directory to fix duckdb extension loading issue by @phillipleblanc in #8318
Add GH Workflow to run Append benchmark test by @sgrebnov in #8321
Exclude MySQL SF100 from test-operator dispatch by @sgrebnov in #8320
feat: Update clippy lints by @peasee in #8317
Add S3 location predicate pruning to listing connector by @phillipleblanc in #8319
Review feedback for caching mode accelerator by @phillipleblanc in #8326
Also include Dockerfile home changes for release build by @phillipleblanc in #8327
Change communication channel from Discord to Slack by @Jeadie in #8330
Replace Discord link with Slack link in README by @Jeadie in #8331
fix(glue): Prevent OpenDAL from automatic loading of AWS credentials from environment by @sgrebnov in #8337
Block on index read for FTS queries by @Jeadie in #8339
Fix search query provider by @Jeadie in #8343
Support for writing into AWS S3 Tables by @sgrebnov in #8344
Acceleration file_create mode by @lukekim in #8347
Don't block on lock in FTS query path by @Jeadie in #8348
feat: Add an optimizer rule to replace join accumulator for Cayenne by @peasee in #8316
S3 Vectors limit updates by @lukekim in #8352
Sanitize top_n_sample order_by parsing by @phillipleblanc in #8356
Add distributed registration for data connectors by @phillipleblanc in #8354
Improve IndexerExec to properly handle limit pushdown by @sgrebnov in #8366
Fix Cayenne partition_by metadata flaky integration test by @phillipleblanc in #8367
Rework caching accelerator to use the stale-while-revalidate pattern. by @phillipleblanc in #8365
Add TinyLFU caching policy by @lukekim in #8370
Make Arrow acceleration on_conflict verification more robust by @sgrebnov in #8375
Add additional test for verify_on_conflict_matches_primary_key (Arrow acceleration) by @sgrebnov in #8376
Add v1.10.0-rc1 release notes by @mach-kernel in #8373
docs: Remove DuckDB agg pushdown from release notes by @peasee in #8383
Testoperator dispatch: add Append support and test configurations by @sgrebnov in #8360
fix: Increase TPCDS DuckDB connection pool size by @peasee in #8386
fix: Update benchmark snapshots by @app/github-actions in #8385
fix: Update benchmark snapshots by @app/github-actions in #8389
Fix Windows build by @phillipleblanc in #8391
DuckDB aggregate pushdown: fix partitioning and schema rewrite bugs by @mach-kernel in #8397
Delta table: Store current snapshot ref with table instance by @mach-kernel in #8358
GetAppDefinition: Check if executor is part of cluster by @mach-kernel in #8396
1.10.0-rc1 housekeeping by @mach-kernel in #8394
Change debug log to warning for vector engine config by @Jeadie in #8378
Clarify /v1/nsql datasets sampling hint by @phillipleblanc in #8395
Use bmi1 target feature for x86_64 by @phillipleblanc in #8401
benchmarks: Default to update snapshots when run on a non-release branch by @phillipleblanc in #8402
Update threat model for v1.9.2 by @phillipleblanc in #8400
Fix iceberg tables metadata - assign ids to all fields, including nested by @ewgenius in #8351
Fix databricks_spark_connect_m2m_integration_test_catalog snapshot by @ewgenius in #8403
Move all GitHub Actions workflows to use exact commit sha by @phillipleblanc in #8409
Upgrade datafusion-tableproviders (df v50) by @lukekim in #8261
Batching for CDC by @krinart in #8359
fix: make DuckDB attachments logic more robust by @sgrebnov in #8411
Persistent checkpoints for DynamoDB Streams by @krinart in #8345
Distributed query: Support AsyncFuncExec and Spice UDFs in Ballista by @mach-kernel in #8414
Pin GitHub Actions to fix Testoperator build action by @sgrebnov in #8416
Watermarks support for DynamoDB by @krinart in #8417
Fix typo in .vscode/launch.json by @sgrebnov in #8415
DuckSqlExec: Update equivalence properties when rewriting schema by @phillipleblanc in #8420
Validate that the commit for datafusion-table-providers exists on the spiceai branch by @phillipleblanc in #8421
Append Tests: add support for retention testing by @sgrebnov in #8419
New crate google-genai by @Jeadie in #8390
federation: Improve error message and add debug logging for cast failures by @phillipleblanc in #8422
DynamoDB Streams Error Handling by @krinart in #8418
Append tests: add support for with_retention_data to dispatch by @sgrebnov in #8430
Append tests: add support for test metrics reporting by @sgrebnov in #8432
test-operator: fix metrics reporting by @sgrebnov in #8435
Follow-up improvements and bug fixes by @krinart in #8433
Periodic snapshots for append/changes streams by @krinart in #8407
Add support for caching_stale_if_error to caching accelerator; fix multiple upstream requests during SWR; fix Arrow accelerator by @phillipleblanc in #8425
Disable dataset health monitor for dynamic HTTP connector by @phillipleblanc in #8441
Metrics + snapshots_trigger_threshold for DynamoDB Streams by @krinart in #8437
Clear the in-flight revalidations cache after a revalidation has completed. by @phillipleblanc in #8443
dont build 'spicepod-validator' on 'make install' by @Jeadie in #8426
fix: Update benchmark snapshots by @app/github-actions in #8436
fix: Disable Cayenne HashJoin rewriter optimizer by @peasee in #8439
Testoperator: add duckdb-partitioned query override by @sgrebnov in #8446
Add a check to validate that results cache SWR and caching accelerator SWR are not both set. by @phillipleblanc in #8445
OTel exporter for push metrics by @lukekim in #8442
fix: Update benchmark snapshots by @app/github-actions in #8448
Add snapshot creation logging by @krinart in #8469
Fix PeriodicReader panic by @krinart in #8471
fix: Pin CUDA build actions to commits by @peasee in #8477
DuckDB agg pushdown: gate behind accelerator parameter by @mach-kernel in #8474
Rename aggregate_pushdown_optimization -> optimizer_duckdb_aggregate_pushdown by @ewgenius in #8485
Handle throttling exception for DynamoDB streams by @phillipleblanc in #8492

Spice v1.10.0-rc.1 (Dec 2, 2025)

December 3, 2025 · 11 min read

David Stancu

Principal Software Engineer at Spice AI

Announcing the release of Spice v1.10.0-rc.1! ⚡

v1.10.0-rc1 is a release candidate for early testing of v1.10 features including an all new caching acceleration mode, tiny_lfu caching policy, a new DynamoDB Streams connector (Preview), improvements to the DynamoDB connector, faster distributed query execution, S3 connector improvements, and security hardening for v1.10.0-stable.

What's New in v1.10.0-rc1

Caching Acceleration Mode with SWR and TinyLFU

This release introduces a new caching acceleration mode that implements the stale-while-revalidate (SWR) pattern using Data Accelerators such as DuckDB or Cayenne, enabling queries to return file-persisted cached results immediately while asynchronously refreshing data in the background. Combined with the new TinyLFU cache eviction policy, Spice can now maintain higher cache hit rates while keeping memory usage predictable.

Key Features:

Stale-While-Revalidate (SWR): Returns cached data immediately while refreshing in the background
Data Accelerator Support: Cached accelerators can persist data to disk using DuckDB, SQLite, or Cayenne file modes.
TinyLFU Cache Policy: Probabilistic cache admission policy that maintains high hit rates with minimal overhead
Predictable Memory Usage: Configurable memory limits with automatic eviction of less frequently used entries

Example Spicepod.yml configuration:

runtime:
  caching:
    sql_results:
      enabled: true
      eviction_policy: tiny_lfu # default lru

datasets:
  - from: s3://my-bucket/data.parquet
    name: cached_data
    acceleration:
      enabled: true
      engine: duckdb
      mode: file # Persist cache to disk
      refresh_mode: caching
      refresh_check_interval: 10m

For more details, refer to the Data Acceleration Documentation and Caching Documentation.

DynamoDB Streams Data Connector in Preview

DynamoDB Connector now integrates with DynamoDB Streams which enables real-time streaming with support for both table bootstrapping and continuous change data capture (CDC). This connector automatically detects changes in DynamoDB tables and streams them into Spice for real-time query, search, and LLM-inference.

Key Features:

Real-Time CDC: Automatically captures inserts, updates, and deletes from DynamoDB tables
Table Bootstrapping: Initial full table load before streaming changes

Example Spicepod.yml configuration:

datasets:
  - from: dynamodb:my_table
    name: orders_stream
    acceleration:
      enabled: true
      refresh_mode: changes

For more details, refer to the DynamoDB Connector Documentation.

Cayenne Accelerator Enhancements

The Cayenne data accelerator now supports:

Sort Columns Configuration: Optimize inserts by pre-sorting data on specified columns for improved query performance

Example Spicepod.yml configuration:

datasets:
  - from: s3://my-bucket/data.parquet
    name: sorted_data
    acceleration:
      enabled: true
      engine: cayenne
      mode: file_create
      params:
        sort_columns: timestamp,region

For more details, refer to the Cayenne Documentation.

S3 Connector Improvements

S3 Location Predicate Pruning: The S3 data connector now supports location-based predicate pruning, dramatically reducing data scanned by pushing down predicates to S3 listing operations. This optimization is especially effective for partitioned datasets stored in S3.

AWS S3 Tables Write Support: Full read/write capability for AWS S3 Tables, enabling fast integration with AWS's table format for S3.

For more details, refer to the S3 Tables Data Connector Documentation and Glue Data Connection Documentation.

Faster Distributed Query Execution

Distributed query planning and execution have been significantly improved:

Fixed executor registration in cluster mode for more reliable distributed deployments
Improved hostname resolution for Flight server binding, enabling better executor discovery
Distributed accelerator registration: Data accelerators now properly register in distributed mode
Optimized query planning: DistributeFileScanOptimizer improvements for faster planning with large datasets

For more details, refer to the Distributed Query Documentation.

Search Improvements

Search capabilities have been improved with several performance and reliability enhancements:

Fixed FTS query blocking: Full-text search queries no longer block unnecessarily, improving query responsiveness
Optimized vector index operations: Eliminated unnecessary list_vectors calls for better performance
Improved limit pushdown: IndexerExec now properly handles limit pushdown for more efficient searches

For more details, refer to the Search Documentation.

Security Hardening

Multiple security improvements have been implemented:

SQL identifier quoting: Hardened SQL identifier quoting across all connectors to prevent injection attacks
Token redaction: Sensitive tokens are now fully redacted in debug output to prevent credential leakage
Path traversal prevention: Fixed tar extraction to prevent path traversal vulnerabilities
Input sanitization: Added validation for top_n_sample order_by parsing
Improved credential handling: Improved credential management in Glue connector

Developer Experience Improvements

Health probe metrics: Added health probe latency metrics for better observability
CLI improvements: Fixed .clear history command in the REPL to fully clear persisted history

Contributors

Breaking Changes

No breaking changes.

Cookbook Updates

No major cookbook updates. The Spice Cookbook still offers 82+ recipes to help you prototype quickly.

Upgrading

To try v1.10.0-rc1, use one of the following methods:

CLI:

spice upgrade --version 1.10.0-rc1

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.10.0-rc1 image:

docker pull spiceai/spiceai:1.10.0-rc1

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai --version 1.10.0-rc1

AWS Marketplace:

🎉 Spice is available in the AWS Marketplace.

What's Changed

Changelog

Test-operator: Add tpcds_q8 to the default row-count validation skip list by @sgrebnov in #8185
fix: Remove unwrap_used from test by @peasee in #8212
Run glue_iceberg_integration_test_catalog as part of main integration tests by @sgrebnov in #8222
Add TPCH sf100 testoperator spicepods with dispatch by @Jeadie in #8192
Build with CPU native flags by @lukekim in #8224
Make copilot check for empty copyright header by @krinart in #8245
fix: Apply assertion clippy in CI/Makefile only by @peasee in #8229
feat: Support running queries only in testoperator by @peasee in #8211
DuckDB query planning: aggregate pushdown by @mach-kernel in #8174
install.sh improvements by @lukekim in #8252
Fix .clear history by @lukekim in #8254
fix: Pushdown dynamic filters to partition scans by @peasee in #8240
Harden SQL identifier quoting in connectors by @phillipleblanc in #8276
Cayenne sort_columns on insert by @lukekim in #8091
Redact token debug output by @phillipleblanc in #8280
fix: Cayenne configuration options by @lukekim in #8281
Prevent path traversal in untar by @phillipleblanc in #8284
Fix cluster mode executor registration by @mach-kernel in #8292
Unignore s3_vectors_kafka_stream test by @Jeadie in #8289
Post-release house keeping by @krinart in #8293
Improve generate_changelog script by @krinart in #8273
Acceleration mode caching by @lukekim in #8237
Sanitization and security checks by @lukekim in #7854
Add health probe latency metric by @phillipleblanc in #8300
Add distributed registration for data accelerators by @phillipleblanc in #8299
Pass IndexedTableProvider down in 'changes_stream' and 'append_stream' by @Jeadie in #8295
Add dynamodb-streams crate by @krinart in #8283
Distributed query: resolve executor hostname when determining Flight server binding by @mach-kernel in #8304
Return computed embeddings from index for partitioned S3Vectors by @Jeadie in #8306
[DDB Streams] Skeleton for DynamoDB Streams by @krinart in #8296
DistributeFileScanOptimizer: Improve planning performance by @mach-kernel in #8305
feat: Add an ExactLeftAccumulator implementation by @peasee in #8302
deps: Upgrade Vortex to 0.56 by @peasee in #8311
DynamoDB table bootstrapping + streaming by @krinart in #8312
Avoid calling S3Vector list_vectors (or equivalent) when indexing into VectorIndexs by @Jeadie in #8282
Add on_conflict testing support to append benchmark by @sgrebnov in #8314
docker: Add valid home directory to fix duckdb extension loading issue by @phillipleblanc in #8318
Add GH Workflow to run Append benchmark test by @sgrebnov in #8321
Exclude MySQL SF100 from test-operator dispatch by @sgrebnov in #8320
feat: Update clippy lints by @peasee in #8317
Add S3 location predicate pruning to listing connector by @phillipleblanc
Review feedback for caching mode accelerator by @phillipleblanc in #8326
Also include Dockerfile home changes for release build by @phillipleblanc in #8327
Change communication channel from Discord to Slack by @Jeadie in #8330
Replace Discord link with Slack link in README by @Jeadie in #8331
fix(glue): Prevent OpenDAL from automatic loading of AWS credentials from environment by @sgrebnov in #8337
Block on index read for FTS queries by @Jeadie in #8339
Fix search query provider by @Jeadie in #8343
Support for writing into AWS S3 Tables by @sgrebnov in #8344
Acceleration file_create mode by @lukekim in #8347
Don't block on lock in FTS query path by @Jeadie in #8348
feat: Add an optimizer rule to replace join accumulator for Cayenne by @peasee in #8316
S3 Vectors limit updates by @lukekim in #8352
Sanitize top_n_sample order_by parsing by @phillipleblanc in #8356
Update version to v1.10.0-rc.1 by @ewgenius in #8362
Improve IndexerExec to properly handle limit pushdown by @sgrebnov in #8366
Fix Cayenne partition_by metadata flaky integration test by @phillipleblanc in #8367
Rework caching accelerator to use the stale-while-revalidate pattern. by @phillipleblanc in #8365
Add TinyLFU caching policy by @lukekim in #8370

What's New in v2.1.0​

High-Throughput Cayenne CDC​

Change Data Capture & HTAP​

Distributed Query​

Performance & Query Engine​

AI & LLM​

Search & Vectors​

SQL & Query Engine​

Security & Connectors​

Adaptive Self-Tuning (Experimental)​

Observability​

Notable Bug Fixes​

Dependency Updates​

Contributors​

Breaking Changes​

Cookbook Updates​

Upgrading​

What's Changed​

Changelog​

What's New in v2.0.1​

Faster Iceberg Reads with Parallel File Scanning​

AWS S3 & Object-Store Reliability​

Data Acceleration & Distributed Query Fixes​

Authenticated Query Fixes​

Contributors​

Breaking Changes​

Cookbook Updates​

Upgrading​

What's Changed​

Changelog​

Highlights in v2.0.0 include:​

Distribution Changes​

What's New in v2.0.0​

Spice Cayenne Reaches General Availability​

Multi-Active HA Distributed Query (GA)​

Security: Mutual TLS, Secret Stores, and Hardening​

Change Data Capture (CDC) Sources​

DML, DDL, and Write-Back​

SQL & User-Defined Functions​

Runtime Features​

Spicepod v2​

Data Connectors & Catalogs​

AI & LLM​

Search & Vectors​

Caching​

Performance & Query Engine​

Rust CLI​

Observability​

Notable Bug Fixes​

Dependency Updates​

Contributors​

Breaking Changes​

Upgrade Guide from v1.x​

1. Build, image, and platform changes​

2. Adopt Spicepod v2 (recommended)​

3. Update changed configuration​

4. Update queries and API clients​

5. Update model providers​

6. Update observability​

Cookbook Updates​

Upgrading​

What's Changed​

Changelog​

What's New in v2.0.0-rc.2​

Distributed Cayenne Query and Write Improvements​

MERGE INTO for Spice Cayenne​

PARTITION BY Support for Cayenne​

Catalog Connector Enhancements​

JSON Ingestion Improvements​

DataFusion v52.4.0 Upgrade​

Dependency Upgrades​

Other Improvements​

Contributors​

Breaking Changes​

Upgrading​

What's Changed​

Changelog​

Distribution Changes​

What's New in v2.0.0-rc.1​

Active-Active HA Distributed Query​

What's New in v2.1.0

High-Throughput Cayenne CDC

Change Data Capture & HTAP

Distributed Query

Performance & Query Engine

AI & LLM

Search & Vectors

SQL & Query Engine

Security & Connectors

Adaptive Self-Tuning (Experimental)

Observability

Notable Bug Fixes

Dependency Updates

Contributors

Breaking Changes

Cookbook Updates

Upgrading

What's Changed

Changelog

What's New in v2.0.1

Faster Iceberg Reads with Parallel File Scanning

AWS S3 & Object-Store Reliability

Data Acceleration & Distributed Query Fixes

Authenticated Query Fixes

Contributors

Breaking Changes

Cookbook Updates

Upgrading

What's Changed

Changelog

Highlights in v2.0.0 include:

Distribution Changes

What's New in v2.0.0

Spice Cayenne Reaches General Availability

Multi-Active HA Distributed Query (GA)

Security: Mutual TLS, Secret Stores, and Hardening

Change Data Capture (CDC) Sources

DML, DDL, and Write-Back

SQL & User-Defined Functions

Runtime Features

Spicepod v2

Data Connectors & Catalogs

AI & LLM

Search & Vectors

Caching

Performance & Query Engine

Rust CLI

Observability

Notable Bug Fixes

Dependency Updates

Contributors

Breaking Changes

Upgrade Guide from v1.x

1. Build, image, and platform changes

2. Adopt Spicepod `v2` (recommended)

3. Update changed configuration

4. Update queries and API clients

5. Update model providers

6. Update observability

Cookbook Updates

Upgrading

What's Changed

Changelog

What's New in v2.0.0-rc.2

Distributed Cayenne Query and Write Improvements

MERGE INTO for Spice Cayenne

`PARTITION BY` Support for Cayenne

Catalog Connector Enhancements

JSON Ingestion Improvements

DataFusion v52.4.0 Upgrade

Dependency Upgrades

Other Improvements

Contributors

Breaking Changes

Upgrading

What's Changed

Changelog

Distribution Changes

What's New in v2.0.0-rc.1

Active-Active HA Distributed Query