Spice v2.0-stable (Jun 5, 2026)
53 releases since Spice 1.0-stable, Spice.ai OSS has reached the 2.0-stable milestone! ๐
Spice v2.0.0 is the next major release of Spice and a major milestone in the project's development, advancing Spice from a single-node engine into a distributed data and query platform built for enterprise AI agents. These agents need low-latency, governed access to data spread across many production systems, and because they generate their own queries autonomously, that access has to be sandboxed, observable, and able to absorb occasional heavy analytical queries without overwhelming the underlying systems. The release is headlined by multi-node distributed query, now generally available โ multi-active, highly-available, and object-store-native, built on Apache Ballista โ distributing both query execution and ingestion across executors with data-local routing and per-executor statistics for distributed join planning. Alongside it, the Spice Cayenne data accelerator is generally available, built on the Vortex compressed columnar format, with a high-throughput CDC write path, MERGE INTO, SQL-defined partitioning, inline writes, a dedicated compaction runtime, and write-path statistics for distributed join sizing. The engine also moves to DataFusion v52 with sort pushdown, a rewritten merge join, and dynamic filters, and the Spice CLI is rewritten in Rust as a single self-contained binary.
v2.0 also expands real-time and write-path capabilities across the platform: native CDC from MongoDB Change Streams and PostgreSQL WAL logical replication, durable Kafka CDC offsets, DML write-back for PostgreSQL, Snowflake, DynamoDB, Arrow, and DuckLake, DDL and MERGE INTO for Iceberg catalogs, mutual TLS across server endpoints and outbound connectors, HashiCorp Vault and Azure Key Vault secret stores, user-defined functions, hybrid search with Elasticsearch and DuckDB HNSW vector indexes, provider-aware LLM prompt caching, and the Responses API across all model providers.
Highlights in v2.0.0 include:โ
- Spice Cayenne (GA) โ generally available on the Vortex compressed columnar format, with WAL-staged writes, inline low-latency writes, fast-path CDC deletes, merge-on-read position deletes, composite & SQL-defined partitioning, MERGE INTO, dedicated compaction runtime, and join-sizing statistics maintained on the write path
- Multi-Active HA Distributed Query (GA) โ multi-node distributed query built on Apache Ballista, with object-store-native clustering, dynamic cluster sizing, distributed ingestion, data-local query routing, per-executor table statistics for distributed join planning, and async queries via
/v1/queries - Mutual TLS (mTLS) โ public mTLS for HTTP and Flight, TLS cert hot-reload, and mTLS client certificates for FlightSQL and Spice.ai connectors
- Enterprise Authentication & Authorization โ OIDC bearer-token verification and Cedar-based authorization policy with per-principal row- and column-level filtering
- New Secret Stores โ HashiCorp Vault and Azure Key Vault
- CDC Sources โ native MongoDB Change Streams, PostgreSQL WAL logical replication, and durable Kafka CDC offsets โ no Debezium or Kafka middleware required
- DML & DDL โ INSERT/UPDATE/DELETE write-back for PostgreSQL, Snowflake, DynamoDB, and Arrow;
CREATE TABLE/DROP TABLEandMERGE INTOfor Iceberg catalogs - User-Defined Functions โ SQL UDFs in spicepods, remote UDFs over HTTP, and optional geospatial
ST_*UDFs - On-Demand Dataset Loading & Unified Query Cancellation โ faster startup and end-to-end cancellation across HTTP, Flight, FlightSQL, and MCP
- Dynamic HTTP Connector โ OAuth2 refresh tokens, pagination, dynamic headers, subquery-driven parameters, and rate-control state persisted across restarts
- Storage-Profile Accelerator Tuning &
refresh_mode: snapshotโ storage-aware acceleration defaults and point-in-time snapshot acceleration - Search & Vectors โ Elasticsearch data connector with native hybrid search, DuckDB HNSW vector engine with a statically linked VSS extension, multi-vector MaxSim embeddings, and a
rerank()UDTF - AI & LLM โ provider-aware prompt caching, Responses API across all providers, MCP Streamable HTTP transport, and a searchable LLM tool registry
- New Data Connectors โ Elasticsearch (Alpha), GCS (Alpha), Azure Cosmos DB (Alpha), Git (RC), ADBC, DuckLake (Beta), and catalog connectors for PostgreSQL, MySQL, MSSQL, and Snowflake
- Rust CLI โ single-binary
spiceCLI withspice queryasync REPL, shell completions, and--output=json - Dependency upgrades including DataFusion v52.5, DuckDB v1.5.3, Arrow v57.2, iceberg-rust v0.9.1, Turso v0.6.1, and Vortex v0.69
Spice v2.0 includes several breaking changes. Review the breaking changes section before upgrading.
Distribution Changesโ
AI/ML support including local LLM/ML model and hosted LLM inference is now included in the default Spice build and image. The separate models build variant has been removed.
With models now included by default, the data-only distribution (without AI/ML support) is only published in nightly builds. Official production-ready data-only distributions are available exclusively through Spice Cloud and the Enterprise release.
A new Network Attached Storage (NAS) distribution with built-in SMB and NFS data connector support is also available in nightly builds and with Spice.ai Enterprise.
| Distribution / Variant | Open Source | Spice Cloud | Enterprise |
|---|---|---|---|
| Default | โ | โ | โ |
| Data | Nightly only | โ | โ |
| NAS (SMB + NFS) | Nightly only | โ | โ |
| Metal (macOS) | โ | โ | โ |
| CUDA (Linux) | Nightly only | โ | โ |
| Allocator variants | Nightly only | โ | โ |
| ODBC connector | Local build only | โ | โ |
Native Windows builds are no longer provided; use WSL for local development. For more details, see the Distributions documentation.
What's New in v2.0.0โ
Spice Cayenne Reaches General Availabilityโ
The Spice Cayenne data accelerator is generally available in v2.0, with a major focus across the release candidates on write-path throughput, correctness, and distributed operation.
Write path & ingest:
- Staged Append Writes: WAL-based staged append writes prevent partial writes and data loss on stream errors โ batches commit atomically.
- Inline Writes: Small writes are serialized as Arrow IPC and committed directly into the Cayenne metastore, bypassing the staged Vortex write path for low-latency ingest. Inline upserts atomically rewrite existing inline rows, inline data stays query-visible via an in-memory union scan, and rows are checkpointed to Vortex when thresholds are reached. Inline writes now also proceed with pending deletions in flight, and inline flush caps scale with available memory and storage class.
- Fast-Path CDC Deletes:
DELETEstatements whose filters identify primary keys directly โ including composite keys expressed as(k1, k2) IN ((...), (...))โ skip the table scan entirely. - Merge-On-Read Position Deletes: Primary-key upsert tables use position deletes with memory-pool accounting, avoiding full-table rewrites on update-heavy workloads.
- Resident Upsert Keysets: CDC upsert primary-key keysets stay resident between batches, avoiding per-batch full-table rebuilds.
- CDC Sub-Batch Efficiency: Interleaved upsert/delete workloads produce fewer sub-batch splits, with last-write-wins deduplication applied within batches.
- Dedicated Compaction Runtime: Background compaction runs on a dedicated thread pool with CDC pipelining and protected snapshots, isolating compaction work from query and ingest paths.
Query & planning:
- Join Filter Propagation: Filters propagate across equi-join keys, with range fallback for large join filters and IN-list rewrites.
- Write-Path Join-Sizing Statistics: Cayenne maintains live row counts and HyperLogLog-based distinct-value estimates on the write path, so distributed
JoinSelectioncan correctly size joins without rescans. - Scan-Result Cache: A new scan-result cache accelerates hot reads, with parallel Vortex partition writes and lock-free deletion caches with bloom-prefiltered probes.
SQL & catalog:
- MERGE INTO: Upsert-style
MERGE INTOfor Cayenne catalog tables, distributed across executors in cluster mode. PARTITION BYin SQL: Define partitioning directly inCREATE TABLE ... PARTITION BY (...); metadata is persisted in the catalog and survives restarts.- Composite Partitioning:
partition_by: [col1, col2]with hierarchical path-like keys. - File-Based Retention Deletes: Time-based retention uses file-level deletes for both position-based and primary-key tables.
Correctness: Synchronized partition commits, correct NULL-sentinel handling for nullable partition expressions, tombstoned inline-checkpointed rows on upsert (preventing duplicate primary keys), and live reads through expired protected snapshots.
Multi-Active HA Distributed Query (GA)โ
Spice.ai Enterprise feature. See High Availability.
Distributed Query is generally available. Built on Apache Ballista, it distributes query execution across multiple active executor nodes with no single point of failure, reading directly from object storage rather than relying on a central cluster.
Distributed query supports two execution modes:
- Synchronous: Queries for accelerated datasets are distributed across executors and results stream back in real-time โ best for interactive, latency-sensitive queries.
- Asynchronous: Queries submitted via the HTTP
/v1/queriesAPI materialize results to object storage for later retrieval โ best for long-running analytical and batch workloads.
Key capabilities:
- Dynamic Cluster Sizing: The planner adjusts parallelism to the number of active executors as nodes join or leave.
- Distributed Ingestion: Ingestion for partitioned accelerated tables is distributed across executors, with partition-aware write-through splitting scheduler-side Flight
DoPutwrites to the responsible executors. - Data-Local Query Routing: Cayenne catalog queries route to the executors holding the relevant partitions.
- Per-Executor Table Statistics: Executors report table statistics โ including NDV-aware estimates โ so distributed
JoinSelectioncan size joins correctly, fixing out-of-memory conditions on large semi-joins. - Readiness & Failure Detection:
/v1/readygates on a configurable executor quorum for safe rolling deployments; scheduler readiness additionally waits for executor partition loads; executor heartbeat timeout reduced from 180s to 30s. - Distributed DML & DDL: UPDATE/DELETE forwarding to all executors, executor DDL sync for late joiners, and distributed
MERGE INTO. - Cluster Observability: New cluster metrics (including
scheduler_active_executors_count), distributedruntime.task_historyreplication, and a Grafana dashboard. - Ballista S3 Shuffle: Async queries with
runtime.params.shuffle_location: s3://...complete reliably with executor-environment-derived S3 clients.
Security: Mutual TLS, Secret Stores, and Hardeningโ
Several capabilities in this section are Spice.ai Enterprise features. See Enterprise Security.
Mutual TLS across the platform:
- Public mTLS for HTTP and Flight:
client_auth_mode: request(optional, for migration windows) orrequired(strict) client-certificate verification. - TLS Cert Hot-Reload: The runtime reloads TLS certificates on
SIGHUPfor zero-downtime rotation. - Outbound mTLS Client Certificates: FlightSQL and Spice.ai data connectors present client certificates to upstream services; the
spice sqlREPL supports mTLS client auth.
runtime:
tls:
enabled: true
certificate_file: /etc/spice/tls/server.crt
key_file: /etc/spice/tls/server.key
client_auth_mode: required
client_auth_ca_file: /etc/spice/tls/client-ca.crt
Authentication & Authorization (Spice.ai Enterprise):
- OIDC Authentication: Validate OIDC bearer tokens (JWTs) issued by enterprise identity providers โ Microsoft Entra ID, Okta, Auth0, AWS Cognito, and Google โ for secure access to runtime endpoints, standalone or combined with API keys.
- Principal-Based Policy Enforcement: Fine-grained, Cedar-based authorization policy configured under
runtime.authorizationgoverns allow/deny access across datasets, models, tools, and endpoints. Combined with identity SQL functions (current_principal(),current_principal_email(),current_principal_groups()), policies enforce per-principal row-level filtering and column masking.
New Secret Stores: HashiCorp Vault (KV v1/v2; token, approle, kubernetes, and jwt auth with automatic lease renewal) and Azure Key Vault (service principal, managed identity, workload identity, Azure CLI, or auto-detect; sovereign cloud support).
Hardening:
- Read-only API Key Enforcement on the Flight
DoGetpath and async query endpoints. - Per-Principal Cache Namespacing: SQL, search, and caching-accelerator caches are namespaced per authenticated principal so cached results never cross identity boundaries.
- API Key Timing Leak & Remote-UDF SSRF: Closed a timing-based position-disclosure leak in API key comparison and blocked SSRF via remote UDF endpoints.
- Snowflake Function Deny-List: A function deny-list is enforced in Snowflake federation pushdown, and Snowflake account identifiers and auth configuration are validated at startup.
- MCP
allowed_hosts: MCP servers can be restricted to an explicit allowlist of upstream hosts.
Change Data Capture (CDC) Sourcesโ
See Change Data Capture (CDC) for an overview of CDC in Spice.
- MongoDB Change Streams: MongoDB datasets with
refresh_mode: changesstream changes natively into any local accelerator โ no Debezium or Kafka required. - PostgreSQL Native Replication (WAL): PostgreSQL datasets stream INSERT/UPDATE/DELETE directly from logical replication using
pgoutputdecoding, with automatic per-replica slot management, an initialREPEATABLE READbootstrap snapshot, and durable LSN acknowledgement. - Kafka CDC Offset Persistence: Kafka CDC offsets persist in sidecar tables for durable, resumable streams across restarts and failovers.
- Pipelined CDC Ingestion: Source reads overlap with batch apply, with envelope coalescing and improved nullability propagation.
- Debezium Schema Evolution: Schema changes in Debezium-sourced datasets no longer break dataset initialization on reload.
datasets:
- from: postgres:my_table
name: my_table
params:
pg_host: localhost
pg_db: mydb
acceleration:
enabled: true
engine: duckdb
refresh_mode: changes
DML, DDL, and Write-Backโ
Spice v2.0 turns more connectors and catalogs into full read/write tables:
- PostgreSQL DML:
INSERT,UPDATE, andDELETEwrite-back on PostgreSQL datasets, with foreign-key metadata exposed via the PostgreSQL catalog connector. - Snowflake DML:
INSERT,UPDATE, andDELETEwrite-back on Snowflake datasets. - DynamoDB DML:
INSERT,UPDATE, andDELETEfor DynamoDB, complementing read and CDC streaming. - Arrow Primary Key Upserts: Native update-or-insert semantics for in-memory Arrow-accelerated tables.
- DDL for Iceberg:
CREATE TABLEandDROP TABLEvia FlightSQL and/v1/sqlfor Iceberg, withcatalog.access: read_write_create. - DuckLake INSERT: DuckLake catalog tables with
read_writeaccess supportINSERT.
SQL & User-Defined Functionsโ
See the SQL Reference for the full SQL surface area.
- User-Defined Functions: Define reusable SQL UDFs as first-class spicepod components, or invoke remote functions over HTTP (Spice.ai Enterprise), plus table user functions.
- Spatial SQL UDFs: Optional geospatial
ST_*UDFs for geometry workloads. - JSON UDTFs:
flatten_json,json_tree, andflatten_json_propertiestable-valued functions for JSON transformation and schema decomposition (with options such asexpand_maps). See JSON Functions and Operators. - PostgreSQL Metadata UDFs: Dataset and column descriptions are exposed via PostgreSQL-compatible UDFs (
obj_description,col_description), so BI tools andpsqlsurface Spice metadata. - FlightSQL Substrait Plans:
CommandStatementSubstraitPlansupport for clients submitting Substrait-encoded plans. - SQL REPL Expanded View: Toggle
\xfor a vertical key-value layout on wide result sets. - Prepared statement, federation, and unparsing fixes across the engine, including keeping correlated subqueries out of
JOIN ONconditions for Spice Cloud federation and correctEXISTS/NOT EXISTSsubquery handling in the federation analyzer.
Runtime Featuresโ
- On-Demand Dataset Loading: Datasets can be deferred โ registered with a declared schema at startup (
columns[].type,columns[].nullable) and fully resolved on first reference, reducing startup time and memory for large spicepods. - Unified Query Cancellation: HTTP, Flight, FlightSQL, MCP, and internal execution paths honour a unified cancellation signal โ disconnects, REPL
Ctrl-C, and cancelled HTTP requests cancel the query end-to-end. - Storage-Profile Accelerator Tuning:
acceleration.storage_profile(auto,local_ssd,ebs,tmpfs) applies storage-aware defaults across DuckDB, SQLite, Turso, and Cayenne file-mode accelerators;autodetects the backing storage. refresh_mode: snapshot(Spice.ai Enterprise): Point-in-time snapshot acceleration with SQLite/Turso WAL flushing and Cayenne metastore slice integration, now reporting accurate readiness when no snapshot exists yet.- Structured Component Errors:
/v1/datasets?status=trueand/v1/models?status=truereturn structurederrorobjects (category,type,code) and human-readableerror_messagefields; the CLI shows anERRORcolumn. - Actionable Config Errors: Parameter typos, missing secret references, and unknown engine names produce specific, actionable errors with suggestions.
Spicepod v2โ
Spicepods now support version: v2, the default for spice init, while v1 spicepods continue to work with automatic migration of deprecated fields.
| Version | Status |
|---|---|
v2 | Default. Used by spice init. |
v1 | Supported. Deprecated fields auto-migrate. |
v1beta1 | Removed. No longer accepted. |
| v1 (deprecated) | v2 (preferred) | Notes |
|---|---|---|
runtime.results_cache | runtime.caching.sql_results | All fields migrate automatically. cache_max_size โ max_size. |
runtime.memory_limit | runtime.query.memory_limit | Auto-migrated. query.memory_limit takes priority if both set. |
runtime.temp_directory | runtime.query.temp_directory | Auto-migrated. query.temp_directory takes priority if both set. |
dataset.invalid_type_action | dataset.unsupported_type_action | Auto-migrated. v2 adds a new string variant. |
New v2 fields include runtime.ready_state, runtime.query.spill_compression, runtime.caching.sql_results.stale_while_revalidate_ttl, runtime.caching.sql_results.encoding, scheduler partition-assignment configuration, and catalog.access: read_write_create.
Data Connectors & Catalogsโ
New connectors:
- Elasticsearch (Alpha, Spice.ai Enterprise): Query Elasticsearch indexes as SQL tables with native hybrid search โ
vector_search()kNN,text_search()BM25, andrrf()fusion โ plus Elasticsearch as a backing vector engine, direct FTS engine configuration, and index lifecycle controls. - GCS (Alpha): Federated queries against Google Cloud Storage, with Iceberg table support.
- Azure Cosmos DB (Alpha): Read-only NoSQL / Core SQL API connector with cross-partition scans and schema inference.
- Git (RC): HTTPS/SSH auth, Git LFS support, and per-repo connection resilience.
- ADBC: Data connector and catalog with full query federation, BigQuery support, and schema/table discovery.
- DuckLake (Beta): Lakehouse-style data management with DuckDB as the metadata catalog and object storage for data โ ACID transactions, time travel, and schema evolution on Parquet.
- Self-Hosted Spice Connector: Connect Spice to another self-hosted Spice runtime as a federated source.
New catalog connectors for PostgreSQL, MySQL, MSSQL, and Snowflake, using native metadata catalogs for schema and table discovery. Unity Catalog compatibility extends to OSS Unity Catalog deployments, and DDL-defined catalogs can expose and query views.
HTTP connector: OAuth2 refresh-token authentication, query-parameter and no-limit pagination, dynamic request headers parameterised from query predicates, subquery-driven request parameters for fan-out queries, response metadata as queryable columns, map-to-array conversion, shared and persistent rate-control state across restarts and replicas, no caching of transient 429/5xx errors, and a correctly populated fetched_at column.
JSON ingestion: Single-object documents, JSONL, BOM-prefixed input, Socrata SODA responses, format auto-detection, and RFC 6901 json_pointer extraction of nested payloads.
Databricks: Resilience controls, Unity Catalog-aware permission prechecks with structured advisory errors, Classic SQL Warehouse foreign-table compatibility, connect_timeout/client_timeout parameters, a Databricks SQL dialect for federation, and Delta Lake column mapping (Name and Id modes).
Other connector improvements: MongoDB SRV support; MySQL mysql_zero_date_behavior; Snowflake OBJECT, MAP, GEOGRAPHY, GEOMETRY, VECTOR, and TIMESTAMP_LTZ types plus key-pair auth; ClickHouse Date32; S3 s3_url_style for path-style addressing and faster Parquet reads; GraphQL custom auth headers; Oracle and MSSQL sort/limit pushdown; GitHub GraphQL resilience; and improved Kafka reliability.
AI & LLMโ
- Provider-Aware Prompt Caching: LLM calls automatically use provider-side prompt caching (e.g., Anthropic, OpenAI) for system prompts and tool descriptions, reducing latency and cost.
- Responses API Across All Providers: The Responses API works with every configured model provider, including streaming
response.output_text.deltaevents andAuthorization: Bearerheader support. - Multi-Vector Embeddings with MaxSim: List-of-string columns produce one embedding per element with MaxSim/mean/sum scoring for ColBERT-style late-interaction retrieval, plus a
_matchcolumn identifying the best-matching element. rerank()UDTF: Reorder results fromvector_search,text_search, orrrfusing any registered chat model as a reranker, with automatic query propagation and pushdown support.- Searchable LLM Tool Registry: Agents discover tools via semantic search instead of enumerating every tool in the system prompt.
- MCP Improvements: Streamable HTTP transport (
/v1/mcp) on rmcp v1.5.0, native auth for streamable HTTP tools (mcp_auth_token,mcp_headers), external MCP server tool calls traced in task history, and configurableallowed_hosts. - Per-Model Rate-Limited AI UDF Execution for controlling concurrent AI function invocations.
Search & Vectorsโ
- DuckDB Vector Engine:
vector_engine: duckdbuses DuckDB's HNSW index for fast approximate nearest-neighbor search without an external vector store. In v2.0.0, the DuckDB VSS extension is statically linked into the bundled DuckDB, so HNSW vector search works out-of-the-box on clean machines with no extension download. HNSW indexes are preserved across data refresh, andcosine_distancepushes down viaarray_cosine_distance. - Hybrid Search: Combine kNN vector search and BM25 full-text search with reciprocal rank fusion (
rrf()), backed by Tantivy, Elasticsearch, or DuckDB. - Full-Text Search Performance: Significantly faster Tantivy ingestion with rollback-on-error, and search metadata is correctly preserved on indexing and in Vortex physical schema calculation.
- Embedding Validation:
row_idcolumns are validated during dataset initialization.
Cachingโ
Improvements across Caching:
- Stale-While-Revalidate:
runtime.caching.sql_results.stale_while_revalidate_ttlserves stale results while revalidating in the background. - Cache Encoding: Optional compression (e.g.,
zstd) for SQL results cache entries. - Retention Policies for cached query results, and improved CDC-driven cache invalidation (including view plan invalidation on updates).
- Idle Cache Maintenance: Periodic maintenance drains invalidation predicates on idle caches, fixing unbounded memory growth in rarely-read caches.
Performance & Query Engineโ
Apache DataFusion is upgraded to v52.5 over the course of the release cycle, bringing:
- Sort Pushdown to Scans: ~30x faster top-K queries on pre-sorted data; Parquet scans reverse row-group order for
DESConASC-sorted files. - Rewritten Sort-Merge Join: Up to three orders of magnitude faster in pathological cases (e.g., TPC-H Q21: minutes โ milliseconds).
- Dynamic Filters:
MIN/MAXaggregates and hash-join build sides prune files, row groups, and rows during execution. - Faster
CASEExpressions, statistics caching, and prefix-aware list-files caching for faster planning. TableProviderDELETE/UPDATE hooks and theRelationPlannerAPI for extensible SQL planning.- Strict Overflow Handling:
try_cast_toerrors on overflow instead of silently producing NULLs.
Additional engine work: default query memory limit raised from 70% to 90% with GreedyMemoryPool, partial aggregation optimization for FlightSQLExec, improved partitioned query planning, and metastore transaction support to prevent concurrent conflicts.
Rust CLIโ
The Spice CLI is completely rewritten from Go to Rust โ a single spice binary built from the same codebase as spiced, with full feature parity across 27+ commands.
spice query: Interactive REPL for async queries with multi-line SQL, progress indication, and cancellation.spice dataset configure: Non-interactive flag-based configuration (--from,--description,--param KEY=VALUE,--set) alongside interactive prompts.spice completions: Shell completion script generation.--output=json: Machine-readable output for scripting;spice login --outputaddsenv,json, andkeychainmodes.spice initwrites ayaml-language-serverschema directive for IDE completions.
Observabilityโ
- OpenTelemetry: Exporter fixes, authenticated metrics export, configurable metric name prefix (
runtime.telemetry.metric_prefix), delta temporality by default, and OTLP resource attributes viaruntime.telemetry.properties. - Query Metrics: The
query_executionsmetric gains adatasetsdimension for per-dataset query attribution. - Ingestion Metrics:
rows_written,bytes_written, anddataset_acceleration_size_bytesfor acceleration refresh and FlightDoPut/ADBC ingestion, andEXPLAIN ANALYZEmetrics inFlightSQLExec. - Task History: Distributed task history in cluster mode and tracing for external MCP server tool calls.
Notable Bug Fixesโ
- localpod synchronization:
localpodchild datasets correctly track parent refreshes when the parent uses the in-memory Arrow accelerator. - Spice Cloud federation: Correlated subqueries are kept out of
JOIN ONconditions, fixing rejected federated queries. refresh_mode: snapshot: No longer reports Ready with empty data when no snapshot exists.- Search metadata: Field and schema metadata preserved on search indexing and in Vortex physical schema calculation.
- HTTP connector:
fetched_atcolumn is correctly populated. - Connector correctness: DynamoDB Streams transient-error retries and typed-NULL DML handling; ScyllaDB physical filter pushdown disabled to fix incorrect results; MSSQL
TOP Npushdown; DuckDB DELETE/UPDATE onfullandcachingrefresh modes; Turso checked arithmetic for timestamp conversions; ODBC queries no longer silently return 0 rows on failure; FlightGetFlightInfo/DoGetschema parity.
Dependency Updatesโ
| Dependency / Component | Version |
|---|---|
| DataFusion | v52.5 |
| Ballista | v52 |
| Arrow (arrow-rs) | v57.2 |
| DuckDB | v1.5.3 (with statically linked VSS) |
| iceberg-rust | v0.9.1 |
| Turso (libsql) | v0.6.1 |
| Vortex | v0.69.0 |
| delta_kernel | v0.18.2 |
| rmcp (MCP) | v1.5.0 |
| mistral.rs | v0.8.x (candle v0.10.1) |
| ADBC Core | v0.23 |
| Rust toolchain | v1.94.1 |
Contributorsโ
Breaking Changesโ
-
Models included by default: The separate
modelsbuild variant has been removed. Local LLM inference is always included in the default build and image. -
Windows native builds removed: Use WSL for local development.
-
Spicepod version defaults to
v2:spice initcreatesversion: v2spicepods.v1remains supported with auto-migration;v1beta1is no longer accepted. -
Flattened
runtime.schedulerconfiguration: The nestedruntime.scheduler.partition_managementblock is flattened and renamed:# Before
runtime:
scheduler:
partition_management:
interval: 30s
max_assignments_per_cycle: 16
discovery_timeout: 10s
# After
runtime:
scheduler:
partition_assignment_interval: 30s
max_assignments_per_interval: 16
partition_discovery_timeout: 10s -
S3 metadata columns renamed:
location,last_modified,sizeโ_location,_last_modified,_size. -
Default query memory limit changed: Increased from 70% to 90%.
-
Metric renames:
accelerated_refreshmetrics renamed toacceleration_refresh;last_refresh_timegauge renamed to include the milliseconds unit. -
DuckDB parameter rename:
partitioned_write_flush_thresholdโpartitioned_write_flush_threshold_rows. -
/v1/searchAPI: Always returns an array inmatches, even for single results. -
/v1/evalsAPI removed. -
Perplexity model provider removed.
-
x.ai model endpoint: x.ai models exclusively use the
/v1/responsesendpoint.
Upgrade Guide from v1.xโ
Most v1 spicepods continue to work on v2.0 โ v1 remains supported and deprecated fields auto-migrate at load time โ so many deployments can upgrade by updating the binary or image alone. The steps below cover the breaking changes that may require manual action. Review each before upgrading a production deployment.
1. Build, image, and platform changesโ
- Models are now included by default. The separate
modelsbuild variant (and the corresponding-modelsimage tags) has been removed; local LLM inference is always included in the default build and image. If your deployment pinned amodelsbuild or-models-tagged image, switch to the default build/image. - Native Windows builds are removed. Use WSL for local Windows development.
2. Adopt Spicepod v2 (recommended)โ
spice init now creates version: v2 spicepods. v1 spicepods remain supported with automatic migration, but v1beta1 is no longer accepted. To move to v2, set version: v2 and update the following fields โ each auto-migrates from v1, but updating now clears the deprecation:
| v1 (deprecated) | v2 (preferred) |
|---|---|
runtime.results_cache | runtime.caching.sql_results (cache_max_size โ max_size) |
runtime.memory_limit | runtime.query.memory_limit |
runtime.temp_directory | runtime.query.temp_directory |
dataset.invalid_type_action | dataset.unsupported_type_action |
3. Update changed configurationโ
- DuckDB parameter rename:
partitioned_write_flush_thresholdโpartitioned_write_flush_threshold_rows. - Default query memory limit raised from 70% to 90%. If you relied on the previous default to leave headroom for other processes on the host, set it explicitly via
runtime.query.memory_limit.
4. Update queries and API clientsโ
- S3 metadata columns renamed:
location,last_modified,sizeโ_location,_last_modified,_size. Update any queries that reference these columns. /v1/searchalways returns an array inmatches, even for a single result. Update clients that assumed a scalar value./v1/evalsAPI removed. Remove integrations that depend on it.
5. Update model providersโ
- Perplexity model provider removed. Re-point affected models to another provider.
- x.ai models use the
/v1/responsesendpoint exclusively. Ensure x.ai integrations target the Responses API.
6. Update observabilityโ
- Metric renames:
accelerated_refreshโacceleration_refresh, and thelast_refresh_timegauge is renamed to include the milliseconds unit. Update dashboards and alerts that reference these metric names.
After updating, restart the runtime and verify datasets and models report ready via /v1/datasets?status=true and /v1/models?status=true (the CLI shows a Ready/ERROR column).
Cookbook Updatesโ
New Spice Cookbook recipes added during the v2.0 release cycle:
- Async Queries: Submit long-running queries asynchronously and retrieve results later.
- DuckLake Catalog: Lakehouse-style data management with ACID transactions and time travel.
- Distributed Query: Run Spice in multi-active distributed cluster mode.
- mTLS: Mutual TLS for HTTP and Flight endpoints.
- Elasticsearch Connector: Query Elasticsearch indexes as SQL tables.
- MCP Server: Use Spice as an MCP server over Streamable HTTP.
- Snowflake DML: Write-back to Snowflake with INSERT/UPDATE/DELETE.
- PostgreSQL, MySQL, and MSSQL Catalogs: Schema and table discovery for external databases.
- Full-Text Search: BM25 full-text search over accelerated datasets.
The Spice Cookbook includes more than 100 recipes to help you get started with Spice quickly and easily.
Upgradingโ
To upgrade to v2.0.0, use one of the following methods:
CLI:
spice upgrade
Homebrew:
brew upgrade spiceai/spiceai/spice
Docker:
Pull the spiceai/spiceai:2.0.0 image:
docker pull spiceai/spiceai:2.0.0
For available tags, see DockerHub.
Helm:
helm repo update
helm upgrade spiceai spiceai/spiceai --version 2.0.0
AWS Marketplace:
Spice is available in the AWS Marketplace.
What's Changedโ
Changelogโ
- Add TPC-DS integration tests with S3 source and PostgreSQL acceleration by @phillipleblanc in #9006
- fix(tests): fix flaky/slow/failing unit tests by @phillipleblanc in #9009
- fix: Update benchmark snapshots for DF51 upgrade by @app/github-actions in #9008
- fix: add feature gate to rrf TEST_EMBEDDING_MODEL by @phillipleblanc in #9017
- fix: features check by @phillipleblanc in #9014
- fix: Enable Cayenne acceleration snapshots by @lukekim in #9020
- URL table support by @lukekim in #9018
- ScyllaDB key filter by @lukekim in #8997
- fix: Schema mismatch when using column projection with HTTP caching by @phillipleblanc in #9021
- Add more tests for HTTP caching with columns selection by @sgrebnov in #9025
- HTTP cache snapshots: default to
time_intervaland fixsnapshots_creation_policy: on_changeby @sgrebnov in #9026 - Fix duplicate snapshot creation on startup by @sgrebnov in #9029
- Add ScyllaDB and SMB to the README table by @krinart in #9034
- Remove waiting for runtime to be ready before creating snapshot by @krinart in #9033
- Fix snapshot on_change policy to skip when no writes occurred by @sgrebnov in #9028
- Release notes for release
release/1.11.0-rc.2by @krinart in #9016 - ci: use arduino/setup-protoc for official protobuf compiler by @phillipleblanc in #9036
- ci: install unzip on aarch64 runner for arduino/setup-protoc by @phillipleblanc in #9038
- fix: don't fail release if upload to minio fails by @phillipleblanc in #9039
- Add missing protoc step to setup-cc action by @krinart in #9041
- fix: Update Search integration test snapshots by @app/github-actions in #9013
- Fix
formula_1andcodebase_communityinbird-benchby @Jeadie in #9000 - Cayenne S3 Express One Zone improvements by @lukekim in #9015
- Add zlib1g-dev to CI by @lukekim in #9052
- Improve validation and logging for hash indexes by @lukekim in #9047
- Upgrade Vortex with CASE-WHEN by @lukekim in #9051
- x.ai models now exclusively use /v1/responses endpoint by @lukekim in #9400
- Improvements for snapshot schema comparison by @krinart in #9401
- v2.0 breaking changes by @lukekim in #9233
- Create
PartitionManagementTaskfor scheduler to update accelerated table partition assignments by @Jeadie in #9378 - refactor(Cayenne): route all write orchestration through
CayenneDataSinkby @sgrebnov in #9402 - Refactor benchmark to use QueryExecutor trait by @Jeadie in #9418
- feat: Add spidapter build and release workflow by @peasee in #9427
- Testoperator: add support for api-key when connecting to external spice instance by @sgrebnov in #9421
- Initial implementation of Ducklake catalog & data connectors by @lukekim in #9083
- Require
aws_lc_rssince jsonwebtoken upgrade by @Jeadie in #9426 - feat: Add spidapter tool by @peasee in #9425
- Add release notes for 1.11.2 patch release by @sgrebnov in #9430
- feat(spidapter): integrate system-adapter-protocol with SCP provisioning by @phillipleblanc in #9434
- Add DuckLake TPCH E2E workflow and federated Spicepod configuration by @lukekim in #9431
- fix(spidapter): use Flight handshake auth instead of x-api-key header by @phillipleblanc in #9435
- [spidapter] Keep only what sparks joy by @Jeadie in #9439
- Refactor binary operator balancing by @Jeadie in #9424
- feat: Add Iceberg DDL support (CREATE TABLE / DROP TABLE) for default catalog override by @phillipleblanc in #9440
- Fix Flight SQL schema consistency: expand view types and verify field names by @sgrebnov in #9438
- Update spidapter for new system-adapter-protocol by @sgrebnov in #9442
- docs: fix typos and syntax errors in style guide and error handling docs by @cluster2600 in #9445
- Add acceleration refresh ingestion metrics (rows_written, bytes_written) by @phillipleblanc in #9461
- Refactor(Cayenne): Replace CatalogError and string based errors with Snafu errors by @sgrebnov in #9403
- Replace deprecated claude-3-5-haiku-latest with claude-haiku-4-5 by @Jeadie in #9492
- Fix #9481: Preserve schema in results cache for empty query results by @phillipleblanc in #9485
- Fix partition by serializing by @Jeadie in #9474
- query: reconcile execution stream nullability with logical plan schema by @phillipleblanc in #9486
- initial
spice-cloud-clientcrate andspice cloud metrics --app <app-name>. by @Jeadie in #9480 - feat: Return dataset error message in datasets API by @peasee in #9487
- Spicebench by @lukekim in #9447
- build(deps): consolidate dependabot dependency updates by @phillipleblanc in #9504
- fix(cluster): route non-partitioned accelerated tables in distributed mode by @phillipleblanc in #9508
- Enable core scalar UDFs in refresh SQL by @sgrebnov in #9502
- Fix metrics in Spidapter again by @Jeadie in #9497
- fix(cluster): tolerate Completed->status propagation race in distributed query handle by @phillipleblanc in #9510
- feat: Support distributed ingestion in cayenne catalog by @peasee in #9506
- Fix Cayenne duplicate primary keys after DELETE + UPSERT CDC sequences by @krinart in #9494
- fix(cluster): rewrite table scans inside subqueries for distributed execution by @phillipleblanc in #9518
- fix: Set catalog mode to readwritecreate in spidapter by @peasee in #9519
- Upgrade AWS SDK crates & set APN user-agent in AWS SDK credential bridge by @lukekim in #8328
- feat(runtime): add runtime ready_state on_registration semantics by @lukekim in #9522
- fix: Add spidapter post-setup retries by @peasee in #9526
- Make partition discovery more robust and make initialization non-blocking by @sgrebnov in #9499
- Make
lint-rust-fixsupport targeted packages and features by @Jeadie in #9511 - Handle new Cloud SCP API by @Jeadie in #9532
- Refactor and simplify streaming benchmarks by @krinart in #9405
- fix: ensure spidapter only increments attempts on failures by @peasee in #9534
- feat: Support specifying app resources in spidapter by @peasee in #9536
- test(runtime): Spice Cayenne DDL integration test by @lukekim in #9535
- fix: Handle schema evolution mismatch errors during data refresh by @lukekim in #9527
- fix: resolve clippy lint warnings by @phillipleblanc in #9547
pr-builds --tag <TAG>for build_and_release.yml by @Jeadie in #9507- Add
--outputflag tospice loginwith env/json/keychain modes by @Jeadie in #9541 - Don't use 'PartitionedTableScanRewrite' in async distributed query by @Jeadie in #9548
- feat(spidapter): add local backend mode with single executor by @phillipleblanc in #9531
- support chat template in HF by @Jeadie in #9543
- fix(cayenne): stream PK retention deletes and run OOM regression in CI by @phillipleblanc in #9533
- cayenne: Staged append writes to prevent partial writes and data loss on stream error by @sgrebnov in #9491
AcceleratedTable::scanuseFederatedTable::scanwhenClusterRole::Schedulerby @Jeadie in #9550- Upgrade to delta-kernel-rs v0.18.2 by @lukekim in #9528
- Run cayenne tests as part of PR CI by @sgrebnov in #9554
- Upgrade to DataFusion v52.2.0 by @lukekim in #9419
- Remove Snapshot Compaction + Add snapshot existence check by @krinart in #9523
- Update dependencies by @lukekim in #9566
- fix: Update benchmark snapshots by @app/github-actions in #9565
- fix: Compare Cayenne table configuration on startup by @peasee in #9529
- Make
Refresh::refresh_sqlmore robust to alterations over time. by @Jeadie in #9549 - fix: Update datafusion-table-providers dependency to latest revision by @lukekim in #9574
- Unset
AWS_ENDPOINT_URLwhen empty by @krinart in #9575 - fix: allow BytesProcessedExec repartitioning for unordered input by @lukekim in #9540
- Sanitize DataFusion errors by @lukekim in #9530
- Add conditional logging for partition assignments by @Jeadie in #9577
- use 'properly early exit on SIGTERM' by @Jeadie in #9573
- Update datafusion to 52.2.0 by @phillipleblanc in #9582
- Ensure we query one and only one partition per request by @Jeadie in #9416
- feat: Add support for Spicepod version v2 by @lukekim in #9583
- [SpiceDQ] Improve error messages; Avoid race condition on allocate_initial_partitions. by @Jeadie in #9579
- Update ballista dependencies to latest 52.0.0 revision by @lukekim in #9581
- Fix Databricks spark_connect mode always disabled by @phillipleblanc in #9586
- Support partitioning in Arrow accelerator by @Jeadie in #9571
- Fix
spice queryCLI response deserialization by @phillipleblanc in #9588 - fix: Update benchmark snapshots by @app/github-actions in #9584
- fix: Share RuntimeEnv across Cayenne read/write/delete paths for targeted list_files_cache invalidation by @sgrebnov in #9589
- feat: Add file:// state_location support for async queries scheduler by @phillipleblanc in #9590
- Update endgame links by @krinart in #9598
- ci: fix E2E CLI upgrade test to use latest release for spiced download by @phillipleblanc in #9613
- fix(DF): Lazily initialize BatchCoalescer in RepartitionExec to avoid schema type mismatch by @sgrebnov in #9623
- feat: Implement catalog connectors for various databases by @lukekim in #9509
- Refactor and clean up code across multiple crates by @lukekim in #9620
- fix: Improve error handling for distributed mode and state_location configuration by @lukekim in #9611
- Properly install postgres in
install-postgresaction by @krinart in #9629 - fix: Use Python venv for schema validation in CI by @phillipleblanc in #9637
- Update spicepod.schema.json by @app/github-actions in #9640
- Update testoperator dispatch to use release/2.0 branch by @phillipleblanc in #9641
- fix: Align CUDA asset names in Dockerfile and install tests with build output by @phillipleblanc in #9639
- Fix expect test scripts in E2E Installation AI test by @sgrebnov in #9643
- testoperator for partitioned arrow accelerator by @Jeadie in #9635
- Remove default 1s refresh_check_interval from spidapter for hive datasets by @phillipleblanc in #9645
- Fix scheduler panic and cancel race condition by @phillipleblanc in #9644
- Align Spice.ai connector parameter names across catalog/data connectors by @lukekim in #9632
- docs: update distribution details and add NAS support in release notes by @lukekim in #9650
- Enable
postgres-accelin CI builds for benchmarks by @sgrebnov in #9649 - perf: Cache Turso metastore connection across operations by @penberg in #9646
- Add 'scheduler_state_location' to spidapter by @Jeadie in #9655
- Implement Cayenne S3 Express multi-zone live test with data validation by @lukekim in #9631
- chore(spidapter): bump default memory limit from 8Gi to 32Gi by @phillipleblanc in #9661
- perf: Use prepare_cached() in Turso and SQLite metastore backends by @penberg in #9662
- Improve CDC cache invalidation by @krinart in #9651
- Refactor Cayenne IDs to use UUIDv7 strings by @lukekim in #9667
- fix: add liveness check for dead executors in partition routing by @Jeadie in #9657
- fix(s3): Fix metadata column schema mismatches in projected queries by @sgrebnov in #9664
- s3_metadata_columns tests: include test for location outside table prefix by @sgrebnov in #9676
- docs: Update DuckDB, GCS, Git connector and Cayenne documentation by @lukekim in #9671
- Add s3_url_style support for S3 connector URL addressing by @phillipleblanc in #9642
- Consolidate E2E workflows and require WSL for Windows runtime by @lukekim in #9660
- Upgrade to Rust v1.93.1 by @lukekim in #9669
- Security fixes and improvements by @lukekim in #9666
- feat(flight): add DoPut rows/bytes written metrics for DoPut ETL ingestion tracking by @phillipleblanc in #9663
- Skip caching http error response + add
response_headersby @krinart in #9670 - refactor: Remove
v1/evalsfunctionality by @Jeadie in #9420 - Make a test harness for Distributed Spice integration tests by @Jeadie in #9615
- Enable
on_zero_results: use_sourcefor views by @krinart in #9699 - fix(spidapter): Lower memory limit, passthrough AWS secrets, override flight URL by @peasee in #9704
- Show an error on a shared acceleration file with snapshots enabled by @krinart in #9698
- Fixes for anthropic by @Jeadie in #9707
- Use
max_partitions_per_executorinallocate_initial_partitionsby @Jeadie in #9659 - [SpiceDQ] Accelerations must have partition key by @Jeadie in #9711
- Upgrade to Turso v0.5 by @lukekim in #9628
- feat: Rename metadata columns to _location, _last_modified, _size by @phillipleblanc in #9712
- fix: bump datafusion-ballista to fix BatchCoalescer schema mismatch panic by @phillipleblanc in #9716
- fix: Ensure Cayenne respects target file size by @peasee in #9730
- refactor: Make DDL preprocessing generic from Iceberg DDL processing by @peasee in #9731
- [SpiceDQ] Distribute query of Cayenne Catalog to executors with data by @Jeadie in #9727
- Properly set
primary_keys/on_conflictfor Cayenne tables by @krinart in #9739 - Add executor resource and replica support to cloud app config by @ewgenius in #9734
- feat: Support PARTITION BY in Cayenne Catalog table creation by @peasee in #9741
- Update datafusion and related packages to version 52.3.0 by @lukekim in #9708
- Route FlightSQL statement updates through QueryBuilder by @phillipleblanc in #9754
- JSON file format improvements by @lukekim in #9743
- [SpiceDQ] Partition Cayenne catalogs writes through to executors by @Jeadie in #9737
- Update to DF v52.3.0 versions of datafusion & datafusion-tableproviders by @lukekim in #9756
- Make S3 metadata column handling more robust by @sgrebnov in #9762
- Fetch API keys from dedicated endpoint instead of apps response by @phillipleblanc in #9767
- Update arrow-rs, datafusion-federation, and datafusion-table-providers dependencies by @phillipleblanc in #9769
- Chunk metastore batch inserts to respect SQLite parameter limits by @phillipleblanc in #9770
- Improve JSON SODA support by @lukekim in #9795
- Add ADBC Data Connector by @lukekim in #9723
- docs: Release Cayenne as RC by @peasee in #9766
- cli[feat]: cloud mode to use region-specific endpoints by @lukekim in #9803
- Include updated JSON formats in HTTPS connector by @lukekim in #9800
- Flight DoPut: Partition-aware write-through forwarding by @Jeadie in #9759
- Pass through authentication to ADBC connector by @lukekim in #9801
- Move scheduler_state_location from adapter metadata to env var by @phillipleblanc in #9802
- Fix Cayenne DoPut upsert returning stale data after 3+ writes by @phillipleblanc in #9806
- Fix JSON column projection producing schema mismatch by @sgrebnov in #9811
- Fix http connector by @krinart in #9818
- Fix ADBC Connector build and test by @lukekim in #9813
- Support update & delete DML for distributed cayenne catalog by @Jeadie in #9805
- Set allow_http param when S3 endpoint uses http scheme by @phillipleblanc in #9834
- fix: Cayenne Catalog DDL requires a connected executor in distributed mode by @Jeadie in #9838
- fix: Add conditional put support for file:// scheduler state location by @Jeadie in #9842
- fix: Require the DDL primary key contain the partition key by @Jeadie in #9844
- fix: Databricks SQL Warehouse schema retrieval with INLINE disposition and async retry by @lukekim in #9846
- Filter pushdown improvements for SqlTable by @lukekim in #9852
- feat: add iam_role_source parameter for AWS credential configuration by @lukekim in #9854
- Fix ODBC queries silently returning 0 rows on query failure by @lukekim in #9864
- feat(adbc): Add ADBC catalog connector with schema/table discovery by @lukekim in #9865
- Make Turso SQL unparsing more robust and fix date comparisons by @lukekim in #9871
- Fix Flight/FlightSQL filter precedence and mutable query consistency by @lukekim in #9876
- Partial Aggregation optimisation for
FlightSQLExecby @lukekim in #9882 - fix: v1/responses API preserves client instructions when system_prompt is set by @Jeadie in #9884
- feat: emit
scheduler_active_executors_countand use it in spidapter by @Jeadie in #9885 - feat: Add custom auth header support for GraphQL connector by @krinart in #9899
- Add --endpoint flag to spice run with scheme-based routing by @lukekim in #9903
- When executor connects, send DDL for existing tables by @Jeadie in #9904
- fix: Improve ADBC driver shutdown handling and error classification by @lukekim in #9905
- fix: require all executors to succeed for distributed DML (DELETE/UPDATE) forwarding by @Jeadie in #9908
- fix(cayenne catalog): fix catalog refresh race condition causing duplicate primary keys by @Jeadie in #9909
- Remove Perplexity support by @Jeadie in #9910
- Fix refresh_sql support for debezium constraints by @krinart in #9912
- Implement DML for DynamoDBTableProvider by @lukekim in #9915
- chore: Update iceberg-rust fork to v0.9 by @lukekim in #9917
- Run physical optimizer on
FallbackOnZeroResultsScanExecfallback plan by @sgrebnov in #9927 - Improve Databricks error message when dataset has no columns by @sgrebnov in #9928
- Delta Lake: fix data skipping for >= timestamp predicates by @sgrebnov in #9932
- fix: Ensure distributed Cayenne DML inserts are forwarded to executors by @Jeadie in #9948
- Add full query federation support for ADBC data connector by @lukekim in #9953
- Make time_format deserialization case-insensitive by @claudespice in #9955
- Hash ADBC join-pushdown context to prevent credential leaks in EXPLAIN plans by @lukekim in #9956
- fix: Normalize Arrow Dictionary types for DuckDB and SQLite acceleration by @sgrebnov in #9959
- ADBC BigQuery: Improve BigQuery dialect date/time and interval SQL generation by @lukekim in #9967
- Make
BigQueryDialectmore robust and add BigQuery TPC-H benchmark support by @lukekim in #9969 - fix: Show proper unauthorized error instead of misleading runtime unavailable by @lukekim in #9972
- fix: Enforce target_chunk_size as hard maximum in chunking by @lukekim in #9973
- Add caching retention by @krinart in #9984
- fix: improve Databricks schema error detection and messages by @lukekim in #9987
- fix: Set default S3 region for opendal operator and fix cayenne nextest by @phillipleblanc in #9995
- fix(PostgreSQL): fix schema discovery for PostgreSQL partitioned tables by @sgrebnov in #9997
- fix: Defer cache size check until after encoding for compressed results by @krinart in #10001
- fix: Rewrite numeric BETWEEN to CAST(AS REAL) for Turso by @lukekim in #10003
- fix: Handle integer time columns in append refresh for all accelerators by @sgrebnov in #10004
- fix: preserve s3a:// scheme when building OpenDalStorageFactory with custom endpoint by @phillipleblanc in #10006
- Fix ISO8601 time_format with Vortex/Cayenne append refresh by @sgrebnov in #10009
- fix: Address data correctness bugs found in audit by @sgrebnov in #10015
- fix(federation): fix SQL unparsing for Inexact filter pushdown with alias by @lukekim in #10017
- Improve GitHub connector ref handling and resilience by @lukekim in #10023
- feat: Add spice completions command for shell completion generation by @lukekim in #10024
- fix: Fix data correctness bugs in DynamoDB decimal conversion and GraphQL pagination by @sgrebnov in #10054
- Implement RefreshDataset for distributed control stream by @Jeadie in #10055
- perf: Improve S3 parquet read performance by @sgrebnov in #10064
- fix: Prevent write-through stalls and preserve PartitionTableProvider during catalog refresh by @Jeadie in #10066
- feat:
spice completionsauto-detects shell directory and writes file by @lukekim in #10068 - fix: Bug in DynamoDB, GraphQL, and ISO8601 refresh data handling by @sgrebnov in #10063
- fix partial aggregation deduplication on string checking by @lukekim in #10078
- fix: add MetastoreTransaction support to prevent concurrent transaction conflicts by @phillipleblanc in #10080
- fix: Use GreedyMemoryPool, add spidapter query memory limit arg by @phillipleblanc in #10082
- feat: Add metrics for EXPLAIN ANALYZE in FlightSQLExec by @lukekim in #10084
- Use strict cast in
try_cast_toto error on overflow instead of silent NULL by @sgrebnov in #10104 - feat: Implement MERGE INTO for Cayenne catalog tables by @peasee in #10105
- feat: Add distributed MERGE INTO support for Cayenne catalog tables by @peasee in #10106
- Improve JSON format auto-detection for single multi-line objects by @lukekim in #10107
- Add mode: file_update acceleration mode by @krinart in #10108
- Coerce unsupported Arrow types to Iceberg v2 equivalents in REST catalog API by @peasee in #10109
- fix: Update default query memory limit to 90% from 70% by @phillipleblanc in #10112
- feat: Add mTLS client auth support to spice sql REPL by @lukekim in #10113
- fix(datafusion-federation): report error on overflow instead of silent NULL by @sgrebnov in #10124
- fix: Prevent data loss in MERGE when source has duplicate keys by @peasee in #10126
- feat: Add ClickHouse Date32 type support by @sgrebnov in #10132
- Add Delta Lake column mapping support (Name/Id modes) by @sgrebnov in #10134
- fix: Restore Turso numeric BETWEEN rewrite lost in DML revert by @lukekim in #10139
- fix: Enable arm64 Linux builds with fp16 and lld workarounds by @lukekim in #10142
- fix: remove double trailing slash in Unity Catalog storage locations by @sgrebnov in #10147
- fix: Improve GitHub GraphQL client resilience and performance by @lukekim in #10151
- Enable reqwest compression and optimize HTTP client settings by @lukekim in #10154
- fix: executor startup failures by @Jeadie in #10155
- feat: Distributed runtime.task_history support by @Jeadie in #10156
- fix: Preserve timestamp timezone in DDL forwarding to executors by @peasee in #10159
- feat: Per-model rate-limited concurrent AI UDF execution by @Jeadie in #10160
- fix(Turso): Reject subquery/outer-ref filter pushdown in Turso provider by @lukekim in #10174
- Fix linux/macos
spice upgradeby @phillipleblanc in #10194 - Improve CREATE TABLE LIKE error messages, success output, EXPLAIN, and validation by @peasee in #10203
- fix: chunk MERGE delete filters and update Vortex for stack-safe IN-lists by @peasee in #10207
- Propagate
runtime.params.parquet_page_indexto Delta Lake connector by @sgrebnov in #10209 - Properly mark dataset as Ready on Scheduler by @Jeadie in #10215
- fix: handle Utf8View/LargeUtf8 in GitHub connector ref filters by @lukekim in #10217
- fix(databricks): Fix schema introspection and timestamp overflow by @lukekim in #10226
- fix(databricks): Fix schema introspection failures for non-Unity-Catalog environments by @lukekim in #10227
- feat: Add pagination support to HTTP data connector by @lukekim in #10228
- feat(databricks): DESCRIBE TABLE fallback and source-native type parsing for Lakehouse Federation by @lukekim in #10229
- fix(databricks): harden HTTP retries, compression, and token refresh by @lukekim in #10232
- feat[helm chart]: Add support for ServiceAccount annotations and AWS IRSA example by @peasee in #9833
- fix: Log warning and fall back gracefully on Cayenne config change by @krinart in #9092
- fix: Handle engine mismatch gracefully in snapshot fallback loop by @krinart in #9187
- fix: Full Text Search schema mismatch with ADBC connector by @lukekim in #10235
- docs: Update v2.0.0-rc.2 release notes with latest changes by @lukekim in #10238
- Fix append refresh dedup failure when refresh_sql selects column subset by @sgrebnov in #10225
- Revert "Properly mark dataset as Ready on Scheduler (#10215)" by @sgrebnov in #10242
- Fix failing merge conflicts for benchmarks by @krinart in #10247
- fix(github): fetch commits for dynamic and slash refs by @lukekim in #10233
- Upgrade DataFusion to v52.5.0-rc1 by @lukekim in #10249
- Merge develop to trunk (2026-04-09) by @claudespice in #10248
- fix: Validate embedding row_id columns during dataset init (fixes #8226) by @claudespice in #10208
- fix: Update tpch benchmark snapshots for federated/glue[csv].yaml by @app/github-actions in #10244
- feat(databricks): add resilience controls, UC awareness, and task history instrumentation by @lukekim in #10246
- fix: Make PartitionManager resilient to bare vs fully qualified table references by @sgrebnov in #10257
- fix: Update tpch benchmark snapshots for accelerated/s3[parquet]-cayenne[file].yaml by @app/github-actions in #10256
- Merge develop to trunk (2026-04-10) by @claudespice in #10251
- Improve Snowflake/ADBC dataset registration performance and observability by @lukekim in #10266
- Fixes for kafka connector by @krinart in #10263
- fix(runtime): gate otel code tags, suppress aws sdk noise, and unblock connector init by @lukekim in #10260
- fix(runtime): avoid regionless AWS SDK loads by @lukekim in #10271
- Add versioned release install workflow coverage by @lukekim in #10276
- fix(runtime): handle HTTP JSON unions and spicepod reloads by @lukekim in #10277
- Databricks UC permission prechecks: explicit denial as permanent error, ambiguous cases advisory by @lukekim in #10274
- Revert component status changes re-introduced by develop merge (#10248) by @sgrebnov in #10293
- Fix broken CI workflows by @ewgenius in #10294
- Group dependabot updates by ecosystem by @lukekim in #10296
- fix(tests): Replace flaky S3 Vectors snapshot tests with structural validation by @lukekim in #10301
- Update test_github_workflows snapshot by @lukekim in #10304
- fix(ci): fix Bedrock runner mismatch and snapshot auto-merge failure by @ewgenius in #10306
- feat(http): Add map-to-array conversion and query-parameter pagination by @lukekim in #10295
- New crate:
datafusion-ddlby @Jeadie in #10205 - Make Databricks UC permission checks advisory with structured error reporting by @lukekim in #10283
- build(deps): bump the github-actions-dependencies group with 4 updates by @app/dependabot in #10298
- fix: Clear cached plans on view updates by @peasee in #10312
- build(deps): bump the aws-sdk group with 7 updates by @app/dependabot in #10299
- Code out of runtime. by @Jeadie in #10178
- fix: Respect function registry denies for accelerated table filter pushdown by @peasee in #10311
- fix: Don't block heartbeat when all slots acquired by @peasee in #10322
- fix: strip only outer parens in
get_table_partition_expr_from_ctxby @Jeadie in #10323 - Upgrade datafusion-table-providers with MongoDB SRV support by @lukekim in #10317
- fix: Avoid pushing down bucketing partition expressions into executors by @peasee in #10324
- Upgrade datafusion-table-providers to d1b911a5 and bump adbc to 0.23 by @lukekim in #10329
- fix: Update Search integration test snapshots by @app/github-actions in #10308
- Handle foreign table + Classic sql warehouse combination gracefully by @krinart in #10318
- New crate
datafusion-flightsqlby @Jeadie in #10201 - Set
tantivy=warnunless very verbose logging by @Jeadie in #10338 - Remove image registry and image name options from spidapter by @ewgenius in #10241
- build(deps): bump sysinfo from 0.37.2 to 0.38.4 by @app/dependabot in #10291
- build(deps): bump futures from 0.3.31 to 0.3.32 by @app/dependabot in #10289
- New crate 'datafusion-dml' by @Jeadie in #10334
- Jeadie/26 04 16/spice sql by @Jeadie in #10343
- Add Teraswitch/Pittsburgh apt mirrors + retry config for CI runners by @lukekim in #10349
- Implement sort pushdown and fix pushdown gaps across providers by @lukekim in #10337
- Merge develop to trunk (2026-04-16) by @claudespice in #10345
- Update candle and mistral.rs lock-step pins by @lukekim in #10278
- docs: fix status badges in README by @lukekim in #10350
- Migrate secrets to vars by @krinart in #10354
- Add limit pushdown and improve sort pushdown for Oracle and MSSQL by @sgrebnov in #10351
- Fix ubuntu mirror configuration by @ewgenius in #10359
- fix: Increase throughput test default ready_wait from 30s to 300s (fixes #8207) by @claudespice in #10344
- Add auth headers support to OTEL metrics exporter by @lukekim in #10347
- fix(github): shrink GraphQL page size on gateway errors; lower comment defaults by @lukekim in #10355
- Relax apt mirror substitution failure to warning in CI action by @ewgenius in #10361
- feat(http): Add OAuth2 refresh-token auth to HTTP connector by @lukekim in #10348
- Upgrade Rust toolchain to 1.94.1 by @lukekim in #10353
- Handle order by and sort in PartitionedTableScanRewrite by @Jeadie in #9656
- Fix OTEL Exporter by @krinart in #10363
- Pin spiceai candle / TEI forks to merged revs; drop local [patch] overrides by @lukekim in #10362
- Integrate spiceio and makefile_targets into pr.yml by @lukekim in #10357
- ci: skip artifact compression for test binaries/archives by @lukekim in #10381
- chore(deps): bump spiceai/candle, spiceai/mistral.rs, aws-lc-rs, tantivy, rand by @lukekim in #10379
- Bump datafusion-table-providers (#10375) by @lukekim in #10384
- fix: Update Search integration test snapshots by @app/github-actions in #10376
- v2.0.0-rc.3 preparation by @ewgenius in #10382
- fix(spicepod): JSON schema accepts string or
{name: expr}forpartition_byby @lukekim in #10352 - fix: Use ROUND for Turso decimal BETWEEN comparisons (fixes #9872) by @claudespice in #10360
- Revert "v2.0.0-rc.3 preparation" from trunk by @ewgenius in #10386
- Add
on_schema_resolveddataset ready state by @lukekim in #10368 - feat: Add Elasticsearch data connector with hybrid search support by @lukekim in #10258
- ci: bump test archive upload compression-level to 1 by @lukekim in #10388
- feat(git-connector): promote Git connector to RC status by @lukekim in #10385
- feat(postgres): stream WAL directly to Spice accelerators by @lukekim in #10364
- Add schema decomposition to the HTTP connector by @lukekim in #10393
- fix(cayenne): Skip catalog refresh state reload for existing providers by @sgrebnov in #10396
- Make
cayenne-flightsqltool by @Jeadie in #10356 - build(deps): bump the github-actions-dependencies group with 2 updates by @app/dependabot in #10398
- Update openapi.json by @app/github-actions in #10272
- Merge develop to trunk โ 2026-04-19 by @claudespice in #10407
- feat(otel): default OTLP push exporter to delta temporality by @phillipleblanc in #10412
- fix: Restore analyzer rule ordering to run federation before type coercion by @sgrebnov in #10415
- fix: Map Utf8/LargeUtf8 to STRING in Databricks/Spark SQL dialects by @sgrebnov in #10420
- feat(otel): add metric name prefix at runtime.telemetry.metric_prefix by @phillipleblanc in #10418
- fix: Map LargeUtf8 to VARCHAR in Athena ODBC dialect by @sgrebnov in #10419
- feat(cluster): connector-driven object store registration on executors by @phillipleblanc in #10414
- build(deps): bump ubuntu from 22.04 to 24.04 in the docker-dependencies group by @app/dependabot in #10397
- fix: Update benchmark snapshots Apr 20 by @app/github-actions in #10417
- feat(otel): apply runtime.telemetry.properties as resource attributes on exported metrics by @phillipleblanc in #10416
- Publish RC releases to DockerHub; upgrade runners to ubuntu-24.04 by @lukekim in #10428
- feat: Add Azure Cosmos DB (NoSQL) data connector (RC) by @lukekim in #10392
- feat(datafusion): flatten_json_properties + json_tree UDTFs by @lukekim in #10406
- Harden /v1/tools and /v1/nsql against unauthenticated / LLM-driven SQL by @lukekim in #10365
- feat(embeddings): multi-vector embeddings with MaxSim + late-interaction by @lukekim in #10408
- Update GH runners for CUDA builds by @ewgenius in #10432
- fix(delta_lake): register object stores on cluster executors by @phillipleblanc in #10436
- DF-native DML by @krinart in #10327
- ci: run Build and Test on spiceai-macos; split install jobs by profile by @lukekim in #10434
- Improve search UDTFs: text_search, vector_search, rrf by @lukekim in #10387
- fix(model2vec): Improve robustness of model loading for sentence-transformers layouts by @sgrebnov in #10444
- Merge develop to trunk โ 2026-04-21 by @claudespice in #10448
- Enable filter pushdown for
vector_searchUDTF by @sgrebnov in #10447 - Support Snowflake OBJECT, MAP, GEOGRAPHY, GEOMETRY, VECTOR, TIMESTAMP_LTZ types by @lukekim in #10451
- Fix Databricks tests by @krinart in #10449
- fix(cluster): forward register_object_stores through connector wrappers by @phillipleblanc in #10460
- Fixes for vector-search by @krinart in #10455
- Add expand_maps option and flatten_json UDTF by @lukekim in #10452
- fix: Update Search integration test snapshots by @app/github-actions in #10458
- Fix physical codec decode ambiguity for empty protobuf messages by @sgrebnov in #10466
- chore(logging): demote s3_single_file_cached skip refresh log to debug by @phillipleblanc in #10467
- Enable filter pushdown for
rrfUDTF by @sgrebnov in #10465 - feat(cluster): consolidate distributed state into cluster.json by @phillipleblanc in #10463
- feat(cayenne): Add column statistics and data inlining by @lukekim in #10314
- docs(copilot): flag missing wrapper delegation when adding default trait methods by @phillipleblanc in #10461
- Wire Elasticsearch vector engine write path through acceleration by @lukekim in #10453
- Add helm lint CI by @ewgenius in #10468
- Fix Azure and GCS acceleration snapshot object store credential handling by @phillipleblanc in #10486
- Update spicepod.schema.json by @app/github-actions in #10485
- fix(secrets): harden AWS Secrets Manager secret store by @lukekim in #10478
- Update
datafusion-ballistacrate by @sgrebnov in #10488 - feat(secrets): add ParameterSpec and more params for AWS secrets manager by @phillipleblanc in #10487
- Add rerank UDTF for hybrid search with query auto-propagation by @lukekim in #10469
- Fix flatten_json_properties by @krinart in #10475
- fix: preserve field and schema metadata in expand_views_schema by @claudespice in #10494
- Upgrade rmcp to upstream 1.5.0; switch MCP server to Streamable HTTP by @lukekim in #10491
- fix: handle Snowflake TIMESTAMP_LTZ wire format and prevent nanosecond overflow by @claudespice in #10493
- Lint parity in Makefile by @krinart in #10492
- Add connect_timeout/client_timeout params to Databricks sql_warehouse mode by @lukekim in #10495
- fix(tracing): suppress opentelemetry INFO logs at all verbosity levels by @lukekim in #10497
- DynamoDB DML by @krinart in #10470
- feat(cayenne): native vector search via SIMD similarity UDFs by @lukekim in #10456
- fix(cli): suppress banner for all JSON-producing cloud subcommands (fixes #10498) by @claudespice in #10510
- fix(deps): bump openssl to 0.10.78 by @phillipleblanc in #10509
- fix(s3): quiet AWS SDK credential probe when no region is configured by @phillipleblanc in #10506
- fix(cdc): emit ready signal on caught-up Kafka/Debezium streams (#5201) by @phillipleblanc in #10504
runtime-clustercrate + Run partition discovery before forwarding refresh to executors by @krinart in #10490- Update lint-rust target to use
--keep-goingby @Jeadie in #10508 - Add TPC-H SF100 s3[parquet]-duckdb[file] benchmark spicepod by @lukekim in #10524
- Remove dev-profile install steps from pr.yml by @Jeadie in #10507
- fix: add missing NULL check on Timestamp path in append refresh by @claudespice in #10518
- fix: return error on Decimal128/256 overflow instead of silently dropping scale by @claudespice in #10519
- fix: delegate update and delete_from in IndexedTableProvider and EmbeddingTable by @claudespice in #10520
- feat(devx): make config errors, CLI, and REPL lead users to success by @lukekim in #10489
- fix(rerank): defer execution to RerankExec, enable filters and projection pushdown by @sgrebnov in #10514
- fix(llms): support Gemma models with missing attention_bias config field by @lukekim in #10523
- Fix vector_search silently ignoring named limit/column/include_score args by @sgrebnov in #10527
- fix: split unsupported filters locally in scan() for UseSource mode by @ewgenius in #10528
- feat(secrets): add Azure Key Vault secret store by @lukekim in #10496
- Bump mistralrs by @krinart in #10532
- Fix benchmark configurations and CI build issues by @sgrebnov in #10535
- Fix catalog query overrides for MySQL and MSSQL benchmarks by @sgrebnov in #10543
- For Cayenne, preserve matched columns for
MERGE ... ON <cols>by @Jeadie in #10340 - build(deps): bump the aws-sdk group across 1 directory with 5 updates by @app/dependabot in #10538
- docs: update AI agent instructions (git workflow + Rust 1.94) by @lukekim in #10544
- fix: Update tpch benchmark snapshots by @app/github-actions in #10529
- fix: Update tpch benchmark snapshots for accelerated/s3[parquet]-duckdb[file].yaml by @app/github-actions in #10525
- Extract
runtime-datafusionfromruntimeby @krinart in #10545 - Use generic DML extension planner for Cayenne by @Jeadie in #10437
- fix: Update Search integration test snapshots by @app/github-actions in #10552
- Fix security and correctness audit issues by @lukekim in #10526
- fix(MySQL): revert MySQL result column reorder to fix federated query failures by @sgrebnov in #10557
- Fix
protocinstallation by @krinart in #10566 - fix: Disable Ballista dynamic filters on HashJoinExec by @peasee in #10548
- Support views on DDL catalogs by @Jeadie in #10554
- Update datafusion by @Jeadie in #10422
- Improve full-text search indexing performance by @sgrebnov in #10464
- feat(mysql): add mysql_zero_date_behavior parameter (null|error) by @phillipleblanc in #10573
- fix(snowflake): declare
private_keyin connector PARAMETERS (fixes #10517) by @claudespice in #10559 - Honour
CARGO_TARGET_DIRin Makefiles by @Jeadie in #10569 - Enable
cosine_distancepushdown to DuckDB accelerator viaarray_cosine_distanceby @sgrebnov in #10564 - fix: Update test snapshots by @app/github-actions in #10570
- fix: Update tpch benchmark snapshots by @app/github-actions in #10560
- feat(snapshots): make snapshots an optional feature by @phillipleblanc in #10574
- Enforce read-only API key restrictions on Flight DoGet and async query paths by @Jeadie in #10551
- Improved security posture on Github workflows by @Jeadie in #10556
- fix: Update datafusion-table-providers to improve SqlTable filter pushdown by @sgrebnov in #10595
- feat(secrets): add HashiCorp Vault secret store by @phillipleblanc in #10561
- fix: delegate update() in UpsertDedupTableProvider to inner provider by @claudespice in #10593
- Add DuckDB vector engine support by @lukekim in #10562
- Sharepoint - add object-store listing connector with expanded auth and write support by @lukekim in #10473
- fix: Install protoc from source by @peasee in #10597
- Enable DML support for PostgreSQL data connector by @phillipleblanc in #10446
- feat(postgres): support inline PEM sslrootcert by @claudespice in #10578
- Add foreign key metadata discovery to PostgreSQL Catalog by @sgrebnov in #10849
- Add Snowflake DML support by @lukekim in #10747
- Add MongoDB Change Streams support by @lukekim in #10813
- Add user-defined functions by @lukekim in #10571
- Add table user functions and gate HTTP servers by @lukekim in #10675
- feat: add on-demand dataset loading by @phillipleblanc in #10629
- feat(runtime): declared-schema deferred datasets by @phillipleblanc in #10669
- feat(spicepod, runtime): add columns[].type / nullable + lenient type parser by @phillipleblanc in #10661
- Replace external smb crate with internal SMB 3.1.1 client by @phillipleblanc in #10516
- Add unified query cancellation across all paths by @lukekim in #10390
- Add dynamic HTTP request headers by @lukekim in #10604
- feat(http): Support dynamic HTTP connector request params from subqueries by @lukekim in #10636
- feat(http): pass through HTTP metadata columns with JSON schema decomposition by @lukekim in #10679
- Add nolimit HTTP pagination max pages by @lukekim in #10673
- Add shared HTTP rate control for connectors by @lukekim in #10648
- Use origin label instead of name for HTTP rate control metrics by @lukekim in #10689
- fix(http): reject OR across different HTTP filter columns by @lukekim in #10625
- Add provider-aware LLM prompt caching by @lukekim in #10645
- Add searchable registry mode for LLM tools by @lukekim in #10647
- feat: refresh_mode: snapshot + SQLite/Turso WAL flush + Cayenne metastore slice by @phillipleblanc in #10651
- feat: per-principal cache namespacing for SQL/search/caching-accelerator by @lukekim in #10702
- Add self-hosted Spice connector support by @phillipleblanc in #10546
- Add Delta Lake Azure tenant parameter by @phillipleblanc in #10671
- Support OAuth2 client credentials in 'spice cloud login' by @ewgenius in #10586
- Add configurable allowed_hosts for MCP by @lukekim in #10638
- fix: make Helm chart probes configurable by @peasee in #10696
- Strip high-cardinality datasets dim from anonymous telemetry by @lukekim in #10711
- feat(elasticsearch): direct FTS engine config + index lifecycle and ingestion controls by @lukekim in #10672
- Add DuckDB HNSW vector index support for accelerated views by @sgrebnov in #10695
- Rewrite DuckDB vector search SQL to activate HNSW_INDEX_SCAN by @sgrebnov in #10674
- Fix DuckDB HNSW vector indexes lost after data refresh by @sgrebnov in #10668
- Fix DuckDB DELETE/UPDATE on
fullandcachingrefresh mode datasets by @phillipleblanc in #10632 - Fix DuckLake connector: downcast, module registration, schema discovery, and S3 credentials by @sgrebnov in #10650
- Fix federation pushing denied functions inside subqueries to remote engines by @phillipleblanc in #10692
- fix(caching): honour refresh_on_startup: always in caching mode by @phillipleblanc in #10594
- fix(iceberg): rebuild storage factory when Hadoop catalog scheme is inferred by @sgrebnov in #10601
- Pipeline CDC ingestion: overlap source reads with batch apply by @lukekim in #10676
- fix: add NULL check to CDC primary key extraction by @lukekim in #10684
- Properly handle nullability during CDC processing by @krinart in #10803
- Flatten scheduler config and rename partition management โ partition assignment by @lukekim in #10450
- Improve NSQL UX and harden internal LLM tools by @lukekim in #10715
- Support Responses API across model providers by @lukekim in #10724
- Update xAI default model and handle Grok model retirements by @Jeadie in #10723
- Improve cli table layout by @krinart in #10725
- TLS cert hot-reload (mTLS plan M1) by @phillipleblanc in #10727
- Fix DuckLake catalog
includefilter being ignored by @phillipleblanc in #10738 - Promote DuckLake Catalog and Data Connector to Beta quality by @sgrebnov in #10743
- feat(ducklake): Support INSERT on catalog tables with read_write access by @sgrebnov in #10744
- perf(cdc): coalesce envelopes and overlap commits in apply pipeline by @lukekim in #10745
- feat: Allow full version tags in spicepod version by @peasee in #10748
- Add Arrow primary key upserts by @lukekim in #10749
- fix(snapshot): keep refresh_mode snapshot read-only by @phillipleblanc in #10752
- feat(tls): public mTLS for HTTP and Flight (channel + identity modes) by @phillipleblanc in #10753
- perf(cayenne): lock-free deletion caches with bloom-prefiltered probe by @lukekim in #10756
- fix(security): close API key timing-position leak and remote-UDF SSRF by @lukekim in #10757
- Fix 'wait_until_dependent_tables_are_ready' for catalogs by @phillipleblanc in #10758
- Fixes for views and resolved tables on 'spice refresh' CLI by @phillipleblanc in #10759
- Implement FlightSQL CommandStatementSubstraitPlan support by @lukekim in #10761
- feat(connectors): mTLS client cert support for flightsql and spiceai connectors by @phillipleblanc in #10764
- Allow arbitrary filenames when specifying spicepod path +
kindvalidation by @krinart in #10777 - fix: ignore field metadata in schema compatibility check in index_table_scan by @Jeadie in #10778
- Display pushed-down limits in EXPLAIN TREE output by @lukekim in #10779
- fix: enable streaming append for Kafka with Cayenne accelerator by @lukekim in #10780
- fix: bound chunked-index intermediate batch size to prevent OOM by @phillipleblanc in #10783
- fix: label all columns in
spice cloud metricstable output by @claudespice in #10784 - fix: use checked arithmetic for Turso integer-millis timestamp read path by @claudespice in #10786
- fix: use checked arithmetic in timestamp-to-nanosecond conversions by @claudespice in #10666
- Upgrade to DuckDB v1.5.2 by @sgrebnov in #10788
- Improve CDC ingestion performance by @lukekim in #10789
- Fix
tool_search/tool_invokespans by @lukekim in #10791 - Add Cayenne inline mutations and benchmark coverage by @lukekim in #10792
- Ensure we always resolve table names in distributed mode/metadata by @Jeadie in #10793
- Remove permanent errors from DynamoDB Streams by @krinart in #10794
- Add expanded view mode for wide table display in SQL REPL by @lukekim in #10797
- Fix Cayenne CDC schema mismatch error by @sgrebnov in #10800
- Executors should create catalog tables on join by @Jeadie in #10807
- Add compressed file support for listing connectors by @lukekim in #10809
- Improve Cayenne mutation, scan, and inline memtable scaling by @lukekim in #10811
- Add range fallback for large join filters by @lukekim in #10816
- Improve Cayenne join filter pushdown by @lukekim in #10818
- Synchronize Cayenne partition commits across partitions by @phillipleblanc in #10819
- fix: Deny nondistributed cayenne catalog by @peasee in #10821
- Enable parallel Cayenne Vortex writes by @lukekim in #10822
- Expand Arrow type handling in formatting and Elasticsearch by @lukekim in #10825
- Add
response.output_text.deltato responses API by @krinart in #10828 - feat(cayenne): add join filter propagation and no-spill Q21 planning by @lukekim in #10840
- Upgrade Turso to v0.6.0 by @sgrebnov in #10843
- feat(cli): add
spice feedbackcommand to open community Slack by @lukekim in #10856 - Upgrade iceberg to v0.9.1 by @sgrebnov in #10859
- feat(cluster): per-request executor readiness gate on /v1/ready by @phillipleblanc in #10860
- fix: Require dim-side statistics for
CayennePropagateFilterAcrossEquiJoinKeysby @sgrebnov in #10863 - fix: Debezium schema evolution breaks dataset init on reload by @claudespice in #10144
- fix(mssql): Push topK limit to SQL Server for non-nullable sort columns by @Jeadie in #10621
- fix(ScyllaDB): disable physical filter pushdown by @sgrebnov in #10772
- fix: handle typed NULLs and prevent overflow in DynamoDB DML type conversions by @krinart in #10511
- fix: use InsertOp::Overwrite in DynamoDB bootstrap scan_and_overwrite_accelerator by @krinart in #10639
- Improve DynamoDB Bootstrap performance by @krinart in #10616
- fix: preserve field and schema metadata in Vortex type transformation by @lukekim in #10628
- fix: GH connector - explicitly use AWS LC RS crypto provider for jwt by @phillipleblanc in #10619
- fix: add snapshot mode guards to delete_from/update and delegate DML in SwappableTableProvider by @phillipleblanc in #10685
- Persist HTTP rate-control state in object storage by @lukekim in #10697
- Rate limit metrics HTTP endpoint by @lukekim in #10162
- feat(geo): add optional spatial SQL UDF support by @lukekim in #10833
- feat(cayenne): CDC throughput, compaction, scan caching, and benchmarks by @lukekim in #10852
- fix(cayenne): fix Vortex panic on highly compressible data by @sgrebnov in #10855
- fix(cayenne): Read live protected snapshots after cleanup grace period by @sgrebnov in #10901
- fix: Disable Cayenne HashJoin rewriter optimizer by @sgrebnov in #10882
- Fix GetFlightInfo vs DoGet Flight Schema by @krinart in #10864
- fix(search): preserve column casing in /v1/search primary key plumbing by @claudespice in #10909
- fix(object-store): dedupe s3 url style auto-detection log by @phillipleblanc in #10898
- Improve Spice CLI manifest editing and direct command modes by @lukekim in #10815
- Persist Kafka CDC offsets in sidecar tables by @lukekim in #10823
- feat(task-history): record Ballista stages for distributed queries by @phillipleblanc in #10831
- Add '#[deny(clippy::missing_trait_methods)]' to wrapper/delegation trait impls by @Jeadie in #10795
- Optimize Cayenne catalog maintenance paths by @lukekim in #10904
- Centralize DuckDB settings for accelerator by @ewgenius in #10895
- deps(ballista): bump to 47e2b494 to fix S3 shuffle reads under cluster mode by @phillipleblanc in #10910
- Authorization header + Bump async-openai +
responses_adapterfix by @krinart in #10911 - Tune accelerators by storage profile by @lukekim in #10913
- feat: add dataset-level on_schema_change config by @lukekim in #10908
- Handle
NULLsentinel for nullable partition expressions by @Jeadie in #10880 - fix: Remove Cayenne Catalog from catalog registration by @peasee in #10914
- Add catalog name to foreign key metadata in postgres catalog by @Jeadie in #10917
- Cayenne perf: eliminate redundant clones, PK point-lookup fanout fix, IN-list rewrite + microbench coverage by @lukekim in #10916
- fix(turso-shared): retry on Turso BEGIN CONCURRENT "Write-write conflict" by @lukekim in #10946
- Vendor Vortex DataFusion for Cayenne by @lukekim in #10933
- perf(cayenne): background retention + enable CDC pipelining for retention-configured tables by @lukekim in #10936
- feat(cayenne): scale metastore pool to 32 + vs_duckdb_scaling benches (1โ128 concurrency, sqlite + turso lanes) by @lukekim in #10943
- feat(mcp): support auth for streamable HTTP tools by @phillipleblanc in #10927
- Explicit error if v1/search requests a table without search index by @Jeadie in #10968
- Fix spicepod loading failure when directory name contains dots by @sgrebnov in #10958
- Extend append tests with arrow engine configurations by @sgrebnov in #10959
- Remove dataset on_schema_change Policy from rc.5 release notes by @sgrebnov in #10964
- Skip tpcds_q78 for Cayenne engine at SF100 by @sgrebnov in #10966
- fix: Update benchmark snapshots May-20 by @app/github-actions in #10952
- Fix #10951: UdtfExec invariant Vec lengths must match children count by @phillipleblanc in #10953
- docs(release): update v2.0.0-rc.5 notes with latest trunk PRs by @lukekim in #10949
- Remove eval related things for v2.0.0 by @Jeadie in #10945
- build(deps): bump ubuntu from 24.04 to 26.04 in the docker-dependencies group by @app/dependabot in #10883
- fix: Add publish = false to chbench-driver by @sgrebnov in #10939
- [Bug] Timing between reconnect and AllocateInitialPartitions leaves connection without flight_sql_client by @Jeadie in #10805
- Fix:
refresh_mode: snapshotreports Ready with empty data when no snapshot exists by @sgrebnov in #10979 - fix(cluster): gate scheduler readiness on executor partition loads by @phillipleblanc in #10992
- fix: handle EXISTS/NOT EXISTS subqueries in federation analyzer by @sgrebnov in #10996
- Refactor spice dataset configuration command by @Jeadie in #10999
- fix: preserve field and schema metadata in Vortex physical schema calculation by @claudespice in #11013
- fix: validate Snowflake account identifiers and auth config by @Jeadie in #11024
- Fix Unity Catalog connector deserialization failure with OSS Unity Catalog by @ewgenius in #11026
- feat(cayenne): allow inline writes with pending deletions (deletes/upserts) by @sgrebnov in #11031
- Expose metadata descriptions via PostgreSQL UDFs by @lukekim in #11032
- Remove default runtime features - enable explicitly in spiced by @phillipleblanc in #11037
- feat(cayenne): fast-path CDC deletes by extracting PK values from filters by @sgrebnov in #11049
- Cayenne optimizer rules: auto relevance test for q21-shape (all-Cayenne CH-Bench) and runtime rule selection by @lukekim in #11050
- refactor(cdc): reduce CDC sub-batch splits for interleaved upsert/delete workloads by @sgrebnov in #11051
- fix(snowflake): enforce function deny-list in federation pushdown by @claudespice in #11057
- fix(mcp): trace external server tool calls in task history by @ewgenius in #11058
- perf(cdc): Last-write-wins dedup in
group_into_sub_batchesto reduce sub-batch splits by @sgrebnov in #11059 - PM edits to v2.0.0-rc5 by @lukekim in #11067
- fix(snowflake): wire deny-list in extracted connector crate (#10703) by @claudespice in #11071
- perf(cayenne): keep CDC upsert PK keysets resident to avoid per-batch full-table rebuilds by @lukekim in #11074
- Fix metadata on search indexing by @Jeadie in #11080
- feat(cayenne): merge-on-read position deletes for PK upsert tables + memory-pool accounting by @lukekim in #11085
- perf(cayenne): scale CDC inline flush caps with memory + storage class by @lukekim in #11087
- feat(cluster): report per-executor table statistics so distributed JoinSelection can size joins by @phillipleblanc in #11089
- Improve Cayenne CDC write and compaction path tracing by @sgrebnov in #11091
- Support tuple-IN composite PK extraction in Cayenne delete fast-path by @sgrebnov in #11093
- feat(cluster): NDV-aware executor stats so CDC q18 join swap fires by @phillipleblanc in #11098
- feat(cayenne): maintain join-sizing stats on the write path by @phillipleblanc in #11104
- fix(cache): run periodic moka maintenance for idle caches by @phillipleblanc in #11106
- Upgrade to DuckDB 1.5.3 + statically link the VSS (HNSW) extension by @sgrebnov in #11107
- Fix fetched_at for HTTP connector by @Jeadie in #11116
- fix(cayenne): tombstone inline-checkpointed rows on upsert to prevent duplicate PKs by @sgrebnov in #11129
- feat: dedicated compaction runtime for Cayenne + CDC pipelining, protected snapshots, and test coverage by @lukekim in #11130
- Add
datasetsdimension to thequery_executionsmetric by @phillipleblanc in #11138 - Fix #11137: localpod child not tracking parent refreshes with in-memory (arrow) parent accelerator by @phillipleblanc in #11139
- Fix Windows build: vendor the VSS extension (drop nested submodule) by @phillipleblanc in #11140
- fix(spiceai): keep correlated subqueries out of JOIN ON for Spice Cloud federation by @phillipleblanc in #11143
- Refactor spice dataset configuration command by @Jeadie in #10999
- feat(cayenne): sharded parallel Vortex encode with key/time clustering by @lukekim in #11144
- fix(cluster): prevent DoPut write pipeline self-deadlock under ingest backpressure by @phillipleblanc in #11160
- fix(cayenne): only warn on genuine protected-snapshot amplification by @lukekim in #11158
Full Changelog: https://github.com/spiceai/spiceai/compare/v1.11.6...v2.0.0

