Spice v2.0-rc.4 (Apr 30, 2026)
Announcing the release of Spice v2.0-rc.4! ๐
v2.0.0-rc.4 is the fourth release candidate for advanced testing of v2.0, building on v2.0.0-rc.3.
Highlights in this release candidate include:
- Elasticsearch Data Connector (Alpha) with native hybrid search (BM25 full-text + kNN vector + RRF)
- PostgreSQL Native CDC via WAL logical replication, eliminating the need for Debezium or Kafka
- Multi-vector Embeddings with MaxSim for ColBERT-style late-interaction retrieval
- Rerank UDTF for hybrid search pipelines with automatic query propagation
- HashiCorp Vault and Azure Key Vault Secret Stores for enterprise secret management
- DuckDB Vector Engine with HNSW index support
- Azure Cosmos DB Connector (RC), Git Connector promoted to RC
- MCP Streamable HTTP transport
- Read-only API Key Enforcement on Flight DoGet and async query paths
What's New in v2.0.0-rc.4โ
Elasticsearch Data Connector (Alpha, Spice.ai Enterprise)โ
The new Elasticsearch data connector enables querying Elasticsearch indexes as SQL tables with full hybrid search support. Currently available in Spice.ai Enterprise.
Key capabilities:
- SQL Table Access: Query any Elasticsearch index with standard SQL via a native DataFusion
TableProvider. - kNN Vector Search: Use the
vector_search()UDTF against Elasticsearch-backed vector fields. - BM25 Full-Text Search: Use the
text_search()UDTF for native Elasticsearch full-text queries. - Hybrid Search: Combine kNN and BM25 results with the
rrf()UDTF for reciprocal rank fusion. - Elasticsearch as a Vector Engine: Accelerated datasets can use Elasticsearch as the backing vector engine for embedding storage and retrieval.
Example configuration:
datasets:
- from: elasticsearch:my_index
name: my_data
params:
elasticsearch_endpoint: https://my-cluster.es.io:9200
elasticsearch_username: ${secrets:es_user}
elasticsearch_password: ${secrets:es_password}
PostgreSQL Native Replication via WALโ
Postgres datasets configured with refresh_mode: changes can now stream changes directly from PostgreSQL logical replication (WAL) into any local accelerator without Debezium or Kafka required.
Key capabilities:
- Native Logical Replication: Uses
pgoutputdecoding to stream INSERT/UPDATE/DELETE events. - Automatic Slot Management: Each Spice replica creates a distinct replication slot (
spice_<dataset>_<hash>), so multi-replica deployments work automatically. Publications are shared. - Bootstrap Snapshot: An initial
REPEATABLE READsnapshot seeds the accelerator before replication begins. - LSN Acknowledgement: The
LsnCommittersends durable LSN back to Postgres so WAL segments are reclaimed. - All Accelerators Supported: Works with DuckDB, SQLite, Postgres, Cayenne, and Arrow accelerators.
Example configuration:
datasets:
- from: postgres:my_table
name: my_table
params:
pg_host: localhost
pg_port: 5432
pg_db: mydb
pg_publication: my_publication # optional; auto-created if omitted
acceleration:
enabled: true
engine: duckdb
refresh_mode: changes
Multi-vector Embeddings with MaxSim (Late Interaction)โ
Column-level embeddings now support list-of-string columns, producing one embedding vector per list element and enabling ColBERT-style late-interaction retrieval.
Key capabilities:
- Multi-vector per Row: Columns of type
List<String>produceList<FixedSizeList<F32, D>>โ one embedding per list element. - MaxSim / Mean / Sum Scoring: Per-row score is the max, mean, or sum cosine over the list elements. Default is MaxSim (ColBERT).
_matchColumn: Returns the specific list element that produced the highest cosine similarity.- No Schema Changes Required: Works with existing embedding configurations; activates automatically for list-type columns.
Rerank UDTF for Hybrid Searchโ
A new rerank() table-valued function reorders scored results from vector_search, text_search, or rrf by a reranker model's relevance judgements. See Search Functionality for an overview of search UDTFs.
Key capabilities:
- Auto Query Propagation: The query string is automatically inherited from a nested search UDTF โ no repetition required.
- Any Chat Model as Reranker: Any registered chat/completion model can serve as a reranker via the built-in
LlmRerankadapter (listwise prompt by default; pointwise available). - Filter and Projection Pushdown: The
RerankExecphysical node supports pushdown, reducing data movement. - Extensible: A new
RerankerModelStoresits alongsideChatModelStoreandEmbeddingModelStore; native providers (Cohere, Voyage, BGE) can be added without runtime plumbing changes.
SELECT * FROM rerank(
rrf(vector_search('my_table', 'query text'), text_search('my_table', 'query text')),
document => content
) LIMIT 10;
New Secret Stores: HashiCorp Vault and Azure Key Vaultโ
Two new enterprise-grade Secret Stores are now available.
HashiCorp Vault (hashicorp_vault):
- KV v2 (default) and KV v1 mount support.
- Auth methods:
token,approle,kubernetes,jwt. - Token leases are cached and automatically re-acquired on expiry.
secrets:
- from: hashicorp_vault:secret/my-app
name: my_secrets
params:
hashicorp_vault_addr: https://vault.example.com
hashicorp_vault_auth_method: approle
hashicorp_vault_role_id: ${env:VAULT_ROLE_ID}
hashicorp_vault_secret_id: ${secrets:vault_secret_id}
Azure Key Vault (azure_keyvault):
- Per-key caching with single-flight fetch coalescing.
- Auth methods: service principal, managed identity, workload identity, Azure CLI, or auto-detect.
- Supports sovereign clouds via
endpointparameter.
secrets:
- from: azure_keyvault:my-vault
name: my_secrets
params:
azure_keyvault_auth_method: managed_identity
DuckDB Vector Engineโ
DuckDB-accelerated tables can now use DuckDB's HNSW index for vector search via the vector_engine: duckdb option, enabling fast approximate nearest-neighbor search without an external vector store.
Example configuration:
datasets:
- from: postgres:public.documents
name: documents
columns:
- name: content
embeddings:
- from: hf_minilm
row_id: id
acceleration:
enabled: true
engine: duckdb
mode: file
vectors:
enabled: true
engine: duckdb
params:
duckdb_distance_metric: cosine
duckdb_hnsw_m: 16
duckdb_hnsw_ef_construction: 64
duckdb_hnsw_ef_search: 32
embeddings:
- from: huggingface:huggingface.co/minishlab/potion-base-2M
name: hf_minilm
New and Promoted Connectorsโ
Azure Cosmos DB (Alpha):
A new read-only Azure Cosmos DB NoSQL / Core SQL API connector built on the azure_data_cosmos 0.30 SDK. Supports cross-partition scans, schema inference from document samples, and key-based auth (connection string or account endpoint + key).
Git Connector (RC):
The Git data connector is promoted to RC status with HTTPS/SSH auth (git_token, git_username/git_password, git_ssh_key), Git LFS support (enable_lfs), and per-repo connection resilience (semaphore, bounded retries with exponential backoff, permanent-error circuit breaking).
DynamoDB Write Support (DML)โ
DynamoDB datasets now support write-back via INSERT, UPDATE, and DELETE operations, complementing the existing read and CDC streaming capabilities.
MCP Streamable HTTP Transportโ
The MCP server has been upgraded to rmcp 1.5.0 and switched to the Streamable HTTP transport (/v1/mcp), replacing the previous SSE-based endpoint. The client-side transport is updated to StreamableHttpClientTransport.
Security Improvementsโ
Read-only API Key Enforcement: API keys with read-only scope are now strictly enforced on the Flight DoGet path and on async query endpoints, preventing write operations from being issued under a read-only key.
GitHub Workflow Hardening: CI workflows have been hardened with improved security posture to reduce supply-chain risk.
Developer Experience Improvementsโ
- Actionable Config Errors: Parameter typos, missing secret references, and unknown engine names now produce specific, actionable error messages with Levenshtein-based suggestions, rather than silent drops or generic "missing required parameter" messages.
spice initImprovements: Written spicepods now include ayaml-language-server: $schema=...directive for IDE completions. Creation messages print regardless of log level.- REPL Improvements: Log filter honors
RUST_LOGwhen-vis not passed; version banner moves to stderr and prints only on an interactive TTY. - 403 / 401 Routing: HTTP 403 responses route to a new
PermissionDeniedvariant; 401 messages point atspice login/SPICE_API_KEY.
OpenTelemetry Improvementsโ
See Observability & Monitoring and the runtime.telemetry reference for full configuration details.
- Metric Name Prefix: Configure a prefix for all exported OTLP metric names via
runtime.telemetry.metric_prefix. - Delta Temporality Default: The OTLP push exporter now defaults to delta temporality, matching Prometheus and most backends.
- Resource Attributes:
runtime.telemetry.propertiesare applied as OTLP resource attributes on exported metrics.
Full-text Search Performanceโ
Tantivy full-text search ingestion performance is significantly improved with better batch handling and a rollback-on-error path.
SQL and Query Engineโ
- DataFusion Upgrade: Updated to a newer DataFusion revision with additional bug fixes and performance improvements.
- Views on DDL Catalogs: DDL-defined catalogs (e.g., Unity Catalog) can now expose and query views.
flatten_json/json_tree/expand_mapsUDTFs: New table-valued functions for JSON transformation, map expansion, and schema decomposition in query pipelines. See JSON Functions and Operators.cosine_distancePushdown to DuckDB:cosine_distanceis now pushed down to DuckDB accelerators viaarray_cosine_distance.- Snowflake Type Support: Added support for
OBJECT,MAP,GEOGRAPHY,GEOMETRY,VECTOR, andTIMESTAMP_LTZtypes in the Snowflake connector. - MySQL Zero-Date Behavior: The MySQL connector adds a new
mysql_zero_date_behaviorparameter (nullorerror) controlling how MySQL zero-date values (0000-00-00) are handled. - Databricks Timeouts: The Databricks connector adds new
connect_timeoutandclient_timeoutparameters forsql_warehousemode.
Dependency Updatesโ
| Dependency / Component | Version / Update |
|---|---|
| DataFusion | Updated |
| rmcp | v1.5.0 (from fork pin) |
| mistral.rs | Updated |
| openssl | 0.10.78 |
Contributorsโ
Breaking Changesโ
No breaking changes.
Cookbook Updatesโ
No new cookbook recipes.
The Spice Cookbook includes 86 recipes to help you get started with Spice quickly and easily.
Upgradingโ
To upgrade to v2.0.0-rc.4, use one of the following methods:
CLI:
spice upgrade v2.0.0-rc.4
Homebrew:
brew upgrade spiceai/spiceai/spice
Docker:
Pull the spiceai/spiceai:2.0.0-rc.4 image:
docker pull spiceai/spiceai:2.0.0-rc.4
For available tags, see DockerHub.
Helm:
helm repo update
helm upgrade spiceai spiceai/spiceai --version 2.0.0-rc.4
AWS Marketplace:
Spice is available in the AWS Marketplace.
What's Changedโ
Changelogโ
- Integrate spiceio and makefile_targets into pr.yml by @lukekim in #10357
- ci: skip artifact compression for test binaries/archives by @lukekim in #10381
- chore(deps): bump spiceai/candle, spiceai/mistral.rs, aws-lc-rs, tantivy, rand by @lukekim in #10379
- Bump datafusion-table-providers (#10375) by @lukekim in #10384
- fix: Update Search integration test snapshots by @app/github-actions in #10376
- v2.0.0-rc.3 preparation by @ewgenius in #10382
- fix(spicepod): JSON schema accepts string or
{name: expr}forpartition_byby @lukekim in #10352 - fix: Use ROUND for Turso decimal BETWEEN comparisons (fixes #9872) by @claudespice in #10360
- Revert "v2.0.0-rc.3 preparation" from trunk by @ewgenius in #10386
- Add
on_schema_resolveddataset ready state by @lukekim in #10368 - feat: Add Elasticsearch data connector with hybrid search support by @lukekim in #10258
- ci: bump test archive upload compression-level to 1 by @lukekim in #10388
- feat(git-connector): promote Git connector to RC status by @lukekim in #10385
- feat(postgres): stream WAL directly to Spice accelerators by @lukekim in #10364
- Add schema decomposition to the HTTP connector by @lukekim in #10393
- fix(cayenne): Skip catalog refresh state reload for existing providers by @sgrebnov in #10396
- Make
cayenne-flightsqltool by @Jeadie in #10356 - build(deps): bump the github-actions-dependencies group with 2 updates by @app/dependabot in #10398
- Update openapi.json by @app/github-actions in #10272
- Merge develop to trunk โ 2026-04-19 by @claudespice in #10407
- feat(otel): default OTLP push exporter to delta temporality by @phillipleblanc in #10412
- fix: Restore analyzer rule ordering to run federation before type coercion by @sgrebnov in #10415
- fix: Map Utf8/LargeUtf8 to STRING in Databricks/Spark SQL dialects by @sgrebnov in #10420
- feat(otel): add metric name prefix at runtime.telemetry.metric_prefix by @phillipleblanc in #10418
- fix: Map LargeUtf8 to VARCHAR in Athena ODBC dialect by @sgrebnov in #10419
- feat(cluster): connector-driven object store registration on executors by @phillipleblanc in #10414
- build(deps): bump ubuntu from 22.04 to 24.04 in the docker-dependencies group by @app/dependabot in #10397
- fix: Update benchmark snapshots Apr 20 by @app/github-actions in #10417
- feat(otel): apply runtime.telemetry.properties as resource attributes on exported metrics by @phillipleblanc in #10416
- Publish RC releases to DockerHub; upgrade runners to ubuntu-24.04 by @lukekim in #10428
- feat: Add Azure Cosmos DB (NoSQL) data connector (RC) by @lukekim in #10392
- feat(datafusion): flatten_json_properties + json_tree UDTFs by @lukekim in #10406
- Harden /v1/tools and /v1/nsql against unauthenticated / LLM-driven SQL by @lukekim in #10365
- feat(embeddings): multi-vector embeddings with MaxSim + late-interaction by @lukekim in #10408
- Update GH runners for CUDA builds by @ewgenius in #10432
- fix(delta_lake): register object stores on cluster executors by @phillipleblanc in #10436
- DF-native DML by @krinart in #10327
- ci: run Build and Test on spiceai-macos; split install jobs by profile by @lukekim in #10434
- Improve search UDTFs: text_search, vector_search, rrf by @lukekim in #10387
- fix(model2vec): Improve robustness of model loading for sentence-transformers layouts by @sgrebnov in #10444
- Merge develop to trunk โ 2026-04-21 by @claudespice in #10448
- Enable filter pushdown for
vector_searchUDTF by @sgrebnov in #10447 - Support Snowflake OBJECT, MAP, GEOGRAPHY, GEOMETRY, VECTOR, TIMESTAMP_LTZ types by @lukekim in #10451
- Fix Databricks tests by @krinart in #10449
- fix(cluster): forward register_object_stores through connector wrappers by @phillipleblanc in #10460
- Fixes for vector-search by @krinart in #10455
- Add expand_maps option and flatten_json UDTF by @lukekim in #10452
- fix: Update Search integration test snapshots by @app/github-actions in #10458
- Fix physical codec decode ambiguity for empty protobuf messages by @sgrebnov in #10466
- chore(logging): demote s3_single_file_cached skip refresh log to debug by @phillipleblanc in #10467
- Enable filter pushdown for
rrfUDTF by @sgrebnov in #10465 - feat(cluster): consolidate distributed state into cluster.json by @phillipleblanc in #10463
- feat(cayenne): Add column statistics and data inlining by @lukekim in #10314
- docs(copilot): flag missing wrapper delegation when adding default trait methods by @phillipleblanc in #10461
- Wire Elasticsearch vector engine write path through acceleration by @lukekim in #10453
- Add helm lint CI by @ewgenius in #10468
- Fix Azure and GCS acceleration snapshot object store credential handling by @phillipleblanc in #10486
- Update spicepod.schema.json by @app/github-actions in #10485
- fix(secrets): harden AWS Secrets Manager secret store by @lukekim in #10478
- Update
datafusion-ballistacrate by @sgrebnov in #10488 - feat(secrets): add ParameterSpec and more params for AWS secrets manager by @phillipleblanc in #10487
- Add rerank UDTF for hybrid search with query auto-propagation by @lukekim in #10469
- Fix flatten_json_properties by @krinart in #10475
- fix: preserve field and schema metadata in expand_views_schema by @claudespice in #10494
- Upgrade rmcp to upstream 1.5.0; switch MCP server to Streamable HTTP by @lukekim in #10491
- fix: handle Snowflake TIMESTAMP_LTZ wire format and prevent nanosecond overflow by @claudespice in #10493
- Lint parity in Makefile by @krinart in #10492
- Add connect_timeout/client_timeout params to Databricks sql_warehouse mode by @lukekim in #10495
- fix(tracing): suppress opentelemetry INFO logs at all verbosity levels by @lukekim in #10497
- DynamoDB DML by @krinart in #10470
- feat(cayenne): native vector search via SIMD similarity UDFs by @lukekim in #10456
- fix(cli): suppress banner for all JSON-producing cloud subcommands (fixes #10498) by @claudespice in #10510
- fix(deps): bump openssl to 0.10.78 by @phillipleblanc in #10509
- fix(s3): quiet AWS SDK credential probe when no region is configured by @phillipleblanc in #10506
- fix(cdc): emit ready signal on caught-up Kafka/Debezium streams (#5201) by @phillipleblanc in #10504
runtime-clustercrate + Run partition discovery before forwarding refresh to executors by @krinart in #10490- Update lint-rust target to use
--keep-goingby @Jeadie in #10508 - Add TPC-H SF100 s3[parquet]-duckdb[file] benchmark spicepod by @lukekim in #10524
- Remove dev-profile install steps from pr.yml by @Jeadie in #10507
- fix: add missing NULL check on Timestamp path in append refresh by @claudespice in #10518
- fix: return error on Decimal128/256 overflow instead of silently dropping scale by @claudespice in #10519
- fix: delegate update and delete_from in IndexedTableProvider and EmbeddingTable by @claudespice in #10520
- feat(devx): make config errors, CLI, and REPL lead users to success by @lukekim in #10489
- fix(rerank): defer execution to RerankExec, enable filters and projection pushdown by @sgrebnov in #10514
- fix(llms): support Gemma models with missing attention_bias config field by @lukekim in #10523
- Fix vector_search silently ignoring named limit/column/include_score args by @sgrebnov in #10527
- fix: split unsupported filters locally in scan() for UseSource mode by @ewgenius in #10528
- feat(secrets): add Azure Key Vault secret store by @lukekim in #10496
- Bump mistralrs by @krinart in #10532
- Fix benchmark configurations and CI build issues by @sgrebnov in #10535
- Fix catalog query overrides for MySQL and MSSQL benchmarks by @sgrebnov in #10543
- For Cayenne, preserve matched columns for
MERGE ... ON <cols>by @Jeadie in #10340 - build(deps): bump the aws-sdk group across 1 directory with 5 updates by @app/dependabot in #10538
- docs: update AI agent instructions (git workflow + Rust 1.94) by @lukekim in #10544
- fix: Update tpch benchmark snapshots by @app/github-actions in #10529
- fix: Update tpch benchmark snapshots for accelerated/s3[parquet]-duckdb[file].yaml by @app/github-actions in #10525
- Extract
runtime-datafusionfromruntimeby @krinart in #10545 - Use generic DML extension planner for Cayenne by @Jeadie in #10437
- fix: Update Search integration test snapshots by @app/github-actions in #10552
- Fix security and correctness audit issues by @lukekim in #10526
- fix(MySQL): revert MySQL result column reorder to fix federated query failures by @sgrebnov in #10557
- Fix
protocinstallation by @krinart in #10566 - fix: Disable Ballista dynamic filters on HashJoinExec by @peasee in #10548
- Support views on DDL catalogs by @Jeadie in #10554
- Update datafusion by @Jeadie in #10422
- Improve full-text search indexing performance by @sgrebnov in #10464
- feat(mysql): add mysql_zero_date_behavior parameter (null|error) by @phillipleblanc in #10573
- fix(snowflake): declare
private_keyin connector PARAMETERS (fixes #10517) by @claudespice in #10559 - Honour
CARGO_TARGET_DIRin Makefiles by @Jeadie in #10569 - Enable
cosine_distancepushdown to DuckDB accelerator viaarray_cosine_distanceby @sgrebnov in #10564 - fix: Update test snapshots by @app/github-actions in #10570
- fix: Update tpch benchmark snapshots by @app/github-actions in #10560
- feat(snapshots): make snapshots an optional feature by @phillipleblanc in #10574
- Enforce read-only API key restrictions on Flight DoGet and async query paths by @Jeadie in #10551
- Improved security posture on Github workflows by @Jeadie in #10556
- fix: Update datafusion-table-providers to improve SqlTable filter pushdown by @sgrebnov in #10595
- feat(secrets): add HashiCorp Vault secret store by @phillipleblanc in #10561
- fix: delegate update() in UpsertDedupTableProvider to inner provider by @claudespice in #10593
- Add DuckDB vector engine support by @lukekim in #10562
- Sharepoint - add object-store listing connector with expanded auth and write support by @lukekim in #10473
- fix: Install protoc from source by @peasee in #10597
Full Changelog: https://github.com/spiceai/spiceai/compare/v2.0.0-rc.3...v2.0.0-rc.4

