Skip to main content

2 posts tagged with "Embedding"

Embedding related topics and usage.

View All Tags

Spice v2.0-rc.4 (Apr 30, 2026)

ยท 22 min read
William Croxson
Senior Software Engineer at Spice AI

Announcing the release of Spice v2.0-rc.4! ๐Ÿš€

v2.0.0-rc.4 is the fourth release candidate for advanced testing of v2.0, building on v2.0.0-rc.3.

Highlights in this release candidate include:

  • Elasticsearch Data Connector (Alpha) with native hybrid search (BM25 full-text + kNN vector + RRF)
  • PostgreSQL Native CDC via WAL logical replication, eliminating the need for Debezium or Kafka
  • Multi-vector Embeddings with MaxSim for ColBERT-style late-interaction retrieval
  • Rerank UDTF for hybrid search pipelines with automatic query propagation
  • HashiCorp Vault and Azure Key Vault Secret Stores for enterprise secret management
  • DuckDB Vector Engine with HNSW index support
  • Azure Cosmos DB Connector (RC), Git Connector promoted to RC
  • MCP Streamable HTTP transport
  • Read-only API Key Enforcement on Flight DoGet and async query paths

What's New in v2.0.0-rc.4โ€‹

Elasticsearch Data Connector (Alpha, Spice.ai Enterprise)โ€‹

The new Elasticsearch data connector enables querying Elasticsearch indexes as SQL tables with full hybrid search support. Currently available in Spice.ai Enterprise.

Key capabilities:

  • SQL Table Access: Query any Elasticsearch index with standard SQL via a native DataFusion TableProvider.
  • kNN Vector Search: Use the vector_search() UDTF against Elasticsearch-backed vector fields.
  • BM25 Full-Text Search: Use the text_search() UDTF for native Elasticsearch full-text queries.
  • Hybrid Search: Combine kNN and BM25 results with the rrf() UDTF for reciprocal rank fusion.
  • Elasticsearch as a Vector Engine: Accelerated datasets can use Elasticsearch as the backing vector engine for embedding storage and retrieval.

Example configuration:

datasets:
- from: elasticsearch:my_index
name: my_data
params:
elasticsearch_endpoint: https://my-cluster.es.io:9200
elasticsearch_username: ${secrets:es_user}
elasticsearch_password: ${secrets:es_password}

PostgreSQL Native Replication via WALโ€‹

Postgres datasets configured with refresh_mode: changes can now stream changes directly from PostgreSQL logical replication (WAL) into any local accelerator without Debezium or Kafka required.

Key capabilities:

  • Native Logical Replication: Uses pgoutput decoding to stream INSERT/UPDATE/DELETE events.
  • Automatic Slot Management: Each Spice replica creates a distinct replication slot (spice_<dataset>_<hash>), so multi-replica deployments work automatically. Publications are shared.
  • Bootstrap Snapshot: An initial REPEATABLE READ snapshot seeds the accelerator before replication begins.
  • LSN Acknowledgement: The LsnCommitter sends durable LSN back to Postgres so WAL segments are reclaimed.
  • All Accelerators Supported: Works with DuckDB, SQLite, Postgres, Cayenne, and Arrow accelerators.

Example configuration:

datasets:
- from: postgres:my_table
name: my_table
params:
pg_host: localhost
pg_port: 5432
pg_db: mydb
pg_publication: my_publication # optional; auto-created if omitted
acceleration:
enabled: true
engine: duckdb
refresh_mode: changes

Multi-vector Embeddings with MaxSim (Late Interaction)โ€‹

Column-level embeddings now support list-of-string columns, producing one embedding vector per list element and enabling ColBERT-style late-interaction retrieval.

Key capabilities:

  • Multi-vector per Row: Columns of type List<String> produce List<FixedSizeList<F32, D>> โ€” one embedding per list element.
  • MaxSim / Mean / Sum Scoring: Per-row score is the max, mean, or sum cosine over the list elements. Default is MaxSim (ColBERT).
  • _match Column: Returns the specific list element that produced the highest cosine similarity.
  • No Schema Changes Required: Works with existing embedding configurations; activates automatically for list-type columns.

A new rerank() table-valued function reorders scored results from vector_search, text_search, or rrf by a reranker model's relevance judgements. See Search Functionality for an overview of search UDTFs.

Key capabilities:

  • Auto Query Propagation: The query string is automatically inherited from a nested search UDTF โ€” no repetition required.
  • Any Chat Model as Reranker: Any registered chat/completion model can serve as a reranker via the built-in LlmRerank adapter (listwise prompt by default; pointwise available).
  • Filter and Projection Pushdown: The RerankExec physical node supports pushdown, reducing data movement.
  • Extensible: A new RerankerModelStore sits alongside ChatModelStore and EmbeddingModelStore; native providers (Cohere, Voyage, BGE) can be added without runtime plumbing changes.
SELECT * FROM rerank(
rrf(vector_search('my_table', 'query text'), text_search('my_table', 'query text')),
document => content
) LIMIT 10;

New Secret Stores: HashiCorp Vault and Azure Key Vaultโ€‹

Two new enterprise-grade Secret Stores are now available.

HashiCorp Vault (hashicorp_vault):

  • KV v2 (default) and KV v1 mount support.
  • Auth methods: token, approle, kubernetes, jwt.
  • Token leases are cached and automatically re-acquired on expiry.
secrets:
- from: hashicorp_vault:secret/my-app
name: my_secrets
params:
hashicorp_vault_addr: https://vault.example.com
hashicorp_vault_auth_method: approle
hashicorp_vault_role_id: ${env:VAULT_ROLE_ID}
hashicorp_vault_secret_id: ${secrets:vault_secret_id}

Azure Key Vault (azure_keyvault):

  • Per-key caching with single-flight fetch coalescing.
  • Auth methods: service principal, managed identity, workload identity, Azure CLI, or auto-detect.
  • Supports sovereign clouds via endpoint parameter.
secrets:
- from: azure_keyvault:my-vault
name: my_secrets
params:
azure_keyvault_auth_method: managed_identity

DuckDB Vector Engineโ€‹

DuckDB-accelerated tables can now use DuckDB's HNSW index for vector search via the vector_engine: duckdb option, enabling fast approximate nearest-neighbor search without an external vector store.

Example configuration:

datasets:
- from: postgres:public.documents
name: documents
columns:
- name: content
embeddings:
- from: hf_minilm
row_id: id
acceleration:
enabled: true
engine: duckdb
mode: file
vectors:
enabled: true
engine: duckdb
params:
duckdb_distance_metric: cosine
duckdb_hnsw_m: 16
duckdb_hnsw_ef_construction: 64
duckdb_hnsw_ef_search: 32

embeddings:
- from: huggingface:huggingface.co/minishlab/potion-base-2M
name: hf_minilm

New and Promoted Connectorsโ€‹

Azure Cosmos DB (Alpha):

A new read-only Azure Cosmos DB NoSQL / Core SQL API connector built on the azure_data_cosmos 0.30 SDK. Supports cross-partition scans, schema inference from document samples, and key-based auth (connection string or account endpoint + key).

Git Connector (RC):

The Git data connector is promoted to RC status with HTTPS/SSH auth (git_token, git_username/git_password, git_ssh_key), Git LFS support (enable_lfs), and per-repo connection resilience (semaphore, bounded retries with exponential backoff, permanent-error circuit breaking).

DynamoDB Write Support (DML)โ€‹

DynamoDB datasets now support write-back via INSERT, UPDATE, and DELETE operations, complementing the existing read and CDC streaming capabilities.

MCP Streamable HTTP Transportโ€‹

The MCP server has been upgraded to rmcp 1.5.0 and switched to the Streamable HTTP transport (/v1/mcp), replacing the previous SSE-based endpoint. The client-side transport is updated to StreamableHttpClientTransport.

Security Improvementsโ€‹

Read-only API Key Enforcement: API keys with read-only scope are now strictly enforced on the Flight DoGet path and on async query endpoints, preventing write operations from being issued under a read-only key.

GitHub Workflow Hardening: CI workflows have been hardened with improved security posture to reduce supply-chain risk.

Developer Experience Improvementsโ€‹

  • Actionable Config Errors: Parameter typos, missing secret references, and unknown engine names now produce specific, actionable error messages with Levenshtein-based suggestions, rather than silent drops or generic "missing required parameter" messages.
  • spice init Improvements: Written spicepods now include a yaml-language-server: $schema=... directive for IDE completions. Creation messages print regardless of log level.
  • REPL Improvements: Log filter honors RUST_LOG when -v is not passed; version banner moves to stderr and prints only on an interactive TTY.
  • 403 / 401 Routing: HTTP 403 responses route to a new PermissionDenied variant; 401 messages point at spice login / SPICE_API_KEY.

OpenTelemetry Improvementsโ€‹

See Observability & Monitoring and the runtime.telemetry reference for full configuration details.

  • Metric Name Prefix: Configure a prefix for all exported OTLP metric names via runtime.telemetry.metric_prefix.
  • Delta Temporality Default: The OTLP push exporter now defaults to delta temporality, matching Prometheus and most backends.
  • Resource Attributes: runtime.telemetry.properties are applied as OTLP resource attributes on exported metrics.

Full-text Search Performanceโ€‹

Tantivy full-text search ingestion performance is significantly improved with better batch handling and a rollback-on-error path.

SQL and Query Engineโ€‹

  • DataFusion Upgrade: Updated to a newer DataFusion revision with additional bug fixes and performance improvements.
  • Views on DDL Catalogs: DDL-defined catalogs (e.g., Unity Catalog) can now expose and query views.
  • flatten_json / json_tree / expand_maps UDTFs: New table-valued functions for JSON transformation, map expansion, and schema decomposition in query pipelines. See JSON Functions and Operators.
  • cosine_distance Pushdown to DuckDB: cosine_distance is now pushed down to DuckDB accelerators via array_cosine_distance.
  • Snowflake Type Support: Added support for OBJECT, MAP, GEOGRAPHY, GEOMETRY, VECTOR, and TIMESTAMP_LTZ types in the Snowflake connector.
  • MySQL Zero-Date Behavior: The MySQL connector adds a new mysql_zero_date_behavior parameter (null or error) controlling how MySQL zero-date values (0000-00-00) are handled.
  • Databricks Timeouts: The Databricks connector adds new connect_timeout and client_timeout parameters for sql_warehouse mode.

Dependency Updatesโ€‹

Dependency / ComponentVersion / Update
DataFusionUpdated
rmcpv1.5.0 (from fork pin)
mistral.rsUpdated
openssl0.10.78

Contributorsโ€‹

Breaking Changesโ€‹

No breaking changes.

Cookbook Updatesโ€‹

No new cookbook recipes.

The Spice Cookbook includes 86 recipes to help you get started with Spice quickly and easily.

Upgradingโ€‹

To upgrade to v2.0.0-rc.4, use one of the following methods:

CLI:

spice upgrade v2.0.0-rc.4

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:2.0.0-rc.4 image:

docker pull spiceai/spiceai:2.0.0-rc.4

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai --version 2.0.0-rc.4

AWS Marketplace:

Spice is available in the AWS Marketplace.

What's Changedโ€‹

Changelogโ€‹

Full Changelog: https://github.com/spiceai/spiceai/compare/v2.0.0-rc.3...v2.0.0-rc.4

Spice v1.0.4 (Feb 17, 2025)

ยท 3 min read
Jack Eadie
Token Plumber at Spice AI

Announcing the release of Spice v1.0.4 ๐ŸŽ๏ธ

Spice v1.0.4 improves partition pruning for Delta Lake tables, significantly increasing scan efficiency and reducing overhead. xAI tool calling is more robust and the spice trace CLI command now provides expanded, detailed output for deeper analysis. Additionally, a bug has been fixed to correctly apply column name case-sensitivity in refresh SQL, indexes, and primary keys.

Highlights in v1.0.4โ€‹

  • Improved Append-Based Refresh When using an append-based acceleration where the time_column format differs from the physical partition, two new dataset configuration options, time_partition_column and time_partition_format can be configured to improve partition pruning and exclude irrelevant partitions during the refreshes.

For example, when the time_column format is timestamp and the physical data partition is date such as below:

my_delta_table/
โ”œโ”€โ”€ _delta_log/
โ”œโ”€โ”€ date_col=2023-12-31/
โ”œโ”€โ”€ date_col=2024-02-04/
โ”œโ”€โ”€ date_col=2025-01-01/
โ””โ”€โ”€ date_col=2030-06-15/

Partition pruning can be optimized using the configuration:

datasets:
- from: delta_lake://my_delta_table
name: my_delta_table
time_column: created_at # A fine-grained timestamp
time_format: timestamp
time_partition_column: date_col # Data is physically partitioned by `date_col`
time_partition_format: date
sgrebnov marked this conversation as resolved.
  • Expanded spice trace output: The spice trace CLI command now includes additional details, such as task status, and optional flags --include-input and --include-output for detailed tracing.

Example spice trace output:

TREE                   STATUS DURATION   TASK
a97f52ccd7687e64 โœ… 673.14ms ai_chat
โ”œโ”€โ”€ 4eebde7b04321803 โœ… 0.04ms tool_use::list_datasets
โ””โ”€โ”€ 4c9049e1bf1c3500 โœ… 671.91ms ai_completion

Example spice trace --include-input --include-output output:

TREE                   STATUS DURATION   TASK                    OUTPUT
a97f52ccd7687e64 โœ… 673.14ms ai_chat The capital of New York is Albany.
โ”œโ”€โ”€ 4eebde7b04321803 โœ… 0.04ms tool_use::list_datasets []
โ””โ”€โ”€ 4c9049e1bf1c3500 โœ… 671.91ms ai_completion [{"content":"The capital of New York is Albany.","refusal":null,"tool_calls":null,"role":"assistant","function_call":null,"audio":null}]

Contributorsโ€‹

  • @Jeadie
  • @peasee
  • @phillipleblanc
  • @Sevenannn
  • @sgrebnov
  • @lukekim

Breaking Changesโ€‹

No breaking changes.

Cookbook Updatesโ€‹

No new recipes.

Upgradingโ€‹

To upgrade to v1.0.4, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.0.4 image:

docker pull spiceai/spiceai:1.0.4

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

What's Changedโ€‹

Dependenciesโ€‹

No major dependency changes.

Changelogโ€‹

- Do not return underlying content of chunked embedding column by default during tool_use::document_similarity by @Jeadie in https://github.com/spiceai/spiceai/pull/4802
- Fix Snowflake Case-Sensitive Identifiers support by @sgrebnov in https://github.com/spiceai/spiceai/pull/4813
- Prepare for 1.0.4 by @sgrebnov in https://github.com/spiceai/spiceai/pull/4801
- Add support for a time_partition_column by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4784
- Prevent the automatic normalization of refresh_sql columns to lowercase by @sgrebnov in https://github.com/spiceai/spiceai/pull/4787
- Implement partition pruning for Delta Lake tables by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4783
- Fix constraint verification for columns with uppercase letters by @sgrebnov in https://github.com/spiceai/spiceai/pull/4785
- Add truncate command for spice trace by @peasee in https://github.com/spiceai/spiceai/pull/4771
- Implement Cache-Control: no-cache to bypass results cache by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4763
- Prompt user to download runtime when running spice sql by @Sevenannn in https://github.com/spiceai/spiceai/pull/4747
- Add vector search tracing by @peasee in https://github.com/spiceai/spiceai/pull/4757
- Update spice trace output format by @Jeadie in https://github.com/spiceai/spiceai/pull/4750
- Fix tool call arguments in Grok messages by @Jeadie in https://github.com/spiceai/spiceai/pull/4741

**Full Changelog**: https://github.com/spiceai/spiceai/compare/v1.0.3...v1.0.4