16 posts tagged with "duckdb"

DuckDB database topics and usage

Spice v1.5.0 (July 21, 2025)

July 22, 2025 · 14 min read

Senior Software Engineer at Spice AI

Announcing the release of Spice v1.5.0! 🔍

Spice v1.5.0 brings major upgrades to search and retrieval. It introduces native support for Amazon S3 Vectors, enabling petabyte scale vector search directly from S3 vector buckets, alongside SQL-integrated vector and tantivy-powered full-text search, partitioning for DuckDB acceleration, and automated refreshes for search indexes and views. It includes the AWS Bedrock Embeddings Model Provider, the Oracle Database connector, and the now-stable Spice.ai Cloud Data Connector, and the upgrade to DuckDB v1.3.2.

What's New in v1.5.0

Amazon S3 Vectors Support: Spice.ai now integrates with Amazon S3 Vectors, launched in public preview on July 15, 2025, enabling vector-native object storage with built-in indexing and querying. This integration supports semantic search, recommendation systems, and retrieval-augmented generation (RAG) at petabyte scale with S3’s durability and elasticity. Spice.ai manages the vector lifecycle—ingesting data, creating embeddings with models like Amazon Titan or Cohere via AWS Bedrock, or others available on HuggingFace, and storing it in S3 Vector buckets.

Spice integration with Amazon S3 Vectors

Example Spicepod.yml configuration for S3 Vectors:

datasets:
  - from: s3://my_data_bucket/data/
    name: my_vectors
    params:
      file_format: parquet
    acceleration:
      enabled: true
    vectors:
      engine: s3_vectors
      params:
        s3_vectors_aws_region: us-east-2
        s3_vectors_bucket: my-s3-vectors-bucket
    columns:
      - name: content
        embeddings:
          - from: bedrock_titan
            row_id:
              - id

Example SQL query using S3 Vectors:

SELECT *
FROM vector_search(my_vectors, 'Cricket bats', 10)
WHERE price < 100
ORDER BY score

For more details, refer to the S3 Vectors Documentation.

SQL-integrated Search: Vector and BM25-scored full-text search capabilities are now natively available in SQL queries, extending the power of the POST v1/search endpoint to all SQL workflows.

Example Vector-Similarity-Search (VSS) using the vector_search UDTF on the table reviews for the search term "Cricket bats":

SELECT review_id, review_text, review_date, score
FROM vector_search(reviews, "Cricket bats")
WHERE country_code="AUS"
LIMIT 3

Example Full-Text-Search (FTS) using the text_search UDTF on the table reviews for the search term "Cricket bats":

SELECT review_id, review_text, review_date, score
FROM text_search(reviews, "Cricket bats")
LIMIT 3

DuckDB v1.3.2 Upgrade: Upgraded DuckDB engine from v1.1.3 to v1.3.2. Key improvements include support for adding primary keys to existing tables, resolution of over-eager unique constraint checking for smoother inserts, and 13% reduced runtime on TPC-H SF100 queries through extensive optimizer refinements. The v1.2.x release of DuckDB was skipped due to a regression in indexes.

Read the DuckDB v1.2.0 announcement.
Read the DuckDB v1.3.0 announcement.

Partitioned Acceleration: DuckDB file-based accelerations now support partition_by expressions, enabling queries to scale to large datasets through automatic data partitioning and query predicate pruning. New UDFs, bucket and truncate, simplify partition logic.

New UDFs useful for partition_by expressions:

bucket(num_buckets, col): Partitions a column into a specified number of buckets based on a hash of the column value.
truncate(width, col): Truncates a column to a specified width, aligning values to the nearest lower multiple (e.g., truncate(10, 101) = 100).

Example Spicepod.yml configuration:

datasets:
  - from: s3://my_bucket/some_large_table/
    name: my_table
    params:
      file_format: parquet
    acceleration:
      enabled: true
      engine: duckdb
      mode: file
      partition_by: bucket(100, account_id) # Partition account_id into 100 buckets

Full-Text-Search (FTS) Index Refresh: Accelerated datasets with search indexes maintain up-to-date results with configurable refresh intervals.

Example refreshing search indexes on body every 10 seconds:

datasets:
  - from: github:github.com/spiceai/docs/pulls
    name: spiceai.doc.pulls
    params:
      github_token: ${secrets:GITHUB_TOKEN}
    acceleration:
      enabled: true
      refresh_mode: full
      refresh_check_interval: 10s
    columns:
      - name: body
        full_text_search:
          enabled: true
          row_id:
            - id

Scheduled View Refresh: Accelerated Views now support cron-based refresh schedules using refresh_cron, automating updates for accelerated data.

Example Spicepod.yml configuration:

views:
  - name: my_view
    sql: SELECT 1
    acceleration:
      enabled: true
      refresh_cron: '0 * * * *' # Every hour

For more details, refer to Scheduled Refreshes.

Multi-column Vector Search: For datasets configured with embeddings on more than one column, POST v1/search and similarity_search perform parallel vector search on each column, aggregating results using reciprocal rank fusion.

Example Spicepod.yml for multi-column search:

datasets:
  - from: github:github.com/apache/datafusion/issues
    name: datafusion.issues
    params:
      github_token: ${secrets:GITHUB_TOKEN}
    columns:
      - name: title
        embeddings:
          - from: hf_minilm
      - name: body
        embeddings:
          - from: openai_embeddings

AWS Bedrock Embeddings Model Provider: Added support for AWS Bedrock embedding models, including Amazon Titan Text Embeddings and Cohere Text Embeddings.

Example Spicepod.yml:

embeddings:
  - from: bedrock:cohere.embed-english-v3
    name: cohere-embeddings
    params:
      aws_region: us-east-1
      input_type: search_document
      truncate: END
  - from: bedrock:amazon.titan-embed-text-v2:0
    name: titan-embeddings
    params:
      aws_region: us-east-1
      dimensions: '256'

For more details, refer to the AWS Bedrock Embedding Models Documentation.

Oracle Data Connector: Use from: oracle: to access and accelerate data stored in Oracle databases, deployed on-premises or in the cloud.

Example Spicepod.yml:

datasets:
  - from: oracle:"SH"."PRODUCTS"
    name: products
    params:
      oracle_host: 127.0.0.1
      oracle_username: scott
      oracle_password: tiger

See the Oracle Data Connector documentation.

GitHub Data Connector: The GitHub data connector supports query and acceleration of members, the users of an organization.

Example Spicepod.yml configuration:

datasets:
  - from: github:github.com/spiceai/members # General format: github.com/[org-name]/members
    name: spiceai.members
    params:
      # With GitHub Apps (recommended)
      github_client_id: ${secrets:GITHUB_SPICEHQ_CLIENT_ID}
      github_private_key: ${secrets:GITHUB_SPICEHQ_PRIVATE_KEY}
      github_installation_id: ${secrets:GITHUB_SPICEHQ_INSTALLATION_ID}
      # With GitHub Tokens
      # github_token: ${secrets:GITHUB_TOKEN}

See the [GitHub Data Connector Documentation]

Spice.ai Cloud Data Connector: Graduated to Stable.

spice-rs SDK Release: The Spice Rust SDK has updated to v3.0.0. This release includes optimizations for the Spice client API, adds robust query retries, and custom metadata configurations for spice queries.

Contributors

Breaking Changes

Search HTTP API Response: POST v1/search response payload has changed. See the new API documentation for details.
Model Provider Parameter Prefixes: Model Provider parameters use provider-specific prefixes instead of openai_ prefixes (e.g., hf_temperature for HuggingFace, anthropic_max_completion_tokens for Anthropic, perplexity_tool_choice for Perplexity). The openai_ prefix remains supported for backward compatibility but is deprecated and will be removed in a future release.

Cookbook Updates

Added Oracle Data Connector cookbook: Connect to tables in Oracle databases.
Added Hashed Partitioning with DuckDB cookbook: Accelerate data on large datasets by partitioning data into a fixed number of buckets.

The Spice Cookbook now includes 72 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.5.0, download and install the specific binary from github.com/spiceai/spiceai/releases/tag/v1.5.0 or pull the v1.5.0 Docker image (spiceai/spiceai:1.5.0).

What's Changed

Dependencies

delta_kernel: Upgraded to v0.12.1
DuckDB: Upgraded from v1.1.3 to v1.3.2
iceberg-rust: Upgraded from v0.4.0 to v0.5.1

Changelog

fix: openai model endpoint (#6394) by @Sevenannn in #6394
Enable configuring otel endpoint from spice run (#6360) by @Advayp in #6360
Enable Oracle connector in default build configuration (#6395) by @sgrebnov in #6395
fix llm integraion test (#6398) by @Sevenannn in #6398
Promote spice cloud connector to stable quality (#6221) by @Sevenannn in #6221
v1.5.0-rc.1 release notes (#6397) by @lukekim in #6397
Fix model nsql integration tests (#6365) by @Sevenannn in #6365
Fix incorrect UDTF name and SQL query (#6404) by @lukekim in #6404
Update v1.5.0-rc.1.md (#6407) by @sgrebnov in #6407
Improve error messages (#6405) by @lukekim in #6405
build(deps): bump Jimver/cuda-toolkit from 0.2.25 to 0.2.26 (#6388) by @app/dependabot in #6388
Upgrade dependabot dependencies (#6411) by @phillipleblanc in #6411
Fix projection pushdown issues for document based file connector (#6362) by @Advayp in #6362
Add a PartitionedDuckDB Accelerator (#6338) by @kczimm in #6338
Use vector_search() UDTF in HTTP APIs (#6417) by @Jeadie in #6417
add supported types (#6409) by @kczimm in #6409
Enable session time zone override for MySQL (#6426) by @sgrebnov in #6426
Acceleration-like indexing for full text search indexes. (#6382) by @Jeadie in #6382
Provide error message when partition by expression changes (#6415) by @kczimm in #6415
Add support for Oracle Autonomous Database connections (Oracle Cloud) (#6421) by @sgrebnov in #6421
prune partitions for exact and in list with and without UDFs (#6423) by @kczimm in #6423
Fixes and reenable FTS tests (#6431) by @Jeadie in #6431
Upgrade DuckDB to 1.3.2 (#6434) by @phillipleblanc in #6434
Fix issue in limit clause for the Github Data connector (#6443) by @Advayp in #6443
Upgrade iceberg-rust to 0.5.1 (#6446) by @phillipleblanc in #6446
v1.5.0-rc.2 release notes (#6440) by @lukekim in #6440
Oracle: add automated TPC-H SF1 benchmark tests (#6449) by @sgrebnov in #6449
fix: Update benchmark snapshots (#6455) by @app/github-actions in #6455
Preserve ArrowError in arrow_tools::record_batch (#6454) by @mach-kernel in #6454
fix: Update benchmark snapshots (#6465) by @app/github-actions in #6465
Add option to preinstall Oracle ODPI-C library in Docker image (#6466) by @sgrebnov in #6466
Include Oracle connector (federated mode) in automated benchmarks (#6467) by @sgrebnov in #6467
Update crates/llms/src/bedrock/embed/mod.rs by @lukekim in #6468
v1.5.0-rc.3 release notes (#6474) by @lukekim in #6474
Add integration tests for S3 Vectors filters pushdown (#6469) by @sgrebnov in #6469
check for indexedtableprovider when finding tables to search on (#6478) by @Jeadie in #6478
Parse fully qualified table names in UDTFs (#6461) by @Jeadie in #6461
Add integration test for S3 Vectors to cover data update (overwrite) (#6480) by @sgrebnov in #6480
Add 'Run all tests' option for models tests and enable Bedrock tests (#6481) by @sgrebnov in #6481
Add support for a members table type for the GitHub Data Connector (#6464) by @Advayp in #6464
S3 vector data cannot be null (#6483) by @Jeadie in #6483
Don't infer FixedSizeList size during indexing vectors. (#6487) by @Jeadie in #6487
Add support for retention_sql acceleration param (#6488) by @sgrebnov in #6488
Make dataset refresh progress tracing less verbose (#6489) by @sgrebnov in #6489
Use RwLock on tantivy index in FullTextDatabaseIndex for update concurrency (#6490) by @Jeadie in #6490
Add tests for dataset retention logic and refactor retention code (#6495) by @sgrebnov in #6495
Upgade dependabot dependencies (#6497) by @phillipleblanc in #6497
Add periodic tracing of data loading progress during dataset refresh (#6499) by @sgrebnov in #6499
Promote Oracle Data Connector to Alpha (#6503) by @sgrebnov in #6503
Use AWS SDK to provide credentials for Iceberg connectors (#6498) by @phillipleblanc in #6498
Add integration tests for partitioning (#6463) by @kczimm in #6463
Use top-level table in full-text search JOIN ON (#6491) by @Jeadie in #6491
Use accelerated table in vector_search JOIN operations when appropriate (#6516) by @Jeadie in #6516
Fix 'additional_column' for quoted columns (fix for qualified columns broke it) (#6512) by @Jeadie in #6512
Also use AWS SDK for inferring credentials for S3/Delta/Databricks Delta data connectors (#6504) by @phillipleblanc in #6504
Add per-dataset availability monitor configuration (#6482) by @phillipleblanc in #6482
Suppress the warning from the AWS SDK if it can't load credentials (#6533) by @phillipleblanc in #6533
Change default value of check_availability from default to auto (#6534) by @lukekim in #6534
README.md improvements for v1.5.0 (#6539) by @lukekim in #6539
Temporary disable s3_vectors_basic (#6537) by @sgrebnov in #6537
Ensure binder errors show before query and other (#6374) by @suhuruli in #6374
Update spiceai/duckdb-rs -> DuckDB 1.3.2 + index fix (#6496) by @mach-kernel in #6496
Update table-providers to latest version with DuckDB fixes (#6535) by @phillipleblanc in #6535
S3: default to public access if no auth is provided (#6532) by @sgrebnov in #6532

Spice v1.3.2 (June 2, 2025)

June 2, 2025 · 2 min read

Phillip LeBlanc

Co-Founder and CTO of Spice AI

Announcing the release of Spice v1.3.2! ❄️

Spice v1.3.2 is a patch release with fixes to the DuckDB data accelerator and Snowflake data connector.

Changes:

DuckDB Data Accelerator: Supports ORDER BY rand() for randomized result ordering and ORDER BY NULL for SQL compatibility.
Snowflake Data Connector: Adds TIMESTAMP_NTZ(0) type for timestamps with seconds precision.

Contributors

Breaking Changes

No breaking changes.

Cookbook Updates

No new cookbook recipes.

The Spice Cookbook now includes 67 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.3.2, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.3.2 image:

docker pull spiceai/spiceai:1.3.2

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

No major dependency changes.

Changelog

Handle Snowflake Timestamp NTZ with seconds precision (#6084) by @kczimm in #6084
Fix DuckDB acceleration ORDER BY rand() and ORDER BY NULL (#6071) by @phillipleblanc in #6071

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.3.1...v1.3.2

Spice v1.3.0 (May 19, 2025)

May 20, 2025 · 9 min read

Phillip LeBlanc

Co-Founder and CTO of Spice AI

Announcing the release of Spice v1.3.0! 🏎️

Spice v1.3.0 accelerates data and AI applications with significantly improved query performance, reliability, and expanded Databricks integration. New support for the Databricks SQL Statement Execution API enables direct SQL queries on Databricks SQL Warehouses, complementing Mosaic AI model serving and embeddings (introduced in v1.2.2) and existing Databricks catalog and dataset integrations. This release upgrades to DataFusion v46, optimizes results caching performance, and strengthens security with least-privilege sandboxed improvements.

What's New in v1.3.0

Databricks SQL Statement Execution API Support: Added support for the Databricks SQL Statement Execution API, enabling direct SQL queries against Databricks SQL Warehouses for optimized performance in analytics and reporting workflows.

Example spicepod.yml configuration:

datasets:
  - from: databricks:spiceai.datasets.my_awesome_table
    name: my_awesome_table
    params:
      mode: sql_warehouse
      databricks_endpoint: ${env:DATABRICKS_ENDPOINT}
      databricks_sql_warehouse_id: ${env:DATABRICKS_SQL_WAREHOUSE_ID}
      databricks_token: ${env:DATABRICKS_TOKEN}

For details, see the Databricks Data Connector documentation.

Improved Results Cache Performance & Hashing Algorithm: Spice now supports an alternative results cache hashing algorithm, ahash, in addition to siphash, being the default. Configure it via:
```
runtime:
  results_cache:
    hashing_algorithm: ahash # or siphash
```
The hashing algorithm determines how cache keys are hashed before being stored, impacting both lookup speed and protection against potential DOS attacks.

Using ahash improves performance for large queries or query plans. Combined with results cache optimizations, it reduces 99th percentile request latency and increases total requests/second for queries with large result sets (100k+ cached rows). The following charts show performance tested against the TPCH Query #17 on a scale factor 5 dataset (30+ million rows, 5GB):

Latency Req/sec

Note: ahash was not available in v1.2.2, so it is excluded from comparisons.

To learn more, refer to the Results Cache Hashing Algorithm documentation.
SQL Query Performance: Optimized the critical SQL query path, reducing overhead and improving response times for simple queries by 10-20%.
DuckDB Acceleration: Fixed a bug in the DuckDB acceleration engine causing query failures under high concurrency when querying datasets accelerated into multiple DuckDB files.
Container Security: The container image now runs as a non-root user with enhanced sandboxing and includes only essential dependencies for a slimmer, more secure image.

DataFusion v46 Highlights

Spice.ai is built on the DataFusion query engine. The v46 release brings:

Faster Performance 🚀: DataFusion 46 introduces significant performance enhancements, including a 2x faster median() function for large datasets without grouping, 10–100% speed improvements in FIRST_VALUE and LAST_VALUE window functions by avoiding sorting, and a 40x faster uuid() function. Additional optimizations, such as a 50% faster repeat() string function, accelerated chr() and to_hex() functions, improved grouping algorithms, and Parquet row group pruning with NOT LIKE filters, further boost overall query efficiency.
New range() Table Function: A new table-valued function range(start, stop, step) has been added to make it easy to generate integer sequences — similar to PostgreSQL’s generate_series() or Spark’s range(). Example: SELECT * FROM range(1, 10, 2);
UNION [ALL | DISTINCT] BY NAME Support: DataFusion now supports UNION BY NAME and UNION ALL BY NAME, which align columns by name instead of position. This matches functionality found in systems like Spark and DuckDB and simplifies combining heterogeneously ordered result sets.

Example:
```
SELECT col1, col2 FROM t1
UNION ALL BY NAME
SELECT col2, col1 FROM t2;
```

See the DataFusion 46.0.0 release notes for details.

Spice.ai adopts the latest minus one DataFusion release for quality assurance and stability. The upgrade to DataFusion v47 is planned for Spice v1.4.0 in June.

Contributors

Breaking Changes

The container image now always runs as a non-root user (UID/GID 65534) with minimal dependencies, resulting in a smaller, more secure image. Standard Linux tools, including bash, are no longer included.

Kubernetes Deployments:

Use of the v1.3.0+ Helm chart is required, which includes a securityContext ensuring the sandbox user has required file access.
For deployments using a lower version than the v1.3.0 Helm chart, add the following securityContext to the pod specification:

securityContext:
  runAsUser: 65534
  runAsGroup: 65534
  fsGroup: 65534

See the Docker Sandbox Guide for details on how to update custom Docker images to restore the previous behavior.

Cookbook Updates

Added Accelerated Views: Pre-calculate and materialize data derived from one or more underlying datasets.

The Spice Cookbook now includes 67 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.3.0, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.3.0 image:

docker pull spiceai/spiceai:1.3.0

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

DataFusion: Upgraded to v46
Apache Arrow: Upgraded to v54.3.0
delta_kernel: Upgraded to v0.10.0

Changelog

update to 1.2.2 by @Jeadie in #5806
Move sandboxing logic to Dockerfile by @phillipleblanc in #5808
Add note to run installation health workflow after release is marked as official by @Sevenannn in #5797
ROADMAP updates May 13, 2025 by @lukekim in #5809
Update qa_analytics.csv by @kczimm in #5810
post-release housekeeping by @Jeadie in #5811
Fix flaky DataBricks M2M integration tests by @phillipleblanc in #5818
Add DataFusion request context extension to http routes by @ewgenius in #5807
Use Utf8 for partition columns by @phillipleblanc in #5820
Use full path for location metadata column by @phillipleblanc in #5819
Remove the DataFusion reference from the flight service and use the reference from the request context instead by @ewgenius in #5821
Upgrade delta_kernel to 0.10 by @phillipleblanc in #5823
fix: Update benchmark snapshots by @app/github-actions in #5827
Update qa_analytics.csv by @kczimm in #5824
fix: Update benchmark snapshots by @app/github-actions in #5826
fix: Update benchmark snapshots by @app/github-actions in #5825
Fix dispatch spicepod reference for file[parquet]-duckdb[file]-indexes and file[parquet]-duckdb[memory]-indexes by @phillipleblanc in #5837
Fix spice run --http-endpoint in CLI by @Jeadie in #5812
Prevent excessively copying RawCacheKey by @peasee in #5838
Make DuckDB database attachments logic more robust by @sgrebnov in #5839
Simplify Databricks U2M auth flow, by moving user auth to the request context by @ewgenius in #5842
Update to new MCP crate by @Jeadie in #5758
Disable the query tracker when task history is disabled by @peasee in #5852
Set fsGroup on PodSpec to force volumes to be mounted with permission to docker image by @phillipleblanc in #5854
Clarify Helm release steps by @phillipleblanc in #5855
Avoid cloning cached results by @peasee in #5853
Upgrade to DataFusion 46 by @phillipleblanc in #5543
Update openapi.json by @app/github-actions in #5856
Adapt to Arrow 54 changes in Dict IDs preserving (Arrow IPC) by @sgrebnov in #5866
fix: Update benchmark snapshots by @app/github-actions in #5867
Fix s3[parquet]-duckdb[file-many] benchmark Spicepod configuration by @sgrebnov in #5868
fix: Update benchmark snapshots by @app/github-actions in #5869
feat: Refactor caching, support hashing algorithms by @peasee in #5859
Overried health checks for Databricks models in U2M auth mode by @ewgenius in #5858
Update trunk to 1.4.0-unstable by @phillipleblanc in #5878
fix: Pass parameters to testoperator explain plan by @peasee in #5883
Disallow schema updates for existing accelerated tables by @phillipleblanc in #5887
Deferrable registration for Databricks U2M datasets by @ewgenius in #5860

See the full list of changes at: v1.2.2...v1.3.0

Spice v1.1.0 (Mar 31, 2025)

April 1, 2025 · 20 min read

Luke Kim

Founder and CEO of Spice AI

Model-Context-Protocol (MCP) support in Spice.ai Open Source

Announcing the release of Spice v1.1.0! 🤖

Spice v1.1.0 introduces full support for the Model-Context-Protocol (MCP), expanding how models and tools connect. Spice can now act as both an MCP Server, with the new /v1/mcp/sse API, and an MCP Client, supporting stdio and SSE-based servers. This release also introduces a new Web Search tool with Perplexity model support, advanced evaluation workflows with custom eval scorers, including LLM-as-a-judge, and adds an IMAP Data Connector for federated SQL queries across email servers. Alongside these features, v1.1.0 includes automatic NSQL query retries, expanded task tracing, request drains for HTTP server shutdowns, delivering improved reliability, flexibility, and observability.

Highlights in v1.1.0

Spice as an MCP Server and Client: Spice now supports the Model Context Protocol (MCP), for expanded tool discovery and connectivity. Spice can:
1. Run stdio-based MCP servers internally.
2. Connect to external MCP servers over SSE protocol (Streamable HTTP is coming soon!)
For more details, see the MCP documentation.

Usage
```
tools:
  - name: google_maps
    from: mcp:npx
    params:
      mcp_args: -y @modelcontextprotocol/server-google-maps
```
Spice as an MCP Server

Tools in Spice can be accessed via MCP. For example, connecting from an IDE like Cursor or Windsurf to Spice. Set the MCP Server URL to http://localhost:8090/v1/mcp/sse.

Perplexity Model Support: Spice now supports Perplexity-hosted models, enabling advanced web search and retrieval capabilities. Example configuration:

models:
  - name: webs
    from: perplexity:sonar
    params:
      perplexity_auth_token: ${ secrets:SPICE_PERPLEXITY_AUTH_TOKEN }
      perplexity_search_domain_filter:
        - docs.spiceai.org
        - huggingface.co

For more details, see the Perplexity documentation.

Web Search Tool: The new Web Search Tool enables Spice models to search the web for information using search engines like Perplexity. Example configuration:

tools:
  - name: the_internet
    from: websearch
    description: 'Search the web for information.'
    params:
      engine: perplexity
      perplexity_auth_token: ${ secrets:SPICE_PERPLEXITY_AUTH_TOKEN }

For more details, see the Web Search Tool documentation.

Eval Scorers: Eval scorers assess model performance on evaluation cases. Spice includes built-in scorers:

match: Exact match.
json_match: JSON equivalence.
includes: Checks if actual output includes expected output.
fuzzy_match: Normalized subset matching.
levenshtein: Levenshtein distance.

Custom scorers can use embedding models or LLMs as judges. Example:

evals:
  - name: australia
    dataset: cricket_questions
    scorers:
      - hf_minilm
      - judge
      - match
embeddings:
  - name: hf_minilm
    from: huggingface:huggingface.co/sentence-transformers/all-MiniLM-L6-v2
models:
  - name: judge
    from: openai:gpt-4o
    params:
      openai_api_key: ${ secrets:OPENAI_API_KEY }
      system_prompt: |
        Compare these stories and score their similarity (0.0 to 1.0).
        Story A: {{ .actual }}
        Story B: {{ .ideal }}

For more details, see the Eval Scorers documentation.

IMAP Data Connector: Query emails stored in IMAP servers using federated SQL. Example:
```
datasets:
  - from: imap:[email protected]
    name: emails
    params:
      imap_access_token: ${secrets:IMAP_ACCESS_TOKEN}
```
For more details, see the IMAP Data Connector documentation.
Automatic NSQL Query Retries: Failed NSQL queries are now automatically retried, improving reliability for federated queries. For more details, see the NSQL documentation.
Enhanced Task Tracing: Task history now includes chat completion IDs, and runtime readiness is traced for better observability. Use the runtime.task_history table to query task details. See the Task History documentation.
Vector Search with Keyword Filtering: The vector search API now includes an optional list of keywords as a parameter, to pre-filter SQL results before performing a vector search. When vector searching via a chat completion, models will automatically generate keywords relevant to the search. See the Vector Search API documentation.
Improved Refresh Behavior on Startup: Spice won't automatically refresh an accelerated dataset on startup if it doesn't need to. See the Refresh on Startup documentation.
Graceful Shutdown for HTTP Server: The HTTP server now drains requests for graceful shutdowns, ensuring smoother runtime termination.

New Contributors 🎉

@Garamda made their first contribution in github.com/spiceai/spiceai/pull/4840
@sergey-shandar made their first contribution in github.com/spiceai/spiceai/pull/4868
@benrussell made their first contribution in github.com/spiceai/spiceai/pull/5126

Contributors

@sgrebnov
@phillipleblanc
@peasee
@Jeadie
@lukekim
@benrussell
@Sevenannn
@sergey-shandar
@Garamda
@johnnynunez

Breaking Changes

No breaking changes.

Cookbook Updates

The Spice Cookbook now has 74 recipes that make it easy to get started with Spice!

Upgrading

To upgrade to v1.1.0, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.1.0 image:

docker pull spiceai/spiceai:1.1.0

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

No major dependency changes.

Changelog

- release: Bump chart, and versions for next release by @peasee in <https://github.com/spiceai/spiceai/pull/4464>
- feat: Schedule testoperator by @peasee in <https://github.com/spiceai/spiceai/pull/4503>
- fix: Remove on zero results arguments from benchmarks by @peasee in <https://github.com/spiceai/spiceai/pull/4533>
- fix: Don't snapshot clickbench benchmarks by @peasee in <https://github.com/spiceai/spiceai/pull/4534>
- docs: v1.0.1 release note by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4529>
- Update acknowledgements by @github-actions in <https://github.com/spiceai/spiceai/pull/4535>
- In spiced_docker, propagate setup to publish-cuda by @Jeadie in <https://github.com/spiceai/spiceai/pull/4543>
- Upgrade Rust to 1.84 by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4541>
- Upgrade dependencies by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4546>
- Revert "Use OpenAI golang client in `spice chat` (#4491)" by @Jeadie in <https://github.com/spiceai/spiceai/pull/4564>
- feat: add schema inference for the Spice.ai Data Connector by @peasee in <https://github.com/spiceai/spiceai/pull/4579>
- Remove 'tools: builtin' by @Jeadie in <https://github.com/spiceai/spiceai/pull/4607>
- feat: Add initial IMAP connector by @peasee in <https://github.com/spiceai/spiceai/pull/4587>
- feat: Add email content loading by @peasee in <https://github.com/spiceai/spiceai/pull/4616>
- feat: Add SSL and Auth parameters for IMAP by @peasee in <https://github.com/spiceai/spiceai/pull/4613>
- Change /v1/models to be OpenAI compatible by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4624>
- Use `pdf-extract` crate to extract text from PDF documents by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4615>
- Update openapi.json by @github-actions in <https://github.com/spiceai/spiceai/pull/4628>
- Add 1.0.2 release notes by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4627>
- Fix cuda::ffi by @Jeadie in <https://github.com/spiceai/spiceai/pull/4649>
- Update spicepod.schema.json by @github-actions in <https://github.com/spiceai/spiceai/pull/4654>
- fix: Spice.ai schema inference by @peasee in <https://github.com/spiceai/spiceai/pull/4674>
- Add SQL Benchmark with sample eval configuration based on TPCH by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4549>
- Update Helm chart to Spice v1.0.2 by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4655>
- Update v1.0.2 release notes by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4639>
- Fix E2E AI release install test on self-hosted runners (macos) by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4675>
- Main performance metrics calculation for Text to SQL Benchmark by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4681>
- Add eval datasets / test scripts for model grading criteria by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4663>
- Update openapi.json by @github-actions in <https://github.com/spiceai/spiceai/pull/4684>
- Add testoperator for `evals` running by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4688>
- Add GH Workflow to run Text to SQL benchmark by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4689>
- Add 1.0.2 as supported version to SECURITY.md by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4695>
- Text-To-SQL benchmark: trace failed tests by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4705>
- Text-To-SQL benchmark: extend list of benchmarking models by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4707>
- Text-To-SQL: increase sql coverage, add more advanced tests by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4713>
- Use model that supports tools in hf_test by @Jeadie in <https://github.com/spiceai/spiceai/pull/4712>
- Fix Spice.ai E2E test by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4723>
- Return non-existing model for v1/chat endpoint by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4718>
- Update Helm chart for 1.0.3 by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4742>
- Update dependencies by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4740>
- Update spicepod.schema.json by @github-actions in <https://github.com/spiceai/spiceai/pull/4744>
- Update SECURITY.md with 1.0.3 by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4745>
- Add basic smoke test of perplexity LLM to llm integration tests. by @Jeadie in <https://github.com/spiceai/spiceai/pull/4735>
- Don't run integration tests on PRs when only CLI is changed by @Jeadie in <https://github.com/spiceai/spiceai/pull/4751>
- Prompt user to upgrade through brew / do another clean install when spice is installed through homebrew / at non-standard path by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4746>
- feat: Search with keyword filtering by @peasee in <https://github.com/spiceai/spiceai/pull/4759>
- Fix search benchmark by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4765>
- feat: Add IMAP access token parameter by @peasee in <https://github.com/spiceai/spiceai/pull/4769>
- Update openapi.json by @github-actions in <https://github.com/spiceai/spiceai/pull/4774>
- Mark trunk builds as unstable by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4776>
- feat: Release Spice.ai RC by @peasee in <https://github.com/spiceai/spiceai/pull/4753>
- fix: Validate columns and keywords in search by @peasee in <https://github.com/spiceai/spiceai/pull/4775>
- Run models E2E tests on PR by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4798>
- fix: models runtime not required for cloud chat by @peasee in <https://github.com/spiceai/spiceai/pull/4781>
- Only open one PR for openapi.json by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4807>
- docs: Release IMAP Alpha by @peasee in <https://github.com/spiceai/spiceai/pull/4797>
- Add Results-Cache-Status to indicate query result came from cache by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4809>
- Initial spice cli e2e tests with spice upgrade tests by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4764>
- Log CLI and Runtime Versions on startup by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4816>
- Sort keys for openai by @Jeadie in <https://github.com/spiceai/spiceai/pull/4766>
- Remove docs index trigger from the endgame template by @ewgenius in <https://github.com/spiceai/spiceai/pull/4832>
- Release notes for v1.0.4 by @Jeadie in <https://github.com/spiceai/spiceai/pull/4827>
- Update SECURITY.md by @Jeadie in <https://github.com/spiceai/spiceai/pull/4829>
- Update spicepod.schema.json by @github-actions in <https://github.com/spiceai/spiceai/pull/4831>
- Don't print URL by @lukekim in <https://github.com/spiceai/spiceai/pull/4838>
- add 'eval_run' to 'spice trace' by @Jeadie in <https://github.com/spiceai/spiceai/pull/4841>
- Run benchmark tests w/o uploading test results (pending improvements) by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4843>
- Fix 'actual" and "output" columns in `eval.results`. by @Jeadie in <https://github.com/spiceai/spiceai/pull/4835>
- Fix string escaping of system prompt by @Jeadie in <https://github.com/spiceai/spiceai/pull/4844>
- update helm chart to v1.0.4 by @Jeadie in <https://github.com/spiceai/spiceai/pull/4828>
- Update openapi.json by @github-actions in <https://github.com/spiceai/spiceai/pull/4806>
- fix: Skip sccache in PR for external users by @peasee in <https://github.com/spiceai/spiceai/pull/4851>
- fix: Return BAD_REQUEST when not embeddings are configured by @peasee in <https://github.com/spiceai/spiceai/pull/4804>
- Debug log cuda detection failure in spice by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4852>
- fix: Set RUSTC wrapper explicitly by @peasee in <https://github.com/spiceai/spiceai/pull/4854>
- Improve trace UX for `ai_completion`, fix infinite tool calls by @Jeadie in <https://github.com/spiceai/spiceai/pull/4853>
- Allow homebrew spice cli to upgrade the runtime by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4811>
- Add support for MCP tools by @Jeadie in <https://github.com/spiceai/spiceai/pull/4808>
- fix: Rustc wrapper actions by @peasee in <https://github.com/spiceai/spiceai/pull/4867>
- Provide link to supported OS list when user platform is not supported by @Garamda in <https://github.com/spiceai/spiceai/pull/4840>
- Always download spice runtime version matched with spice cli version by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4761>
- Disable flaky integration test by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4871>
- fix: sccache actions setup by @peasee in <https://github.com/spiceai/spiceai/pull/4873>
- Fixing Go installation in the setup script for Linux Arm64 by @sergey-shandar in <https://github.com/spiceai/spiceai/pull/4868>
- Update openapi.json by @github-actions in <https://github.com/spiceai/spiceai/pull/4864>
- DuckDB acceleration: Use temp table only for append with conflict resolution by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4874>
- Trace the output of streamed `chat/completions` to runtime.task_history. by @Jeadie in <https://github.com/spiceai/spiceai/pull/4845>
- Always pass `X-API-Key` in spice api calls header if detected in env by @ewgenius in <https://github.com/spiceai/spiceai/pull/4878>
- Revert "DuckDB acceleration: Use temp table only for append with conflict resolution" by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4886>
- Allow overriding spicerack base url in the CLI by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4892>
- Add test Spicepod for DuckDB full acceleration with constraints by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4891>
- Refactor Parameter Handling by @Advayp in <https://github.com/spiceai/spiceai/pull/4833>
- Add test Spicepod for DuckDB append acceleration with constraints by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4898>
- Update to latest async-openai fork. Update secrecy by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4911>
- Fix mcp tools build by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4916>
- Add more test spicepods by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4923>
- task: Add more dispatch files by @peasee in <https://github.com/spiceai/spiceai/pull/4933>
- run spiceai benchmark test using test operator by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4920>
- Convert sequential search code block to parallel async by @Garamda in <https://github.com/spiceai/spiceai/pull/4936>
- fix: Throughput metric calculation by @peasee in <https://github.com/spiceai/spiceai/pull/4938>
- Update dependabot dependencies & `cargo update` by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4872>
- Improve servers shutdown sequence during runtime termination by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4942>
- Semantic model for views. Views visible in `table_schema` & `list_datasets` tools. by @Jeadie in <https://github.com/spiceai/spiceai/pull/4946>
- update openai-async by @Jeadie in <https://github.com/spiceai/spiceai/pull/4948>
- Update openapi.json by @github-actions in <https://github.com/spiceai/spiceai/pull/4961>
- fix: Redundant results snapshotting by @peasee in <https://github.com/spiceai/spiceai/pull/4956>
- Create schema for views if not exist by @Jeadie in <https://github.com/spiceai/spiceai/pull/4957>
- Bump Jimver/cuda-toolkit from 0.2.21 to 0.2.22 by @dependabot in <https://github.com/spiceai/spiceai/pull/4969>
- List available operations in `spice trace <operation>` by @Jeadie in <https://github.com/spiceai/spiceai/pull/4953>
- Initial commit of release analytics by @lukekim in <https://github.com/spiceai/spiceai/pull/4975>
- Remove spaces from CSV by @lukekim in <https://github.com/spiceai/spiceai/pull/4977>
- Fix Spice pods watcher by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4984>
- feat: Add appendable data sources for the testoperator by @peasee in <https://github.com/spiceai/spiceai/pull/4949>
- Omit timestamp when warning regarding datasets with hyphens by @Advayp in <https://github.com/spiceai/spiceai/pull/4987>
- Update helm chart to v1.0.5 by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4990>
- docs: Update qa_analytics.csv by @peasee in <https://github.com/spiceai/spiceai/pull/4989>
- Update end_game template by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4991>
- Update spicepod.schema.json by @github-actions in <https://github.com/spiceai/spiceai/pull/4993>
- Add v1.0.5 release notes by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4994>
- Supported Versions: include v1.0.5 by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4995>
- Dependabot updates by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4992>
- Switch to basic markdown formatting for vector search by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4934>
- docs: Update qa_analytics.csv by @peasee in <https://github.com/spiceai/spiceai/pull/5001>
- feat: Add TPCDS FileAppendableSource for testoperator by @peasee in <https://github.com/spiceai/spiceai/pull/5002>
- Update `ring` by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/5003>
- docs: Update qa_analytics.csv by @peasee in <https://github.com/spiceai/spiceai/pull/5006>
- feat: Add ClickBench FileAppendableSource for testoperator by @peasee in <https://github.com/spiceai/spiceai/pull/5004>
- feat: Validate append test table counts by @peasee in <https://github.com/spiceai/spiceai/pull/5008>
- feat: Add append spicepods by @peasee in <https://github.com/spiceai/spiceai/pull/5009>
- Improve Vector Search performance for large content w/o primary key defined by @sgrebnov in <https://github.com/spiceai/spiceai/pull/5010>
- Don't try to downgrade Arc in test_acceleration_duckdb_single_instance by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/5014>
- feat: Add an initial testoperator vector search command by @peasee in <https://github.com/spiceai/spiceai/pull/5011>
- feat: Update testoperator workflows for automatic snapshot updates by @peasee in <https://github.com/spiceai/spiceai/pull/5018>
- Fix Vector Search when additional columns include embedding column by @sgrebnov in <https://github.com/spiceai/spiceai/pull/5022>
- Include test for primary key passed as additional column in Vector Search by @sgrebnov in <https://github.com/spiceai/spiceai/pull/5024>
- fix: Update benchmark snapshots by @github-actions in <https://github.com/spiceai/spiceai/pull/5020>
- upgrade mistral.rs by @Jeadie in <https://github.com/spiceai/spiceai/pull/4952>
- fix: Indexes for TPCDS SQLite Spicepod by @peasee in <https://github.com/spiceai/spiceai/pull/5038>
- fix: Update benchmark snapshots by @github-actions in <https://github.com/spiceai/spiceai/pull/5035>
- Include local files in generated Spicepod package by @sgrebnov in <https://github.com/spiceai/spiceai/pull/5041>
- update mistral.rs to 'spiceai' branch rev by @Jeadie in <https://github.com/spiceai/spiceai/pull/5029>
- Configure spiced as an MCP SSE server by @Jeadie in <https://github.com/spiceai/spiceai/pull/5039>
- Update openapi.json by @github-actions in <https://github.com/spiceai/spiceai/pull/5052>
- fix: Disable benchmarks schedule, enable testoperator schedule by @peasee in <https://github.com/spiceai/spiceai/pull/5058>
- fix: Update benchmark snapshots by @github-actions in <https://github.com/spiceai/spiceai/pull/5060>
- Update ROADMAP.md March 2025 by @lukekim in <https://github.com/spiceai/spiceai/pull/5061>
- fix: Testoperator data setup by @peasee in <https://github.com/spiceai/spiceai/pull/5068>
- fix: All HTTP endpoints to hang when adding an invalid dataset with --pods-watcher-enabled by @sgrebnov in <https://github.com/spiceai/spiceai/pull/5050>
- fix: Update benchmark snapshots by @github-actions in <https://github.com/spiceai/spiceai/pull/5073>
- Integration tests for MCP tooling by @Jeadie in <https://github.com/spiceai/spiceai/pull/5053>
- OpenAPI docs for MCP by @Jeadie in <https://github.com/spiceai/spiceai/pull/5057>
- fix: Acceleration federation test by @peasee in <https://github.com/spiceai/spiceai/pull/5090>
- fix: Allow spiced commit in testoperator dispatch by @peasee in <https://github.com/spiceai/spiceai/pull/5098>
- fix: Use RefreshOverrides for the refresh API definition by @peasee in <https://github.com/spiceai/spiceai/pull/5095>
- Update openapi.json by @github-actions in <https://github.com/spiceai/spiceai/pull/5094>
- fix: Increase tries for refresh_status_change_to_ready test by @peasee in <https://github.com/spiceai/spiceai/pull/5099>
- feat: Testoperator reports on max and median memory usage by @peasee in <https://github.com/spiceai/spiceai/pull/5101>
- Update openapi.json by @github-actions in <https://github.com/spiceai/spiceai/pull/5105>
- fix: Fail testoperator on failed queries by @peasee in <https://github.com/spiceai/spiceai/pull/5106>
- Update Helm chart to 1.0.6 by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/5107>
- Update SECURITY.md to include 1.0.6 by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/5109>
- Update spicepod.schema.json by @github-actions in <https://github.com/spiceai/spiceai/pull/5108>
- Add QA analytics for 1.0.6 by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/5110>
- add env variables to tools, usable in MCP stdio by @Jeadie in <https://github.com/spiceai/spiceai/pull/5097>
- HF downloads obey SIGTERM by @Jeadie in <https://github.com/spiceai/spiceai/pull/5044>
- Add v1.0.6 release notes into trunk by @sgrebnov in <https://github.com/spiceai/spiceai/pull/5111>
- Remove redundant mod name for iceberg integration tests by @Sevenannn in <https://github.com/spiceai/spiceai/pull/5112>
- Use fixed data directory for test operator by @Sevenannn in <https://github.com/spiceai/spiceai/pull/5103>
- Improvements for evals by @Jeadie in <https://github.com/spiceai/spiceai/pull/5040>
- Make McpProxy trait for MCP passthrough by @Jeadie in <https://github.com/spiceai/spiceai/pull/5115>
- Properly handle '/' for tool names. by @Jeadie in <https://github.com/spiceai/spiceai/pull/5116>
- Use retry logic when loading tools by @Jeadie in <https://github.com/spiceai/spiceai/pull/5120>
- Exclude slow tests from regular pr runs by @Sevenannn in <https://github.com/spiceai/spiceai/pull/5119>
- Fix test operator snapshot update by @Sevenannn in <https://github.com/spiceai/spiceai/pull/5130>
- spice init: Fixes windows bug where full path is used for spicepod name by @benrussell in <https://github.com/spiceai/spiceai/pull/5126>
- fix: Update benchmark snapshots by @github-actions in <https://github.com/spiceai/spiceai/pull/5131>
- Implement graceful shutdown for HTTP server by @sgrebnov in <https://github.com/spiceai/spiceai/pull/5102>
- Update enhancement.md by @lukekim in <https://github.com/spiceai/spiceai/pull/5142>
- Add GitHub Workflow and PoC Spicepod configuration to run FinanceBench tests by @sgrebnov in <https://github.com/spiceai/spiceai/pull/5145>
- Fix Postgres and MySQL installation on macos14-runner (E2E CI) by @sgrebnov in <https://github.com/spiceai/spiceai/pull/5155>
- De-duplicate attachments in DuckDBAttachments by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/5156>
- v1.0.7 release note by @Sevenannn in <https://github.com/spiceai/spiceai/pull/5153>
- Update spicepod.schema.json by @github-actions in <https://github.com/spiceai/spiceai/pull/5160>
- Update Helm chart to 1.0.7 by @Sevenannn in <https://github.com/spiceai/spiceai/pull/5159>
- Add github token to macos test release download tasks by @Sevenannn in <https://github.com/spiceai/spiceai/pull/5161>
- update security.md for 1.0.7 by @Sevenannn in <https://github.com/spiceai/spiceai/pull/5162>
- Update roadmap.md by @Sevenannn in <https://github.com/spiceai/spiceai/pull/5163>
- Add a performance comparison section for 1.0.7 by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/5164>
- docs: Add snafu error variant point to style guide by @peasee in <https://github.com/spiceai/spiceai/pull/5167>
- Fix 1.0.7 release note by @Sevenannn in <https://github.com/spiceai/spiceai/pull/5168>
- Adjust DuckDB connection pool size based on DuckDB accelerator instances usage by @Sevenannn in <https://github.com/spiceai/spiceai/pull/5117>
- Add automatic retry for NSQL queries by @sgrebnov in <https://github.com/spiceai/spiceai/pull/5169>
- Include chat completion id to task history by @sgrebnov in <https://github.com/spiceai/spiceai/pull/5170>
- Trace when all runtime components are ready by @sgrebnov in <https://github.com/spiceai/spiceai/pull/5171>
- Update qa_analytics.csv for 1.0.7 by @Sevenannn in <https://github.com/spiceai/spiceai/pull/5165>
- Set default tool recursion limit to 10 to prevent infinite loops by @sgrebnov in <https://github.com/spiceai/spiceai/pull/5173>
- Add support for `schema_source_path` param for object-store data connectors by @sgrebnov in <https://github.com/spiceai/spiceai/pull/5178>
- Run license check and check changes on self-hosted macOS runners by @lukekim in <https://github.com/spiceai/spiceai/pull/5179>
- Add MCP by @lukekim in <https://github.com/spiceai/spiceai/pull/5183>

Full Changelog: github.com/spiceai/spiceai/compare/v1.0.0...release/1.1

Spice v1.0.7 (Mar 26, 2025)

March 27, 2025 · 4 min read

Phillip LeBlanc

Co-Founder and CTO of Spice AI

Announcing the release of Spice v1.0.7 🏎️

Spice v1.0.7 improves memory usage when using DuckDB, improves schema inference performance when using object-store based data connectors, and fixes a bug in Dremio schema inference.

Highlights in v1.0.7

DuckDB Memory Usage: Memory usage when using DuckDB has been significantly improved for data loads and refreshes through expanded use of zero-copy Arrow and multi-threading for data loads. When a duckdb_memory_limit is specified, disk spilling has been improved for greater-than-memory workloads. In addition, a new temp_directory runtime parameter supports storing temporary files to alternative location than the DuckDB data file for higher throughput. For example, temp_directory could be set to a different high-IOPs IO2 EBS volume that is separate from the duckdb_file_path.

Automated end-to-end tests for the DuckDB Accelerator coverage has been significantly expanded.

For configuration details, see the documentation for runtime parameters and the DuckDB Data Accelerator.
Schema Inference Performance for Object-Store Data Connectors: Schema inference performance has been improved, especially for large numbers of objects (1M+ objects) when using object-store based data connectors by making the object-listing and selection more efficient.

Performance

When compared to previous versions, Spice v1.0.7 loads DuckDB accelerated datasets significantly faster. When using the TPCH lineitem dataset at Scale Factor 100 (600M rows):

Without Indexes

5x faster, 28% less memory usage.

v1.0.6 v1.0.7

Version	Load Time	Peak Memory Usage
v1.0.6	16m 3s	32GB
v1.0.7	3m 149ms	24.4GB

With Indexes

2.5x faster. Higher memory usage in v1.0.7 is due to better resource utilization to achieve faster load times. Use the duckdb_memory_limit parameter to control memory usage.

Version	Load Time	Peak Memory Usage
v1.0.6	27m 9s	50GB
v1.0.7	11m 30s	77GB

v1.0.6 with indexes v1.0.7 with indexes

Documentation

DuckDB Data Accelerator: Has been expanded with additional resource usage guidance.
Memory: A new section for memory considerations has been added to the Reference section.

Contributors

@phillipleblanc
@sgrebnov
@peasee
@Sevenannn

Breaking Changes

No breaking changes.

Upgrading

To upgrade to v1.0.7, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.0.7 image:

docker pull spiceai/spiceai:1.0.7

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

DataFusion Table Providers: Upgraded from 760ece6ac52b7d180d697f347642af403c2e711c to 9ba9dce19a1fdbd5e22cc2e445c5b3ea731944b4.

Changelog

- fix: Remove on zero results arguments from benchmarks by @peasee in https://github.com/spiceai/spiceai/pull/4533
- Run benchmark tests w/o uploading test results (pending improvements) by @sgrebnov in https://github.com/spiceai/spiceai/pull/4843
- fix: Return BAD_REQUEST when not embeddings are configured by @peasee in https://github.com/spiceai/spiceai/pull/4804
- Fix Dremio schema inference by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5114
- Improve performance of schema inference for object-store data connectors by @sgrebnov in https://github.com/spiceai/spiceai/pull/5124
- Always download spice runtime version matched with spice cli version by @Sevenannn in https://github.com/spiceai/spiceai/pull/4761
- Fix go lint errors by @sgrebnov in https://github.com/spiceai/spiceai/pull/5147
- Make DuckDB acceleration E2E tests more comprehensive by @sgrebnov in https://github.com/spiceai/spiceai/pull/5146
- Enable Spice to load larger than memory datasets into DuckDB accelerations by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5149
- Add `temp_directory` runtime parameter and insert it for DuckDB accelerations by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5152
- Fix Postgres and MySQL installation on macos14-runner (E2E CI) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5155
- Enable E2E for DuckDB full mode acceleration with indexes only in CI by @sgrebnov in https://github.com/spiceai/spiceai/pull/5154

Full Changelog: github.com/spiceai/spiceai/compare/v1.0.6...v1.0.7

What's New in v1.5.0​

Contributors​

Breaking Changes​

Cookbook Updates​

Upgrading​

What's Changed​

Dependencies​

Changelog​

Contributors​

Breaking Changes​

Cookbook Updates​

Upgrading​

What's Changed​

Dependencies​

Changelog​

What's New in v1.3.0​

DataFusion v46 Highlights​

Contributors​

Breaking Changes​

Cookbook Updates​

Upgrading​

What's Changed​

Dependencies​

Changelog​

Highlights in v1.1.0​

Usage​

Spice as an MCP Server​

New Contributors 🎉​

Contributors​

Breaking Changes​

Cookbook Updates​

Upgrading​

What's Changed​

Dependencies​

Changelog​

Highlights in v1.0.7​

Performance​

Without Indexes​

With Indexes​

Documentation​

Contributors​

Breaking Changes​

Upgrading​

What's Changed​

Dependencies​

Changelog​

What's New in v1.5.0

Contributors

Breaking Changes

Cookbook Updates

Upgrading

What's Changed

Dependencies

Changelog

Contributors

Breaking Changes

Cookbook Updates

Upgrading

What's Changed

Dependencies

Changelog

What's New in v1.3.0

DataFusion v46 Highlights

Contributors

Breaking Changes

Cookbook Updates

Upgrading

What's Changed

Dependencies

Changelog

Highlights in v1.1.0

Usage

Spice as an MCP Server

New Contributors 🎉

Contributors

Breaking Changes

Cookbook Updates

Upgrading

What's Changed

Dependencies

Changelog

Highlights in v1.0.7

Performance

Without Indexes

With Indexes

Documentation

Contributors

Breaking Changes

Upgrading

What's Changed

Dependencies

Changelog