Skip to main content

6 posts tagged with "caching"

Results Cache related topics and usage

View All Tags

Spice v1.11.0 (Jan 28, 2026)

· 58 min read
William Croxson
Senior Software Engineer at Spice AI

Announcing the release of Spice v1.11.0-stable! ⚡

In Spice v1.11.0, Spice Cayenne reaches Beta status with acceleration snapshots, Key-based deletion vectors, and Amazon S3 Express One Zone support. DataFusion has been upgraded to v51 along with Arrow v57.2, and iceberg-rust v0.8.0. v1.11 adds several DynamoDB & DynamoDB Streams improvements such as JSON nesting, and adds significant improvements to Distributed Query with active-active schedulers and mTLS for enterprise-grade high-availability and secure cluster communication.

This release also adds new SMB, NFS, and ScyllaDB Data Connectors (Alpha), Prepared Statements with full SDK support (gospice, spice-rs, spice-dotnet, spice-java, spice.js, and spicepy), Google LLM Support for expanded AI inference capabilities, and significant improvements to caching, observability, and Hash Indexing for Arrow Acceleration.

What's New in v1.11.0

Spice Cayenne Accelerator Reaches Beta

Spice Cayenne has been promoted to Beta status with acceleration snapshots support and numerous performance and stability improvements.

Key Enhancements:

  • Key-based Deletion Vectors: Improved deletion vector support using key-based lookups for more efficient data management and faster delete operations. Key-based deletion vectors are more memory-efficient than positional vectors for sparse deletions.
  • S3 Express One Zone Support: Store Cayenne data files in S3 Express One Zone for single-digit millisecond latency, ideal for latency-sensitive query workloads that require persistence.

Improved Reliability:

  • Resolved FuturesUnordered reentrant drop crashes
  • Fixed memory growth issues related to Vortex metrics allocation
  • Metadata catalog now properly respects cayenne_file_path location
  • Added warnings for unparseable configuration values

For more details, refer to the Cayenne Documentation.

DataFusion v51 Upgrade

Apache DataFusion has been upgraded to v51, bringing significant performance improvements, new SQL features, and enhanced observability.

DataFusion v51 ClickBench Performance

Performance Improvements:

  • Faster CASE Expression Evaluation: Expressions now short-circuit earlier, reuse partial results, and avoid unnecessary scattering, speeding up common ETL patterns
  • Better Defaults for Remote Parquet Reads: DataFusion now fetches the last 512KB of Parquet files by default, typically avoiding 2 I/O requests per file
  • Faster Parquet Metadata Parsing: Leverages Arrow 57's new thrift metadata parser for up to 4x faster metadata parsing

New SQL Features:

  • SQL Pipe Operators: Support for |> syntax for inline transforms
  • DESCRIBE <query>: Returns the schema of any query without executing it
  • Named Arguments in SQL Functions: PostgreSQL-style param => value syntax for scalar, aggregate, and window functions
  • Decimal32/Decimal64 Support: New Arrow types supported including aggregations like SUM, AVG, and MIN/MAX

Example pipe operator:

SELECT * FROM t
|> WHERE a > 10
|> ORDER BY b
|> LIMIT 5;

Improved Observability:

  • Improved EXPLAIN ANALYZE Metrics: New metrics including output_bytes, selectivity for filters, reduction_factor for aggregates, and detailed timing breakdowns

Arrow 57.2 Upgrade

Apache Arrow has been upgraded to v57.2, bringing major performance improvements and new capabilities.

Arrow 57 Parquet Metadata Parsing Performance

Key Features:

  • 4x Faster Parquet Metadata Parsing: A rewritten thrift metadata parser delivers up to 4x faster metadata parsing, especially beneficial for low-latency use cases and files with large amounts of metadata
  • Parquet Variant Support: Experimental support for reading and writing the new Parquet Variant type for semi-structured data, including shredded variant values
  • Parquet Geometry Support: Read and write support for Parquet Geometry types (GEOMETRY and GEOGRAPHY) with GeospatialStatistics
  • New arrow-avro Crate: Efficient conversion between Apache Avro and Arrow RecordBatches with projection pushdown and vectorized execution support

DynamoDB Connector Enhancements

  • Added JSON nesting for DynamoDB Streams
  • Improved batch deletion handling

Distributed Query Improvements

High Availability Clusters: Spice now supports running multiple active schedulers in an active/active configuration for production deployments. This eliminates the scheduler as a single point of failure and enables graceful handling of node failures.

  • Multiple schedulers run simultaneously, each capable of accepting queries
  • Schedulers coordinate via a shared S3-compatible object store
  • Executors discover all schedulers automatically
  • A load balancer distributes client queries across schedulers

Example HA configuration:

runtime:
scheduler:
state_location: s3://my-bucket/spice-cluster
params:
region: us-east-1

mTLS Verification: Cluster communication between scheduler and executors now supports mutual TLS verification for enhanced security.

Credential Propagation: S3, ABFS, and GCS credentials are now automatically propagated to executors in cluster mode, enabling access to cloud storage across the distributed query cluster.

Improved Resilience:

  • Exponential backoff for scheduler disconnection recovery
  • Increased gRPC message size limit from 16MB to 100MB for large query plans
  • HTTP health endpoint for cluster executors
  • Automatic executor role inference when --scheduler-address is provided

For more details, refer to the Distributed Query Documentation.

iceberg-rust v0.8.0 Upgrade

Spice has been upgraded to iceberg-rust v0.8.0, bringing improved Iceberg table support.

Key Features:

  • V3 Metadata Support: Full support for Iceberg V3 table metadata format
  • INSERT INTO Partitioned Tables: DataFusion integration now supports inserting data into partitioned Iceberg tables
  • Improved Delete File Handling: Better support for position and equality delete files, including shared delete file loading and caching
  • SQL Catalog Updates: Implement update_table and register_table for SQL catalog
  • S3 Tables Catalog: Implement update_table for S3 Tables catalog
  • Enhanced Arrow Integration: Convert Arrow schema to Iceberg schema with auto-assigned field IDs, _file column support, and Date32 type support

Acceleration Snapshots

Acceleration snapshots enable point-in-time recovery and data versioning for accelerated datasets. Snapshots capture the state of accelerated data at specific points, allowing for fast bootstrap recovery and rollback capabilities.

Key Features:

  • Flexible Triggers: Configure when snapshots are created based on time intervals or stream batch counts
  • Automatic Compaction: Reduce storage overhead by compacting older snapshots (DuckDB only)
  • Bootstrap Integration: Snapshots can reset cache expiry on load for seamless recovery (DuckDB with Caching refresh mode)
  • Smart Creation Policies: Only create snapshots when data has actually changed

Example configuration:

datasets:
- from: s3://my-bucket/data.parquet
name: my_dataset
acceleration:
enabled: true
engine: cayenne
mode: file
snapshots: enabled
snapshots_trigger: time_interval
snapshots_trigger_threshold: 1h
snapshots_creation_policy: on_changed

Snapshots API and CLI: New API endpoints and CLI commands for managing snapshots programmatically.

CLI Commands:

# List all snapshots for a dataset
spice acceleration snapshots taxi_trips

# Get details of a specific snapshot
spice acceleration snapshot taxi_trips 3

# Set the current snapshot for rollback (requires runtime restart)
spice acceleration set-snapshot taxi_trips 2

HTTP API Endpoints:

MethodEndpointDescription
GET/v1/datasets/{dataset}/acceleration/snapshotsList all snapshots for a dataset
GET/v1/datasets/{dataset}/acceleration/snapshots/{id}Get details of a specific snapshot
POST/v1/datasets/{dataset}/acceleration/snapshots/currentSet the current snapshot for rollback

For more details, refer to the Acceleration Snapshots Documentation.

Caching Acceleration Mode Improvements

The Caching Acceleration Mode introduced in v1.10.0 has received significant performance optimizations and reliability fixes in this release.

Performance Optimizations:

  • Non-blocking Cache Writes: Cache misses no longer block query responses. Data is written to the cache asynchronously after the query returns, reducing query latency for cache miss scenarios.
  • Batch Cache Writes: Multiple cache entries are now written in batches rather than individually, significantly improving write throughput for high-volume cache operations.

Reliability Fixes:

  • Correct SWR Refresh Behavior: The stale-while-revalidate (SWR) pattern now correctly refreshes only the specific entries that were accessed instead of refreshing all stale rows in the dataset. This prevents unnecessary source queries and reduces load on upstream data sources.
  • Deduplicated Refresh Requests: Fixed an issue where JSON array responses could trigger multiple redundant refresh operations. Refresh requests are now properly deduplicated.
  • Fixed Cache Hit Detection: Resolved an issue where queries that didn't include fetched_at in their projection would always result in cache misses, even when cached data was available.
  • Unfiltered Query Optimization: SELECT * queries without filters now return cached data directly without unnecessary filtering overhead.

For more details, refer to the Caching Acceleration Mode Documentation.

Prepared Statements

Improved Query Performance and Security: Spice now supports prepared statements, enabling parameterized queries that improve both performance through query plan caching and security by preventing SQL injection attacks.

Key Features:

  • Query Plan Caching: Prepared statements cache query plans, reducing planning overhead for repeated queries
  • SQL Injection Prevention: Parameters are safely bound, preventing SQL injection vulnerabilities
  • Arrow Flight SQL Support: Full prepared statement support via Arrow Flight SQL protocol

SDK Support:

SDKSupportMin VersionMethod
gospice (Go)✅ Fullv8.0.0+SqlWithParams() with typed constructors (Int32Param, StringParam, TimestampParam, etc.)
spice-rs (Rust)✅ Fullv3.0.0+query_with_params() with RecordBatch parameters
spice-dotnet (.NET)✅ Fullv0.3.0+QueryWithParams() with typed parameter builders
spice-java (Java)✅ Fullv0.5.0+queryWithParams() with typed Param constructors (Param.int64(), Param.string(), etc.)
spice.js (JavaScript)✅ Fullv3.1.0+query() with parameterized query support
spicepy (Python)✅ Fullv3.1.0+query() with parameterized query support

Example (Go):

import "github.com/spiceai/gospice/v8"

client, _ := spice.NewClient()
defer client.Close()

// Parameterized query with typed parameters
results, _ := client.SqlWithParams(ctx,
"SELECT * FROM products WHERE price > $1 AND category = $2",
spice.Float64Param(10.0),
spice.StringParam("electronics"),
)

Example (Java):

import ai.spice.SpiceClient;
import ai.spice.Param;
import org.apache.arrow.adbc.core.ArrowReader;

try (SpiceClient client = new SpiceClient()) {
// With automatic type inference
ArrowReader reader = client.queryWithParams(
"SELECT * FROM products WHERE price > $1 AND category = $2",
10.0, "electronics");

// With explicit typed parameters
ArrowReader reader = client.queryWithParams(
"SELECT * FROM products WHERE price > $1 AND category = $2",
Param.float64(10.0),
Param.string("electronics"));
}

For more details, refer to the Parameterized Queries Documentation.

Spice Java SDK v0.5.0

Parameterized Query Support for Java: The Spice Java SDK v0.5.0 introduces parameterized queries using ADBC (Arrow Database Connectivity), providing a safer and more efficient way to execute queries with dynamic parameters.

Key Features:

  • SQL Injection Prevention: Parameters are safely bound, preventing SQL injection vulnerabilities
  • Automatic Type Inference: Java types are automatically mapped to Arrow types (e.g., doubleFloat64, StringUtf8)
  • Explicit Type Control: Use the new Param class with typed factory methods (Param.int64(), Param.string(), Param.decimal128(), etc.) for precise control over Arrow types
  • Updated Dependencies: Apache Arrow Flight SQL upgraded to 18.3.0, plus new ADBC driver support

Example:

import ai.spice.SpiceClient;
import ai.spice.Param;

try (SpiceClient client = new SpiceClient()) {
// With automatic type inference
ArrowReader reader = client.queryWithParams(
"SELECT * FROM taxi_trips WHERE trip_distance > $1 LIMIT 10",
5.0);

// With explicit typed parameters for precise control
ArrowReader reader = client.queryWithParams(
"SELECT * FROM orders WHERE order_id = $1 AND amount >= $2",
Param.int64(12345),
Param.decimal128(new BigDecimal("99.99"), 10, 2));
}

Maven:

<dependency>
<groupId>ai.spice</groupId>
<artifactId>spiceai</artifactId>
<version>0.5.0</version>
</dependency>

For more details, refer to the Spice Java SDK Repository.

Google LLM Support

Expanded AI Provider Support: Spice now supports Google embedding and chat models via the Google AI provider, expanding the available LLM options for AI inference workloads alongside existing providers like OpenAI, Anthropic, and AWS Bedrock.

Key Features:

  • Google Chat Models: Access Google's Gemini models for chat completions
  • Google Embeddings: Generate embeddings using Google's text embedding models
  • Unified API: Use the same OpenAI-compatible API endpoints for all LLM providers

Example spicepod.yaml configuration:

models:
- from: google:gemini-2.0-flash
name: gemini
params:
google_api_key: ${secrets:GOOGLE_API_KEY}

embeddings:
- from: google:text-embedding-004
name: google_embeddings
params:
google_api_key: ${secrets:GOOGLE_API_KEY}

For more details, refer to the Google LLM Documentation (see docs PR #1286).

URL Tables

Query data sources directly via URL in SQL without prior dataset registration. Supports S3, Azure Blob Storage, and HTTP/HTTPS URLs with automatic format detection and partition inference.

Supported Patterns:

  • Single files: SELECT * FROM 's3://bucket/data.parquet'
  • Directories/prefixes: SELECT * FROM 's3://bucket/data/'
  • Glob patterns: SELECT * FROM 's3://bucket/year=*/month=*/data.parquet'

Key Features:

  • Automatic file format detection (Parquet, CSV, JSON, etc.)
  • Hive-style partition inference with filter pushdown
  • Schema inference from files
  • Works with both SQL and DataFrame APIs

Example with hive partitioning:

-- Partitions are automatically inferred from paths
SELECT * FROM 's3://bucket/data/' WHERE year = '2024' AND month = '01'

Enable via spicepod.yml:

runtime:
params:
url_tables: enabled

Cluster Mode Async Query APIs (experimental)

New asynchronous query APIs for long-running queries in cluster mode:

  • /v1/queries endpoint: Submit queries and retrieve results asynchronously

OpenTelemetry Improvements

Unified Telemetry Endpoint: OTel metrics ingestion has been consolidated to the Flight port (50051), simplifying deployment by removing the separate OTel port (50052). The push-based metrics exporter continues to support integration with OpenTelemetry collectors.

Note: This is a breaking change. Update your configurations if you were using the dedicated OTel port 50052. Internal cluster communication now uses port 50052 exclusively.

Observability Improvements

Enhanced Dashboards: Updated Grafana and Datadog example dashboards with:

  • Snapshot monitoring widgets
  • Improved accelerated datasets section
  • Renamed ingestion lag charts for clarity

Additional Histogram Buckets: Added more buckets to histogram metrics for better latency distribution visibility.

For more details, refer to the Monitoring Documentation.

Hash Indexing for Arrow Acceleration (experimental)

Arrow-based accelerations now support hash indexing for faster point lookups on equality predicates. Hash indexes provide O(1) average-case lookup performance for columns with high cardinality.

Features:

  • Primary key hash index support
  • Secondary index support for non-primary key columns
  • Composite key support with proper null value handling

Example configuration:

datasets:
- from: postgres:users
name: users
acceleration:
enabled: true
engine: arrow
primary_key: user_id
indexes:
'(tenant_id, user_id)': unique # Composite hash index

For more details, refer to the Hash Index Documentation.

SMB and NFS Data Connectors

Network-Attached Storage Connectors: New data connectors for SMB (Server Message Block) and NFS (Network File System) protocols enable direct federated queries against network-attached storage without requiring data movement to cloud object stores.

Key Features:

  • SMB Protocol Support: Connect to Windows file shares and Samba servers with authentication support
  • NFS Protocol Support: Connect to Unix/Linux NFS exports for direct data access
  • Federated Queries: Query Parquet, CSV, JSON, and other file formats directly from network storage with full SQL support
  • Acceleration Support: Accelerate data from SMB/NFS sources using DuckDB, Spice Cayenne, or other accelerators

Example spicepod.yaml configuration:

datasets:
# SMB share
- from: smb://fileserver/share/data.parquet
name: smb_data
params:
smb_username: ${secrets:SMB_USER}
smb_password: ${secrets:SMB_PASS}

# NFS export
- from: nfs://nfsserver/export/data.parquet
name: nfs_data

For more details, refer to the Data Connectors Documentation.

ScyllaDB Data Connector

A new data connector for ScyllaDB, the high-performance NoSQL database compatible with Apache Cassandra. Query ScyllaDB tables directly or accelerate them for faster analytics.

Example configuration:

datasets:
- from: scylladb:my_keyspace.my_table
name: scylla_data
acceleration:
enabled: true
engine: duckdb

For more details, refer to the ScyllaDB Data Connector Documentation.

Flight SQL TLS Connection Fixes

TLS Connection Support: Fixed TLS connection issues when using grpc+tls:// scheme with Flight SQL endpoints. Added support for custom CA certificate files via the new flightsql_tls_ca_certificate_file parameter.

Developer Experience Improvements

  • Turso v0.3.2 Upgrade: Upgraded Turso accelerator for improved performance and reliability
  • Rust 1.91 Upgrade: Updated to Rust 1.91 for latest language features and performance improvements
  • Spice Cloud CLI: Added spice cloud CLI commands for cloud deployment management
  • Improved Spicepod Schema: Improved JSON schema generation for better IDE support and validation
  • Acceleration Snapshots: Added configurable snapshots_create_interval for periodic acceleration snapshots independent of refresh cycles
  • Tiered Caching with Localpod: The Localpod connector now supports caching refresh mode, enabling multi-layer acceleration where a persistent cache feeds a fast in-memory cache
  • GitHub Data Connector: Added workflows and workflow runs support for GitHub repositories
  • NDJSON/LDJSON Support: Added support for Newline Delimited JSON and Line Delimited JSON file formats

Additional Improvements & Bug Fixes

  • Model Listing: New functionality to list available models across multiple AI providers
  • DuckDB Partitioned Tables: Primary key constraints now supported in partitioned DuckDB table mode
  • Post-refresh Sorting: New on_refresh_sort_columns parameter for DuckDB enables data ordering after writes
  • Improved Install Scripts: Removed jq dependency and improved cross-platform compatibility
  • Better Error Messages: Improved error messaging for bucket UDF arguments and deprecated OpenAI parameters
  • Reliability: Fixed DynamoDB IAM role authentication with new dynamodb_auth: iam_role parameter
  • Reliability: Fixed cluster executors to use scheduler's temp_directory parameter for shuffle files
  • Reliability: Initialize secrets before object stores in cluster executor mode
  • Reliability: Added page-level retry with backoff for transient GitHub GraphQL errors
  • Performance: Improved statistics for rewritten DistributeFileScanOptimizer plans
  • Developer Experience: Added max_message_size configuration for Flight service

Contributors

Breaking Changes

OTel Ingestion Port Change

OTel ingestion has been moved to the Flight port (50051), removing the separate OTel port 50052. Port 50052 is now used exclusively for internal cluster communication. Update your configurations if you were using the dedicated OTel port.

Distributed Query Cluster Mode Requires mTLS

Distributed query cluster mode now requires mTLS for secure communication between cluster nodes. This is a security enhancement to prevent unauthorized nodes from joining the cluster and accessing secrets.

Migration Steps:

  1. Generate certificates using spice cluster tls init and spice cluster tls add
  2. Update scheduler and executor startup commands with --node-mtls-* arguments
  3. For development/testing, use --allow-insecure-connections to opt out of mTLS

Renamed CLI Arguments:

Old NameNew Name
--cluster-mode--role
--cluster-ca-certificate-file--node-mtls-ca-certificate-file
--cluster-certificate-file--node-mtls-certificate-file
--cluster-key-file--node-mtls-key-file
--cluster-address--node-bind-address
--cluster-advertise-address--node-advertise-address
--cluster-scheduler-url--scheduler-address

Removed CLI Arguments:

  • --cluster-api-key: Replaced by mTLS authentication

Cookbook Updates

New ScyllaDB Data Connector Recipe: New recipe demonstrating how to use the ScyllaDB Data Connector. See ScyllaDB Data Connector Recipe for details.

New SMB Data Connector Recipe: New recipe demonstrating how to use the SMB Data Connector. See SMB Data Connector Recipe for details.

The Spice Cookbook includes 86 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.11.0, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.11.0 image:

docker pull spiceai/spiceai:1.11.0

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai --version 1.11.0

AWS Marketplace:

Spice is available in the AWS Marketplace.

Dependencies

What's Changed

Changelog

Spice v1.11.0-rc.2 (Jan 22, 2026)

· 24 min read
Viktor Yershov
Senior Software Engineer at Spice AI

Announcing the release of Spice v1.11.0-rc.2!

v1.11.0-rc.2 is the second release candidate for advanced test of v1.11. It brings Spice Cayenne to Beta status with acceleration snapshots support, a new ScyllaDB Data Connector, upgrades to DataFusion v51, Arrow 57.2, and iceberg-rust v0.8.0. It includes significant improvements to distributed query, caching, and observability.

What's New in v1.11.0-rc.2

Spice Cayenne Accelerator Reaches Beta

Spice Cayenne has been promoted to Beta status with acceleration snapshots support and numerous stability improvements.

Improved Reliability:

  • Fixed timezone database issues in Docker images that caused acceleration panics
  • Resolved FuturesUnordered reentrant drop crashes
  • Fixed memory growth issues related to Vortex metrics allocation
  • Metadata catalog now properly respects cayenne_file_path location
  • Added warnings for unparseable configuration values

Example configuration with snapshots:

datasets:
- from: s3://my-bucket/data.parquet
name: my_dataset
acceleration:
enabled: true
engine: cayenne
mode: file

DataFusion v51 Upgrade

Apache DataFusion has been upgraded to v51, bringing significant performance improvements, new SQL features, and enhanced observability.

DataFusion v51 ClickBench Performance

Performance Improvements:

  • Faster CASE Expression Evaluation: Expressions now short-circuit earlier, reuse partial results, and avoid unnecessary scattering, speeding up common ETL patterns
  • Better Defaults for Remote Parquet Reads: DataFusion now fetches the last 512KB of Parquet files by default, typically avoiding 2 I/O requests per file
  • Faster Parquet Metadata Parsing: Leverages Arrow 57's new thrift metadata parser for up to 4x faster metadata parsing

New SQL Features:

  • SQL Pipe Operators: Support for |> syntax for inline transforms
  • DESCRIBE <query>: Returns the schema of any query without executing it
  • Named Arguments in SQL Functions: PostgreSQL-style param => value syntax for scalar, aggregate, and window functions
  • Decimal32/Decimal64 Support: New Arrow types supported including aggregations like SUM, AVG, and MIN/MAX

Example pipe operator:

SELECT * FROM t
|> WHERE a > 10
|> ORDER BY b
|> LIMIT 5;

Improved Observability:

  • Improved EXPLAIN ANALYZE Metrics: New metrics including output_bytes, selectivity for filters, reduction_factor for aggregates, and detailed timing breakdowns

Arrow 57.2 Upgrade

Spice has been upgraded to Apache Arrow Rust 57.2.0, bringing major performance improvements and new capabilities.

Arrow 57 Parquet Metadata Parsing Performance

Key Features:

  • 4x Faster Parquet Metadata Parsing: A rewritten thrift metadata parser delivers up to 4x faster metadata parsing, especially beneficial for low-latency use cases and files with large amounts of metadata
  • Parquet Variant Support: Experimental support for reading and writing the new Parquet Variant type for semi-structured data, including shredded variant values
  • Parquet Geometry Support: Read and write support for Parquet Geometry types (GEOMETRY and GEOGRAPHY) with GeospatialStatistics
  • New arrow-avro Crate: Efficient conversion between Apache Avro and Arrow RecordBatches with projection pushdown and vectorized execution support

iceberg-rust v0.8.0 Upgrade

Spice has been upgraded to iceberg-rust v0.8.0, bringing improved Iceberg table support.

Key Features:

  • V3 Metadata Support: Full support for Iceberg V3 table metadata format
  • INSERT INTO Partitioned Tables: DataFusion integration now supports inserting data into partitioned Iceberg tables
  • Improved Delete File Handling: Better support for position and equality delete files, including shared delete file loading and caching
  • SQL Catalog Updates: Implement update_table and register_table for SQL catalog
  • S3 Tables Catalog: Implement update_table for S3 Tables catalog
  • Enhanced Arrow Integration: Convert Arrow schema to Iceberg schema with auto-assigned field IDs, _file column support, and Date32 type support

Acceleration Snapshots

Acceleration snapshots enable point-in-time recovery and data versioning for accelerated datasets. Snapshots capture the state of accelerated data at specific points, allowing for fast bootstrap recovery and rollback capabilities.

Key Feature Improvements in v1.11:

  • Flexible Triggers: Configure when snapshots are created based on time intervals or stream batch counts
  • Automatic Compaction: Reduce storage overhead by compacting older snapshots (DuckDB only)
  • Bootstrap Integration: Snapshots can reset cache expiry on load for seamless recovery (DuckDB with Caching refresh mode)
  • Smart Creation Policies: Only create snapshots when data has actually changed

Example configuration:

datasets:
- from: s3://my-bucket/data.parquet
name: my_dataset
acceleration:
enabled: true
engine: cayenne
mode: file
snapshots: enabled
snapshots_trigger: time_interval
snapshots_trigger_threshold: 1h
snapshots_creation_policy: on_changed

Snapshots API and CLI: New API endpoints and CLI commands for managing snapshots programmatically. List, create, and restore snapshots directly from the command line or via HTTP.

For more details, refer to the Acceleration Snapshots Documentation.

ScyllaDB Data Connector

A new data connector for ScyllaDB, the high-performance NoSQL database compatible with Apache Cassandra. Query ScyllaDB tables directly or accelerate them for faster analytics.

Example configuration:

datasets:
- from: scylladb:my_keyspace.my_table
name: scylla_data
acceleration:
enabled: true
engine: duckdb

For more details, refer to the ScyllaDB Data Connector Documentation.

Distributed Query Improvements

mTLS Verification: Cluster communication between scheduler and executors now supports mutual TLS verification for enhanced security.

Credential Propagation: Azure and GCS credentials are now automatically propagated to executors in cluster mode, enabling access to cloud storage across the distributed query cluster.

Improved Resilience:

  • Exponential backoff for scheduler disconnection recovery
  • Increased gRPC message size limit from 16MB to 100MB for large query plans
  • HTTP health endpoint for cluster executors
  • Automatic executor role inference when --scheduler-address is provided

For more details, refer to the Distributed Query Documentation.

Caching Acceleration Mode Improvements

The Caching Acceleration Mode introduced in v1.10.0 has received significant performance optimizations and reliability fixes in this release.

Performance Optimizations:

  • Non-blocking Cache Writes: Cache misses no longer block query responses. Data is written to the cache asynchronously after the query returns, reducing query latency for cache miss scenarios.
  • Batch Cache Writes: Multiple cache entries are now written in batches rather than individually, significantly improving write throughput for high-volume cache operations.

Reliability Fixes:

  • Correct SWR Refresh Behavior: The stale-while-revalidate (SWR) pattern now correctly refreshes only the specific entries that were accessed instead of refreshing all stale rows in the dataset. This prevents unnecessary source queries and reduces load on upstream data sources.
  • Deduplicated Refresh Requests: Fixed an issue where JSON array responses could trigger multiple redundant refresh operations. Refresh requests are now properly deduplicated.
  • Fixed Cache Hit Detection: Resolved an issue where queries that didn't include fetched_at in their projection would always result in cache misses, even when cached data was available.
  • Unfiltered Query Optimization: SELECT * queries without filters now return cached data directly without unnecessary filtering overhead.

For more details, refer to the Caching Acceleration Mode Documentation.

DynamoDB Connector Enhancements

  • Added JSON nesting for DynamoDB Streams
  • Proper batch deletion handling

URL Tables

Query data sources directly via URL in SQL without prior dataset registration. Supports S3, Azure Blob Storage, and HTTP/HTTPS URLs with automatic format detection and partition inference.

Supported Patterns:

  • Single files: SELECT * FROM 's3://bucket/data.parquet'
  • Directories/prefixes: SELECT * FROM 's3://bucket/data/'
  • Glob patterns: SELECT * FROM 's3://bucket/year=*/month=*/data.parquet'

Key Features:

  • Automatic file format detection (Parquet, CSV, JSON, etc.)
  • Hive-style partition inference with filter pushdown
  • Schema inference from files
  • Works with both SQL and DataFrame APIs

Example with hive partitioning:

-- Partitions are automatically inferred from paths
SELECT * FROM 's3://bucket/data/' WHERE year = '2024' AND month = '01'

Enable via spicepod.yml:

runtime:
params:
url_tables: enabled

Cluster Mode Async Query APIs (experimental)

New asynchronous query APIs for long-running queries in cluster mode:

  • /v1/queries endpoint: Submit queries and retrieve results asynchronously
  • Arrow Flight async support: Non-blocking query execution via Arrow Flight protocol

Observability Improvements

Enhanced Dashboards: Updated Grafana and Datadog example dashboards with:

  • Snapshot monitoring widgets
  • Improved accelerated datasets section
  • Renamed ingestion lag charts for clarity

Additional Histogram Buckets: Added more buckets to histogram metrics for better latency distribution visibility.

For more details, refer to the Monitoring Documentation.

Additional Improvements

  • Model Listing: New functionality to list available models across multiple AI providers
  • DuckDB Partitioned Tables: Primary key constraints now supported in partitioned DuckDB table mode
  • Post-refresh Sorting: New on_refresh_sort_columns parameter for DuckDB enables data ordering after writes
  • Improved Install Scripts: Removed jq dependency and improved cross-platform compatibility
  • Better Error Messages: Improved error messaging for bucket UDF arguments and deprecated OpenAI parameters

Contributors

Breaking Changes

No breaking changes.

Cookbook Updates

New ScyllaDB Data Connector Recipe: New recipe demonstrating how to use ScyllaDB Data Connector. See ScyllaDB Data Connector Recipe for details.

New SMB Data Connector Recipe: New recipe demonstrating how to use ScyllaDB Data Connector. See SMB Data Connector Recipe for details.

The Spice Cookbook includes 86 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.11.0-rc.2, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:v1.11.0-rc.2 image:

docker pull spiceai/spiceai:v1.11.0-rc.2

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

AWS Marketplace:

Spice is available in the AWS Marketplace.

Dependencies

Changelog

Spice v1.10.2 (Dec 22, 2025)

· 5 min read
Sergei Grebnov
Senior Software Engineer at Spice AI

Announcing the release of Spice v1.10.2! 🔥

v1.10.2 introduces Tiered Caching Acceleration with Localpod for multi-layer acceleration architectures, Periodic Acceleration Snapshots with configurable intervals, DynamoDB JSON Nesting for column consolidation, and Kafka/Debezium Batching for faster data ingestion. This release also includes fixes for SQLite accelerator decimal/date handling and real-time status reporting for the /v1/datasets and /v1/models API endpoints.

What's New in v1.10.2

Tiered Caching with Localpod

Multi-Layer Acceleration Architecture: The Localpod connector now supports caching refresh mode, enabling tiered acceleration where a persistent cache (e.g., file-mode DuckDB) feeds a fast in-memory cache (e.g., Arrow, memory-mode DuckDB).

Key Features:

  • Automatic Cache Propagation: New cache entries automatically propagate from parent to child accelerators
  • Warm Startup: Child accelerators initialize from existing parent data on startup, eliminating cold-start latency
  • Flexible Tiering: Combine any accelerator engines (DuckDB, SQLite, Cayenne) across tiers

Example spicepod.yaml configuration:

datasets:
# Parent: persistent file-mode cache
- from: https://api.example.com
name: api_cache
acceleration:
enabled: true
refresh_mode: caching
engine: duckdb
mode: file

# Child: fast in-memory cache fed by parent
- from: localpod:api_cache
name: api_cache_memory
acceleration:
enabled: true
refresh_mode: caching
engine: arrow
mode: memory

For more details, refer to the Localpod Data Connector Documentation.

Periodic Acceleration Snapshots

Configurable Snapshot Intervals: A new snapshots_create_interval parameter enables periodic snapshot creation for accelerated datasets across all refresh modes. This provides better control over snapshot frequency and ensures consistent recovery points for accelerated data.

Example spicepod.yaml configuration:

datasets:
- from: s3://my-bucket/data.parquet
name: my_data
acceleration:
enabled: true
engine: duckdb
mode: file
refresh_mode: caching
snapshots: enabled
params:
snapshots_create_interval: 60s # Write a snapshot every 60 seconds

For more details, refer to the Data Acceleration Documentation.

DynamoDB JSON Nesting

Consolidate Columns into JSON: The DynamoDB Data Connector now supports consolidating columns into a single JSON column using the json_object: "*" metadata option. This is useful when only a few columns are needed as discrete fields while the rest can be accessed as nested JSON.

Example spicepod.yaml configuration:

datasets:
- from: dynamodb:my_table
name: my_table
columns:
- name: PK
- name: SK
- name: data_json
metadata:
json_object: '*' # Captures all other columns as JSON

Example Output: Given a DynamoDB table with columns PK, SK, name, email, and status, the resulting table schema consolidates all non-specified columns into the data_json column:

PKSKdata_json
pk_1sort_1{"name": "Alice", "email": "[email protected]", "status": "active"}
pk_2sort_2{"name": "Bob", "email": "[email protected]", "status": "inactive"}

For more details, refer to the DynamoDB JSON Nesting Documentation.

Kafka/Debezium Batching

Faster Data Ingestion: Configure message batching for Kafka and Debezium connectors to improve data ingestion throughput. Batching reduces processing overhead by grouping multiple messages together before insertion.

Key Features:

  • Configurable Batch Size: Control the maximum number of records per batch (default: 10,000)
  • Configurable Batch Duration: Set the maximum wait time before flushing a partial batch (default: 1s)

Example spicepod.yaml configuration:

datasets:
- from: debezium:kafka-server.public.my_table
name: my_table
params:
batch_max_size: 10000 # Max records per batch (default: 10000)
batch_max_duration: 1s # Max wait time per batch (default: 1s)

For more details, refer to the Kafka Data Connector Documentation and Debezium Data Connector Documentation.

Additional Improvements & Bug Fixes

  • Reliability: Fixed SQLite accelerator decimal and date type handling for improved data type accuracy.
  • Reliability: Fixed real-time status reporting for /v1/datasets and /v1/models API endpoints.
  • Reliability: Fixed Kafka warning when security.protocol is set to PLAINTEXT.

Contributors

Breaking Changes

No breaking changes.

Cookbook Updates

New Cayenne Data Accelerator Recipe: New recipe demonstrating how to accelerate a local copy of the taxi trips dataset using Cayenne as the data accelerator engine. See Cayenne Data Accelerator Recipe for details.

New Dataset Partitioning Recipe: New recipe demonstrating how to partition accelerated datasets to improve query performance. See Dataset Partitioning for details.

The Spice Cookbook includes 84 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.10.2, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.10.2 image:

docker pull spiceai/spiceai:1.10.2

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

AWS Marketplace:

🎉 Spice is now available in the AWS Marketplace!

What's Changed

Changelog

Spice v1.10.0 (Dec 9, 2025)

· 18 min read
William Croxson
Senior Software Engineer at Spice AI

Announcing the release of Spice v1.10.0! ⚡

Spice v1.10.0 introduces a new Caching Acceleration Mode with stale-while-revalidate (SWR) semantics for disk-persisted, low-latency queries with background refresh. This release also adds the TinyLFU eviction policy for the SQL results cache, a preview of the DynamoDB Streams connector for real-time CDC, S3 location predicate pruning for faster partitioned queries, improved distributed query execution, and multiple security hardening improvements.

What's New in v1.10.0

Caching Acceleration Mode

Low-Latency Queries with Background Refresh: This release introduces a new caching acceleration mode that implements the stale-while-revalidate (SWR) pattern. Queries return cached results immediately while data refreshes asynchronously in the background, eliminating query latency spikes during refresh cycles. Cached data persists to disk using DuckDB, SQLite, or Cayenne file modes.

Key Features:

  • Stale-While-Revalidate (SWR): Returns cached data immediately while refreshing in the background, reducing query latency
  • Disk Persistence: Cached results persist across restarts using DuckDB, SQLite, or Cayenne file modes
  • Configurable Refresh: Control refresh intervals with refresh_check_interval to balance freshness and source load

Recommendation: Use retention configuration with caching acceleration to ensure stale data is cleaned up over time.

Example spicepod.yaml configuration:

datasets:
- from: http://localhost:7400
name: cached_data
time_column: fetched_at
acceleration:
enabled: true
engine: duckdb
mode: file # Persist cache to disk
refresh_mode: caching
refresh_check_interval: 10m
retention_check_enabled: true
retention_period: 24h
retention_check_interval: 1h

For more details, refer to the Data Acceleration Documentation.

TinyLFU Cache Eviction Policy

Higher Cache Hit Rates for SQL Results Cache: A new TinyLFU cache eviction policy is now available for the SQL results cache. TinyLFU is a probabilistic cache admission policy that maintains higher hit rates than LRU while keeping memory usage predictable, making it ideal for workloads with varying query frequency patterns.

Example spicepod.yaml configuration:

runtime:
caching:
sql_results:
enabled: true
eviction_policy: tiny_lfu # default: lru

For more details, refer to the Caching Documentation and the Moka TinyLFU Documentation for details of the algorithm.

DynamoDB Streams Data Connector (Preview)

Real-Time Change Data Capture for DynamoDB: The DynamoDB connector now integrates with DynamoDB Streams for real-time change data capture (CDC). This enables continuous synchronization of DynamoDB table changes into Spice for real-time query, search, and LLM-inference.

Key Features:

  • Real-Time CDC: Automatically captures inserts, updates, and deletes from DynamoDB tables as they occur
  • Table Bootstrapping: Performs an initial full table scan before streaming changes, ensuring complete data consistency
  • Acceleration Integration: Works with refresh_mode: changes to incrementally update accelerated datasets

Note: DynamoDB Streams must be enabled on your DynamoDB table. This feature is in preview.

Example spicepod.yaml configuration:

datasets:
- from: dynamodb:my_table
name: orders_stream
acceleration:
enabled: true
refresh_mode: changes # Enable Streams capture

For more details, refer to the DynamoDB Connector Documentation.

OpenTelemetry Metrics Exporter

Spice can now push metrics to an OpenTelemetry collector, enabling integration with platforms such as Jaeger, New Relic, Honeycomb, and other OpenTelemetry-compatible backends.

Key Features:

  • Protocol Support: Supports the gRPC (default port 4317) protocol
  • Configurable Push Interval: Control how frequently metrics are pushed to the collector

Example spicepod.yaml configuration for gRPC:

runtime:
telemetry:
enabled: true
otel_exporter:
endpoint: 'localhost:4317'
push_interval: '30s'

For more details, refer to the Observability & Monitoring Documentation.

S3 Connector Improvements

S3 Location Predicate Pruning: The S3 data connector now supports location-based predicate pruning, dramatically reducing data scanned by pushing down location filter predicates to S3 listing operations. For partitioned datasets (e.g., year=2025/month=12/), Spice now skips listing irrelevant partitions entirely, significantly reducing query latency and S3 API costs.

AWS S3 Tables Write Support: Full read/write capability for AWS S3 Tables, enabling direct integration with AWS's managed table format for S3. Use standard SQL INSERT INTO to write data.

For more details, refer to the S3 Data Connector Documentation and Glue Data Connector Documentation.

Faster Distributed Query Execution

Distributed query planning and execution have been significantly improved:

  • Fixed executor registration in cluster mode for more reliable distributed deployments
  • Improved hostname resolution for Flight server binding, enabling better executor discovery
  • Distributed accelerator registration: Data accelerators now properly register in distributed mode
  • Optimized query planning: DistributeFileScanOptimizer improvements for faster planning with large datasets

For more details, refer to the Distributed Query Documentation.

Search Improvements

Search capabilities have been improved with several performance and reliability enhancements:

  • Fixed FTS query blocking: Full-text search queries no longer block unnecessarily, improving query responsiveness
  • Optimized vector index operations: Eliminated unnecessary list_vectors calls for better performance
  • Improved limit pushdown: IndexerExec now properly handles limit pushdown for more efficient searches

For more details, refer to the Search Documentation.

Security Hardening

Multiple security improvements have been implemented:

  • SQL Identifier Quoting: Hardened SQL identifier quoting across all database connectors (PostgreSQL, MySQL, DuckDB, etc.) to prevent SQL injection attacks through table or column names
  • Token Redaction: Sensitive authentication tokens are now fully redacted in debug and error output, preventing accidental credential exposure in logs
  • Path Traversal Prevention: Fixed tar extraction operations to prevent directory traversal vulnerabilities when processing archived files
  • Input Sanitization: Added strict validation for top_n_sample order_by clause parsing to prevent injection attacks
  • Glue Credential Handling: Prevented automatic loading of AWS credentials from environment in Glue connector, ensuring explicit credential configuration

Developer Experience Improvements

  • Health probe metrics: Added health probe latency metrics for better observability
  • CLI improvements: Fixed .clear history command in the REPL to fully clear persisted history

Contributors

Breaking Changes

No breaking changes.

Cookbook Updates

No major cookbook updates.

The Spice Cookbook includes 82 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.10.0, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.10.0 image:

docker pull spiceai/spiceai:1.10.0

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

AWS Marketplace:

🎉 Spice is now available in the AWS Marketplace!

What's Changed

Changelog

Spice v1.10.0-rc.1 (Dec 2, 2025)

· 11 min read
David Stancu
Principal Software Engineer at Spice AI

Announcing the release of Spice v1.10.0-rc.1! ⚡

v1.10.0-rc1 is a release candidate for early testing of v1.10 features including an all new caching acceleration mode, tiny_lfu caching policy, a new DynamoDB Streams connector (Preview), improvements to the DynamoDB connector, faster distributed query execution, S3 connector improvements, and security hardening for v1.10.0-stable.

What's New in v1.10.0-rc1

Caching Acceleration Mode with SWR and TinyLFU

This release introduces a new caching acceleration mode that implements the stale-while-revalidate (SWR) pattern using Data Accelerators such as DuckDB or Cayenne, enabling queries to return file-persisted cached results immediately while asynchronously refreshing data in the background. Combined with the new TinyLFU cache eviction policy, Spice can now maintain higher cache hit rates while keeping memory usage predictable.

Key Features:

  • Stale-While-Revalidate (SWR): Returns cached data immediately while refreshing in the background
  • Data Accelerator Support: Cached accelerators can persist data to disk using DuckDB, SQLite, or Cayenne file modes.
  • TinyLFU Cache Policy: Probabilistic cache admission policy that maintains high hit rates with minimal overhead
  • Predictable Memory Usage: Configurable memory limits with automatic eviction of less frequently used entries

Example Spicepod.yml configuration:

runtime:
caching:
sql_results:
enabled: true
eviction_policy: tiny_lfu # default lru

datasets:
- from: s3://my-bucket/data.parquet
name: cached_data
acceleration:
enabled: true
engine: duckdb
mode: file # Persist cache to disk
refresh_mode: caching
refresh_check_interval: 10m

For more details, refer to the Data Acceleration Documentation and Caching Documentation.

DynamoDB Streams Data Connector in Preview

DynamoDB Connector now integrates with DynamoDB Streams which enables real-time streaming with support for both table bootstrapping and continuous change data capture (CDC). This connector automatically detects changes in DynamoDB tables and streams them into Spice for real-time query, search, and LLM-inference.

Key Features:

  • Real-Time CDC: Automatically captures inserts, updates, and deletes from DynamoDB tables
  • Table Bootstrapping: Initial full table load before streaming changes

Example Spicepod.yml configuration:

datasets:
- from: dynamodb:my_table
name: orders_stream
acceleration:
enabled: true
refresh_mode: changes

For more details, refer to the DynamoDB Connector Documentation.

Cayenne Accelerator Enhancements

The Cayenne data accelerator now supports:

  • Sort Columns Configuration: Optimize inserts by pre-sorting data on specified columns for improved query performance

Example Spicepod.yml configuration:

datasets:
- from: s3://my-bucket/data.parquet
name: sorted_data
acceleration:
enabled: true
engine: cayenne
mode: file_create
params:
sort_columns: timestamp,region

For more details, refer to the Cayenne Documentation.

S3 Connector Improvements

S3 Location Predicate Pruning: The S3 data connector now supports location-based predicate pruning, dramatically reducing data scanned by pushing down predicates to S3 listing operations. This optimization is especially effective for partitioned datasets stored in S3.

AWS S3 Tables Write Support: Full read/write capability for AWS S3 Tables, enabling fast integration with AWS's table format for S3.

For more details, refer to the S3 Tables Data Connector Documentation and Glue Data Connection Documentation.

Faster Distributed Query Execution

Distributed query planning and execution have been significantly improved:

  • Fixed executor registration in cluster mode for more reliable distributed deployments
  • Improved hostname resolution for Flight server binding, enabling better executor discovery
  • Distributed accelerator registration: Data accelerators now properly register in distributed mode
  • Optimized query planning: DistributeFileScanOptimizer improvements for faster planning with large datasets

For more details, refer to the Distributed Query Documentation.

Search Improvements

Search capabilities have been improved with several performance and reliability enhancements:

  • Fixed FTS query blocking: Full-text search queries no longer block unnecessarily, improving query responsiveness
  • Optimized vector index operations: Eliminated unnecessary list_vectors calls for better performance
  • Improved limit pushdown: IndexerExec now properly handles limit pushdown for more efficient searches

For more details, refer to the Search Documentation.

Security Hardening

Multiple security improvements have been implemented:

  • SQL identifier quoting: Hardened SQL identifier quoting across all connectors to prevent injection attacks
  • Token redaction: Sensitive tokens are now fully redacted in debug output to prevent credential leakage
  • Path traversal prevention: Fixed tar extraction to prevent path traversal vulnerabilities
  • Input sanitization: Added validation for top_n_sample order_by parsing
  • Improved credential handling: Improved credential management in Glue connector

Developer Experience Improvements

  • Health probe metrics: Added health probe latency metrics for better observability
  • CLI improvements: Fixed .clear history command in the REPL to fully clear persisted history

Contributors

Breaking Changes

No breaking changes.

Cookbook Updates

No major cookbook updates. The Spice Cookbook still offers 82+ recipes to help you prototype quickly.

Upgrading

To try v1.10.0-rc1, use one of the following methods:

CLI:

spice upgrade --version 1.10.0-rc1

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.10.0-rc1 image:

docker pull spiceai/spiceai:1.10.0-rc1

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai --version 1.10.0-rc1

AWS Marketplace:

🎉 Spice is available in the AWS Marketplace.

What's Changed

Changelog

Spice v1.1.1 (Apr 7, 2025)

· 6 min read
Phillip LeBlanc
Co-Founder and CTO of Spice AI

Announcing the release of Spice v1.1.1! 📊

Spice v1.1.1 introduces several key updates, including a new Component Metrics System, improved Delta Data Connector performance, improved MCP tool descriptions, and expanded runtime results caching options. This release also adds detailed MySQL connection pool metrics for better observability. Component Metrics are Prometheus-compatible and accessible via the metrics endpoint.

Highlights v1.1.1

  • Component Metrics System: A new system for monitoring components, starting with MySQL connection pool metrics. These metrics provide insights into MySQL connection performance and can be selectively enabled in the dataset configuration. Metrics are exposed in Prometheus format via the metrics endpoint.

For more details, see the Component Metrics documentation.

  • Results Caching Enhancements: Added a cache_key_type option for runtime results caching. Options include:
    • plan (Default): Uses the query's logical plan as the cache key. Matches semantically equivalent queries but requires query parsing.
    • sql: Uses the raw SQL string as the cache key. Provides faster lookups but requires exact string matches. Use sql for predictable queries without dynamic functions like NOW().

Example spicepod.yaml configuration:

runtime:
results_cache:
enabled: true
cache_max_size: 128MiB
cache_key_type: sql # Use SQL for the results cache key
item_ttl: 1s

For more details, see the runtime configuration documentation.

  • Delta Data Connector: Improved scan performance for faster query performance.

  • MCP Tools: Improved descriptions for built-in MCP tools to improve usability.

  • MySQL Component Metrics: Added detailed metrics for monitoring MySQL connections, such as connection count and pool activity.

Example spicepod.yaml configuration:

datasets:
- from: mysql:my_table
name: my_dataset
metrics:
- name: connection_count
enabled: true
- name: connections_in_pool
enabled: true
- name: active_wait_requests
enabled: true
params:
mysql_host: localhost
mysql_tcp_port: 3306
mysql_user: root
mysql_pass: ${secrets:MYSQL_PASS}

For more details, see the MySQL Data Connector documentation.

  • spice.js SDK: The spice.js SDK has been updated to v2.0.1 and includes several important security updates.

New Contributors 🎉

Contributors

Breaking Changes

No breaking changes in this release.

Cookbook Updates

The Spice Cookbook now includes 65 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.1.1, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.1.1 image:

docker pull spiceai/spiceai:1.1.1

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

  • No major dependency changes.

Changelog

- fix: Testoperator DuckDB, SQLite, Postgres, Spicecloud by [@peasee](https://github.com/peasee) in [#5190](https://github.com/spiceai/spiceai/pull/5190)
- Update Helm Chart and SECURITY.md to v1.1.0 by [@lukekim](https://github.com/lukekim) in [#5223](https://github.com/spiceai/spiceai/pull/5223)
- Update version.txt to v1.1.1-unstable by [@lukekim](https://github.com/lukekim) in [#5224](https://github.com/spiceai/spiceai/pull/5224)
- Update Cargo.lock to v1.1.1-unstable by [@lukekim](https://github.com/lukekim) in [#5225](https://github.com/spiceai/spiceai/pull/5225)
- Add tests for `verify_schema_source_path` in `ListingTableConnector` by [@phillipleblanc](https://github.com/phillipleblanc) in [#5221](https://github.com/spiceai/spiceai/pull/5221)
- Reduce noise from debug logging by [@phillipleblanc](https://github.com/phillipleblanc) in [#5227](https://github.com/spiceai/spiceai/pull/5227)
- Improve `openai_test_chat_messages` integration test reliability by [@Sevenannn](https://github.com/Sevenannn) in [#5222](https://github.com/spiceai/spiceai/pull/5222)
- Verify the checkpoints existence before shutting down runtime in integration tests directly querying checkpoint by [@Sevenannn](https://github.com/Sevenannn) in [#5232](https://github.com/spiceai/spiceai/pull/5232)
- Fix CORS support for json content-type api by [@sgrebnov](https://github.com/sgrebnov) in [#5241](https://github.com/spiceai/spiceai/pull/5241)
- Fix ModelGradedScorer error: The 'metadata' parameter is only allowed when 'store' is enabled. by [@sgrebnov](https://github.com/sgrebnov) in [#5231](https://github.com/spiceai/spiceai/pull/5231)
- fix: Use `pulls-with-spice-action` and switch to `spiceai-macos` runners by [@peasee](https://github.com/peasee) in [#5238](https://github.com/spiceai/spiceai/pull/5238)
- Use v1.0.3 pulls with spice action by [@lukekim](https://github.com/lukekim) in [#5244](https://github.com/spiceai/spiceai/pull/5244)
- feat: Build ODBC binaries, run testoperator on ODBC by [@peasee](https://github.com/peasee) in [#5237](https://github.com/spiceai/spiceai/pull/5237)
- Bump timeout for several integration test runtime load_components & readiness check by [@Sevenannn](https://github.com/Sevenannn) in [#5229](https://github.com/spiceai/spiceai/pull/5229)
- Validate port is available before binding port for docker container in integration tests by [@Sevenannn](https://github.com/Sevenannn) in [#5248](https://github.com/spiceai/spiceai/pull/5248)
- Update datafusion-table-providers to fix the schema for PostgreSQL materialized views by [@ewgenius](https://github.com/ewgenius) in [#5259](https://github.com/spiceai/spiceai/pull/5259)
- Verify flight server is ready for flight integration tests by [@Sevenannn](https://github.com/Sevenannn) in [#5240](https://github.com/spiceai/spiceai/pull/5240)
- fix: Publish to MinIO inside of matrix on build_and_release by [@peasee](https://github.com/peasee) in [#5258](https://github.com/spiceai/spiceai/pull/5258)
- fix: TPCDS on zero results benchmarks by [@peasee](https://github.com/peasee) in [#5263](https://github.com/spiceai/spiceai/pull/5263)
- Use model as a judge scorer for Financebench by [@sgrebnov](https://github.com/sgrebnov) in [#5264](https://github.com/spiceai/spiceai/pull/5264)
- Fix FinanceBench llm scorer secret name by [@sgrebnov](https://github.com/sgrebnov) in [#5276](https://github.com/spiceai/spiceai/pull/5276)
- Implements support for `runtime.results_cache.cache_key_type` by [@phillipleblanc](https://github.com/phillipleblanc) in [#5265](https://github.com/spiceai/spiceai/pull/5265)
- fix: Testoperator MS SQL, query overrides, dispatcher by [@peasee](https://github.com/peasee) in [#5279](https://github.com/spiceai/spiceai/pull/5279)
- refactor: Delete old benchmarks by [@peasee](https://github.com/peasee) in [#5283](https://github.com/spiceai/spiceai/pull/5283)
- Imporve embedding column parsing performance test by [@Sevenannn](https://github.com/Sevenannn) in [#5268](https://github.com/spiceai/spiceai/pull/5268)
- Add Support for AWS Session Token in S3 Data Connector by [@kczimm](https://github.com/kczimm) in [#5243](https://github.com/spiceai/spiceai/pull/5243)
- Implement Component Metrics system + MySQL connection pool metrics by [@phillipleblanc](https://github.com/phillipleblanc) in [#5290](https://github.com/spiceai/spiceai/pull/5290)
- Add default descriptions to built-in MCP tools by [@lukekim](https://github.com/lukekim) in [#5293](https://github.com/spiceai/spiceai/pull/5293)
- fix: Vector search with cased columns by [@peasee](https://github.com/peasee) in [#5295](https://github.com/spiceai/spiceai/pull/5295)
- Run delta kernel scan in a blocking Tokio thread. by [@phillipleblanc](https://github.com/phillipleblanc) in [#5296](https://github.com/spiceai/spiceai/pull/5296)
- Expose the `mysql_pool_min` and `mysql_pool_max` connection pool parameters by [@phillipleblanc](https://github.com/phillipleblanc) in [#5297](https://github.com/spiceai/spiceai/pull/5297)
- use patched pdf-extract by [@kczimm](https://github.com/kczimm) in [#5270](https://github.com/spiceai/spiceai/pull/5270)

Full Changelog: v1.1.0...v1.1.1