7 posts tagged with "iceberg"

Iceberg Catalong Connector related topics and usage

View All Tags

Spice v1.11.0 (Jan 28, 2026)

January 28, 2026 · 58 min read

William Croxson

Senior Software Engineer at Spice AI

Announcing the release of Spice v1.11.0-stable! ⚡

In Spice v1.11.0, Spice Cayenne reaches Beta status with acceleration snapshots, Key-based deletion vectors, and Amazon S3 Express One Zone support. DataFusion has been upgraded to v51 along with Arrow v57.2, and iceberg-rust v0.8.0. v1.11 adds several DynamoDB & DynamoDB Streams improvements such as JSON nesting, and adds significant improvements to Distributed Query with active-active schedulers and mTLS for enterprise-grade high-availability and secure cluster communication.

This release also adds new SMB, NFS, and ScyllaDB Data Connectors (Alpha), Prepared Statements with full SDK support (gospice, spice-rs, spice-dotnet, spice-java, spice.js, and spicepy), Google LLM Support for expanded AI inference capabilities, and significant improvements to caching, observability, and Hash Indexing for Arrow Acceleration.

What's New in v1.11.0

Spice Cayenne Accelerator Reaches Beta

Spice Cayenne has been promoted to Beta status with acceleration snapshots support and numerous performance and stability improvements.

Key Enhancements:

Key-based Deletion Vectors: Improved deletion vector support using key-based lookups for more efficient data management and faster delete operations. Key-based deletion vectors are more memory-efficient than positional vectors for sparse deletions.
S3 Express One Zone Support: Store Cayenne data files in S3 Express One Zone for single-digit millisecond latency, ideal for latency-sensitive query workloads that require persistence.

Improved Reliability:

Resolved FuturesUnordered reentrant drop crashes
Fixed memory growth issues related to Vortex metrics allocation
Metadata catalog now properly respects cayenne_file_path location
Added warnings for unparseable configuration values

For more details, refer to the Cayenne Documentation.

DataFusion v51 Upgrade

Apache DataFusion has been upgraded to v51, bringing significant performance improvements, new SQL features, and enhanced observability.

DataFusion v51 ClickBench Performance

Performance Improvements:

Faster CASE Expression Evaluation: Expressions now short-circuit earlier, reuse partial results, and avoid unnecessary scattering, speeding up common ETL patterns
Better Defaults for Remote Parquet Reads: DataFusion now fetches the last 512KB of Parquet files by default, typically avoiding 2 I/O requests per file
Faster Parquet Metadata Parsing: Leverages Arrow 57's new thrift metadata parser for up to 4x faster metadata parsing

New SQL Features:

SQL Pipe Operators: Support for |> syntax for inline transforms
DESCRIBE <query>: Returns the schema of any query without executing it
Named Arguments in SQL Functions: PostgreSQL-style param => value syntax for scalar, aggregate, and window functions
Decimal32/Decimal64 Support: New Arrow types supported including aggregations like SUM, AVG, and MIN/MAX

Example pipe operator:

SELECT * FROM t
|> WHERE a > 10
|> ORDER BY b
|> LIMIT 5;

Improved Observability:

Improved EXPLAIN ANALYZE Metrics: New metrics including output_bytes, selectivity for filters, reduction_factor for aggregates, and detailed timing breakdowns

Arrow 57.2 Upgrade

Apache Arrow has been upgraded to v57.2, bringing major performance improvements and new capabilities.

Key Features:

4x Faster Parquet Metadata Parsing: A rewritten thrift metadata parser delivers up to 4x faster metadata parsing, especially beneficial for low-latency use cases and files with large amounts of metadata
Parquet Variant Support: Experimental support for reading and writing the new Parquet Variant type for semi-structured data, including shredded variant values
Parquet Geometry Support: Read and write support for Parquet Geometry types (GEOMETRY and GEOGRAPHY) with GeospatialStatistics
New arrow-avro Crate: Efficient conversion between Apache Avro and Arrow RecordBatches with projection pushdown and vectorized execution support

DynamoDB Connector Enhancements

Added JSON nesting for DynamoDB Streams
Improved batch deletion handling

Distributed Query Improvements

High Availability Clusters: Spice now supports running multiple active schedulers in an active/active configuration for production deployments. This eliminates the scheduler as a single point of failure and enables graceful handling of node failures.

Multiple schedulers run simultaneously, each capable of accepting queries
Schedulers coordinate via a shared S3-compatible object store
Executors discover all schedulers automatically
A load balancer distributes client queries across schedulers

Example HA configuration:

runtime:
  scheduler:
    state_location: s3://my-bucket/spice-cluster
    params:
      region: us-east-1

mTLS Verification: Cluster communication between scheduler and executors now supports mutual TLS verification for enhanced security.

Credential Propagation: S3, ABFS, and GCS credentials are now automatically propagated to executors in cluster mode, enabling access to cloud storage across the distributed query cluster.

Improved Resilience:

Exponential backoff for scheduler disconnection recovery
Increased gRPC message size limit from 16MB to 100MB for large query plans
HTTP health endpoint for cluster executors
Automatic executor role inference when --scheduler-address is provided

For more details, refer to the Distributed Query Documentation.

iceberg-rust v0.8.0 Upgrade

Spice has been upgraded to iceberg-rust v0.8.0, bringing improved Iceberg table support.

Key Features:

V3 Metadata Support: Full support for Iceberg V3 table metadata format
INSERT INTO Partitioned Tables: DataFusion integration now supports inserting data into partitioned Iceberg tables
Improved Delete File Handling: Better support for position and equality delete files, including shared delete file loading and caching
SQL Catalog Updates: Implement update_table and register_table for SQL catalog
S3 Tables Catalog: Implement update_table for S3 Tables catalog
Enhanced Arrow Integration: Convert Arrow schema to Iceberg schema with auto-assigned field IDs, _file column support, and Date32 type support

Acceleration Snapshots

Acceleration snapshots enable point-in-time recovery and data versioning for accelerated datasets. Snapshots capture the state of accelerated data at specific points, allowing for fast bootstrap recovery and rollback capabilities.

Key Features:

Flexible Triggers: Configure when snapshots are created based on time intervals or stream batch counts
Automatic Compaction: Reduce storage overhead by compacting older snapshots (DuckDB only)
Bootstrap Integration: Snapshots can reset cache expiry on load for seamless recovery (DuckDB with Caching refresh mode)
Smart Creation Policies: Only create snapshots when data has actually changed

Example configuration:

datasets:
  - from: s3://my-bucket/data.parquet
    name: my_dataset
    acceleration:
      enabled: true
      engine: cayenne
      mode: file
      snapshots: enabled
      snapshots_trigger: time_interval
      snapshots_trigger_threshold: 1h
      snapshots_creation_policy: on_changed

Snapshots API and CLI: New API endpoints and CLI commands for managing snapshots programmatically.

CLI Commands:

# List all snapshots for a dataset
spice acceleration snapshots taxi_trips

# Get details of a specific snapshot
spice acceleration snapshot taxi_trips 3

# Set the current snapshot for rollback (requires runtime restart)
spice acceleration set-snapshot taxi_trips 2

HTTP API Endpoints:

Method	Endpoint	Description
GET	`/v1/datasets/{dataset}/acceleration/snapshots`	List all snapshots for a dataset
GET	`/v1/datasets/{dataset}/acceleration/snapshots/{id}`	Get details of a specific snapshot
POST	`/v1/datasets/{dataset}/acceleration/snapshots/current`	Set the current snapshot for rollback

For more details, refer to the Acceleration Snapshots Documentation.

Caching Acceleration Mode Improvements

The Caching Acceleration Mode introduced in v1.10.0 has received significant performance optimizations and reliability fixes in this release.

Performance Optimizations:

Non-blocking Cache Writes: Cache misses no longer block query responses. Data is written to the cache asynchronously after the query returns, reducing query latency for cache miss scenarios.
Batch Cache Writes: Multiple cache entries are now written in batches rather than individually, significantly improving write throughput for high-volume cache operations.

Reliability Fixes:

Correct SWR Refresh Behavior: The stale-while-revalidate (SWR) pattern now correctly refreshes only the specific entries that were accessed instead of refreshing all stale rows in the dataset. This prevents unnecessary source queries and reduces load on upstream data sources.
Deduplicated Refresh Requests: Fixed an issue where JSON array responses could trigger multiple redundant refresh operations. Refresh requests are now properly deduplicated.
Fixed Cache Hit Detection: Resolved an issue where queries that didn't include fetched_at in their projection would always result in cache misses, even when cached data was available.
Unfiltered Query Optimization: SELECT * queries without filters now return cached data directly without unnecessary filtering overhead.

For more details, refer to the Caching Acceleration Mode Documentation.

Prepared Statements

Improved Query Performance and Security: Spice now supports prepared statements, enabling parameterized queries that improve both performance through query plan caching and security by preventing SQL injection attacks.

Key Features:

Query Plan Caching: Prepared statements cache query plans, reducing planning overhead for repeated queries
SQL Injection Prevention: Parameters are safely bound, preventing SQL injection vulnerabilities
Arrow Flight SQL Support: Full prepared statement support via Arrow Flight SQL protocol

SDK Support:

SDK	Support	Min Version	Method
gospice (Go)	✅ Full	v8.0.0+	`SqlWithParams()` with typed constructors (`Int32Param`, `StringParam`, `TimestampParam`, etc.)
spice-rs (Rust)	✅ Full	v3.0.0+	`query_with_params()` with `RecordBatch` parameters
spice-dotnet (.NET)	✅ Full	v0.3.0+	`QueryWithParams()` with typed parameter builders
spice-java (Java)	✅ Full	v0.5.0+	`queryWithParams()` with typed `Param` constructors (`Param.int64()`, `Param.string()`, etc.)
spice.js (JavaScript)	✅ Full	v3.1.0+	`query()` with parameterized query support
spicepy (Python)	✅ Full	v3.1.0+	`query()` with parameterized query support

Example (Go):

import "github.com/spiceai/gospice/v8"

client, _ := spice.NewClient()
defer client.Close()

// Parameterized query with typed parameters
results, _ := client.SqlWithParams(ctx,
    "SELECT * FROM products WHERE price > $1 AND category = $2",
    spice.Float64Param(10.0),
    spice.StringParam("electronics"),
)

Example (Java):

import ai.spice.SpiceClient;
import ai.spice.Param;
import org.apache.arrow.adbc.core.ArrowReader;

try (SpiceClient client = new SpiceClient()) {
    // With automatic type inference
    ArrowReader reader = client.queryWithParams(
        "SELECT * FROM products WHERE price > $1 AND category = $2",
        10.0, "electronics");

    // With explicit typed parameters
    ArrowReader reader = client.queryWithParams(
        "SELECT * FROM products WHERE price > $1 AND category = $2",
        Param.float64(10.0),
        Param.string("electronics"));
}

For more details, refer to the Parameterized Queries Documentation.

Spice Java SDK v0.5.0

Parameterized Query Support for Java: The Spice Java SDK v0.5.0 introduces parameterized queries using ADBC (Arrow Database Connectivity), providing a safer and more efficient way to execute queries with dynamic parameters.

Key Features:

SQL Injection Prevention: Parameters are safely bound, preventing SQL injection vulnerabilities
Automatic Type Inference: Java types are automatically mapped to Arrow types (e.g., double → Float64, String → Utf8)
Explicit Type Control: Use the new Param class with typed factory methods (Param.int64(), Param.string(), Param.decimal128(), etc.) for precise control over Arrow types
Updated Dependencies: Apache Arrow Flight SQL upgraded to 18.3.0, plus new ADBC driver support

Example:

import ai.spice.SpiceClient;
import ai.spice.Param;

try (SpiceClient client = new SpiceClient()) {
    // With automatic type inference
    ArrowReader reader = client.queryWithParams(
        "SELECT * FROM taxi_trips WHERE trip_distance > $1 LIMIT 10",
        5.0);

    // With explicit typed parameters for precise control
    ArrowReader reader = client.queryWithParams(
        "SELECT * FROM orders WHERE order_id = $1 AND amount >= $2",
        Param.int64(12345),
        Param.decimal128(new BigDecimal("99.99"), 10, 2));
}

Maven:

<dependency>
  <groupId>ai.spice</groupId>
  <artifactId>spiceai</artifactId>
  <version>0.5.0</version>
</dependency>

For more details, refer to the Spice Java SDK Repository.

Google LLM Support

Expanded AI Provider Support: Spice now supports Google embedding and chat models via the Google AI provider, expanding the available LLM options for AI inference workloads alongside existing providers like OpenAI, Anthropic, and AWS Bedrock.

Key Features:

Google Chat Models: Access Google's Gemini models for chat completions
Google Embeddings: Generate embeddings using Google's text embedding models
Unified API: Use the same OpenAI-compatible API endpoints for all LLM providers

Example spicepod.yaml configuration:

models:
  - from: google:gemini-2.0-flash
    name: gemini
    params:
      google_api_key: ${secrets:GOOGLE_API_KEY}

embeddings:
  - from: google:text-embedding-004
    name: google_embeddings
    params:
      google_api_key: ${secrets:GOOGLE_API_KEY}

For more details, refer to the Google LLM Documentation (see docs PR #1286).

URL Tables

Query data sources directly via URL in SQL without prior dataset registration. Supports S3, Azure Blob Storage, and HTTP/HTTPS URLs with automatic format detection and partition inference.

Supported Patterns:

Single files: SELECT * FROM 's3://bucket/data.parquet'
Directories/prefixes: SELECT * FROM 's3://bucket/data/'
Glob patterns: SELECT * FROM 's3://bucket/year=*/month=*/data.parquet'

Key Features:

Automatic file format detection (Parquet, CSV, JSON, etc.)
Hive-style partition inference with filter pushdown
Schema inference from files
Works with both SQL and DataFrame APIs

Example with hive partitioning:

-- Partitions are automatically inferred from paths
SELECT * FROM 's3://bucket/data/' WHERE year = '2024' AND month = '01'

Enable via spicepod.yml:

runtime:
  params:
    url_tables: enabled

Cluster Mode Async Query APIs (experimental)

New asynchronous query APIs for long-running queries in cluster mode:

/v1/queries endpoint: Submit queries and retrieve results asynchronously

OpenTelemetry Improvements

Unified Telemetry Endpoint: OTel metrics ingestion has been consolidated to the Flight port (50051), simplifying deployment by removing the separate OTel port (50052). The push-based metrics exporter continues to support integration with OpenTelemetry collectors.

Note: This is a breaking change. Update your configurations if you were using the dedicated OTel port 50052. Internal cluster communication now uses port 50052 exclusively.

Observability Improvements

Enhanced Dashboards: Updated Grafana and Datadog example dashboards with:

Snapshot monitoring widgets
Improved accelerated datasets section
Renamed ingestion lag charts for clarity

Additional Histogram Buckets: Added more buckets to histogram metrics for better latency distribution visibility.

For more details, refer to the Monitoring Documentation.

Hash Indexing for Arrow Acceleration (experimental)

Arrow-based accelerations now support hash indexing for faster point lookups on equality predicates. Hash indexes provide O(1) average-case lookup performance for columns with high cardinality.

Features:

Primary key hash index support
Secondary index support for non-primary key columns
Composite key support with proper null value handling

Example configuration:

datasets:
  - from: postgres:users
    name: users
    acceleration:
      enabled: true
      engine: arrow
      primary_key: user_id
      indexes:
        '(tenant_id, user_id)': unique  # Composite hash index

For more details, refer to the Hash Index Documentation.

SMB and NFS Data Connectors

Network-Attached Storage Connectors: New data connectors for SMB (Server Message Block) and NFS (Network File System) protocols enable direct federated queries against network-attached storage without requiring data movement to cloud object stores.

Key Features:

SMB Protocol Support: Connect to Windows file shares and Samba servers with authentication support
NFS Protocol Support: Connect to Unix/Linux NFS exports for direct data access
Federated Queries: Query Parquet, CSV, JSON, and other file formats directly from network storage with full SQL support
Acceleration Support: Accelerate data from SMB/NFS sources using DuckDB, Spice Cayenne, or other accelerators

Example spicepod.yaml configuration:

datasets:
  # SMB share
  - from: smb://fileserver/share/data.parquet
    name: smb_data
    params:
      smb_username: ${secrets:SMB_USER}
      smb_password: ${secrets:SMB_PASS}

  # NFS export
  - from: nfs://nfsserver/export/data.parquet
    name: nfs_data

For more details, refer to the Data Connectors Documentation.

ScyllaDB Data Connector

A new data connector for ScyllaDB, the high-performance NoSQL database compatible with Apache Cassandra. Query ScyllaDB tables directly or accelerate them for faster analytics.

Example configuration:

datasets:
  - from: scylladb:my_keyspace.my_table
    name: scylla_data
    acceleration:
      enabled: true
      engine: duckdb

For more details, refer to the ScyllaDB Data Connector Documentation.

Flight SQL TLS Connection Fixes

TLS Connection Support: Fixed TLS connection issues when using grpc+tls:// scheme with Flight SQL endpoints. Added support for custom CA certificate files via the new flightsql_tls_ca_certificate_file parameter.

Developer Experience Improvements

Turso v0.3.2 Upgrade: Upgraded Turso accelerator for improved performance and reliability
Rust 1.91 Upgrade: Updated to Rust 1.91 for latest language features and performance improvements
Spice Cloud CLI: Added spice cloud CLI commands for cloud deployment management
Improved Spicepod Schema: Improved JSON schema generation for better IDE support and validation
Acceleration Snapshots: Added configurable snapshots_create_interval for periodic acceleration snapshots independent of refresh cycles
Tiered Caching with Localpod: The Localpod connector now supports caching refresh mode, enabling multi-layer acceleration where a persistent cache feeds a fast in-memory cache
GitHub Data Connector: Added workflows and workflow runs support for GitHub repositories
NDJSON/LDJSON Support: Added support for Newline Delimited JSON and Line Delimited JSON file formats

Additional Improvements & Bug Fixes

Model Listing: New functionality to list available models across multiple AI providers
DuckDB Partitioned Tables: Primary key constraints now supported in partitioned DuckDB table mode
Post-refresh Sorting: New on_refresh_sort_columns parameter for DuckDB enables data ordering after writes
Improved Install Scripts: Removed jq dependency and improved cross-platform compatibility
Better Error Messages: Improved error messaging for bucket UDF arguments and deprecated OpenAI parameters
Reliability: Fixed DynamoDB IAM role authentication with new dynamodb_auth: iam_role parameter
Reliability: Fixed cluster executors to use scheduler's temp_directory parameter for shuffle files
Reliability: Initialize secrets before object stores in cluster executor mode
Reliability: Added page-level retry with backoff for transient GitHub GraphQL errors
Performance: Improved statistics for rewritten DistributeFileScanOptimizer plans
Developer Experience: Added max_message_size configuration for Flight service

Contributors

Breaking Changes

OTel Ingestion Port Change

OTel ingestion has been moved to the Flight port (50051), removing the separate OTel port 50052. Port 50052 is now used exclusively for internal cluster communication. Update your configurations if you were using the dedicated OTel port.

Distributed Query Cluster Mode Requires mTLS

Distributed query cluster mode now requires mTLS for secure communication between cluster nodes. This is a security enhancement to prevent unauthorized nodes from joining the cluster and accessing secrets.

Migration Steps:

Generate certificates using spice cluster tls init and spice cluster tls add
Update scheduler and executor startup commands with --node-mtls-* arguments
For development/testing, use --allow-insecure-connections to opt out of mTLS

Renamed CLI Arguments:

Old Name	New Name
`--cluster-mode`	`--role`
`--cluster-ca-certificate-file`	`--node-mtls-ca-certificate-file`
`--cluster-certificate-file`	`--node-mtls-certificate-file`
`--cluster-key-file`	`--node-mtls-key-file`
`--cluster-address`	`--node-bind-address`
`--cluster-advertise-address`	`--node-advertise-address`
`--cluster-scheduler-url`	`--scheduler-address`

Removed CLI Arguments:

--cluster-api-key: Replaced by mTLS authentication

Cookbook Updates

New ScyllaDB Data Connector Recipe: New recipe demonstrating how to use the ScyllaDB Data Connector. See ScyllaDB Data Connector Recipe for details.

New SMB Data Connector Recipe: New recipe demonstrating how to use the SMB Data Connector. See SMB Data Connector Recipe for details.

The Spice Cookbook includes 86 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.11.0, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.11.0 image:

docker pull spiceai/spiceai:1.11.0

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai --version 1.11.0

AWS Marketplace:

Spice is available in the AWS Marketplace.

Dependencies

DataFusion: Upgraded to v51 (release notes)
Arrow: Upgraded to v57.2 (release notes)
iceberg-rust: Upgraded to v0.8.0 (release notes)

What's Changed

Changelog

OTel exporter for push metrics by @lukekim in #8442
fix: Update benchmark snapshots by @app/github-actions in #8448
Add TPCH append tests to scheduled dispatch workflow by @sgrebnov in #8451
Add snapshot creation logging by @krinart in #8469
Fix PeriodicReader panic by @krinart in #8471
Benchmarks: increase readiness timeout for turso acceleration (TPC-H) by @sgrebnov in #8470
fix: Pin CUDA build actions to commits by @peasee in #8477
Add Criterion benchmarking to chunking crate. by @Jeadie in #8431
DuckDB agg pushdown: gate behind accelerator parameter by @mach-kernel in #8474
Rename aggregate_pushdown_optimization -> optimizer_duckdb_aggregate_pushdown by @ewgenius in #8485
Handle throttling exception for DynamoDB streams by @phillipleblanc in #8492
docs: Add release notes by @peasee in #8478
Update spicepod.schema.json by @app/github-actions in #8496
Move 'test_projection_pushdown' to runtime-datafusion by @Jeadie in #8490
Fix OTEL metrics HTTP exporter client setup by @phillipleblanc in #8489
Update endgame to include new caching accelerator cookbook by @phillipleblanc in #8487
DynamoDB tests and fixes by @lukekim in #8491
Align make lint-rust-fix with make lint-rust by @Jeadie in #8499
fix: Remove unused Cayenne parameters by @peasee in #8500
Force task history captured_plan outputs to be captured even if they would be filtered out otherwise by @phillipleblanc in #8501
release: post-release updates by @peasee in #8503
CI: Fix E2E models dispatch by @mach-kernel in #8505
Use an isolated Tokio runtime for refresh tasks that is separate from the main query API by @phillipleblanc in #8504
Update openapi.json by @app/github-actions in #8512
Update dependencies by @phillipleblanc in #8513
fix: Avoid double hashing cache key by @peasee in #8511
fix: Eagerly drop cached records for results larger than max by @peasee in #8516
feat: Support vortex zstd compressor by @peasee in #8515
warning if column is defined in spicepod but is non-existant by @Jeadie in #8498
Return summarized spicepods from /v1/spicepods by @phillipleblanc in #8404
Fix for idle DynamoDB Stream by @krinart in #8506
Use Datafusion::Plan over Datafusion::Internal for user-facing search errors. by @Jeadie in #8484
DDB Streams Integration Test + Memory Acceleration + Improved Warning by @krinart in #8520
Upgrade to gospice v8 by @lukekim in #8524
Vortex file format for object store by @Jeadie in #8525
docs: Add missing cookbooks to endgame, focus area section by @peasee in #8527
ListingTableConnector: Drop partition columns that reoccur in file schema by @mach-kernel in #8519
fix(cluster): initialize secrets before object stores in executor by @sgrebnov in #8532
Use separate Tokio runtime for SWR refreshes by @phillipleblanc in #8530
fix: SQL Results Cache SWR triggers incorrect cache miss metric by @phillipleblanc in #8529
feat: testoperator: OTLP streaming metrics / connect to existing instance / infinite mode / query timeout by @phillipleblanc in #8537
Update openapi.json by @app/github-actions in #8518
Add better attributes for search testing by @Jeadie in #8531
fix: Improve Cayenne errors, ID selection for table/partition creation by @peasee in #8523
Percent-encode Kubernetes secret name path segment by @phillipleblanc in #8522
Initial parameterisation of Search integration tests by @Jeadie in #8066
fix: Add warning when multiple partitions are defined for the same table by @peasee in #8540
Add DuckDB file-mode support for search test parameterization. by @Jeadie in #8541
fix: Add recursion depth limits to prevent DoS via deeply nested data (DynamoDB + S3 Vectors) by @phillipleblanc in #8544
Remove the clippy::too_many_lines lint by @phillipleblanc in #8549
feat: Add spice cluster tls commands by @phillipleblanc in #8550
Move OTel ingestion to Flight port, remove separate OTel port 50052 by @phillipleblanc in #8551
Move verbose tool init messages to trace by @phillipleblanc in #8552
Add SANs to spice cluster tls add certificates by @phillipleblanc in #8554
Add cpu, gpu, and memory to telemetry by @lukekim in #8483
feat: Add workflows and workflow runs to GitHub Data Connector by @peasee in #8548
Add S3Vectors option for paramterised search tests by @Jeadie in #8555
Distributed query: TLS + API key by @mach-kernel in #8468
Refactor RRF SQL to use LogicalPlanBuilder by @Jeadie in #7968
Require mTLS for distributed query cluster mode by @phillipleblanc in #8580
Fix stats for rewritten DistributeFileScanOptimizer plans by @mach-kernel in #8581
Bump actions/cache from 4.3.0 to 5.0.1 by @app/dependabot in #8573
Show user-friendly error on empty DDB table by @krinart in #8586
Add (Deprecated) labels to deprecated spice sql params by @krinart in #8588
Fix kafka warning when security.protocol is set to PLAINTEXT by @krinart in #8587
Upgrade dependencies by @phillipleblanc in #8593
Release notes for v1.10.1 by @Jeadie in #8568
Run search benchmarks twice a week. by @Jeadie in #8592
Add cayenne data accelerator by @Jeadie in #8553
SQL allowlist for tools: sql, list_datasets, search, table_schema by @Jeadie in #8449
use rstest for llms integration tests by @Jeadie in #8566
Post v1.10.1 housekeeping by @Jeadie in #8600
Add checklist for SDK publication in end_game.md by @Jeadie in #8602
Fix IMAGE_TAG assignment for Docker compatibility by @Jeadie in #8599
Modify cluster arguments to spiced for UX review by @phillipleblanc in #8603
Update QA analytics with new release data by @Jeadie in #8601
Remove 'tract-core' dependency. by @Jeadie in #8605
Google embedding and chat models. by @Jeadie in #8423
Fix test_github_workflows integration test by @sgrebnov in #8607
Update openapi.json by @app/github-actions in #8604
ci: Upload artifacts to MinIO eagerly after each build step by @phillipleblanc in #8615
fix: SQLite accelerator decimal/date handling by @phillipleblanc in #8606
Configure mTLS for executor-to-executor gRPC connections by @sgrebnov in #8617
feat: Enable localpod with caching mode accelerator for tiered caching by @phillipleblanc in #8621
Add Cayenne S3 Express One Zone support for data files by @lukekim in #8502
Add snapshot interval for acceleration snapshots by @phillipleblanc in #8627
Add dataset_load_parallelism parameter to spicepod.yml by @peasee in #8630
Json Nesting for DynamoDB by @krinart in #8623
Restore deprecated open-telemetry flag in spiced by @phillipleblanc in #8629
Implement batching for Kafka/Debezium + null Decimal handling by @krinart in #8622
fix: Status field in /v1/datasets & /v1/models by @lukekim in #8633
Add Spice test operator improvements by @sgrebnov in #8625
Add v1.10.2 release notes by @sgrebnov in #8640
Align object-store vortex with runtime feature flagging by @Jeadie in #8620
Extended LLM & search tests on cron by @Jeadie in #8624
Test-operator: emit main metrics as part of load tests by @sgrebnov in #8639
Build a local docker image from an existing Spice binary by @phillipleblanc in #8619
fix: Use runtime-rate-control for GitHub Data Connector by @peasee in #8638
Upgrade dependencies by @phillipleblanc in #8655
Bump headers-accept from 0.1.4 to 0.3.0 by @app/dependabot in #8644
Update openapi.json by @app/github-actions in #8637
Update SECURITY.md - Include v1.10.2 by @sgrebnov in #8661
Serialize acceleration snapshots with refresh writes by @phillipleblanc in #8652
Update AI Installation test to use minilm_l6_v2 by @sgrebnov in #8659
Fixes for search integration test CI by @Jeadie in #8656
fix: Use a GitHub rate controller per auth context by @peasee in #8662
fix: Update Search integration test snapshots by @app/github-actions in #8654
Make E2E Test Release Installation (AI, Local HF model) test more robust by @sgrebnov in #8666
Fix issue with location predicate for custom S3 endpoints + regression integration test by @phillipleblanc in #8668
fix: Validate schema match before projection pushdown in UnionProjectionPushdownOptimizer by @phillipleblanc in #8669
Proper batch commit for kafka/debezium by @krinart in #8671
Improve spicepod json schema generation by @ewgenius in #8547
Start the anonymous telemetry exporter asynchronously by @phillipleblanc in #8679
fix: Move enforce-pulls to hosted runner by @phillipleblanc in #8686
Update QA analytics with 1.10.2 release data by @sgrebnov in #8667
fix: Azure does not support suffix range requests by @phillipleblanc in #8685
Remove spicepod-validator cargo build from build-dev target by @Jeadie in #8684
fix: Update test snapshots by @app/github-actions in #8680
fix: Update Search integration test snapshots by @app/github-actions in #8681
SMB and NFS Data Connectors by @lukekim in #8674
Upgrade to openai-async v0.32 by @lukekim in #8635
v1.10.3 release notes by @phillipleblanc in #8693
Upgrade dependencies by @phillipleblanc in #8704
fix: Support NDJSON and LDJSON by @lukekim in #8649
move OpenAI overrides to non-prefixed by @Jeadie in #8678
Update Google LLM param: google_dimensions -> dimensions by @Jeadie in #8677
Make cluster mTLS optional with insecure flag by @phillipleblanc in #8703
Revert "fix: Move enforce-pulls to hosted runner (#8686)" by @phillipleblanc in #8709
Initial 'testoperator run text-to-sql' by @Jeadie in #8618
Add support for abfss by @krinart in #8706
Add testoperator TPCH dispatch for ABFS with hierarchical namespace disabled + versioning enabled by @phillipleblanc in #8711
Update openapi.json by @app/github-actions in #8692
cluster: validate --role argument by @phillipleblanc in #8717
Upgrade to Turso v0.3.2 by @lukekim in #8716
Rename --insecure to --allow-insecure-connections to be consistent with existing naming by @lukekim in #8720
Remove 'testoperator run http-consistency/http-overhead' by @Jeadie in #8708
refactor: Remove cluster feature flag by @phillipleblanc in #8718
Docs: Distributed query ADR by @mach-kernel in #8608
Use model.datasets to allowlist on tools by @Jeadie in #8714
cluster: quality of life improvements to starting cluster mode locally by @phillipleblanc in #8719
Docs: Ballista extension ADR by @mach-kernel in #8616
Improve deprecation messages when going from prefixed -> non-prefixed. by @Jeadie in #8724
Remove tools from auto-defaults by @Jeadie in #8725
Make distinct providers for vector spilling, vector partitioning. by @Jeadie in #8546
cluster: default scheduler address port by @phillipleblanc in #8728
Add Makefile targets for testoperator by @Jeadie in #8729
text-to-sql dispatch in testoperator by @Jeadie in #8705
DR-006: High Availability Distributed Query with Stateless Schedulers by @lukekim in #8721
DR-007: mTLS for Distributed Query Cluster Communication by @lukekim in #8722
SMB and NFS improvements by @lukekim in #8710
fix: Cluster executors use scheduler's temp_directory for shuffle files by @phillipleblanc in #8733
use 'max_message_size' in flight service too by @Jeadie in #8730
Add page-level retry for transient GraphQL errors with backoff and increase GitHub rate limit buffer up to 100 by @ewgenius in #8726
Make testoperator Dockerfile; CI to build docker image to ghcr.io. by @Jeadie in #8732
cluster: UnionProjectionPushdownOptimizer: Add projection pushdown diagnostics for union children by @phillipleblanc in #8734
Fix column projection order mismatch with location metadata columns by @phillipleblanc in #8738
Fixes for testoperator. by @Jeadie in #8737
Improve Cayenne Deletion Vectors with KeyBased support by @lukekim in #8713
Fix testoperator_dispatch.yaml by @Jeadie in #8740
Add spice cloud CLI commands by @lukekim in #8528
Add FTP, NFS, & SMB TPCH SF1 spicepods by @lukekim in #8739
Prepared Statements by @lukekim in #7588
Schedule dispatch of testoperator run text-to-sql. by @Jeadie in #8745
Fix minio for ai benchmark CI by @Jeadie in #8743
Upgrade to Rust 1.91 by @phillipleblanc in #8749
fix: Update benchmark snapshots by @app/github-actions in #8763
Benchmarks: make row count validation skip logic configurable by scale factor, query set, and overrides by @sgrebnov in #8756
Make benchmark tests more robust by @sgrebnov in #8766
Add parameter to force using iam_role for DynamoDB by @krinart in #8767
fix: Update Search integration test snapshots by @app/github-actions in #8735
v1.10.4 release notes by @phillipleblanc in #8790
Trace metrics export errors by @sgrebnov in #8791
v1.10.4 SECURITY.md update by @phillipleblanc in #8800
Add timezone database to Docker image to fix Cayenne acceleration panic by @sgrebnov in #8799
Upgrade dependencies by @phillipleblanc in #8801
Fix table_allowlist for table sampling and NSQL by @Jeadie in #8789
Cayenne primary key on-conflict handling by @lukekim in #8788
fix: Update benchmark snapshots by @app/github-actions in #8773
fix: correctly identify deprecated openai_* parameters by @phillipleblanc in #8809
fix: Update benchmark snapshots by @app/github-actions in #8812
Use workspace version for cayenne crate by @phillipleblanc in #8811
Don't CAST strings which breaks push down optimizer by @lukekim in #8810
fix: Update benchmark snapshots by @app/github-actions in #8815
Update async-openai to latest revision 4dcd633aad6f - brings fix for openai compatible model providers by @ewgenius in #8816
Add auth/iam_role_source to DynamoDB connector by @krinart in #8808
DynamoDB fixes: JSON nesting for Streams, proper batch deletions by @krinart in #8821
Rough roadmap for 2026-2027 by @lukekim in #8805
Release notes for v1.11.0-rc1 by @ewgenius in #8786
Make S3V integration tests prepare_for_aws_tests more robust by @sgrebnov in #8820
Bump rsa from 0.9.9 to 0.9.10 in the cargo group across 1 directory by @app/dependabot in #8819
Add timezone database to Release and CUDA Docker images to fix Cayene panic by @sgrebnov in #8832
fix: UnionProjectionPushdownOptimizer - Schema change during transform_down breaks parent nodes by @phillipleblanc in #8831
Update grafana/datadog example dashboards by @krinart in #8833
Add Dev bird bench as text-to-sql queryset in CI. by @Jeadie in #8753
Update testoperator scheduler to use release/1.11 branch by @ewgenius in #8829
Spice Cayenne fixes and test spicepods for Beta & RC by @lukekim in #8787
testoperator dispatch all bird-bench database variants by @Jeadie in #8835
feat: Improve column statistics handling with safe access and defaults by @phillipleblanc in #8836
cluster: mTLS verification by @phillipleblanc in #8837
fix: 8770: Unsupported ScalarFunctionExpr in ORDER BY by @lukekim in #8838
Workflow tweaks by @lukekim in #8845
Cayenne: metadata catalog should respect cayenne_file_path location by @sgrebnov in #8844
Expand Cayenne feature coverage by @lukekim in #8848
docs: HA distributed query decisions by @phillipleblanc in #8817
fix(optimizer): Fix correctness issues in UnionProjectionPushdownOptimizer by @phillipleblanc in #8851
Pin reqwest to 0.12.24 to fix HuggingFace embedding model download by @ewgenius in #8853
Fix builds and pin to Ubuntu 22.04 by @lukekim in #8856
Revert "Fix builds and pin to Ubuntu 22.04" by @lukekim in #8861
Ensure setup Rust is run by @lukekim in #8862
fix: Ubuntu 24.04+ renamed libaio1 to libaio1t64 by @lukekim in #8865
Upgrade to Pulls with Spice v2 by @lukekim in #8866
Add limit and configuration name to 'testoperator run text-to-sql' by @Jeadie in #8839
PR check and test optimization by @lukekim in #8868
Upgrade S3 Vectors SDK and improve test robustness by @lukekim in #8867
[Testoperator] Query level and improved aggregate level for NSQL by @Jeadie in #8840
Add docker build for private branches for ghcr.io/spiceai/spiceai-dev by @phillipleblanc in #8873
Expand the data acceleration round-trip test coverage by @lukekim in #8855
fix: Provide a better error for improper bucket UDF arguments by @peasee in #8849
ScyllaDB Data Connector by @lukekim in #8827
Use tokio-rusqlite for Spice Cayenne SQLite by @lukekim in #8857
Cayenne: fix FuturesUnordered reentrant drop crash by @sgrebnov in #8863
Bump github/codeql-action from 4.31.9 to 4.31.10 by @app/dependabot in #8884
Bump golang.org/x/sys from 0.39.0 to 0.40.0 by @app/dependabot in #8881
Bump github.com/spiceai/gospice/v8 from 8.0.0 to 8.0.1 by @app/dependabot in #8883
Bump roaring from 0.11.2 to 0.11.3 by @app/dependabot in #8885
Bump golang.org/x/mod from 0.31.0 to 0.32.0 by @app/dependabot in #8882
Bump aws-sdk-s3 from 1.115.0 to 1.119.0 by @app/dependabot in #8887
Bump libc from 0.2.177 to 0.2.180 by @app/dependabot in #8886
Bump tokio-util from 0.7.17 to 0.7.18 by @app/dependabot in #8889
Bump governor from 0.10.2 to 0.10.4 by @app/dependabot in #8888
fix: flaky test test_concurrent_partition_creation by @phillipleblanc in #8898
Update Cayenne snapshots for TPC-DS by @lukekim in #8890
Add more buckets to histogram metrics by @krinart in #8850
feat: Add HTTP health endpoint for cluster executors by @phillipleblanc in #8899
feat: Implement model listing functionality for multiple providers by @lukekim in #8901
feat: Initial HA schedulers distributed query implementation by @phillipleblanc in #8852
fix: infer executor role from --scheduler-address when --role is omitted by @phillipleblanc in #8903
Improve install scripts and remove jq dependency by @lukekim in #8847
Benchmarks: sort PartitionedUnionExec children for deterministic snapshot comparison by @sgrebnov in #8877
Cayenne: share VortexFileCache across partitions via CayenneContext by @sgrebnov in #8880
Update ballista to add exponential backoff for scheduler disconnection by @phillipleblanc in #8905
Configurably add BirdBench evidence to testoperator text-to-SQL. by @Jeadie in #8904
Helm: Allow command override via values.yaml by @sgrebnov in #8906
Fix distributed query gRPC message size limit (16MB -> 100MB) by @phillipleblanc in #8900
OS specific setup actions by @lukekim in #8909
Cayenne should warn if unable to parse configuration value by @sgrebnov in #8907
Add snapshots widgets to example dashboard by @krinart in #8910
Add quality criteria for the features by @krinart in #8897
Improve Accelerated Datasets section for Grafana/Datadog dashboards by @krinart in #8915
Use HTTP traceparent in NSQL to support concurrency in 'testoperator run text-to-SQL' by @Jeadie in #8912
Remove setup for cc from integration_models.yml by @Jeadie in #8917
Propagate Azure and GCS credentials to executors in cluster mode by @phillipleblanc in #8918
Cayenne: fix memory growth due to vortex metrics allocation by @sgrebnov in #8908
fix(caching): Deduplicate refresh requests for JSON array responses by @sgrebnov in #8921
fix(caching): Return cached data directly for unfiltered queries (SELECT *) by @sgrebnov in #8919
Correct MinIO path syntax for spiced download by @Jeadie in #8916
Acceleration snapshots compaction + Improved Snapshots UX by @krinart in #8858
Change base image from bookworm-slim to trixie-slim by @Jeadie in #8923
Add testoperator run text-to-sql metrics from LogicalPlan by @Jeadie in #8895
Fix spicepod dependencies in testoperator by @Jeadie in #8875
Update copilot instructions for data correctness by @lukekim in #8922
Add BootstrapStatus + Snapshot bootstrapping parallelization by @krinart in #8926
fix: add missing feature-gate for AWS Secrets Manager error variant by @phillipleblanc in #8928
refactor: make ConnectorParams fields public for external connectors by @phillipleblanc in #8929
fix(caching): SWR refreshes only accessed entry instead of all stale rows by @sgrebnov in #8931
Cayenne: include cayenne_metadata_dir to known params by @sgrebnov in #8933
Rename Ingestion Lag chart in example dashboards by @krinart in #8932
fix(caching): Fix HTTP caching always MISS when projection excludes fetched_at by @sgrebnov in #8930
Reset expiry after snapshot bootstraping for Caching by @krinart in #8925
Set use_ssl=false for sccache by @lukekim in #8945
Hash indexing for Arrow Acceleration by @lukekim in #8924
[Cayenne] Acceleration snapshots support by @lukekim in #7973
perf(caching): Non-blocking cache writes on cache miss by @sgrebnov in #8948
Update NSQL models by @lukekim in #8951
Hash Index Key verification by @lukekim in #8949
Add snapshots_creation_policy param by @krinart in #8954
Remove candle & cudarc from non-models build by @lukekim in #8955
Acceleration Snapshots API and CLI by @lukekim in #8934
Ignore test for data_components arrow::indexed::test_primary_key_value_matches_batch by @Jeadie in #8962
fix: Update benchmark snapshots by @app/github-actions in #8965
Hash Index secondary index support by @lukekim in #8958
fix: Support primary key constraints in partitioned DuckDB tables mode by @sgrebnov in #8966
perf(caching): Batch cache writes by @sgrebnov in #8959
CI perf optimizations by @lukekim in #8968
Fix Makefile linting by @Jeadie in #8970
Fixes in testoperator run text-to-sql. by @Jeadie in #8927
implement Chat::as_sql for xAI anthropic by @Jeadie in #8957
Fix duckdb_file_path in search integration test by @Jeadie in #8972
fix: Update benchmark snapshots by @app/github-actions in #8971
Maintenance updates to Anthropic API by @Jeadie in #8956
Add CacheBackend Trait, implement pingora-lru, and add throughput tests by @lukekim in #8080
fix: Update benchmark snapshots by @app/github-actions in #8974
fix: Update benchmark snapshots by @app/github-actions in #8975
Make accelerator shutdown more robust by @lukekim in #8969
feat(duckdb): Add on_refresh_sort_columns for post-write data ordering (initial version) by @sgrebnov in #8964
Proper handling for initial snapshot by @krinart in #8911
fix: Remove --no-default-features from cargo-hack command in features workflow by @phillipleblanc in #8977
build(deps): bump actions/cache from 5.0.1 to 5.0.2 by @app/dependabot in #8983
build(deps): bump actions/checkout from 4 to 6 by @app/dependabot in #8982
build(deps): bump actions/setup-go from 6.1.0 to 6.2.0 by @app/dependabot in #8984
build(deps): bump github.com/olekukonko/tablewriter from 1.1.2 to 1.1.3 by @app/dependabot in #8979
build(deps): bump github.com/klauspost/compress from 1.18.2 to 1.18.3 by @app/dependabot in #8980
Add /v1/queries and Arrow Flight async APIs by @lukekim in #8946
build(deps): bump Vampire/setup-wsl from 5 to 6 by @app/dependabot in #8981
fix: Update Search integration test snapshots by @app/github-actions in #8973
build(deps): bump insta from 1.46.0 to 1.46.1 by @app/dependabot in #8988
build(deps): bump schemars from 1.1.0 to 1.2.0 by @app/dependabot in #8985
fix: Update benchmark snapshots by @app/github-actions in #8978
fix: Data correctness edge cases by @lukekim in #8953
Correct MinIO path syntax for spiced download (Part 2) by @Jeadie in #8995
Make .spice/data in search integration tests by @Jeadie in #8992
fix: Hash index composite keys null values by @lukekim in #9001
Update Cayenne status to Beta by @lukekim in #9002
fix: Disable TPC-DS result validation (not yet supported) by @sgrebnov in #9004
feat: Upgrade to DataFusion v51 and dependencies by @lukekim in #8864
Improvements for snapshots_creation_policy by @krinart in #9003
fix(ci): restore cached spicepod-validator binary instead of lookup-only by @phillipleblanc in #9007
Update version by @krinart in #9010
Update lock file - https://github.com/spiceai/spiceai/commit/53babbf07ca8c1c7b2e1da42ce58c465d9bc9276/
fix: Enable Cayenne acceleration snapshots by @lukekim in #9020
Add TPC-DS integration tests with S3 source and PostgreSQL acceleration by @phillipleblanc in #9006
fix(tests): fix flaky/slow/failing unit tests by @phillipleblanc in #9009
fix: Update benchmark snapshots for DF51 upgrade by @app/github-actions in #9008
fix: add feature gate to rrf TEST_EMBEDDING_MODEL by @phillipleblanc in #9017
fix: features check by @phillipleblanc in #9014
URL table support by @lukekim in #9018
ScyllaDB key filter by @lukekim in #8997
fix: Schema mismatch when using column projection with HTTP caching by @phillipleblanc in #9021
Add more tests for HTTP caching with columns selection by @sgrebnov in #9025
HTTP cache snapshots: default to time_interval and fix snapshots_creation_policy: on_change by @sgrebnov in #9026
Fix duplicate snapshot creation on startup by @sgrebnov in #9029
Remove waiting for runtime to be ready before creating snapshot by @krinart in #9033
Fix snapshot on_change policy to skip when no writes occurred by @sgrebnov in #9028
Release notes for release release/1.11.0-rc.2 by @krinart in #9016
ci: use arduino/setup-protoc for official protobuf compiler by @phillipleblanc in #9036
ci: install unzip on aarch64 runner for arduino/setup-protoc by @phillipleblanc in #9038
fix: don't fail release if upload to minio fails by @phillipleblanc in #9039
Improve validation and logging for hash indexes by @lukekim in #9047
Pin to ubuntu-22.04 by @lukekim in #9068
Fix broken telemetry for testoperator by @krinart in #9054
Fix release builds by @lukekim in #9069
Spice 1.11.0-rc3 release notes by @krinart in #9070
Update spicepod.schema.json by @app/github-actions in #9071
Add missing protoc step to setup-cc action by @krinart in #9041
Fix TLS connection for grpc+tls:// Flight SQL endpoints and add custom CA certificate support by @phillipleblanc in #9073
Update 1.11.0-rc.3 release notes by @krinart in #9082
Fix formula_1 and codebase_community in bird-bench by @Jeadie in #9000
Cayenne S3 Express One Zone improvements by @lukekim in #9015
Add zlib1g-dev to CI by @lukekim in #9052
Upgrade Vortex with CASE-WHEN by @lukekim in #9051
fix: Cayenne CatalogError handling for constraint violations by @lukekim in #9050
Fix Docker build failing to copy shared libraries due to ldd output parsing by @phillipleblanc in #9058
feat: Change /v1/sql and FlightSQL to use local execution in cluster mode by @phillipleblanc in #9055
Remove unmaintained dependencies by @lukekim in #9045
Enable cayenne + changes stream by @Jeadie in #9053
feat(cli): add spice query command for async queries REPL by @phillipleblanc in #9057
Remove unncessary allocations by @lukekim in #9059
Add dataset_acceleration_size_bytes metric by @krinart in #9062
Fix tracing of sql_query beneath tool_use::sample_data. by @Jeadie in #9043
Basic script to run distributed spice by @Jeadie in #9049
Add integration tests for Acceleration Snapshots by @krinart in #9067
Upgrade CUDA toolkit to 12.6.0 by @sgrebnov in #9079
Install required protoc dependency for CUDA build by @sgrebnov in #9080
feat(cluster): add executor control stream heartbeat by @phillipleblanc in #9072
feat: Fix async queries API and integrate Ballista shuffle improvements by @lukekim in #9075
Remove models variant (now default) & Windows builds (use WSL) by @lukekim in #9063
Cayenne: share upload semaphore across partitions to bound memory growth and optimize I/O by @sgrebnov in #9078
Snowflake data connector - add snowflake_private_key parameter by @ewgenius in #9085
Rewrite Go CLI in Rust by @phillipleblanc in #9061
GCS Data Connector (Alpha) by @lukekim in #9084
Skip 'latest' Docker tag for pre-release versions by @sgrebnov in #9077
Add HTTP endpoints for acceleration snapshots API by @phillipleblanc in #9065
Cayenne: Allow append mode with both primary_key and time_column by @sgrebnov in #9090
Add Cluster Observability (Metrics+Dashboard) by @phillipleblanc in #9066
proto for 'CayenneAccelerationExec' by @Jeadie in #9094
Add 'anthropic-beta' header for structured outputs by @Jeadie in #9093
Cayenne: refactor write path to use insert_into() as single entry point (part 1) by @sgrebnov in #9088
Fix testoperator dispatch by @sgrebnov in #9097
Fix setup-spiced GH action (_models suffix does not exist anymore) by @sgrebnov in #9102
build(deps): bump github/codeql-action from 4.31.10 to 4.31.11 by @app/dependabot in #9108
Remove DistributeFileScanOptimizer and UnionProjectionPushdownOptimizer & set target_partitions dynamically based on cluster capacity by @phillipleblanc in #9100
build(deps): bump zip from 2.4.2 to 6.0.0 by @app/dependabot in #9111
fix: Preserve query parameter order in HTTP connector to match filter values by @sgrebnov in #9114
Add PollNow interrupt for Ballista executors to reduce task scheduling latency by @phillipleblanc in #9098
Revert "GCS Data Connector (Alpha) " by @lukekim in #9084
Fix stack overflow for CDC batching by @krinart in #9115
fix: update Ballista fork to include executor timeout fix by @phillipleblanc in #9124
Properly propagate SIGINT/SIGTERM from CLI to runtime by @krinart in #9127
fix: Use the same vortex dependency as ballista by @peasee in #9123
release: Bump version to 1.11.0 for stable - https://github.com/spiceai/spiceai/commit/14d09f8e262008df69ded898ed3bebee08471508/
Cayenne snapshots with shared metadata by @lukekim in #9118
Improve error handling for URL tables with Azure URLs by @phillipleblanc in #9129
Add missing Windows build step for spice CLI in build_and_release workflow by @phillipleblanc in #9143
Fix install-dev to use debug build path for spice binary by @phillipleblanc in #9142
fix: CLI builds by @peasee in #9145
Always create initial snapshots (unless bootstrapped) + when no snapshots exist by @krinart in #9119
fix(cayenne): Fix upsert with pending deletions causing duplicate PKs by @sgrebnov in #9152
fix(flightrepl): Add chrono-tz feature to flightrepl for timezone formatting by @sgrebnov in #9153
fix(delta_lake): Preserve container name in ABFSS URLs for Azure Delta Lake tables by @sgrebnov in #9155
fix: Make CLI system and asset type detection more robust by @peasee in #9148
fix: Set query set properly on benchmarks telemetry metrics attributes by @peasee in #9162
fix: Download _models variant - https://github.com/spiceai/spiceai/commit/27f3058d0007595b02c198755c3b22319032ff30/
fix: Helm chart image tag - https://github.com/spiceai/spiceai/commit/7405c8df0db4ecce0ed6d4a4d424553d604b9036/
Revert "Remove models variant (now default) & Windows builds (use WSL) " by @lukekim in #9063
fix(cli): Several CLI fixes from the Go to Rust migration by @lukekim in #9157

Spice v1.11.0-rc.2 (Jan 22, 2026)

January 22, 2026 · 24 min read

Viktor Yershov

Senior Software Engineer at Spice AI

Announcing the release of Spice v1.11.0-rc.2! ⭐

v1.11.0-rc.2 is the second release candidate for advanced test of v1.11. It brings Spice Cayenne to Beta status with acceleration snapshots support, a new ScyllaDB Data Connector, upgrades to DataFusion v51, Arrow 57.2, and iceberg-rust v0.8.0. It includes significant improvements to distributed query, caching, and observability.

What's New in v1.11.0-rc.2

Spice Cayenne Accelerator Reaches Beta

Spice Cayenne has been promoted to Beta status with acceleration snapshots support and numerous stability improvements.

Improved Reliability:

Fixed timezone database issues in Docker images that caused acceleration panics
Resolved FuturesUnordered reentrant drop crashes
Fixed memory growth issues related to Vortex metrics allocation
Metadata catalog now properly respects cayenne_file_path location
Added warnings for unparseable configuration values

Example configuration with snapshots:

datasets:
  - from: s3://my-bucket/data.parquet
    name: my_dataset
    acceleration:
      enabled: true
      engine: cayenne
      mode: file

DataFusion v51 Upgrade

Apache DataFusion has been upgraded to v51, bringing significant performance improvements, new SQL features, and enhanced observability.

DataFusion v51 ClickBench Performance

Performance Improvements:

Faster CASE Expression Evaluation: Expressions now short-circuit earlier, reuse partial results, and avoid unnecessary scattering, speeding up common ETL patterns
Better Defaults for Remote Parquet Reads: DataFusion now fetches the last 512KB of Parquet files by default, typically avoiding 2 I/O requests per file
Faster Parquet Metadata Parsing: Leverages Arrow 57's new thrift metadata parser for up to 4x faster metadata parsing

New SQL Features:

SQL Pipe Operators: Support for |> syntax for inline transforms
DESCRIBE <query>: Returns the schema of any query without executing it
Named Arguments in SQL Functions: PostgreSQL-style param => value syntax for scalar, aggregate, and window functions
Decimal32/Decimal64 Support: New Arrow types supported including aggregations like SUM, AVG, and MIN/MAX

Example pipe operator:

SELECT * FROM t
|> WHERE a > 10
|> ORDER BY b
|> LIMIT 5;

Improved Observability:

Improved EXPLAIN ANALYZE Metrics: New metrics including output_bytes, selectivity for filters, reduction_factor for aggregates, and detailed timing breakdowns

Arrow 57.2 Upgrade

Spice has been upgraded to Apache Arrow Rust 57.2.0, bringing major performance improvements and new capabilities.

Key Features:

4x Faster Parquet Metadata Parsing: A rewritten thrift metadata parser delivers up to 4x faster metadata parsing, especially beneficial for low-latency use cases and files with large amounts of metadata
Parquet Variant Support: Experimental support for reading and writing the new Parquet Variant type for semi-structured data, including shredded variant values
Parquet Geometry Support: Read and write support for Parquet Geometry types (GEOMETRY and GEOGRAPHY) with GeospatialStatistics
New arrow-avro Crate: Efficient conversion between Apache Avro and Arrow RecordBatches with projection pushdown and vectorized execution support

iceberg-rust v0.8.0 Upgrade

Spice has been upgraded to iceberg-rust v0.8.0, bringing improved Iceberg table support.

Key Features:

V3 Metadata Support: Full support for Iceberg V3 table metadata format
INSERT INTO Partitioned Tables: DataFusion integration now supports inserting data into partitioned Iceberg tables
Improved Delete File Handling: Better support for position and equality delete files, including shared delete file loading and caching
SQL Catalog Updates: Implement update_table and register_table for SQL catalog
S3 Tables Catalog: Implement update_table for S3 Tables catalog
Enhanced Arrow Integration: Convert Arrow schema to Iceberg schema with auto-assigned field IDs, _file column support, and Date32 type support

Acceleration Snapshots

Key Feature Improvements in v1.11:

Flexible Triggers: Configure when snapshots are created based on time intervals or stream batch counts
Automatic Compaction: Reduce storage overhead by compacting older snapshots (DuckDB only)
Bootstrap Integration: Snapshots can reset cache expiry on load for seamless recovery (DuckDB with Caching refresh mode)
Smart Creation Policies: Only create snapshots when data has actually changed

Example configuration:

datasets:
  - from: s3://my-bucket/data.parquet
    name: my_dataset
    acceleration:
      enabled: true
      engine: cayenne
      mode: file
      snapshots: enabled
      snapshots_trigger: time_interval
      snapshots_trigger_threshold: 1h
      snapshots_creation_policy: on_changed

Snapshots API and CLI: New API endpoints and CLI commands for managing snapshots programmatically. List, create, and restore snapshots directly from the command line or via HTTP.

For more details, refer to the Acceleration Snapshots Documentation.

ScyllaDB Data Connector

A new data connector for ScyllaDB, the high-performance NoSQL database compatible with Apache Cassandra. Query ScyllaDB tables directly or accelerate them for faster analytics.

Example configuration:

datasets:
  - from: scylladb:my_keyspace.my_table
    name: scylla_data
    acceleration:
      enabled: true
      engine: duckdb

For more details, refer to the ScyllaDB Data Connector Documentation.

Distributed Query Improvements

mTLS Verification: Cluster communication between scheduler and executors now supports mutual TLS verification for enhanced security.

Credential Propagation: Azure and GCS credentials are now automatically propagated to executors in cluster mode, enabling access to cloud storage across the distributed query cluster.

Improved Resilience:

Exponential backoff for scheduler disconnection recovery
Increased gRPC message size limit from 16MB to 100MB for large query plans
HTTP health endpoint for cluster executors
Automatic executor role inference when --scheduler-address is provided

For more details, refer to the Distributed Query Documentation.

Caching Acceleration Mode Improvements

The Caching Acceleration Mode introduced in v1.10.0 has received significant performance optimizations and reliability fixes in this release.

Performance Optimizations:

Non-blocking Cache Writes: Cache misses no longer block query responses. Data is written to the cache asynchronously after the query returns, reducing query latency for cache miss scenarios.
Batch Cache Writes: Multiple cache entries are now written in batches rather than individually, significantly improving write throughput for high-volume cache operations.

Reliability Fixes:

Correct SWR Refresh Behavior: The stale-while-revalidate (SWR) pattern now correctly refreshes only the specific entries that were accessed instead of refreshing all stale rows in the dataset. This prevents unnecessary source queries and reduces load on upstream data sources.
Deduplicated Refresh Requests: Fixed an issue where JSON array responses could trigger multiple redundant refresh operations. Refresh requests are now properly deduplicated.
Fixed Cache Hit Detection: Resolved an issue where queries that didn't include fetched_at in their projection would always result in cache misses, even when cached data was available.
Unfiltered Query Optimization: SELECT * queries without filters now return cached data directly without unnecessary filtering overhead.

For more details, refer to the Caching Acceleration Mode Documentation.

DynamoDB Connector Enhancements

Added JSON nesting for DynamoDB Streams
Proper batch deletion handling

URL Tables

Query data sources directly via URL in SQL without prior dataset registration. Supports S3, Azure Blob Storage, and HTTP/HTTPS URLs with automatic format detection and partition inference.

Supported Patterns:

Single files: SELECT * FROM 's3://bucket/data.parquet'
Directories/prefixes: SELECT * FROM 's3://bucket/data/'
Glob patterns: SELECT * FROM 's3://bucket/year=*/month=*/data.parquet'

Key Features:

Automatic file format detection (Parquet, CSV, JSON, etc.)
Hive-style partition inference with filter pushdown
Schema inference from files
Works with both SQL and DataFrame APIs

Example with hive partitioning:

-- Partitions are automatically inferred from paths
SELECT * FROM 's3://bucket/data/' WHERE year = '2024' AND month = '01'

Enable via spicepod.yml:

runtime:
  params:
    url_tables: enabled

Cluster Mode Async Query APIs (experimental)

New asynchronous query APIs for long-running queries in cluster mode:

/v1/queries endpoint: Submit queries and retrieve results asynchronously
Arrow Flight async support: Non-blocking query execution via Arrow Flight protocol

Observability Improvements

Enhanced Dashboards: Updated Grafana and Datadog example dashboards with:

Snapshot monitoring widgets
Improved accelerated datasets section
Renamed ingestion lag charts for clarity

Additional Histogram Buckets: Added more buckets to histogram metrics for better latency distribution visibility.

For more details, refer to the Monitoring Documentation.

Additional Improvements

Model Listing: New functionality to list available models across multiple AI providers
DuckDB Partitioned Tables: Primary key constraints now supported in partitioned DuckDB table mode
Post-refresh Sorting: New on_refresh_sort_columns parameter for DuckDB enables data ordering after writes
Improved Install Scripts: Removed jq dependency and improved cross-platform compatibility
Better Error Messages: Improved error messaging for bucket UDF arguments and deprecated OpenAI parameters

Contributors

Breaking Changes

No breaking changes.

Cookbook Updates

New ScyllaDB Data Connector Recipe: New recipe demonstrating how to use ScyllaDB Data Connector. See ScyllaDB Data Connector Recipe for details.

New SMB Data Connector Recipe: New recipe demonstrating how to use ScyllaDB Data Connector. See SMB Data Connector Recipe for details.

The Spice Cookbook includes 86 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.11.0-rc.2, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:v1.11.0-rc.2 image:

docker pull spiceai/spiceai:v1.11.0-rc.2

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

AWS Marketplace:

Spice is available in the AWS Marketplace.

Dependencies

DataFusion: Upgraded to v51 (release notes)
Arrow: Upgraded to v57 (release notes)
iceberg-rust: Upgraded to v0.8.0 (release notes)

Changelog

Add timezone database to Docker image to fix Cayenne acceleration panic by @sgrebnov in #8799
Upgrade dependencies by @phillipleblanc in #8801
Fix table_allowlist for table sampling and NSQL by @Jeadie in #8789
Cayenne primary key on-conflict handling by @lukekim in #8788
fix: Update benchmark snapshots by @app/github-actions in #8773
fix: correctly identify deprecated openai_* parameters by @phillipleblanc in #8809
fix: Update benchmark snapshots by @app/github-actions in #8812
Use workspace version for cayenne crate by @phillipleblanc in #8811
Don't CAST strings which breaks push down optimizer by @lukekim in #8810
fix: Update benchmark snapshots by @app/github-actions in #8815
Update async-openai to latest revision 4dcd633aad6f - brings fix for openai compatible model providers by @ewgenius in #8816
Add auth/iam_role_source to DynamoDB connector by @krinart in #8808
DynamoDB fixes: JSON nesting for Streams, proper batch deletions by @krinart in #8821
Rough roadmap for 2026-2027 by @lukekim in #8805
Release notes for v1.11.0-rc1 by @ewgenius in #8786
Make S3V integration tests prepare_for_aws_tests more robust by @sgrebnov in #8820
Bump rsa from 0.9.9 to 0.9.10 in the cargo group across 1 directory by @app/dependabot in #8819
Add timezone database to Release and CUDA Docker images to fix Cayene panic by @sgrebnov in #8832
fix: UnionProjectionPushdownOptimizer - Schema change during transform_down breaks parent nodes by @phillipleblanc in #8831
Update grafana/datadog example dashboards by @krinart in #8833
Add Dev bird bench as text-to-sql queryset in CI. by @Jeadie in #8753
Update testoperator scheduler to use release/1.11 branch by @ewgenius in #8829
Spice Cayenne fixes and test spicepods for Beta & RC by @lukekim in #8787
testoperator dispatch all bird-bench database variants by @Jeadie in #8835
feat: Improve column statistics handling with safe access and defaults by @phillipleblanc in #8836
cluster: mTLS verification by @phillipleblanc in #8837
fix: 8770: Unsupported ScalarFunctionExpr in ORDER BY by @lukekim in #8838
Workflow tweaks by @lukekim in #8845
Cayenne: metadata catalog should respect cayenne_file_path location by @sgrebnov in #8844
Expand Cayenne feature coverage by @lukekim in #8848
docs: HA distributed query decisions by @phillipleblanc in #8817
fix(optimizer): Fix correctness issues in UnionProjectionPushdownOptimizer by @phillipleblanc in #8851
Pin reqwest to 0.12.24 to fix HuggingFace embedding model download by @ewgenius in #8853
Fix builds and pin to Ubuntu 22.04 by @lukekim in #8856
Revert "Fix builds and pin to Ubuntu 22.04" by @lukekim in #8861
Ensure setup Rust is run by @lukekim in #8862
fix: Ubuntu 24.04+ renamed libaio1 to libaio1t64 by @lukekim in #8865
Upgrade to Pulls with Spice v2 by @lukekim in #8866
Add limit and configuration name to 'testoperator run text-to-sql' by @Jeadie in #8839
PR check and test optimization by @lukekim in #8868
Upgrade S3 Vectors SDK and improve test robustness by @lukekim in #8867
[Testoperator] Query level and improved aggregate level for NSQL by @Jeadie in #8840
Add docker build for private branches for ghcr.io/spiceai/spiceai-dev by @phillipleblanc in #8873
Expand the data acceleration round-trip test coverage by @lukekim in #8855
fix: Provide a better error for improper bucket UDF arguments by @peasee in #8849
ScyllaDB Data Connector by @lukekim in #8827
Use tokio-rusqlite for Spice Cayenne SQLite by @lukekim in #8857
Cayenne: fix FuturesUnordered reentrant drop crash by @sgrebnov in #8863
Bump github/codeql-action from 4.31.9 to 4.31.10 by @app/dependabot in #8884
Bump golang.org/x/sys from 0.39.0 to 0.40.0 by @app/dependabot in #8881
Bump github.com/spiceai/gospice/v8 from 8.0.0 to 8.0.1 by @app/dependabot in #8883
Bump roaring from 0.11.2 to 0.11.3 by @app/dependabot in #8885
Bump golang.org/x/mod from 0.31.0 to 0.32.0 by @app/dependabot in #8882
Bump aws-sdk-s3 from 1.115.0 to 1.119.0 by @app/dependabot in #8887
Bump libc from 0.2.177 to 0.2.180 by @app/dependabot in #8886
Bump tokio-util from 0.7.17 to 0.7.18 by @app/dependabot in #8889
Bump governor from 0.10.2 to 0.10.4 by @app/dependabot in #8888
fix: flaky test test_concurrent_partition_creation by @phillipleblanc in #8898
Update Cayenne snapshots for TPC-DS by @lukekim in #8890
Add more buckets to histogram metrics by @krinart in #8850
feat: Add HTTP health endpoint for cluster executors by @phillipleblanc in #8899
feat: Implement model listing functionality for multiple providers by @lukekim in #8901
feat: Initial HA schedulers distributed query implementation by @phillipleblanc in #8852
fix: infer executor role from --scheduler-address when --role is omitted by @phillipleblanc in #8903
Improve install scripts and remove jq dependency by @lukekim in #8847
Benchmarks: sort PartitionedUnionExec children for deterministic snapshot comparison by @sgrebnov in #8877
Cayenne: share VortexFileCache across partitions via CayenneContext by @sgrebnov in #8880
Update ballista to add exponential backoff for scheduler disconnection by @phillipleblanc in #8905
Configurably add BirdBench evidence to testoperator text-to-SQL. by @Jeadie in #8904
Helm: Allow command override via values.yaml by @sgrebnov in #8906
Fix distributed query gRPC message size limit (16MB -> 100MB) by @phillipleblanc in #8900
OS specific setup actions by @lukekim in #8909
Cayenne should warn if unable to parse configuration value by @sgrebnov in #8907
Add snapshots widgets to example dashboard by @krinart in #8910
Add quality criteria for the features by @krinart in #8897
Improve Accelerated Datasets section for Grafana/Datadog dashboards by @krinart in #8915
Use HTTP traceparent in NSQL to support concurrency in 'testoperator run text-to-SQL' by @Jeadie in #8912
Remove setup for cc from integration_models.yml by @Jeadie in #8917
Propagate Azure and GCS credentials to executors in cluster mode by @phillipleblanc in #8918
Cayenne: fix memory growth due to vortex metrics allocation by @sgrebnov in #8908
fix(caching): Deduplicate refresh requests for JSON array responses by @sgrebnov in #8921
fix(caching): Return cached data directly for unfiltered queries (SELECT *) by @sgrebnov in #8919
Correct MinIO path syntax for spiced download by @Jeadie in #8916
Acceleration snapshots compaction + Improved Snapshots UX by @krinart in #8858
Change base image from bookworm-slim to trixie-slim by @Jeadie in #8923
Add testoperator run text-to-sql metrics from LogicalPlan by @Jeadie in #8895
Fix spicepod dependencies in testoperator by @Jeadie in #8875
Update copilot instructions for data correctness by @lukekim in #8922
Add BootstrapStatus + Snapshot bootstrapping parallelization by @krinart in #8926
fix: add missing feature-gate for AWS Secrets Manager error variant by @phillipleblanc in #8928
refactor: make ConnectorParams fields public for external connectors by @phillipleblanc in #8929
fix(caching): SWR refreshes only accessed entry instead of all stale rows by @sgrebnov in #8931
Cayenne: include cayenne_metadata_dir to known params by @sgrebnov in #8933
Rename Ingestion Lag chart in example dashboards by @krinart in #8932
fix(caching): Fix HTTP caching always MISS when projection excludes fetched_at by @sgrebnov in #8930
Reset expiry after snapshot bootstraping for Caching by @krinart in #8925
Set use_ssl=false for sccache by @lukekim in #8945
Hash indexing for Arrow Acceleration by @lukekim in #8924
[Cayenne] Acceleration snapshots support by @lukekim in #7973
perf(caching): Non-blocking cache writes on cache miss by @sgrebnov in #8948
Update NSQL models by @lukekim in #8951
Hash Index Key verification by @lukekim in #8949
Add snapshots_creation_policy param by @krinart in #8954
Remove candle & cudarc from non-models build by @lukekim in #8955
Acceleration Snapshots API and CLI by @lukekim in #8934
Ignore test for data_components arrow::indexed::test_primary_key_value_matches_batch by @Jeadie in #8962
fix: Update benchmark snapshots by @app/github-actions in #8965
Hash Index secondary index support by @lukekim in #8958
fix: Support primary key constraints in partitioned DuckDB tables mode by @sgrebnov in #8966
perf(caching): Batch cache writes by @sgrebnov in #8959
CI perf optimizations by @lukekim in #8968
Fix Makefile linting by @Jeadie in #8970
Fixes in testoperator run text-to-sql. by @Jeadie in #8927
implement Chat::as_sql for xAI anthropic by @Jeadie in #8957
Fix duckdb_file_path in search integration test by @Jeadie in #8972
fix: Update benchmark snapshots by @app/github-actions in #8971
Maintenance updates to Anthropic API by @Jeadie in #8956
Add CacheBackend Trait, implement pingora-lru, and add throughput tests by @lukekim in #8080
fix: Update benchmark snapshots by @app/github-actions in #8974
fix: Update benchmark snapshots by @app/github-actions in #8975
Make accelerator shutdown more robust by @lukekim in #8969
feat(duckdb): Add on_refresh_sort_columns for post-write data ordering (initial version) by @sgrebnov in #8964
Proper handling for initial snapshot by @krinart in #8911
fix: Remove --no-default-features from cargo-hack command in features workflow by @phillipleblanc in #8977
build(deps): bump actions/cache from 5.0.1 to 5.0.2 by @app/dependabot in #8983
build(deps): bump actions/checkout from 4 to 6 by @app/dependabot in #8982
build(deps): bump actions/setup-go from 6.1.0 to 6.2.0 by @app/dependabot in #8984
build(deps): bump github.com/olekukonko/tablewriter from 1.1.2 to 1.1.3 by @app/dependabot in #8979
build(deps): bump github.com/klauspost/compress from 1.18.2 to 1.18.3 by @app/dependabot in #8980
Add /v1/queries and Arrow Flight async APIs by @lukekim in #8946
build(deps): bump Vampire/setup-wsl from 5 to 6 by @app/dependabot in #8981
fix: Update Search integration test snapshots by @app/github-actions in #8973
build(deps): bump insta from 1.46.0 to 1.46.1 by @app/dependabot in #8988
build(deps): bump schemars from 1.1.0 to 1.2.0 by @app/dependabot in #8985
fix: Update benchmark snapshots by @app/github-actions in #8978
fix: Data correctness edge cases by @lukekim in #8953
Correct MinIO path syntax for spiced download (Part 2) by @Jeadie in #8995
Make .spice/data in search integration tests by @Jeadie in #8992
fix: Hash index composite keys null values by @lukekim in #9001
Update Cayenne status to Beta by @lukekim in #9002
fix: Disable TPC-DS result validation (not yet supported) by @sgrebnov in #9004
feat: Upgrade to DataFusion v51 and dependencies by @lukekim in #8864
Improvements for snapshots_creation_policy by @krinart in #9003
fix(ci): restore cached spicepod-validator binary instead of lookup-only by @phillipleblanc in #9007
Update version by @krinart in #9010

Spice v1.8.0 (Oct 6, 2025)

October 7, 2025 · 20 min read

Phillip LeBlanc

Co-Founder and CTO of Spice AI

Announcing the release of Spice v1.8.0! 🧊

Spice v1.8.0 delivers major advances in data writes, scalable vector search, and now in preview—managed acceleration snapshots for fast cold starts. This release introduces write support for Iceberg tables using standard SQL INSERT INTO, partitioned S3 Vector indexes for petabyte-scale vector search, and preview of the AI SQL function for direct LLM integration in SQL. Additional improvements include improved reliability, and the v3.0.3 release of the Spice.js Node.js SDK.

What's New in v1.8.0

Iceberg Table Write Support (Preview)

Append Data to Iceberg Tables with SQL INSERT INTO: Spice now supports writing to Iceberg tables and catalogs using standard SQL INSERT INTO statements. This enables data ingestion, transformation, and pipeline use cases—no Spark or external writer required.

Append-only: Initial version targets appends; no overwrite or delete.
Schema validation: Inserted data must match the target table schema.
Secure by default: Writes are only enabled for datasets or catalogs explicitly marked with access: read_write.

Example Spicepod configuration:

catalogs:
  - from: iceberg:https://glue.ap-northeast-3.amazonaws.com/iceberg/v1/catalogs/111111/namespaces
    name: ice
    access: read_write

datasets:
  - from: iceberg:https://iceberg-catalog-host.com/v1/namespaces/my_namespace/tables/my_table
    name: iceberg_table
    access: read_write

Example SQL usage:

-- Insert from another table
INSERT INTO iceberg_table
SELECT * FROM existing_table;

-- Insert with values
INSERT INTO iceberg_table (id, name, amount)
VALUES (1, 'John', 100.0), (2, 'Jane', 200.0);

-- Insert into catalog table
INSERT INTO ice.sales.transactions
VALUES (1001, '2025-01-15', 299.99, 'completed');

Note: Only Iceberg datasets and catalogs with access: read_write support writes. Internal Spice tables and other connectors remain read-only.

Learn more in the Iceberg Data Connector documentation.

Acceleration Snapshots for Fast Cold Starts (Preview)

Bootstrap Managed Accelerations from Object Storage: Spice now supports managed acceleration snapshots in preview, enabling datasets accelerated with file-based engines (DuckDB or SQLite) to bootstrap from a snapshot stored in object storage (such as S3) if the local acceleration file does not exist on startup. This dramatically reduces cold start times and enables ephemeral storage for accelerations with persistent recovery.

Key features:

Rapid readiness: Datasets can become ready in seconds by downloading a pre-built snapshot, skipping lengthy initial acceleration.
Hive-style partitioning: Snapshots are organized by month, day, and dataset for easy retention and management.
Flexible bootstrapping: Configurable fallback and retry behavior if a snapshot is missing or corrupted.

Example Spicepod configuration:

snapshots:
  enabled: true
  location: s3://some_bucket/some_folder/ # Folder for storing snapshots
  bootstrap_on_failure_behavior: warn # Options: warn, retry, fallback
  params:
    s3_auth: iam_role # All S3 dataset params accepted here

datasets:
  - from: s3://some_bucket/some_table/
    name: some_table
    params:
      file_format: parquet
      s3_auth: iam_role
    acceleration:
      enabled: true
      snapshots: enabled # Options: enabled, disabled, bootstrap_only, create_only
      engine: duckdb
      mode: file
      params:
        duckdb_file: /nvme/some_table.db

How it works:

On startup, if the acceleration file does not exist, Spice checks the snapshot location for the latest snapshot and downloads it.
Snapshots are stored as: s3://some_bucket/some_folder/month=2025-09/day=2025-09-30/dataset=some_table/some_table_<timestamp>.db
If no snapshot is found, a new acceleration file is created as usual.
Snapshots are written after each refresh (unless configured otherwise).

Supported snapshot modes:

enabled: Download and write snapshots.
bootstrap_only: Only download on startup, do not write new snapshots.
create_only: Only write snapshots, do not download on startup.
disabled: No snapshotting.

Note: This feature is only supported for file-based accelerations (DuckDB or SQLite) with dedicated files.

Why use acceleration snapshots?

Faster cold starts: Skip waiting for full acceleration on startup.
Ephemeral storage: Use fast local disks (e.g., NVMe) for acceleration, with persistent recovery from object storage.
Disaster recovery: Recover from federated source outages by bootstrapping from the latest snapshot.

Partitioned S3 Vector Indexes

Efficient, Scalable Vector Search with Partitioning: Spice now supports partitioning Amazon S3 Vector indexes and scatter-gather queries using a partition_by expression in the dataset vector engine configuration. Partitioned indexes enable faster ingestion, lower query latency, and scale to billions of vectors.

Example Spicepod configuration:

datasets:
  - name: reviews
    vectors:
      enabled: true
      engine: s3_vectors
      params:
        s3_vectors_bucket: my-bucket
        s3_vectors_index: base-embeddings
      partition_by:
        - 'bucket(50, PULocationID)'
    columns:
      - name: body
        embeddings:
          from: bedrock_titan
      - name: title
        embeddings:
          from: bedrock_titan

See the Amazon S3 Vectors documentation for details.

AI SQL function for LLM Integration (Preview)

LLMs Directly In SQL: A new asynchronous ai SQL function enables direct calls to LLMs from SQL queries for text generation, translation, classification, and more. This feature is released in preview and supports both default and model-specific invocation.

Example Spicepod model configuration:

models:
  - name: gpt-4o
    from: openai:gpt-4o
    params:
      openai_api_key: ${secrets:openai_key}

Example SQL usage:

-- basic usage with default model
SELECT ai('hi, this prompt is directly from SQL.');

-- basic usage with specified model
SELECT ai('hi, this prompt is directly from SQL.', 'gpt-4o');

-- Using row data as input to the prompt
SELECT ai(concat_ws(' ', 'Categorize the zone', Zone, 'in a single word. Only return the word.')) AS category
FROM taxi_zones
LIMIT 10;

Learn more in the SQL Reference AI documentation.

Remote Endpoint Support for Spice CLI

Run CLI Commands Remotely: The Spice CLI now supports connecting to remote Spice instances, enabling you to run spice sql, spice search, and spice chat commands from your local machine against a remote spiced daemon or to Spice Cloud. Previously, these commands required running on the same machine as the runtime. Now, new flags allow remote execution:

--cloud: Connect to a Spice Cloud instance (requires --api-key).
--endpoint <endpoint>: Connect to a remote Spice instance via HTTP or Arrow Flight SQL (gRPC). Supports http://, https://, grpc://, or grpc+tls:// schemes.

Examples:

# Run SQL queries against a remote Spice instance
spice sql --endpoint http://remote-host:8090

# Use Spice Cloud for chat or search
spice chat --cloud --api-key <your-api-key>
spice search --cloud --api-key <your-api-key>

Supported CLI Commands:

spice sql --cloud / spice sql --endpoint <endpoint>
spice search --cloud / spice search --endpoint <endpoint>
spice chat --cloud / spice chat --endpoint <endpoint>

Additional Flags:

--headers: Pass custom HTTP headers to the remote endpoint.
--tls-root-certificate-file: Specify a root certificate for TLS verification.
--user-agent: Set a custom user agent for requests.

For more details, see the Spice CLI Command Reference.

Spice.js v3.0.3 SDK

Spice.js v3.0.3 Released: The official Spice.ai Node.js/JavaScript SDK has been updated to v3.0.3, bringing cross-platform support, new APIs, and improved reliability for both Node.js and browser environments.

Modern Query Methods: Use sql(), sqlJson(), and nsql() for flexible querying, streaming, and natural language to SQL.
Browser Support: SDK now works in browsers and web applications, automatically selecting the optimal transport (gRPC or HTTP).
Health Checks & Dataset Refresh: Easily monitor Spice runtime health and trigger dataset refreshes on demand.
Automatic HTTP Fallback: If gRPC/Flight is unavailable, the SDK falls back to HTTP automatically.
Migration Guidance: v3 requires Node.js 20+, uses camelCase parameters, and introduces a new package structure.

Example usage:

import { SpiceClient } from '@spiceai/spice'

const client = new SpiceClient(apiKey)
const table = await client.sql('SELECT * FROM my_table LIMIT 10')
console.table(table.toArray())

See Spice.js SDK documentation for full details, migration tips, and advanced usage.

Additional Improvements

Reliability: Improved logging, error handling, and network readiness checks across connectors (Iceberg, Databricks, etc.).
Vector search durability and scale: Refined logging, stricter default limits, safeguards against index-only scans and duplicate results, and always-accessible metadata for robust queryability at scale.
Cache behavior: Tightened cache logic for modification queries.
Full-Text Search: FTS metadata columns now usable in projections; max search results increased to 1000.
RRF Hybrid Search: Reciprocal Rank Fusion (RRF) UDTF enhancements for advanced hybrid search scenarios.

Contributors

Breaking Changes

This release introduces two breaking changes associated with the search observability and tooling.

Firstly, the document_similarity tool has been renamed to search. This has the equivalent change to tracing of these tool calls:

## Old: v1.7.1
>> spice trace tool_use::document_similarity
>> curl -XPOST http://localhost:8090/v1/tools/document_similarity \
  -d '{
    "datasets": ["my_tbl"],
    "text": "Welcome to another Spice release"
  }'

## New: v1.8.0
>> spice trace tool_use::search
>> curl -XPOST http://localhost:8090/v1/tools/search \
  -d '{
    "datasets": ["my_tbl"],
    "text": "Welcome to another Spice release"
  }'

Secondly, the vector_search task in runtime.task_history has been renamed to search.

Cookbook Updates

Added new AI SQL function recipe for invoking LLMs within SQL queries.
Updated Iceberg Catalog Connector recipe for Iceberg Writes.
Updated Spice.js JavaScript (Node.js) SDK for v3.0.3 with examples and v2 to v3 migration guide.

The Spice Cookbook now includes 80 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.8.0, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.8.0 image:

docker pull spiceai/spiceai:1.8.0

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

AWS Marketplace:

🎉 Spice is now available in the AWS Marketplace!

What's Changed

Dependencies

iceberg-rust: Upgraded to v0.7.0-rc.1
mimalloc: Upgraded from 0.1.47 to 0.1.48
azure_core: Upgraded from 0.27.0 to 0.28.0
Jimver/cuda-toolkit: Upgraded from 0.2.27 to 0.2.28

Changelog

Add #[cfg(feature = "postgres")] to acceleration refresh tests by @Jeadie in #7241
fix: Update benchmark snapshots by @github-actions[bot] in #7267
fix: Update benchmark snapshots by @github-actions[bot] in #7268
fix: Update benchmark snapshots by @github-actions[bot] in #7269
Update the tpch benchmark snapshots for: federated/databricks[sql_warehouse].yaml by @github-actions[bot] in #7270
EmbeddingInput cache keys to include model name by @mach-kernel in #7275
ensure FTS metadata columns can be used in projection by @Jeadie in #7282
Use 8-core runners for Windows CUDA builds by @sgrebnov in #7284
Make search test more robust by @krinart in #7283
Post-release housekeeping by @sgrebnov in #7272
fix: Use median cached response duration for test search cache by @peasee in #7286
Bump dirs from 5.0.1 to 6.0.0 by @dependabot[bot] in #7244
Bump indexmap from 2.11.0 to 2.11.4 by @dependabot[bot] in #7248
Fix JOIN level filters not having columns in schema by @Jeadie in #7287
use SessionContext::new_empty in RRF by @kczimm in #7291
Use rust:1.89-slim-bookworm for build, more places to bump rust version by @sgrebnov in #7293
Update openapi.json by @github-actions[bot] in #7290
Enable chunking in SearchIndex by @Jeadie in #7143
Add index name and remove duplicate records string to S3 Vectors log by @lukekim in #7260
Use file-based fts index by @Jeadie in #7024
Remove 'PostApplyCandidateGeneration' by @Jeadie in #7288
RRF: Rank and recency boosting by @mach-kernel in #7294
Update ROADMAP.md by removing v1.7 milestone by @sgrebnov in #7297
RRF: Preserve base ranking when results differ -> FULL OUTER JOIN does not produce time column by @mach-kernel in #7300
chore: remove unused Dataset methods by @kczimm in #7295
fix removing embedding column by @Jeadie in #7302
fix: Add feature flag for using object store in spicepod by @peasee in #7303
Upgrade to iceberg-rust v0.7.0-rc1 by @sgrebnov in #7296
Enable DML Update SQL operations for datasets configured as access: read_write by @sgrebnov in #7304
Create and parse partitioned S3 vector index names by @kczimm in #7198
RRF: Fix decay for disjoint result sets by @mach-kernel in #7305
RRF: Project top scores, do not yield duplicate results by @mach-kernel in #7306
RRF: Case sensitive column/ident handling by @mach-kernel in #7309 in #7309
For vector_search, use a default limit of 1000 if no limit specified by @lukekim in #7311
Don’t cache modification queries (DDL, DML, COPY) by @sgrebnov in #7316
Fix Anthropic model regex and add validation tests by @ewgenius in #7319
Enhancement: Implement before/after/lag metrics for acceleration refresh by @krinart in #7310
Refactor chat model health check to lower tokens usage for reasoning models by @ewgenius in #7317
Add support for writing into Iceberg tables by @sgrebnov in #7315
Fix lint warnings by @lukekim in #7327
Use logical plan in SearchQueryProvider by @Jeadie in #7314
FTS max search results 100 -> 1000 by @Jeadie in #7331
Improve Databricks SQL Warehouse Error Handling by @sgrebnov in #7332
Use spicepod embedding model name for model_name() by @Jeadie in #7333
Handle async queries for Databricks SQL Warehouse API by @phillipleblanc in #7335
Enable DML (INSERT INTO) operations for catalogs configured as access:read_write by @sgrebnov in #7330
Bump regex from 1.11.2 to 1.11.3 by @dependabot[bot] in #7336
Update qa_analytics.csv with 1.7.0 release data by @sgrebnov in #7337
RRF: Fix ident resolution for struct fields, autohashed join key for varying types by @mach-kernel in #7339
v1.7.1 release notes by @kczimm in #7348
Bump Jimver/cuda-toolkit from 0.2.27 to 0.2.28 by @dependabot[bot] in #7343
Add support for writing into Glue (Iceberg) tables and catalogs by @sgrebnov in #7355
Bump mimalloc from 0.1.47 to 0.1.48 by @dependabot[bot] in #7342
Add ai async UDF by @lukekim in #7328
Use self-hosted and spiceai-macos runners for workflows where possible by @lukekim in #7371
Several updates for improved search testing by @Jeadie in #7358
Update supported versions in SECURITY.md by @Jeadie in #7377
1.7.1 release analytics by @mach-kernel in #7380
Add acceleration_file_path helper and refactor spice_sys to use Snafu errors by @phillipleblanc in #7376 in #7376
fix: Update benchmark snapshots by @github-actions[bot] in #7353
Robust search test by @Jeadie in #7381
[bug] Fix ai UDF bug of mismatched column length by @lukekim in #7383
Add OpenOption to spice_sys acceleration tables by @phillipleblanc in #7379
Add new snapshots Spicepod configuration by @phillipleblanc in #7384
Update naming of tool_use::document_similarity and vector_search spans by @Jeadie in #7273
fix: Update benchmark snapshots by @github-actions[bot] in #7354
Make ai UDF a models only feature by @lukekim in #7387
Add new runtime_acceleration crate; create SnapshotManager; implement SnapshotManager::download_latest_snapshot by @phillipleblanc in #7386
Refactor 'VectorScanTableProvider' to use just 'VectorIndex::list_table_provider' by @Jeadie in #7318
Fix embed logs by @Jeadie in #7382
Enable spicepod dependencies in testoperator by @Jeadie in #7334
ai UDF security and performance optimizations by @lukekim in #7392
Wire up the snapshot download on dataset startup by @phillipleblanc in #7389
Implement initial snapshot creation logic in SnapshotManager by @phillipleblanc in #7391
Make tool_use::table_schema output model-friendly by @krinart in #7393
Fix minor lint warnings by @lukekim in #7395
Enable metadata columns in document-based object store datasets by @Jeadie in #7397
Core dependencies of financebench by @Jeadie in #7400
Add S3vector variant to financebench by @Jeadie in #7399
Set PostgreSQL unsupported_spice_action=string by default by @lukekim in #7398
Use non-blocking connection check for verify_ns_lookup_and_tcp_connect by @phillipleblanc in #7401
Bump moka from 0.12.10 to 0.12.11 by @dependabot[bot] in #7340
Bump tokio-postgres from 0.7.13 to 0.7.14 by @dependabot[bot] in #7344
Bump azure_core from 0.27.0 to 0.28.0 by @dependabot[bot] in #7338
Forbid INSERT OVERWRITE DML operations by @sgrebnov in #7402
Make database connection pool sizes consistent by @lukekim in #7403
Disable vector index only scans by @Jeadie in #7405
Make CLI --endpoint and --cloud args & table output consistent by @lukekim in #7396
Write new snapshots at the end of an accelerated refresh by @phillipleblanc in #7410
Read and write partitioned S3 indexes by @kczimm in #7313
Fix partial data writes in Iceberg data connector by @sgrebnov in #7411
Remove nix by @phillipleblanc in #7414
Use DataFusion JoinSetTracer for async context propagation by @lukekim in #7416
Implement cache invalidation for DML (INSERT INTO) operations by @sgrebnov in #7394
Make cleanup disk GH action; use in integration tests by @Jeadie in #7418
Move S3Vector to 'search' crate by @Jeadie in #7373
Use LogicalPlan builder API for LogicalPlans by @Jeadie in #7408
Use hive-style partitioned paths for DB snapshots by @phillipleblanc in #7422
Limit results from SearchIndex::query_table_provider by @Jeadie in #7421
Delay initial readiness if snapshots are enabled with an append-mode refresh by @phillipleblanc in #7425
Disable snapshots by default by @phillipleblanc in #7426
Rewrite ChunkedNonIndexVectorGeneration to use LogicalPlanBuilder (instead of string formatting) by @Jeadie in #7413
Fix for search field as metadata for chunked search indexes by @Jeadie in #7429
Add feature is currently in preview warning for read_write access mode by @sgrebnov in #7440
Add feature is currently in preview warning for snapshots by @sgrebnov in #7442
Fix tracing so that ai_completions are parented under sql_query by @lukekim in #7415
Disable acceleration refresh metrics by @krinart in #7450
Enable snapshot acceleration by default by @phillipleblanc in #7451
fix: partition name validation by @kczimm in #7452

Spice v1.5.2 (Aug 11, 2025)

August 12, 2025 · 7 min read

Kevin Zimmerman

Principal Software Engineer at Spice AI

Announcing the release of Spice v1.5.2! 🛠️

Spice v1.5.2 introduces a new Amazon Bedrock Models Provider for converse API (Nova) compatible models, AWS Redshift support using the Postgres data connector, and Hadoop Catalog Support for Iceberg tables along with several bug fixes and improvements.

What's New in v1.5.2

Amazon Bedrock Models Provider: Adds a new Amazon Bedrock LLM Provider. Models compatible with the Converse API (Nova) are supported.

Amazon Bedrock provides access to a range of foundation models for generative AI. Spice supports using Bedrock-hosted models by specifying the bedrock prefix in the from field and configuring the required parameters.

Supported Model IDs:

amazon.nova-lite-v1:0
amazon.nova-micro-v1:0
amazon.nova-premier-v1:0
amazon.nova-pro-v1:0

Refer to the Amazon Bedrock documentation for details on available models and cross-region inference profiles.

Example Spicepod.yaml:

models:
  - from: bedrock:us.amazon.nova-lite-v1:0
    name: novash
    params:
      aws_region: us-east-1
      aws_access_key_id: ${ secrets:AWS_ACCESS_KEY_ID }
      aws_secret_access_key: ${ secrets:AWS_SECRET_ACCESS_KEY }
      bedrock_guardrail_identifier: arn:aws:bedrock:abcdefg012927:0123456789876:guardrail/hello
      bedrock_guardrail_version: DRAFT
      bedrock_trace: enabled
      bedrock_temperature: 42

For more information, see the Amazon Bedrock Documentation.

AWS Redshift Support for Postgres Data Connector: Spice now supports connecting to Amazon Redshift using the PostgreSQL data connector. Redshift is a columnar OLAP database compatible with PostgreSQL, allowing you to use the same connector and configuration parameters.

To connect to Redshift, use the format postgres:schema.table in your Spicepod and set the connection parameters to match your Redshift cluster settings.

Example Spicepod.yaml:

# Example datasets for Redshift TPCH tables
datasets:
  - from: postgres:public.customer
    name: customer
    params:
      pg_host: ${secrets:PG_HOST}
      pg_port: 5439
      pg_sslmode: prefer
      pg_db: dev
      pg_user: ${secrets:PG_USER}
      pg_pass: ${secrets:PG_PASS}
  - from: postgres:public.lineitem
    name: lineitem
    params:
      pg_host: ${secrets:PG_HOST}
      pg_port: 5439
      pg_sslmode: prefer
      pg_db: dev
      pg_user: ${secrets:PG_USER}
      pg_pass: ${secrets:PG_PASS}

Redshift types are mapped to PostgreSQL types. See the PostgreSQL connector documentation for details on supported types and configuration.

Hadoop Catalog Support for Iceberg: The Iceberg Data and Catalog connectors now support connecting to Hadoop catalogs on filesystem (file://) or S3 object storage (s3://, s3a://). This enables connecting to Iceberg catalogs without a separate catalog provider service.

Example Spicepod.yaml:

catalogs:
  - from: iceberg:file:///tmp/hadoop_warehouse/
    name: local_hadoop
  - from: iceberg:s3://my-bucket/hadoop_warehouse/
    name: s3_hadoop

  # Example datasets
  - from: iceberg:file:///data/hadoop_warehouse/test/my_table_1
    name: local_hadoop
  - from: iceberg:s3://my-bucket/hadoop_warehouse/test/my_table_2
    name: s3_hadoop

For more details, see the Iceberg Data Connector documentation and the Iceberg Catalog Connector documentation.

Parquet Reader: Optional Parquet Page Index: Fixed an issue where the Parquet reader, using arrow-rs and DataFusion, errored on files missing page indexes, despite the Parquet spec allowing optional indexes. The Spice team contributed optional page index support to arrow-rs (PR #6) and configurable handling in DataFusion (PR #93). A new runtime parameter, parquet_page_index, makes Parquet Page Indexes configurable in Spice:

runtime:
  params:
    parquet_page_index: required # Options: required, skip, auto

required: (Default) Errors if page indexes are absent.
skip: Ignores page indexes, potentially reducing query performance.
auto: Uses page indexes if available; skips otherwise.

This improves compatibility and query flexibility for Parquet datasets.

Contributors

Breaking Changes

Amazon S3 Vectors Vector Engine: Amazon S3 Vectors is currently a preview AWS service. A recent update to the Amazon S3 Vectors service API introduced a breaking change that affects the integration when projecting (selecting) the embedding column. This results in the following error:

Json error: whilst decoding field 'data': expected [ got nullReceived only partial JSON payload from QueryVectors

The issue is expected to be resolved in the next release of Spice. A current workaround is to limit queries to non-embedding columns.

i.e. instead of:

SELECT url, title, scored, body_embedding
FROM vector_search(pulls, 'bugs in DuckDB', 4)
WHERE state = 'OPEN'
ORDER BY score DESC
LIMIT 4;

Remove the *_embedding column from the projection. E.g.

SELECT url, title, scored
FROM vector_search(pulls, 'bugs in DuckDB', 4)
WHERE state = 'OPEN'
ORDER BY score DESC
LIMIT 4;

This issue and workaround also applies to SELECT * FROM vector_search(..). E.g.

SELECT *
FROM vector_search(pulls, 'bugs in DuckDB', 4)
WHERE state = 'OPEN'
ORDER BY score DESC
LIMIT 4;

Cookbook Updates

Added Amazon Redshift Support to the Postgres Data Connector cookbook: Connect to tables in Amazon Redshift.

The Spice Cookbook includes 75 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.5.2, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.5.2 image:

docker pull spiceai/spiceai:1.5.2

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

AWS Marketplace:

🎉 Spice is also now available in the AWS Marketplace!

What's Changed

Dependencies

No major dependency updates.

Changelog

fixes for databricks OpenAI compatibility (#6629) by @Jeadie in #6629
Update spicepod.schema.json (#6632) by @app/github-actions in #6632
Remove 'stream_options' from databricks LLMs (#6637) by @Jeadie in #6637
Move retry and rate limiting logic for Amazon bedrock out of embeddings. (#6626) by @Jeadie in #6626
Disable Metal precomplation in integration_llms.yml (#6649) by @Jeadie in #6649
fix: Hadoop integration test (#6660) by @peasee in #6660
feat: Add Hadoop Catalog Data Component (#6658) by @peasee in #6658
update datafusion-table-providers to latest spiceai tag (#6661) by @mach-kernel in #6661
feat: Add Hadoop Catalog connectors for Iceberg (#6659) by @peasee in #6659
Make FullTextSearchExec robust to RecordBatch column ordering. (#6675) by @Jeadie in #6675
Make 'runtime-object-store' crate (#6674) by @Jeadie in #6674
fix: Support include for Iceberg (#6663) by @peasee in #6663
feat: Add Hadoop TPCH benchmark (#6678) by @peasee in #6678
feat: Add Hadoop metadata_path parameter (#6680) by @peasee in #6680
fix: Automatically infer Hadoop warehouse scheme (#6681) by @peasee in #6681
Amazon Bedrock, specifically Nova models (#6673) by @Jeadie in [#6673](https://github.com/spiceai/spiceai/pull/6673
fix perplexity_auth_token parameters for web_search (#6685) by @Jeadie in #6685
Fix AWS Auth issue (#6699) by @Advayp in #6699
Limit Concurrent Requests for GitHub (#6672) by @Advayp in #6672
Add runtime parameter to enable more permissive parquet reading when page indexes are missing (#6716) by @phillipleblanc in #6716
Improve Flight REPL error messages (#6696) by @lukekim in #6696
Fixes from search tests (#6710) by @Jeadie in #6710

Spice v1.0.6 (Mar 17, 2025)

March 17, 2025 · 4 min read

Sergei Grebnov

Senior Software Engineer at Spice AI

Announcing the release of Spice v1.0.6 ⚡

Spice v1.0.6 improves stability for DuckDB acceleration, Iceberg Data/Catalog connector improvements when using AWS Glue, and fixes an issue with the ready_state: on_registration federation fallback when using DuckDB. In addition, redundant data refreshes on startup are avoided for accelerations with persistent data.

Highlights in v1.0.6

Iceberg Data/Catalog Connector Improvements: Improves Iceberg data & catalog connector reliability, including bug fixes for AWS Glue API rate-limiting and compatibility, REST API pagination support, explicit AWS credential handling, and support for AWS STS role assumption.
Fixes On-Registration Fallback when using DuckDB: Previously, when using DuckDB as a data accelerator and the ready_state: on_registration configuration, queries made during the initial data refresh did not properly fallback to the federated source. This is now fixed.
DuckDB downgraded for Stability: DuckDB has been downgraded to v1.1.3 due to a regression in memory handling tracked by duckdb/duckdb issue #16640. Once resolved and validated, Spice will re-upgrade to v1.2.x.
Expanded Integration Tests: Additional integration tests covering federated accelerator behavior and graceful shutdown processes have been added.
Optimized Data Refresh for Persistent Accelerations: Changed behavior in v1.0.6. When using persistent (file-mode) acceleration without a defined refresh interval, Spice performs a full refresh at startup only if no previously accelerated data is available. This ensures efficient startup behavior by avoiding unnecessary refreshes. This logic applies only to full refreshes when no refresh interval is specified.

To maintain the previous behavior and always refresh on every startup, set:

acceleration:
  refresh_on_startup: always

Contributors

@peasee
@phillipleblanc
@sgrebnov
@lukekim
@Sevenannn

Breaking Changes

Starting from v1.0.6 when using persistent (file-mode) acceleration without a defined refresh interval, Spice performs a full refresh at startup only if no previously accelerated data is available. To maintain the previous behavior and always refresh on every startup, set:

acceleration:
  refresh_on_startup: always

Cookbook Updates

No new recipes.

Upgrading

To upgrade to v1.0.6, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.0.6 image:

docker pull spiceai/spiceai:1.0.6

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

duckdb-rs: Downgraded from 1.2.0 to 1.1.3

Changelog

Implement proper ready_state: on_registration for federation enabled accelerators by @phillipleblanc in #5019
Add indexes and primary keys mismatch detection for DuckDB Acceleration by @sgrebnov in #5045
Add comprehensive integration tests for the ready_state behavior by @phillipleblanc in #5042
Add test Spicepod for acceleration with constraints by @sgrebnov in #4891
Add test Spicepod for DuckDB append acceleration with constraints by @sgrebnov in #4898
Add DuckDB graceful shutdown test to E2E CI tests by @sgrebnov in #5047
Update duckdb_append_with_pk_and_indexes.yaml (work for duckdb 1.1.x) by @sgrebnov in #5067
fix: Downgrade to DuckDB 1.1.3 by @peasee in #5055
fix: Acceleration federation integration test by @peasee in #5070
Improvements to Iceberg Catalog/Data Connector by @phillipleblanc in #5071
Add Results-Cache-Status to indicate query result came from cache by @phillipleblanc in #4809
fix: Spice.ai schema inference by @peasee in #4674
Add refresh_on_startup Spicepod configuration param by @phillipleblanc and @sgrebnov in #5086
Test restart behavior of DuckDB file acceleration against glue iceberg table by @Sevenannn #5075
Run Iceberg Data Connector - DuckDB File mode integration test by @Sevenannn #5069
Integration test for glue iceberg catalog by @Sevenannn #5077

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.5...v1.0.6

Spice v1.0.5 (Mar 11, 2025)

March 11, 2025 · 4 min read

Sergei Grebnov

Senior Software Engineer at Spice AI

Announcing the release of Spice v1.0.5 🧊

Spice v1.0.5 expands Iceberg support with the introduction of the Iceberg Data Connector, in addition to the existing Iceberg Catalog Connector. This new connector enables direct dataset creation and configuration for specific Iceberg objects, enabling federated and accelerated SQL queries on Apache Iceberg tables.

Performance improvements include object-store optimized Parquet pruning in append mode, where object-store metadata is now leveraged alongside Hive partitioning to optimize file pruning. This results in faster and more efficient queries.

DuckDB has been upgraded to v1.2.0, along with additional stability improvements, including improved graceful shutdown and the ability to configure the DuckDB memory limit.

Additional updates include support for the Arrow Map type.

Highlights in v1.0.5

New Iceberg Data Connector: Enables direct dataset creation and querying of Iceberg tables.

Example usage in spicepod.yaml:
```
datasets:
  - from: iceberg:https://iceberg-catalog-host.com/v1/namespaces/my_namespace/tables/my_table
    name: my_table
    params:
      # Same as Iceberg Catalog Connector
    acceleration:
      enabled: true
```
For detailed setup instructions, authentication options, and configuration parameters, refer to the Iceberg Data Connector documentation.
Improved Parquet pruning in append mode: Uses object-store metadata for more efficient file pruning.
DuckDB upgrade to v1.2.0 with improved graceful shutdown: Read the DuckDB v1.2.0 announcement for details, including breaking changes for map and list_reduce. Graceful shutdown of DuckDB has been improved for better stability across restarts.

Configurable DuckDB memory limit: Use the duckdb_memory_limit parameter to set the DuckDB acceleration memory limit:

- from: spice.ai:path.to.my_dataset
   name: my_dataset
   acceleration:
     params:
       duckdb_memory_limit: '2GB'
     enabled: true
     engine: duckdb
     mode: file

Contributors

@peasee
@phillipleblanc
@sgrebnov
@lukekim

Breaking Changes

DuckDB v1.2.0 has breaking changes.

Upgrading

To upgrade to v1.0.5, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.0.5 image:

docker pull spiceai/spiceai:1.0.5

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

duckdb-rs: Upgraded from 1.1.1 to 1.2.0

Changelog

fix: Update OpenAI model health check by @peasee in #4849
fix: Allow metrics endpoint setting in CLI by @peasee in #4939
DuckDB acceleration: fix Decimal with zero scale support by @sgrebnov in #4922
Introduce runtime shutdown state by @sgrebnov in #4917
Add support for Flight and HTTP endpoints configuration to Spice CLI (run and sql) by @sgrebnov and @lukekim in #4913
Fix Datafusion resources deallocation during shutdown by @sgrebnov in #4912
DuckDB: fix error handling during record batch insertion by @sgrebnov in #4894
DuckDB: add support for Map Arrow type for DuckDB acceleration by @sgrebnov in #4887
Upgrade to DuckDB v1.2.0 by @sgrebnov in #4842
Gracefully shutdown the runtime and deallocate static resources by @sgrebnov in #4879
Implement an Iceberg Data Connector by @phillipleblanc in #4941
Don't trace canceled dataset refresh during runtime termination by @sgrebnov in #4958
Use metadata column last_modified when specified as a time_column by @phillipleblanc in #4970
Add duckdb_memory_limit param support for DuckDB acceleration by @sgrebnov in #4971
Add Iceberg dataset integration test by @phillipleblanc in #4950

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.4...v1.0.5

Spice v1.0.1 (Jan 27, 2025)

January 27, 2025 · 5 min read

Qianqian Liu

Software Engineer at Spice AI

Spice v1.0.1 focuses on an improved developer experience, with automatic CUDA GPU detection for local models, in addition to bug fixes. Notably, the Iceberg Catalog Connector now supports AWS Glue including Sig v4 authentication.

Highlights in v1.0.1

AWS Glue Support for Iceberg Catalog Connector: The Iceberg Catalog Connector now supports AWS Glue. Example spicepod.yaml configuration:

- from: iceberg:https://glue.ap-northeast-2.amazonaws.com/iceberg/v1/catalogs/123456789012/namespaces
  name: glue

spice upgrade CLI Command: The spice upgrade CLI command detects more edge cases for a smoother upgrade experience.
GPU Acceleration Detection: The Spice CLI now automatically detects and enables CUDA (NVIDIA GPUs) GPU acceleration when supported in addition to Metal (M-Series on macOS).
Python SDK: The Python SDK (spicepy) has updated to v3.0.0, aligning the SDK with the Runtime

Breaking changes

No breaking changes.

Dependencies

No major dependency changes.

Cookbook

Added DeepSeek Model Recipe
Added OpenAI LLM & Embeddings Recipe

Upgrading

To upgrade to v1.0.1, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.0.1 image:

docker pull spiceai/spiceai:1.0.1

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

Contributors

@Jeadie
@phillipleblanc
@ewgenius
@peasee
@Sevenannn
@sgrebnov
@lukekim

What's Changed

- Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/4459
- docs: 1.0 release notes by @peasee in https://github.com/spiceai/spiceai/pull/4440
- Create a release-only workflow that uses a previous run's artifacts by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4461
- Add publish-only CUDA workflow by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4462
- Fix the CUDA release workflow by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4463
- docs: Update SECURITY.md for stable by @peasee in https://github.com/spiceai/spiceai/pull/4465
- docs: Update endgame by @peasee in https://github.com/spiceai/spiceai/pull/4460
- docs: Promote HF and File model components by @peasee in https://github.com/spiceai/spiceai/pull/4457
- fix: E2E test release installation by @peasee in https://github.com/spiceai/spiceai/pull/4466
- Fix publish part of CUDA workflow by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4467
- Fix broken docs links in README by @ewgenius in https://github.com/spiceai/spiceai/pull/4468
- Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/4474
- Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/4477
- Add instruction to force-install CPU runtime to v1.0 release notes by @sgrebnov in https://github.com/spiceai/spiceai/pull/4469
- feat: Add WIP testoperator dispatch workflow by @peasee in https://github.com/spiceai/spiceai/pull/4478
- Fix Bug: invalid REPL cursor position on Windows by @sgrebnov in https://github.com/spiceai/spiceai/pull/4480
- feat: Download latest spiced commit for testoperators by @peasee in https://github.com/spiceai/spiceai/pull/4483
- Add compute engine image by @lukekim in https://github.com/spiceai/spiceai/pull/4486
- fix: Testoperator git fetch depth by @peasee in https://github.com/spiceai/spiceai/pull/4484
- feat: New spicepods, testoperator improvements, TPCDS Q1 fix by @peasee in https://github.com/spiceai/spiceai/pull/4475
- Add 87 CUDA compatiblity to build CI by @Jeadie in https://github.com/spiceai/spiceai/pull/4489
- Use OpenAI golang client in `spice chat` by @Jeadie in https://github.com/spiceai/spiceai/pull/4491
- Verify `search` and `chat` on Windows as part of AI installation tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/4492
- feat: Add testoperator dispatch command by @peasee in https://github.com/spiceai/spiceai/pull/4479
- Run CUDA builds on non-GPU instances by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4496
- Use upgraded spice cli when performing runtime upgrade in spice upgrade by @Sevenannn in https://github.com/spiceai/spiceai/pull/4490
- Revert "Use OpenAI golang client in `spice chat` (#4491)" by @Jeadie in https://github.com/spiceai/spiceai/pull/4532
- Make Anthropic rate limit error message friendlier by @sgrebnov in https://github.com/spiceai/spiceai/pull/4501
- Update supported CUDA targets: add 87(cli), remove 75 by @sgrebnov in https://github.com/spiceai/spiceai/pull/4509
- Support AWS Glue for Iceberg catalog connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4517
- Package CUDA runtime libraries into artifact for Windows by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4497

**Full Changelog**: https://github.com/spiceai/spiceai/compare/v1.0.0...v1.0.1

Resources

Community

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Slack or by email to get involved.

Twitter: @spice_ai
Slack: spiceai.org/slack
Telegram: Spice AI Discussion
Reddit: https://www.reddit.com/r/spiceai
Email: [email protected]

What's New in v1.11.0​

Spice Cayenne Accelerator Reaches Beta​

DataFusion v51 Upgrade​

Arrow 57.2 Upgrade​

DynamoDB Connector Enhancements​

Distributed Query Improvements​

iceberg-rust v0.8.0 Upgrade​

Acceleration Snapshots​

Caching Acceleration Mode Improvements​

Prepared Statements​

Spice Java SDK v0.5.0​

Google LLM Support​

URL Tables​

Cluster Mode Async Query APIs (experimental)​

OpenTelemetry Improvements​

Observability Improvements​

Hash Indexing for Arrow Acceleration (experimental)​

SMB and NFS Data Connectors​

ScyllaDB Data Connector​

Flight SQL TLS Connection Fixes​

Developer Experience Improvements​

Additional Improvements & Bug Fixes​

Contributors​

Breaking Changes​

OTel Ingestion Port Change​

Distributed Query Cluster Mode Requires mTLS​

Cookbook Updates​

Upgrading​

Dependencies​

What's Changed​

Changelog​

What's New in v1.11.0-rc.2​

Spice Cayenne Accelerator Reaches Beta​

DataFusion v51 Upgrade​

Arrow 57.2 Upgrade​

iceberg-rust v0.8.0 Upgrade​

Acceleration Snapshots​

ScyllaDB Data Connector​

Distributed Query Improvements​

Caching Acceleration Mode Improvements​

DynamoDB Connector Enhancements​

URL Tables​

Cluster Mode Async Query APIs (experimental)​

Observability Improvements​

Additional Improvements​

Contributors​

Breaking Changes​

Cookbook Updates​

Upgrading​

Dependencies​

Changelog​

What's New in v1.8.0​

Iceberg Table Write Support (Preview)​

Acceleration Snapshots for Fast Cold Starts (Preview)​

Partitioned S3 Vector Indexes​

AI SQL function for LLM Integration (Preview)​

Remote Endpoint Support for Spice CLI​

Spice.js v3.0.3 SDK​

Additional Improvements​

Contributors​

Breaking Changes​

Cookbook Updates​

Upgrading​

What's Changed​

Dependencies​

Changelog​

What's New in v1.5.2​

Contributors​

Breaking Changes​

Cookbook Updates​

Upgrading​

What's Changed​

Dependencies​

Changelog​

Highlights in v1.0.6​

Contributors​

Breaking Changes​

Cookbook Updates​

Upgrading​

What's Changed​

What's New in v1.11.0

Spice Cayenne Accelerator Reaches Beta

DataFusion v51 Upgrade

Arrow 57.2 Upgrade

DynamoDB Connector Enhancements

Distributed Query Improvements

iceberg-rust v0.8.0 Upgrade

Acceleration Snapshots

Caching Acceleration Mode Improvements

Prepared Statements

Spice Java SDK v0.5.0

Google LLM Support

URL Tables

Cluster Mode Async Query APIs (experimental)

OpenTelemetry Improvements

Observability Improvements

Hash Indexing for Arrow Acceleration (experimental)

SMB and NFS Data Connectors

ScyllaDB Data Connector

Flight SQL TLS Connection Fixes

Developer Experience Improvements

Additional Improvements & Bug Fixes

Contributors

Breaking Changes

OTel Ingestion Port Change

Distributed Query Cluster Mode Requires mTLS

Cookbook Updates

Upgrading

Dependencies

What's Changed

Changelog

What's New in v1.11.0-rc.2

Spice Cayenne Accelerator Reaches Beta

DataFusion v51 Upgrade

Arrow 57.2 Upgrade

iceberg-rust v0.8.0 Upgrade

Acceleration Snapshots

ScyllaDB Data Connector

Distributed Query Improvements

Caching Acceleration Mode Improvements

DynamoDB Connector Enhancements

URL Tables

Cluster Mode Async Query APIs (experimental)

Observability Improvements

Additional Improvements

Contributors

Breaking Changes

Cookbook Updates

Upgrading

Dependencies

Changelog

What's New in v1.8.0

Iceberg Table Write Support (Preview)

Acceleration Snapshots for Fast Cold Starts (Preview)

Partitioned S3 Vector Indexes

AI SQL function for LLM Integration (Preview)

Remote Endpoint Support for Spice CLI

Spice.js v3.0.3 SDK

Additional Improvements

Contributors

Breaking Changes

Cookbook Updates

Upgrading

What's Changed

Dependencies

Changelog

What's New in v1.5.2

Contributors

Breaking Changes

Cookbook Updates

Upgrading

What's Changed

Dependencies

Changelog

Highlights in v1.0.6

Contributors

Breaking Changes

Cookbook Updates

Upgrading

What's Changed