Spice v1.3.0 (May 19, 2025)
Announcing the release of Spice v1.3.0! 🏎️
Spice v1.3.0 accelerates data and AI applications with significantly improved query performance, reliability, and expanded Databricks integration. New support for the Databricks SQL Statement Execution API enables direct SQL queries on Databricks SQL Warehouses, complementing Mosaic AI model serving and embeddings (introduced in v1.2.2) and existing Databricks catalog and dataset integrations. This release upgrades to DataFusion v46, optimizes results caching performance, and strengthens security with least-privilege sandboxed improvements.
What's New in v1.3.0
-
Databricks SQL Statement Execution API Support: Added support for the Databricks SQL Statement Execution API, enabling direct SQL queries against Databricks SQL Warehouses for optimized performance in analytics and reporting workflows.
Example
spicepod.yml
configuration:datasets:
- from: databricks:spiceai.datasets.my_awesome_table
name: my_awesome_table
params:
mode: sql_warehouse
databricks_endpoint: ${env:DATABRICKS_ENDPOINT}
databricks_sql_warehouse_id: ${env:DATABRICKS_SQL_WAREHOUSE_ID}
databricks_token: ${env:DATABRICKS_TOKEN}For details, see the Databricks Data Connector documentation.
-
Improved Results Cache Performance & Hashing Algorithm: Spice now supports an alternative results cache hashing algorithm,
ahash
, in addition tosiphash
, being the default. Configure it via:runtime:
results_cache:
hashing_algorithm: ahash # or siphashThe hashing algorithm determines how cache keys are hashed before being stored, impacting both lookup speed and protection against potential DOS attacks.
Using
ahash
improves performance for large queries or query plans. Combined with results cache optimizations, it reduces 99th percentile request latency and increases total requests/second for queries with large result sets (100k+ cached rows). The following charts show performance tested against the TPCH Query #17 on a scale factor 5 dataset (30+ million rows, 5GB):Latency Req/sec Note:
ahash
was not available in v1.2.2, so it is excluded from comparisons.To learn more, refer to the Results Cache Hashing Algorithm documentation.
-
SQL Query Performance: Optimized the critical SQL query path, reducing overhead and improving response times for simple queries by 10-20%.
-
DuckDB Acceleration: Fixed a bug in the DuckDB acceleration engine causing query failures under high concurrency when querying datasets accelerated into multiple DuckDB files.
-
Container Security: The container image now runs as a non-root user with enhanced sandboxing and includes only essential dependencies for a slimmer, more secure image.
DataFusion v46 Highlights
Spice.ai is built on the DataFusion query engine. The v46 release brings:
-
Faster Performance 🚀: DataFusion 46 introduces significant performance enhancements, including a 2x faster
median()
function for large datasets without grouping, 10–100% speed improvements inFIRST_VALUE
andLAST_VALUE
window functions by avoiding sorting, and a 40x fasteruuid()
function. Additional optimizations, such as a 50% fasterrepeat()
string function, acceleratedchr()
andto_hex()
functions, improved grouping algorithms, and Parquet row group pruning withNOT LIKE
filters, further boost overall query efficiency. -
New range() Table Function: A new table-valued function
range(start, stop, step)
has been added to make it easy to generate integer sequences — similar to PostgreSQL’s generate_series() or Spark’s range(). Example:SELECT * FROM range(1, 10, 2);
-
UNION [ALL | DISTINCT] BY NAME Support: DataFusion now supports
UNION BY NAME
andUNION ALL BY NAME
, which align columns by name instead of position. This matches functionality found in systems like Spark and DuckDB and simplifies combining heterogeneously ordered result sets.Example:
SELECT col1, col2 FROM t1
UNION ALL BY NAME
SELECT col2, col1 FROM t2;
See the DataFusion 46.0.0 release notes for details.
Spice.ai adopts the latest minus one DataFusion release for quality assurance and stability. The upgrade to DataFusion v47 is planned for Spice v1.4.0 in June.
Contributors
Breaking Changes
No breaking changes.
Cookbook Updates
- Added Accelerated Views: Pre-calculate and materialize data derived from one or more underlying datasets.
The Spice Cookbook now includes 67 recipes to help you get started with Spice quickly and easily.
Upgrading
To upgrade to v1.3.0, use one of the following methods:
CLI:
spice upgrade
Homebrew:
brew upgrade spiceai/spiceai/spice
Docker:
Pull the spiceai/spiceai:1.3.0
image:
docker pull spiceai/spiceai:1.3.0
For available tags, see DockerHub.
Helm:
helm repo update
helm upgrade spiceai spiceai/spiceai
What's Changed
Dependencies
- DataFusion: Upgraded to v46
- Apache Arrow: Upgraded to v54.3.0
- delta_kernel: Upgraded to v0.10.0
Changelog
- update to 1.2.2 by @Jeadie in #5806
- Move sandboxing logic to Dockerfile by @phillipleblanc in #5808
- Add note to run installation health workflow after release is marked as official by @Sevenannn in #5797
- ROADMAP updates May 13, 2025 by @lukekim in #5809
- Update qa_analytics.csv by @kczimm in #5810
- post-release housekeeping by @Jeadie in #5811
- Fix flaky DataBricks M2M integration tests by @phillipleblanc in #5818
- Add DataFusion request context extension to http routes by @ewgenius in #5807
- Use Utf8 for partition columns by @phillipleblanc in #5820
- Use full path for location metadata column by @phillipleblanc in #5819
- Remove the DataFusion reference from the flight service and use the reference from the request context instead by @ewgenius in #5821
- Upgrade delta_kernel to 0.10 by @phillipleblanc in #5823
- fix: Update benchmark snapshots by @app/github-actions in #5827
- Update qa_analytics.csv by @kczimm in #5824
- fix: Update benchmark snapshots by @app/github-actions in #5826
- fix: Update benchmark snapshots by @app/github-actions in #5825
- Fix dispatch spicepod reference for
file[parquet]-duckdb[file]-indexes
andfile[parquet]-duckdb[memory]-indexes
by @phillipleblanc in #5837 - Fix
spice run --http-endpoint
in CLI by @Jeadie in #5812 - Prevent excessively copying RawCacheKey by @peasee in #5838
- Make DuckDB database attachments logic more robust by @sgrebnov in #5839
- Simplify Databricks U2M auth flow, by moving user auth to the request context by @ewgenius in #5842
- Update to new MCP crate by @Jeadie in #5758
- Disable the query tracker when task history is disabled by @peasee in #5852
- Set fsGroup on PodSpec to force volumes to be mounted with permission to docker image by @phillipleblanc in #5854
- Clarify Helm release steps by @phillipleblanc in #5855
- Avoid cloning cached results by @peasee in #5853
- Upgrade to DataFusion 46 by @phillipleblanc in #5543
- Update openapi.json by @app/github-actions in #5856
- Adapt to Arrow 54 changes in Dict IDs preserving (Arrow IPC) by @sgrebnov in #5866
- fix: Update benchmark snapshots by @app/github-actions in #5867
- Fix s3[parquet]-duckdb[file-many] benchmark Spicepod configuration by @sgrebnov in #5868
- fix: Update benchmark snapshots by @app/github-actions in #5869
- feat: Refactor caching, support hashing algorithms by @peasee in #5859
- Overried health checks for Databricks models in U2M auth mode by @ewgenius in #5858
- Update trunk to 1.4.0-unstable by @phillipleblanc in #5878
- fix: Pass parameters to testoperator explain plan by @peasee in #5883
- Disallow schema updates for existing accelerated tables by @phillipleblanc in #5887
- Deferrable registration for Databricks U2M datasets by @ewgenius in #5860
See the full list of changes at: v1.2.2...v1.3.0