Spice v1.9.0-rc.1 (Nov 4, 2025)
This is the first release candidate for v1.9.0, which introduces Cayenne, a new high-performance data accelerator built on the Vortex columnar format that delivers DuckDB-comparable performance without scaling limitations. This release also upgrades to DataFusion v50 for improved query performance, expands search capabilities with full-text search on views and multi-column embeddings, includes significant DynamoDB and DuckDB accelerator improvements, and delivers security and reliability enhancements.
What's New in v1.9.0-rc.1β
Cayenne Data Accelerator (Alpha)β
Introducing Cayenne: SQL as an Acceleration Format: A new high-performance data accelerator that simplifies multi-file data acceleration by using an embedded database (SQLite) for metadata while storing data in the Vortex columnar format. Cayenne delivers query and ingestion performance comparable or better to DuckDB's file-based acceleration without DuckDB's memory overhead and the scaling challenges of single DuckDB files.
Cayenne uses SQLite to manage acceleration metadata (schemas, snapshots, statistics, file tracking) through simple SQL transactions, while storing actual data in Vortex's compressed columnar format. This architecture provides:
Key Features:
- SQLite + Vortex Architecture: All metadata is stored in SQLite tables with standard SQL transactions, while data lives in Vortex's compressed, chunked columnar format designed for zero-copy access and efficient scanning.
- Simplified Operations: No complex file hierarchies, no JSON/Avro metadata files, no separate catalog serversβjust SQL tables and Vortex data files. The entire metadata schema is intentionally simple for maximum reliability.
- Fast Metadata Access: Single SQL query retrieves all metadata needed for query planningβno multiple round trips to storage, no S3 throttling, no reconstruction of metadata state from scattered files.
- Efficient Small Changes: Dramatically reduces small file proliferation. Snapshots are just rows in SQLite tables, not new files on disk. Supports millions of snapshots without performance degradation.
- High Concurrency: Changes consist of two steps: stage Vortex files (if any), then run a single SQL transaction. Much faster conflict resolution and support for many more concurrent updates than file-based formats.
- Advanced Data Lifecycle: Full ACID transactions, delete support, and retention SQL execution on refresh commit.
Example Spicepod.yml configuration:
datasets:
- from: s3:my_table
name: accelerated_data
acceleration:
enabled: true
engine: cayenne
retention:
sql: DELETE FROM accelerated_data WHERE created_at < NOW() - INTERVAL '30 days'
Note, the Cayenne Data Accelerator is in Alpha with limitations.
For more details, refer to the Cayenne Documentation, the Vortex project, and the DuckLake announcement that partly inspired this design.
DataFusion v50 Upgradeβ
Spice.ai is built on the DataFusion query engine. The v50 release brings significant performance improvements and enhanced reliability:
Performance Improvements π:
- Dynamic Filter Pushdown: Enhanced dynamic filter pushdown for custom
ExecutionPlans, ensuring filters propagate correctly through all physical operators for improved query performance. - Partition Pruning: Expanded partition pruning support ensures that unnecessary partitions are skipped when filters are not used, reducing data scanning overhead and improving query execution times.
Bug Fixes & Reliability: Resolved issues with partition name validation and empty execution plans when vector index lists are empty. Fixed timestamp support for partition expressions, enabling better partitioning for time-series data.
See the Apache DataFusion 50.0.0 Release for more details.
DynamoDB Data Connector Improvementsβ
Improved Query Performance: The DynamoDB Data Connector now includes improved filter handling for edge cases, parallel scan support for faster data ingestion, and better error handling for misconfigured queries. These improvements enable more reliable and performant access to DynamoDB data.
Example Spicepod.yml configuration:
datasets:
- from: dynamodb:my_table
name: ddb_data
params:
scan_segments: 10 # Default `auto` which calculates optimal segments based on number of rows
Search & Embeddings Enhancementsβ
Full-Text Search on Views: Full-text search indexes are now supported on views, enabling advanced search scenarios over pre-aggregated or transformed data. This extends the power of Spice's search capabilities beyond base datasets.
Multi-Column Embeddings on Views: Views now support embedding columns, enabling vector search and semantic retrieval on view data. This is useful for search over aggregated or joined datasets.
Vector Engines on Views: Vector search engines are now available for views, enabling similarity search over complex queries and transformations.
Example Spicepod.yml configuration:
views:
- name: aggregated_reviews
sql: SELECT review_id, review_text FROM reviews WHERE rating > 4
embeddings:
- column: review_text
model: openai:text-embedding-3-small
DuckDB Accelerator Improvementsβ
Parquet Buffering for Partitioned Writes: DuckDB partitioned writes in table mode now support Parquet buffering, reducing memory usage and improving write performance for large datasets.
Retention SQL on Refresh Commit: DuckDB accelerations now support running retention SQL on refresh commit, enabling automatic data cleanup and lifecycle management during refresh operations.
UTC Timezone for DuckDB: DuckDB now uses UTC as the default timezone, ensuring consistent behavior for time-based queries across different environments.
Example Spicepod.yml configuration:
datasets:
- from: s3://my_bucket/large_table/
name: partitioned_data
acceleration:
enabled: true
engine: duckdb
mode: file
retention:
sql: DELETE FROM partitioned_data WHERE event_time < NOW() - INTERVAL '7 days'
Query Performance Optimizationsβ
Optimized Prepared Statements: Prepared statement handling has been optimized for better performance with parameterized queries, reducing planning overhead and improving execution time for repeated queries.
Large RecordBatch Chunking: Large Arrow RecordBatch objects are now automatically chunked to control memory usage during query execution, preventing memory exhaustion for queries returning large result sets.
Security & Reliability Improvementsβ
Enhanced HTTP Client Security: HTTP client usage across the runtime has been hardened with improved TLS validation, certificate pinning for critical endpoints, and better error handling for network failures.
ODBC Connector Improvements: Removed unwrap calls from the ODBC connector, improving error handling and reliability. Fixed secret handling and Kubernetes secret integration.
CLI Permissions Hardening: Tightened file permissions for the CLI and install script, ensuring secure defaults for configuration files and credentials.
Oracle Instant Client Pinning: Oracle Instant Client downloads are now pinned to specific SHAs, ensuring reproducible builds and preventing supply chain attacks.
Observability & Tracingβ
DataFusion Log Emission: The Spice runtime now emits DataFusion internal logs, providing deeper visibility into query planning and execution for debugging and performance analysis.
AI Completions Tracing: Fixed tracing so that ai_completions operations are correctly parented under sql_query traces, improving observability for AI-powered queries.
Git Data Connector (Alpha)β
Version-Controlled Data Access: The new Git Data Connector (Alpha) enables querying datasets stored in Git repositories. This connector is ideal for use cases involving configuration files, documentation, or any data tracked in version control.
Example Spicepod.yml configuration:
datasets:
- from: git:https://github.com/myorg/myrepo
name: git_metrics
params:
file_format: csv
For more details, refer to the Git Data Connector Documentation.
Additional Improvements & Bug Fixesβ
- Reliability: Fixed refresh worker panics with recovery handling to prevent runtime crashes during acceleration refreshes.
- Reliability: Improved error messages for missing or invalid
spicepod.yamlfiles, providing actionable feedback for misconfiguration. - Reliability: Fixed DuckDB metadata pointer loading issues for snapshots.
- Performance: Ensured
ListingTablepartitions are pruned correctly when filters are not used. - Reliability: Fixed vector dimension determination for partitioned indexes.
- Search: Fixed casing issues in Reciprocal Rank Fusion (RRF) for hybrid search queries.
- Search: Fixed search field handling as metadata for chunked search indexes.
- Validation: Added timestamp support for partition expressions.
- Validation: Fixed
regexp_matchfunction for DuckDB datasets. - Validation: Fixed partition name validation for improved reliability.
Contributorsβ
Breaking Changesβ
No breaking changes.
Cookbook Updatesβ
No major cookbook updates.
The Spice Cookbook includes 81 recipes to help you get started with Spice quickly and easily.
Upgradingβ
To upgrade to v1.9.0-rc.1, use one of the following methods:
CLI:
spice upgrade
Homebrew:
brew upgrade spiceai/spiceai/spice
Docker:
Pull the spiceai/spiceai:1.9.0-rc.1 image:
docker pull spiceai/spiceai:1.9.0-rc.1
For available tags, see DockerHub.
Helm:
helm repo update
helm upgrade spiceai spiceai/spiceai
AWS Marketplace:
π Spice is now available in the AWS Marketplace!
What's Changedβ
Changelogβ
- Fix for search field as metadata for chunked search indexes by @Jeadie in #7429
- Bump object_store from 0.12.3 to 0.12.4 by @app/dependabot in #7433
- Properly respect disabling snapshots by @phillipleblanc in #7431
- Revert "Properly respect disabling snapshots" by @sgrebnov in #7439
- Revert "Disable snapshots by default" by @sgrebnov in #7438
- Add preview warning for write access mode by @sgrebnov in #7440
- fix: regexp_match for DuckDB datasets by @kczimm in #7443
- Add feature is currently in preview warning for snapshots by @sgrebnov in #7442
- [Logger] Also emit Datafusion logs by @mach-kernel in #7441
- add missing snapshot by @kczimm in #7446
- Fix tracing so that ai_completions are parented under sql_query by @lukekim in #7415
- Enable snapshot acceleration by default by @phillipleblanc in #7451
- Disable acceleration refresh metrics by @krinart in #7450
- Add v1.8 release notes by @phillipleblanc in #7430
- fix: partition name validation by @kczimm in #7452
- Fix lint error due to ignore without reasons by @krinart in #7454
- Add models and CUDA support to spiced install script by @lukekim in #7457
- Post-release 1.8 updates by @phillipleblanc in #7455
- Remove println in datafusion by @phillipleblanc in #7461
- Update end_game.md to notify once release is done by @sgrebnov in #7460
- Remove italics from snapshot logging by @phillipleblanc in #7463
- Update openapi.json by @app/github-actions in #7466
- Fix generate spicepod schema by @phillipleblanc in #7464
- Fix generate acknowledements by @phillipleblanc in #7465
- Update spicepod.schema.json by @app/github-actions in #7469
- fix: Ensure ListingTable partitions are pruned when filters are not used by @peasee in #7471
- Create
runtime-secretscrate by @phillipleblanc in #7474 - Create
runtime-parameterscrate by @phillipleblanc in #7475 - Don't download the snapshot if the acceleration is present by @phillipleblanc in #7477
- Fix casing for keywords and additional columns by @Jeadie in #7770
- Bump actions/upload-artifact from 4 to 5 by @app/dependabot in #7750
- Bump criterion from 0.5.1 to 0.7.0 by @app/dependabot in #7740
- Bump rustls-native-certs from 0.8.1 to 0.8.2 by @app/dependabot in #7744
- Git Data Connector (Alpha) by @lukekim in #7772
- Pepper accelerator delete support by @lukekim in #7616
- Update Helm chart instructions for Helm in end_game.md by @sgrebnov in #7776
- Turso data accelerator by @lukekim in #7472
- Apply retention SQL filter to refresh fetch by @phillipleblanc in #7778
- Add Parquet buffering option for DuckDB partitioned writes (tables mode) by @sgrebnov in #7735
- fix: EmptyExec when list indexes is empty by @kczimm in #7784
- 1.8.3 post-release housekeeping by @mach-kernel in #7783
- feat: Upgrade to Datafusion v50 by @peasee in #7777
- fix: Replace vortex datafusion with public crate by @peasee in #7791
- Full-text search on views by @Jeadie in #7733
- Revert "Apply retention SQL filter to refresh fetch (#7778)" by @phillipleblanc in #7796
- fix: Add ingest duration and acceleration size metrics to testoperator by @peasee in #7792
- Set local timezone to UTC for DuckDB by @phillipleblanc in #7797
- add Timestamp support for partition expressions by @kczimm in #7803
- Fix trunk lint by @krinart in #7804
- Add missing mongodb params by @krinart in #7807
- Embedding columns on view components by @Jeadie in #7795
- Add Turso as a Pepper Catalog metastore by @lukekim in #7793
- Run retention_sql on refresh commit for DuckDB by @lukekim in #7785
- docs: Update datafusion upgrade checklist by @peasee in #7812
- Vector engines on views by @Jeadie in #7808
- Handle refresh worker panics and add recovery test by @phillipleblanc in #7815
- chunk large record batches to control memory usage by @kczimm in #7802
- fix: cannot determine vector dimension for partitioned indexes by @kczimm in #7818
- Upgrade to Turso v0.3 by @lukekim in #7821
- fix: Ensure custom *Exec ExecutionPlans push down dynamic filters by @peasee in #7811
- handle casing in RRF by @Jeadie in #7825
- Enable 'turso' for pepper acceleration by default by @sgrebnov in #7826
- Improved DynamoDB Data Connector by @krinart in #7715
- Initial support for llama.cpp as LLM inference backend by @lukekim in #7794
- Pepper: Implement retention SQL on refresh commit by @phillipleblanc in #7814
- Fix Dockerfiles for arm64 by @lukekim in #7834
- [DynamoDB] Handle filter edge-cases by @krinart in #7830
- [DynamoDB] Support parallelization for
Scanrequest by @krinart in #7829 - Don't feature gate Pepper by @lukekim in #7832
- Fix llama.cpp static link by @lukekim in #7835
- fix: docker nightly builds by @kczimm in #7837
- Use GitHub-hosted macOS runner only for tag releases by @lukekim in #7836
- Fix Bug: DuckDB INTERNAL Error: Failed to load metadata pointer by @sgrebnov in #7839
- Fix docker arm64 build to use aegis in pure-rust mode by @lukekim in #7840
- Revert "Use GitHub-hosted macOS runner only for tag releases" by @lukekim in #7843
- Rename Pepper to Cayenne by @lukekim in #7844
- Tighten CLI permissions and install script by @lukekim in #7845
- Set mvcc for Cayenne Turso metastore by @lukekim in #7850
- Optimize Prepared Statements by @lukekim in #7859
- Remove unwrap from ODBC connector, fix secrets, and kuberenetes secre⦠by @lukekim in #7846
- Improve and secure HTTP client usage by @lukekim in #7847
- Pin Oracle Instant Client download to a SHA by @lukekim in #7851
- Improve experience for missing or invalid Spicepod.yaml by @lukekim in #7849
- chore: Fix PR linting by @peasee in #7865
- Revert FlightIPC issues by @Jeadie in #7870
- Improve error message by adding 'cayenne' to the list of valid accelerator engines by @sgrebnov in #7882
- fix: allow parameter index without dollar signs by @kczimm in #7887
- Temporary disable
supports_limit_pushdownforSchemaCastScanExecby @sgrebnov in #7893 - Remove '.embeddings[].metadata' by @Jeadie in #7897





