Spice v1.9.0-rc.2 (Nov 11, 2025)
Announcing the release of Spice v1.9.0-rc.2! 🌶
This is the second release candidate for v1.9.0, which introduces Spice Cayenne, a new high-performance data accelerator built on the Vortex columnar format that delivers better than DuckDB performance without single-file scaling limitations and a preview of Multi-Node Distributed Query based on Apache Ballista. v1.9.0-rc.2 also upgrades to DataFusion v50 and DuckDB v1.4.1 for even higher query performance, expands search capabilities with full-text search on views and multi-column embeddings, includes significant DynamoDB and DuckDB accelerator improvements, expands the HTTP data connector to support endpoints as tables, and delivers many security and reliability improvements.
What's New in v1.9.0-rc.2​
Cayenne Data Accelerator (Beta)​
Introducing Cayenne: SQL as an Acceleration Format: A new high-performance Data Accelerator that simplifies multi-file data acceleration by using an embedded database (SQLite) for metadata while storing data in the Vortex columnar format, a Linux Foundation project. Cayenne delivers query and ingestion performance better than DuckDB's file-based acceleration without DuckDB's memory overhead and the scaling challenges of single DuckDB files.
Cayenne uses SQLite to manage acceleration metadata (schemas, snapshots, statistics, file tracking) through simple SQL transactions, while storing data in Vortex's compressed columnar format. This architecture provides:
Key Features:
- SQLite + Vortex Architecture: All metadata is stored in SQLite tables with standard SQL transactions, while data lives in Vortex's compressed, chunked columnar format designed for zero-copy access and efficient scanning.
- Simplified Operations: No complex file hierarchies, no JSON/Avro metadata files, no separate catalog servers—just SQL tables and Vortex data files. The entire metadata schema is intentionally simple for maximum reliability.
- Fast Metadata Access: Single SQL query retrieves all metadata needed for query planning—no multiple round trips to storage, no S3 throttling, no reconstruction of metadata state from scattered files.
- Efficient Small Changes: Dramatically reduces small file proliferation. Snapshots are just rows in SQLite tables, not new files on disk. Supports millions of snapshots without performance degradation.
- High Concurrency: Changes consist of two steps: stage Vortex files (if any), then run a single SQL transaction. Much faster conflict resolution and support for many more concurrent updates than file-based formats.
- Advanced Data Lifecycle: Full ACID transactions, delete support, and retention SQL execution on refresh commit.
Example Spicepod.yml configuration:
datasets:
- from: s3:my_table
name: accelerated_data_30d
acceleration:
enabled: true
engine: cayenne
mode: file
refresh_mode: append
retention_sql: DELETE FROM accelerated_data WHERE created_at < NOW() - INTERVAL '30 days'
Note, the Cayenne Data Accelerator is in Beta with limitations.
For more details, refer to the Cayenne Documentation, the Vortex project, and the DuckLake announcement that partly inspired this design.
Multi-Node Distributed Query (Preview)​
Apache Ballista Integration: Spice now supports distributed query execution based on Apache Ballista, enabling distributed queries across multiple executor nodes for improved performance on large datasets. This feature is in preview in v1.9.0-rc.2.
Architecture:
A distributed Spice cluster consists of:
- Scheduler: Responsible for distributed query planning and work queue management for the executor fleet
- Executors: One or more nodes responsible for running physical query plans
Getting Started:
Start a scheduler instance using an existing Spicepod. The scheduler is the only spiced instance that needs to be configured:
# Start scheduler (note the flight bind address override if you want it reachable outside localhost)
spiced --cluster-mode scheduler --flight 0.0.0.0:50051
Start one or more executors configured with the scheduler's flight URI:
# Start executor (automatically selects a free port if 50051 is taken)
spiced --cluster-mode executor --scheduler-url spiced://localhost:50051
Query Execution:
Queries run through the scheduler will now show a distributed_plan in EXPLAIN output, demonstrating how the query is distributed across executor nodes:
EXPLAIN SELECT count(id) FROM my_dataset;
Current Limitations:
- Accelerated datasets are currently not supported. This feature is designed for querying partitioned data lake formats (Parquet, Delta Lake, Iceberg, etc.)
- The feature is in preview and may have stability or performance limitations
- Specific acceleration support is planned for future releases
DataFusion v50 Upgrade​
Spice.ai is built on the Apache DataFusion query engine. The v50 release brings significant performance improvements and enhanced reliability:
Performance Improvements 🚀:
-
Dynamic Filter Pushdown: Enhanced dynamic filter pushdown for custom
ExecutionPlans, ensuring filters propagate correctly through all physical operators for improved query performance. -
Partition Pruning: Expanded partition pruning support ensures that unnecessary partitions are skipped when filters are not used, reducing data scanning overhead and improving query execution times.
Apache Spark Compatible Functions: Added support for Spark-compatible functions including array, bit_get/bit_count, bitmap_count, crc32/sha1, date_add/date_sub, if, last_day, like/ilike, luhn_check, mod/pmod, next_day, parse_url, rint, and width_bucket.
Bug Fixes & Reliability: Resolved issues with partition name validation and empty execution plans when vector index lists are empty. Fixed timestamp support for partition expressions, enabling better partitioning for time-series data.
See the Apache DataFusion 50.0.0 Release for more details.
DuckDB v1.4.1 Upgrade and Accelerator Improvements​
DuckDB v1.4.1: DuckDB has been upgraded to v1.4.1, which includes several performance optimizations.
Composite ART Index Support: DuckDB in Spice now supports composite (multi-column) Adaptive Radix Tree (ART) indexes for accelerated table scans. When queries filter on multiple columns fully covered by a composite index, the optimizer automatically uses index scans instead of full table scans, delivering significant performance improvements for selective queries.
Example configuration:
datasets:
- from: file://data.parquet
name: sales
acceleration:
enabled: true
engine: duckdb
indexes:
'(region, product_id)': enabled
Performance example with composite index on 7.5M rows:
SELECT * FROM sales WHERE region = 'US' AND product_id = 12345;
-- Without index: 0.282s
-- With composite index (region, product_id): 0.037s
-- Performance improvement: 7.6x faster with composite index
DuckDB Intermediate Materialization: Queries with indexes now use intermediate materialization (WITH ... AS MATERIALIZED) to leverage faster index scans. Currently supported for non-federated queries (query_federation: disabled) against a single table with indexes only. When predicates cover more columns than the index, the optimizer rewrites queries to first materialize index-filtered results, then apply remaining predicates. This optimization can deliver significant performance improvements for selective queries.
Example configuration:
datasets:
- from: file://sales_data.parquet
name: sales
acceleration:
enabled: true
engine: duckdb
mode: file
params:
query_federation: disabled # Required currently for intermediate materialization
indexes:
'(region, product_id)': enabled
Performance example:
-- Query with indexed columns (region, product_id) plus additional filter (amount)
SELECT * FROM sales
WHERE region = 'US' AND product_id = 12345 AND amount > 1000;
-- Optimized execution time: 0.031s (with intermediate materialization)
-- Standard execution time: 0.108s (without optimization)
-- Performance improvement: ~3.5x faster
The optimizer automatically rewrites the query to:
WITH _intermediate_materialize AS MATERIALIZED (
SELECT * FROM sales WHERE region = 'US' AND product_id = 12345
)
SELECT * FROM _intermediate_materialize WHERE amount > 1000;
Parquet Buffering for Partitioned Writes: DuckDB partitioned writes in table mode now support Parquet buffering, reducing memory usage and improving write performance for large datasets.
Retention SQL on Refresh Commit: DuckDB accelerations now support running retention SQL on refresh commit, enabling automatic data cleanup and lifecycle management during refresh operations.
UTC Timezone for DuckDB: DuckDB now uses UTC as the default timezone, ensuring consistent behavior for time-based queries across different environments.
Example Spicepod.yml configuration:
datasets:
- from: s3://my_bucket/large_table/
name: partitioned_data
acceleration:
enabled: true
engine: duckdb
mode: file
retention:
sql: DELETE FROM partitioned_data WHERE event_time < NOW() - INTERVAL '7 days'
HTTP Data Connector​
-
Querying endpoints as tables: The HTTP/HTTPS Data Connectors now supports querying HTTP endpoints directly as tables in SQL queries with dynamic filters. This feature transforms REST APIs into queryable data sources, making it easy to integrate external service data.
-
Query HTTP endpoint that returns structured data (JSON, CSV, etc.) as if it were a database table
-
Configurable retry logic, timeouts, and POST request support for more complex API interactions
Example Spicepod.yml configuration:
datasets:
- from: https://api.tvmaze.com
name: tvmaze
params:
file_format: json
max_retries: 3
client_timeout: 10s
Example SQL query:
SELECT request_path, request_query, content
FROM tvmaze
WHERE request_path = '/search/people' and request_query = 'q=michael'
LIMIT 10;
If a request_body is supplied it will be posted to the endpoint:
Example SQL query:
SELECT request_path, request_query, content
FROM tvmaze
WHERE request_path = '/search/people' and request_query = 'q=michael' and request_body = '{"name": "michael"}'
LIMIT 10;
HTTP endpoints can be accelerated using refresh_sql:
datasets:
- from: https://api.tvmaze.com
name: tvmaze
acceleration:
enabled: true
refresh_mode: full
refresh_sql: |
SELECT request_path, request_query, content
FROM tvmaze
request_path = '/search/people'
AND request_query IN ('q=michael', 'q=luke')
DynamoDB Data Connector Improvements​
Improved Query Performance: The DynamoDB Data Connector now includes improved filter handling for edge cases, parallel scan support for faster data ingestion, and better error handling for misconfigured queries. These improvements enable more reliable and performant access to DynamoDB data.
Example Spicepod.yml configuration:
datasets:
- from: dynamodb:my_table
name: ddb_data
params:
scan_segments: 10 # Default `auto` which calculates optimal segments based on number of rows
S3 Versioning Support​
Atomic Range Reads for Versioned Files: Spice now supports S3 Versioning for all connectors using object-store (S3, Delta Lake, etc.), ensuring range reads over versioned files are atomically correct. When S3 versioning is enabled, Spice automatically tracks version IDs during file discovery and uses them for all subsequent range reads, preventing inconsistencies from concurrent file modifications.
Current limitations:
- Multi-file connections (e.g., partitioned datasets) do not yet support version tracking across all files
- Version tracking is automatic when S3 versioning is enabled on the bucket
Search & Embeddings Enhancements​
Full-Text Search on Views: Full-text search indexes are now supported on views, enabling advanced search scenarios over pre-aggregated or transformed data. This extends the power of Spice's search capabilities beyond base datasets.
Multi-Column Embeddings on Views: Views now support embedding columns, enabling vector search and semantic retrieval on view data. This is useful for search over aggregated or joined datasets.
Vector Engines on Views: Vector search engines are now available for views, enabling similarity search over complex queries and transformations.
Example Spicepod.yml configuration:
views:
- name: aggregated_reviews
sql: SELECT review_id, review_text FROM reviews WHERE rating > 4
embeddings:
- column: review_text
model: openai:text-embedding-3-small
Dedicated Query Thread Pool (Now Enabled by Default)​
Dedicated Query Thread Pool: Query execution and accelerated refreshes now run on their own dedicated thread pool, separate from the HTTP server. This prevents heavy query workloads from slowing down API responses, keeping health checks fast and avoiding unnecessary Kubernetes pod restarts under load.
This feature was opt-in in previous releases and is now enabled by default in v1.9.0-rc.2. To disable it and revert to the previous behavior, add the following spicepod.yaml configuration:
runtime:
params:
dedicated_thread_pool: none
Query Performance Optimizations​
Stale-While-Revalidate Cache Control: Query results now support "stale-while-revalidate" cache control, allowing stale cached data to be served immediately while asynchronously refreshing the cache entry in the background. This improves response times for frequently-accessed queries while maintaining data freshness. Requires cache key type to be set to "sql (raw)" for proper operation.
Optimized Prepared Statements: Prepared statement handling has been optimized for better performance with parameterized queries, reducing planning overhead and improving execution time for repeated queries.
Large RecordBatch Chunking: Large Arrow RecordBatch objects are now automatically chunked to control memory usage during query execution, preventing memory exhaustion for queries returning large result sets.
Query Result Cache: Stale-While-Revalidate​
HTTP Cache-Control Support: The query result cache now supports the stale-while-revalidate Cache-Control directive, enabling faster response times by serving stale cached results immediately while asynchronously refreshing the cache in the background. This feature is particularly useful for applications that can tolerate slightly stale data in exchange for improved performance.
How it works:
When a cache entry is stale but within the stale-while-revalidate window, Spice will:
- Immediately return the stale cached result to the client
- Asynchronously re-execute the query in the background to refresh the cache
- Future requests will use the refreshed data
Configuration:
Use the Cache-Control HTTP header with the stale-while-revalidate directive:
Cache-Control: max-age=300, stale-while-revalidate=60
This configuration caches results for 5 minutes (300 seconds), and allows serving stale results for an additional 60 seconds while refreshing in the background.
Requirements:
- Must use plan or raw SQL cache keys (set
cache_key_typetosqlorplanin results_caching configuration) - Background revalidation re-executes queries through the normal query path
- Timestamp tracking automatically determines cache entry age for staleness checks
Example configuration via HTTP header:
GET /v1/sql
Cache-Control: max-age=600, stale-while-revalidate=120
X-Cache-Key-Type: sql
This feature improves application responsiveness while ensuring data freshness through background updates.
Security & Reliability Improvements​
Enhanced HTTP Client Security: HTTP client usage across the runtime has been hardened with improved TLS validation, certificate pinning for critical endpoints, and better error handling for network failures.
ODBC Connector Improvements: Removed unwrap calls from the ODBC connector, improving error handling and reliability. Fixed secret handling and Kubernetes secret integration.
CLI Permissions Hardening: Tightened file permissions for the CLI and install script, ensuring secure defaults for configuration files and credentials.
Oracle Instant Client Pinning: Oracle Instant Client downloads are now pinned to specific SHAs, ensuring reproducible builds and preventing supply chain attacks.
AWS Authentication Improvements​
Improved Credential Retry Logic: AWS SDK credential initialization has been significantly improved with more robust retry logic and better error handling. The system now automatically retries transient credential resolution failures using Fibonacci backoff, allowing Spice to tolerate extended AWS outages (up to ~48 hours) without manual intervention.
Key features:
- Automatic retry with backoff: Implements Fibonacci backoff for transient credential failures (network issues, temporary AWS service disruptions)
- Configurable retry limits: Supports up to 300 retry attempts with a maximum retry interval of 600 seconds
- Better error handling: Distinguishes between retryable errors (connector errors) and non-retryable errors (misconfiguration)
- Unauthenticated access support: Properly supports unauthenticated access to public S3 buckets without requiring credentials
- Improved error messages: Provides detailed logging with attempt numbers, retry intervals, and error context for better troubleshooting
The improvements ensure more reliable AWS service integration, particularly in environments with intermittent network connectivity or during AWS service degradations.
Observability & Tracing​
DataFusion Log Emission: The Spice runtime now emits DataFusion internal logs, providing deeper visibility into query planning and execution for debugging and performance analysis.
AI Completions Tracing: Fixed tracing so that ai_completions operations are correctly parented under sql_query traces, improving observability for AI-powered queries.
Git Data Connector (Alpha)​
Version-Controlled Data Access: The new Git Data Connector (Alpha) enables querying datasets stored in Git repositories. This connector is ideal for use cases involving configuration files, documentation, or any data tracked in version control.
Example Spicepod.yml configuration:
datasets:
- from: git:https://github.com/myorg/myrepo
name: git_metrics
params:
file_format: csv
For more details, refer to the Git Data Connector Documentation.
Spice Java SDK 0.4.0​
The Spice Java SDK have been upgraded with support configurable Arrow memory limit: spice-java v0.4.0
SpiceClient client = SpiceClient.builder()
.withArrowMemoryLimitMB(1024) // 1GB limit
.build();
CLI Improvements​
Install Specific Versions: The spice install command now supports installing specific versions of the Spice runtime and CLI. This enables easy version management, downgrading, or installation of specific releases for testing or compatibility requirements.
Usage:
# Install a specific version
spice install v1.8.3
# Install a specific version with AI flavor
spice install v1.8.3 ai
# Install latest version (existing behavior)
spice install
spice install ai
Note: Homebrew installations require manual version management via brew install spiceai/spiceai/spice@<version>.
Persistent Query History: The Spice CLI REPL (SQL, search, and chat interfaces) now persists command history to ~/.spice/query_history.txt, making your query history available across sessions. The history file is automatically created if it doesn't exist, with graceful fallback if the home directory cannot be determined.
New REPL Commands:
.clear- Clear the screen using ANSI escape codes for a clean workspace.clear history- Clear and persist the query history, removing all stored commands
Tab Completion: Tab completion now includes suggestions based on your command history, making it faster to re-run or modify previous queries.
Example usage:
sql> SELECT * FROM my_table;
sql> .clear # Clears the screen
sql> .clear history # Clears command history
sql> # Use arrow keys or tab to access previous commands
Additional Improvements & Bug Fixes​
- Reliability: Fixed refresh worker panics with recovery handling to prevent runtime crashes during acceleration refreshes.
- Reliability: Improved error messages for missing or invalid
spicepod.yamlfiles, providing actionable feedback for misconfiguration. - Reliability: Fixed DuckDB metadata pointer loading issues for snapshots.
- Performance: Ensured
ListingTablepartitions are pruned correctly when filters are not used. - Reliability: Fixed vector dimension determination for partitioned indexes.
- Search: Fixed casing issues in Reciprocal Rank Fusion (RRF) for hybrid search queries.
- Search: Fixed search field handling as metadata for chunked search indexes.
- Validation: Added timestamp support for partition expressions.
- Validation: Fixed
regexp_matchfunction for DuckDB datasets. - Validation: Fixed partition name validation for improved reliability.
Contributors​
Breaking Changes​
No breaking changes.
Cookbook Updates​
New HTTP Data Connector Recipe: New recipe demonstrating how to query REST APIs and HTTP(s) endpoints. See HTTP Connector Recipe for details.
The Spice Cookbook includes 82 recipes to help you get started with Spice quickly and easily.
Upgrading​
To upgrade to v1.9.0-rc.2, use one of the following methods:
CLI:
spice upgrade
Homebrew:
brew upgrade spiceai/spiceai/spice
Docker:
Pull the spiceai/spiceai:1.9.0-rc.2 image:
docker pull spiceai/spiceai:1.9.0-rc.2
For available tags, see DockerHub.
Helm:
helm repo update
helm upgrade spiceai spiceai/spiceai
AWS Marketplace:
🎉 Spice is now available in the AWS Marketplace!
What's Changed​
Dependencies​
- DataFusion: Upgraded to v50
- Apache Arrow: Upgraded to v56
- DuckDB: Upgraded to v1.4.1
- Delta Kernel: Upgraded to v0.16.0
Changelog​
- Fix for search field as metadata for chunked search indexes by @Jeadie in #7429
- Bump object_store from 0.12.3 to 0.12.4 by @app/dependabot in #7433
- Properly respect disabling snapshots by @phillipleblanc in #7431
- Revert "Properly respect disabling snapshots" by @sgrebnov in #7439
- Revert "Disable snapshots by default" by @sgrebnov in #7438
- Add preview warning for write access mode by @sgrebnov in #7440
- fix: regexp_match for DuckDB datasets by @kczimm in #7443
- Add feature is currently in preview warning for snapshots by @sgrebnov in #7442
- [Logger] Also emit Datafusion logs by @mach-kernel in #7441
- add missing snapshot by @kczimm in #7446
- Fix tracing so that ai_completions are parented under sql_query by @lukekim in #7415
- Enable snapshot acceleration by default by @phillipleblanc in #7451
- Disable acceleration refresh metrics by @krinart in #7450
- Add v1.8 release notes by @phillipleblanc in #7430
- fix: partition name validation by @kczimm in #7452
- Fix lint error due to ignore without reasons by @krinart in #7454
- Add models and CUDA support to spiced install script by @lukekim in #7457
- Post-release 1.8 updates by @phillipleblanc in #7455
- Remove println in datafusion by @phillipleblanc in #7461
- Update end_game.md to notify once release is done by @sgrebnov in #7460
- Remove italics from snapshot logging by @phillipleblanc in #7463
- Update openapi.json by @app/github-actions in #7466
- Fix generate spicepod schema by @phillipleblanc in #7464
- Fix generate acknowledements by @phillipleblanc in #7465
- Update spicepod.schema.json by @app/github-actions in #7469
- fix: Ensure ListingTable partitions are pruned when filters are not used by @peasee in #7471
- Create
runtime-secretscrate by @phillipleblanc in #7474 - Create
runtime-parameterscrate by @phillipleblanc in #7475 - Don't download the snapshot if the acceleration is present by @phillipleblanc in #7477
- Fix casing for keywords and additional columns by @Jeadie in #7770
- Bump actions/upload-artifact from 4 to 5 by @app/dependabot in #7750
- Bump criterion from 0.5.1 to 0.7.0 by @app/dependabot in #7740
- Bump rustls-native-certs from 0.8.1 to 0.8.2 by @app/dependabot in #7744
- Git Data Connector (Alpha) by @lukekim in #7772
- Pepper accelerator delete support by @lukekim in #7616
- Update Helm chart instructions for Helm in end_game.md by @sgrebnov in #7776
- Turso data accelerator by @lukekim in #7472
- Apply retention SQL filter to refresh fetch by @phillipleblanc in #7778
- Add Parquet buffering option for DuckDB partitioned writes (tables mode) by @sgrebnov in #7735
- fix: EmptyExec when list indexes is empty by @kczimm in #7784
- 1.8.3 post-release housekeeping by @mach-kernel in #7783
- feat: Upgrade to Datafusion v50 by @peasee in #7777
- fix: Replace vortex datafusion with public crate by @peasee in #7791
- Full-text search on views by @Jeadie in #7733
- Revert "Apply retention SQL filter to refresh fetch (#7778)" by @phillipleblanc in #7796
- fix: Add ingest duration and acceleration size metrics to testoperator by @peasee in #7792
- Set local timezone to UTC for DuckDB by @phillipleblanc in #7797
- add Timestamp support for partition expressions by @kczimm in #7803
- Fix trunk lint by @krinart in #7804
- Add missing mongodb params by @krinart in #7807
- Embedding columns on view components by @Jeadie in #7795
- Add Turso as a Pepper Catalog metastore by @lukekim in #7793
- Run retention_sql on refresh commit for DuckDB by @lukekim in #7785
- docs: Update datafusion upgrade checklist by @peasee in #7812
- Vector engines on views by @Jeadie in #7808
- Handle refresh worker panics and add recovery test by @phillipleblanc in #7815
- chunk large record batches to control memory usage by @kczimm in #7802
- fix: cannot determine vector dimension for partitioned indexes by @kczimm in #7818
- Upgrade to Turso v0.3 by @lukekim in #7821
- fix: Ensure custom *Exec ExecutionPlans push down dynamic filters by @peasee in #7811
- handle casing in RRF by @Jeadie in #7825
- Enable 'turso' for pepper acceleration by default by @sgrebnov in #7826
- Improved DynamoDB Data Connector by @krinart in #7715
- Initial support for llama.cpp as LLM inference backend by @lukekim in #7794
- Pepper: Implement retention SQL on refresh commit by @phillipleblanc in #7814
- Fix Dockerfiles for arm64 by @lukekim in #7834
- [DynamoDB] Handle filter edge-cases by @krinart in #7830
- [DynamoDB] Support parallelization for
Scanrequest by @krinart in #7829 - Don't feature gate Pepper by @lukekim in #7832
- Fix llama.cpp static link by @lukekim in #7835
- fix: docker nightly builds by @kczimm in #7837
- Use GitHub-hosted macOS runner only for tag releases by @lukekim in #7836
- Fix Bug: DuckDB INTERNAL Error: Failed to load metadata pointer by @sgrebnov in #7839
- Fix docker arm64 build to use aegis in pure-rust mode by @lukekim in #7840
- Revert "Use GitHub-hosted macOS runner only for tag releases" by @lukekim in #7843
- Rename Pepper to Cayenne by @lukekim in #7844
- Tighten CLI permissions and install script by @lukekim in #7845
- Set mvcc for Cayenne Turso metastore by @lukekim in #7850
- Optimize Prepared Statements by @lukekim in #7859
- Remove unwrap from ODBC connector, fix secrets, and kuberenetes secre… by @lukekim in #7846
- Improve and secure HTTP client usage by @lukekim in #7847
- Pin Oracle Instant Client download to a SHA by @lukekim in #7851
- Improve experience for missing or invalid Spicepod.yaml by @lukekim in #7849
- chore: Fix PR linting by @peasee in #7865
- Revert FlightIPC issues by @Jeadie in #7870
- Improve error message by adding 'cayenne' to the list of valid accelerator engines by @sgrebnov in #7882
- fix: allow parameter index without dollar signs by @kczimm in #7887
- Temporary disable
supports_limit_pushdownforSchemaCastScanExecby @sgrebnov in #7893 - Remove '.embeddings[].metadata' by @Jeadie in #7897
- Optimize macOS and Windows builds by @lukekim in #7863
- fix: Kafka message delivery failed by @kczimm in #7883
- docs: Update component criteria by @peasee in #7891
- fix: Make integration run with no relevant changes, disable makefile targets on PR by @peasee in #7896
- Add Cayenne benchmark and concurrency tests and remove indexes for Turso MVCC by @lukekim in #7879
- Revert llama.cpp engine by @lukekim in #7898
- Make Cayenne snapshotting more robust by @sgrebnov in #7899
- Release notes v1.9.0-rc1 by @Jeadie in #7902
- Fix
dataset_acceleration_last_refresh_time_msunit to milliseconds in description by @ewgenius in #7901 - Fix lint error in record_explain_plan functionality by @sgrebnov in #7906
- Cleanup old snapshots after full refresh by @lukekim in #7908
- Cayenne deletion vector caching support by @lukekim in #7903
- Split filters into partition filters (for pruning) and data filters by @lukekim in #7889
- fix: Update benchmark snapshots by @app/github-actions in #7911
- fix: Update benchmark snapshots by @app/github-actions in #7912
- fix: Update benchmark snapshots by @app/github-actions in #7913
- Update spicepod.schema.json by @app/github-actions in #7916
- fix: Update benchmark snapshots by @app/github-actions in #7917
- Add Cayenne & Turso accelerators to E2E CI test matrix by @lukekim in #7922
- Make preview warnings consistent by @lukekim in #7921
- Filter and write optimizations by @lukekim in #7918
- fix: Set sccache region explicitly by @peasee in #7928
- fix: Enable integration test merge group checks by @peasee in #7927
- Update testoperator release branch from 1.8 to 1.9 by @peasee in #7926
- Update DuckDB to 1.4.1 with composite ART scans by @mach-kernel in #7884
- Don't build Windows on trunk pushes by @lukekim in #7931
- fix: Use correct minio secret in build binary push by @peasee in #7934
- Update test-framework workers to use dedicated Flight client by @sgrebnov in #7938
- Fix financebench, configure s3vectors for appropriate snapshotting by @Jeadie in #7935
- Don't try to initialize accelerator if it is disabled by @lukekim in #7932
- Add spark UDFs to Spice by @Jeadie in #7936
- Fix extra async_trait in cayenne metadata catalog by @phillipleblanc in #7942
- deps: Upgrade to Rust 1.90 by @peasee in #7941
- Add cayenne accelerator to README.md by @ewgenius in #7905
- fix: Update benchmark snapshots by @app/github-actions in #7948
- Run integration tests with
AWS_EC2_METADATA_DISABLEDflag by @sgrebnov in #7952 - Only retry credentials on ConnectorError by @kczimm in #7944
- fix: Improve join reordering by ensuring
JoinSelectionis applied by @peasee in #7828 - fix: Remove unused deps, consolidate workspace deps by @peasee in #7953
- bump async-openai commit by @kczimm in #7929
- deps: Use vortex fork by @peasee in #7954
- Enable tracing in delta lake integration tests by @sgrebnov in #7951
- Update datasets in S3 vectors test case by @Jeadie in #7956
- Add spiced metrics scraping to test operator by @lukekim in #7937
- Memoize S3 vectors ListIndex API call with configurable TTL by @kczimm in #7910
- Cayenne performance optimizations by @lukekim in #7907
- Setup HotFix issue template by @ewgenius in #7957
- Fix AWS SDK credential cache retry handling by @phillipleblanc in #7943
- Infer RRF
join_keyfromTableProvider::constraintsand implementSearchQueryProvider::constraints. by @Jeadie in #7959 - [Optimizer]: DuckDB intermediate materialization (non-federated) by @mach-kernel in #7964
- 1.7.3 post-release housekeeping by @ewgenius in #7962
- Fix
digest_manyUDF forColumnarValue::Array. by @Jeadie in #7960 - Fix spiced metrics reporting as part of benchmark tests by @sgrebnov in #7967
- Avoid pushing down Spice specific UDFs to accelerators during federation by @Jeadie in #7909
- CLI file persisted history with
.clearand.clear historycommands by @lukekim in #7970 - ResultsCache Cache-Control
stale-while-revalidateby @lukekim in #7963 - Use GetVectors API instead of returnData by @kczimm in #7083
- Make DuckDB intermediate materialization logic more robust by @sgrebnov in #7971
- [Cayenne] Configurable target Vortex file size by @lukekim in #7972
- fix: Update benchmark snapshots by @app/github-actions in #7974
- Bump github.com/klauspost/compress from 1.17.11 to 1.18.1 by @app/dependabot in #7872
- fix: Update benchmark snapshots by @app/github-actions in #7978
- fix: Update benchmark snapshots by @app/github-actions in #7982
- Run Integration tests on spiceai-dev-runners by @sgrebnov in #7985
- [CLI] Fix cursor issue due to flush by @lukekim in #7981
- fix: Support S3 versioning, Vortex dynamic filter pushdown by @peasee in #7984
- Make
clustera default feature by @lukekim in #7994 - Optimize DuckDB Intermediate Index Materialization for No-Index Case by @sgrebnov in #7998
- HTTP connector with dynamic filter support by @lukekim in #7969
- Revert federation 'can_execute_plan' by @Jeadie in #7999
- Fix stale caching by @lukekim in #7995
- Fix count(*) for http connector by @krinart in #8001
- [CLI] Install specific version by @lukekim in #8006
- Fix stale with revalidate request/response by @lukekim in #8005
- Fallback
RequestContextfor cluster queries by @Jeadie in #8007 - Use use_rustls_tls for Spice Cloud /connect by @lukekim in #8008
- Use delta-kernel-rs 0.16x + Parquet reader with object meta API changes by @mach-kernel in #8011
- fix: Update datafusion & arrow-rs with S3 versioning fix by @lukekim in #8012

