Skip to main content

Amazon S3 Vectors with Spice

· 26 min read
Jack Eadie
Token Plumber at Spice AI

The latest Spice.ai Open Source release (v1.5.0) brings major improvements to search, including native support for Amazon S3 Vectors. Announced in public preview at AWS Summit New York 2025, Amazon S3 Vectors is a new S3 bucket type purpose-built for vector embeddings, with dedicated APIs for similarity search.

Spice AI was a day 1 launch partner for S3 Vectors, integrating it as a scalable vector index backend. In this post, we explore how S3 Vectors integrates into Spice.ai’s data, search, and AI-inference engine, how Spice manages indexing and lifecycle of embeddings for production vector search, and how this unlocks a powerful hybrid search experience. We’ll also put this in context with industry trends and compare Spice’s approach to other vector database solutions like Qdrant, Weaviate, Pinecone, and Turbopuffer.

Amazon S3 Vectors Overview

Amazon S3 Vectors Overview

Amazon S3 Vectors extends S3 object storage with native support for storing and querying vectors at scale. As AWS describes, it is “designed to provide the same elasticity, scale, and durability as Amazon S3,” providing storage of billions of vectors and sub-second similarity queries. Crucially, S3 Vectors dramatically lowers the cost of vector search infrastructure – reducing upload, storage, and query costs by up to 90% compared to traditional solutions. It achieves this by separating storage from compute: vectors reside durably in S3, and queries execute on transient, on-demand resources, avoiding the need for always-on, memory-intensive vector database servers. In practice, S3 Vectors exposes two core operations:

  1. Upsert vectors – assign a vector (an array of floats) to a given key (identifier) and optionally store metadata alongside it.

  2. Vector similarity query – given a new query vector, efficiently find the stored vectors that are closest (e.g. minimal distance) to it, returning their keys (and scores).

This transforms S3 into a massively scalable vector index service. You can store embeddings at petabyte scale and perform similarity search with metrics like cosine or Euclidean distance via a simple API. It’s ideal for AI use cases like semantic search, recommendations, or Retrieval-Augmented Generation (RAG) where large volumes of embeddings need to be queried semantically. By leveraging S3’s pay-for-use storage and ephemeral compute, S3 Vectors can handle infrequent or large-scale queries much more cost-effectively than memory-bound databases, yet still deliver sub-second results.

Vector Search with Embeddings

Vector similarity search retrieves data by comparing items in a high-dimensional embedding space rather than by exact keywords. In a typical pipeline:

  • Data to vectors: We first convert each data item (text, image, etc.) into a numeric vector representation (embedding) using an ML model. For example, a customer review text might be turned into a 768-dimensional embedding that encodes its semantic content. Models like Amazon Titan Embeddings, OpenAI, or Hugging Face sentence transformers handle this step.

  • Index storage: These vectors are stored in a specialized index or database optimized for similarity search. This could be a dedicated vector database or, in our case, Amazon S3 Vectors acting as the index. Each vector is stored with an identifier (e.g. the primary key of the source record) and possibly metadata.

  • Query by vector: A search query (e.g. a phrase or image) is also converted into an embedding vector. The vector index is then queried to find the closest stored vectors by distance metric (cosine, Euclidean, dot product, etc.). The result is a set of IDs of the most similar items, often with a similarity score.

This process enables semantic search – results are returned based on meaning and similarity rather than exact text matches. It powers features like finding relevant documents by topic even if exact terms differ, recommendation systems (finding similar user behavior or content), and providing knowledge context to LLMs in RAG. With the Spice.ai Open Source integration, this whole lifecycle (embedding data, indexing vectors, querying) is managed by the Spice runtime and exposed via a familiar SQL or HTTP interface.

Amazon S3 Vectors in Spice.ai

Spice integration with Amazon S3 Vectors

Spice.ai is an open-source data, search and AI compute engine that supports vector search end-to-end. By integrating S3 Vectors as an index, Spice can embed data, store embeddings in S3, and perform similarity queries – all orchestrated through simple configuration and SQL queries. Let’s walk through how you enable and use this in Spice.

Configuring a Dataset with Embeddings

To use vector search, annotate your dataset schema to specify which column(s) to embed and with which model. Spice supports various embedding models (both local or hosted) via the embeddings section in the configuration. For example, suppose we have a customer reviews table and we want to enable semantic search over the review text (body column):

datasets:
- from: oracle:"CUSTOMER_REVIEWS"
name: reviews
columns:
- name: body
embeddings:
from: bedrock_titan # use an embedding model defined below

embeddings:
- from: bedrock:amazon.titan-embed-text-v2:0
name: bedrock_titan
params:
aws_region: us-east-2
dimensions: '256'

In this spicepod.yaml, we defined an embedding model bedrock_titan (in this case AWS's Titan text embedding model) and attached it to the body column. When the Spice runtime ingests the dataset, it will automatically generate a vector embedding for each row’s body text using that model. By default, Spice can either store these vectors in its acceleration layer or compute them on the fly. However, with S3 Vectors, we can offload them to an S3 Vectors index for scalable storage.

To use S3 Vectors, we simply enable the vector engine in the dataset config:

datasets:
- from: oracle:"CUSTOMER_REVIEWS"
name: reviews
vectors:
enabled: true
engine: s3_vectors
params:
s3_vectors_bucket: my-s3-vector-bucket
#... (rest of dataset definition as above)

This tells Spice to create or use an S3 Vectors index (in the specified S3 bucket) for storing the body embeddings. Spice manages the entire index lifecycle: it creates the vector index, handles inserting each vector with its primary key into S3, and knows how to query it. The embedding model and data source are as before – the only change is where the vectors are stored and queried. The benefit is that now our vectors reside in S3’s highly scalable storage, and we can leverage S3 Vectors’ efficient similarity search API.

Performing a Vector Search Query

Once configured, performing a semantic search is straightforward. Spice exposes both an HTTP endpoint and a SQL table-valued function for vector search. For example, using the HTTP API:

curl -X POST http://localhost:8090/v1/search \
-H "Content-Type: application/json" \
-d '{
"datasets": ["reviews"],
"text": "issues with same day shipping",
"additional_columns": ["rating", "customer_id"],
"where": "created_at >= now() - INTERVAL '7 days'",
"limit": 2
}'

This JSON query says: search the reviews dataset for items similar to the text "issues with same day shipping", and return the top 2 results, including their rating and customer id, filtered to reviews from the last 7 days. The Spice engine will embed the query text (using the same model as the index), perform a similarity lookup in the S3 Vectors index, filter by the WHERE clause, and return the results. A sample response might look like:

{
"results": [
{
"matches": {
"body": "Everything on the site made it seem like I’d get it the same day. Still waiting the next morning was a letdown."
},
"data": { "rating": 3, "customer_id": 6482 },
"primary_key": { "review_id": 123 },
"score": 0.82,
"dataset": "reviews"
},
{
"matches": {
"body": "It was marked as arriving 'today' when I paid, but the delivery was pushed back without any explanation. Timing was kind of important for me."
},
"data": { "rating": 2, "customer_id": 3310 },
"primary_key": { "review_id": 24 },
"score": 0.76,
"dataset": "reviews"
}
],
"duration_ms": 86
}

Each result includes the matching column snippet (body), the additional requested fields, the primary key, and a relevance score. In this case, the two reviews shown are indeed complaints about “same day” delivery issues, which the vector search found based on semantic similarity to the query (see how the second result made no mention of "same day" delivery, but rather described a similar issue as the first ).

Developers can also use SQL for the same operation. Spice provides a table function vector_search(dataset, query) that can be used in the FROM clause of a SQL query. For example, the above search could be expressed as:

SELECT review_id, rating, customer_id, body, score
FROM vector_search(reviews, 'issues with same day shipping')
WHERE created_at >= to_unixtime(now() - INTERVAL '7 days')
ORDER BY score DESC
LIMIT 2;

This would yield a result set (with columns like review_id, score, etc.) similar to the JSON above, which you can join or filter just like any other SQL table. This ability to treat vector search results as a subquery/table and combine them with standard SQL filtering is a powerful feature of Spice.ai’s integration – few other solutions let you natively mix vector similarity and relational queries so seamlessly.

See a 2-min demo of it in action:

Managing Embeddings Storage in Spice.ai

An important design question for any vector search system is where and how to store the embedding vectors. Before introducing S3 Vectors, Spice offered two approaches for managing vectors:

  1. Accelerator storage: Embed the data in advance and store the vectors alongside other cached data in a Data Accelerator (Spice’s high-performance materialization layer). This keeps vectors readily accessible in memory or fast storage.

  2. Just-in-time computation: Compute the necessary vectors on the fly during a query, rather than storing them persistently. For example, at query time, embed only the subset of rows that satisfy recent filters (e.g. all reviews in the last 7 days) and compare those to the query vector.

Both approaches have trade-offs. Pre-storing in an accelerator provides fast query responses but may not be feasible for very large datasets (which might not fit entirely, or fit affordably in fast storage) and accelerators, like DuckDB or SQLite aren’t optimized for similarity search algorithms on billion-scale vectors. Just-in-time embedding avoids extra storage but becomes prohibitively slow when computing embeddings over large data scans (and for each query), and provides no efficient algorithm for efficiently finding similar neighbours.

Amazon S3 Vectors offers a compelling third option: the scalability of S3 with the efficient retrieval of vector index data structures. By configuring the dataset with engine: s3_vectors as shown earlier, Spice will offload the vector storage and similarity computations to S3 Vectors. This means you can handle very large embedding sets (millions or billions of items) without worrying about Spice’s memory or local disk limits, and still get fast similarity operations via S3’s API. In practice, when Spice ingests data, it will embed each row’s body and PUT it into the S3 Vector index (with the review_id as the key, and possibly some metadata). At query time, Spice calls S3 Vectors’ query API to retrieve the nearest neighbors for the embedded query. All of this is abstracted away; you simply query Spice and it orchestrates these steps.

The Spice runtime manages index creation, updates, and deletion. For instance, if new data comes in or old data is removed, Spice will synchronize those changes to the S3 vector index. Developers don’t need to directly interact with S3 – it’s configured once in YAML. This tight integration accelerates application development: your app can treat Spice like any other database, while behind the scenes Spice leverages S3’s elasticity for the heavy lifting.

Vector Index Usage in Query Execution

How does a vector index actually get used in Spice’s SQL query planner? To illustrate, consider the simplified SQL we used:

SELECT *
FROM vector_search(reviews, 'issues with same day shipping')
ORDER BY score DESC
LIMIT 5;

Logically, without a vector index, Spice would have to do the following at query time:

  1. Embed the query text 'issues with same day shipping' into a vector v.

  2. Retrieve or compute all candidate vectors for the searchable column (here every body embedding in the dataset). This could mean scanning every row or at least every row matching other filter predicate.

  3. Calculate distances between the query vector v and each candidate vector, compute a similarity score (e.g. score = 1 - distance).

  4. Sort all candidates by the score and take the top 5.

For large datasets, steps 2–4 would be extremely expensive (a brute-force scan through potentially millions of vectors for each search, then a full sort operation). A vector index avoiding unnecessary recomputation of embeddings, reduces the number of distance calculations required, and provides in-order candidate neighbors.

With S3 Vectors, step 2 and 3 are pushed down to the S3 service. The vector index can directly return the top K closest matches to v. Conceptually, S3 Vectors gives back an ordered list of primary keys with their similarity scores. For example, it might return something like: {(review_id=123, score=0.82), (review_id=24, score=0.76), ...} up to K results.

Spice then uses these results, logically as a temporary table (let’s call it vector_query_results), joined with the main reviews table to get the full records. In SQL pseudocode, Spice does something akin to:

-- The vector index returns the closest matches for a given query.
CREATE TEMP TABLE vector_query_results (
review_id BIGINT,
score FLOAT
);

Imagine this temp table is populated by an efficient vector retrieval operatin in S3 Vectors for the query.

-- Now we join to retrieve full details
SELECT r.review_id, r.rating, r.customer_id, r.body, v.score
FROM vector_query_results v
JOIN reviews r ON r.review_id = v.review_id
ORDER BY v.score DESC
LIMIT 5;

This way, only the top few results (say 50 or 100 candidates) are processed in the database, rather than the entire dataset. The heavy work of narrowing down candidates occurs inside the vector index. Spice essentially treats vector_search(dataset, query) as a table-valued function that produces (id, score) pairs which are then joinable.

Handling Filters Efficiently

One consideration when using an external vector index is how to handle additional filter conditions (the WHERE clause). In our example, we had a filter created_at >= now() - 7 days. If we simply retrieve the top K results from the vector search and then apply the time filter, we might run into an issue: those top K might not include any recent items, even if there are relevant recent items slightly further down the similarity ranking. This is because S3 Vectors (like most ANN indexes) will return the top K most similar vectors globally, unaware of our date constraint.

If only a small fraction of the data meets the filter, a naive approach could drop most of the top results, leaving fewer than the desired number of final results. For example, imagine the vector index returns 100 nearest reviews overall, but only 5% of all reviews are from the last week – we’d expect only ~5 of those 100 to be recent, possibly fewer than the LIMIT. The query could end up with too few results not because they don’t exist, but because the index wasn’t filter-aware and we truncated the candidate list.

To solve this, S3 Vectors supports metadata filtering at query time. We can store certain fields as metadata with each vector and have the similarity search constrained to vectors where the metadata meets criteria. Spice.ai leverages this by allowing you to mark some dataset columns as “vector filterable”. In our YAML, we could do:

columns:
- name: created_at
metadata:
vectors: filterable

By doing this, Spice's query planner will include the created_at value with each vector it upserts to S3, and it will push down the time filter into the S3 Vectors query. Under the hood, the S3 vector query will then return only nearest neighbors that also satisfy created_at >= now()-7d. This greatly improves both efficiency and result relevance. The query execution would conceptually become:

-- Vector query with filter returns a temp table including the metadata
CREATE TEMP TABLE vector_query_results (
review_id BIGINT,
score FLOAT,
created_at TIMESTAMP
);
-- vector_query_results is already filtered to last 7 days

SELECT r.review_id, r.rating, r.customer_id, r.body, v.score
FROM vector_query_results v
JOIN reviews r ON r.review_id = v.review_id
-- (no need for additional created_at filter here, it’s pre-filtered)
ORDER BY v.score DESC
LIMIT 5;

Now the index itself is ensuring all similar reviews are from the last week, and so if there are at least five results from the last week, it will return a full result (i.e. respecting LIMIT 5).

Including Data to Avoid Joins

Another optimization Spice supports is storing additional, non-filterable columns in the vector index to entirely avoid the expensive table join back to the main table for certain queries. For example, we might mark rating, customer_id, or even the text body as non-filterable vector metadata. This means these fields are stored with the vector in S3, but not used for filtering (just for retrieval). In the Spice config, it would look like:

columns:
- name: rating
metadata:
vectors: non-filterable
- name: customer_id
metadata:
vectors: non-filterable
- name: body
metadata:
vectors: non-filterable

With this setup, when Spice queries S3 Vectors, the vector index will return not only each match’s review_id and score, but also the stored rating, customer_id, and body values. Thus, the temporary vector_query_results table already has all the information needed to satisfy the query. We don’t even need to join against the reviews table unless we want some column that wasn’t stored. The query can be answered entirely from the index data:

SELECT review_id, rating, customer_id, body, score
FROM vector_query_results
ORDER BY score DESC
LIMIT 5;

This is particularly useful for read-heavy query workloads where hitting the main database adds latency. By storing the most commonly needed fields along with the vector, Spice’s vector search behaves like an index-only query (similar to covering indexes in relational databases). You trade a bit of extra storage in S3 (duplicating some fields, but still managed by Spice) for faster queries that bypass the heavier join.

This extends to WHERE conditions on non-filterable columns, or filter predicate unsupported by S3 vectors. Spice's execution engine can apply these filters, still avoiding any expensive JOIN on the underlying table.

SELECT review_id, rating, customer_id, body, score
FROM vector_query_results
where rating > 3 -- Filter performed in Spice on, with non-filterable data from vector index
ORDER BY score DESC
LIMIT 5;

It’s worth noting that you should choose carefully which fields to mark as metadata – too many or very large fields could increase index storage and query payload sizes. Spice gives you the flexibility to include just what you need for filtering and projection to optimize each use case.

Beyond Basic Vector Search in Spice

Many real-world search applications go beyond a single-vector similarity lookup. Spice.ai’s strength is that it’s a full database engine. You can compose more complex search workflows, including hybrid search (combining keyword/text search with vector search), multi-vector queries, re-ranking strategies, and more. Spice provides both an out-of-the-box hybrid search API and the ability to write custom SQL to implement advanced retrieval logic.

  • Multiple vector fields or multi-modal search: You might have vectors for different aspects of data (e.g. an e-commerce product could have embeddings for both its description and the product's image. Or a document has both a title and body that should be searchable individually and together) that you may want to search across and combine results. Spice lets you do vector search on multiple columns easily, and you can weight the importance of each. For instance, you might boost matches in the title higher than matches in the body.

  • Vector and full-text search: Similar to vector search, columns can have text indexes defined that enable full-text BM25 search. Text search can then be performed in SQL with a similar text_search UDTF. The /v1/search HTTP API will perform a hybrid search across both full-text and vector indexes, merging results using Reciprocal Rank Fusion (RRF). This means you get a balanced result set that accounts for direct keyword matches as well as semantic similarity. The example below demonstrates how RRF can be implemented in SQL by combining ranks.

  • Hybrid vector + keyword search: Sometimes you want to ensure certain keywords are present while also using semantic similarity. Spice supports hybrid search natively – its default /v1/search HTTP API actually performs both full-text BM25 search and vector search, then merges results using Reciprocal Rank Fusion (RRF). This means you get a balanced result set that accounts for direct keyword matches as well as semantic similarity. In Spice’s SQL, you can also call text_search(dataset, query) for traditional full-text search, and combine it with vector_search results. The example below demonstrates how RRF can be implemented in SQL by combining ranks.

  • Two-phase retrieval (re-ranking): A common pattern is to use a fast first-pass retrieval (e.g. a keyword search) to get a larger candidate set, then apply a more expensive or precise ranking (e.g. vector search) on this subset to improve the score of the required final candidate set. With Spice, you can orchestrate this in SQL or in application code. For example, you could query a BM25 index for 100 candidates, then perform a vector search amongst this candidate set(i.e. restricted to those IDs) for a second phase. Since Spice supports standard SQL constructs, you can express these multi-step plans with common table expressions (CTEs) and joins.

To illustrate hybrid search, here’s a SQL snippet that uses the Reciprocal Rank Fusion (RRF) technique to merge vector and text search results for the same query (RRF is used, when needed, in the v1/search HTTP API):

WITH
vector_results AS (
SELECT review_id, RANK() OVER (ORDER BY score DESC) AS vector_rank
FROM vector_search(reviews, 'issues with same day shipping')
),
text_results AS (
SELECT review_id, RANK() OVER (ORDER BY score DESC) AS text_rank
FROM text_search(reviews, 'issues with same day shipping')
)
SELECT
COALESCE(v.review_id, t.review_id) AS review_id,
-- RRF scoring: 1/(60+rank) from each source
(1.0 / (60 + COALESCE(v.vector_rank, 1000)) +
1.0 / (60 + COALESCE(t.text_rank, 1000))) AS fused_score
FROM vector_results v
FULL OUTER JOIN text_results t ON v.review_id = t.review_id
ORDER BY fused_score DESC
LIMIT 50;

This takes the vector similarity results and text (BM25) results, assigns each a rank based not on the score, but rather the relative order of candidates, and combines these ranks for an overall order. Spice’s primary key SQL semantics easily enables this document ID join.

For a multi-column vector search example, suppose our reviews dataset has both a title and body with embeddings, and we want to prioritize title matches higher. We could create a combined_score where the title is weighted twice as high as the body:

WITH
body_results AS (
SELECT review_id, score AS body_score
FROM vector_search(reviews, 'issues with same day shipping', col => 'body')
),
title_results AS (
SELECT review_id, score AS title_score
FROM vector_search(reviews, 'issues with same day shipping', col => 'title')
)
SELECT
COALESCE(body.review_id, title.review_id) AS review_id,
COALESCE(body_score, 0) + 2.0 * COALESCE(title_score, 0) AS combined_score
FROM body_results
FULL OUTER JOIN title_results ON body_results.review_id = title_results.review_id
ORDER BY combined_score DESC
LIMIT 5;

These examples scratch the surface of what you can do by leveraging Spice’s SQL-based composition. The key point is that Spice isn’t just a vector database – it’s a hybrid engine that lets you combine vector search with other query logic (text search, filters, joins, aggregations, etc.) all in one place. This can significantly simplify building complex search and AI-driven applications.

(Note: Like most vector search systems, S3 Vectors uses an approximate nearest neighbor (ANN) algorithm under the hood for performance. This yields fast results that are probabilistically the closest, which is usually an acceptable trade-off in practice. Additionally, in our examples we focused on one embedding per row; production systems may use techniques like chunking text into multiple embeddings or adding external context, but the principles above remain the same.)

Industry Context and Comparisons

The rise of vector databases over the past few years (Pinecone, Qdrant, Weaviate, etc.) has been driven by the need to serve AI applications with semantic search at scale. Each solution takes a slightly different approach in architecture and trade-offs. Spice.ai’s integration with Amazon S3 Vectors represents a newer trend in this space: decoupling storage from compute for vector search, analogous to how data warehouses separated compute and storage in the past. Let’s compare this approach with some existing solutions:

  • Traditional Vector Databases (Qdrant, Weaviate, Pinecone): These systems typically run as dedicated services or clusters that handle both the storage of vectors (on disk or in-memory) and the computation of similarity search. For example, Qdrant (an open-source engine in Rust) allows either in-memory storage or on-disk storage (using RocksDB) for vectors and payloads. It’s optimized for high performance and offers features like filtering, quantization, and distributed clustering, but you generally need to provision servers/instances that will host all your data and indexes. Weaviate, another popular open-source vector DB, uses a Log-Structured Merge (LSM) tree based storage engine that persists data to disk and keeps indexes in memory. Weaviate supports hybrid search (it can combine keyword and vector queries) and offers a GraphQL API, with a managed cloud option priced mainly by data volume. Pinecone, a fully managed SaaS, also requires you to select a service tier or pod which has certain memory/CPU allocated for your index – essentially your data lives in Pinecone’s infrastructure, not in your AWS account. These solutions excel at low-latency search for high query throughput scenarios (since data is readily available in RAM or local SSD), but the cost can be high for large datasets. You pay for a lot of infrastructure to be running, even during idle times. In fact, prior to S3 Vectors, vector search engines often stored data in memory at ~$2/GB and needed multiple replicas on SSD, which is “the most expensive way to store data”, as Simon Eskildsen (Turbopuffer’s founder) noted. Some databases mitigate cost by compressing or offloading to disk, but still, maintaining say 100 million embeddings might require a sizable cluster of VMs or a costly cloud plan.

  • Spice.ai with Amazon S3 Vectors: This approach flips the script by storing vectors in cheap, durable object storage (S3) and loading/indexing them on demand. As discussed, S3 Vectors keeps the entire vector dataset in S3 at ~$0.02/GB storage , and only spins up transient compute (managed by AWS) to serve queries, meaning you aren’t paying for idle GPU or RAM time. AWS states this design can cut total costs by up to 90% while still giving sub-second performance on billions of vectors. It’s essentially a serverless vector search model – you don’t manage servers or even dedicated indices; you just use the API. Spice.ai’s integration means developers get this cost-efficiency without having to rebuild their application: they can use standard SQL and Spice will push down operations to S3 Vectors as appropriate. This decoupled storage/compute model is ideal for use cases where the data is huge but query volumes are moderate or bursty (e.g., an enterprise semantic search that is used a few times an hour, or a nightly ML batch job). It avoids the “monolithic database” scenario of having a large cluster running 24/7. However, one should note that if you need extremely high QPS (thousands of queries per second at ultra-low latency), a purely object-storage-based solution might not outperform a tuned in-memory vector DB – AWS positions S3 Vectors as complementary to higher-QPS solutions like OpenSearch for real-time needs.

  • Turbopuffer: Turbopuffer is a startup that, much like Spice with S3 Vectors, is built from first principles on object storage. It provides “serverless vector and full-text search… fast, 10× cheaper, and extremely scalable,” by leveraging S3 or similar object stores with smart caching. The philosophy is the same: use the durability and low cost of object storage for the bulk of data, and layer a cache (memory/SSD) in front for performance-critical portions. According to Turbopuffer’s founder, moving from memory/SSD-centric architectures to an object storage core can yield 100× cost savings for cold data and 6–20× for warm data, without sacrificing too much performance. Turbopuffer’s engine indexes data incrementally on S3 and uses caching to achieve similar latency to conventional search engines on hot data. The key difference is that Turbopuffer is a standalone search service (with its own API), whereas Spice uses AWS’s S3 Vectors service as the backend. Both approaches validate the industry trend toward disaggregated storage for search. Essentially, they are bringing the cloud data warehouse economics to vector search: store everything cheaply, compute on demand.

In summary, Spice.ai’s integration with S3 Vectors and similar efforts indicate a shift in vector search towards cost-efficient, scalable architectures that separate the concerns of storing massive vector sets and serving queries. Developers now have options: if you need blazing fast, realtime vector search with constant high traffic, dedicated compute infrastructure might be justified. But for many applications – enterprise search, AI assistants with a lot of knowledge but lower QPS, periodic analytics over embeddings – offloading to something like S3 Vectors can save enormously on cost while still delivering sub-second performance at huge scale. And with Spice.ai, you get the best of both worlds: the ease of a unified SQL engine that can do keyword + vector hybrid search on structured data, combined with the power of a cloud-native vector store. It simplifies your stack (no separate vector DB service to manage) and accelerates development since you can join and filter vector search results with your data immediately in one query .

References:

Spice v1.5.1 (July 28, 2025)

· 5 min read
Jack Eadie
Token Plumber at Spice AI

Announcing the release of Spice v1.5.1! 🔑

Spice v1.5.1 expands the GitHub data connector to include pull-request comments, adds a configurable rate limiting for AWS Bedrock embedding models, expands partition pruning with inequality operators, and adds client-supplied cache keys for granular caching control in the HTTP and Arrow Flight SQL APIs.

What's New in v1.5.1

GitHub Data Connector Pull Request Comments: Configure GitHub pulls datasets to include comments.

Example Spicepod.yaml:

datasets:
- from: github:github.com/spiceai/spiceai/pulls
name: spiceai.pulls
params:
github_include_comments: all # 'review', 'discussion', or 'none'. Defaults to 'none'.
github_max_comments_fetched: '25' # Defaults to 100
# ...

For details, see the GitHub Data Connector documentation.

AWS Bedrock Embedding Models Invocation Control: Improved rate limiting control for AWS Bedrock embedding models with max_concurrent_invocations configuration.

embeddings:
- from: bedrock:cohere.embed-english-v3
name: cohere-embeddings
params:
max_concurrent_invocations: '41'
# ...

For details, see the AWS Bedrock Embeddings Model Provider documentation.

Improved Query Partitioning: Expanded partition pruning support with additional inequality operators (e.g. >, >=, <, <=).

For details, see the Query Partitioning documentation.

Client-Supplied Cache Keys: Support for a new Spice-Cache-Key header/metadata-key in the HTTP and Arrow Flight SQL query APIs to for fine-grained client-side caching control.

Example HTTP API usage:

$ curl -vvS -XPOST http://localhost:8090/v1/sql \
-H"spice-cache-key: 1851400_20170216_north_america" \
-d "select * from scihub_journals_accessed
where user_id = '1851400'
and date_trunc('DAY', timestamp) = '2017-02-16'
and city = 'New York';"

Example Response:

< HTTP/1.1 200 OK
< content-type: application/json
< x-cache: Hit from spiceai
< results-cache-status: HIT
< vary: Spice-Cache-Key
< vary: origin, access-control-request-method, access-control-request-headers
< content-length: 604
< date: Wed, 23 Jul 2025 20:26:12 GMT
<
[{
"timestamp": "2017-02-16 09:55:06",
"doi": "10.1155/2012/650929",
"ip_identifier": 1000856,
"user_id": 1851400,
"country": "United States",
"city": "New York",
"longitude": 40.7830603,
"latitude": -73.9712488
},
...
]

For details, see the Cache Control documentation.

Contributors

New Contributors

Breaking Changes

  • N/A

Cookbook Updates

No new recipes added in this release.

The Spice Cookbook includes 74 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.5.1, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.5.1 image:

docker pull spiceai/spiceai:1.5.1

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

No major dependency updates.

Changelog

  • Fix refresh via Api when dataset is already accelerated and no refresh interval is set by @sgrebnov in #6549
  • Add support for custom GraphQL unnesting behavior by @Advayp in #6540
  • Regex Update to disallow hyphens dataset names by @varunguleriaCodes in #6383
  • Enforce max limit on comments fetched per PR by @Advayp in #6580
  • Fix accelerated refresh issue by @Advayp in #6590
  • Enable configurations of max invocations for Bedrock models by @Advayp in #6592
  • Client-supplied cache keys (Spice-Cache-Key) by @mach-kernel in #6579
  • Improved partition pruning by @kczimm in #6582
  • Fix retention filter when both retention_sql and period are set by @sgrebnov in #6595
  • Initial support for PR comments by @Advayp in #6569
  • chore: Update croner by @peasee in #6547
  • fix databricks streaming for Claude model by @peasee in #6601
  • Remove FullTextUDTFAnalyzerRule and move FTS code into search crate by @jeadie in #6596
  • Remove download of legacy sentence transformers config by @jeadie in #6605
  • re-add snapshot tests by @jeadie
  • Embedding column config to support client-specified vector sizes by @mach-kernel in #6610
  • Fix mismatch in columns for the GitHub PR table type by @Advayp in #6616
  • bump version to 1.5.1 by @phillipleblanc
  • fix issues with cherry-picking by @jeadie
  • Add integration tests for GitHub PRs with comments by @Advayp in #6581
  • Add view name to view creation errors by @lukekim in #6611
  • CDC: Compute embeddings on ingest by @mach-kernel in #6612

Spice v1.5.0 (July 21, 2025)

· 14 min read
Evgenii Khramkov
Senior Software Engineer at Spice AI

Announcing the release of Spice v1.5.0! 🔍

Spice v1.5.0 brings major upgrades to search and retrieval. It introduces native support for Amazon S3 Vectors, enabling petabyte scale vector search directly from S3 vector buckets, alongside SQL-integrated vector and tantivy-powered full-text search, partitioning for DuckDB acceleration, and automated refreshes for search indexes and views. It includes the AWS Bedrock Embeddings Model Provider, the Oracle Database connector, and the now-stable Spice.ai Cloud Data Connector, and the upgrade to DuckDB v1.3.2.

What's New in v1.5.0

Amazon S3 Vectors Support: Spice.ai now integrates with Amazon S3 Vectors, launched in public preview on July 15, 2025, enabling vector-native object storage with built-in indexing and querying. This integration supports semantic search, recommendation systems, and retrieval-augmented generation (RAG) at petabyte scale with S3’s durability and elasticity. Spice.ai manages the vector lifecycle—ingesting data, creating embeddings with models like Amazon Titan or Cohere via AWS Bedrock, or others available on HuggingFace, and storing it in S3 Vector buckets.

Spice integration with Amazon S3 Vectors

Example Spicepod.yml configuration for S3 Vectors:

datasets:
- from: s3://my_data_bucket/data/
name: my_vectors
params:
file_format: parquet
acceleration:
enabled: true
vectors:
engine: s3_vectors
params:
s3_vectors_aws_region: us-east-2
s3_vectors_bucket: my-s3-vectors-bucket
columns:
- name: content
embeddings:
- from: bedrock_titan
row_id:
- id

Example SQL query using S3 Vectors:

SELECT *
FROM vector_search(my_vectors, 'Cricket bats', 10)
WHERE price < 100
ORDER BY score

For more details, refer to the S3 Vectors Documentation.

SQL-integrated Search: Vector and BM25-scored full-text search capabilities are now natively available in SQL queries, extending the power of the POST v1/search endpoint to all SQL workflows.

Example Vector-Similarity-Search (VSS) using the vector_search UDTF on the table reviews for the search term "Cricket bats":

SELECT review_id, review_text, review_date, score
FROM vector_search(reviews, "Cricket bats")
WHERE country_code="AUS"
LIMIT 3

Example Full-Text-Search (FTS) using the text_search UDTF on the table reviews for the search term "Cricket bats":

SELECT review_id, review_text, review_date, score
FROM text_search(reviews, "Cricket bats")
LIMIT 3

DuckDB v1.3.2 Upgrade: Upgraded DuckDB engine from v1.1.3 to v1.3.2. Key improvements include support for adding primary keys to existing tables, resolution of over-eager unique constraint checking for smoother inserts, and 13% reduced runtime on TPC-H SF100 queries through extensive optimizer refinements. The v1.2.x release of DuckDB was skipped due to a regression in indexes.

Partitioned Acceleration: DuckDB file-based accelerations now support partition_by expressions, enabling queries to scale to large datasets through automatic data partitioning and query predicate pruning. New UDFs, bucket and truncate, simplify partition logic.

New UDFs useful for partition_by expressions:

  • bucket(num_buckets, col): Partitions a column into a specified number of buckets based on a hash of the column value.
  • truncate(width, col): Truncates a column to a specified width, aligning values to the nearest lower multiple (e.g., truncate(10, 101) = 100).

Example Spicepod.yml configuration:

datasets:
- from: s3://my_bucket/some_large_table/
name: my_table
params:
file_format: parquet
acceleration:
enabled: true
engine: duckdb
mode: file
partition_by: bucket(100, account_id) # Partition account_id into 100 buckets

Full-Text-Search (FTS) Index Refresh: Accelerated datasets with search indexes maintain up-to-date results with configurable refresh intervals.

Example refreshing search indexes on body every 10 seconds:

datasets:
- from: github:github.com/spiceai/docs/pulls
name: spiceai.doc.pulls
params:
github_token: ${secrets:GITHUB_TOKEN}
acceleration:
enabled: true
refresh_mode: full
refresh_check_interval: 10s
columns:
- name: body
full_text_search:
enabled: true
row_id:
- id

Scheduled View Refresh: Accelerated Views now support cron-based refresh schedules using refresh_cron, automating updates for accelerated data.

Example Spicepod.yml configuration:

views:
- name: my_view
sql: SELECT 1
acceleration:
enabled: true
refresh_cron: '0 * * * *' # Every hour

For more details, refer to Scheduled Refreshes.

Multi-column Vector Search: For datasets configured with embeddings on more than one column, POST v1/search and similarity_search perform parallel vector search on each column, aggregating results using reciprocal rank fusion.

Example Spicepod.yml for multi-column search:

datasets:
- from: github:github.com/apache/datafusion/issues
name: datafusion.issues
params:
github_token: ${secrets:GITHUB_TOKEN}
columns:
- name: title
embeddings:
- from: hf_minilm
- name: body
embeddings:
- from: openai_embeddings

AWS Bedrock Embeddings Model Provider: Added support for AWS Bedrock embedding models, including Amazon Titan Text Embeddings and Cohere Text Embeddings.

Example Spicepod.yml:

embeddings:
- from: bedrock:cohere.embed-english-v3
name: cohere-embeddings
params:
aws_region: us-east-1
input_type: search_document
truncate: END
- from: bedrock:amazon.titan-embed-text-v2:0
name: titan-embeddings
params:
aws_region: us-east-1
dimensions: '256'

For more details, refer to the AWS Bedrock Embedding Models Documentation.

Oracle Data Connector: Use from: oracle: to access and accelerate data stored in Oracle databases, deployed on-premises or in the cloud.

Example Spicepod.yml:

datasets:
- from: oracle:"SH"."PRODUCTS"
name: products
params:
oracle_host: 127.0.0.1
oracle_username: scott
oracle_password: tiger

See the Oracle Data Connector documentation.

GitHub Data Connector: The GitHub data connector supports query and acceleration of members, the users of an organization.

Example Spicepod.yml configuration:

datasets:
- from: github:github.com/spiceai/members # General format: github.com/[org-name]/members
name: spiceai.members
params:
# With GitHub Apps (recommended)
github_client_id: ${secrets:GITHUB_SPICEHQ_CLIENT_ID}
github_private_key: ${secrets:GITHUB_SPICEHQ_PRIVATE_KEY}
github_installation_id: ${secrets:GITHUB_SPICEHQ_INSTALLATION_ID}
# With GitHub Tokens
# github_token: ${secrets:GITHUB_TOKEN}

See the GitHub Data Connector Documentation

Spice.ai Cloud Data Connector: Graduated to Stable.

spice-rs SDK Release: The Spice Rust SDK has updated to v3.0.0. This release includes optimizations for the Spice client API, adds robust query retries, and custom metadata configurations for spice queries.

Contributors

Breaking Changes

  • Search HTTP API Response: POST v1/search response payload has changed. See the new API documentation for details.
  • Model Provider Parameter Prefixes: Model Provider parameters use provider-specific prefixes instead of openai_ prefixes (e.g., hf_temperature for HuggingFace, anthropic_max_completion_tokens for Anthropic, perplexity_tool_choice for Perplexity). The openai_ prefix remains supported for backward compatibility but is deprecated and will be removed in a future release.

Cookbook Updates

The Spice Cookbook now includes 72 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.5.0, download and install the specific binary from github.com/spiceai/spiceai/releases/tag/v1.5.0 or pull the v1.5.0 Docker image (spiceai/spiceai:1.5.0).

What's Changed

Dependencies

Changelog

  • fix: openai model endpoint (#6394) by @Sevenannn in #6394
  • Enable configuring otel endpoint from spice run (#6360) by @Advayp in #6360
  • Enable Oracle connector in default build configuration (#6395) by @sgrebnov in #6395
  • fix llm integraion test (#6398) by @Sevenannn in #6398
  • Promote spice cloud connector to stable quality (#6221) by @Sevenannn in #6221
  • v1.5.0-rc.1 release notes (#6397) by @lukekim in #6397
  • Fix model nsql integration tests (#6365) by @Sevenannn in #6365
  • Fix incorrect UDTF name and SQL query (#6404) by @lukekim in #6404
  • Update v1.5.0-rc.1.md (#6407) by @sgrebnov in #6407
  • Improve error messages (#6405) by @lukekim in #6405
  • build(deps): bump Jimver/cuda-toolkit from 0.2.25 to 0.2.26 (#6388) by @app/dependabot in #6388
  • Upgrade dependabot dependencies (#6411) by @phillipleblanc in #6411
  • Fix projection pushdown issues for document based file connector (#6362) by @Advayp in #6362
  • Add a PartitionedDuckDB Accelerator (#6338) by @kczimm in #6338
  • Use vector_search() UDTF in HTTP APIs (#6417) by @Jeadie in #6417
  • add supported types (#6409) by @kczimm in #6409
  • Enable session time zone override for MySQL (#6426) by @sgrebnov in #6426
  • Acceleration-like indexing for full text search indexes. (#6382) by @Jeadie in #6382
  • Provide error message when partition by expression changes (#6415) by @kczimm in #6415
  • Add support for Oracle Autonomous Database connections (Oracle Cloud) (#6421) by @sgrebnov in #6421
  • prune partitions for exact and in list with and without UDFs (#6423) by @kczimm in #6423
  • Fixes and reenable FTS tests (#6431) by @Jeadie in #6431
  • Upgrade DuckDB to 1.3.2 (#6434) by @phillipleblanc in #6434
  • Fix issue in limit clause for the Github Data connector (#6443) by @Advayp in #6443
  • Upgrade iceberg-rust to 0.5.1 (#6446) by @phillipleblanc in #6446
  • v1.5.0-rc.2 release notes (#6440) by @lukekim in #6440
  • Oracle: add automated TPC-H SF1 benchmark tests (#6449) by @sgrebnov in #6449
  • fix: Update benchmark snapshots (#6455) by @app/github-actions in #6455
  • Preserve ArrowError in arrow_tools::record_batch (#6454) by @mach-kernel in #6454
  • fix: Update benchmark snapshots (#6465) by @app/github-actions in #6465
  • Add option to preinstall Oracle ODPI-C library in Docker image (#6466) by @sgrebnov in #6466
  • Include Oracle connector (federated mode) in automated benchmarks (#6467) by @sgrebnov in #6467
  • Update crates/llms/src/bedrock/embed/mod.rs by @lukekim in #6468
  • v1.5.0-rc.3 release notes (#6474) by @lukekim in #6474
  • Add integration tests for S3 Vectors filters pushdown (#6469) by @sgrebnov in #6469
  • check for indexedtableprovider when finding tables to search on (#6478) by @Jeadie in #6478
  • Parse fully qualified table names in UDTFs (#6461) by @Jeadie in #6461
  • Add integration test for S3 Vectors to cover data update (overwrite) (#6480) by @sgrebnov in #6480
  • Add 'Run all tests' option for models tests and enable Bedrock tests (#6481) by @sgrebnov in #6481
  • Add support for a members table type for the GitHub Data Connector (#6464) by @Advayp in #6464
  • S3 vector data cannot be null (#6483) by @Jeadie in #6483
  • Don't infer FixedSizeList size during indexing vectors. (#6487) by @Jeadie in #6487
  • Add support for retention_sql acceleration param (#6488) by @sgrebnov in #6488
  • Make dataset refresh progress tracing less verbose (#6489) by @sgrebnov in #6489
  • Use RwLock on tantivy index in FullTextDatabaseIndex for update concurrency (#6490) by @Jeadie in #6490
  • Add tests for dataset retention logic and refactor retention code (#6495) by @sgrebnov in #6495
  • Upgade dependabot dependencies (#6497) by @phillipleblanc in #6497
  • Add periodic tracing of data loading progress during dataset refresh (#6499) by @sgrebnov in #6499
  • Promote Oracle Data Connector to Alpha (#6503) by @sgrebnov in #6503
  • Use AWS SDK to provide credentials for Iceberg connectors (#6498) by @phillipleblanc in #6498
  • Add integration tests for partitioning (#6463) by @kczimm in #6463
  • Use top-level table in full-text search JOIN ON (#6491) by @Jeadie in #6491
  • Use accelerated table in vector_search JOIN operations when appropriate (#6516) by @Jeadie in #6516
  • Fix 'additional_column' for quoted columns (fix for qualified columns broke it) (#6512) by @Jeadie in #6512
  • Also use AWS SDK for inferring credentials for S3/Delta/Databricks Delta data connectors (#6504) by @phillipleblanc in #6504
  • Add per-dataset availability monitor configuration (#6482) by @phillipleblanc in #6482
  • Suppress the warning from the AWS SDK if it can't load credentials (#6533) by @phillipleblanc in #6533
  • Change default value of check_availability from default to auto (#6534) by @lukekim in #6534
  • README.md improvements for v1.5.0 (#6539) by @lukekim in #6539
  • Temporary disable s3_vectors_basic (#6537) by @sgrebnov in #6537
  • Ensure binder errors show before query and other (#6374) by @suhuruli in #6374
  • Update spiceai/duckdb-rs -> DuckDB 1.3.2 + index fix (#6496) by @mach-kernel in #6496
  • Update table-providers to latest version with DuckDB fixes (#6535) by @phillipleblanc in #6535
  • S3: default to public access if no auth is provided (#6532) by @sgrebnov in #6532

Spice v1.4.0 (June 18, 2025)

· 19 min read
William Croxson
Senior Software Engineer at Spice AI

Announcing the release of Spice v1.4.0! ⚡

This release upgrades DataFusion to v47 and Arrow to v55 for faster queries, more efficient Parquet/CSV handling, and improved reliability. It introduces the AWS Glue Catalog and Data Connectors for native access to Glue-managed data on S3, and adds support for Databricks U2M OAuth for secure Databricks user authentication.

New Cron-based dataset refreshes and worker schedules enable automated task management, while dataset and search results caching improvements further optimizes query, search, and RAG performance.

What's New in v1.4.0

DataFusion v47 Highlights

Spice.ai is built on the DataFusion query engine. The v47 release brings:

Performance Improvements 🚀: This release delivers major query speedups through specialized GroupsAccumulator implementations for first_value, last_value, and min/max on Duration types, eliminating unnecessary sorting and computation. TopK operations are now up to 10x faster thanks to early exit optimizations, while sort performance is further enhanced by reusing row converters, removing redundant clones, and optimizing sort-preserving merge streams. Logical operations benefit from short-circuit evaluation for AND/OR, reducing overhead, and additional enhancements address high latency from sequential metadata fetching, improve int/string comparison efficiency, and simplify logical expressions for better execution.

Bug Fixes & Compatibility Improvements 🛠️: The release addresses issues with external sort, aggregation, and window functions, improves handling of NULL values and type casting in arrays and binary operations, and corrects problems with complex joins and nested window expressions. It also addresses SQL unparsing for subqueries, aliases, and UNION BY NAME.

See the Apache DataFusion 47.0.0 Changelog for details.

Arrow v55 Highlights

Arrow v55 delivers faster Parquet gzip compression, improved array concatenation, and better support for large files (4GB+) and modular encryption. Parquet metadata reads are now more efficient, with support for range requests and enhanced compatibility for INT96 timestamps and timezones. CSV parsing is more robust, with clearer error messages. These updates boost performance, compatibility, and reliability.

See the Arrow 55.0.0 Changelog and Arrow 55.1.0 Changelog for details.

Runtime Highlights

Search Result Caching: Spice now supports runtime caching for search results, improving performance for subsequent searches and chat completion requests that use the document_similarity LLM tool. Caching is configurable with options like maximum size, item TTL, eviction policy, and hashing algorithm.

Example spicepod.yml configuration:

runtime:
caching:
search_results:
enabled: true
max_size: 128mb
item_ttl: 5s
eviction_policy: lru
hashing_algorithm: siphash

For more information, refer to the Caching documentation.

AWS Glue Catalog Connector Alpha: Connect to AWS Glue Data Catalogs to query Iceberg, Parquet, or CSV tables in S3.

Example spicepod.yml configuration:

catalogs:
- from: glue
name: my_glue_catalog
params:
glue_key: <your-access-key-id>
glue_secret: <your-secret-access-key>
glue_region: <your-region>
include:
- 'testdb.hive_*'
- 'testdb.iceberg_*'
sql> show tables;
+-----------------+--------------+-------------------+------------+
| table_catalog | table_schema | table_name | table_type |
+-----------------+--------------+-------------------+------------+
| my_glue_catalog | testdb | hive_table_001 | BASE TABLE |
| my_glue_catalog | testdb | iceberg_table_001 | BASE TABLE |
| spice | runtime | task_history | BASE TABLE |
+-----------------+--------------+-------------------+------------+

For more information, refer to the Glue Catalog Connector documentation.

AWS Glue Data Connector Alpha: Connect to specific tables in AWS Glue Data Catalogs to query Iceberg, Parquet, or CSV in S3.

Example spicepod.yml configuration:

datasets:
- from: glue:my_database.my_table
name: my_table
params:
glue_auth: key
glue_region: us-east-1
glue_key: ${secrets:AWS_ACCESS_KEY_ID}
glue_secret: ${secrets:AWS_SECRET_ACCESS_KEY}

For more information, refer to the Glue Data Connector documentation.

Databricks U2M OAuth: Spice now supports User-to-Machine (U2M) authentication for Databricks when called with a compatible client, such as the Spice Cloud Platform.

datasets:
- from: databricks:spiceai_sandbox.default.messages
name: messages
params:
databricks_endpoint: ${secrets:DATABRICKS_ENDPOINT}
databricks_cluster_id: ${secrets:DATABRICKS_CLUSTER_ID}
databricks_client_id: ${secrets:DATABRICKS_CLIENT_ID}

Dataset Refresh Schedules: Accelerated datasets now support a refresh_cron parameter, automatically refreshing the dataset on a defined cron schedule. Cron scheduled refreshes respect the global dataset_refresh_parallelism parameter.

Example spicepod.yml configuration:

datasets:
- name: my_dataset
from: s3://my-bucket/my_file.parquet
acceleration:
refresh_cron: 0 0 * * * # Daily refresh at midnight

For more information, refer to the Dataset Refresh Schedules documentation.

Worker Execution Schedules: Workers now support a cron parameter and will execute an LLM-prompt or SQL query automatically on the defined cron schedule, in conjunction with a provided params.prompt.

Example spicepod.yml configuration:

workers:
- name: email_reporter
models:
- from: gpt-4o
params:
prompt: 'Inspect the latest emails, and generate a summary report for them. Post the summary report to the connected Teams channel'
cron: 0 2 * * * # Daily at 2am

For more information, refer to the Worker Execution Schedules documentation.

SQL Worker Actions: Spice now supports workers with sql actions for automated SQL query execution on a cron schedule:

workers:
- name: my_worker
cron: 0 * * * *
sql: 'SELECT * FROM lineitem'

For more information, refer to the Workers with a SQL action documentation;

Contributors

Breaking Changes

  • No breaking changes.

Cookbook Updates

The Spice Cookbook now includes 70 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.4.0, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.4.0 image:

docker pull spiceai/spiceai:1.4.0

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

Changelog

  • Update trunk to 1.4.0-unstable (#5878) by @phillipleblanc in #5878
  • Update openapi.json (#5885) by @app/github-actions in #5885
  • feat: Testoperator reports benchmark failure summary (#5889) by @peasee in #5889
  • fix: Publish binaries to dev when platform option is all (#5905) by @peasee in #5905
  • feat: Print dispatch current test count of total (#5906) by @peasee in #5906
  • Include multiple duckdb files acceleration scenarios into testoperator dispatch (#5913) by @sgrebnov in #5913
  • feat: Support building testoperator on dev (#5915) by @peasee in #5915
  • Update spicepod.schema.json (#5927) by @app/github-actions in #5927
  • Update ROADMAP & SECURITY for 1.3.0 (#5926) by @phillipleblanc in #5926
  • docs: Update qa_analytics.csv (#5928) by @peasee in #5928
  • fix: Properly publish binaries to dev on push (#5931) by @peasee in #5931
  • Load request context extensions on every flight incoming call (#5916) by @ewgenius in #5916
  • Fix deferred loading for datasets with embeddings (#5932) by @ewgenius in #5932
  • Schedule AI benchmarks to run every Mon and Thu evening PST (#5940) by @sgrebnov in #5940
  • Fix explain plan snapshots for TPCDS queries Q36, Q70 & Q86 not being deterministic after DF 46 upgrade (#5942) by @phillipleblanc in #5942
  • chore: Upgrade to Rust 1.86 (#5945) by @peasee in #5945
  • Standardise HTTP settings across CLI (#5769) by @Jeadie in #5769
  • Fix deferred flag for Databricks SQL warehouse mode (#5958) by @ewgenius in #5958
  • Add deferred catalog loading (#5950) by @ewgenius in #5950
  • Refactor deferred_load using ComponentInitialization enum for better clarity (#5961) by @ewgenius in #5961
  • Post-release housekeeping (#5964) by @phillipleblanc in #5964
  • add LTO for release builds (#5709) by @kczimm in #5709
  • Fix dependabot/192 (#5976) by @Jeadie in #5976
  • Fix Test-to-SQL benchmark scheduled run (#5977) by @sgrebnov in #5977
  • Fix JSON to ScalarValue type conversion to match DataFusion behavior (#5979) by @sgrebnov in #5979
  • Add v1.3.1 release notes (#5978) by @lukekim in #5978
  • Regenerate nightly build workflow (#5995) by @ewgenius in #5995
  • Fix DataFusion dependency loading in Databricks request context extension (#5987) by @ewgenius in #5987
  • Update spicepod.schema.json (#6000) by @app/github-actions in #6000
  • feat: Run MySQL SF100 on dev runners (#5986) by @peasee in #5986
  • fix: Remove caching RwLock (#6001) by @peasee in #6001
  • 1.3.1 Post-release housekeeping (#6002) by @phillipleblanc in #6002
  • feat: Add initial scheduler crate (#5923) by @peasee in #5923
  • fix flight request context scope (#6004) by @ewgenius in #6004
  • fix: Ensure snapshots on different scale factors are retained (#6009) by @peasee in #6009
  • fix: Allow dev runners in dispatch files (#6011) by @peasee in #6011
  • refactor: Deprecate results_cache for caching.sql_results (#6008) by @peasee in #6008
  • Fix models benchmark results reporting (#6013) by @sgrebnov in #6013
  • fix: Run PR checks for tools/ changes (#6014) by @peasee in #6014
  • feat: Add a CronRequestChannel for scheduler (#6005) by @peasee in #6005
  • feat: Add refresh_cron acceleration parameter, start scheduler on table load (#6016) by @peasee in #6016
  • Update license check to allow dual license crates (#6021) by @sgrebnov in #6021
  • Initial worker concept (#5973) by @Jeadie in #5973
  • Don't fail if cargo-deny already installed (license check) (#6023) by @sgrebnov in #6023
  • Upgrade to DataFusion 47 and Arrow 55 (#5966) by @sgrebnov in #5966
  • Read Iceberg tables from Glue Catalog Connector (#5965) by @kczimm in #5965
  • Handle multiple highlights in v1/search UX (#5963) by @Jeadie in #5963
  • feat: Add cron scheduler configurations for workers (#6033) by @peasee in #6033
  • feat: Add search cache configuration and results wrapper (#6020) by @peasee in #6020
  • Fix GitHub Actions Ubuntu for more workflows (#6040) by @phillipleblanc in #6040
  • Fix Actions for testoperator dispatch manual (#6042) by @phillipleblanc in #6042
  • refactor: Remove worker type (#6039) by @peasee in #6039
  • feat: Support cron dataset refreshes (#6037) by @peasee in #6037
  • Upgrade datafusion-federation to 0.4.2 (#6022) by @phillipleblanc in #6022
  • Define SearchPipeline and use in runtime/vector_search.rs. (#6044) by @Jeadie in #6044
  • fix: Scheduler test when scheduler is running (#6051) by @peasee in #6051
  • doc: Spice Cloud Connector Limitation (#6035) by @Sevenannn in #6035
  • Add support for on_conflict:upsert for Arrow MemTable (#6059) by @sgrebnov in #6059
  • Enhance Arrow Flight DoPut operation tracing (#6053) by @sgrebnov in #6053
  • Update openapi.json (#6032) by @app/github-actions in #6032
  • Add tools enabled to MCP server capabilities (#6060) by @Jeadie in #6060
  • Upgrade to delta_kernel 0.11 (#6045) by @phillipleblanc in #6045
  • refactor: Replace refresh oneshot with notify (#6050) by @peasee in #6050
  • Enable Upsert OnConflictBehavior for runtime.task_history table (#6068) by @sgrebnov in #6068
  • feat: Add a workers integration test (#6069) by @peasee in #6069
  • Fix DuckDB acceleration ORDER BY rand() and ORDER BY NULL (#6071) by @phillipleblanc in #6071
  • Update Models Benchmarks to report unsuccessful evals as errors (#6070) by @sgrebnov in #6070
  • Revert: fix: Use HTTPS ubuntu sources (#6082) by @Sevenannn in #6082
  • Add initial support for Spice Cloud Platform management (#6089) by @sgrebnov in #6089
  • Run spiceai cloud connector TPC tests using spice dev apps (#6049) by @Sevenannn in #6049
  • feat: Add SQL worker action (#6093) by @peasee in #6093
  • Post-release housekeeping (#6097) by @phillipleblanc in #6097
  • Fix search bench (#6091) by @Jeadie in #6091
  • fix: Update benchmark snapshots (#6094) by @app/github-actions in #6094
  • fix: Update benchmark snapshots (#6095) by @app/github-actions in #6095
  • Glue catalog connector for hive style parquet (#6054) by @kczimm in #6054
  • Update openapi.json (#6100) by @app/github-actions in #6100
  • Improve Flight Client DoPut / Publish error handling (#6105) by @sgrebnov in #6105
  • Define PostApplyCandidateGeneration to handle all filters & projections. (#6096) by @Jeadie in #6096
  • refactor: Update the tracing task names for scheduled tasks (#6101) by @peasee in #6101
  • task: Switch GH runners in PR and testoperator (#6052) by @peasee in #6052
  • feat: Connect search caching for HTTP and tools (#6108) by @peasee in #6108
  • test: Add multi-dataset cron test (#6102) by @peasee in #6102
  • Sanitize the ListingTableURL (#6110) by @phillipleblanc in #6110
  • Avoid partial writes by FlightTableWriter (#6104) by @sgrebnov in #6104
  • fix: Update the TPCDS postgres acceleration indexes (#6111) by @peasee in #6111
  • Make Glue Catalog refreshable (#6103) by @kczimm in #6103
  • Refactor Glue catalog to use a new Glue data connector (#6125) by @kczimm in #6125
  • Emit retry error on flight transient connection failure (#6123) by @Sevenannn in #6123
  • Update Flight DoPut implementation to send single final PutResult (#6124) by @sgrebnov in #6124
  • feat: Add metrics for search results cache (#6129) by @peasee in #6129
  • update MCP crate (#6130) by @Jeadie in #6130
  • feat: Add search cache status header, respect cache control (#6131) by @peasee in #6131
  • fix: Allow specifying individual caching blocks (#6133) by @peasee in #6133
  • Update openapi.json (#6132) by @app/github-actions in #6132
  • Add CSV support to Glue data connector (#6138) by @kczimm in #6138
  • Update Spice Cloud Platform management UX (#6140) by @sgrebnov in #6140
  • Add TPCH bench for Glue catalog (#6055) by @kczimm in #6055
  • Enforce max_tokens_per_request limit in OpenAI embedding logic (#6144) by @sgrebnov in #6144
  • Enable Spice Cloud Control Plane connect (management) for FinanceBench (#6147) by @sgrebnov in #6147
  • Add integration test for Spice Cloud Platform management (#6150) by @sgrebnov in #6150
  • fix: Invalidate search cache on refresh (#6137) by @peasee in #6137
  • fix: Prevent registering cron schedule with change stream accelerations (#6152) by @peasee in #6152
  • test: Add an append cron integration test (#6151) by @peasee in #6151
  • fix: Cache search results with no-cache directive (#6155) by @peasee in #6155
  • fix: Glue catalog dispatch runner type (#6157) by @peasee in #6157
  • Fix: Glue S3 location for directories and Iceberg credentials (#6174) by @kczimm in #6174
  • Support multiple columns in FTS (#6156) by @Jeadie in #6156
  • fix: Add --cache-control flag for search CLI (#6158) by @peasee in #6158
  • Add Glue data connector tpch bench test for parquet and csv (#6170) by @kczimm in #6170
  • fix: Apply results cache deprecation correctly (#6177) by @peasee in #6177
  • Fix regression in Parquet pushdown (#6178) by @phillipleblanc in #6178
  • Fix CUDA build (use candle-core 0.8.4 and cudarc v0.12) (#6181) by @sgrebnov in #6181
  • return empty stream if no external_links present (#6192) by @kczimm in #6192
  • Use arrow pretty print util instead of init dataframe / logical plan in display_records (#6191) by @Sevenannn in #6191
  • task: Enable additional TPCDS test scenarios in dispatcher (#6160) by @peasee in #6160
  • chore: Update dependencies (#6196) by @peasee in #6196
  • Fix FlightSQL GetDbSchemas and GetTables schemas to fully match the protocol (#6197) by @sgrebnov in #6197
  • Use spice-rs in test operator and retry on connection reset error (#6136) by @Sevenannn in #6136
  • Fix load status metric description (#6219) by @phillipleblanc in #6219
  • Run extended tests on PRs against release branch, update glue_iceberg_integration_test_catalog test (#6204) by @Sevenannn in #6204
  • query schema for is_nullable (#6229) by @kczimm in #6229
  • fix: use the query error message when queries fail (#6228) by @kczimm in #6228
  • fix glue iceberg catalog integration test (#6249) by @Sevenannn in #6249
  • cache table providers in glue catalog (#6252) by @kczimm in #6252
  • fix: databricks sql_warehouse schema contains duplicate fields (#6255) by @phillipleblanc in #6255

Full Changelog: v1.3.2...v1.4.0

Spice v1.3.2 (June 2, 2025)

· 2 min read
Phillip LeBlanc
Co-Founder and CTO of Spice AI

Announcing the release of Spice v1.3.2! ❄️

Spice v1.3.2 is a patch release with fixes to the DuckDB data accelerator and Snowflake data connector.

Changes:

  • DuckDB Data Accelerator: Supports ORDER BY rand() for randomized result ordering and ORDER BY NULL for SQL compatibility.

  • Snowflake Data Connector: Adds TIMESTAMP_NTZ(0) type for timestamps with seconds precision.

Contributors

Breaking Changes

No breaking changes.

Cookbook Updates

No new cookbook recipes.

The Spice Cookbook now includes 67 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.3.2, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.3.2 image:

docker pull spiceai/spiceai:1.3.2

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

No major dependency changes.

Changelog

  • Handle Snowflake Timestamp NTZ with seconds precision (#6084) by @kczimm in #6084
  • Fix DuckDB acceleration ORDER BY rand() and ORDER BY NULL (#6071) by @phillipleblanc in #6071

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.3.1...v1.3.2