10 posts tagged with "s3"

Amazon S3 storage service topics and usage

View All Tags

Spice v2.0.1 (Jun 17, 2026)

June 17, 2026 · 4 min read

Phillip LeBlanc

Co-Founder and CTO of Spice AI

Spice v2.0.1 is now available! 🛠️

Spice v2.0.1 is a patch release focused on reliability and performance. It speeds up Apache Iceberg reads and fixes bugs across AWS S3 and object-store datasets, data acceleration, distributed query, and authenticated access.

What's New in v2.0.1

Faster Iceberg Reads with Parallel File Scanning

The Apache Iceberg reader now scans data files in parallel (#11331), improving read throughput and latency for Iceberg tables that span many files.

AWS S3 & Object-Store Reliability

Three fixes improve S3 and object-store dataset behavior:

Refresh-skip restored (#11339): ETag/Version-based refresh-skip works reliably again, so unchanged S3 objects are no longer re-downloaded on every refresh.
Retry when source files are not yet available (#11342): an object-store dataset whose source files are not present at startup now retries and becomes ready once the data appears, instead of failing permanently.
Path-style addressing for dotted bucket names (#11347): on standard AWS, buckets whose names contain dots now default to path-style addressing, avoiding TLS wildcard certificate errors under virtual-hosted-style HTTPS.

Data Acceleration & Distributed Query Fixes

Two fixes ensure accelerated datasets behave correctly in more configurations:

Acceleration endpoints (#11345): /v1/datasets/{name}/acceleration/refresh (and the related update-refresh-sql, partition-filters, and snapshots endpoints) now work for all accelerated datasets, fixing cases where some incorrectly reported Table is not accelerated.
Distributed clusters (#11226): the distributed query coordinator now serves accelerated data from executors for all accelerated datasets, instead of falling back to reading from the source for some.

Authenticated Query Fixes

With authentication enabled, queries now consistently run as the requesting user (#11253), so per-user behavior such as results caching is correctly scoped to each user.

Contributors

Breaking Changes

No breaking changes.

Cookbook Updates

No new cookbook recipes.

The Spice Cookbook includes more than 100 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v2.0.1, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:2.0.1 image:

docker pull spiceai/spiceai:2.0.1

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai --version 2.0.1

AWS Marketplace:

Spice is available in the AWS Marketplace.

What's Changed

Changelog

fix(ci): key testoperator & validator artifacts by checked-out commit by @sgrebnov in #11281
chore(deps): fix cargo-deny advisory failures on release/2.0 by @phillipleblanc in #11333
chore(deps): bump iceberg-rust to parallel file scanning fork (release/2.0) by @phillipleblanc in #11331
fix(refresh): restore S3 ETag/Version refresh-skip behind provider wrappers by @phillipleblanc in #11339
fix(runtime): retry object-store dataset load when source files are not yet available by @phillipleblanc in #11342
feat(s3): default to path-style for dotted bucket names on standard AWS by @phillipleblanc in #11347
fix(runtime): resolve accelerated table through metadata-enrichment wrapper by @phillipleblanc in #11345
fix(cluster): distribute accelerated tables wrapped by metadata/index providers by @phillipleblanc in #11226
fix: scope request context across the managed query runtime by @phillipleblanc in #11253

Full Changelog: https://github.com/spiceai/spiceai/compare/v2.0.0...v2.0.1

Spice v1.11.5 (Apr 1, 2026)

April 1, 2026 · 4 min read

Sergei Grebnov

Senior Software Engineer at Spice AI

Announcing the release of Spice v1.11.5! 🛠️

Spice v1.11.5 is a patch release improving on_zero_results: use_source fallback performance, Delta Lake timestamp predicate data skipping, S3 Parquet read performance, PostgreSQL partitioned table support, Cayenne target file size handling, and preparing the CLI for v2.0 runtime upgrades.

What's New in v1.11.5

`on_zero_results: use_source` Fallback Performance Improvement

Improved the on_zero_results: use_source fallback path to run DataFusion's physical optimizer on the federated scan plan (#9927). The fallback path now runs SessionState::physical_optimizers() rules on the federated scan plan before execution, enabling parallel file group scanning and other optimizations. This results in significantly faster fallback queries on multi-core machines, particularly for file-based data sources like Delta Lake.

Delta Lake: Improved Data Skipping for `>=` Timestamp Predicates

Delta Lake table scans with >= timestamp filters now correctly prune files that do not match the predicate (#9932), improving query performance through more effective data skipping (file-level pruning).

PostgreSQL: Partitioned Tables Support

The PostgreSQL data connector now supports partitioned tables (#9997) for both federated and accelerated queries.

S3 Parquet Read Performance Improvement

Improved parquet read performance from S3 and other object stores (#10064), particularly for tables with many columns. Column data ranges are now coalesced into fewer, larger requests instead of being fetched individually, reducing the number of HTTP round-trips.

Cayenne: Ensure Target File Size is Respected

The Cayenne accelerator now correctly respects the configured target file size (#10071). Previously, Cayenne could produce many small, fragmented Vortex files; with this fix, files are written at the expected target size, improving storage efficiency and query performance.

CLI: Support for v2.0 Runtime Upgrades

The Spice CLI can now upgrade to v2.0 runtime versions. This enables upgrading to v2.0 release candidates and, once released, the v2.0 stable runtime.

spice upgrade v2.0.0-rc.1

Running spice upgrade without a version will upgrade to the latest stable version, including v2.0 once released.

Note: Native Windows runtime builds will no longer be provided in v2.0. Use WSL for local development instead.

Contributors

Breaking Changes

No breaking changes.

Cookbook Updates

No new cookbook recipes.

The Spice Cookbook includes 86 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.11.5, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.11.5 image:

docker pull spiceai/spiceai:1.11.5

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai --version 1.11.5

AWS Marketplace:

Spice is available in the AWS Marketplace.

What's Changed

Changelog

fix(runtime): Run physical optimizer on FallbackOnZeroResultsScanExec fallback plan by @sgrebnov in #9927
fix(delta_lake): Fix data skipping for >= timestamp predicates by @sgrebnov in #9932
fix(PostgreSQL): Fix schema discovery for PostgreSQL partitioned tables by @sgrebnov in #9997
fix(cli): Skip models variant download for v2+ in upgrade/install by @lukekim and @sgrebnov in #10052
perf(s3): Improve Parquet read performance by @sgrebnov in #10064
fix(cayenne): Ensure Cayenne respects target file size by @krinart in #10071

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.11.4...v1.11.5

Spice v1.11.4 (Mar 12, 2026)

March 12, 2026 · 5 min read

Sergei Grebnov

Senior Software Engineer at Spice AI

Announcing the release of Spice v1.11.4! ⚡

Spice v1.11.4 is a patch release improving S3 metadata column query robustness and enabling on_zero_results: use_source for accelerated views.

What's New in v1.11.4

Accelerated Views: `on_zero_results: use_source` Support

Accelerated views now support the on_zero_results: use_source configuration (#9699). Previously, accelerated views only supported on_zero_results: return_empty, which returned an empty result set when the accelerated data contained no matching rows. With this change, views can fall back to querying the source data when the accelerated query returns zero results, matching the behavior already available for accelerated datasets.

Example configuration:

views:
  - name: sales_summary
    sql: |
      SELECT region, SUM(amount) as total
      FROM sales
      GROUP BY region
    acceleration:
      enabled: true
      on_zero_results: use_source

How the Fallback Works

When an accelerated view is configured with on_zero_results: use_source, the following happens at query time:

The accelerated store is queried first. The query runs against the view's accelerated data (e.g., Spice Cayenne, Arrow, DuckDB, or SQLite).
If the accelerated query returns zero rows, the runtime falls back to re-executing the view's SQL query against the datasets it references.
Referenced datasets are queried according to their own configuration. The view's SQL is re-executed against each referenced dataset as it is configured. This means:
- If a referenced dataset is accelerated, the query hits that dataset's accelerated store — not the raw data source.
- If a referenced dataset is accelerated with on_zero_results: use_source and its accelerated store also returns zero rows, it will independently fall back to its own federated data source (e.g., Postgres, S3, etc.).
- If a referenced dataset is federated (not accelerated), the query goes directly to the data source.

This means the fallback can chain through multiple layers: first the view's acceleration, then each referenced dataset's acceleration, and finally the original data source — each layer independently applying its own on_zero_results behavior.

Example: Multi-layer fallback

datasets:
  - from: postgres:orders
    name: orders
    acceleration:
      enabled: true
      refresh_sql: "SELECT * FROM orders WHERE created_at > now() - interval '7 days'"
      on_zero_results: use_source  # Falls back to Postgres if accelerated data has no matches

views:
  - name: recent_orders_summary
    sql: |
      SELECT status, COUNT(*) as order_count
      FROM orders
      GROUP BY status
    acceleration:
      enabled: true
      on_zero_results: use_source  # Falls back to re-running the SQL against referenced datasets

In this example, a query like SELECT * FROM recent_orders_summary WHERE status = 'cancelled' follows this path:

Queries recent_orders_summary in the view's accelerated store (DuckDB/SQLite).
If zero rows are returned, re-executes SELECT status, COUNT(*) ... FROM orders GROUP BY status against the orders dataset.
Since orders is accelerated, the query hits the orders accelerated store.
If orders also returns zero rows (e.g., the refresh_sql excluded cancelled orders), it falls back to querying Postgres directly.

S3 Data Connector: More Robust Metadata Column Handling

Improved the robustness of metadata column (location, last_modified, size) handling for S3 datasets. Building on the v1.11.3 release, this update addresses an additional edge case where the query optimizer's projection swap could cause an index out of bounds panic when metadata columns are referenced in projections with filters or scalar functions.

Contributors

Breaking Changes

No breaking changes.

Cookbook Updates

No new cookbook recipes.

The Spice Cookbook includes 86 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.11.4, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.11.4 image:

docker pull spiceai/spiceai:1.11.4

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai --version 1.11.4

AWS Marketplace:

Spice is available in the AWS Marketplace.

What's Changed

Changelog

fix(s3): Make metadata column handling more robust by @sgrebnov in #9714
feat(views): Enable on_zero_results: use_source for accelerated views by @krinart in #9699

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.11.3...v1.11.4

Spice v1.11.3 (Mar 9, 2026)

March 9, 2026 · 3 min read

Phillip LeBlanc

Co-Founder and CTO of Spice AI

Announcing the release of Spice v1.11.3! 🛠️

Spice v1.11.3 is a patch release fixing schema consistency issues in the S3 and FlightSQL data connectors, improving CDC cache invalidation, and enhancing the HTTP data connector's error handling and response metadata.

What's New in v1.11.3

S3 Data Connector Fix

Fixed an issue where queries using metadata columns (location, last_modified, size) on S3 datasets produced Input field name does not match with the projection expression errors (#9647). This occurred when projecting metadata columns with filters or scalar functions (e.g., SELECT lower(location) FROM table WHERE location = '...'), and when projection returned no matching files.

FlightSQL Schema Consistency

Fixed an issue where the Flight SQL JDBC driver returned Unsupported ArrowType Utf8View errors when performing ::TEXT type casts (#9253). The FlightSQL endpoint now maps view types (e.g., Utf8View, BinaryView) to their non-view equivalents, ensuring compatibility with JDBC and ODBC clients.

CDC Cache Invalidation

Fixed an issue where the SQL results cache was invalidated on every change stream poll, even when zero records were returned (#9472). This caused near-total cache miss rates for datasets using refresh_mode: changes (e.g., DynamoDB Streams), effectively rendering the cache useless. Cache invalidation now only occurs when a change batch contains actual data changes.

HTTP Data Connector Improvements

HTTP error responses (e.g., 5xx) are now excluded from the cache, preventing transient server errors from polluting cached results.
Added a response_headers column (Map type) to HTTP responses, providing access to response header metadata in query results.

Contributors

Breaking Changes

No breaking changes.

Cookbook Updates

No new cookbook recipes.

The Spice Cookbook includes 86 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.11.3, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.11.3 image:

docker pull spiceai/spiceai:1.11.3

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai --version 1.11.3

AWS Marketplace:

Spice is available in the AWS Marketplace.

What's Changed

Changelog

fix(s3): Fix metadata column schema mismatches in projected queries by @sgrebnov in #9664
s3_metadata_columns tests: include test for location outside table prefix by @sgrebnov in #9676
Fix Flight SQL schema consistency: expand view types and verify field names by @sgrebnov in #9438
Improve CDC cache invalidation by @krinart in #9651
Skip caching http error response + add response_headers by @krinart in #9670

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.11.2...v1.11.3

Spice v1.10.3 (Dec 29, 2025)

December 29, 2025 · 2 min read

Phillip LeBlanc

Co-Founder and CTO of Spice AI

Announcing the release of Spice v1.10.3! 🚀

v1.10.3 is a patch release with improved startup reliability, fixes for Azure BlobFS versioned containers, S3 custom endpoint query resolution, and a fix for the OpenAI Responses API.

What's New in v1.10.3

Additional Improvements & Bug Fixes

Reliability: Telemetry exporter initialization now runs asynchronously, preventing blocked startup in environments with network restrictions (e.g., Kubernetes with restrictive network policies).
Reliability: Fixed an issue where queries on Azure Blob containers with versioning enabled would fail with "Azure does not support suffix range requests" error in distributed query mode.
Reliability: Fixed S3 location-based queries against custom S3 endpoints (e.g., MinIO, LocalStack). Queries with location predicates on datasets using s3_endpoint and s3_region parameters now correctly route to the configured endpoint instead of defaulting to AWS S3.
Reliability: Fixed "project index out of bounds" errors in the query optimizer when union children have mismatched schemas. The optimizer now validates schema compatibility before applying projection pushdown.
Reliability: Fixed an issue where the OpenAI Responses API (/v1/responses) was not working correctly.

Contributors

Breaking Changes

No breaking changes.

Cookbook Updates

No major cookbook updates.

The Spice Cookbook includes 84 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.10.3, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.10.3 image:

docker pull spiceai/spiceai:1.10.3

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

AWS Marketplace:

🎉 Spice is now available in the AWS Marketplace!

What's Changed

Changelog

Upgrade to openai-async v0.32 by @lukekim in #8635
Fix issue with location predicate for custom S3 endpoints + regression integration test by @phillipleblanc in #8668
fix: Validate schema match before projection pushdown in UnionProjectionPushdownOptimizer by @phillipleblanc in #8669
Start the anonymous telemetry exporter asynchronously by @phillipleblanc in #8679
fix: Azure does not support suffix range requests by @phillipleblanc in #8685

Spice v1.5.0 (July 21, 2025)

July 22, 2025 · 14 min read

Evgenii Khramkov

Senior Software Engineer at Spice AI

Announcing the release of Spice v1.5.0! 🔍

Spice v1.5.0 brings major upgrades to search and retrieval. It introduces native support for Amazon S3 Vectors, enabling petabyte scale vector search directly from S3 vector buckets, alongside SQL-integrated vector and tantivy-powered full-text search, partitioning for DuckDB acceleration, and automated refreshes for search indexes and views. It includes the AWS Bedrock Embeddings Model Provider, the Oracle Database connector, and the now-stable Spice.ai Cloud Data Connector, and the upgrade to DuckDB v1.3.2.

What's New in v1.5.0

Amazon S3 Vectors Support: Spice.ai now integrates with Amazon S3 Vectors, launched in public preview on July 15, 2025, enabling vector-native object storage with built-in indexing and querying. This integration supports semantic search, recommendation systems, and retrieval-augmented generation (RAG) at petabyte scale with S3’s durability and elasticity. Spice.ai manages the vector lifecycle—ingesting data, creating embeddings with models like Amazon Titan or Cohere via AWS Bedrock, or others available on HuggingFace, and storing it in S3 Vector buckets.

Spice integration with Amazon S3 Vectors

Example Spicepod.yml configuration for S3 Vectors:

datasets:
  - from: s3://my_data_bucket/data/
    name: my_vectors
    params:
      file_format: parquet
    acceleration:
      enabled: true
    vectors:
      engine: s3_vectors
      params:
        s3_vectors_aws_region: us-east-2
        s3_vectors_bucket: my-s3-vectors-bucket
    columns:
      - name: content
        embeddings:
          - from: bedrock_titan
            row_id:
              - id

Example SQL query using S3 Vectors:

SELECT *
FROM vector_search(my_vectors, 'Cricket bats', 10)
WHERE price < 100
ORDER BY score

For more details, refer to the S3 Vectors Documentation.

SQL-integrated Search: Vector and BM25-scored full-text search capabilities are now natively available in SQL queries, extending the power of the POST v1/search endpoint to all SQL workflows.

Example Vector-Similarity-Search (VSS) using the vector_search UDTF on the table reviews for the search term "Cricket bats":

SELECT review_id, review_text, review_date, score
FROM vector_search(reviews, "Cricket bats")
WHERE country_code="AUS"
LIMIT 3

Example Full-Text-Search (FTS) using the text_search UDTF on the table reviews for the search term "Cricket bats":

SELECT review_id, review_text, review_date, score
FROM text_search(reviews, "Cricket bats")
LIMIT 3

DuckDB v1.3.2 Upgrade: Upgraded DuckDB engine from v1.1.3 to v1.3.2. Key improvements include support for adding primary keys to existing tables, resolution of over-eager unique constraint checking for smoother inserts, and 13% reduced runtime on TPC-H SF100 queries through extensive optimizer refinements. The v1.2.x release of DuckDB was skipped due to a regression in indexes.

Read the DuckDB v1.2.0 announcement.
Read the DuckDB v1.3.0 announcement.

Partitioned Acceleration: DuckDB file-based accelerations now support partition_by expressions, enabling queries to scale to large datasets through automatic data partitioning and query predicate pruning. New UDFs, bucket and truncate, simplify partition logic.

New UDFs useful for partition_by expressions:

bucket(num_buckets, col): Partitions a column into a specified number of buckets based on a hash of the column value.
truncate(width, col): Truncates a column to a specified width, aligning values to the nearest lower multiple (e.g., truncate(10, 101) = 100).

Example Spicepod.yml configuration:

datasets:
  - from: s3://my_bucket/some_large_table/
    name: my_table
    params:
      file_format: parquet
    acceleration:
      enabled: true
      engine: duckdb
      mode: file
      partition_by: bucket(100, account_id) # Partition account_id into 100 buckets

Full-Text-Search (FTS) Index Refresh: Accelerated datasets with search indexes maintain up-to-date results with configurable refresh intervals.

Example refreshing search indexes on body every 10 seconds:

datasets:
  - from: github:github.com/spiceai/docs/pulls
    name: spiceai.doc.pulls
    params:
      github_token: ${secrets:GITHUB_TOKEN}
    acceleration:
      enabled: true
      refresh_mode: full
      refresh_check_interval: 10s
    columns:
      - name: body
        full_text_search:
          enabled: true
          row_id:
            - id

Scheduled View Refresh: Accelerated Views now support cron-based refresh schedules using refresh_cron, automating updates for accelerated data.

Example Spicepod.yml configuration:

views:
  - name: my_view
    sql: SELECT 1
    acceleration:
      enabled: true
      refresh_cron: '0 * * * *' # Every hour

For more details, refer to Scheduled Refreshes.

Multi-column Vector Search: For datasets configured with embeddings on more than one column, POST v1/search and similarity_search perform parallel vector search on each column, aggregating results using reciprocal rank fusion.

Example Spicepod.yml for multi-column search:

datasets:
  - from: github:github.com/apache/datafusion/issues
    name: datafusion.issues
    params:
      github_token: ${secrets:GITHUB_TOKEN}
    columns:
      - name: title
        embeddings:
          - from: hf_minilm
      - name: body
        embeddings:
          - from: openai_embeddings

AWS Bedrock Embeddings Model Provider: Added support for AWS Bedrock embedding models, including Amazon Titan Text Embeddings and Cohere Text Embeddings.

Example Spicepod.yml:

embeddings:
  - from: bedrock:cohere.embed-english-v3
    name: cohere-embeddings
    params:
      aws_region: us-east-1
      input_type: search_document
      truncate: END
  - from: bedrock:amazon.titan-embed-text-v2:0
    name: titan-embeddings
    params:
      aws_region: us-east-1
      dimensions: '256'

For more details, refer to the AWS Bedrock Embedding Models Documentation.

Oracle Data Connector: Use from: oracle: to access and accelerate data stored in Oracle databases, deployed on-premises or in the cloud.

Example Spicepod.yml:

datasets:
  - from: oracle:"SH"."PRODUCTS"
    name: products
    params:
      oracle_host: 127.0.0.1
      oracle_username: scott
      oracle_password: tiger

See the Oracle Data Connector documentation.

GitHub Data Connector: The GitHub data connector supports query and acceleration of members, the users of an organization.

Example Spicepod.yml configuration:

datasets:
  - from: github:github.com/spiceai/members # General format: github.com/[org-name]/members
    name: spiceai.members
    params:
      # With GitHub Apps (recommended)
      github_client_id: ${secrets:GITHUB_SPICEHQ_CLIENT_ID}
      github_private_key: ${secrets:GITHUB_SPICEHQ_PRIVATE_KEY}
      github_installation_id: ${secrets:GITHUB_SPICEHQ_INSTALLATION_ID}
      # With GitHub Tokens
      # github_token: ${secrets:GITHUB_TOKEN}

See the GitHub Data Connector Documentation

Spice.ai Cloud Data Connector: Graduated to Stable.

spice-rs SDK Release: The Spice Rust SDK has updated to v3.0.0. This release includes optimizations for the Spice client API, adds robust query retries, and custom metadata configurations for spice queries.

Contributors

Breaking Changes

Search HTTP API Response: POST v1/search response payload has changed. See the new API documentation for details.
Model Provider Parameter Prefixes: Model Provider parameters use provider-specific prefixes instead of openai_ prefixes (e.g., hf_temperature for HuggingFace, anthropic_max_completion_tokens for Anthropic, perplexity_tool_choice for Perplexity). The openai_ prefix remains supported for backward compatibility but is deprecated and will be removed in a future release.

Cookbook Updates

Added Oracle Data Connector cookbook: Connect to tables in Oracle databases.
Added Hashed Partitioning with DuckDB cookbook: Accelerate data on large datasets by partitioning data into a fixed number of buckets.

The Spice Cookbook now includes 72 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.5.0, download and install the specific binary from github.com/spiceai/spiceai/releases/tag/v1.5.0 or pull the v1.5.0 Docker image (spiceai/spiceai:1.5.0).

What's Changed

Dependencies

delta_kernel: Upgraded to v0.12.1
DuckDB: Upgraded from v1.1.3 to v1.3.2
iceberg-rust: Upgraded from v0.4.0 to v0.5.1

Changelog

fix: openai model endpoint (#6394) by @Sevenannn in #6394
Enable configuring otel endpoint from spice run (#6360) by @Advayp in #6360
Enable Oracle connector in default build configuration (#6395) by @sgrebnov in #6395
fix llm integraion test (#6398) by @Sevenannn in #6398
Promote spice cloud connector to stable quality (#6221) by @Sevenannn in #6221
v1.5.0-rc.1 release notes (#6397) by @lukekim in #6397
Fix model nsql integration tests (#6365) by @Sevenannn in #6365
Fix incorrect UDTF name and SQL query (#6404) by @lukekim in #6404
Update v1.5.0-rc.1.md (#6407) by @sgrebnov in #6407
Improve error messages (#6405) by @lukekim in #6405
build(deps): bump Jimver/cuda-toolkit from 0.2.25 to 0.2.26 (#6388) by @app/dependabot in #6388
Upgrade dependabot dependencies (#6411) by @phillipleblanc in #6411
Fix projection pushdown issues for document based file connector (#6362) by @Advayp in #6362
Add a PartitionedDuckDB Accelerator (#6338) by @kczimm in #6338
Use vector_search() UDTF in HTTP APIs (#6417) by @Jeadie in #6417
add supported types (#6409) by @kczimm in #6409
Enable session time zone override for MySQL (#6426) by @sgrebnov in #6426
Acceleration-like indexing for full text search indexes. (#6382) by @Jeadie in #6382
Provide error message when partition by expression changes (#6415) by @kczimm in #6415
Add support for Oracle Autonomous Database connections (Oracle Cloud) (#6421) by @sgrebnov in #6421
prune partitions for exact and in list with and without UDFs (#6423) by @kczimm in #6423
Fixes and reenable FTS tests (#6431) by @Jeadie in #6431
Upgrade DuckDB to 1.3.2 (#6434) by @phillipleblanc in #6434
Fix issue in limit clause for the Github Data connector (#6443) by @Advayp in #6443
Upgrade iceberg-rust to 0.5.1 (#6446) by @phillipleblanc in #6446
v1.5.0-rc.2 release notes (#6440) by @lukekim in #6440
Oracle: add automated TPC-H SF1 benchmark tests (#6449) by @sgrebnov in #6449
fix: Update benchmark snapshots (#6455) by @app/github-actions in #6455
Preserve ArrowError in arrow_tools::record_batch (#6454) by @mach-kernel in #6454
fix: Update benchmark snapshots (#6465) by @app/github-actions in #6465
Add option to preinstall Oracle ODPI-C library in Docker image (#6466) by @sgrebnov in #6466
Include Oracle connector (federated mode) in automated benchmarks (#6467) by @sgrebnov in #6467
Update crates/llms/src/bedrock/embed/mod.rs by @lukekim in #6468
v1.5.0-rc.3 release notes (#6474) by @lukekim in #6474
Add integration tests for S3 Vectors filters pushdown (#6469) by @sgrebnov in #6469
check for indexedtableprovider when finding tables to search on (#6478) by @Jeadie in #6478
Parse fully qualified table names in UDTFs (#6461) by @Jeadie in #6461
Add integration test for S3 Vectors to cover data update (overwrite) (#6480) by @sgrebnov in #6480
Add 'Run all tests' option for models tests and enable Bedrock tests (#6481) by @sgrebnov in #6481
Add support for a members table type for the GitHub Data Connector (#6464) by @Advayp in #6464
S3 vector data cannot be null (#6483) by @Jeadie in #6483
Don't infer FixedSizeList size during indexing vectors. (#6487) by @Jeadie in #6487
Add support for retention_sql acceleration param (#6488) by @sgrebnov in #6488
Make dataset refresh progress tracing less verbose (#6489) by @sgrebnov in #6489
Use RwLock on tantivy index in FullTextDatabaseIndex for update concurrency (#6490) by @Jeadie in #6490
Add tests for dataset retention logic and refactor retention code (#6495) by @sgrebnov in #6495
Upgade dependabot dependencies (#6497) by @phillipleblanc in #6497
Add periodic tracing of data loading progress during dataset refresh (#6499) by @sgrebnov in #6499
Promote Oracle Data Connector to Alpha (#6503) by @sgrebnov in #6503
Use AWS SDK to provide credentials for Iceberg connectors (#6498) by @phillipleblanc in #6498
Add integration tests for partitioning (#6463) by @kczimm in #6463
Use top-level table in full-text search JOIN ON (#6491) by @Jeadie in #6491
Use accelerated table in vector_search JOIN operations when appropriate (#6516) by @Jeadie in #6516
Fix 'additional_column' for quoted columns (fix for qualified columns broke it) (#6512) by @Jeadie in #6512
Also use AWS SDK for inferring credentials for S3/Delta/Databricks Delta data connectors (#6504) by @phillipleblanc in #6504
Add per-dataset availability monitor configuration (#6482) by @phillipleblanc in #6482
Suppress the warning from the AWS SDK if it can't load credentials (#6533) by @phillipleblanc in #6533
Change default value of check_availability from default to auto (#6534) by @lukekim in #6534
README.md improvements for v1.5.0 (#6539) by @lukekim in #6539
Temporary disable s3_vectors_basic (#6537) by @sgrebnov in #6537
Ensure binder errors show before query and other (#6374) by @suhuruli in #6374
Update spiceai/duckdb-rs -> DuckDB 1.3.2 + index fix (#6496) by @mach-kernel in #6496
Update table-providers to latest version with DuckDB fixes (#6535) by @phillipleblanc in #6535
S3: default to public access if no auth is provided (#6532) by @sgrebnov in #6532

Spice v1.0-stable (Jan 20, 2025)

January 20, 2025 · 11 min read

William Croxson

Senior Software Engineer at Spice AI

🎉 After 47 releases, Spice.ai OSS has reached production readiness with the 1.0-stable milestone!

The core runtime and features such as query federation, query acceleration, catalog integration, search and AI-inference have all graduated to stable status along with key component graduations across data connectors, data accelerators, catalog connectors, and AI model providers.

Highlights in v1.0-stable

Stable Data Connectors: The following data connectors have graduated to Stable:
- Delta Lake
- MySQL
- Dremio
- PostgreSQL
- Databricks (mode: delta_lake)
- DuckDB
- S3
Stable Data Accelerators: The following data accelerators have graduated to Stable:
- DuckDB
- Arrow
Unity Catalog Connector: Graduated to Stable.
Databricks (mode: spark_connect) Data Connector: Graduated to Beta.
Beta Catalog Connectors: The Iceberg and Databricks catalog connectors graduated to Beta.
OpenAI Model & Embeddings Provider: Graduated to Release Candidate (RC).
Alpha Model Providers: The Anthropic and xAI (Grok) model providers graduated to Alpha.

Breaking Changes

Default Runtime Version: The CLI will install the GPU accelerated AI-capable Runtime by default (if supported), when running spice install or spice run. To force-install the non-GPU version, run spice install ai --cpu.
Default OpenAI Model: The default OpenAI model has updated to gpt-4o-mini.
Identifier Normalization: Unquoted identifiers such as table names are no longer normalized to lowercase. Identifiers will now retain their exact case as provided.
Sandboxed Docker Image: The Runtime Docker Image now runs the spiced process as the nobody user in a minimal chroot sandbox.
Insecure S3 and ABFS endpoints: The S3 and ABFS connectors now enforce insecure endpoint checks, preventing HTTP endpoints unless allow_http is explicitly enabled. Refer to the documentation for details.

Dependencies

No major dependency changes.

Upgrading

To upgrade to v1.0.0, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.0.0 image:

docker pull spiceai/spiceai:1.0.0

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

Contributors

@peasee
@ewgenius
@Jeadie
@Sevenannn
@lukekim
@phillipleblanc
@sgrebnov

What's Changed

- feat: Update load test criteria, testoperator updates by @peasee in <https://github.com/spiceai/spiceai/pull/4311>
- Update helm for v1.0.0-rc.5 by @ewgenius in <https://github.com/spiceai/spiceai/pull/4313>
- Update spicepod.schema.json by @github-actions in <https://github.com/spiceai/spiceai/pull/4318>
- Bump version to v1.0.0, update SECURITY.md by @ewgenius in <https://github.com/spiceai/spiceai/pull/4314>
- Initial criteria for models, embeddings by @Jeadie in <https://github.com/spiceai/spiceai/pull/4223>
- Update benchmark snapshots by @github-actions in <https://github.com/spiceai/spiceai/pull/4321>
- Add dremio param for running load test by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4315>
- Promote Databricks (mode: delta_lake) connector to stable by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4328>
- Handle failed query in load test by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4327>
- feat: Use load test hours for baseline query sets by @peasee in <https://github.com/spiceai/spiceai/pull/4334>
- Fix typo in 1.0.0-rc.5 release notes by @ewgenius in <https://github.com/spiceai/spiceai/pull/4329>
- feat: add testoperator data consistency by @peasee in <https://github.com/spiceai/spiceai/pull/4319>
- docs: Release DuckDB connector stable by @peasee in <https://github.com/spiceai/spiceai/pull/4335>
- Fix DocumentDB -> DynamoDB by @lukekim in <https://github.com/spiceai/spiceai/pull/4339>
- Update benchmark snapshots by @github-actions in <https://github.com/spiceai/spiceai/pull/4337>
- fix: Download hits.parquet from MinIO for benchmark by @peasee in <https://github.com/spiceai/spiceai/pull/4338>
- Update openapi.json by @github-actions in <https://github.com/spiceai/spiceai/pull/4341>
- Remove evil averages by @lukekim in <https://github.com/spiceai/spiceai/pull/4343>
- Don't run builds on non-code changes by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4344>
- Remove streaming requirement from Databricks spark Beta and Spark connector Beta by @ewgenius in <https://github.com/spiceai/spiceai/pull/4345>
- Update s3 tpcds spicepods by @ewgenius in <https://github.com/spiceai/spiceai/pull/4346>
- Explicitly set required scale factor for throughput and load tests by @ewgenius in <https://github.com/spiceai/spiceai/pull/4347>
- Fix s3 tpcds dataset name by @ewgenius in <https://github.com/spiceai/spiceai/pull/4348>
- Promote Iceberg Catalog Connector to Beta by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4350>
- Update s3 clickbench benchmark snapshots by @ewgenius in <https://github.com/spiceai/spiceai/pull/4351>
- fix: DuckDB clickbench on zero results by @peasee in <https://github.com/spiceai/spiceai/pull/4349>
- Add integration test with snapshots for databricks catalog connector by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4353>
- refactor: Remove on zero results from benchmarks, add data consistency workflow by @peasee in <https://github.com/spiceai/spiceai/pull/4354>
- Fix Bug: No field named body_embedding when do vector search with refresh sql containing subset of columns by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4297>
- docs: Update roadmap by @peasee in <https://github.com/spiceai/spiceai/pull/4364>
- feat: Release accelerators stable by @peasee in <https://github.com/spiceai/spiceai/pull/4361>
- Add TPCH/TPCDS test spicepods for MySQL by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4365>
- Catch when an insecure (http) S3 and ABFS data connectors endpoint is used without specifying the `allow_http` parameter by @ewgenius in <https://github.com/spiceai/spiceai/pull/4363>
- Update ROADMAP - Iceberg catalog alpha for v1.0 by @ewgenius in <https://github.com/spiceai/spiceai/pull/4367>
- Promote databricks catalog and databricks (spark_connect) connector to beta by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4369>
- Update Roadmap - Iceberg beta by @ewgenius in <https://github.com/spiceai/spiceai/pull/4373>
- Build CUDA binaries for Linux by @Jeadie in <https://github.com/spiceai/spiceai/pull/4320>
- Promote Nvidia NIM as Alpha by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4380>
- Promote xai to alpha by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4381>
- Update stable criteria for object store based connectors by @ewgenius in <https://github.com/spiceai/spiceai/pull/4383>
- Testoperator: http consistency and overhead tests, fixes and ci by @ewgenius in <https://github.com/spiceai/spiceai/pull/4382>
- Promote S3 Data Connector to Stable by @ewgenius in <https://github.com/spiceai/spiceai/pull/4385>
- Download platform-supported CUDA binary version on Linux by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4356>
- Fix http consistency test workflow, add overhead workflow by @ewgenius in <https://github.com/spiceai/spiceai/pull/4387>
- feat: Add Postgres test spicepods by @peasee in <https://github.com/spiceai/spiceai/pull/4388>
- Fix typos + specific in model criteria; Make explicit alpha/beta tests for LLMS in `crates/llms/tests`.  by @Jeadie in <https://github.com/spiceai/spiceai/pull/4377>
- Fix federation bug for correlated subqueries of deeply nested Dremio tables by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4389>
- Fix http overhead workflow by @ewgenius in <https://github.com/spiceai/spiceai/pull/4390>
- Tweak model tests, fix embedding input by @ewgenius in <https://github.com/spiceai/spiceai/pull/4391>
- Promote Dremio to Stable quality by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4392>
- Add beta functionality tests for embedding models. by @Jeadie in <https://github.com/spiceai/spiceai/pull/4352>
- docs: Release postgres connector stable by @peasee in <https://github.com/spiceai/spiceai/pull/4398>
- Increase timeout for model response in E2E tests by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4399>
- Disable ident normalization (i.e. `SELECT MyColumn from table` works) by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4400>
- Preserve schema metadata by @ewgenius in <https://github.com/spiceai/spiceai/pull/4402>
- Make models integration tests tracing less verbose by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4403>
- Fix `cuda` feature build on Windows by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4404>
- Promote MySQL to Stable by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4406>
- docs: Release Delta Lake and Unity catalog by @peasee in <https://github.com/spiceai/spiceai/pull/4405>
- Use `gpt-4o-mini` as a default model for openai provider by @ewgenius in <https://github.com/spiceai/spiceai/pull/4410>
- Fix streaming for Openai and Anthropic by @Jeadie in <https://github.com/spiceai/spiceai/pull/4409>
- Tweak model loading and missing tool errors messages by @ewgenius in <https://github.com/spiceai/spiceai/pull/4412>
- Spice CLI: fallback to CPU build for unsupported GPU Compute Capability by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4407>
- Build Windows CUDA binaries as part of `build_and_release` workflow by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4386>
- Update docs link by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4416>
- feat: Add CPU models install escape hatch by @peasee in <https://github.com/spiceai/spiceai/pull/4419>
- Handle OpenAI API Errors by @ewgenius in <https://github.com/spiceai/spiceai/pull/4417>
- Update spice cli to use `GH_TOKEN` or `GITHUB_TOKEN` env variables when calling releases api by @ewgenius in <https://github.com/spiceai/spiceai/pull/4175>
- Implement secure sandboxing for Docker image by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4411>
- Automatically install supported CUDA binary on Windows by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4420>
- Metrics for LLMs+ embeddings by @Jeadie in <https://github.com/spiceai/spiceai/pull/4418>
- Jeadie/25 01 17/beta perf by @Jeadie in <https://github.com/spiceai/spiceai/pull/4397>
- Pass GitHub token to all CI steps calling spice run by @ewgenius in <https://github.com/spiceai/spiceai/pull/4423>
- Run the models integration tests on PRs by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4421>
- Run CUDA builds in a separate workflow by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4430>
- Promote OpenAI models and embeddings providers to RC by @ewgenius in <https://github.com/spiceai/spiceai/pull/4432>
- Update link to retrieval-augmented generation (RAG) details by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4433>
- Unity catalog should strip parameter prefix before passing parameters to delta lake factory by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4436>
- Update quickstart traces to match current version by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4435>
- Update Supported Embeddings Providers Readme section by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4434>
- Local models can stream tools by @Jeadie in <https://github.com/spiceai/spiceai/pull/4429>
- fix: Use MetricsCollector::show() for HTTP testoperator commands by @peasee in <https://github.com/spiceai/spiceai/pull/4442>
- Fix run query action by @ewgenius in <https://github.com/spiceai/spiceai/pull/4444>
- Default to AI-enabled runtime for `spice run`/`spice install` by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4443>
- Change no spicepod.yaml log to warning by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4447>
- refactor: Update Catalog Connector error messages by @peasee in <https://github.com/spiceai/spiceai/pull/4441>
- Fix panic when converting OTel metrics by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4449>
- refactor: Update model errors by @peasee in <https://github.com/spiceai/spiceai/pull/4446>
- Update spiceai/mistral.rs to silence metadata logs by @ewgenius in <https://github.com/spiceai/spiceai/pull/4452>
- fix xAI; don't use openai defaults by @Jeadie in <https://github.com/spiceai/spiceai/pull/4450>
- Improves the UX of using huggingface models by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4451>
- Add GH Workflow to test `spice ai` runtime installation by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4448>
- fix: Use specific model errors where available by @peasee in <https://github.com/spiceai/spiceai/pull/4454>
- Detect and report unsupported embedding column type during dataset registration by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4456>
- Handle Errors by @Jeadie in <https://github.com/spiceai/spiceai/pull/4455>
- Catch and report negative openai_temperature error by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4453>
- Clarify release check error message if it is caused by wrong GH token by @ewgenius in <https://github.com/spiceai/spiceai/pull/4458>

**Full Changelog**: <https://github.com/spiceai/spiceai/compare/v1.0.0-rc.5...v1.0.0>

Resources

Community

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Slack or by email to get involved.

Twitter: @spice_ai
Slack: spiceai.org/slack
Telegram: Spice AI Discussion
Reddit: https://www.reddit.com/r/spiceai
Email: [email protected]

Spice v0.20-beta (Nov 4, 2024)

November 4, 2024 · 4 min read

Phillip LeBlanc

Co-Founder and CTO of Spice AI

Announcing the release of Spice v0.20-beta 🧩

Spice v0.20.0-beta improves federated query performance with column pruning and adds support for Metal (Apple Silicon) and CUDA (NVidia) accelerators. The S3, PostgreSQL, MySQL, and GitHub Data Connectors have graduated from Beta to Release Candidates. The Arrow, DuckDB, and SQLite Data Accelerators have graduated from Alpha to Beta.

Highlights in v0.20.0-beta

Data Connectors: The S3, PostgreSQL, MySQL, and GitHub Data Connectors have graduated from beta to release candidate.

Data Accelerators: The Arrow, DuckDB, and SQLite Data Accelerators have graduated from alpha to beta.

Metal and CUDA Support: Added support for Metal (Apple Silicon) and CUDA (NVidia) for AI/ML workloads including embeddings and local LLM inference.

For instructions on compiling a Meta or CUDA binary, see the Installation Docs.

Breaking Changes

The ODBC Data Connector now requires ODBC drivers specified in connection strings are registered in the system ODBC driver manager.

Example invalid connection string:

DRIVER={/path/to/driver.so};SERVER=localhost;DATABASE=master

Example valid connection string:

DRIVER={My ODBC Driver};SERVER=localhost;DATABASE=master

Where My ODBC Driver is the name of an ODBC driver registered in the ODBC driver manager.

Contributors

@ewgenius
@peasee
@phillipleblanc
@sgrebnov
@Jeadie
@barracudarin
@Sevenannn

What's Changed

- Update Helm for v0.19.4-beta and add release notes by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/3310>
- Update spicepod.schema.json by @github-actions in <https://github.com/spiceai/spiceai/pull/3311>
- `metal` & `cuda` flags for spice by @Jeadie in <https://github.com/spiceai/spiceai/pull/3212>
- Promote postgres connector to RC quality by @Sevenannn in <https://github.com/spiceai/spiceai/pull/3305>
- docs: Update ROADMAP.md by @peasee in <https://github.com/spiceai/spiceai/pull/3322>
- feat: Enable federation for in-memory accelerators by @peasee in <https://github.com/spiceai/spiceai/pull/3325>
- fix: Only allow env files from the current dir by @peasee in <https://github.com/spiceai/spiceai/pull/3327>
- Always read TimezoneTZ from PostgreSQL as UTC by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/3330>
- For multi-sink acceleration refreshes, ensure parent table completes before the children. by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/3329>
- Update TPC-DS Q49 (Decimal to Float) to match SQLite's type system by @sgrebnov in <https://github.com/spiceai/spiceai/pull/3323>
- Enable parquet pushdown in Spice by @Sevenannn in <https://github.com/spiceai/spiceai/pull/3245>
- Use spice object_store fork to fix S3 ambiguous error by @Sevenannn in <https://github.com/spiceai/spiceai/pull/3304>
- Don't mix commented out queries for s3 connectors and accelerators by @Sevenannn in <https://github.com/spiceai/spiceai/pull/3331>
- Allow only valid WHERE conditions in vector searches by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/3335>
- fix: Allow only ODBC profiles by @peasee in <https://github.com/spiceai/spiceai/pull/3324>
- Track how many times an acceleration falls back during initialization by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/3339>
- Anthropic model regex and fix tool parsing aggregation bug by @Jeadie in <https://github.com/spiceai/spiceai/pull/3334>
- Upgrade runtime along with CLI on `spice upgrade` by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/3341>
- Update upcoming Roadmap by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/3343>
- fix: Prevent acceleration files outside of working directory by @peasee in <https://github.com/spiceai/spiceai/pull/3340>
- Document S3 connector limitations by @Sevenannn in <https://github.com/spiceai/spiceai/pull/3333>
- Update Object Store Patch by @Sevenannn in <https://github.com/spiceai/spiceai/pull/3361>
- Promote SQLite Data Accelerator to Beta by @sgrebnov in <https://github.com/spiceai/spiceai/pull/3365>
- Promote S3 connector to RC quality by @Sevenannn in <https://github.com/spiceai/spiceai/pull/3362>
- Revert "fix: Only allow env files from the current dir" by @peasee in <https://github.com/spiceai/spiceai/pull/3368>
- docs: Fix typo for S3 release status in README.md by @peasee in <https://github.com/spiceai/spiceai/pull/3370>
- Include unnecessary columns pruning step during federated plan creation by @sgrebnov in <https://github.com/spiceai/spiceai/pull/3363>

**Full Changelog**: <https://github.com/spiceai/spiceai/compare/v0.19.4-beta...v0.20.0-beta>

Resources

Community

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Slack or by email to get involved.

Twitter: @spice_ai
Slack: spiceai.org/slack
Telegram: Spice AI Discussion
Reddit: https://www.reddit.com/r/spiceai
Email: [email protected]

Spice v0.18-beta (Sep 16, 2024)

September 16, 2024 · 7 min read

Sergei Grebnov

Senior Software Engineer at Spice AI

Announcing the release of Spice v0.18-beta.

The v0.18.0-beta release adds new Sharepoint and File data connectors, introduces AWS Identity and Access Management (IAM) support for the S3 Data Connector, improves performance of the GitHub connector, and increases the overall reliability of all data accelerators. The /ready API endpoint was enhanced to report as ready only when all components, including loaded data, have successfully reported readiness.

Highlights in v0.18.0-beta

Sharepoint Data Connector: Use from: sharepoint: to access and accelerate documents stored in Microsoft 365 OneDrive for Business (Sharepoint). The CLI also includes a new spice login sharepoint to aid in local development and testing.

Example spicepod.yml:

datasets:
  - from: sharepoint:drive:Documents/path:/important_documents/
    name: important_documents
    params:
      sharepoint_client_id: ${secrets:SPICE_SHAREPOINT_CLIENT_ID}
      sharepoint_tenant_id: ${secrets:SPICE_SHAREPOINT_TENANT_ID}
      sharepoint_client_secret: ${secrets:SPICE_SHAREPOINT_CLIENT_SECRET}

See the Sharepoint Data Connector documentation.

AWS Identity and Access Management (IAM) for S3: A new s3_auth parameter for the s3 data connector to configure the authentication method to use when connecting to S3. Supported values are public, key, and iam_role. Use s3_auth: iam_role to assume the instance IAM role.

Example spicepod.yml:

datasets:
  - from: s3://my-bucket
    name: bucket
    params:
      s3_auth: iam_role # Assume IAM role of instance

See the S3 Data Connector documentation.

File Data Connector Use from: file: to query files stored by locally accessible filesystems.

Example spicepod.yml:

datasets:
  - from: file://path/to/customer.parquet
    name: customer
    params:
      file_format: parquet

See the File Data Connector documentation.

Improved /ready Api Now includes the initial data load for accelerated datasets in addition to component readiness to ensure readiness is only reported when data has loaded and can be successfully queried.

Breaking Changes

GitHub Data Connector: The data type for time-related columns has changed from Utf8 to Timestamp. To upgrade, data type references to timestamp. For example, if using time_format:, change uses of time_format: ISO8601 to time_format: timestamp.
Ready API: The /ready API reports ready only when all components have reported ready and data is fully loaded. To upgrade, evaluate uses of the Ready API (such as Kubernetes readiness probes) and consider how it might affect system behavior.

Dependencies

No major dependencies updates.

Contributors

@phillipleblanc
@Jeadie
@lukekim
@sgrebnov
@peasee
@eltociear
@Sevenannn
@ewgenius
@karifabri

New Contributors

@karifabri made their first contribution in https://github.com/spiceai/spiceai/pull/2601

What's Changed

- Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/2585
- Set helm to v0.17.4-beta by @ewgenius in https://github.com/spiceai/spiceai/pull/2595
- Bump to next v0.18.0-beta version by @ewgenius in https://github.com/spiceai/spiceai/pull/2596
- Add snapshot test docs / Update beta criteria for data accelerators by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2594
- Enable federation for accelerated queries (sqlite, duckdb, postgres) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2598
- spelling updates on v0.17.4 release notes by @karifabri in https://github.com/spiceai/spiceai/pull/2601
- Update endgame template by @ewgenius in https://github.com/spiceai/spiceai/pull/2591
- fix: Re-attach DuckDB attachments on each query by @peasee in https://github.com/spiceai/spiceai/pull/2602
- Speed up sqlite accelerator benchmark test with indexes by @Sevenannn in https://github.com/spiceai/spiceai/pull/2597
- Fix refresh API using `refresh_mode: append` by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2609
- Tweak `/ready` to only report ready when components have all reported Ready by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2600
- Add `s3_auth` parameter to configure IAM role authentication by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2611
- Bump fundu from 2.0.0 to 2.0.1 by @dependabot in https://github.com/spiceai/spiceai/pull/2576
- fix: Remove comments from SQL files by @peasee in https://github.com/spiceai/spiceai/pull/2627
- Utilize runtime.status().is_ready() to check acceleration dataset readiness in benchmark test by @Sevenannn in https://github.com/spiceai/spiceai/pull/2614
- Allow for prefix to be kept in internal Parameters by @Jeadie in https://github.com/spiceai/spiceai/pull/2603
- Bump itertools from 0.12.1 to 0.13.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2572
- Bump golang.org/x/mod from 0.20.0 to 0.21.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2571
- Add initial threat model using OWASP Threat Dragon by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2599
- fix: Explicitly error for duplicate duckdb file accelerators by @peasee in https://github.com/spiceai/spiceai/pull/2628
- Benchmark test binary can parse command line option by @Sevenannn in https://github.com/spiceai/spiceai/pull/2626
- Snapshot tests shouldn't crash the Spice benchmark test by @Sevenannn in https://github.com/spiceai/spiceai/pull/2613
- Bump anyhow from 1.0.86 to 1.0.87 by @dependabot in https://github.com/spiceai/spiceai/pull/2573
- Upgrade datafusion to improve SQLite subquery tables aliasing support by @sgrebnov in https://github.com/spiceai/spiceai/pull/2634
- Run benchmark separately using workflow by @Sevenannn in https://github.com/spiceai/spiceai/pull/2631
- Sharepoint UX changes by @Jeadie in https://github.com/spiceai/spiceai/pull/2633
- Improve `/ready` to only mark a dataset ready iff the initial refresh completed by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2630
- Support relative paths for file connector by @Jeadie in https://github.com/spiceai/spiceai/pull/2637
- Fix `error decoding response body` GitHub file connector bug by @sgrebnov in https://github.com/spiceai/spiceai/pull/2645
- GraphQL pagination and robustness. by @Jeadie in https://github.com/spiceai/spiceai/pull/2632
- docs: Update bug template by @peasee in https://github.com/spiceai/spiceai/pull/2629
- Define GitHub `issues` data connector schema upfront by @sgrebnov in https://github.com/spiceai/spiceai/pull/2646
- Add support for loading from Sharepoint Group's default drive. by @Jeadie in https://github.com/spiceai/spiceai/pull/2642
- Fix typo in workflow, fix the postgres connector container readiness check by @Sevenannn in https://github.com/spiceai/spiceai/pull/2654
- Fix check all features by @Sevenannn in https://github.com/spiceai/spiceai/pull/2653
- Enable Warn/Error traces from dependency components by @sgrebnov in https://github.com/spiceai/spiceai/pull/2655
- Use lower case iso8601 for time_column by @Sevenannn in https://github.com/spiceai/spiceai/pull/2551
- Add basic integration test for Spice spill-to-disk and re-hydration scenario by @sgrebnov in https://github.com/spiceai/spiceai/pull/2643
- Add 'RefreshOverrides::max_jitter' to 'POST /v1/datasets/:name/acceleration/refresh' by @Jeadie in https://github.com/spiceai/spiceai/pull/2641
- Bump rustls-pemfile from 1.0.4 to 2.1.3 by @dependabot in https://github.com/spiceai/spiceai/pull/2575
- Update dependencies to support querying postgres enum types by @Sevenannn in https://github.com/spiceai/spiceai/pull/2657
- Upgrade table-providers by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2659
- Improve `spill_to_disk_and_rehydration` integration test by @sgrebnov in https://github.com/spiceai/spiceai/pull/2658
- Enhance GitHub connector robustness with explicit table schema definitions by @sgrebnov in https://github.com/spiceai/spiceai/pull/2661
- Rename sharepoint fields by @Jeadie in https://github.com/spiceai/spiceai/pull/2668
- Disable dataset checkpoint for DuckDB acceleration by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2676
- Revert "Enable federation for accelerated queries (sqlite, duckdb, postgres) (#2598) by @Sevenannn in https://github.com/spiceai/spiceai/pull/2683

**Full Changelog**: https://github.com/spiceai/spiceai/compare/v0.17.4-beta...v0.18.0-beta

Resources

Community

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Slack or by email to get involved.

Twitter: @spice_ai
Slack: spiceai.org/slack
Telegram: Spice AI Discussion
Reddit: https://www.reddit.com/r/spiceai
Email: [email protected]

Spice.ai v0.10-alpha

March 27, 2024 · 2 min read

Phillip LeBlanc

Co-Founder and CTO of Spice AI

Announcing the release of Spice v0.10-alpha! 🧙‍♂️

The Spice.ai v0.10-alpha release focused on additions and updates to improve stability, usability, and the overall Spice developer experience.

Highlights in v0.10-alpha

Public Bucket Support for S3 Data Connector: The S3 Data Connector now supports public buckets in addition to buckets requiring an access id and key.

JDBC-Client Connectivity: Improved connectivity for JDBC clients, like Tableau.

User Experience Improvements:

Friendlier error messages across the board to make debugging and development better.
Added a spice login postgres command, streamlining the process for connecting to PostgreSQL databases.
Added PostgreSQL connection verification and connection string support, enhancing usability for PostgreSQL users.

Grafana Dashboard: Improving the ability to monitor Spice deployments, a standard Grafana dashboard is now available.

Contributors

@phillipleblanc
@mitchdevenport
@Jeadie
@ewgenius
@sgrebnov
@y-f-u
@lukekim
@digadeesh

New in this release

Fixes Gracefully handle Arrow Flight DoExchange connection resets
Adds Grafana Dashboard
Adds Flight SQL CommandGetTableTypes Command support (improves JDBC-client connectivity)
Adds Friendlier error messages
Adds spice login postgres command
Adds PostgreSQL connection verification
Adds PostgreSQL connection string support
Adds Linux aarch64 build
Updates Improves spice status with dataset metrics
Updates CLI REPL improved show tables output
Updates CLI REPL limit output to 500 rows
Updates Improved README.md with architecture diagram updates
Updates Improved CI run time.
Updates Use macOS hosted Actions runner

Resources

Community

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Slack or by email to get involved.

Twitter: @spice_ai
Slack: spiceai.org/slack
Telegram: Spice AI Discussion
Reddit: https://www.reddit.com/r/spiceai
Email: [email protected]

What's New in v2.0.1​

Faster Iceberg Reads with Parallel File Scanning​

AWS S3 & Object-Store Reliability​

Data Acceleration & Distributed Query Fixes​

Authenticated Query Fixes​

Contributors​

Breaking Changes​

Cookbook Updates​

Upgrading​

What's Changed​

Changelog​

What's New in v1.11.5​

on_zero_results: use_source Fallback Performance Improvement​

Delta Lake: Improved Data Skipping for >= Timestamp Predicates​

PostgreSQL: Partitioned Tables Support​

S3 Parquet Read Performance Improvement​

Cayenne: Ensure Target File Size is Respected​

CLI: Support for v2.0 Runtime Upgrades​

Contributors​

Breaking Changes​

Cookbook Updates​

Upgrading​

What's Changed​

Changelog​

What's New in v1.11.4​

Accelerated Views: on_zero_results: use_source Support​

How the Fallback Works​

S3 Data Connector: More Robust Metadata Column Handling​

Contributors​

Breaking Changes​

Cookbook Updates​

Upgrading​

What's Changed​

Changelog​

What's New in v1.11.3​

S3 Data Connector Fix​

FlightSQL Schema Consistency​

CDC Cache Invalidation​

HTTP Data Connector Improvements​

Contributors​

Breaking Changes​

Cookbook Updates​

Upgrading​

What's Changed​

Changelog​

What's New in v1.10.3​

Additional Improvements & Bug Fixes​

Contributors​

Breaking Changes​

Cookbook Updates​

Upgrading​

What's Changed​

Changelog​

What's New in v1.5.0​

Contributors​

Breaking Changes​

Cookbook Updates​

Upgrading​

What's Changed​

Dependencies​

Changelog​

Highlights in v1.0-stable​

Breaking Changes​

Dependencies​

Upgrading​

Contributors​

What's Changed​

Resources​

Community​

Highlights in v0.20.0-beta​

Breaking Changes​

Contributors​

What's Changed​

Resources​

Community​

Highlights in v0.18.0-beta​

Breaking Changes​

Dependencies​

Contributors​

New Contributors​

What's New in v2.0.1

Faster Iceberg Reads with Parallel File Scanning

AWS S3 & Object-Store Reliability

Data Acceleration & Distributed Query Fixes

Authenticated Query Fixes

Contributors

Breaking Changes

Cookbook Updates

Upgrading

What's Changed

Changelog

What's New in v1.11.5

`on_zero_results: use_source` Fallback Performance Improvement

Delta Lake: Improved Data Skipping for `>=` Timestamp Predicates

PostgreSQL: Partitioned Tables Support

S3 Parquet Read Performance Improvement

Cayenne: Ensure Target File Size is Respected

CLI: Support for v2.0 Runtime Upgrades

Contributors

Breaking Changes

Cookbook Updates

Upgrading

What's Changed

Changelog

What's New in v1.11.4

Accelerated Views: `on_zero_results: use_source` Support

How the Fallback Works

S3 Data Connector: More Robust Metadata Column Handling

Contributors

Breaking Changes

Cookbook Updates

Upgrading

What's Changed

Changelog

What's New in v1.11.3

S3 Data Connector Fix

FlightSQL Schema Consistency

CDC Cache Invalidation

HTTP Data Connector Improvements

Contributors

Breaking Changes

Cookbook Updates

Upgrading

What's Changed

Changelog

What's New in v1.10.3

Additional Improvements & Bug Fixes

Contributors

Breaking Changes

Cookbook Updates

Upgrading

What's Changed

Changelog

What's New in v1.5.0

Contributors

Breaking Changes

Cookbook Updates

Upgrading

What's Changed

Dependencies

Changelog

Highlights in v1.0-stable

Breaking Changes

Dependencies

Upgrading

Contributors

What's Changed

Resources

Community

Highlights in v0.20.0-beta

Breaking Changes

Contributors

What's Changed

Resources

Community

Highlights in v0.18.0-beta

Breaking Changes

Dependencies

Contributors

New Contributors