Skip to main content

12 posts tagged with "duckdb"

DuckDB database topics and usage

View All Tags

Spice v1.0.7 (Mar 26, 2025)

ยท 3 min read
Phillip LeBlanc
Co-Founder and CTO of Spice AI

Announcing the release of Spice v1.0.7 ๐ŸŽ๏ธ

Spice v1.0.7 improves memory usage when using DuckDB, improves schema inference performance when using object-store based data connectors, and fixes a bug in Dremio schema inference.

Highlights in v1.0.7โ€‹

  • DuckDB Memory Usage: Memory usage when using DuckDB has been significantly improved for data loads and refreshes through expanded use of zero-copy Arrow and multi-threading for data loads. When a duckdb_memory_limit is specified, disk spilling has been improved for greater-than-memory workloads. In addition, a new temp_directory runtime parameter supports storing temporary files to alternative location than the DuckDB data file for higher throughput. For example, temp_directory could be set to a different high-IOPs IO2 EBS volume that is separate from the duckdb_file_path.

    Automated end-to-end tests for the DuckDB Accelerator coverage has been significantly expanded.

    For configuration details, see the documentation for runtime parameters and the DuckDB Data Accelerator.

  • Schema Inference Performance for Object-Store Data Connectors: Schema inference performance has been improved, especially for large numbers of objects (1M+ objects) when using object-store based data connectors by making the object-listing and selection more efficient.

Performanceโ€‹

When compared to previous versions, Spice v1.0.7 loads DuckDB accelerated datasets significantly faster. When using the TPCH lineitem dataset at Scale Factor 100 (600M rows):

Without Indexesโ€‹

5x faster, 28% less memory usage.

v1.0.6 v1.0.7

VersionLoad TimePeak Memory Usage
v1.0.616m 3s32GB
v1.0.73m 149ms24.4GB

With Indexesโ€‹

2.5x faster. Higher memory usage in v1.0.7 is due to better resource utilization to achieve faster load times. Use the duckdb_memory_limit parameter to control memory usage.

VersionLoad TimePeak Memory Usage
v1.0.627m 9s50GB
v1.0.711m 30s77GB

v1.0.6 with indexes v1.0.7 with indexes

Documentationโ€‹

  • DuckDB Data Accelerator: Has been expanded with additional resource usage guidance.
  • Memory: A new section for memory considerations has been added to the Reference section.

Contributorsโ€‹

  • @phillipleblanc
  • @sgrebnov
  • @peasee
  • @Sevenannn

Breaking Changesโ€‹

No breaking changes.

Upgradingโ€‹

To upgrade to v1.0.7, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.0.7 image:

docker pull spiceai/spiceai:1.0.7

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

What's Changedโ€‹

Dependenciesโ€‹

Changelogโ€‹

- fix: Remove on zero results arguments from benchmarks by @peasee in https://github.com/spiceai/spiceai/pull/4533
- Run benchmark tests w/o uploading test results (pending improvements) by @sgrebnov in https://github.com/spiceai/spiceai/pull/4843
- fix: Return BAD_REQUEST when not embeddings are configured by @peasee in https://github.com/spiceai/spiceai/pull/4804
- Fix Dremio schema inference by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5114
- Improve performance of schema inference for object-store data connectors by @sgrebnov in https://github.com/spiceai/spiceai/pull/5124
- Always download spice runtime version matched with spice cli version by @Sevenannn in https://github.com/spiceai/spiceai/pull/4761
- Fix go lint errors by @sgrebnov in https://github.com/spiceai/spiceai/pull/5147
- Make DuckDB acceleration E2E tests more comprehensive by @sgrebnov in https://github.com/spiceai/spiceai/pull/5146
- Enable Spice to load larger than memory datasets into DuckDB accelerations by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5149
- Add `temp_directory` runtime parameter and insert it for DuckDB accelerations by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5152
- Fix Postgres and MySQL installation on macos14-runner (E2E CI) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5155
- Enable E2E for DuckDB full mode acceleration with indexes only in CI by @sgrebnov in https://github.com/spiceai/spiceai/pull/5154

Full Changelog: github.com/spiceai/spiceai/compare/v1.0.6...v1.0.7

Spice v1.0.6 (Mar 17, 2025)

ยท 3 min read
Sergei Grebnov
Senior Software Engineer at Spice AI

Announcing the release of Spice v1.0.6 โšก

Spice v1.0.6 improves stability for DuckDB acceleration, Iceberg Data/Catalog connector improvements when using AWS Glue, and fixes an issue with the ready_state: on_registration federation fallback when using DuckDB. In addition, redundant data refreshes on startup are avoided for accelerations with persistent data.

Highlights in v1.0.6โ€‹

  • Iceberg Data/Catalog Connector Improvements: Improves Iceberg data & catalog connector reliability, including bug fixes for AWS Glue API rate-limiting and compatibility, REST API pagination support, explicit AWS credential handling, and support for AWS STS role assumption.

  • Fixes On-Registration Fallback when using DuckDB: Previously, when using DuckDB as a data accelerator and the ready_state: on_registration configuration, queries made during the initial data refresh did not properly fallback to the federated source. This is now fixed.

  • DuckDB downgraded for Stability: DuckDB has been downgraded to v1.1.3 due to a regression in memory handling tracked by duckdb/duckdb issue #16640. Once resolved and validated, Spice will re-upgrade to v1.2.x.

  • Expanded Integration Tests: Additional integration tests covering federated accelerator behavior and graceful shutdown processes have been added.

  • Optimized Data Refresh for Persistent Accelerations: Changed behavior in v1.0.6. When using persistent (file-mode) acceleration without a defined refresh interval, Spice performs a full refresh at startup only if no previously accelerated data is available. This ensures efficient startup behavior by avoiding unnecessary refreshes. This logic applies only to full refreshes when no refresh interval is specified.

To maintain the previous behavior and always refresh on every startup, set:

 acceleration:
refresh_on_startup: always

Contributorsโ€‹

  • @peasee
  • @phillipleblanc
  • @sgrebnov
  • @lukekim
  • @Sevenannn

Breaking Changesโ€‹

Starting from v1.0.6 when using persistent (file-mode) acceleration without a defined refresh interval, Spice performs a full refresh at startup only if no previously accelerated data is available. To maintain the previous behavior and always refresh on every startup, set:

 acceleration:
refresh_on_startup: always

Cookbook Updatesโ€‹

No new recipes.

Upgradingโ€‹

To upgrade to v1.0.6, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.0.6 image:

docker pull spiceai/spiceai:1.0.6

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

What's Changedโ€‹

Dependenciesโ€‹

Changelogโ€‹

  • Implement proper ready_state: on_registration for federation enabled accelerators by @phillipleblanc in #5019
  • Add indexes and primary keys mismatch detection for DuckDB Acceleration by @sgrebnov in #5045
  • Add comprehensive integration tests for the ready_state behavior by @phillipleblanc in #5042
  • Add test Spicepod for acceleration with constraints by @sgrebnov in #4891
  • Add test Spicepod for DuckDB append acceleration with constraints by @sgrebnov in #4898
  • Add DuckDB graceful shutdown test to E2E CI tests by @sgrebnov in #5047
  • Update duckdb_append_with_pk_and_indexes.yaml (work for duckdb 1.1.x) by @sgrebnov in #5067
  • fix: Downgrade to DuckDB 1.1.3 by @peasee in #5055
  • fix: Acceleration federation integration test by @peasee in #5070
  • Improvements to Iceberg Catalog/Data Connector by @phillipleblanc in #5071
  • Add Results-Cache-Status to indicate query result came from cache by @phillipleblanc in #4809
  • fix: Spice.ai schema inference by @peasee in #4674
  • Add refresh_on_startup Spicepod configuration param by @phillipleblanc and @sgrebnov in #5086
  • Test restart behavior of DuckDB file acceleration against glue iceberg table by @Sevenannn #5075
  • Run Iceberg Data Connector - DuckDB File mode integration test by @Sevenannn #5069
  • Integration test for glue iceberg catalog by @Sevenannn #5077

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.5...v1.0.6

Spice v1.0.5 (Mar 11, 2025)

ยท 3 min read
Sergei Grebnov
Senior Software Engineer at Spice AI

Announcing the release of Spice v1.0.5 ๐ŸงŠ

Spice v1.0.5 expands Iceberg support with the introduction of the Iceberg Data Connector, in addition to the existing Iceberg Catalog Connector. This new connector enables direct dataset creation and configuration for specific Iceberg objects, enabling federated and accelerated SQL queries on Apache Iceberg tables.

Performance improvements include object-store optimized Parquet pruning in append mode, where object-store metadata is now leveraged alongside Hive partitioning to optimize file pruning. This results in faster and more efficient queries.

DuckDB has been upgraded to v1.2.0, along with additional stability improvements, including improved graceful shutdown and the ability to configure the DuckDB memory limit.

Additional updates include support for the Arrow Map type.

Highlights in v1.0.5โ€‹

  • New Iceberg Data Connector: Enables direct dataset creation and querying of Iceberg tables.

    Example usage in spicepod.yaml:

    datasets:
    - from: iceberg:https://iceberg-catalog-host.com/v1/namespaces/my_namespace/tables/my_table
    name: my_table
    params:
    # Same as Iceberg Catalog Connector
    acceleration:
    enabled: true

    For detailed setup instructions, authentication options, and configuration parameters, refer to the Iceberg Data Connector documentation.

  • Improved Parquet pruning in append mode: Uses object-store metadata for more efficient file pruning.

  • DuckDB upgrade to v1.2.0 with improved graceful shutdown: Read the DuckDB v1.2.0 announcement for details, including breaking changes for map and list_reduce. Graceful shutdown of DuckDB has been improved for better stability across restarts.

  • Configurable DuckDB memory limit: Use the duckdb_memory_limit parameter to set the DuckDB acceleration memory limit:

    - from: spice.ai:path.to.my_dataset
    name: my_dataset
    acceleration:
    params:
    duckdb_memory_limit: '2GB'
    enabled: true
    engine: duckdb
    mode: file

Contributorsโ€‹

  • @peasee
  • @phillipleblanc
  • @sgrebnov
  • @lukekim

Breaking Changesโ€‹

Upgradingโ€‹

To upgrade to v1.0.5, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.0.5 image:

docker pull spiceai/spiceai:1.0.5

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

What's Changedโ€‹

Dependenciesโ€‹

Changelogโ€‹

  • fix: Update OpenAI model health check by @peasee in #4849
  • fix: Allow metrics endpoint setting in CLI by @peasee in #4939
  • DuckDB acceleration: fix Decimal with zero scale support by @sgrebnov in #4922
  • Introduce runtime shutdown state by @sgrebnov in #4917
  • Add support for Flight and HTTP endpoints configuration to Spice CLI (run and sql) by @sgrebnov and @lukekim in #4913
  • Fix Datafusion resources deallocation during shutdown by @sgrebnov in #4912
  • DuckDB: fix error handling during record batch insertion by @sgrebnov in #4894
  • DuckDB: add support for Map Arrow type for DuckDB acceleration by @sgrebnov in #4887
  • Upgrade to DuckDB v1.2.0 by @sgrebnov in #4842
  • Gracefully shutdown the runtime and deallocate static resources by @sgrebnov in #4879
  • Implement an Iceberg Data Connector by @phillipleblanc in #4941
  • Don't trace canceled dataset refresh during runtime termination by @sgrebnov in #4958
  • Use metadata column last_modified when specified as a time_column by @phillipleblanc in #4970
  • Add duckdb_memory_limit param support for DuckDB acceleration by @sgrebnov in #4971
  • Add Iceberg dataset integration test by @phillipleblanc in #4950

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.4...v1.0.5

Spice v1.0-stable (Jan 20, 2025)

ยท 8 min read
William Croxson
Senior Software Engineer at Spice AI

๐ŸŽ‰ After 47 releases, Spice.ai OSS has reached production readiness with the 1.0-stable milestone!

The core runtime and features such as query federation, query acceleration, catalog integration, search and AI-inference have all graduated to stable status along with key component graduations across data connectors, data accelerators, catalog connectors, and AI model providers.

Highlights in v1.0-stableโ€‹

Breaking Changesโ€‹

  • Default Runtime Version: The CLI will install the GPU accelerated AI-capable Runtime by default (if supported), when running spice install or spice run. To force-install the non-GPU version, run spice install ai --cpu.

  • Default OpenAI Model: The default OpenAI model has updated to gpt-4o-mini.

  • Identifier Normalization: Unquoted identifiers such as table names are no longer normalized to lowercase. Identifiers will now retain their exact case as provided.

  • Sandboxed Docker Image: The Runtime Docker Image now runs the spiced process as the nobody user in a minimal chroot sandbox.

  • Insecure S3 and ABFS endpoints: The S3 and ABFS connectors now enforce insecure endpoint checks, preventing HTTP endpoints unless allow_http is explicitly enabled. Refer to the documentation for details.

Dependenciesโ€‹

No major dependency changes.

Upgradingโ€‹

To upgrade to v1.0.0, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.0.0 image:

docker pull spiceai/spiceai:1.0.0

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

Contributorsโ€‹

  • @peasee
  • @ewgenius
  • @Jeadie
  • @Sevenannn
  • @lukekim
  • @phillipleblanc
  • @sgrebnov

What's Changedโ€‹

- feat: Update load test criteria, testoperator updates by @peasee in <https://github.com/spiceai/spiceai/pull/4311>
- Update helm for v1.0.0-rc.5 by @ewgenius in <https://github.com/spiceai/spiceai/pull/4313>
- Update spicepod.schema.json by @github-actions in <https://github.com/spiceai/spiceai/pull/4318>
- Bump version to v1.0.0, update SECURITY.md by @ewgenius in <https://github.com/spiceai/spiceai/pull/4314>
- Initial criteria for models, embeddings by @Jeadie in <https://github.com/spiceai/spiceai/pull/4223>
- Update benchmark snapshots by @github-actions in <https://github.com/spiceai/spiceai/pull/4321>
- Add dremio param for running load test by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4315>
- Promote Databricks (mode: delta_lake) connector to stable by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4328>
- Handle failed query in load test by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4327>
- feat: Use load test hours for baseline query sets by @peasee in <https://github.com/spiceai/spiceai/pull/4334>
- Fix typo in 1.0.0-rc.5 release notes by @ewgenius in <https://github.com/spiceai/spiceai/pull/4329>
- feat: add testoperator data consistency by @peasee in <https://github.com/spiceai/spiceai/pull/4319>
- docs: Release DuckDB connector stable by @peasee in <https://github.com/spiceai/spiceai/pull/4335>
- Fix DocumentDB -> DynamoDB by @lukekim in <https://github.com/spiceai/spiceai/pull/4339>
- Update benchmark snapshots by @github-actions in <https://github.com/spiceai/spiceai/pull/4337>
- fix: Download hits.parquet from MinIO for benchmark by @peasee in <https://github.com/spiceai/spiceai/pull/4338>
- Update openapi.json by @github-actions in <https://github.com/spiceai/spiceai/pull/4341>
- Remove evil averages by @lukekim in <https://github.com/spiceai/spiceai/pull/4343>
- Don't run builds on non-code changes by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4344>
- Remove streaming requirement from Databricks spark Beta and Spark connector Beta by @ewgenius in <https://github.com/spiceai/spiceai/pull/4345>
- Update s3 tpcds spicepods by @ewgenius in <https://github.com/spiceai/spiceai/pull/4346>
- Explicitly set required scale factor for throughput and load tests by @ewgenius in <https://github.com/spiceai/spiceai/pull/4347>
- Fix s3 tpcds dataset name by @ewgenius in <https://github.com/spiceai/spiceai/pull/4348>
- Promote Iceberg Catalog Connector to Beta by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4350>
- Update s3 clickbench benchmark snapshots by @ewgenius in <https://github.com/spiceai/spiceai/pull/4351>
- fix: DuckDB clickbench on zero results by @peasee in <https://github.com/spiceai/spiceai/pull/4349>
- Add integration test with snapshots for databricks catalog connector by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4353>
- refactor: Remove on zero results from benchmarks, add data consistency workflow by @peasee in <https://github.com/spiceai/spiceai/pull/4354>
- Fix Bug: No field named body_embedding when do vector search with refresh sql containing subset of columns by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4297>
- docs: Update roadmap by @peasee in <https://github.com/spiceai/spiceai/pull/4364>
- feat: Release accelerators stable by @peasee in <https://github.com/spiceai/spiceai/pull/4361>
- Add TPCH/TPCDS test spicepods for MySQL by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4365>
- Catch when an insecure (http) S3 and ABFS data connectors endpoint is used without specifying the `allow_http` parameter by @ewgenius in <https://github.com/spiceai/spiceai/pull/4363>
- Update ROADMAP - Iceberg catalog alpha for v1.0 by @ewgenius in <https://github.com/spiceai/spiceai/pull/4367>
- Promote databricks catalog and databricks (spark_connect) connector to beta by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4369>
- Update Roadmap - Iceberg beta by @ewgenius in <https://github.com/spiceai/spiceai/pull/4373>
- Build CUDA binaries for Linux by @Jeadie in <https://github.com/spiceai/spiceai/pull/4320>
- Promote Nvidia NIM as Alpha by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4380>
- Promote xai to alpha by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4381>
- Update stable criteria for object store based connectors by @ewgenius in <https://github.com/spiceai/spiceai/pull/4383>
- Testoperator: http consistency and overhead tests, fixes and ci by @ewgenius in <https://github.com/spiceai/spiceai/pull/4382>
- Promote S3 Data Connector to Stable by @ewgenius in <https://github.com/spiceai/spiceai/pull/4385>
- Download platform-supported CUDA binary version on Linux by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4356>
- Fix http consistency test workflow, add overhead workflow by @ewgenius in <https://github.com/spiceai/spiceai/pull/4387>
- feat: Add Postgres test spicepods by @peasee in <https://github.com/spiceai/spiceai/pull/4388>
- Fix typos + specific in model criteria; Make explicit alpha/beta tests for LLMS in `crates/llms/tests`. by @Jeadie in <https://github.com/spiceai/spiceai/pull/4377>
- Fix federation bug for correlated subqueries of deeply nested Dremio tables by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4389>
- Fix http overhead workflow by @ewgenius in <https://github.com/spiceai/spiceai/pull/4390>
- Tweak model tests, fix embedding input by @ewgenius in <https://github.com/spiceai/spiceai/pull/4391>
- Promote Dremio to Stable quality by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4392>
- Add beta functionality tests for embedding models. by @Jeadie in <https://github.com/spiceai/spiceai/pull/4352>
- docs: Release postgres connector stable by @peasee in <https://github.com/spiceai/spiceai/pull/4398>
- Increase timeout for model response in E2E tests by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4399>
- Disable ident normalization (i.e. `SELECT MyColumn from table` works) by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4400>
- Preserve schema metadata by @ewgenius in <https://github.com/spiceai/spiceai/pull/4402>
- Make models integration tests tracing less verbose by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4403>
- Fix `cuda` feature build on Windows by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4404>
- Promote MySQL to Stable by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4406>
- docs: Release Delta Lake and Unity catalog by @peasee in <https://github.com/spiceai/spiceai/pull/4405>
- Use `gpt-4o-mini` as a default model for openai provider by @ewgenius in <https://github.com/spiceai/spiceai/pull/4410>
- Fix streaming for Openai and Anthropic by @Jeadie in <https://github.com/spiceai/spiceai/pull/4409>
- Tweak model loading and missing tool errors messages by @ewgenius in <https://github.com/spiceai/spiceai/pull/4412>
- Spice CLI: fallback to CPU build for unsupported GPU Compute Capability by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4407>
- Build Windows CUDA binaries as part of `build_and_release` workflow by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4386>
- Update docs link by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4416>
- feat: Add CPU models install escape hatch by @peasee in <https://github.com/spiceai/spiceai/pull/4419>
- Handle OpenAI API Errors by @ewgenius in <https://github.com/spiceai/spiceai/pull/4417>
- Update spice cli to use `GH_TOKEN` or `GITHUB_TOKEN` env variables when calling releases api by @ewgenius in <https://github.com/spiceai/spiceai/pull/4175>
- Implement secure sandboxing for Docker image by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4411>
- Automatically install supported CUDA binary on Windows by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4420>
- Metrics for LLMs+ embeddings by @Jeadie in <https://github.com/spiceai/spiceai/pull/4418>
- Jeadie/25 01 17/beta perf by @Jeadie in <https://github.com/spiceai/spiceai/pull/4397>
- Pass GitHub token to all CI steps calling spice run by @ewgenius in <https://github.com/spiceai/spiceai/pull/4423>
- Run the models integration tests on PRs by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4421>
- Run CUDA builds in a separate workflow by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4430>
- Promote OpenAI models and embeddings providers to RC by @ewgenius in <https://github.com/spiceai/spiceai/pull/4432>
- Update link to retrieval-augmented generation (RAG) details by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4433>
- Unity catalog should strip parameter prefix before passing parameters to delta lake factory by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4436>
- Update quickstart traces to match current version by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4435>
- Update Supported Embeddings Providers Readme section by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4434>
- Local models can stream tools by @Jeadie in <https://github.com/spiceai/spiceai/pull/4429>
- fix: Use MetricsCollector::show() for HTTP testoperator commands by @peasee in <https://github.com/spiceai/spiceai/pull/4442>
- Fix run query action by @ewgenius in <https://github.com/spiceai/spiceai/pull/4444>
- Default to AI-enabled runtime for `spice run`/`spice install` by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4443>
- Change no spicepod.yaml log to warning by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4447>
- refactor: Update Catalog Connector error messages by @peasee in <https://github.com/spiceai/spiceai/pull/4441>
- Fix panic when converting OTel metrics by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4449>
- refactor: Update model errors by @peasee in <https://github.com/spiceai/spiceai/pull/4446>
- Update spiceai/mistral.rs to silence metadata logs by @ewgenius in <https://github.com/spiceai/spiceai/pull/4452>
- fix xAI; don't use openai defaults by @Jeadie in <https://github.com/spiceai/spiceai/pull/4450>
- Improves the UX of using huggingface models by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4451>
- Add GH Workflow to test `spice ai` runtime installation by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4448>
- fix: Use specific model errors where available by @peasee in <https://github.com/spiceai/spiceai/pull/4454>
- Detect and report unsupported embedding column type during dataset registration by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4456>
- Handle Errors by @Jeadie in <https://github.com/spiceai/spiceai/pull/4455>
- Catch and report negative openai_temperature error by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4453>
- Clarify release check error message if it is caused by wrong GH token by @ewgenius in <https://github.com/spiceai/spiceai/pull/4458>

**Full Changelog**: <https://github.com/spiceai/spiceai/compare/v1.0.0-rc.5...v1.0.0>

Resourcesโ€‹

Communityโ€‹

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Discord or by email to get involved.

Spice v1.0-rc.3 (Dec 30, 2024)

ยท 7 min read
Luke Kim
Founder and CEO of Spice AI

Announcing the release of Spice v1.0-rc.3 ๐ŸงŠ

Spice v1.0.0-rc.3 is the third release candidate for the first major version of Spice.ai OSS. This release continues the focus on production readiness and includes new Iceberg Catalog APIs, DuckDB improvements, and a new Iceberg Catalog Connector.

Highlights in v1.0-rc.3โ€‹

  • Iceberg Catalog APIs: Spice now functions as an Iceberg Catalog provider, implementing a core subset of the Iceberg Catalog APIs. This enables Iceberg Catalog clients native discovery of datasets and schemas through Spice APIs.

  • GET /v1/namespaces - List all catalogs registered in Spice.

  • GET /v1/namespaces?parent=catalog - List schemas registered under a given catalog.

  • GET /v1/namespaces/:catalog_schema/tables - List tables registered under a given schema.

  • GET /v1/namespaces/:catalog_schema/tables/:table - Get the schema of a given table.

  • Iceberg Catalog Connector: The Iceberg Catalog Connector is a new integration to discover and query datasets from a remote Iceberg Catalog.

Example connecting to a remote Iceberg Catalog with tables stored in S3:

catalogs:
- from: iceberg:https://my-iceberg-catalog.com/v1/namespaces
name: ice
params:
iceberg_s3_access_key_id: ${secrets:ICEBERG_S3_ACCESS_KEY_ID}
iceberg_s3_secret_access_key: ${secrets:ICEBERG_S3_SECRET_ACCESS_KEY}
iceberg_s3_region: us-east-1

View the Iceberg Catalog Connector documentation for more details.

  • DuckDB Improvements: Added cosine_distance support for DuckDB-backed vector search, improved unnest nested type handling for array_element and lists, and optimized query performance.

  • SQLite Data Accelerator: Graduated to Release Candidate (RC).

  • File Data Accelerator: Graduated to Release Candidate (RC).

Breaking changesโ€‹

  • API:v1/datasets/sample has been removed as it is not particularly useful, can be replicated via SQL, and via the tools endpoint POST v1/tools/:name.

Cookbookโ€‹

  • New Language Model Evals Recipe showing how to measure the performance of a language model using LLM-as-Judge, configured entirely in the spice runtime.

  • New Iceberg Catalog Recipe showing how to use Spice to query Iceberg tables from an Iceberg catalog.

Dependenciesโ€‹

  • OpenTelemetry: Upgraded from 0.26.0 to 0.27.1
  • Go: Upgraded from 1.22 to 1.23 (CLI)

Contributorsโ€‹

  • @sgrebnov
  • @phillipleblanc
  • @peasee
  • @Jeadie
  • @Sevenannn
  • @lukekim
  • @ewgenius

What's Changedโ€‹

- Add CI configuration for search benchmark dataset access by @sgrebnov in https://github.com/spiceai/spiceai/pull/3888
- Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/3895
- Upgrade dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3896
- chore: Update helm chart for RC.2 by @peasee in https://github.com/spiceai/spiceai/pull/3899
- Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/3903
- chore: Update MacOS test release install to macos-13 by @peasee in https://github.com/spiceai/spiceai/pull/3901
- Add usage to `spice chat` and fix `v1/models?status=true`. by @Jeadie in https://github.com/spiceai/spiceai/pull/3898
- chore: Bump versions for rc3 by @peasee in https://github.com/spiceai/spiceai/pull/3902
- docs: Update endgame with a step to verify dependencies in release notes by @peasee in https://github.com/spiceai/spiceai/pull/3897
- Ensure eval dataset input and ouput of correct length by @Jeadie in https://github.com/spiceai/spiceai/pull/3900
- `spice add/connect/dataset configure` should update spicepod, not overwrite it & upgrade to Go 1.23 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3905
- Bump opentelemetry from 0.26.0 to 0.27.1 by @dependabot in https://github.com/spiceai/spiceai/pull/3879
- Ensure trace_id is overridden for prior written spans by @Jeadie in https://github.com/spiceai/spiceai/pull/3906
- add 'role': 'assistant' for local models by @Jeadie in https://github.com/spiceai/spiceai/pull/3910
- Run tpcds benchmark for file connector by @Sevenannn in https://github.com/spiceai/spiceai/pull/3924
- Update to reference cookbook instead of quickstarts/samples by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3928
- Fix/remove flaky integration tests by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3930
- Implement `/v1/iceberg/namespaces` & `/v1/iceberg/config` APIs by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3923
- Add script for creating tpcds parquet files and spicepod for file connector by @Sevenannn in https://github.com/spiceai/spiceai/pull/3931
- Use `utoipa` to generate openapi.json and swagger for dev by @Jeadie in https://github.com/spiceai/spiceai/pull/3927
- `fuzzy_match`, `json_match`, `includes` scorer by @Jeadie in https://github.com/spiceai/spiceai/pull/3926
- Implement `/v1/iceberg/namespaces/:namespace` by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3933
- Implement `GET /v1/iceberg/namespaces/:namespace/tables` API by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3934
- Add custom Spice DuckDB dialect with cosine_distance support by @sgrebnov in https://github.com/spiceai/spiceai/pull/3938
- Fix NSQL error: `all columns in a record batch must have the same length` by @sgrebnov in https://github.com/spiceai/spiceai/pull/3947
- Don't include tools use in hf test model by @Jeadie in https://github.com/spiceai/spiceai/pull/3955
- Implement `GET /v1/namespaces/{namespace}/tables/{table}` API by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3940
- Update dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3967
- DuckDB: add support for nested types in Lists by @sgrebnov in https://github.com/spiceai/spiceai/pull/3961
- Add script to set up clickbench for file connector by @Sevenannn in https://github.com/spiceai/spiceai/pull/3945
- docs: Add connector stable criteria by @peasee in https://github.com/spiceai/spiceai/pull/3908
- Update Roadmp Dec 23, 2024 by @lukekim in https://github.com/spiceai/spiceai/pull/3978
- Improve CI testing for OpenAPI, new tool `spiceschema`, fix broken OpenAPI stuff. by @Jeadie in https://github.com/spiceai/spiceai/pull/3948
- remove `v1/datasets/sample` by @Jeadie in https://github.com/spiceai/spiceai/pull/3981
- feat: add SQLite ClickBench benchmark by @peasee in https://github.com/spiceai/spiceai/pull/3975
- Remove feature 'llms/mistralrs' by @Jeadie in https://github.com/spiceai/spiceai/pull/3984
- Add support for 'params.spice_tools: nsql' by @Jeadie in https://github.com/spiceai/spiceai/pull/3985
- Fix integration tests - add missing `format` query parameter in /v1/status requests by @ewgenius in https://github.com/spiceai/spiceai/pull/3989
- Enhance AI tools sampling logic for robust handling of large fields by @sgrebnov in https://github.com/spiceai/spiceai/pull/3959
- Fix subquery federation by @Sevenannn in https://github.com/spiceai/spiceai/pull/3991
- Fix unnest and add DuckDB support for `array_element` by @sgrebnov in https://github.com/spiceai/spiceai/pull/3995
- Add score value snapshotting to vector similarity search tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/3996
- Use Llama-3.2-3B-Instruct for Hugging Face integration testing by @sgrebnov in https://github.com/spiceai/spiceai/pull/3992
- Simplify `construct_chunk_query_sql` for DuckDB compatibility by @sgrebnov in https://github.com/spiceai/spiceai/pull/3988
- Update TPCH and TPCDS benchmarks for spice.ai connector by @ewgenius in https://github.com/spiceai/spiceai/pull/3982
- Correctly pass Hugging Face token in models integration tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/3997
- Fix: `on_zero_results` causes `TransactionContext Error: Catalog write-write conflict on create with "attachment_0"` by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3998
- Add DuckDB acceleration to search benchmarks by @sgrebnov in https://github.com/spiceai/spiceai/pull/4000
- Enable Postgres write via non-default `postgres-write` feature flag by @sgrebnov in https://github.com/spiceai/spiceai/pull/4004
- Allow search benchmark to write test results by @sgrebnov in https://github.com/spiceai/spiceai/pull/4008
- Make Flight DoPut atomic and commit write only on successful stream completion by @sgrebnov in https://github.com/spiceai/spiceai/pull/4002
- Create a `CatalogConnector` abstraction by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4003
- Fix `generate-openapi.yml` and add `.schema/openapi.json`. by @Jeadie in https://github.com/spiceai/spiceai/pull/3983
- Enable spice.ai tpcds bench workflow. Comment failing tpch queries. by @ewgenius in https://github.com/spiceai/spiceai/pull/4001
- feat: Add SQLite ClickBench overrides by @peasee in https://github.com/spiceai/spiceai/pull/4016
- Implement Iceberg Catalog Connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4053
- feat: Datafusion updates for SQLite fixes and release by @peasee in https://github.com/spiceai/spiceai/pull/4054
- docs: Add accelerator stable release criteria by @peasee in https://github.com/spiceai/spiceai/pull/4017
- Add dremio tpch / tpcds benchmark test by @Sevenannn in https://github.com/spiceai/spiceai/pull/4063
- Update docs, and make PR to `spiceai/docs` for new `openapi.json`. by @Jeadie in https://github.com/spiceai/spiceai/pull/4019
- Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/4065
- Fix dremio subquery rewrite by @Sevenannn in https://github.com/spiceai/spiceai/pull/4064
- Update generate-openapi.yml by @Jeadie in https://github.com/spiceai/spiceai/pull/4073
- docs: Add catalog criteria by @peasee in https://github.com/spiceai/spiceai/pull/4052
- fix `distinct_columns` in auto/nsql tool groups by @Jeadie in https://github.com/spiceai/spiceai/pull/4074
- Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/4075
- Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/4076
- Implement window_func_support_window_frame from DremioDialect by @Sevenannn in https://github.com/spiceai/spiceai/pull/4012
- Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/4079
- Promote file connector to rc by @Sevenannn in https://github.com/spiceai/spiceai/pull/4080
- Add Iceberg to README by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4085
- Fix '/v1/status' default format by @Jeadie in https://github.com/spiceai/spiceai/pull/4081

**Full Changelog**: https://github.com/spiceai/spiceai/compare/v1.0.0-rc.2...v1.0.0-rc.3

Resourcesโ€‹

Communityโ€‹

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Discord or by email to get involved.