Skip to main content

Spice v0.18.1-beta (Sep 23, 2024)

ยท 6 min read
Sergei Grebnov
Senior Software Engineer at Spice AI

Announcing the release of Spice v0.18.1-beta. ๐ŸŽ๏ธ

The v0.18.1-beta release continues to improve runtime performance and reliability. Performance for accelerated queries joining multiple datasets has been significantly improved with join push-down support. GraphQL, MySQL, and SharePoint data connectors have better reliability and error handling, and a new Microsoft SQL Server data connector has been introduced. Task History now has fine-grained configuration, including the ability to disable the feature entirely. A new spice search CLI command has been added, enabling development-time embeddings-based searches across datasets.

Highlights in v0.18.1-betaโ€‹

Join push-down for accelerations: Queries to the same accelerator will now push-down joins, significantly improving acceleration performance for queries joining multiple tables.

Microsoft SQL Server Data Connector: Use from: mssql: to access and accelerate Microsoft SQL Server datasets.

Example spicepod.yml:

datasets:
- from: mssql:path.to.my_dataset
name: my_dataset
params:
mssql_connection_string: ${secrets:mssql_connection_string}

See the Microsoft SQL Server Data Connector documentation.

Task History: Task History can be configured in the spicepod.yml, including the ability to include, or truncate outputs such as the results of a SQL query.

Example spicepod.yml:

runtime:
task_history:
enabled: true
captured_output: truncated
retention_period: 8h
retention_check_interval: 15m

See the Task History Spicepod reference for more information on possible values and behaviors.

Search CLI Command Use the spice search CLI command to perform embeddings-based searches across search configure datasets. Note: Search requires the ai feature to be installed.

Refresh on File Changes: File Data Connector data refreshes can be configured to be triggered when the source file is modified through a file system watcher. Enable the watcher by adding file_watcher: enabled to the acceleration parameters.

Example spicepod.yml:

datasets:
- from: file://path/to/my_file.csv
name: my_file
acceleration:
enabled: true
refresh_mode: full
params:
file_watcher: enabled

Breaking Changesโ€‹

The Query History table runtime.query_history has been deprecated and removed in favor of the Task History table runtime.task_history. The Task History table tracks tasks across all features such as SQL query, vector search, and AI completion in a unified table.

See the Task History documentation.

Dependenciesโ€‹

Contributorsโ€‹

  • @phillipleblanc
  • @Jeadie
  • @lukekim
  • @sgrebnov
  • @peasee
  • @Sevenannn
  • @ewgenius
  • @slyons

New Contributorsโ€‹

What's Changedโ€‹

- Update Helm Chart for 0.18.0-beta release by @sgrebnov in https://github.com/spiceai/spiceai/pull/2711
- Use a single instance for all DuckDB accelerated datasets by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2669
- Dependabot upgrades by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2715
- Use a single instance for all SQLite accelerated datasets by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2720
- Prepare for v0.18.1-beta release by @sgrebnov in https://github.com/spiceai/spiceai/pull/2692
- For GraphQL, remove necessity of `json_pointer` and improve error messaging. by @Jeadie in https://github.com/spiceai/spiceai/pull/2713
- Postgres accelerator benchmark test by @Sevenannn in https://github.com/spiceai/spiceai/pull/2652
- Trace query result while running benchmark tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/2684
- Early check EmbeddingConnector if embedding models do not exist by @Jeadie in https://github.com/spiceai/spiceai/pull/2717
- Move table creation for spice_sys_dataset_checkpoint to DatasetCheckpoint::try_new by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2732
- Don't load tools immediately by @Jeadie in https://github.com/spiceai/spiceai/pull/2731
- Renable accelerator federation on trunk by @Sevenannn in https://github.com/spiceai/spiceai/pull/2725
- Fixing Data Connectors link in README.md by @slyons in https://github.com/spiceai/spiceai/pull/2724
- Enable rehydration tests for DuckDB by @sgrebnov in https://github.com/spiceai/spiceai/pull/2729
- Check pageInfo is correct at initialisation of GraphQL connector by @Jeadie in https://github.com/spiceai/spiceai/pull/2730
- Microsoft SQL Server data connector initial support by @sgrebnov in https://github.com/spiceai/spiceai/pull/2741
- Add `spice search` CLI command by @lukekim in https://github.com/spiceai/spiceai/pull/2739
- Update threat model by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2738
- Upgrade to Arrow 53, DataFusion 42 and DuckDB 1.1 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2744
- Update datafusion table provider patch by @Sevenannn in https://github.com/spiceai/spiceai/pull/2747
- feat: Add enabled config option for task_history by @peasee in https://github.com/spiceai/spiceai/pull/2758
- Remove v0.18.0-beta from the Roadmap by @sgrebnov in https://github.com/spiceai/spiceai/pull/2748
- Fix spark-connect to use native roots for TLS again by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2766
- Fix benchmark test - Install default crypto provider by @Sevenannn in https://github.com/spiceai/spiceai/pull/2752
- Resolve primary keys for datasets with catalog or schema by @Jeadie in https://github.com/spiceai/spiceai/pull/2749
- MSSQL: include table name in schema retrieval error by @sgrebnov in https://github.com/spiceai/spiceai/pull/2746
- File Format parsing for Document tables, support for docx + pdf by @Jeadie in https://github.com/spiceai/spiceai/pull/2740
- Add Document parsing to Sharepoint connector. by @Jeadie in https://github.com/spiceai/spiceai/pull/2760
- Execution plan with BinaryExpr predicates pushdown support for MS SQL by @sgrebnov in https://github.com/spiceai/spiceai/pull/2768
- Update datafusion patch by @Sevenannn in https://github.com/spiceai/spiceai/pull/2772
- Support for standalone config parameters for MS SQL by @sgrebnov in https://github.com/spiceai/spiceai/pull/2773
- Utilize DataConnectorError for MySQL Data Connector Errors by @Sevenannn in https://github.com/spiceai/spiceai/pull/2759
- Add Score to search results by @lukekim in https://github.com/spiceai/spiceai/pull/2774
- Don't call GetComponentStatuses when --metrics not enabled by @Jeadie in https://github.com/spiceai/spiceai/pull/2779
- Implement better error handling for spicepods by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2767
- Make integration tests more robust by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2782
- Query results streaming support for MS SQL by @sgrebnov in https://github.com/spiceai/spiceai/pull/2781
- Update benchmark snapshots by @Sevenannn in https://github.com/spiceai/spiceai/pull/2778
- For Sharepoint connector, if client_secret and auth_code are both provided, default to auth_code by @Jeadie in https://github.com/spiceai/spiceai/pull/2780
- Add modified pk/indexes scenario to rehydration tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/2743
- Run benchmarks on Wed, Fri, Sat, and Sun. by @lukekim in https://github.com/spiceai/spiceai/pull/2786
- Update PULL_REQUEST_TEMPLATE.md to include a section for Documentation by @slyons in https://github.com/spiceai/spiceai/pull/2785
- Add E2E test for MS SQL data connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/2788
- More types support for MS SQL data connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/2789
- feat: Add captured_output option for task_history by @peasee in https://github.com/spiceai/spiceai/pull/2783
- Add ability to refresh when file data connector detects changes by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2787
- Propagate MySQL invalid table name error by @Sevenannn in https://github.com/spiceai/spiceai/pull/2776
- feat: Add retention options for task_history config by @peasee in https://github.com/spiceai/spiceai/pull/2784
- fix: Move task history check after query history creation by @peasee in https://github.com/spiceai/spiceai/pull/2793
- MS SQL connector should ignore all unsupported types by @sgrebnov in https://github.com/spiceai/spiceai/pull/2795
- Improve Sharepoint DX by @Jeadie in https://github.com/spiceai/spiceai/pull/2791
- Replace query history with task history by @peasee in https://github.com/spiceai/spiceai/pull/2792
- Fix datasets_health_monitor spice.runtime.task_history not found warning by @sgrebnov in https://github.com/spiceai/spiceai/pull/2805
- Upgrade macOS x86_64 test runner to macOS 13.6.9 Ventura by @sgrebnov in https://github.com/spiceai/spiceai/pull/2803
- Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/2808
- Add mssql to the list of supported data connectors by @sgrebnov in https://github.com/spiceai/spiceai/pull/28

**Full Changelog**: https://github.com/spiceai/spiceai/compare/v0.18.0-beta...v0.18.1-beta

Resourcesโ€‹

Communityโ€‹

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Discord or by email to get involved.

Spice v0.18-beta (Sep 16, 2024)

ยท 6 min read
Sergei Grebnov
Senior Software Engineer at Spice AI

Announcing the release of Spice v0.18-beta.

The v0.18.0-beta release adds new Sharepoint and File data connectors, introduces AWS Identity and Access Management (IAM) support for the S3 Data Connector, improves performance of the GitHub connector, and increases the overall reliability of all data accelerators. The /ready API endpoint was enhanced to report as ready only when all components, including loaded data, have successfully reported readiness.

Highlights in v0.18.0-betaโ€‹

Sharepoint Data Connector: Use from: sharepoint: to access and accelerate documents stored in Microsoft 365 OneDrive for Business (Sharepoint). The CLI also includes a new spice login sharepoint to aid in local development and testing.

Example spicepod.yml:

datasets:
- from: sharepoint:drive:Documents/path:/important_documents/
name: important_documents
params:
sharepoint_client_id: ${secrets:SPICE_SHAREPOINT_CLIENT_ID}
sharepoint_tenant_id: ${secrets:SPICE_SHAREPOINT_TENANT_ID}
sharepoint_client_secret: ${secrets:SPICE_SHAREPOINT_CLIENT_SECRET}

See the Sharepoint Data Connector documentation.

AWS Identity and Access Management (IAM) for S3: A new s3_auth parameter for the s3 data connector to configure the authentication method to use when connecting to S3. Supported values are public, key, and iam_role. Use s3_auth: iam_role to assume the instance IAM role.

Example spicepod.yml:

datasets:
- from: s3://my-bucket
name: bucket
params:
s3_auth: iam_role # Assume IAM role of instance

See the S3 Data Connector documentation.

File Data Connector Use from: file: to query files stored by locally accessible filesystems.

Example spicepod.yml:

datasets:
- from: file://path/to/customer.parquet
name: customer
params:
file_format: parquet

See the File Data Connector documentation.

Improved /ready Api Now includes the initial data load for accelerated datasets in addition to component readiness to ensure readiness is only reported when data has loaded and can be successfully queried.

Breaking Changesโ€‹

  • GitHub Data Connector: The data type for time-related columns has changed from Utf8 to Timestamp. To upgrade, data type references to timestamp. For example, if using time_format:, change uses of time_format: ISO8601 to time_format: timestamp.

  • Ready API: The /ready API reports ready only when all components have reported ready and data is fully loaded. To upgrade, evaluate uses of the Ready API (such as Kubernetes readiness probes) and consider how it might affect system behavior.

Dependenciesโ€‹

No major dependencies updates.

Contributorsโ€‹

  • @phillipleblanc
  • @Jeadie
  • @lukekim
  • @sgrebnov
  • @peasee
  • @eltociear
  • @Sevenannn
  • @ewgenius
  • @karifabri

New Contributorsโ€‹

What's Changedโ€‹

- Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/2585
- Set helm to v0.17.4-beta by @ewgenius in https://github.com/spiceai/spiceai/pull/2595
- Bump to next v0.18.0-beta version by @ewgenius in https://github.com/spiceai/spiceai/pull/2596
- Add snapshot test docs / Update beta criteria for data accelerators by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2594
- Enable federation for accelerated queries (sqlite, duckdb, postgres) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2598
- spelling updates on v0.17.4 release notes by @karifabri in https://github.com/spiceai/spiceai/pull/2601
- Update endgame template by @ewgenius in https://github.com/spiceai/spiceai/pull/2591
- fix: Re-attach DuckDB attachments on each query by @peasee in https://github.com/spiceai/spiceai/pull/2602
- Speed up sqlite accelerator benchmark test with indexes by @Sevenannn in https://github.com/spiceai/spiceai/pull/2597
- Fix refresh API using `refresh_mode: append` by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2609
- Tweak `/ready` to only report ready when components have all reported Ready by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2600
- Add `s3_auth` parameter to configure IAM role authentication by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2611
- Bump fundu from 2.0.0 to 2.0.1 by @dependabot in https://github.com/spiceai/spiceai/pull/2576
- fix: Remove comments from SQL files by @peasee in https://github.com/spiceai/spiceai/pull/2627
- Utilize runtime.status().is_ready() to check acceleration dataset readiness in benchmark test by @Sevenannn in https://github.com/spiceai/spiceai/pull/2614
- Allow for prefix to be kept in internal Parameters by @Jeadie in https://github.com/spiceai/spiceai/pull/2603
- Bump itertools from 0.12.1 to 0.13.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2572
- Bump golang.org/x/mod from 0.20.0 to 0.21.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2571
- Add initial threat model using OWASP Threat Dragon by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2599
- fix: Explicitly error for duplicate duckdb file accelerators by @peasee in https://github.com/spiceai/spiceai/pull/2628
- Benchmark test binary can parse command line option by @Sevenannn in https://github.com/spiceai/spiceai/pull/2626
- Snapshot tests shouldn't crash the Spice benchmark test by @Sevenannn in https://github.com/spiceai/spiceai/pull/2613
- Bump anyhow from 1.0.86 to 1.0.87 by @dependabot in https://github.com/spiceai/spiceai/pull/2573
- Upgrade datafusion to improve SQLite subquery tables aliasing support by @sgrebnov in https://github.com/spiceai/spiceai/pull/2634
- Run benchmark separately using workflow by @Sevenannn in https://github.com/spiceai/spiceai/pull/2631
- Sharepoint UX changes by @Jeadie in https://github.com/spiceai/spiceai/pull/2633
- Improve `/ready` to only mark a dataset ready iff the initial refresh completed by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2630
- Support relative paths for file connector by @Jeadie in https://github.com/spiceai/spiceai/pull/2637
- Fix `error decoding response body` GitHub file connector bug by @sgrebnov in https://github.com/spiceai/spiceai/pull/2645
- GraphQL pagination and robustness. by @Jeadie in https://github.com/spiceai/spiceai/pull/2632
- docs: Update bug template by @peasee in https://github.com/spiceai/spiceai/pull/2629
- Define GitHub `issues` data connector schema upfront by @sgrebnov in https://github.com/spiceai/spiceai/pull/2646
- Add support for loading from Sharepoint Group's default drive. by @Jeadie in https://github.com/spiceai/spiceai/pull/2642
- Fix typo in workflow, fix the postgres connector container readiness check by @Sevenannn in https://github.com/spiceai/spiceai/pull/2654
- Fix check all features by @Sevenannn in https://github.com/spiceai/spiceai/pull/2653
- Enable Warn/Error traces from dependency components by @sgrebnov in https://github.com/spiceai/spiceai/pull/2655
- Use lower case iso8601 for time_column by @Sevenannn in https://github.com/spiceai/spiceai/pull/2551
- Add basic integration test for Spice spill-to-disk and re-hydration scenario by @sgrebnov in https://github.com/spiceai/spiceai/pull/2643
- Add 'RefreshOverrides::max_jitter' to 'POST /v1/datasets/:name/acceleration/refresh' by @Jeadie in https://github.com/spiceai/spiceai/pull/2641
- Bump rustls-pemfile from 1.0.4 to 2.1.3 by @dependabot in https://github.com/spiceai/spiceai/pull/2575
- Update dependencies to support querying postgres enum types by @Sevenannn in https://github.com/spiceai/spiceai/pull/2657
- Upgrade table-providers by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2659
- Improve `spill_to_disk_and_rehydration` integration test by @sgrebnov in https://github.com/spiceai/spiceai/pull/2658
- Enhance GitHub connector robustness with explicit table schema definitions by @sgrebnov in https://github.com/spiceai/spiceai/pull/2661
- Rename sharepoint fields by @Jeadie in https://github.com/spiceai/spiceai/pull/2668
- Disable dataset checkpoint for DuckDB acceleration by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2676
- Revert "Enable federation for accelerated queries (sqlite, duckdb, postgres) (#2598) by @Sevenannn in https://github.com/spiceai/spiceai/pull/2683

**Full Changelog**: https://github.com/spiceai/spiceai/compare/v0.17.4-beta...v0.18.0-beta

Resourcesโ€‹

Communityโ€‹

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Discord or by email to get involved.

Spice v0.17.4-beta (Sep 9, 2024)

ยท 4 min read
Luke Kim
Founder and CEO of Spice AI

Announcing the release of Spice v0.17.4-beta.

The v0.17.4-beta release adds compatibility, performance, and reliability improvements to the DuckDB and SQLite accelerators. The GitHub data connector adds a Stargazers table, Snowflake and Clickhouse data connectors have improved resiliency for empty tables, and core data processing and quality has been improved.

Highlights in v0.17.4-betaโ€‹

Improved benchmarking, testing, and robustness of data accelerators: Continued compatibility, performance, and reliability improvements for SQLite and DuckDB data accelerators and expanded performance and quality testing.

GitHub Stargazers: The GitHub Data Connector adds support for a /stargazers table making it easy to query GitHub Stargazers using SQL!

Breaking Changesโ€‹

None.

Contributorsโ€‹

  • @phillipleblanc
  • @Jeadie
  • @lukekim
  • @sgrebnov
  • @peasee
  • @eltociear
  • @Sevenannn
  • @ewgenius

New Contributorsโ€‹

What's Changedโ€‹

- Change to sql lang by @ewgenius in https://github.com/spiceai/spiceai/pull/2484
- Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/2487
- Bump rustyline from 13.0.0 to 14.0.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2473
- Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/2490
- Native schema inference for snowflake (and support timezone_tz, better numeric support) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2493
- Add checks for GitHub quickstart and docs banner to endgame template by @ewgenius in https://github.com/spiceai/spiceai/pull/2489
- Prepare for v0.18.0-beta by @Jeadie in https://github.com/spiceai/spiceai/pull/2488
- Add logo to README.md by @lukekim in https://github.com/spiceai/spiceai/pull/2497
- Add stargazers to GitHub data connector by @lukekim in https://github.com/spiceai/spiceai/pull/2502
- Enable federation for accelerated queries (sqlite and duckdb) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2511
- Load SQLite decimal extension by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2498
- fix: Support INTERVAL in SQLite by @peasee in https://github.com/spiceai/spiceai/pull/2513
- Add refresh jitter to refreshing dataset acceleration by @Jeadie in https://github.com/spiceai/spiceai/pull/2510
- Update to use DuckDB streaming by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2514
- Add more MySQL types in E2E testing by @sgrebnov in https://github.com/spiceai/spiceai/pull/2512
- Update tpc loading script to support automatic loading into postgres by @Sevenannn in https://github.com/spiceai/spiceai/pull/2509
- docs: update README.md by @eltociear in https://github.com/spiceai/spiceai/pull/2516
- Bump quinn-proto from 0.11.6 to 0.11.8 in the cargo group by @dependabot in https://github.com/spiceai/spiceai/pull/2501
- Script for loading clickbench data into arrow / postgres and run clickbench queries by @Sevenannn in https://github.com/spiceai/spiceai/pull/2500
- Fix run query script to correctly record all errors by @Sevenannn in https://github.com/spiceai/spiceai/pull/2529
- Add support for DuckDB engine to setup-tpc-spicepod.bash by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2530
- Upgrade `datafusion` (fixes subquery alias table unparsing for SQLite) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2532
- Make dataset acceleration delay `period +- jitter` by @Jeadie in https://github.com/spiceai/spiceai/pull/2534
- Add refresh options to `POST /v1/datasets/:name/acceleration/refresh` by @Jeadie in https://github.com/spiceai/spiceai/pull/2515
- Add E2E for GitHub Connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/2505
- Add on-conflict integration test for file based and memory based sqlite by @Sevenannn in https://github.com/spiceai/spiceai/pull/2533
- Upgrade to Rust v.1.81.0 and fix resulting compile error by @Sevenannn in https://github.com/spiceai/spiceai/pull/2539
- Remove unneeded `RwLock` from `EmbeddingModelStore` by @Jeadie in https://github.com/spiceai/spiceai/pull/2541
- Remove unneeded RwLock from LlmModelStore by @Jeadie in https://github.com/spiceai/spiceai/pull/2537
- Add sqlite to the setup tpc benchmark script by @Sevenannn in https://github.com/spiceai/spiceai/pull/2540
- Add sqlite to setup clickbench script by @Sevenannn in https://github.com/spiceai/spiceai/pull/2548
- Update version for v0.17.4-beta release by @ewgenius in https://github.com/spiceai/spiceai/pull/2563
- Sharepoint data connector by @Jeadie in https://github.com/spiceai/spiceai/pull/2294
- Fix predicate/projection push-down for BytesProcessedNode by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2564
- fix out of order projections in sharepoint scans by @Jeadie in https://github.com/spiceai/spiceai/pull/2569
- Use Decimal instead of Float64 for SQLite Decimal columns by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2566
- Add snapshot tests for EXPLAIN plans in integration tests by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2570
- Set refresh.period from `refresh_data_window` by @ewgenius in https://github.com/spiceai/spiceai/pull/2578
- Add snapshot tests for EXPLAIN plans in benchmark tests by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2580
- Disable federation for accelerated queries by @sgrebnov in https://github.com/spiceai/spiceai/pull/2581
- Add manual refresh payload to 'spice refresh...' by @Jeadie in https://github.com/spiceai/spiceai/pull/2565
- Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/2586

**Full Changelog**: https://github.com/spiceai/spiceai/compare/v0.17.3-beta...v0.17.4-beta

Resourcesโ€‹

Communityโ€‹

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Discord or by email to get involved.

Spice v0.17.3-beta (Sep 2, 2024)

ยท 5 min read
Jack Eadie
Token Plumber at Spice AI

Announcing the release of Spice v0.17.3-beta.

The v0.17.3-beta release further improves data accelerator robustness and adds a new github data connector that makes accelerating GitHub Issues, Pull Requests, Commits, and Blobs easy.

Highlights in v0.17.3-betaโ€‹

Improved benchmarking, testing, and robustness of data accelerators: Continued improvements to benchmarking and testing of data accelerators, leading to more robust and reliable data accelerators.

GitHub Connector (alpha): Connect to GitHub and accelerate Issues, Pull Requests, Commits, and Blobs.

datasets:
# Fetch all rust and golang files from spiceai/spiceai
- from: github:github.com/spiceai/spiceai/files/trunk
name: spiceai.files
params:
include: '**/*.rs; **/*.go'
github_token: ${secrets:GITHUB_TOKEN}

# Fetch all issues from spiceai/spiceai. Similar for pull requests, commits, and more.
- from: github:github.com/spiceai/spiceai/issues
name: spiceai.issues
params:
github_token: ${secrets:GITHUB_TOKEN}

Breaking Changesโ€‹

None.

Upgrade Instructionsโ€‹

  • CLI: Run spice upgrade
  • Docker: docker pull spiceai/spiceai:latest
  • Container image tag: spiceai/spiceai:latest or spiceai/spiceai:0.17.3-beta

Contributorsโ€‹

  • @phillipleblanc
  • @Jeadie
  • @peasee
  • @sgrebnov
  • @Sevenannn
  • @lukekim
  • @dependabot
  • @ewgenius

What's Changedโ€‹

Dependenciesโ€‹

  • delta_kernel from 0.2.0 to 0.3.0.

Commitsโ€‹

- Prepare version for v0.17.3-beta by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2388
- Add a basic Github Connector by @Jeadie in https://github.com/spiceai/spiceai/pull/2365
- task: Re-enable federation by @peasee in https://github.com/spiceai/spiceai/pull/2389
- fix: Implement custom PartialEq for Dataset by @peasee in https://github.com/spiceai/spiceai/pull/2390
- GitHub Data Connector `files` support (basic fields) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2393
- Add a `--force` flag to `spice install` to force it to install the latest released version by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2395
- Improve experience of using `spice chat` by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2396
- Fix view loading on startup by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2398
- Add `include` param support to GitHub Data Connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/2397
- Postgres integration test to cover on-conflict behavior by @Sevenannn in https://github.com/spiceai/spiceai/pull/2359
- Create dependabot.yml by @lukekim in https://github.com/spiceai/spiceai/pull/2399
- Add `content` column to GitHub Connector when dataset is accelerated by @sgrebnov in https://github.com/spiceai/spiceai/pull/2400
- Fix dependabot indentation by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2402
- Bump docker/setup-buildx-action from 1 to 3 by @dependabot in https://github.com/spiceai/spiceai/pull/2403
- Bump github/codeql-action from 2 to 3 by @dependabot in https://github.com/spiceai/spiceai/pull/2404
- Bump docker/login-action from 1 to 3 by @dependabot in https://github.com/spiceai/spiceai/pull/2405
- Bump yogevbd/enforce-label-action from 2.1.0 to 2.2.2 by @dependabot in https://github.com/spiceai/spiceai/pull/2406
- Bump actions/checkout from 3 to 4 by @dependabot in https://github.com/spiceai/spiceai/pull/2407
- Bump go.uber.org/zap from 1.21.0 to 1.27.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2408
- Bump github.com/prometheus/client_model from 0.6.0 to 0.6.1 by @dependabot in https://github.com/spiceai/spiceai/pull/2409
- Bump github.com/spf13/cobra from 1.6.0 to 1.8.1 by @dependabot in https://github.com/spiceai/spiceai/pull/2412
- Bump chrono-tz from 0.8.6 to 0.9.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2413
- Bump tokio from 1.39.2 to 1.39.3 by @dependabot in https://github.com/spiceai/spiceai/pull/2414
- Bump tokenizers from 0.19.1 to 0.20.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2415
- Bump serde from 1.0.207 to 1.0.209 by @dependabot in https://github.com/spiceai/spiceai/pull/2416
- Bump gopkg.in/natefinch/lumberjack.v2 from 2.0.0 to 2.2.1 by @dependabot in https://github.com/spiceai/spiceai/pull/2410
- Bump ndarray from 0.15.6 to 0.16.1 by @dependabot in https://github.com/spiceai/spiceai/pull/2417
- Bump golang.org/x/mod from 0.14.0 to 0.20.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2411
- Add correct labels to dependabot.yml by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2418
- Fix build break by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2430
- Dependabot updates by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2431
- Bump github.com/stretchr/testify from 1.8.1 to 1.9.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2422
- Preserve timezone information in constructing expr by @Sevenannn in https://github.com/spiceai/spiceai/pull/2392
- Bump github.com/spf13/viper from 1.12.0 to 1.19.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2420
- Fix repeated base table data in acceleration with embeddings by @Sevenannn in https://github.com/spiceai/spiceai/pull/2401
- Fix tool calling with Groq (and potentially other tool-enabled models) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2435
- Remove candle from `crates/llms/src/chat/` by @Jeadie in https://github.com/spiceai/spiceai/pull/2439
- fix: Only attach successfully initialized accelerators by @peasee in https://github.com/spiceai/spiceai/pull/2433
- Support overriding OpenAI default values in a model param; add token usage telemetry to task_history. by @Jeadie in https://github.com/spiceai/spiceai/pull/2434
- Enable message chains and tool calls for local LLMs by @Jeadie in https://github.com/spiceai/spiceai/pull/2180
- DuckDB on-conflict integration test by @Sevenannn in https://github.com/spiceai/spiceai/pull/2437
- Fix MySQL E2E tests and include MySQL acceleration testing by @sgrebnov in https://github.com/spiceai/spiceai/pull/2441
- Use rtcontext for proper cloud/local context in `spice chat` by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2442
- Fix MySQL connector to respect the source column's decimal precision by @sgrebnov in https://github.com/spiceai/spiceai/pull/2443
- Improve Github Data Connector tables schema by @sgrebnov in https://github.com/spiceai/spiceai/pull/2448
- Improve GitHub Connector error msg when invalid token or permissions by @sgrebnov in https://github.com/spiceai/spiceai/pull/2449
- Proper error tracking across tracing spans by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2454
- task: Disable and update federation by @peasee in https://github.com/spiceai/spiceai/pull/2457
- GitHub connector: convert `labels` and `hashes` to primitive arrays by @sgrebnov in https://github.com/spiceai/spiceai/pull/2452
- Bump `datafusion` version to the latest by @sgrebnov in https://github.com/spiceai/spiceai/pull/2456
- Trim trailing `/` for S3 data connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2458
- Add `accelerated_refresh` to `task_history` table by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2459
- Add `assignees` and `labels` fields to github issues and github pulls datasets by @ewgenius in https://github.com/spiceai/spiceai/pull/2467
- Native clickhouse schema inference by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2466
- List GitHub connector in readme by @ewgenius in https://github.com/spiceai/spiceai/pull/2468
- Fix LLMs health check; Add `updatedAt` field to GitHub connector by @ewgenius in https://github.com/spiceai/spiceai/pull/2474
- Remove non existing updated_at from github.pulls dataset by @ewgenius in https://github.com/spiceai/spiceai/pull/2475
- GitHub connector: add pulls labels and rm duplicate milestoneId and milestoneTitle for issues by @sgrebnov in https://github.com/spiceai/spiceai/pull/2477
- Bump delta_kernel from 0.2.0 to 0.3.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2472
- Add back GitHub connector Pull Request `updated_at` by @lukekim in https://github.com/spiceai/spiceai/pull/2479
- Update ROADMAP Sep 2, 2024. by @lukekim in https://github.com/spiceai/spiceai/pull/2478

**Full Changelog**: <https://github.com/spiceai/spiceai/compare/v0.17.2-beta...v0.17.3-beta>

Resourcesโ€‹

Communityโ€‹

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Discord or by email to get involved.

Spice v0.17.2-beta (August 26, 2024)

ยท 6 min read
Phillip LeBlanc
Co-Founder and CTO of Spice AI

Announcing the release of Spice v0.17.2-beta ๐Ÿ„

The v0.17.2-beta release focuses on improving data accelerator compatibility, stability, and performance. Expanded data type support for DuckDB, SQLite, and PostgreSQL data accelerators (and data connectors) enables significantly more data types to be accelerated. Error handling and logging has also been improved along with several bugs.

Highlights in v0.17.2-betaโ€‹

Expanded Data Type Support for Data Accelerators: DuckDB, SQLite, and PostgreSQL Data Accelerators now support a wider range of data types, enabling acceleration of more diverse datasets.

Enhanced Error Handling and Logging: Improvements have been made to aid in troubleshooting and debugging.

Anonymous Usage Telemetry: Optional, anonymous, aggregated telemetry has been added to help improve Spice. This feature can be disabled. For details about collected data, see the telemetry documentation.

To opt out of telemetry:

  1. Using the CLI flag:

    spice run -- --telemetry-enabled false
  2. Add configuration to spicepod.yaml:

    runtime:
    telemetry:
    enabled: false

Improved Benchmarking: A suite of performance benchmarking tests have been added to the project, helping to maintain and improve runtime performance; a top priority for the project.

Breaking Changesโ€‹

None.

Contributorsโ€‹

  • @Jeadie
  • @y-f-u
  • @phillipleblanc
  • @sgrebnov
  • @Sevenannn
  • @peasee
  • @ewgenius

What's Changedโ€‹

Dependenciesโ€‹

Commitsโ€‹

- Pin actions/upload-artifact to v4.3.4 by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2200>
- Update spicepod.schema.json by @github-actions in <https://github.com/spiceai/spiceai/pull/2202>
- Update to next release version, `v0.17.2-beta` by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2203>
- add accelerator beta criteria by @y-f-u in <https://github.com/spiceai/spiceai/pull/2201>
- update helm chart to 0.17.1-beta by @Sevenannn in <https://github.com/spiceai/spiceai/pull/2205>
- add dockerignore to avoid copy target and test folder by @y-f-u in <https://github.com/spiceai/spiceai/pull/2206>
- add client timeout for deltalake connector by @y-f-u in <https://github.com/spiceai/spiceai/pull/2208>
- Upgrade tonic and opentelemetry-proto by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2223>
- Add index and resource tuning for postgres ghcr image to support postgres benchmark in sf1 by @Sevenannn in <https://github.com/spiceai/spiceai/pull/2196>
- Remove embedding columns from `retrieved_primary_keys` in v1/search by @Jeadie in <https://github.com/spiceai/spiceai/pull/2176>
- use file as db_path_param as the param prefix is trimmed by @y-f-u in <https://github.com/spiceai/spiceai/pull/2230>
- use file for sqlite db path param by @y-f-u in <https://github.com/spiceai/spiceai/pull/2231>
- docs: Clarify the global requirement for local_infile when loading TPCH by @peasee in <https://github.com/spiceai/spiceai/pull/2228>
- Revert pinning actions/upload-artifact@v4 by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2232>
- Runtime tools to chat models by @Jeadie in <https://github.com/spiceai/spiceai/pull/2207>
- Create `runtime.task_history` table for queries, and embeddings by @Jeadie in <https://github.com/spiceai/spiceai/pull/2191>
- chore: Update Databricks ODBC Bench to use TPCH SF1 by @peasee in <https://github.com/spiceai/spiceai/pull/2238>
- Replace `metrics-rs` with OpenTelemetry Metrics by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2240>
- fix: Remove dead code by @peasee in <https://github.com/spiceai/spiceai/pull/2249>
- Improve tool quality and add vector search tool by @Jeadie in <https://github.com/spiceai/spiceai/pull/2250>
- fix missing partition cols in delta lake by @y-f-u in <https://github.com/spiceai/spiceai/pull/2253>
- download file from remote for delta testing by @y-f-u in <https://github.com/spiceai/spiceai/pull/2254>
- feat: Set SQLite DB path to .spice/data by @peasee in <https://github.com/spiceai/spiceai/pull/2242>
- Support tools for chat completions in streaming mode by @ewgenius in <https://github.com/spiceai/spiceai/pull/2255>
- Load component `description` field from spicepod.yaml and include in LLM context by @ewgenius in <https://github.com/spiceai/spiceai/pull/2261>
- Add parameter for `connection_pool_size` in the Postgres Data Connector by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2251>
- Add primary keys to response of `DocumentSimilarityTool` by @Jeadie in <https://github.com/spiceai/spiceai/pull/2263>
- run queries bash script by @y-f-u in <https://github.com/spiceai/spiceai/pull/2262>
- Run benchmark test on schedule by @Sevenannn in <https://github.com/spiceai/spiceai/pull/2277>
- feat: Add a reference to originating App for a Dataset by @peasee in <https://github.com/spiceai/spiceai/pull/2283>
- Tool use & telemetry productionisation. by @Jeadie in <https://github.com/spiceai/spiceai/pull/2286>
- Fix cron in benchmarks.yml by @Sevenannn in <https://github.com/spiceai/spiceai/pull/2288>
- Upgrade to DataFusion v41 by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2290>
- Chat completions adjustments and fixes by @ewgenius in <https://github.com/spiceai/spiceai/pull/2292>
- Define the new metrics Arrow schema based on Open Telemetry by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2295>
- OpenTelemetry Metrics Arrow exporter to `runtime.metrics` table by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2296>
- Calculate summary metrics from histograms for Prometheus endpoint by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2302>
- Add back Spice DF runtime_env during SessionContext construction by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2304>
- Add integration test for S3 data connector by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2305>
- Fix `secrets.inject_secrets` when secret not found. by @Jeadie in <https://github.com/spiceai/spiceai/pull/2306>
- Intra-table federation query on duckdb accelerated table by @y-f-u in <https://github.com/spiceai/spiceai/pull/2299>
- Postgres federation on acceleration by @y-f-u in <https://github.com/spiceai/spiceai/pull/2309>
- sqlite intra table federation on acceleration by @y-f-u in <https://github.com/spiceai/spiceai/pull/2308>
- feat: Add `DataAccelerator::init()` for SQLite acceleration federation by @peasee in <https://github.com/spiceai/spiceai/pull/2293>
- Initial framework for collecting anonymous usage telemetry by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2310>
- Add gRPC action to trigger accelerated dataset refresh by @sgrebnov in <https://github.com/spiceai/spiceai/pull/2316>
- add `disable_query_push_down` option to acceleration settings by @y-f-u in <https://github.com/spiceai/spiceai/pull/2327>
- Remove `v1/assist` by @Jeadie in <https://github.com/spiceai/spiceai/pull/2312>
- bump table provider version to set the correct dialect for postgres writer by @y-f-u in <https://github.com/spiceai/spiceai/pull/2329>
- Send telemetry on startup by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2331>
- Calculate resource IDs for telemetry by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2332>
- Refactor `v1/search`: include WHERE condition, allow extra columns in projection. by @Jeadie in <https://github.com/spiceai/spiceai/pull/2328>
- Add integration test for gRPC dataset refresh action by @sgrebnov in <https://github.com/spiceai/spiceai/pull/2330>
- Propagate errors through all `task_history` nested spans by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2337>
- Improve tools by @Jeadie in <https://github.com/spiceai/spiceai/pull/2338>
- update duckdb rs version to support more types: interval/duration/etc by @y-f-u in <https://github.com/spiceai/spiceai/pull/2336>
- feat: Add DuckDB accelerator init, attach databases for federation by @peasee in <https://github.com/spiceai/spiceai/pull/2335>
- Add query telemetry metrics by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2333>
- Add system prompts for LLMs; system prompts for tool using models. by @Jeadie in <https://github.com/spiceai/spiceai/pull/2342>
- Fix benchmark test to keep running when there's failed queries by @Sevenannn in <https://github.com/spiceai/spiceai/pull/2347>
- Tools as a spicepod first class citizen. by @Jeadie in <https://github.com/spiceai/spiceai/pull/2344>
- Add `bytes_processed` telemetry metric by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2343>
- fix misaligned columns from delta lake by @y-f-u in <https://github.com/spiceai/spiceai/pull/2356>
- Emit telemetry metrics to `runtime.metrics`/Prometheus as well by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2352>
- Use UTC timezone for telemetry timestamps by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2354>
- Fix MetricType deserialization by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2358>
- Add dataset details to tool using LLMs; early check tables in vector search by @Jeadie in <https://github.com/spiceai/spiceai/pull/2353>
- Bump datafusion-federation/datafusion-table-providers dependencies by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2360>
- Update spicepod.schema.json by @github-actions in <https://github.com/spiceai/spiceai/pull/2362>
- fix: Disable DuckDB and SQLite federation by @peasee in <https://github.com/spiceai/spiceai/pull/2371>
- Fix system prompt in ToolUsingChat, fix builtin registration by @Jeadie in <https://github.com/spiceai/spiceai/pull/2367>
- fix: Use --profile release for benchmarks by @peasee in <https://github.com/spiceai/spiceai/pull/2372>
- nql parameter 'use' -> 'model' by @Jeadie in <https://github.com/spiceai/spiceai/pull/2366>

**Full Changelog**: <https://github.com/spiceai/spiceai/compare/v0.17.1-beta...v0.17.2-beta>

Resourcesโ€‹

Communityโ€‹

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Discord or by email to get involved.