Skip to main content

Spice v0.19.1-beta (Oct 14, 2024)

ยท 5 min read
Luke Kim
Founder and CEO of Spice AI

Announcing the release of Spice v0.19.1-beta ๐Ÿ”ฅ

Spice v0.19.1 brings further performance and stability improvements to data connectors, including improved query push-down for file-based connectors (s3, abfs, file, ftp, sftp) that use Hive-style partitioning.

Highlights in v0.19.1โ€‹

TPC-H and TPC-DS Coverage: Expanded coverage for TPC-H and TPC-DS benchmarking suites across accelerators and connectors.

GitHub Connector Array Filter: The GitHub connector now supports filter push down for the array_contains function in SQL queries using search query mode.

NSQL CLI Command: A new spice nsql CLI command has been added to easily query datasets with natural language from the command line.

Breaking changesโ€‹

None

Contributorsโ€‹

  • @peasee
  • @Sevenannn
  • @sgrebnov
  • @karifabri
  • @phillipleblanc
  • @lukekim
  • @Jeadie
  • @slyons

Dependenciesโ€‹

What's Changedโ€‹

- release: Update helm chart for v0.19.0-beta by @peasee in https://github.com/spiceai/spiceai/pull/3024
- Set fail-fast = true for benchmark test by @Sevenannn in https://github.com/spiceai/spiceai/pull/2997
- release: Update next version and ROADMAP by @peasee in https://github.com/spiceai/spiceai/pull/3033
- Verify TPCH benchmark query results for Spark connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/2993
- feat: Add x-spice-user-agent header to Spice REPL by @peasee in https://github.com/spiceai/spiceai/pull/2979
- Update to object store file formats documentation link by @karifabri in https://github.com/spiceai/spiceai/pull/3036
- Use teraswitch-runners for Linux x64 workflows + builds by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3042
- feat: Support array contains in GitHub pushdown by @peasee in https://github.com/spiceai/spiceai/pull/2983
- Bump text-splitter from 0.16.1 to 0.17.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2987
- Revert integration tests back to hosted runner by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3046
- Tune Github runner resources to allow in memory TPCDS benchmark to run by @Sevenannn in https://github.com/spiceai/spiceai/pull/3025
- fix: add winver by @peasee in https://github.com/spiceai/spiceai/pull/3054
- refactor: Use is modifier for checking GitHub state filter by @peasee in https://github.com/spiceai/spiceai/pull/3056
- Enable `merge_group` checks for PR workflows by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3058
- Fix issues with merge group by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3059
- Validate in-memory arrow accelertion TPCDS result correctness by @Sevenannn in https://github.com/spiceai/spiceai/pull/3044
- Fix rev parsing for PR checks by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3060
- Use 'Accept' header for `/v1/sql/` and `/v1/nsql` by @Jeadie in https://github.com/spiceai/spiceai/pull/3032
- Verify Postgres acceleration TPCDS result correctness by @Sevenannn in https://github.com/spiceai/spiceai/pull/3043
- Add NSQL CLI REPL command by @lukekim in https://github.com/spiceai/spiceai/pull/2856
- Preserve query results order and add TPCH benchmark results verification for duckdb:file mode by @sgrebnov in https://github.com/spiceai/spiceai/pull/3034
- Refactor benchmark to include MySQL tpcds bench, tweaks to makefile target for generating mysql tpcds data by @Sevenannn in https://github.com/spiceai/spiceai/pull/2967
- Support runtime parameter for `sql_query_keep_partition_by_columns` & enable by default by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3065
- Document TPC-DS limitations: `EXCEPT`, `INTERSECT`, duplicate names by @sgrebnov in https://github.com/spiceai/spiceai/pull/3069
- Adding ABFS benchmark by @slyons in https://github.com/spiceai/spiceai/pull/3062
- Add support for GitHub app installation auth for GitHub connector by @ewgenius in https://github.com/spiceai/spiceai/pull/3063
- docs: Document stack overflow workaround, add helper script by @peasee in https://github.com/spiceai/spiceai/pull/3070
- Tune MySQL TPCDS image to allow for successful benchmark test run by @Sevenannn in https://github.com/spiceai/spiceai/pull/3067
- Automatically infer partitions for hive-style partitioned files for object store based connectors by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3073
- Support `hf_token` from params/secrets by @Jeadie in https://github.com/spiceai/spiceai/pull/3071
- Inherit embedding columns from source, when available. by @Jeadie in https://github.com/spiceai/spiceai/pull/3045
- Validate identifiers for component names by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3079
- docs: Add workaround for TPC-DS Q97 in MySQL by @peasee in https://github.com/spiceai/spiceai/pull/3080
- Document TPC-DS Postgres column alias in a CASE statement limitation by @sgrebnov in https://github.com/spiceai/spiceai/pull/3083
- Update plan snapshots for TPC-H bench queries by @sgrebnov in https://github.com/spiceai/spiceai/pull/3088
- Update Datafusion crate to include recent unparsing fixes by @sgrebnov in https://github.com/spiceai/spiceai/pull/3089
- Sample SQL table data tool and API by @Jeadie in https://github.com/spiceai/spiceai/pull/3081
- chore: Update datafusion-table-providers by @peasee in https://github.com/spiceai/spiceai/pull/3090
- Add `hive_infer_partitions` to remaining object store connectors by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3086
- deps: Update datafusion-table-providers by @peasee in https://github.com/spiceai/spiceai/pull/3093
- For local embedding models, return usage input tokens. by @Jeadie in https://github.com/spiceai/spiceai/pull/3095
- Update end_game.md with Accelerator/Connector criteria check by @slyons in https://github.com/spiceai/spiceai/pull/3092
- Update TPC-DS Q90 by @sgrebnov in https://github.com/spiceai/spiceai/pull/3094
- docs: Add RC connector criteria by @peasee in https://github.com/spiceai/spiceai/pull/3026
- Update version to 0.19.1-beta by @sgrebnov in https://github.com/spiceai/spiceai/pull/3101

**Full Changelog**: https://github.com/spiceai/spiceai/compare/v0.19.0-beta...v0.19.1-beta

Resourcesโ€‹

Communityโ€‹

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Slack or by email to get involved.

Spice v0.19-beta (Oct 7, 2024)

ยท 7 min read
William Croxson
Senior Software Engineer at Spice AI

Announcing the release of Spice v0.19-beta ๐Ÿ“ฆ

Spice v0.19.0-beta brings performance improvements for accelerators and expanded TPC-DS coverage. A new Azure Blob Storage data connector has also been added.

Highlights in v0.19.0-betaโ€‹

Improved TPC-DS Coverage: Enhanced support for TPC-DS derived queries.

CLI SQL REPL: The CLI SQL REPL (spice sql) now supports multi-line editing and tab indentation. Note, a terminating semi-colon ';' is now required for each executed SQL block.

Azure Storage Data Connector: A new Azure Blob Storage data connector (abfs://) has been added, enabling federated SQL queries on files stored in Azure Blob-compatible endpoints, including Azure BlobFS (abfss://) and Azure Data Lake (adl://). Supported file formats can be specified using the file_format parameter.

Example spicepod.yml:

datasets:
- from: abfs://foocontainer/taxi_sample.csv
name: azure_test
params:
azure_account: spiceadls
azure_access_key: abc123==
file_format: csv

For a full list of supported files, see the Object Store File Formats documentation.

For more details, see the Azure Blob Storage Data Connector documentation.

Breaking Changesโ€‹

  • Spice.ai Data Connector: The key for the Spice.ai Cloud Platform Data Connector has changed from spiceai to spice.ai. To upgrade, change uses of from: spiceai: to from: spice.ai:.

  • GitHub Data Connector: Pull Requests column login has been renamed to author.

  • CLI SQL REPL: A terminating semi-colon ';' is now required for each executed SQL block.

  • Spicepod Hot-Reload: When running spiced directly, hot-reload of spicepod.yml configuration is now disabled. Run with spice run to use hot-reload.

Contributorsโ€‹

  • @sgrebnov
  • @Jeadie
  • @Sevenannn
  • @peasee
  • @ewgenius
  • @slyons
  • @phillipleblanc
  • @lukekim

Dependenciesโ€‹

What's Changedโ€‹

- Bump tonic from 0.12.2 to 0.12.3 by @dependabot in https://github.com/spiceai/spiceai/pull/2880
- Verify benchmark query results using snapshot testing (s3 connector) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2902
- Fix `paths-ignore:` by @Jeadie in https://github.com/spiceai/spiceai/pull/2906
- Rename `spiceai` data connector to `spice.ai` by @sgrebnov in https://github.com/spiceai/spiceai/pull/2899
- Update ROADMAP.md by @Jeadie in https://github.com/spiceai/spiceai/pull/2907
- Helm update for helm for 0.18.3-beta by @Jeadie in https://github.com/spiceai/spiceai/pull/2910
- Add tpcds queries by @Sevenannn in https://github.com/spiceai/spiceai/pull/2918
- Fix `paths-ignore` for docs. by @Jeadie in https://github.com/spiceai/spiceai/pull/2911
- feat: Support LIKE expressions in GitHub filter pushdown by @peasee in https://github.com/spiceai/spiceai/pull/2903
- feat: Support date comparison pushdown in GitHub connector by @peasee in https://github.com/spiceai/spiceai/pull/2904
- Improve aggregation and union queries unparsing by @sgrebnov in https://github.com/spiceai/spiceai/pull/2925
- Initialize file based accelerators on dataset reload by @Sevenannn in https://github.com/spiceai/spiceai/pull/2923
- Update spiceai/spiceai for next release by @Jeadie in https://github.com/spiceai/spiceai/pull/2928
- Verify TPC-H benchmark query results for arrow acceleration by @sgrebnov in https://github.com/spiceai/spiceai/pull/2927
- Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/2912
- Use structured output for NSQL by @Jeadie in https://github.com/spiceai/spiceai/pull/2922
- Update TPC-DS queries to use supported date addition format by @sgrebnov in https://github.com/spiceai/spiceai/pull/2930
- Add busy_timeout accelerator param for Sqlite by @Sevenannn in https://github.com/spiceai/spiceai/pull/2855
- Use Cosine Similarity in vector search by @Jeadie in https://github.com/spiceai/spiceai/pull/2932
- Add support for passing `x-spiceai-app-id` metadata in spiceai data connector by @ewgenius in https://github.com/spiceai/spiceai/pull/2934
- docs: update beta accelerator criteria by @peasee in https://github.com/spiceai/spiceai/pull/2905
- Azure Connector implementation by @slyons in https://github.com/spiceai/spiceai/pull/2926
- Local embedding model from relative paths by @Jeadie in https://github.com/spiceai/spiceai/pull/2908
- Add Markdown aware chunker when `params.file_format: md`. by @Jeadie in https://github.com/spiceai/spiceai/pull/2943
- 'spice version' without structured logging by @Jeadie in https://github.com/spiceai/spiceai/pull/2944
- Bump tempfile from 3.12.0 to 3.13.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2878
- feat: GraphQL commit query parameters by @peasee in https://github.com/spiceai/spiceai/pull/2945
- Update OpenAI client and use new request fields by @Jeadie in https://github.com/spiceai/spiceai/pull/2951
- refactor: Rename GitHub pulls login to author by @peasee in https://github.com/spiceai/spiceai/pull/2954
- Run tpcds benchmarks for accelerators by @Sevenannn in https://github.com/spiceai/spiceai/pull/2853
- Add spiced arg `--pods-watcher-enabled`. Watcher disabled by default for spiced. by @ewgenius in https://github.com/spiceai/spiceai/pull/2953
- Add error message when spicepod has embeddings or models without '--features models' by @Jeadie in https://github.com/spiceai/spiceai/pull/2952
- Adding multi-line editing and tab indentation to sql REPL by @slyons in https://github.com/spiceai/spiceai/pull/2949
- Update MySQL ghcr image to include tpcds data by @Sevenannn in https://github.com/spiceai/spiceai/pull/2941
- Document DataFusion limitation: The context only support single SQL Statement, Date Arithmetic like date + 3 not supported by @Sevenannn in https://github.com/spiceai/spiceai/pull/2970
- Bump snafu from 0.8.4 to 0.8.5 by @dependabot in https://github.com/spiceai/spiceai/pull/2876
- Bump async-trait from 0.1.82 to 0.1.83 by @dependabot in https://github.com/spiceai/spiceai/pull/2879
- Bump async-graphql from 7.0.9 to 7.0.11 in the cargo group by @dependabot in https://github.com/spiceai/spiceai/pull/2950
- Verify TPC-H benchmark query results for MySQL by @sgrebnov in https://github.com/spiceai/spiceai/pull/2972
- Verify TPCH benchmark query results for Postgres by @sgrebnov in https://github.com/spiceai/spiceai/pull/2973
- Verify TPCH benchmark query results for sqlite acceleration by @sgrebnov in https://github.com/spiceai/spiceai/pull/2974
- Verify TPCH benchmark query results for duckdb (in-memory) acceleration by @sgrebnov in https://github.com/spiceai/spiceai/pull/2975
- Support for `mdx` file extensions to apply a markdown splitter by @ewgenius in https://github.com/spiceai/spiceai/pull/2977
- Don't assume first vector or content will be non-null/zero by @Jeadie in https://github.com/spiceai/spiceai/pull/2940
- use custom chunk sizers for HF, local and OpenAI models by @Jeadie in https://github.com/spiceai/spiceai/pull/2971
- Ensure we return N unique documents, not N unique chunks by @Jeadie in https://github.com/spiceai/spiceai/pull/2976
- Fix issues parsing `messages[*].tool_calls` for local models by @Jeadie in https://github.com/spiceai/spiceai/pull/2957
- text -> SQL trait to customise per model. by @Jeadie in https://github.com/spiceai/spiceai/pull/2942
- Remove system message from ToolUsingChat. by @Jeadie in https://github.com/spiceai/spiceai/pull/2978
- Make logical plan to sql more robust (improve ORDER BY; support `round` for Postgres) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2984
- Add connection_pool_size parameter for Postgres accelerator by @Sevenannn in https://github.com/spiceai/spiceai/pull/2969
- Fix dataset configure prompt by @sgrebnov in https://github.com/spiceai/spiceai/pull/2991
- Verify TPCH benchmark query results for Databricks(odbc) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2989
- Verify TPCH benchmark query results for Databricks (delta_lake) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2982
- Set log level for anonymous telemetry traces to `trace` by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2995
- Improvements to issue templates by @lukekim in https://github.com/spiceai/spiceai/pull/2992
- `spice login` writes to `.env.local` if present by @slyons in https://github.com/spiceai/spiceai/pull/2996

**Full Changelog**: <https://github.com/spiceai/spiceai/compare/v0.18.3-beta...v0.19.0-beta>

Resourcesโ€‹

Communityโ€‹

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Slack or by email to get involved.

Spice v0.18.3-beta (Sep 30, 2024)

ยท 5 min read
Jack Eadie
Token Plumber at Spice AI

Announcing the release of Spice v0.18.3-beta ๐Ÿ› ๏ธ

The Spice v0.18.3-beta release includes several quality-of-life improvements including verbosity flags for spiced and the Spice CLI, vector search over larger documents with support for chunking dataset embeddings, and multiple performance enhancements. Additionally, the release includes several bug fixes, dependency updates, and optimizations, including updated table providers and significantly improved GitHub data connector performance for issues and pull requests.

Highlights in v0.18.3-betaโ€‹

GitHub Query Mode: A new github_query_mode: search parameter has been added to the GitHub Data Connector, which uses the GitHub Search API to enable faster and more efficient query of issues and pull requests when using filters.

Example spicepod.yml:

- from: github:github.com/spiceai/spiceai/issues/trunk
name: spiceai.issues
params:
github_query_mode: search # Use GitHub Search API
github_token: ${secrets:GITHUB_TOKEN}

Output Verbosity: Higher verbosity output levels can be specified through flags for both spiced and the Spice CLI.

Example command line:

spice -v
spice --very-verbose

spiced -vv
spiced --verbose

Embedding Chunking: Chunking can be enabled and configured to preprocess input data before generating dataset embeddings. This improves the relevance and precision for larger pieces of content.

Example spicepod.yml:

- name: support_tickets
embeddings:
- column: conversation_history
use: openai_embeddings
chunking:
enabled: true
target_chunk_size: 128
overlap_size: 16
trim_whitespace: true

For details, see the Search Documentation.

Dependenciesโ€‹

Contributorsโ€‹

  • @Sevenannn
  • @peasee
  • @Jeadie
  • @sgrebnov
  • @phillipleblanc
  • @ewgenius
  • @slyons

What's Changedโ€‹

- Update datafusion table provider patch by @Sevenannn in https://github.com/spiceai/spiceai/pull/2817
- refactor: Set max_rows_per_batch for ODBC to 4000 by @peasee in https://github.com/spiceai/spiceai/pull/2822
- Use User message for health check by @Jeadie in https://github.com/spiceai/spiceai/pull/2823
- Upgrade Helm chart (Spice v0.18.2-beta) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2820
- Add verbosity flags for spiced, spice: `-v`, `-vv`, `--verbose`, `--very-verbose`. by @Jeadie in https://github.com/spiceai/spiceai/pull/2831
- Rename `spiceai` data connector to `spice.ai` by @sgrebnov in https://github.com/spiceai/spiceai/pull/2680
- Prepare for v0.19.0-beta release (version bump) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2821
- Bump clap from 4.5.17 to 4.5.18 (#2801) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2848
- Enable "rc" feature for serde in spicepod crate by @ewgenius in https://github.com/spiceai/spiceai/pull/2851
- Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/2852
- chore: update table providers by @peasee in https://github.com/spiceai/spiceai/pull/2858
- fix: Use GitHub search for issues in GraphQL by @peasee in https://github.com/spiceai/spiceai/pull/2845
- fix: Use GitHub search for pull_requests by @peasee in https://github.com/spiceai/spiceai/pull/2847
- Support chunking dataset embeddings by @Jeadie in https://github.com/spiceai/spiceai/pull/2854
- refactor: Update GraphQL client to be more robust for filter push down by @peasee in https://github.com/spiceai/spiceai/pull/2864
- docs: Update accelerator beta criteria by @peasee in https://github.com/spiceai/spiceai/pull/2865
- Change `BytesProcessedRule` to be an optimizer rather than an analyzer rule by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2867
- Don't run E2E or PR tests on documentation by @Jeadie in https://github.com/spiceai/spiceai/pull/2869
- Verify benchmark query results using snapshot testing (spice.ai connector) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2866
- feat: Add GraphQLOptimizer by @peasee in https://github.com/spiceai/spiceai/pull/2868
- Update quickstarts for Endgame by @Jeadie in https://github.com/spiceai/spiceai/pull/2863
- Update version to v0.18.3-beta by @sgrebnov in https://github.com/spiceai/spiceai/pull/2882
- Update DataFusion: fix coalesce, Aggregation with Window functions unparsing support by @sgrebnov in https://github.com/spiceai/spiceai/pull/2884
- Revert "Rename `spiceai` data connector to `spice.ai`" by @sgrebnov in https://github.com/spiceai/spiceai/pull/2881
- Adding integration test for DuckDB read functions by @slyons in https://github.com/spiceai/spiceai/pull/2857
- Show more informative mysql error message by @Sevenannn in https://github.com/spiceai/spiceai/pull/2883
- Fix `no process-level CryptoProvider available` when using REPL and TLS by @sgrebnov in https://github.com/spiceai/spiceai/pull/2887
- Change UX for chunking and enable overlap_size in chunking by @Jeadie in https://github.com/spiceai/spiceai/pull/2890
- Add `log/slog` to spice CLI tool by @Jeadie in https://github.com/spiceai/spiceai/pull/2859
- feat: Add GitHub GraphQLOptimizer by @peasee in https://github.com/spiceai/spiceai/pull/2870
- Fix mysql invalid tablename error message by @Sevenannn in https://github.com/spiceai/spiceai/pull/2896
- fix: Remove login column rename in pulls and update Optimizer by @peasee in https://github.com/spiceai/spiceai/pull/2897
- Fix require check checking. by @Jeadie in https://github.com/spiceai/spiceai/pull/2898

**Full Changelog**: https://github.com/spiceai/spiceai/compare/v0.18.2-beta...v0.18.3-beta

Resourcesโ€‹

Communityโ€‹

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Slack or by email to get involved.

Spice v0.18.1-beta (Sep 23, 2024)

ยท 7 min read
Sergei Grebnov
Senior Software Engineer at Spice AI

Announcing the release of Spice v0.18.1-beta. ๐ŸŽ๏ธ

The v0.18.1-beta release continues to improve runtime performance and reliability. Performance for accelerated queries joining multiple datasets has been significantly improved with join push-down support. GraphQL, MySQL, and SharePoint data connectors have better reliability and error handling, and a new Microsoft SQL Server data connector has been introduced. Task History now has fine-grained configuration, including the ability to disable the feature entirely. A new spice search CLI command has been added, enabling development-time embeddings-based searches across datasets.

Highlights in v0.18.1-betaโ€‹

Join push-down for accelerations: Queries to the same accelerator will now push-down joins, significantly improving acceleration performance for queries joining multiple tables.

Microsoft SQL Server Data Connector: Use from: mssql: to access and accelerate Microsoft SQL Server datasets.

Example spicepod.yml:

datasets:
- from: mssql:path.to.my_dataset
name: my_dataset
params:
mssql_connection_string: ${secrets:mssql_connection_string}

See the Microsoft SQL Server Data Connector documentation.

Task History: Task History can be configured in the spicepod.yml, including the ability to include, or truncate outputs such as the results of a SQL query.

Example spicepod.yml:

runtime:
task_history:
enabled: true
captured_output: truncated
retention_period: 8h
retention_check_interval: 15m

See the Task History Spicepod reference for more information on possible values and behaviors.

Search CLI Command Use the spice search CLI command to perform embeddings-based searches across search configure datasets. Note: Search requires the ai feature to be installed.

Refresh on File Changes: File Data Connector data refreshes can be configured to be triggered when the source file is modified through a file system watcher. Enable the watcher by adding file_watcher: enabled to the acceleration parameters.

Example spicepod.yml:

datasets:
- from: file://path/to/my_file.csv
name: my_file
acceleration:
enabled: true
refresh_mode: full
params:
file_watcher: enabled

Breaking Changesโ€‹

The Query History table runtime.query_history has been deprecated and removed in favor of the Task History table runtime.task_history. The Task History table tracks tasks across all features such as SQL query, vector search, and AI completion in a unified table.

See the Task History documentation.

Dependenciesโ€‹

Contributorsโ€‹

  • @phillipleblanc
  • @Jeadie
  • @lukekim
  • @sgrebnov
  • @peasee
  • @Sevenannn
  • @ewgenius
  • @slyons

New Contributorsโ€‹

What's Changedโ€‹

- Update Helm Chart for 0.18.0-beta release by @sgrebnov in https://github.com/spiceai/spiceai/pull/2711
- Use a single instance for all DuckDB accelerated datasets by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2669
- Dependabot upgrades by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2715
- Use a single instance for all SQLite accelerated datasets by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2720
- Prepare for v0.18.1-beta release by @sgrebnov in https://github.com/spiceai/spiceai/pull/2692
- For GraphQL, remove necessity of `json_pointer` and improve error messaging. by @Jeadie in https://github.com/spiceai/spiceai/pull/2713
- Postgres accelerator benchmark test by @Sevenannn in https://github.com/spiceai/spiceai/pull/2652
- Trace query result while running benchmark tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/2684
- Early check EmbeddingConnector if embedding models do not exist by @Jeadie in https://github.com/spiceai/spiceai/pull/2717
- Move table creation for spice_sys_dataset_checkpoint to DatasetCheckpoint::try_new by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2732
- Don't load tools immediately by @Jeadie in https://github.com/spiceai/spiceai/pull/2731
- Renable accelerator federation on trunk by @Sevenannn in https://github.com/spiceai/spiceai/pull/2725
- Fixing Data Connectors link in README.md by @slyons in https://github.com/spiceai/spiceai/pull/2724
- Enable rehydration tests for DuckDB by @sgrebnov in https://github.com/spiceai/spiceai/pull/2729
- Check pageInfo is correct at initialisation of GraphQL connector by @Jeadie in https://github.com/spiceai/spiceai/pull/2730
- Microsoft SQL Server data connector initial support by @sgrebnov in https://github.com/spiceai/spiceai/pull/2741
- Add `spice search` CLI command by @lukekim in https://github.com/spiceai/spiceai/pull/2739
- Update threat model by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2738
- Upgrade to Arrow 53, DataFusion 42 and DuckDB 1.1 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2744
- Update datafusion table provider patch by @Sevenannn in https://github.com/spiceai/spiceai/pull/2747
- feat: Add enabled config option for task_history by @peasee in https://github.com/spiceai/spiceai/pull/2758
- Remove v0.18.0-beta from the Roadmap by @sgrebnov in https://github.com/spiceai/spiceai/pull/2748
- Fix spark-connect to use native roots for TLS again by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2766
- Fix benchmark test - Install default crypto provider by @Sevenannn in https://github.com/spiceai/spiceai/pull/2752
- Resolve primary keys for datasets with catalog or schema by @Jeadie in https://github.com/spiceai/spiceai/pull/2749
- MSSQL: include table name in schema retrieval error by @sgrebnov in https://github.com/spiceai/spiceai/pull/2746
- File Format parsing for Document tables, support for docx + pdf by @Jeadie in https://github.com/spiceai/spiceai/pull/2740
- Add Document parsing to Sharepoint connector. by @Jeadie in https://github.com/spiceai/spiceai/pull/2760
- Execution plan with BinaryExpr predicates pushdown support for MS SQL by @sgrebnov in https://github.com/spiceai/spiceai/pull/2768
- Update datafusion patch by @Sevenannn in https://github.com/spiceai/spiceai/pull/2772
- Support for standalone config parameters for MS SQL by @sgrebnov in https://github.com/spiceai/spiceai/pull/2773
- Utilize DataConnectorError for MySQL Data Connector Errors by @Sevenannn in https://github.com/spiceai/spiceai/pull/2759
- Add Score to search results by @lukekim in https://github.com/spiceai/spiceai/pull/2774
- Don't call GetComponentStatuses when --metrics not enabled by @Jeadie in https://github.com/spiceai/spiceai/pull/2779
- Implement better error handling for spicepods by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2767
- Make integration tests more robust by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2782
- Query results streaming support for MS SQL by @sgrebnov in https://github.com/spiceai/spiceai/pull/2781
- Update benchmark snapshots by @Sevenannn in https://github.com/spiceai/spiceai/pull/2778
- For Sharepoint connector, if client_secret and auth_code are both provided, default to auth_code by @Jeadie in https://github.com/spiceai/spiceai/pull/2780
- Add modified pk/indexes scenario to rehydration tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/2743
- Run benchmarks on Wed, Fri, Sat, and Sun. by @lukekim in https://github.com/spiceai/spiceai/pull/2786
- Update PULL_REQUEST_TEMPLATE.md to include a section for Documentation by @slyons in https://github.com/spiceai/spiceai/pull/2785
- Add E2E test for MS SQL data connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/2788
- More types support for MS SQL data connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/2789
- feat: Add captured_output option for task_history by @peasee in https://github.com/spiceai/spiceai/pull/2783
- Add ability to refresh when file data connector detects changes by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2787
- Propagate MySQL invalid table name error by @Sevenannn in https://github.com/spiceai/spiceai/pull/2776
- feat: Add retention options for task_history config by @peasee in https://github.com/spiceai/spiceai/pull/2784
- fix: Move task history check after query history creation by @peasee in https://github.com/spiceai/spiceai/pull/2793
- MS SQL connector should ignore all unsupported types by @sgrebnov in https://github.com/spiceai/spiceai/pull/2795
- Improve Sharepoint DX by @Jeadie in https://github.com/spiceai/spiceai/pull/2791
- Replace query history with task history by @peasee in https://github.com/spiceai/spiceai/pull/2792
- Fix datasets_health_monitor spice.runtime.task_history not found warning by @sgrebnov in https://github.com/spiceai/spiceai/pull/2805
- Upgrade macOS x86_64 test runner to macOS 13.6.9 Ventura by @sgrebnov in https://github.com/spiceai/spiceai/pull/2803
- Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/2808
- Add mssql to the list of supported data connectors by @sgrebnov in https://github.com/spiceai/spiceai/pull/28

**Full Changelog**: https://github.com/spiceai/spiceai/compare/v0.18.0-beta...v0.18.1-beta

Resourcesโ€‹

Communityโ€‹

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Slack or by email to get involved.

Spice v0.18-beta (Sep 16, 2024)

ยท 7 min read
Sergei Grebnov
Senior Software Engineer at Spice AI

Announcing the release of Spice v0.18-beta.

The v0.18.0-beta release adds new Sharepoint and File data connectors, introduces AWS Identity and Access Management (IAM) support for the S3 Data Connector, improves performance of the GitHub connector, and increases the overall reliability of all data accelerators. The /ready API endpoint was enhanced to report as ready only when all components, including loaded data, have successfully reported readiness.

Highlights in v0.18.0-betaโ€‹

Sharepoint Data Connector: Use from: sharepoint: to access and accelerate documents stored in Microsoft 365 OneDrive for Business (Sharepoint). The CLI also includes a new spice login sharepoint to aid in local development and testing.

Example spicepod.yml:

datasets:
- from: sharepoint:drive:Documents/path:/important_documents/
name: important_documents
params:
sharepoint_client_id: ${secrets:SPICE_SHAREPOINT_CLIENT_ID}
sharepoint_tenant_id: ${secrets:SPICE_SHAREPOINT_TENANT_ID}
sharepoint_client_secret: ${secrets:SPICE_SHAREPOINT_CLIENT_SECRET}

See the Sharepoint Data Connector documentation.

AWS Identity and Access Management (IAM) for S3: A new s3_auth parameter for the s3 data connector to configure the authentication method to use when connecting to S3. Supported values are public, key, and iam_role. Use s3_auth: iam_role to assume the instance IAM role.

Example spicepod.yml:

datasets:
- from: s3://my-bucket
name: bucket
params:
s3_auth: iam_role # Assume IAM role of instance

See the S3 Data Connector documentation.

File Data Connector Use from: file: to query files stored by locally accessible filesystems.

Example spicepod.yml:

datasets:
- from: file://path/to/customer.parquet
name: customer
params:
file_format: parquet

See the File Data Connector documentation.

Improved /ready Api Now includes the initial data load for accelerated datasets in addition to component readiness to ensure readiness is only reported when data has loaded and can be successfully queried.

Breaking Changesโ€‹

  • GitHub Data Connector: The data type for time-related columns has changed from Utf8 to Timestamp. To upgrade, data type references to timestamp. For example, if using time_format:, change uses of time_format: ISO8601 to time_format: timestamp.

  • Ready API: The /ready API reports ready only when all components have reported ready and data is fully loaded. To upgrade, evaluate uses of the Ready API (such as Kubernetes readiness probes) and consider how it might affect system behavior.

Dependenciesโ€‹

No major dependencies updates.

Contributorsโ€‹

  • @phillipleblanc
  • @Jeadie
  • @lukekim
  • @sgrebnov
  • @peasee
  • @eltociear
  • @Sevenannn
  • @ewgenius
  • @karifabri

New Contributorsโ€‹

What's Changedโ€‹

- Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/2585
- Set helm to v0.17.4-beta by @ewgenius in https://github.com/spiceai/spiceai/pull/2595
- Bump to next v0.18.0-beta version by @ewgenius in https://github.com/spiceai/spiceai/pull/2596
- Add snapshot test docs / Update beta criteria for data accelerators by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2594
- Enable federation for accelerated queries (sqlite, duckdb, postgres) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2598
- spelling updates on v0.17.4 release notes by @karifabri in https://github.com/spiceai/spiceai/pull/2601
- Update endgame template by @ewgenius in https://github.com/spiceai/spiceai/pull/2591
- fix: Re-attach DuckDB attachments on each query by @peasee in https://github.com/spiceai/spiceai/pull/2602
- Speed up sqlite accelerator benchmark test with indexes by @Sevenannn in https://github.com/spiceai/spiceai/pull/2597
- Fix refresh API using `refresh_mode: append` by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2609
- Tweak `/ready` to only report ready when components have all reported Ready by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2600
- Add `s3_auth` parameter to configure IAM role authentication by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2611
- Bump fundu from 2.0.0 to 2.0.1 by @dependabot in https://github.com/spiceai/spiceai/pull/2576
- fix: Remove comments from SQL files by @peasee in https://github.com/spiceai/spiceai/pull/2627
- Utilize runtime.status().is_ready() to check acceleration dataset readiness in benchmark test by @Sevenannn in https://github.com/spiceai/spiceai/pull/2614
- Allow for prefix to be kept in internal Parameters by @Jeadie in https://github.com/spiceai/spiceai/pull/2603
- Bump itertools from 0.12.1 to 0.13.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2572
- Bump golang.org/x/mod from 0.20.0 to 0.21.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2571
- Add initial threat model using OWASP Threat Dragon by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2599
- fix: Explicitly error for duplicate duckdb file accelerators by @peasee in https://github.com/spiceai/spiceai/pull/2628
- Benchmark test binary can parse command line option by @Sevenannn in https://github.com/spiceai/spiceai/pull/2626
- Snapshot tests shouldn't crash the Spice benchmark test by @Sevenannn in https://github.com/spiceai/spiceai/pull/2613
- Bump anyhow from 1.0.86 to 1.0.87 by @dependabot in https://github.com/spiceai/spiceai/pull/2573
- Upgrade datafusion to improve SQLite subquery tables aliasing support by @sgrebnov in https://github.com/spiceai/spiceai/pull/2634
- Run benchmark separately using workflow by @Sevenannn in https://github.com/spiceai/spiceai/pull/2631
- Sharepoint UX changes by @Jeadie in https://github.com/spiceai/spiceai/pull/2633
- Improve `/ready` to only mark a dataset ready iff the initial refresh completed by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2630
- Support relative paths for file connector by @Jeadie in https://github.com/spiceai/spiceai/pull/2637
- Fix `error decoding response body` GitHub file connector bug by @sgrebnov in https://github.com/spiceai/spiceai/pull/2645
- GraphQL pagination and robustness. by @Jeadie in https://github.com/spiceai/spiceai/pull/2632
- docs: Update bug template by @peasee in https://github.com/spiceai/spiceai/pull/2629
- Define GitHub `issues` data connector schema upfront by @sgrebnov in https://github.com/spiceai/spiceai/pull/2646
- Add support for loading from Sharepoint Group's default drive. by @Jeadie in https://github.com/spiceai/spiceai/pull/2642
- Fix typo in workflow, fix the postgres connector container readiness check by @Sevenannn in https://github.com/spiceai/spiceai/pull/2654
- Fix check all features by @Sevenannn in https://github.com/spiceai/spiceai/pull/2653
- Enable Warn/Error traces from dependency components by @sgrebnov in https://github.com/spiceai/spiceai/pull/2655
- Use lower case iso8601 for time_column by @Sevenannn in https://github.com/spiceai/spiceai/pull/2551
- Add basic integration test for Spice spill-to-disk and re-hydration scenario by @sgrebnov in https://github.com/spiceai/spiceai/pull/2643
- Add 'RefreshOverrides::max_jitter' to 'POST /v1/datasets/:name/acceleration/refresh' by @Jeadie in https://github.com/spiceai/spiceai/pull/2641
- Bump rustls-pemfile from 1.0.4 to 2.1.3 by @dependabot in https://github.com/spiceai/spiceai/pull/2575
- Update dependencies to support querying postgres enum types by @Sevenannn in https://github.com/spiceai/spiceai/pull/2657
- Upgrade table-providers by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2659
- Improve `spill_to_disk_and_rehydration` integration test by @sgrebnov in https://github.com/spiceai/spiceai/pull/2658
- Enhance GitHub connector robustness with explicit table schema definitions by @sgrebnov in https://github.com/spiceai/spiceai/pull/2661
- Rename sharepoint fields by @Jeadie in https://github.com/spiceai/spiceai/pull/2668
- Disable dataset checkpoint for DuckDB acceleration by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2676
- Revert "Enable federation for accelerated queries (sqlite, duckdb, postgres) (#2598) by @Sevenannn in https://github.com/spiceai/spiceai/pull/2683

**Full Changelog**: https://github.com/spiceai/spiceai/compare/v0.17.4-beta...v0.18.0-beta

Resourcesโ€‹

Communityโ€‹

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Slack or by email to get involved.

Spice v0.17.4-beta (Sep 9, 2024)

ยท 5 min read
Luke Kim
Founder and CEO of Spice AI

Announcing the release of Spice v0.17.4-beta.

The v0.17.4-beta release adds compatibility, performance, and reliability improvements to the DuckDB and SQLite accelerators. The GitHub data connector adds a Stargazers table, Snowflake and Clickhouse data connectors have improved resiliency for empty tables, and core data processing and quality has been improved.

Highlights in v0.17.4-betaโ€‹

Improved benchmarking, testing, and robustness of data accelerators: Continued compatibility, performance, and reliability improvements for SQLite and DuckDB data accelerators and expanded performance and quality testing.

GitHub Stargazers: The GitHub Data Connector adds support for a /stargazers table making it easy to query GitHub Stargazers using SQL!

Breaking Changesโ€‹

None.

Contributorsโ€‹

  • @phillipleblanc
  • @Jeadie
  • @lukekim
  • @sgrebnov
  • @peasee
  • @eltociear
  • @Sevenannn
  • @ewgenius

New Contributorsโ€‹

What's Changedโ€‹

- Change to sql lang by @ewgenius in https://github.com/spiceai/spiceai/pull/2484
- Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/2487
- Bump rustyline from 13.0.0 to 14.0.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2473
- Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/2490
- Native schema inference for snowflake (and support timezone_tz, better numeric support) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2493
- Add checks for GitHub quickstart and docs banner to endgame template by @ewgenius in https://github.com/spiceai/spiceai/pull/2489
- Prepare for v0.18.0-beta by @Jeadie in https://github.com/spiceai/spiceai/pull/2488
- Add logo to README.md by @lukekim in https://github.com/spiceai/spiceai/pull/2497
- Add stargazers to GitHub data connector by @lukekim in https://github.com/spiceai/spiceai/pull/2502
- Enable federation for accelerated queries (sqlite and duckdb) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2511
- Load SQLite decimal extension by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2498
- fix: Support INTERVAL in SQLite by @peasee in https://github.com/spiceai/spiceai/pull/2513
- Add refresh jitter to refreshing dataset acceleration by @Jeadie in https://github.com/spiceai/spiceai/pull/2510
- Update to use DuckDB streaming by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2514
- Add more MySQL types in E2E testing by @sgrebnov in https://github.com/spiceai/spiceai/pull/2512
- Update tpc loading script to support automatic loading into postgres by @Sevenannn in https://github.com/spiceai/spiceai/pull/2509
- docs: update README.md by @eltociear in https://github.com/spiceai/spiceai/pull/2516
- Bump quinn-proto from 0.11.6 to 0.11.8 in the cargo group by @dependabot in https://github.com/spiceai/spiceai/pull/2501
- Script for loading clickbench data into arrow / postgres and run clickbench queries by @Sevenannn in https://github.com/spiceai/spiceai/pull/2500
- Fix run query script to correctly record all errors by @Sevenannn in https://github.com/spiceai/spiceai/pull/2529
- Add support for DuckDB engine to setup-tpc-spicepod.bash by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2530
- Upgrade `datafusion` (fixes subquery alias table unparsing for SQLite) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2532
- Make dataset acceleration delay `period +- jitter` by @Jeadie in https://github.com/spiceai/spiceai/pull/2534
- Add refresh options to `POST /v1/datasets/:name/acceleration/refresh` by @Jeadie in https://github.com/spiceai/spiceai/pull/2515
- Add E2E for GitHub Connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/2505
- Add on-conflict integration test for file based and memory based sqlite by @Sevenannn in https://github.com/spiceai/spiceai/pull/2533
- Upgrade to Rust v.1.81.0 and fix resulting compile error by @Sevenannn in https://github.com/spiceai/spiceai/pull/2539
- Remove unneeded `RwLock` from `EmbeddingModelStore` by @Jeadie in https://github.com/spiceai/spiceai/pull/2541
- Remove unneeded RwLock from LlmModelStore by @Jeadie in https://github.com/spiceai/spiceai/pull/2537
- Add sqlite to the setup tpc benchmark script by @Sevenannn in https://github.com/spiceai/spiceai/pull/2540
- Add sqlite to setup clickbench script by @Sevenannn in https://github.com/spiceai/spiceai/pull/2548
- Update version for v0.17.4-beta release by @ewgenius in https://github.com/spiceai/spiceai/pull/2563
- Sharepoint data connector by @Jeadie in https://github.com/spiceai/spiceai/pull/2294
- Fix predicate/projection push-down for BytesProcessedNode by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2564
- fix out of order projections in sharepoint scans by @Jeadie in https://github.com/spiceai/spiceai/pull/2569
- Use Decimal instead of Float64 for SQLite Decimal columns by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2566
- Add snapshot tests for EXPLAIN plans in integration tests by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2570
- Set refresh.period from `refresh_data_window` by @ewgenius in https://github.com/spiceai/spiceai/pull/2578
- Add snapshot tests for EXPLAIN plans in benchmark tests by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2580
- Disable federation for accelerated queries by @sgrebnov in https://github.com/spiceai/spiceai/pull/2581
- Add manual refresh payload to 'spice refresh...' by @Jeadie in https://github.com/spiceai/spiceai/pull/2565
- Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/2586

**Full Changelog**: https://github.com/spiceai/spiceai/compare/v0.17.3-beta...v0.17.4-beta

Resourcesโ€‹

Communityโ€‹

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Slack or by email to get involved.

Spice v0.17.3-beta (Sep 2, 2024)

ยท 7 min read
Jack Eadie
Token Plumber at Spice AI

Announcing the release of Spice v0.17.3-beta.

The v0.17.3-beta release further improves data accelerator robustness and adds a new github data connector that makes accelerating GitHub Issues, Pull Requests, Commits, and Blobs easy.

Highlights in v0.17.3-betaโ€‹

Improved benchmarking, testing, and robustness of data accelerators: Continued improvements to benchmarking and testing of data accelerators, leading to more robust and reliable data accelerators.

GitHub Connector (alpha): Connect to GitHub and accelerate Issues, Pull Requests, Commits, and Blobs.

datasets:
# Fetch all rust and golang files from spiceai/spiceai
- from: github:github.com/spiceai/spiceai/files/trunk
name: spiceai.files
params:
include: '**/*.rs; **/*.go'
github_token: ${secrets:GITHUB_TOKEN}

# Fetch all issues from spiceai/spiceai. Similar for pull requests, commits, and more.
- from: github:github.com/spiceai/spiceai/issues
name: spiceai.issues
params:
github_token: ${secrets:GITHUB_TOKEN}

Breaking Changesโ€‹

None.

Upgrade Instructionsโ€‹

  • CLI: Run spice upgrade
  • Docker: docker pull spiceai/spiceai:latest
  • Container image tag: spiceai/spiceai:latest or spiceai/spiceai:0.17.3-beta

Contributorsโ€‹

  • @phillipleblanc
  • @Jeadie
  • @peasee
  • @sgrebnov
  • @Sevenannn
  • @lukekim
  • @dependabot
  • @ewgenius

What's Changedโ€‹

Dependenciesโ€‹

  • delta_kernel from 0.2.0 to 0.3.0.

Commitsโ€‹

- Prepare version for v0.17.3-beta by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2388
- Add a basic Github Connector by @Jeadie in https://github.com/spiceai/spiceai/pull/2365
- task: Re-enable federation by @peasee in https://github.com/spiceai/spiceai/pull/2389
- fix: Implement custom PartialEq for Dataset by @peasee in https://github.com/spiceai/spiceai/pull/2390
- GitHub Data Connector `files` support (basic fields) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2393
- Add a `--force` flag to `spice install` to force it to install the latest released version by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2395
- Improve experience of using `spice chat` by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2396
- Fix view loading on startup by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2398
- Add `include` param support to GitHub Data Connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/2397
- Postgres integration test to cover on-conflict behavior by @Sevenannn in https://github.com/spiceai/spiceai/pull/2359
- Create dependabot.yml by @lukekim in https://github.com/spiceai/spiceai/pull/2399
- Add `content` column to GitHub Connector when dataset is accelerated by @sgrebnov in https://github.com/spiceai/spiceai/pull/2400
- Fix dependabot indentation by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2402
- Bump docker/setup-buildx-action from 1 to 3 by @dependabot in https://github.com/spiceai/spiceai/pull/2403
- Bump github/codeql-action from 2 to 3 by @dependabot in https://github.com/spiceai/spiceai/pull/2404
- Bump docker/login-action from 1 to 3 by @dependabot in https://github.com/spiceai/spiceai/pull/2405
- Bump yogevbd/enforce-label-action from 2.1.0 to 2.2.2 by @dependabot in https://github.com/spiceai/spiceai/pull/2406
- Bump actions/checkout from 3 to 4 by @dependabot in https://github.com/spiceai/spiceai/pull/2407
- Bump go.uber.org/zap from 1.21.0 to 1.27.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2408
- Bump github.com/prometheus/client_model from 0.6.0 to 0.6.1 by @dependabot in https://github.com/spiceai/spiceai/pull/2409
- Bump github.com/spf13/cobra from 1.6.0 to 1.8.1 by @dependabot in https://github.com/spiceai/spiceai/pull/2412
- Bump chrono-tz from 0.8.6 to 0.9.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2413
- Bump tokio from 1.39.2 to 1.39.3 by @dependabot in https://github.com/spiceai/spiceai/pull/2414
- Bump tokenizers from 0.19.1 to 0.20.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2415
- Bump serde from 1.0.207 to 1.0.209 by @dependabot in https://github.com/spiceai/spiceai/pull/2416
- Bump gopkg.in/natefinch/lumberjack.v2 from 2.0.0 to 2.2.1 by @dependabot in https://github.com/spiceai/spiceai/pull/2410
- Bump ndarray from 0.15.6 to 0.16.1 by @dependabot in https://github.com/spiceai/spiceai/pull/2417
- Bump golang.org/x/mod from 0.14.0 to 0.20.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2411
- Add correct labels to dependabot.yml by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2418
- Fix build break by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2430
- Dependabot updates by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2431
- Bump github.com/stretchr/testify from 1.8.1 to 1.9.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2422
- Preserve timezone information in constructing expr by @Sevenannn in https://github.com/spiceai/spiceai/pull/2392
- Bump github.com/spf13/viper from 1.12.0 to 1.19.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2420
- Fix repeated base table data in acceleration with embeddings by @Sevenannn in https://github.com/spiceai/spiceai/pull/2401
- Fix tool calling with Groq (and potentially other tool-enabled models) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2435
- Remove candle from `crates/llms/src/chat/` by @Jeadie in https://github.com/spiceai/spiceai/pull/2439
- fix: Only attach successfully initialized accelerators by @peasee in https://github.com/spiceai/spiceai/pull/2433
- Support overriding OpenAI default values in a model param; add token usage telemetry to task_history. by @Jeadie in https://github.com/spiceai/spiceai/pull/2434
- Enable message chains and tool calls for local LLMs by @Jeadie in https://github.com/spiceai/spiceai/pull/2180
- DuckDB on-conflict integration test by @Sevenannn in https://github.com/spiceai/spiceai/pull/2437
- Fix MySQL E2E tests and include MySQL acceleration testing by @sgrebnov in https://github.com/spiceai/spiceai/pull/2441
- Use rtcontext for proper cloud/local context in `spice chat` by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2442
- Fix MySQL connector to respect the source column's decimal precision by @sgrebnov in https://github.com/spiceai/spiceai/pull/2443
- Improve Github Data Connector tables schema by @sgrebnov in https://github.com/spiceai/spiceai/pull/2448
- Improve GitHub Connector error msg when invalid token or permissions by @sgrebnov in https://github.com/spiceai/spiceai/pull/2449
- Proper error tracking across tracing spans by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2454
- task: Disable and update federation by @peasee in https://github.com/spiceai/spiceai/pull/2457
- GitHub connector: convert `labels` and `hashes` to primitive arrays by @sgrebnov in https://github.com/spiceai/spiceai/pull/2452
- Bump `datafusion` version to the latest by @sgrebnov in https://github.com/spiceai/spiceai/pull/2456
- Trim trailing `/` for S3 data connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2458
- Add `accelerated_refresh` to `task_history` table by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2459
- Add `assignees` and `labels` fields to github issues and github pulls datasets by @ewgenius in https://github.com/spiceai/spiceai/pull/2467
- Native clickhouse schema inference by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2466
- List GitHub connector in readme by @ewgenius in https://github.com/spiceai/spiceai/pull/2468
- Fix LLMs health check; Add `updatedAt` field to GitHub connector by @ewgenius in https://github.com/spiceai/spiceai/pull/2474
- Remove non existing updated_at from github.pulls dataset by @ewgenius in https://github.com/spiceai/spiceai/pull/2475
- GitHub connector: add pulls labels and rm duplicate milestoneId and milestoneTitle for issues by @sgrebnov in https://github.com/spiceai/spiceai/pull/2477
- Bump delta_kernel from 0.2.0 to 0.3.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2472
- Add back GitHub connector Pull Request `updated_at` by @lukekim in https://github.com/spiceai/spiceai/pull/2479
- Update ROADMAP Sep 2, 2024. by @lukekim in https://github.com/spiceai/spiceai/pull/2478

**Full Changelog**: <https://github.com/spiceai/spiceai/compare/v0.17.2-beta...v0.17.3-beta>

Resourcesโ€‹

Communityโ€‹

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Slack or by email to get involved.

Spice v0.17.2-beta (August 26, 2024)

ยท 8 min read
Phillip LeBlanc
Co-Founder and CTO of Spice AI

Announcing the release of Spice v0.17.2-beta ๐Ÿ„

The v0.17.2-beta release focuses on improving data accelerator compatibility, stability, and performance. Expanded data type support for DuckDB, SQLite, and PostgreSQL data accelerators (and data connectors) enables significantly more data types to be accelerated. Error handling and logging has also been improved along with several bugs.

Highlights in v0.17.2-betaโ€‹

Expanded Data Type Support for Data Accelerators: DuckDB, SQLite, and PostgreSQL Data Accelerators now support a wider range of data types, enabling acceleration of more diverse datasets.

Enhanced Error Handling and Logging: Improvements have been made to aid in troubleshooting and debugging.

Anonymous Usage Telemetry: Optional, anonymous, aggregated telemetry has been added to help improve Spice. This feature can be disabled. For details about collected data, see the telemetry documentation.

To opt out of telemetry:

  1. Using the CLI flag:

    spice run -- --telemetry-enabled false
  2. Add configuration to spicepod.yaml:

    runtime:
    telemetry:
    enabled: false

Improved Benchmarking: A suite of performance benchmarking tests have been added to the project, helping to maintain and improve runtime performance; a top priority for the project.

Breaking Changesโ€‹

None.

Contributorsโ€‹

  • @Jeadie
  • @y-f-u
  • @phillipleblanc
  • @sgrebnov
  • @Sevenannn
  • @peasee
  • @ewgenius

What's Changedโ€‹

Dependenciesโ€‹

Commitsโ€‹

- Pin actions/upload-artifact to v4.3.4 by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2200>
- Update spicepod.schema.json by @github-actions in <https://github.com/spiceai/spiceai/pull/2202>
- Update to next release version, `v0.17.2-beta` by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2203>
- add accelerator beta criteria by @y-f-u in <https://github.com/spiceai/spiceai/pull/2201>
- update helm chart to 0.17.1-beta by @Sevenannn in <https://github.com/spiceai/spiceai/pull/2205>
- add dockerignore to avoid copy target and test folder by @y-f-u in <https://github.com/spiceai/spiceai/pull/2206>
- add client timeout for deltalake connector by @y-f-u in <https://github.com/spiceai/spiceai/pull/2208>
- Upgrade tonic and opentelemetry-proto by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2223>
- Add index and resource tuning for postgres ghcr image to support postgres benchmark in sf1 by @Sevenannn in <https://github.com/spiceai/spiceai/pull/2196>
- Remove embedding columns from `retrieved_primary_keys` in v1/search by @Jeadie in <https://github.com/spiceai/spiceai/pull/2176>
- use file as db_path_param as the param prefix is trimmed by @y-f-u in <https://github.com/spiceai/spiceai/pull/2230>
- use file for sqlite db path param by @y-f-u in <https://github.com/spiceai/spiceai/pull/2231>
- docs: Clarify the global requirement for local_infile when loading TPCH by @peasee in <https://github.com/spiceai/spiceai/pull/2228>
- Revert pinning actions/upload-artifact@v4 by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2232>
- Runtime tools to chat models by @Jeadie in <https://github.com/spiceai/spiceai/pull/2207>
- Create `runtime.task_history` table for queries, and embeddings by @Jeadie in <https://github.com/spiceai/spiceai/pull/2191>
- chore: Update Databricks ODBC Bench to use TPCH SF1 by @peasee in <https://github.com/spiceai/spiceai/pull/2238>
- Replace `metrics-rs` with OpenTelemetry Metrics by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2240>
- fix: Remove dead code by @peasee in <https://github.com/spiceai/spiceai/pull/2249>
- Improve tool quality and add vector search tool by @Jeadie in <https://github.com/spiceai/spiceai/pull/2250>
- fix missing partition cols in delta lake by @y-f-u in <https://github.com/spiceai/spiceai/pull/2253>
- download file from remote for delta testing by @y-f-u in <https://github.com/spiceai/spiceai/pull/2254>
- feat: Set SQLite DB path to .spice/data by @peasee in <https://github.com/spiceai/spiceai/pull/2242>
- Support tools for chat completions in streaming mode by @ewgenius in <https://github.com/spiceai/spiceai/pull/2255>
- Load component `description` field from spicepod.yaml and include in LLM context by @ewgenius in <https://github.com/spiceai/spiceai/pull/2261>
- Add parameter for `connection_pool_size` in the Postgres Data Connector by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2251>
- Add primary keys to response of `DocumentSimilarityTool` by @Jeadie in <https://github.com/spiceai/spiceai/pull/2263>
- run queries bash script by @y-f-u in <https://github.com/spiceai/spiceai/pull/2262>
- Run benchmark test on schedule by @Sevenannn in <https://github.com/spiceai/spiceai/pull/2277>
- feat: Add a reference to originating App for a Dataset by @peasee in <https://github.com/spiceai/spiceai/pull/2283>
- Tool use & telemetry productionisation. by @Jeadie in <https://github.com/spiceai/spiceai/pull/2286>
- Fix cron in benchmarks.yml by @Sevenannn in <https://github.com/spiceai/spiceai/pull/2288>
- Upgrade to DataFusion v41 by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2290>
- Chat completions adjustments and fixes by @ewgenius in <https://github.com/spiceai/spiceai/pull/2292>
- Define the new metrics Arrow schema based on Open Telemetry by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2295>
- OpenTelemetry Metrics Arrow exporter to `runtime.metrics` table by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2296>
- Calculate summary metrics from histograms for Prometheus endpoint by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2302>
- Add back Spice DF runtime_env during SessionContext construction by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2304>
- Add integration test for S3 data connector by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2305>
- Fix `secrets.inject_secrets` when secret not found. by @Jeadie in <https://github.com/spiceai/spiceai/pull/2306>
- Intra-table federation query on duckdb accelerated table by @y-f-u in <https://github.com/spiceai/spiceai/pull/2299>
- Postgres federation on acceleration by @y-f-u in <https://github.com/spiceai/spiceai/pull/2309>
- sqlite intra table federation on acceleration by @y-f-u in <https://github.com/spiceai/spiceai/pull/2308>
- feat: Add `DataAccelerator::init()` for SQLite acceleration federation by @peasee in <https://github.com/spiceai/spiceai/pull/2293>
- Initial framework for collecting anonymous usage telemetry by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2310>
- Add gRPC action to trigger accelerated dataset refresh by @sgrebnov in <https://github.com/spiceai/spiceai/pull/2316>
- add `disable_query_push_down` option to acceleration settings by @y-f-u in <https://github.com/spiceai/spiceai/pull/2327>
- Remove `v1/assist` by @Jeadie in <https://github.com/spiceai/spiceai/pull/2312>
- bump table provider version to set the correct dialect for postgres writer by @y-f-u in <https://github.com/spiceai/spiceai/pull/2329>
- Send telemetry on startup by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2331>
- Calculate resource IDs for telemetry by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2332>
- Refactor `v1/search`: include WHERE condition, allow extra columns in projection. by @Jeadie in <https://github.com/spiceai/spiceai/pull/2328>
- Add integration test for gRPC dataset refresh action by @sgrebnov in <https://github.com/spiceai/spiceai/pull/2330>
- Propagate errors through all `task_history` nested spans by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2337>
- Improve tools by @Jeadie in <https://github.com/spiceai/spiceai/pull/2338>
- update duckdb rs version to support more types: interval/duration/etc by @y-f-u in <https://github.com/spiceai/spiceai/pull/2336>
- feat: Add DuckDB accelerator init, attach databases for federation by @peasee in <https://github.com/spiceai/spiceai/pull/2335>
- Add query telemetry metrics by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2333>
- Add system prompts for LLMs; system prompts for tool using models. by @Jeadie in <https://github.com/spiceai/spiceai/pull/2342>
- Fix benchmark test to keep running when there's failed queries by @Sevenannn in <https://github.com/spiceai/spiceai/pull/2347>
- Tools as a spicepod first class citizen. by @Jeadie in <https://github.com/spiceai/spiceai/pull/2344>
- Add `bytes_processed` telemetry metric by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2343>
- fix misaligned columns from delta lake by @y-f-u in <https://github.com/spiceai/spiceai/pull/2356>
- Emit telemetry metrics to `runtime.metrics`/Prometheus as well by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2352>
- Use UTC timezone for telemetry timestamps by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2354>
- Fix MetricType deserialization by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2358>
- Add dataset details to tool using LLMs; early check tables in vector search by @Jeadie in <https://github.com/spiceai/spiceai/pull/2353>
- Bump datafusion-federation/datafusion-table-providers dependencies by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/2360>
- Update spicepod.schema.json by @github-actions in <https://github.com/spiceai/spiceai/pull/2362>
- fix: Disable DuckDB and SQLite federation by @peasee in <https://github.com/spiceai/spiceai/pull/2371>
- Fix system prompt in ToolUsingChat, fix builtin registration by @Jeadie in <https://github.com/spiceai/spiceai/pull/2367>
- fix: Use --profile release for benchmarks by @peasee in <https://github.com/spiceai/spiceai/pull/2372>
- nql parameter 'use' -> 'model' by @Jeadie in <https://github.com/spiceai/spiceai/pull/2366>

**Full Changelog**: <https://github.com/spiceai/spiceai/compare/v0.17.1-beta...v0.17.2-beta>

Resourcesโ€‹

Communityโ€‹

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Slack or by email to get involved.

Spice v0.17.1-beta (August 5, 2024)

ยท 5 min read
Phillip LeBlanc
Co-Founder and CTO of Spice AI

The v0.17.1-beta minor release focuses on enhancing stability, performance, and usability. The Flight interface now supports the GetSchema API and s3, ftp, sftp, http, https, and databricks data connectors have added support for a client_timeout parameter.

Highlights in v0.17.1-betaโ€‹

Flight API GetSchema: The GetSchema API is now supported by the Flight interface. The schema of a dataset can be retrieved using GetSchema with the PATH or CMD FlightDescriptor types. The CMD FlightDescriptor type is used to get the schema of an arbitrary SQL query as the CMD bytes. The PATH FlightDescriptor type is used to retrieve the schema of a dataset.

Client Timeout: A client_timeout parameter has been added for Data Connectors: ftp, sftp, http, https, and databricks. When defined, the client timeout configures Spice to stop waiting for a response from the data source after the specified duration. The default timeout is 30 seconds.

datasets:
- from: ftp://remote-ftp-server.com/path/to/folder/
name: my_dataset
params:
file_format: csv
# Example client timeout
client_timeout: 30s
ftp_user: my-ftp-user
ftp_pass: ${secrets:my_ftp_password}

Breaking Changesโ€‹

TLS is now required to be explicitly enabled. Enable TLS on the command line using --tls-enabled true:

spice run -- --tls-enabled true --tls-certificate-file /path/to/cert.pem --tls-key-file /path/to/key.pem

Or in the spicepod.yml with enabled: true:

runtime:
tls:
# TLS explicitly enabled
enabled: true
certificate_file: /path/to/cert.pem
key_file: /path/to/key.pem

Contributorsโ€‹

  • @Jeadie
  • @y-f-u
  • @phillipleblanc
  • @sgrebnov
  • @peasee
  • @Sevenannn

What's Changedโ€‹

Dependenciesโ€‹

  • Rust: Upgraded from v1.79.0 to v1.80.0

Commitsโ€‹

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.17.0-beta...v0.17.1-beta

Resourcesโ€‹

Communityโ€‹

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Slack or by email to get involved.

Spice v0.17-beta (July 29, 2024)

ยท 7 min read
Phillip LeBlanc
Co-Founder and CTO of Spice AI

Announcing the first beta release of Spice.ai OSS! ๐ŸŽ‰

The core Spice runtime has graduated from alpha to beta! Components, such as Data Connectors and Models, follow independent release milestones. Data Connectors graduating from alpha to beta include databricks, spiceai, postgres, s3, odbc, and mysql. From beta to 1.0, project will be to on improving performance and scaling to larger datasets.

This release also includes enhanced security with Transport Layer Security (TLS) secured APIs, a new spice install CLI command, and several performance and stability improvements.

Highlights in v0.17-betaโ€‹

  • Encryption in transit with TLS: The HTTP, gRPC, Metrics, and OpenTelemetry (OTEL) API endpoints can be secured with TLS by specifying a certificate and private key in PEM format.

Enable TLS using the --tls-certificate-file and --tls-key-file command-line flags:

spice run -- --tls-certificate-file /path/to/cert.pem --tls-key-file /path/to/key.pem

Or configure in the spicepod.yml:

runtime:
tls:
certificate_file: /path/to/cert.pem
key_file: /path/to/key.pem

Get started with TLS by following the TLS Sample. For more details see the TLS Documentation.

  • spice install: Running the spice install CLI command will download and install the latest version of the runtime.
spice install
  • Improved SQLite and DuckDB compatibility: The SQLite and DuckDB accelerators support more complex queries and additional data types.

  • Pass through arguments from spice run to runtime: Arguments passed to spice run are now passed through to the runtime.

  • Secrets replacement within connection strings: Secrets are now replaced within connection strings:

datasets:
- from: mysql:my_table
name: my_table
params:
mysql_connection_string: mysql://user:${secrets:mysql_pw}@localhost:3306/db

Breaking Changesโ€‹

The odbc data connector is now optional and has been removed from the released binaries. To use the odbc data connector, use the official Spice Docker image or build the Spice runtime from source.

To build Spice from source with the odbc feature:

cargo build --release --features odbc

To use the official Spice Docker image from DockerHub:

# Pull the latest official Spice image
docker pull spiceai/spiceai:latest

# Pull the official v0.17-beta Spice image
docker pull spiceai/spiceai:0.17.0-beta

Contributorsโ€‹

  • @y-f-u
  • @peasee
  • @digadeesh
  • @phillipleblanc
  • @ewgenius
  • @sgrebnov
  • @Sevenannn
  • @lukekim

What's Changedโ€‹

Dependenciesโ€‹

Commitsโ€‹

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.16.0-alpha...v0.17-beta

Resourcesโ€‹

Communityโ€‹

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Slack or by email to get involved.