Skip to main content

8 posts tagged with "github"

GitHub related topics and usage

View All Tags

Spice v1.5.0 (July 21, 2025)

Β· 14 min read
Evgenii Khramkov
Senior Software Engineer at Spice AI

Announcing the release of Spice v1.5.0! πŸ”

Spice v1.5.0 brings major upgrades to search and retrieval. It introduces native support for Amazon S3 Vectors, enabling petabyte scale vector search directly from S3 vector buckets, alongside SQL-integrated vector and tantivy-powered full-text search, partitioning for DuckDB acceleration, and automated refreshes for search indexes and views. It includes the AWS Bedrock Embeddings Model Provider, the Oracle Database connector, and the now-stable Spice.ai Cloud Data Connector, and the upgrade to DuckDB v1.3.2.

What's New in v1.5.0​

Amazon S3 Vectors Support: Spice.ai now integrates with Amazon S3 Vectors, launched in public preview on July 15, 2025, enabling vector-native object storage with built-in indexing and querying. This integration supports semantic search, recommendation systems, and retrieval-augmented generation (RAG) at petabyte scale with S3’s durability and elasticity. Spice.ai manages the vector lifecycleβ€”ingesting data, creating embeddings with models like Amazon Titan or Cohere via AWS Bedrock, or others available on HuggingFace, and storing it in S3 Vector buckets.

Spice integration with Amazon S3 Vectors

Example Spicepod.yml configuration for S3 Vectors:

datasets:
- from: s3://my_data_bucket/data/
name: my_vectors
params:
file_format: parquet
acceleration:
enabled: true
vectors:
engine: s3_vectors
params:
s3_vectors_aws_region: us-east-2
s3_vectors_bucket: my-s3-vectors-bucket
columns:
- name: content
embeddings:
- from: bedrock_titan
row_id:
- id

Example SQL query using S3 Vectors:

SELECT *
FROM vector_search(my_vectors, 'Cricket bats', 10)
WHERE price < 100
ORDER BY score

For more details, refer to the S3 Vectors Documentation.

SQL-integrated Search: Vector and BM25-scored full-text search capabilities are now natively available in SQL queries, extending the power of the POST v1/search endpoint to all SQL workflows.

Example Vector-Similarity-Search (VSS) using the vector_search UDTF on the table reviews for the search term "Cricket bats":

SELECT review_id, review_text, review_date, score
FROM vector_search(reviews, "Cricket bats")
WHERE country_code="AUS"
LIMIT 3

Example Full-Text-Search (FTS) using the text_search UDTF on the table reviews for the search term "Cricket bats":

SELECT review_id, review_text, review_date, score
FROM text_search(reviews, "Cricket bats")
LIMIT 3

DuckDB v1.3.2 Upgrade: Upgraded DuckDB engine from v1.1.3 to v1.3.2. Key improvements include support for adding primary keys to existing tables, resolution of over-eager unique constraint checking for smoother inserts, and 13% reduced runtime on TPC-H SF100 queries through extensive optimizer refinements. The v1.2.x release of DuckDB was skipped due to a regression in indexes.

Partitioned Acceleration: DuckDB file-based accelerations now support partition_by expressions, enabling queries to scale to large datasets through automatic data partitioning and query predicate pruning. New UDFs, bucket and truncate, simplify partition logic.

New UDFs useful for partition_by expressions:

  • bucket(num_buckets, col): Partitions a column into a specified number of buckets based on a hash of the column value.
  • truncate(width, col): Truncates a column to a specified width, aligning values to the nearest lower multiple (e.g., truncate(10, 101) = 100).

Example Spicepod.yml configuration:

datasets:
- from: s3://my_bucket/some_large_table/
name: my_table
params:
file_format: parquet
acceleration:
enabled: true
engine: duckdb
mode: file
partition_by: bucket(100, account_id) # Partition account_id into 100 buckets

Full-Text-Search (FTS) Index Refresh: Accelerated datasets with search indexes maintain up-to-date results with configurable refresh intervals.

Example refreshing search indexes on body every 10 seconds:

datasets:
- from: github:github.com/spiceai/docs/pulls
name: spiceai.doc.pulls
params:
github_token: ${secrets:GITHUB_TOKEN}
acceleration:
enabled: true
refresh_mode: full
refresh_check_interval: 10s
columns:
- name: body
full_text_search:
enabled: true
row_id:
- id

Scheduled View Refresh: Accelerated Views now support cron-based refresh schedules using refresh_cron, automating updates for accelerated data.

Example Spicepod.yml configuration:

views:
- name: my_view
sql: SELECT 1
acceleration:
enabled: true
refresh_cron: '0 * * * *' # Every hour

For more details, refer to Scheduled Refreshes.

Multi-column Vector Search: For datasets configured with embeddings on more than one column, POST v1/search and similarity_search perform parallel vector search on each column, aggregating results using reciprocal rank fusion.

Example Spicepod.yml for multi-column search:

datasets:
- from: github:github.com/apache/datafusion/issues
name: datafusion.issues
params:
github_token: ${secrets:GITHUB_TOKEN}
columns:
- name: title
embeddings:
- from: hf_minilm
- name: body
embeddings:
- from: openai_embeddings

AWS Bedrock Embeddings Model Provider: Added support for AWS Bedrock embedding models, including Amazon Titan Text Embeddings and Cohere Text Embeddings.

Example Spicepod.yml:

embeddings:
- from: bedrock:cohere.embed-english-v3
name: cohere-embeddings
params:
aws_region: us-east-1
input_type: search_document
truncate: END
- from: bedrock:amazon.titan-embed-text-v2:0
name: titan-embeddings
params:
aws_region: us-east-1
dimensions: '256'

For more details, refer to the AWS Bedrock Embedding Models Documentation.

Oracle Data Connector: Use from: oracle: to access and accelerate data stored in Oracle databases, deployed on-premises or in the cloud.

Example Spicepod.yml:

datasets:
- from: oracle:"SH"."PRODUCTS"
name: products
params:
oracle_host: 127.0.0.1
oracle_username: scott
oracle_password: tiger

See the Oracle Data Connector documentation.

GitHub Data Connector: The GitHub data connector supports query and acceleration of members, the users of an organization.

Example Spicepod.yml configuration:

datasets:
- from: github:github.com/spiceai/members # General format: github.com/[org-name]/members
name: spiceai.members
params:
# With GitHub Apps (recommended)
github_client_id: ${secrets:GITHUB_SPICEHQ_CLIENT_ID}
github_private_key: ${secrets:GITHUB_SPICEHQ_PRIVATE_KEY}
github_installation_id: ${secrets:GITHUB_SPICEHQ_INSTALLATION_ID}
# With GitHub Tokens
# github_token: ${secrets:GITHUB_TOKEN}

See the GitHub Data Connector Documentation

Spice.ai Cloud Data Connector: Graduated to Stable.

spice-rs SDK Release: The Spice Rust SDK has updated to v3.0.0. This release includes optimizations for the Spice client API, adds robust query retries, and custom metadata configurations for spice queries.

Contributors​

Breaking Changes​

  • Search HTTP API Response: POST v1/search response payload has changed. See the new API documentation for details.
  • Model Provider Parameter Prefixes: Model Provider parameters use provider-specific prefixes instead of openai_ prefixes (e.g., hf_temperature for HuggingFace, anthropic_max_completion_tokens for Anthropic, perplexity_tool_choice for Perplexity). The openai_ prefix remains supported for backward compatibility but is deprecated and will be removed in a future release.

Cookbook Updates​

The Spice Cookbook now includes 72 recipes to help you get started with Spice quickly and easily.

Upgrading​

To upgrade to v1.5.0, download and install the specific binary from github.com/spiceai/spiceai/releases/tag/v1.5.0 or pull the v1.5.0 Docker image (spiceai/spiceai:1.5.0).

What's Changed​

Dependencies​

Changelog​

  • fix: openai model endpoint (#6394) by @Sevenannn in #6394
  • Enable configuring otel endpoint from spice run (#6360) by @Advayp in #6360
  • Enable Oracle connector in default build configuration (#6395) by @sgrebnov in #6395
  • fix llm integraion test (#6398) by @Sevenannn in #6398
  • Promote spice cloud connector to stable quality (#6221) by @Sevenannn in #6221
  • v1.5.0-rc.1 release notes (#6397) by @lukekim in #6397
  • Fix model nsql integration tests (#6365) by @Sevenannn in #6365
  • Fix incorrect UDTF name and SQL query (#6404) by @lukekim in #6404
  • Update v1.5.0-rc.1.md (#6407) by @sgrebnov in #6407
  • Improve error messages (#6405) by @lukekim in #6405
  • build(deps): bump Jimver/cuda-toolkit from 0.2.25 to 0.2.26 (#6388) by @app/dependabot in #6388
  • Upgrade dependabot dependencies (#6411) by @phillipleblanc in #6411
  • Fix projection pushdown issues for document based file connector (#6362) by @Advayp in #6362
  • Add a PartitionedDuckDB Accelerator (#6338) by @kczimm in #6338
  • Use vector_search() UDTF in HTTP APIs (#6417) by @Jeadie in #6417
  • add supported types (#6409) by @kczimm in #6409
  • Enable session time zone override for MySQL (#6426) by @sgrebnov in #6426
  • Acceleration-like indexing for full text search indexes. (#6382) by @Jeadie in #6382
  • Provide error message when partition by expression changes (#6415) by @kczimm in #6415
  • Add support for Oracle Autonomous Database connections (Oracle Cloud) (#6421) by @sgrebnov in #6421
  • prune partitions for exact and in list with and without UDFs (#6423) by @kczimm in #6423
  • Fixes and reenable FTS tests (#6431) by @Jeadie in #6431
  • Upgrade DuckDB to 1.3.2 (#6434) by @phillipleblanc in #6434
  • Fix issue in limit clause for the Github Data connector (#6443) by @Advayp in #6443
  • Upgrade iceberg-rust to 0.5.1 (#6446) by @phillipleblanc in #6446
  • v1.5.0-rc.2 release notes (#6440) by @lukekim in #6440
  • Oracle: add automated TPC-H SF1 benchmark tests (#6449) by @sgrebnov in #6449
  • fix: Update benchmark snapshots (#6455) by @app/github-actions in #6455
  • Preserve ArrowError in arrow_tools::record_batch (#6454) by @mach-kernel in #6454
  • fix: Update benchmark snapshots (#6465) by @app/github-actions in #6465
  • Add option to preinstall Oracle ODPI-C library in Docker image (#6466) by @sgrebnov in #6466
  • Include Oracle connector (federated mode) in automated benchmarks (#6467) by @sgrebnov in #6467
  • Update crates/llms/src/bedrock/embed/mod.rs by @lukekim in #6468
  • v1.5.0-rc.3 release notes (#6474) by @lukekim in #6474
  • Add integration tests for S3 Vectors filters pushdown (#6469) by @sgrebnov in #6469
  • check for indexedtableprovider when finding tables to search on (#6478) by @Jeadie in #6478
  • Parse fully qualified table names in UDTFs (#6461) by @Jeadie in #6461
  • Add integration test for S3 Vectors to cover data update (overwrite) (#6480) by @sgrebnov in #6480
  • Add 'Run all tests' option for models tests and enable Bedrock tests (#6481) by @sgrebnov in #6481
  • Add support for a members table type for the GitHub Data Connector (#6464) by @Advayp in #6464
  • S3 vector data cannot be null (#6483) by @Jeadie in #6483
  • Don't infer FixedSizeList size during indexing vectors. (#6487) by @Jeadie in #6487
  • Add support for retention_sql acceleration param (#6488) by @sgrebnov in #6488
  • Make dataset refresh progress tracing less verbose (#6489) by @sgrebnov in #6489
  • Use RwLock on tantivy index in FullTextDatabaseIndex for update concurrency (#6490) by @Jeadie in #6490
  • Add tests for dataset retention logic and refactor retention code (#6495) by @sgrebnov in #6495
  • Upgade dependabot dependencies (#6497) by @phillipleblanc in #6497
  • Add periodic tracing of data loading progress during dataset refresh (#6499) by @sgrebnov in #6499
  • Promote Oracle Data Connector to Alpha (#6503) by @sgrebnov in #6503
  • Use AWS SDK to provide credentials for Iceberg connectors (#6498) by @phillipleblanc in #6498
  • Add integration tests for partitioning (#6463) by @kczimm in #6463
  • Use top-level table in full-text search JOIN ON (#6491) by @Jeadie in #6491
  • Use accelerated table in vector_search JOIN operations when appropriate (#6516) by @Jeadie in #6516
  • Fix 'additional_column' for quoted columns (fix for qualified columns broke it) (#6512) by @Jeadie in #6512
  • Also use AWS SDK for inferring credentials for S3/Delta/Databricks Delta data connectors (#6504) by @phillipleblanc in #6504
  • Add per-dataset availability monitor configuration (#6482) by @phillipleblanc in #6482
  • Suppress the warning from the AWS SDK if it can't load credentials (#6533) by @phillipleblanc in #6533
  • Change default value of check_availability from default to auto (#6534) by @lukekim in #6534
  • README.md improvements for v1.5.0 (#6539) by @lukekim in #6539
  • Temporary disable s3_vectors_basic (#6537) by @sgrebnov in #6537
  • Ensure binder errors show before query and other (#6374) by @suhuruli in #6374
  • Update spiceai/duckdb-rs -> DuckDB 1.3.2 + index fix (#6496) by @mach-kernel in #6496
  • Update table-providers to latest version with DuckDB fixes (#6535) by @phillipleblanc in #6535
  • S3: default to public access if no auth is provided (#6532) by @sgrebnov in #6532

Spice v1.0-rc.4 (Jan 6, 2025)

Β· 5 min read
Phillip LeBlanc
Co-Founder and CTO of Spice AI

Happy New Year πŸŽ†!

Announcing the release of Spice v1.0-rc.4 🌟

Spice v1.0.0-rc.4 is the fourth release candidate for the first major version of Spice.ai OSS. This release continues the focus on production readiness. In addition, xAI has been added as a model provider.

Highlights in v1.0-rc.4​

  • xAI Model Provider: Adds support for xAI hosted models.
models:
- from: xai:grok2-latest
name: xai
params:
xai_api_key: ${secrets:SPICE_XAI_API_KEY}
  • Spicepod Spec Version: Spicepod spec version v1 is now by default. v1beta1 will continue to work.
version: v1
kind: Spicepod
name: my_pod

Cookbook​

Dependencies​

No major dependency changes.

Contributors​

  • @lukekim
  • @phillipleblanc
  • @peasee
  • @karifabri
  • @sgrebnov
  • @Jeadie
  • @ewgenius

What's Changed​

- Update openapi.json by @github-actions in <https://github.com/spiceai/spiceai/pull/4087>
- Update Helm chart for v1.0.0-rc.3 (v0.2.2) by @lukekim in <https://github.com/spiceai/spiceai/pull/4088>
- Rev version to v1.0.0-rc.4 by @lukekim in <https://github.com/spiceai/spiceai/pull/4090>
- Update spicepod.schema.json by @github-actions in <https://github.com/spiceai/spiceai/pull/4089>
- Fix OpenAI Models Integration tests by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4084>
- fix: Update Postgres TPCDS and ClickBench queries by @peasee in <https://github.com/spiceai/spiceai/pull/4092>
- fix: Check Postgres acceleration schema on insert by @peasee in <https://github.com/spiceai/spiceai/pull/4094>
- Update v1.0.0-rc.3.md by @karifabri in <https://github.com/spiceai/spiceai/pull/4096>
- Update openapi.json by @github-actions in <https://github.com/spiceai/spiceai/pull/4093>
- First-class TSV for file data connector by @lukekim in <https://github.com/spiceai/spiceai/pull/4098>
- Allow Flight DoPut only for write api-keys by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4010>
- Only create tables `eval.runs` and `eval.results` when an eval is defined by @Jeadie in <https://github.com/spiceai/spiceai/pull/4099>
- Update Copyright year to include 2025 by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4100>
- feat: add postgres clickbench accelerator, release postgres accelerator by @peasee in <https://github.com/spiceai/spiceai/pull/4111>
- Add spice binaries with metal to releases; detect metal device in `spice install/upgrade`. by @Jeadie in <https://github.com/spiceai/spiceai/pull/4097>
- docs: Clarify connector release criteria by @peasee in <https://github.com/spiceai/spiceai/pull/4112>
- Update datafusion-federation to fix LIMIT with OFFSET handling in logical plan rewrite by @ewgenius in <https://github.com/spiceai/spiceai/pull/4115>
- Support Grok AI. by @Jeadie in <https://github.com/spiceai/spiceai/pull/4113>
- Fix `spice chat` usage bar. by @Jeadie in <https://github.com/spiceai/spiceai/pull/4119>
- Set unified max encoding and decoding message size for all flight client configurations across runtime by @ewgenius in <https://github.com/spiceai/spiceai/pull/4116>
- feat: Add the file connector as an appendable benchmark connector by @peasee in <https://github.com/spiceai/spiceai/pull/4120>
- Add `spice eval` command by @lukekim in <https://github.com/spiceai/spiceai/pull/4118>
- Support multi-level table nesting for Dremio by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4129>
- feat: run append TPCH benchmarks in workflow (Arrow, DuckDB) by @peasee in <https://github.com/spiceai/spiceai/pull/4131>
- Fix bug in Iceberg tables selecting a subset of columns by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4132>
- feat: Run append TPCDS benchmarks in workflow (Arrow, DuckDB) by @peasee in <https://github.com/spiceai/spiceai/pull/4141>
- Setup spice.ai clickbench by @ewgenius in <https://github.com/spiceai/spiceai/pull/4134>
- Data is streamed when reading from the GitHub connector (GraphQL tables) by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4142>
- Mark the GitHub Data Connector as Stable by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4143>
- Fix table quoting for Databricks Spark connector by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4145>
- Extend flight compute context for spice.ai connector with org and app names, to fix federated queries from different spice.ai data sources by @ewgenius in <https://github.com/spiceai/spiceai/pull/4144>
- Enforce Flight DoPut policies: Rate Limiting, Read Timeout, and Max Records per Batch by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4117>
- Fix bug Changes in catalog.yaml would require saving in spicepod.yaml to apply by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4147>
- Update benchmark snapshots by @github-actions in <https://github.com/spiceai/spiceai/pull/4137>
- Add `test-framework` crate to contain all common benchmark, E2E, integration testing logic. by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4157>
- Fix `platform_option` variable in `build_and_release.yml`. by @Jeadie in <https://github.com/spiceai/spiceai/pull/4154>
- feat: Add Clickbench append benchmark for DuckDB and Arrow by @peasee in <https://github.com/spiceai/spiceai/pull/4160>
- Upload artifacts to Minio on build_and_release by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4159>
- feat: add on zero results benchmark by @peasee in <https://github.com/spiceai/spiceai/pull/4164>
- Update spice.ai connector tests by @ewgenius in <https://github.com/spiceai/spiceai/pull/4161>

**Full Changelog**: <https://github.com/spiceai/spiceai/compare/v1.0.0-rc.3...v1.0.0-rc.4>
```text

## Resources

- [Getting started with Spice.ai](https://docs.spiceai.org/getting-started/)
- [Documentation](https://docs.spiceai.org/)

## Community

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Discord or by email to get involved.

- Twitter: [@spice_ai](https://twitter.com/spice_ai)
- Discord: [https://discord.gg/kZnTfneP5u](https://discord.gg/kZnTfneP5u)
- Telegram: [Spice AI Discussion](https://t.me/spiceaichat)
- Reddit: [https://www.reddit.com/r/spiceai](https://www.reddit.com/r/spiceai)
- Email: [[email protected]](mailto:[email protected])

Spice v0.20-beta (Nov 4, 2024)

Β· 4 min read
Phillip LeBlanc
Co-Founder and CTO of Spice AI

Announcing the release of Spice v0.20-beta 🧩

Spice v0.20.0-beta improves federated query performance with column pruning and adds support for Metal (Apple Silicon) and CUDA (NVidia) accelerators. The S3, PostgreSQL, MySQL, and GitHub Data Connectors have graduated from Beta to Release Candidates. The Arrow, DuckDB, and SQLite Data Accelerators have graduated from Alpha to Beta.

Highlights in v0.20.0-beta​

Data Connectors: The S3, PostgreSQL, MySQL, and GitHub Data Connectors have graduated from beta to release candidate.

Data Accelerators: The Arrow, DuckDB, and SQLite Data Accelerators have graduated from alpha to beta.

Metal and CUDA Support: Added support for Metal (Apple Silicon) and CUDA (NVidia) for AI/ML workloads including embeddings and local LLM inference.

For instructions on compiling a Meta or CUDA binary, see the Installation Docs.

Breaking Changes​

  • The ODBC Data Connector now requires ODBC drivers specified in connection strings are registered in the system ODBC driver manager.

Example invalid connection string:

DRIVER={/path/to/driver.so};SERVER=localhost;DATABASE=master

Example valid connection string:

DRIVER={My ODBC Driver};SERVER=localhost;DATABASE=master

Where My ODBC Driver is the name of an ODBC driver registered in the ODBC driver manager.

Contributors​

  • @ewgenius
  • @peasee
  • @phillipleblanc
  • @sgrebnov
  • @Jeadie
  • @barracudarin
  • @Sevenannn

What's Changed​

- Update Helm for v0.19.4-beta and add release notes by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/3310>
- Update spicepod.schema.json by @github-actions in <https://github.com/spiceai/spiceai/pull/3311>
- `metal` & `cuda` flags for spice by @Jeadie in <https://github.com/spiceai/spiceai/pull/3212>
- Promote postgres connector to RC quality by @Sevenannn in <https://github.com/spiceai/spiceai/pull/3305>
- docs: Update ROADMAP.md by @peasee in <https://github.com/spiceai/spiceai/pull/3322>
- feat: Enable federation for in-memory accelerators by @peasee in <https://github.com/spiceai/spiceai/pull/3325>
- fix: Only allow env files from the current dir by @peasee in <https://github.com/spiceai/spiceai/pull/3327>
- Always read TimezoneTZ from PostgreSQL as UTC by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/3330>
- For multi-sink acceleration refreshes, ensure parent table completes before the children. by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/3329>
- Update TPC-DS Q49 (Decimal to Float) to match SQLite's type system by @sgrebnov in <https://github.com/spiceai/spiceai/pull/3323>
- Enable parquet pushdown in Spice by @Sevenannn in <https://github.com/spiceai/spiceai/pull/3245>
- Use spice object_store fork to fix S3 ambiguous error by @Sevenannn in <https://github.com/spiceai/spiceai/pull/3304>
- Don't mix commented out queries for s3 connectors and accelerators by @Sevenannn in <https://github.com/spiceai/spiceai/pull/3331>
- Allow only valid WHERE conditions in vector searches by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/3335>
- fix: Allow only ODBC profiles by @peasee in <https://github.com/spiceai/spiceai/pull/3324>
- Track how many times an acceleration falls back during initialization by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/3339>
- Anthropic model regex and fix tool parsing aggregation bug by @Jeadie in <https://github.com/spiceai/spiceai/pull/3334>
- Upgrade runtime along with CLI on `spice upgrade` by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/3341>
- Update upcoming Roadmap by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/3343>
- fix: Prevent acceleration files outside of working directory by @peasee in <https://github.com/spiceai/spiceai/pull/3340>
- Document S3 connector limitations by @Sevenannn in <https://github.com/spiceai/spiceai/pull/3333>
- Update Object Store Patch by @Sevenannn in <https://github.com/spiceai/spiceai/pull/3361>
- Promote SQLite Data Accelerator to Beta by @sgrebnov in <https://github.com/spiceai/spiceai/pull/3365>
- Promote S3 connector to RC quality by @Sevenannn in <https://github.com/spiceai/spiceai/pull/3362>
- Revert "fix: Only allow env files from the current dir" by @peasee in <https://github.com/spiceai/spiceai/pull/3368>
- docs: Fix typo for S3 release status in README.md by @peasee in <https://github.com/spiceai/spiceai/pull/3370>
- Include unnecessary columns pruning step during federated plan creation by @sgrebnov in <https://github.com/spiceai/spiceai/pull/3363>

**Full Changelog**: <https://github.com/spiceai/spiceai/compare/v0.19.4-beta...v0.20.0-beta>

Resources​

Community​

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Discord or by email to get involved.

Spice v0.19.3-beta (Oct 28, 2024)

Β· 5 min read
Sergei Grebnov
Senior Software Engineer at Spice AI

Announcing the release of Spice v0.19.3-beta πŸ“ˆ

Spice v0.19.3-beta improves the performance and stability of data connectors and accelerators, including faster queries across multiple federated sources by optimizing how filters are applied. Anthropic has also been added as a LLM model provider.

Highlights in v0.19.3​

DataFusion Fixes: Resolved bugs in DataFusion and DataFusion Table Providers, expanding TPC-DS coverage and correctness.

GitHub Data Connector Beta Milestone: The GitHub Data Connector has graduated to Beta after extensive testing, stability, and performance improvements.

Anthropic Models Provider: Anthropic has been added as an LLM provider, including support for streaming.

Example spicepod.yml:

models:
- from: anthropic:claude-3-5-sonnet-20240620
name: claude_3_5_sonnet
params:
anthropic_api_key: ${ secrets:SPICE_ANTHROPIC_API_KEY }

Breaking changes​

None.

Contributors​

  • @Jeadie
  • @Sevenannn
  • @phillipleblanc
  • @peasee
  • @sgrebnov
  • @nlamirault
  • @barracudarin
  • @lukekim
  • @slyons

New Contributors​

What's Changed​

- Make Anthropic OpenAI compatible. by @Jeadie in https://github.com/spiceai/spiceai/pull/3087
- Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/3200
- Bump version to 1.0.0-rc.1 by @Sevenannn in https://github.com/spiceai/spiceai/pull/3202
- Fix clickhouse schema inference for non-default database by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3201
- Update endgame template by @Sevenannn in https://github.com/spiceai/spiceai/pull/3198
- Upgrade dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3197
- fix: dataset refresh defaults properties to None by @peasee in https://github.com/spiceai/spiceai/pull/3205
- Upgrade OTEL to v0.26 and make seconds based metrics reported precisely by @sgrebnov in https://github.com/spiceai/spiceai/pull/3203
- use `text_embedding_inference::Infer` for more complete embedding solution by @Jeadie in https://github.com/spiceai/spiceai/pull/3199
- Add S3 parquet file - arrow accelerator e2e test by @Sevenannn in https://github.com/spiceai/spiceai/pull/3154
- feat: Add script to setup clickbench on mysql by @peasee in https://github.com/spiceai/spiceai/pull/3176
- Update helm chart version to v0.19.2 by @Sevenannn in https://github.com/spiceai/spiceai/pull/3210
- Add sample dataset option in `v1/nsql`. by @Jeadie in https://github.com/spiceai/spiceai/pull/3105
- Split spiced_docker build across architectures by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3206
- feat(helm): do not install demo dataset by default by @nlamirault in https://github.com/spiceai/spiceai/pull/3207
- Split integration test across build/run steps by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3215
- feat(helm): Refactoring Kubernetes labels by @nlamirault in https://github.com/spiceai/spiceai/pull/3208
- Define 'tool_recursion_limit' for LLMs, and limit internal tool calling recursion. by @Jeadie in https://github.com/spiceai/spiceai/pull/3214
- Improve filters pushdown for federated queries by @sgrebnov in https://github.com/spiceai/spiceai/pull/3183
- Implement native schema inference for PostgreSQL by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3209
- docs: Update release criteria by @peasee in https://github.com/spiceai/spiceai/pull/3219
- Run SQLite acceleration TPC-DS tests using smaller scale by @sgrebnov in https://github.com/spiceai/spiceai/pull/3227
- bind the serviceAccount if a name is given or if we're creating one by @barracudarin in https://github.com/spiceai/spiceai/pull/3228
- Only emit channel send error log when its not a closed channel error by @Jeadie in https://github.com/spiceai/spiceai/pull/3230
- Enable Parquet Exec filter pushdown in Spice by @Sevenannn in https://github.com/spiceai/spiceai/pull/3216
- Add snapshots for SQLite TPC-DS benchmark (file mode) by @sgrebnov in https://github.com/spiceai/spiceai/pull/3234
- docs: Add SDK release checks to endgame by @peasee in https://github.com/spiceai/spiceai/pull/3256
- Implement `localpod` Data Connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3249
- Revert "Enable Parquet Exec filter pushdown in Spice (#3216)" by @Sevenannn in https://github.com/spiceai/spiceai/pull/3244
- refactor: Use existing action for detecting changes by @peasee in https://github.com/spiceai/spiceai/pull/3255
- feat: Add GitHub integration test by @peasee in https://github.com/spiceai/spiceai/pull/3226
- Add get_readiness tool to retrieve status of all registered components by @lukekim in https://github.com/spiceai/spiceai/pull/3035
- Improve CLI error output when REPL can't connect to the Flight endpoint by @slyons in https://github.com/spiceai/spiceai/pull/3188
- Fixing FTP link in Endgame by @slyons in https://github.com/spiceai/spiceai/pull/3267
- Update version to 0.19.3-beta by @sgrebnov in https://github.com/spiceai/spiceai/pull/3269
- add service type and annotation customizations in https://github.com/spiceai/spiceai/pull/3268

**Full Changelog**: https://github.com/spiceai/spiceai/compare/v0.19.2-beta...v0.19.3-beta

Resources​

Community​

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Discord or by email to get involved.

Spice v0.19.1-beta (Oct 14, 2024)

Β· 5 min read
Luke Kim
Founder and CEO of Spice AI

Announcing the release of Spice v0.19.1-beta πŸ”₯

Spice v0.19.1 brings further performance and stability improvements to data connectors, including improved query push-down for file-based connectors (s3, abfs, file, ftp, sftp) that use Hive-style partitioning.

Highlights in v0.19.1​

TPC-H and TPC-DS Coverage: Expanded coverage for TPC-H and TPC-DS benchmarking suites across accelerators and connectors.

GitHub Connector Array Filter: The GitHub connector now supports filter push down for the array_contains function in SQL queries using search query mode.

NSQL CLI Command: A new spice nsql CLI command has been added to easily query datasets with natural language from the command line.

Breaking changes​

None

Contributors​

  • @peasee
  • @Sevenannn
  • @sgrebnov
  • @karifabri
  • @phillipleblanc
  • @lukekim
  • @Jeadie
  • @slyons

Dependencies​

What's Changed​

- release: Update helm chart for v0.19.0-beta by @peasee in https://github.com/spiceai/spiceai/pull/3024
- Set fail-fast = true for benchmark test by @Sevenannn in https://github.com/spiceai/spiceai/pull/2997
- release: Update next version and ROADMAP by @peasee in https://github.com/spiceai/spiceai/pull/3033
- Verify TPCH benchmark query results for Spark connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/2993
- feat: Add x-spice-user-agent header to Spice REPL by @peasee in https://github.com/spiceai/spiceai/pull/2979
- Update to object store file formats documentation link by @karifabri in https://github.com/spiceai/spiceai/pull/3036
- Use teraswitch-runners for Linux x64 workflows + builds by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3042
- feat: Support array contains in GitHub pushdown by @peasee in https://github.com/spiceai/spiceai/pull/2983
- Bump text-splitter from 0.16.1 to 0.17.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2987
- Revert integration tests back to hosted runner by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3046
- Tune Github runner resources to allow in memory TPCDS benchmark to run by @Sevenannn in https://github.com/spiceai/spiceai/pull/3025
- fix: add winver by @peasee in https://github.com/spiceai/spiceai/pull/3054
- refactor: Use is modifier for checking GitHub state filter by @peasee in https://github.com/spiceai/spiceai/pull/3056
- Enable `merge_group` checks for PR workflows by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3058
- Fix issues with merge group by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3059
- Validate in-memory arrow accelertion TPCDS result correctness by @Sevenannn in https://github.com/spiceai/spiceai/pull/3044
- Fix rev parsing for PR checks by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3060
- Use 'Accept' header for `/v1/sql/` and `/v1/nsql` by @Jeadie in https://github.com/spiceai/spiceai/pull/3032
- Verify Postgres acceleration TPCDS result correctness by @Sevenannn in https://github.com/spiceai/spiceai/pull/3043
- Add NSQL CLI REPL command by @lukekim in https://github.com/spiceai/spiceai/pull/2856
- Preserve query results order and add TPCH benchmark results verification for duckdb:file mode by @sgrebnov in https://github.com/spiceai/spiceai/pull/3034
- Refactor benchmark to include MySQL tpcds bench, tweaks to makefile target for generating mysql tpcds data by @Sevenannn in https://github.com/spiceai/spiceai/pull/2967
- Support runtime parameter for `sql_query_keep_partition_by_columns` & enable by default by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3065
- Document TPC-DS limitations: `EXCEPT`, `INTERSECT`, duplicate names by @sgrebnov in https://github.com/spiceai/spiceai/pull/3069
- Adding ABFS benchmark by @slyons in https://github.com/spiceai/spiceai/pull/3062
- Add support for GitHub app installation auth for GitHub connector by @ewgenius in https://github.com/spiceai/spiceai/pull/3063
- docs: Document stack overflow workaround, add helper script by @peasee in https://github.com/spiceai/spiceai/pull/3070
- Tune MySQL TPCDS image to allow for successful benchmark test run by @Sevenannn in https://github.com/spiceai/spiceai/pull/3067
- Automatically infer partitions for hive-style partitioned files for object store based connectors by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3073
- Support `hf_token` from params/secrets by @Jeadie in https://github.com/spiceai/spiceai/pull/3071
- Inherit embedding columns from source, when available. by @Jeadie in https://github.com/spiceai/spiceai/pull/3045
- Validate identifiers for component names by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3079
- docs: Add workaround for TPC-DS Q97 in MySQL by @peasee in https://github.com/spiceai/spiceai/pull/3080
- Document TPC-DS Postgres column alias in a CASE statement limitation by @sgrebnov in https://github.com/spiceai/spiceai/pull/3083
- Update plan snapshots for TPC-H bench queries by @sgrebnov in https://github.com/spiceai/spiceai/pull/3088
- Update Datafusion crate to include recent unparsing fixes by @sgrebnov in https://github.com/spiceai/spiceai/pull/3089
- Sample SQL table data tool and API by @Jeadie in https://github.com/spiceai/spiceai/pull/3081
- chore: Update datafusion-table-providers by @peasee in https://github.com/spiceai/spiceai/pull/3090
- Add `hive_infer_partitions` to remaining object store connectors by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3086
- deps: Update datafusion-table-providers by @peasee in https://github.com/spiceai/spiceai/pull/3093
- For local embedding models, return usage input tokens. by @Jeadie in https://github.com/spiceai/spiceai/pull/3095
- Update end_game.md with Accelerator/Connector criteria check by @slyons in https://github.com/spiceai/spiceai/pull/3092
- Update TPC-DS Q90 by @sgrebnov in https://github.com/spiceai/spiceai/pull/3094
- docs: Add RC connector criteria by @peasee in https://github.com/spiceai/spiceai/pull/3026
- Update version to 0.19.1-beta by @sgrebnov in https://github.com/spiceai/spiceai/pull/3101

**Full Changelog**: https://github.com/spiceai/spiceai/compare/v0.19.0-beta...v0.19.1-beta

Resources​

Community​

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Discord or by email to get involved.