Skip to main content

6 posts tagged with "arrow"

Apache Arrow topics and usage

View All Tags

Spice v1.0-stable (Jan 20, 2025)

Β· 8 min read
William Croxson
Senior Software Engineer at Spice AI

πŸŽ‰ After 47 releases, Spice.ai OSS has reached production readiness with the 1.0-stable milestone!

The core runtime and features such as query federation, query acceleration, catalog integration, search and AI-inference have all graduated to stable status along with key component graduations across data connectors, data accelerators, catalog connectors, and AI model providers.

Highlights in v1.0-stable​

Breaking Changes​

  • Default Runtime Version: The CLI will install the GPU accelerated AI-capable Runtime by default (if supported), when running spice install or spice run. To force-install the non-GPU version, run spice install ai --cpu.

  • Default OpenAI Model: The default OpenAI model has updated to gpt-4o-mini.

  • Identifier Normalization: Unquoted identifiers such as table names are no longer normalized to lowercase. Identifiers will now retain their exact case as provided.

  • Sandboxed Docker Image: The Runtime Docker Image now runs the spiced process as the nobody user in a minimal chroot sandbox.

  • Insecure S3 and ABFS endpoints: The S3 and ABFS connectors now enforce insecure endpoint checks, preventing HTTP endpoints unless allow_http is explicitly enabled. Refer to the documentation for details.

Dependencies​

No major dependency changes.

Upgrading​

To upgrade to v1.0.0, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.0.0 image:

docker pull spiceai/spiceai:1.0.0

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

Contributors​

  • @peasee
  • @ewgenius
  • @Jeadie
  • @Sevenannn
  • @lukekim
  • @phillipleblanc
  • @sgrebnov

What's Changed​

- feat: Update load test criteria, testoperator updates by @peasee in <https://github.com/spiceai/spiceai/pull/4311>
- Update helm for v1.0.0-rc.5 by @ewgenius in <https://github.com/spiceai/spiceai/pull/4313>
- Update spicepod.schema.json by @github-actions in <https://github.com/spiceai/spiceai/pull/4318>
- Bump version to v1.0.0, update SECURITY.md by @ewgenius in <https://github.com/spiceai/spiceai/pull/4314>
- Initial criteria for models, embeddings by @Jeadie in <https://github.com/spiceai/spiceai/pull/4223>
- Update benchmark snapshots by @github-actions in <https://github.com/spiceai/spiceai/pull/4321>
- Add dremio param for running load test by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4315>
- Promote Databricks (mode: delta_lake) connector to stable by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4328>
- Handle failed query in load test by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4327>
- feat: Use load test hours for baseline query sets by @peasee in <https://github.com/spiceai/spiceai/pull/4334>
- Fix typo in 1.0.0-rc.5 release notes by @ewgenius in <https://github.com/spiceai/spiceai/pull/4329>
- feat: add testoperator data consistency by @peasee in <https://github.com/spiceai/spiceai/pull/4319>
- docs: Release DuckDB connector stable by @peasee in <https://github.com/spiceai/spiceai/pull/4335>
- Fix DocumentDB -> DynamoDB by @lukekim in <https://github.com/spiceai/spiceai/pull/4339>
- Update benchmark snapshots by @github-actions in <https://github.com/spiceai/spiceai/pull/4337>
- fix: Download hits.parquet from MinIO for benchmark by @peasee in <https://github.com/spiceai/spiceai/pull/4338>
- Update openapi.json by @github-actions in <https://github.com/spiceai/spiceai/pull/4341>
- Remove evil averages by @lukekim in <https://github.com/spiceai/spiceai/pull/4343>
- Don't run builds on non-code changes by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4344>
- Remove streaming requirement from Databricks spark Beta and Spark connector Beta by @ewgenius in <https://github.com/spiceai/spiceai/pull/4345>
- Update s3 tpcds spicepods by @ewgenius in <https://github.com/spiceai/spiceai/pull/4346>
- Explicitly set required scale factor for throughput and load tests by @ewgenius in <https://github.com/spiceai/spiceai/pull/4347>
- Fix s3 tpcds dataset name by @ewgenius in <https://github.com/spiceai/spiceai/pull/4348>
- Promote Iceberg Catalog Connector to Beta by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4350>
- Update s3 clickbench benchmark snapshots by @ewgenius in <https://github.com/spiceai/spiceai/pull/4351>
- fix: DuckDB clickbench on zero results by @peasee in <https://github.com/spiceai/spiceai/pull/4349>
- Add integration test with snapshots for databricks catalog connector by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4353>
- refactor: Remove on zero results from benchmarks, add data consistency workflow by @peasee in <https://github.com/spiceai/spiceai/pull/4354>
- Fix Bug: No field named body_embedding when do vector search with refresh sql containing subset of columns by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4297>
- docs: Update roadmap by @peasee in <https://github.com/spiceai/spiceai/pull/4364>
- feat: Release accelerators stable by @peasee in <https://github.com/spiceai/spiceai/pull/4361>
- Add TPCH/TPCDS test spicepods for MySQL by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4365>
- Catch when an insecure (http) S3 and ABFS data connectors endpoint is used without specifying the `allow_http` parameter by @ewgenius in <https://github.com/spiceai/spiceai/pull/4363>
- Update ROADMAP - Iceberg catalog alpha for v1.0 by @ewgenius in <https://github.com/spiceai/spiceai/pull/4367>
- Promote databricks catalog and databricks (spark_connect) connector to beta by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4369>
- Update Roadmap - Iceberg beta by @ewgenius in <https://github.com/spiceai/spiceai/pull/4373>
- Build CUDA binaries for Linux by @Jeadie in <https://github.com/spiceai/spiceai/pull/4320>
- Promote Nvidia NIM as Alpha by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4380>
- Promote xai to alpha by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4381>
- Update stable criteria for object store based connectors by @ewgenius in <https://github.com/spiceai/spiceai/pull/4383>
- Testoperator: http consistency and overhead tests, fixes and ci by @ewgenius in <https://github.com/spiceai/spiceai/pull/4382>
- Promote S3 Data Connector to Stable by @ewgenius in <https://github.com/spiceai/spiceai/pull/4385>
- Download platform-supported CUDA binary version on Linux by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4356>
- Fix http consistency test workflow, add overhead workflow by @ewgenius in <https://github.com/spiceai/spiceai/pull/4387>
- feat: Add Postgres test spicepods by @peasee in <https://github.com/spiceai/spiceai/pull/4388>
- Fix typos + specific in model criteria; Make explicit alpha/beta tests for LLMS in `crates/llms/tests`. by @Jeadie in <https://github.com/spiceai/spiceai/pull/4377>
- Fix federation bug for correlated subqueries of deeply nested Dremio tables by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4389>
- Fix http overhead workflow by @ewgenius in <https://github.com/spiceai/spiceai/pull/4390>
- Tweak model tests, fix embedding input by @ewgenius in <https://github.com/spiceai/spiceai/pull/4391>
- Promote Dremio to Stable quality by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4392>
- Add beta functionality tests for embedding models. by @Jeadie in <https://github.com/spiceai/spiceai/pull/4352>
- docs: Release postgres connector stable by @peasee in <https://github.com/spiceai/spiceai/pull/4398>
- Increase timeout for model response in E2E tests by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4399>
- Disable ident normalization (i.e. `SELECT MyColumn from table` works) by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4400>
- Preserve schema metadata by @ewgenius in <https://github.com/spiceai/spiceai/pull/4402>
- Make models integration tests tracing less verbose by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4403>
- Fix `cuda` feature build on Windows by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4404>
- Promote MySQL to Stable by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4406>
- docs: Release Delta Lake and Unity catalog by @peasee in <https://github.com/spiceai/spiceai/pull/4405>
- Use `gpt-4o-mini` as a default model for openai provider by @ewgenius in <https://github.com/spiceai/spiceai/pull/4410>
- Fix streaming for Openai and Anthropic by @Jeadie in <https://github.com/spiceai/spiceai/pull/4409>
- Tweak model loading and missing tool errors messages by @ewgenius in <https://github.com/spiceai/spiceai/pull/4412>
- Spice CLI: fallback to CPU build for unsupported GPU Compute Capability by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4407>
- Build Windows CUDA binaries as part of `build_and_release` workflow by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4386>
- Update docs link by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4416>
- feat: Add CPU models install escape hatch by @peasee in <https://github.com/spiceai/spiceai/pull/4419>
- Handle OpenAI API Errors by @ewgenius in <https://github.com/spiceai/spiceai/pull/4417>
- Update spice cli to use `GH_TOKEN` or `GITHUB_TOKEN` env variables when calling releases api by @ewgenius in <https://github.com/spiceai/spiceai/pull/4175>
- Implement secure sandboxing for Docker image by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4411>
- Automatically install supported CUDA binary on Windows by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4420>
- Metrics for LLMs+ embeddings by @Jeadie in <https://github.com/spiceai/spiceai/pull/4418>
- Jeadie/25 01 17/beta perf by @Jeadie in <https://github.com/spiceai/spiceai/pull/4397>
- Pass GitHub token to all CI steps calling spice run by @ewgenius in <https://github.com/spiceai/spiceai/pull/4423>
- Run the models integration tests on PRs by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4421>
- Run CUDA builds in a separate workflow by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4430>
- Promote OpenAI models and embeddings providers to RC by @ewgenius in <https://github.com/spiceai/spiceai/pull/4432>
- Update link to retrieval-augmented generation (RAG) details by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4433>
- Unity catalog should strip parameter prefix before passing parameters to delta lake factory by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4436>
- Update quickstart traces to match current version by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4435>
- Update Supported Embeddings Providers Readme section by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4434>
- Local models can stream tools by @Jeadie in <https://github.com/spiceai/spiceai/pull/4429>
- fix: Use MetricsCollector::show() for HTTP testoperator commands by @peasee in <https://github.com/spiceai/spiceai/pull/4442>
- Fix run query action by @ewgenius in <https://github.com/spiceai/spiceai/pull/4444>
- Default to AI-enabled runtime for `spice run`/`spice install` by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4443>
- Change no spicepod.yaml log to warning by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4447>
- refactor: Update Catalog Connector error messages by @peasee in <https://github.com/spiceai/spiceai/pull/4441>
- Fix panic when converting OTel metrics by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4449>
- refactor: Update model errors by @peasee in <https://github.com/spiceai/spiceai/pull/4446>
- Update spiceai/mistral.rs to silence metadata logs by @ewgenius in <https://github.com/spiceai/spiceai/pull/4452>
- fix xAI; don't use openai defaults by @Jeadie in <https://github.com/spiceai/spiceai/pull/4450>
- Improves the UX of using huggingface models by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4451>
- Add GH Workflow to test `spice ai` runtime installation by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4448>
- fix: Use specific model errors where available by @peasee in <https://github.com/spiceai/spiceai/pull/4454>
- Detect and report unsupported embedding column type during dataset registration by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4456>
- Handle Errors by @Jeadie in <https://github.com/spiceai/spiceai/pull/4455>
- Catch and report negative openai_temperature error by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4453>
- Clarify release check error message if it is caused by wrong GH token by @ewgenius in <https://github.com/spiceai/spiceai/pull/4458>

**Full Changelog**: <https://github.com/spiceai/spiceai/compare/v1.0.0-rc.5...v1.0.0>

Resources​

Community​

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Discord or by email to get involved.

Spice v1.0-rc.1 (Nov 27, 2024)

Β· 14 min read
Jack Eadie
Token Plumber at Spice AI

Announcing the release of Spice v1.0-rc.1 πŸš€

Spice v1.0.0-rc.1 marks the release candidate for the first major version of Spice.ai OSS. This milestone includes key Connector and Accelerator graduations and bug fixes, positioning Spice for a stable and production-ready release.

Highlights in v1.0-rc.1​

API Key Authentication: Spice now supports optional authentication for API endpoints via configurable API keys, for additional security and control over runtime access.

Example Spicepod.yml configuration:

runtime:
auth:
api-key:
enabled: true
keys:
- ${ secrets:api_key } # Load from a secret store
- my-api-key # Or specify directly

Usage:

  • HTTP API: Include the API key in the X-API-Key header.
  • Flight SQL: Use the API key in the Authorization header as a Bearer token.
  • Spice CLI: Provide the --api-key flag for CLI commands.

For more details on using API Key auth, refer to the API Auth documentation.

DuckDB Data Connector: Has graduated from Beta to Release Candidate.

Arrow and DuckDB Data Accelerators: Both have graduated from Beta to Release Candidates.

Debezium Kafka Integration: Spice now supports secure authentication and encryption options for Kafka connections when using Debezium for Change Data Capture (CDC). The previous limitation of PLAINTEXT protocol-only connections has been lifted. Spice now supports the following Kafka security configurations:

  • Security protocol: PLAINTEXT, SSL, SASL_PLAINTEXT, SASL_SSL
  • SASL mechanisms: PLAIN, SCRAM-SHA-256, SCRAM-SHA-512

Example Spicepod.yml configuration:

datasets:
- from: debezium:my_kafka_topic_with_debezium_changes
name: my_dataset
params:
kafka_security_protocol: SASL_SSL
kafka_sasl_mechanism: SCRAM-SHA-512
kafka_sasl_username: kafka
kafka_sasl_password: ${secrets:kafka_sasl_password}
kafka_ssl_ca_location: ./certs/kafka_ca_cert.pem

Breaking changes​

Model Parameters: The params.spice_tools parameter has been replaced by params.tools. Backward compatibility is maintained for existing configurations using params.spice_tools.

Dataset Accelerator State: The ready_state parameter has been moved to the dataset level.

Ready Handler Response: The response body of the /v1/ready handler has been changed from Ready (uppercase) to ready (lowercase) for consistency and adherence to standards.

Default Kafka Security for Debezium: The default Kafka kafka_security_protocol parameter for Debezium datasets has changed from PLAINTEXT to SASL_SSL, improving security by default.

Metrics Name Updates: Adjustments have been made to specific metrics for improved observability and accuracy:

Beforev1.0-rc.1
catalogs_load_errorcatalog_load_errors
catalogs_statuscatalog_load_state
datasets_acceleration_append_duration_ms, datasets_acceleration_load_duration_msdataset_acceleration_refresh_duration_ms {mode: append/full}
datasets_acceleration_last_refresh_timedataset_acceleration_last_refresh_time_ms
datasets_acceleration_refresh_errordataset_acceleration_refresh_errors
datasets_countdataset_active_count
datasets_load_errordataset_load_errors
datasets_statusdataset_load_state
datasets_unavailable_timedataset_unavailable_time_ms
embeddings_countembeddings_active_count
embeddings_load_errorembeddings_load_errors
embeddings_statusembeddings_load_state
flight_do_action_duration_ms, flight_do_get_get_primary_keys_duration_ms, flight_do_get_get_catalogs_duration_ms, flight_do_get_get_schemas_duration_ms, flight_do_get_get_sql_info_duration_ms, flight_do_get_table_types_duration_ms, flight_do_get_get_tables_duration_ms, flight_do_get_prepared_statement_query_duration_ms, flight_do_get_simple_duration_ms, flight_do_get_statement_query_duration_ms, flight_do_put_duration_ms, flight_handshake_request_duration_ms, flight_list_actions_duration_ms, flight_get_flight_info_request_duration_msflight_request_duration_ms {method: method_name, command: command_name}
flight_do_action_requests, flight_do_exchange_data_updates_sent, flight_do_exchange_requests, flight_do_put_requests, flight_do_get_requests, flight_handshake_requests, flight_list_actions_requests, flight_list_flights_requests, flight_get_flight_info_requests, flight_get_schema_requestsflight_requests {method: method_name, command: command_name}
http_requests_duration_mshttp_request_duration_ms
models_countmodel_active_count
models_load_duration_msmodel_load_duration_ms
models_load_errormodel_load_errors
models_statusmodel_load_state
tool_counttool_active_count
tool_load_errortool_load_errors
tools_statustool_load_state
query_countquery_executions
query_execution_durationquery_execution_duration_ms
results_cache_hit_countresults_cache_hits
results_cache_item_countresults_cache_items_count
results_cache_max_sizeresults_cache_max_size_bytes
results_cache_request_countresults_cache_requests
results_cache_sizeresults_cache_size_bytes
secrets_stores_load_duration_mssecrets_store_load_duration_ms
bytes_processedquery_processed_bytes
bytes_returnedquery_returned_bytes
spiced_runtime_flight_server_startruntime_flight_server_started
spiced_runtime_http_server_startruntime_http_server_started
views_load_errorview_load_errors

Contributors​

  • @phillipleblanc
  • @sgrebnov
  • @Jeadie
  • @Sevenannn
  • @peasee
  • @slyons
  • @barracudarin
  • @lukekim
  • @ewgenius

What's changed​

- Update to next release version by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3372
- Update Helm chart to v0.20.0-beta by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3373
- Upgrade dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3375
- E2E: Add a test to confirm refreshing with custom `refresh-sql` via CLI by @sgrebnov in https://github.com/spiceai/spiceai/pull/3374
- Fix regression in inferring embedding model vector size for non-default models by @Jeadie in https://github.com/spiceai/spiceai/pull/3376
- add AI quickstarts to endgame by @Jeadie in https://github.com/spiceai/spiceai/pull/3378
- Remove need for `params.model_type` for most HF LLMs by @Jeadie in https://github.com/spiceai/spiceai/pull/3342
- Replace `query_duration_seconds` and `http_requests_duration_seconds` with `milliseconds` metrics by @sgrebnov in https://github.com/spiceai/spiceai/pull/3251
- Add `Extension<Runtime>` to HTTP routes to simplify tooling in NSQL. by @Jeadie in https://github.com/spiceai/spiceai/pull/3384
- Update datafusion patch by @Sevenannn in https://github.com/spiceai/spiceai/pull/3386
- Ensure hyperparameters are obeyed in recursive chat/completion calls. by @Jeadie in https://github.com/spiceai/spiceai/pull/3395
- fix: update odbc benchmarks by @peasee in https://github.com/spiceai/spiceai/pull/3394
- Implement traits & plumbing for pluggable HTTP Auth by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3397
- Add allow_http parameter for S3 data connector by @Sevenannn in https://github.com/spiceai/spiceai/pull/3398
- Add column field to dataset spicepod component by @Jeadie in https://github.com/spiceai/spiceai/pull/3336
- feat: add duckdb connector benchmarks by @peasee in https://github.com/spiceai/spiceai/pull/3403
- Add integration tests for OpenAI NSQL functionality by @sgrebnov in https://github.com/spiceai/spiceai/pull/3402
- Implement optional api-key auth for the HTTP endpoint by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3405
- Add integration tests for Search API (OpenAI and HF models) by @sgrebnov in https://github.com/spiceai/spiceai/pull/3410
- HTTP APIs: list tools, call tool by @Jeadie in https://github.com/spiceai/spiceai/pull/3404
- Implement optional api-key auth for the Flight/FlightSQL endpoint by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3412
- Adding semicolons to some TPCH queries to make sure they run on the CLI by @slyons in https://github.com/spiceai/spiceai/pull/3420
- Add GrpcAuth to protect the OpenTelemetry endpoint by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3417
- Support Kafka-native authentication and TLS connections for Debezium connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3419
- Add integration tests for Embeddings API (OpenAI and HF models) by @sgrebnov in https://github.com/spiceai/spiceai/pull/3416
- Support base64 embedding format by @Jeadie in https://github.com/spiceai/spiceai/pull/3418
- Give local models some love by @Jeadie in https://github.com/spiceai/spiceai/pull/3425
- Have views update on `--pods-watcher-enabled` by @Jeadie in https://github.com/spiceai/spiceai/pull/3428
- Simplify running models integration tests locally by @sgrebnov in https://github.com/spiceai/spiceai/pull/3424
- Make Debezium connector MySQL compatible by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3432
- Store + load memory tooling, enable by @Jeadie in https://github.com/spiceai/spiceai/pull/3413
- Statically compile OpenSSL by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3434
- Build macOS x64 on macos-14 (Sonoma) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3435
- Upgrade dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3443
- Bump azure_core from 0.20.0 to 0.21.0 by @dependabot in https://github.com/spiceai/spiceai/pull/3436
- Add integration tests for chat completion API (HF and OpenAI) by @sgrebnov in https://github.com/spiceai/spiceai/pull/3433
- Run Clickbench with Spice Benchmark Binary by @Sevenannn in https://github.com/spiceai/spiceai/pull/3389
- Use `datatype_is_semantically_equal` in `verify_schema` by @Sevenannn in https://github.com/spiceai/spiceai/pull/3423
- Use spiceai-large-runners to build benchmark binary by @Sevenannn in https://github.com/spiceai/spiceai/pull/3446
- Skip reqwest_retry::middleware tracing in non verbose configuration by @sgrebnov in https://github.com/spiceai/spiceai/pull/3445
- feat: Add invalid type action handling for DuckDB by @peasee in https://github.com/spiceai/spiceai/pull/3430
- Fix benchmark: Lock poisoning issue from INSTA by @Sevenannn in https://github.com/spiceai/spiceai/pull/3457
- docs: Release DuckDB Connector RC by @peasee in https://github.com/spiceai/spiceai/pull/3459
- DR: Code Pattern For Obtaining Milliseconds-Based Duration by @sgrebnov in https://github.com/spiceai/spiceai/pull/3460
- Improve ClickBench setup script: avoid re-downloading test data every time by @sgrebnov in https://github.com/spiceai/spiceai/pull/3463
- Fix `TableReference` quoting for MySQL by @Jeadie in https://github.com/spiceai/spiceai/pull/3461
- Tool use and model name for local models by @Jeadie in https://github.com/spiceai/spiceai/pull/3458
- `params.tools`, not `params.spice_tools`. Allow backwards compatibility to `params.spice_tools`. by @Jeadie in https://github.com/spiceai/spiceai/pull/3473
- fix: Support DuckDB boolean list by @peasee in https://github.com/spiceai/spiceai/pull/3474
- Upgrade to DataFusion 43 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3462
- Build explicit ODBC Docker image by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3476
- Promote Arrow acceleration to RC by @sgrebnov in https://github.com/spiceai/spiceai/pull/3478
- Update benchmark workflow to create PR for updating snapshot by @Sevenannn in https://github.com/spiceai/spiceai/pull/3479
- Update benchmark snapshots for spice.ai connector tpch by @github-actions in https://github.com/spiceai/spiceai/pull/3481
- Update setup-make action by @Sevenannn in https://github.com/spiceai/spiceai/pull/3488
- Option to return sql from `v1/nsql` by @Jeadie in https://github.com/spiceai/spiceai/pull/3487
- Adding scripts to run and monitor TPC-H/-DS queries at larger scale factors by @slyons in https://github.com/spiceai/spiceai/pull/3483
- Update Datafusion and Datafusion-Table-Providers patch by @Sevenannn in https://github.com/spiceai/spiceai/pull/3489
- docs: Update Accelerator RC to specify clickbench in all modes by @peasee in https://github.com/spiceai/spiceai/pull/3490
- Add logos and marks by @lukekim in https://github.com/spiceai/spiceai/pull/3485
- Updates to repo docs by @lukekim in https://github.com/spiceai/spiceai/pull/3486
- Change `document_similarity` to return markdown, not JSON. by @Jeadie in https://github.com/spiceai/spiceai/pull/3477
- Add support for creating embeddings for Utf8View type columns by @sgrebnov in https://github.com/spiceai/spiceai/pull/3498
- Add vector search support for Utf8View type columns by @sgrebnov in https://github.com/spiceai/spiceai/pull/3500
- Update `datafusion-table-providers` version by @Jeadie in https://github.com/spiceai/spiceai/pull/3503
- Update `text-embeddings-inference` and `mistral.rs` from downstream. by @Jeadie in https://github.com/spiceai/spiceai/pull/3505
- Fix snapshot update PR push in benchmark by @Sevenannn in https://github.com/spiceai/spiceai/pull/3484
- Run FederationAnalyzerRule before ResolveGroupingFunction rule by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3508
- Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/3509
- docs: Release DuckDB accelerator RC by @peasee in https://github.com/spiceai/spiceai/pull/3512
- Upgrade datafusion-functions-json to 0.43 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3511
- Update Datafusion Table Provider patch to fix MySQL refresh append mode by @Sevenannn in https://github.com/spiceai/spiceai/pull/3514
- Handle panics in HF API calls by @Jeadie in https://github.com/spiceai/spiceai/pull/3521
- Update Runtime metrics according to metrics naming guidelines by @sgrebnov in https://github.com/spiceai/spiceai/pull/3518
- Update Flight metrics according to metrics naming guidelines by @sgrebnov in https://github.com/spiceai/spiceai/pull/3515
- Update Results Cache metrics according to metrics naming guidelines by @sgrebnov in https://github.com/spiceai/spiceai/pull/3520
- Move `ready_state` to dataset level by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3526
- Add `--force` option to `spice upgrade` to force it to upgrade to the latest released version by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3527
- Refactor runtime initialization into separate modules by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3531
- Update Anonymous telemetry metrics according to metrics naming guidelines by @sgrebnov in https://github.com/spiceai/spiceai/pull/3529
- Add Metrics naming principles and guidelines by @sgrebnov in https://github.com/spiceai/spiceai/pull/3516
- Update Dataset Acceleration metrics according to metrics naming guidelines by @sgrebnov in https://github.com/spiceai/spiceai/pull/3528
- Improve localpod startup to register immediately after its parent is registered by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3532
- AI/LLM integration tests: make tests more robust and verify more ai_tools by @sgrebnov in https://github.com/spiceai/spiceai/pull/3513
- Update dashboards to match new metrics names by @sgrebnov in https://github.com/spiceai/spiceai/pull/3530
- Clarify source of prefixes for data component parameters. by @Jeadie in https://github.com/spiceai/spiceai/pull/3541
- Upgrade dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3564
- Update Spice release process to support release branches by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3525
- fix: Validate the endpoint for ABFS and S3 by @peasee in https://github.com/spiceai/spiceai/pull/3565
- Vector Search: Default to datasets with embeddings only when none are specified by @sgrebnov in https://github.com/spiceai/spiceai/pull/3575
- Lowercase the ready handler response by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3577
- Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/3579
- Improve `spice search` error handling by @sgrebnov in https://github.com/spiceai/spiceai/pull/3571
- Load components in parallel, not concurrently by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3566
- fix: Make S3 auth parameter validation more robust: by @peasee in https://github.com/spiceai/spiceai/pull/3578
- fix: Infer if the specified file format is correct in object store by @peasee in https://github.com/spiceai/spiceai/pull/3580
- Add ability to configure CORS on the HTTP server by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3581
- fix: Handle invalid S3 auth and region better by @peasee in https://github.com/spiceai/spiceai/pull/3582
- allow setting of replicaCount to a falsy-value by @barracudarin in https://github.com/spiceai/spiceai/pull/3586
- `spice search` to default to only datasets with embeddings by @sgrebnov in https://github.com/spiceai/spiceai/pull/3588
- Run AI integration tests as part of CI by @sgrebnov in https://github.com/spiceai/spiceai/pull/3572
- Load datasets in parallel by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3585
- Run integration test on smaller runners by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3583
- Use folders for model component by @Jeadie in https://github.com/spiceai/spiceai/pull/3584
- Improve models integration tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/3592
- Change default task_history captured_output to `none` by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3598
- Add timeout to `/v1/datasets` APIs when app is locked by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3601
- Properly drop the read lock on the runtime app in http.start by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3603
- Make integration tests more robust on fewer cores by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3604
- refactor: First pass data connector error messages update by @peasee in https://github.com/spiceai/spiceai/pull/3602
- Add log if no datasets are configured by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3605
- Upgrade to DuckDB 1.1.3 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3606
- Add E2E test for spice search and chat functionality (OpenAI) by @sgrebnov in https://github.com/spiceai/spiceai/pull/3599
- Use spiceai-runners for TPCH / TPCDS benchmark by @Sevenannn in https://github.com/spiceai/spiceai/pull/3507
- docs: Update error handling guide by @peasee in https://github.com/spiceai/spiceai/pull/3611
- Improve default description for sql tool by @Jeadie in https://github.com/spiceai/spiceai/pull/3612
- Update metric name from `query_invocations` to `query_executions` by @sgrebnov in https://github.com/spiceai/spiceai/pull/3613
- Don't provide runtime tools to health check. by @Jeadie in https://github.com/spiceai/spiceai/pull/3615
- Sort vector search results based on similarity score by @sgrebnov in https://github.com/spiceai/spiceai/pull/3620
- Allow overriding runtime configuration with `--set-runtime` CLI flags by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3619
- Some bugs by @Jeadie in https://github.com/spiceai/spiceai/pull/3621
- Improve S3 errors by @Sevenannn in https://github.com/spiceai/spiceai/pull/3640
- Update Databricks, Delta Lake, DuckDB error messages by @Sevenannn in https://github.com/spiceai/spiceai/pull/3642
- docs: Add error message UX to beta connector criteria by @peasee in https://github.com/spiceai/spiceai/pull/3639
- feat: Make REPL identify it's waiting on a new line by @peasee in https://github.com/spiceai/spiceai/pull/3617
- Wrap Server-Sent-Events chat errors as OpenAI error events by @sgrebnov in https://github.com/spiceai/spiceai/pull/3641
- refactor: Update accelerated table errors, dataset health monitor errors by @peasee in https://github.com/spiceai/spiceai/pull/3614
- Extend `v1/datasets` api to indicate if dataset can be used in vector search by @sgrebnov in https://github.com/spiceai/spiceai/pull/3644
- feat: Unnest DataFusion errors by @peasee in https://github.com/spiceai/spiceai/pull/3646
- feat: Add RateLimited DataConnectorError by @peasee in https://github.com/spiceai/spiceai/pull/3648
- Setup nightly docker release workflow by @ewgenius in https://github.com/spiceai/spiceai/pull/3649
- Make LLM integration tests more extensible. by @Jeadie in https://github.com/spiceai/spiceai/pull/3576
- feat: Update ODBC error messages by @peasee in https://github.com/spiceai/spiceai/pull/3651
- feat: Better tonic errors by @peasee in https://github.com/spiceai/spiceai/pull/3650
- Nightly release workflow fixes by @ewgenius in https://github.com/spiceai/spiceai/pull/3652
- Fix missing ARM64 image for nightly publish step by @ewgenius in https://github.com/spiceai/spiceai/pull/3653
- Use GitHub GraphQL rate limiting responses to rate limit requests by @lukekim in https://github.com/spiceai/spiceai/pull/3610
- Fix typo in nightly release publish step by @ewgenius in https://github.com/spiceai/spiceai/pull/3654
- Handle GitHub rate-limiting for the Rest API by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3656
- Adding custom User-Agent parameters to chat, nsql and flightrepl by @slyons in https://github.com/spiceai/spiceai/pull/3609
- Remove "nightly-" prefix from tag by @ewgenius in https://github.com/spiceai/spiceai/pull/3671
- Upgrade dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3670
- `spice search` to warn if dataset is not ready and won't be included in search by @sgrebnov in https://github.com/spiceai/spiceai/pull/3590
- Fix keyring secret store to try both prefixed & unprefixed secrets by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3672
- Handle empty embeds by allowing for nulls by @Jeadie in https://github.com/spiceai/spiceai/pull/3600
- Improve github connector error by @Sevenannn in https://github.com/spiceai/spiceai/pull/3677
- Update FlightSQL error messages by @sgrebnov in https://github.com/spiceai/spiceai/pull/3676
- Update Datafusion Table Provider Patch to include error message improvements by @Sevenannn in https://github.com/spiceai/spiceai/pull/3678
- Integration tests for `llms` crate, with basic Anthropic test. by @Jeadie in https://github.com/spiceai/spiceai/pull/3647
- Allow E2E model tests to complete even if parallel platform tests failed by @sgrebnov in https://github.com/spiceai/spiceai/pull/3679
- Add Openai to llms testing by @Jeadie in https://github.com/spiceai/spiceai/pull/3680
- Fix .env in '.github/workflows/integration_llms.yml' by @Jeadie in https://github.com/spiceai/spiceai/pull/3686
- Improve error messages for spice ai connector, separate errors to different lines for DuckDB, Delta Lake, Databricks connector by @Sevenannn in https://github.com/spiceai/spiceai/pull/3643
- Add `microsoft/Phi-3-mini-4k-instruct` to llms crate testing, with `MODEL_SKIPLIST` & `MODEL_ALLOWLIST` by @Jeadie in https://github.com/spiceai/spiceai/pull/3690
- Add nightly label to spiced version in Cargo.toml by @ewgenius in https://github.com/spiceai/spiceai/pull/3691
- Disable HF in models integration tests (not supported) by @sgrebnov in https://github.com/spiceai/spiceai/pull/3693
- Add log when CORS is enabled by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3695
- Fix nightly release workflow by @ewgenius in https://github.com/spiceai/spiceai/pull/3698
- Correctly set nightly labels for both release and pre-release versions by @ewgenius in https://github.com/spiceai/spiceai/pull/3699
- Improve REPL error handling for multiline error messages by @sgrebnov in https://github.com/spiceai/spiceai/pull/3692
- Determine support_filter_pushdown based on Accelerator federated reader & ZeroResultsAction by @Sevenannn in https://github.com/spiceai/spiceai/pull/3694
- Fix rdfkafak duplicated version by @Sevenannn in https://github.com/spiceai/spiceai/pull/3707
- feat: Render multiline errors better in REPL by @peasee in https://github.com/spiceai/spiceai/pull/3701
- refactor: Update UnableToAttachDataConnector error message by @peasee in https://github.com/spiceai/spiceai/pull/3706
- refactor: Update errors for Alpha connectors by @peasee in https://github.com/spiceai/spiceai/pull/3705
- Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/3704
- Implement a RequestContext that automatically propagates request details to metric dimensions by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3709
- Fix acceleration in append mode with refresh_sql specified by @sgrebnov in https://github.com/spiceai/spiceai/pull/3697
- Bump github.com/stretchr/testify from 1.9.0 to 1.10.0 by @dependabot in https://github.com/spiceai/spiceai/pull/3655
- Tokenizer for OpenAI embedding models for accurate chunking by @Jeadie in https://github.com/spiceai/spiceai/pull/3519
- Update error message when dataset isn't configured with time_column in append refresh by @Sevenannn in https://github.com/spiceai/spiceai/pull/3703
- Add the missing winver dependency in runtime crate by @Sevenannn in https://github.com/spiceai/spiceai/pull/3711
- deps: Update table providers by @peasee in https://github.com/spiceai/spiceai/pull/3712
- Add special tokens in chunk sizer by @Jeadie in https://github.com/spiceai/spiceai/pull/3713
- Disable results cache for benchmark tests by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3715

**Full Changelog**: https://github.com/spiceai/spiceai/compare/v0.20.0-beta...v1.0.0-rc.1

Resources​

Community​

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Discord or by email to get involved.

Spice v0.20-beta (Nov 4, 2024)

Β· 3 min read
Phillip LeBlanc
Co-Founder and CTO of Spice AI

Announcing the release of Spice v0.20-beta 🧩

Spice v0.20.0-beta improves federated query performance with column pruning and adds support for Metal (Apple Silicon) and CUDA (NVidia) accelerators. The S3, PostgreSQL, MySQL, and GitHub Data Connectors have graduated from Beta to Release Candidates. The Arrow, DuckDB, and SQLite Data Accelerators have graduated from Alpha to Beta.

Highlights in v0.20.0-beta​

Data Connectors: The S3, PostgreSQL, MySQL, and GitHub Data Connectors have graduated from beta to release candidate.

Data Accelerators: The Arrow, DuckDB, and SQLite Data Accelerators have graduated from alpha to beta.

Metal and CUDA Support: Added support for Metal (Apple Silicon) and CUDA (NVidia) for AI/ML workloads including embeddings and local LLM inference.

For instructions on compiling a Meta or CUDA binary, see the Installation Docs.

Breaking Changes​

  • The ODBC Data Connector now requires ODBC drivers specified in connection strings are registered in the system ODBC driver manager.

Example invalid connection string:

DRIVER={/path/to/driver.so};SERVER=localhost;DATABASE=master

Example valid connection string:

DRIVER={My ODBC Driver};SERVER=localhost;DATABASE=master

Where My ODBC Driver is the name of an ODBC driver registered in the ODBC driver manager.

Contributors​

  • @ewgenius
  • @peasee
  • @phillipleblanc
  • @sgrebnov
  • @Jeadie
  • @barracudarin
  • @Sevenannn

What's Changed​

- Update Helm for v0.19.4-beta and add release notes by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/3310>
- Update spicepod.schema.json by @github-actions in <https://github.com/spiceai/spiceai/pull/3311>
- `metal` & `cuda` flags for spice by @Jeadie in <https://github.com/spiceai/spiceai/pull/3212>
- Promote postgres connector to RC quality by @Sevenannn in <https://github.com/spiceai/spiceai/pull/3305>
- docs: Update ROADMAP.md by @peasee in <https://github.com/spiceai/spiceai/pull/3322>
- feat: Enable federation for in-memory accelerators by @peasee in <https://github.com/spiceai/spiceai/pull/3325>
- fix: Only allow env files from the current dir by @peasee in <https://github.com/spiceai/spiceai/pull/3327>
- Always read TimezoneTZ from PostgreSQL as UTC by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/3330>
- For multi-sink acceleration refreshes, ensure parent table completes before the children. by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/3329>
- Update TPC-DS Q49 (Decimal to Float) to match SQLite's type system by @sgrebnov in <https://github.com/spiceai/spiceai/pull/3323>
- Enable parquet pushdown in Spice by @Sevenannn in <https://github.com/spiceai/spiceai/pull/3245>
- Use spice object_store fork to fix S3 ambiguous error by @Sevenannn in <https://github.com/spiceai/spiceai/pull/3304>
- Don't mix commented out queries for s3 connectors and accelerators by @Sevenannn in <https://github.com/spiceai/spiceai/pull/3331>
- Allow only valid WHERE conditions in vector searches by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/3335>
- fix: Allow only ODBC profiles by @peasee in <https://github.com/spiceai/spiceai/pull/3324>
- Track how many times an acceleration falls back during initialization by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/3339>
- Anthropic model regex and fix tool parsing aggregation bug by @Jeadie in <https://github.com/spiceai/spiceai/pull/3334>
- Upgrade runtime along with CLI on `spice upgrade` by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/3341>
- Update upcoming Roadmap by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/3343>
- fix: Prevent acceleration files outside of working directory by @peasee in <https://github.com/spiceai/spiceai/pull/3340>
- Document S3 connector limitations by @Sevenannn in <https://github.com/spiceai/spiceai/pull/3333>
- Update Object Store Patch by @Sevenannn in <https://github.com/spiceai/spiceai/pull/3361>
- Promote SQLite Data Accelerator to Beta by @sgrebnov in <https://github.com/spiceai/spiceai/pull/3365>
- Promote S3 connector to RC quality by @Sevenannn in <https://github.com/spiceai/spiceai/pull/3362>
- Revert "fix: Only allow env files from the current dir" by @peasee in <https://github.com/spiceai/spiceai/pull/3368>
- docs: Fix typo for S3 release status in README.md by @peasee in <https://github.com/spiceai/spiceai/pull/3370>
- Include unnecessary columns pruning step during federated plan creation by @sgrebnov in <https://github.com/spiceai/spiceai/pull/3363>

**Full Changelog**: <https://github.com/spiceai/spiceai/compare/v0.19.4-beta...v0.20.0-beta>

Resources​

Community​

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Discord or by email to get involved.

Spice.ai v0.7-alpha

Β· 2 min read
Phillip LeBlanc
Co-Founder and CTO of Spice AI

Announcing the release of Spice v0.7-alpha! 🏹

Spice v0.7-alpha is an all new implementation of Spice written in Rust. The Spice v0.7 runtime provides developers with a unified SQL query interface to locally accelerate and query data tables sourced from any database, data warehouse, or data lake.

Learn more and get started in minutes with the updated Quickstart in the repository README!

Highlights in v0.7-alpha​

DataFusion SQL Query Engine: Spice v0.7 leverages the Apache DataFusion query engine to provide very fast, high quality SQL query across one or more local or remote data sources.

Data tables can be locally accelerated using Apache Arrow in-memory or by DuckDB.

New in this release​

  • Adds runtime rewritten in Rust for high-performance.
  • Adds Apache DataFusion SQL query engine.
  • Adds The Spice.ai platform as a data source.
  • Adds Dremio as a data source.
  • Adds OpenTelemetry (OTEL) collector.
  • Adds local data table acceleration.
  • Adds DuckDB file or in-memory as a data table acceleration engine.
  • Adds In-memory Apache Arrow as a data table acceleration engine.
  • Removes the built-in AI training engine; now cloud-based and provided by the Spice.ai platform.
  • Removes the built-in dashboard and web-interface; now cloud-based and provided by the Spice.ai platform.

Resources​

Community​

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Discord or by email to get involved.

Spice.ai v0.6.1-alpha

Β· 2 min read
Luke Kim
Founder and CEO of Spice AI

Announcing the release of Spice.ai v0.6.1-alpha! 🌢

Building upon the Apache Arrow support in v0.6-alpha, Spice.ai now includes new Apache Arrow data processor and Apache Arrow Flight data connector components! Together, these create a high-performance bulk-data transport directly into the Spice.ai ML engine. Coupled with big data systems from the Apache Arrow ecosystem like Hive, Drill, Spark, Snowflake, and BigQuery, it's now easier than ever to combine big data with Spice.ai.

And we're also excited to announce the release of Spice.xyz! πŸŽ‰

Spice.xyz is data and AI infrastructure for web3. It’s web3 data made easy. Insanely fast and purpose designed for applications and ML.

Spice.xyz delivers data in Apache Arrow format, over high-performance Apache Arrow Flight APIs to your application, notebook, ML pipeline, and of course through these new data components, to the Spice.ai runtime.

Read the announcement post at blog.spice.ai.

Spice.xyz

New in this release​

Now built with Go 1.18.

Dependency updates​

  • Updates to React 18
  • Updates to CRA 5
  • Updates to Glide DataGrid 4
  • Updates to SWR 1.2
  • Updates to TypeScript 4.6

Resources​

Community​

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Discord or by email to get involved. We will also be starting a community call series soon!