Skip to main content

7 posts tagged with "datafusion"

DataFusion query engine related topics and usage

View All Tags

Spice v0.19.2-beta (Oct 21, 2024)

ยท 4 min read
Qianqian Liu
Software Engineer at Spice AI

Announcing the release of Spice v0.19.2-beta โšก

Spice v0.19.2-beta continues to improve performance and stability of data connectors and data accelerators, further expands TPC-DS coverage, and includes several bug fixes.

Highlights in v0.19.2โ€‹

DataFusion Fixes: Resolved bugs in DataFusion and DataFusion Table Providers, improving TPC-DS query support and correctness.

TPC-DS Snapshots: Extended support for TPC-DS benchmarks with added snapshot tests for validating query plans and result accuracy.

PostgreSQL Accelerator Beta: Postgres Data Accelerator has been promoted to Beta Quality

Breaking changesโ€‹

  • The hive_infer_partitions parameter been changed to hive_partitioning_enabled, now defaults to false and must be explicitly enabled.

Contributorsโ€‹

  • @ewgenius
  • @sgrebnov
  • @slyons
  • @Jeadie
  • @Sevenannn
  • @phillipleblanc
  • @dependabot
  • @peasee

Dependenciesโ€‹

What's Changedโ€‹

- Update Helm chart for v0.19.1-beta by @ewgenius in https://github.com/spiceai/spiceai/pull/3106
- Add more TPC-DS snapshots for Postgres acceleration by @sgrebnov in https://github.com/spiceai/spiceai/pull/3107
- Bumping version to 1.0.0-rc.1 by @slyons in https://github.com/spiceai/spiceai/pull/3109
- New table sampling methods: sample_distinct_columns, random_sample, top_n_sample by @Jeadie in https://github.com/spiceai/spiceai/pull/3108
- Add TPCDS snapshot tests for file-based and in-mem duckdb by @Sevenannn in https://github.com/spiceai/spiceai/pull/3115
- Add Postgres acceleration E2E test for MySQL by @sgrebnov in https://github.com/spiceai/spiceai/pull/3110
- Update datafusion logical plan to avoid wrong group_by columns in aggregation by @Sevenannn in https://github.com/spiceai/spiceai/pull/3111
- Warn if user tries to embed column that does not exist by @Jeadie in https://github.com/spiceai/spiceai/pull/3120
- Changes for Rust version upgrade by @Sevenannn in https://github.com/spiceai/spiceai/pull/3134
- Add `unnest` support for federated plans by @sgrebnov in https://github.com/spiceai/spiceai/pull/3133
- Don't `.clone()` unnecessarily by @Jeadie in https://github.com/spiceai/spiceai/pull/3128
- Fix Flight `get_schema` to construct logical plan and return that schema. by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3131
- Bump clap from 4.5.19 to 4.5.20 by @dependabot in https://github.com/spiceai/spiceai/pull/3099
- Add GitHub Workflow to build `spice-postgres-tpcds-bench` image by @sgrebnov in https://github.com/spiceai/spiceai/pull/3140
- test: Add basic MySQL integration test by @peasee in https://github.com/spiceai/spiceai/pull/3143
- Bump datafusion-federation and datafusion-table-providers crates by @sgrebnov in https://github.com/spiceai/spiceai/pull/3148
- docs: Add MySQL limitation for division by zero by @peasee in https://github.com/spiceai/spiceai/pull/3144
- fix: Dataset refresh by @peasee in https://github.com/spiceai/spiceai/pull/3147
- Update arrow, duckdb, postgres accelerator tpcds snapshots by @Sevenannn in https://github.com/spiceai/spiceai/pull/3145
- Add TPC-DS benchmarks for Postgres data connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/3149
- Update E2E test ci to include tests for accelerating Postgres into accelerators by @Sevenannn in https://github.com/spiceai/spiceai/pull/3137
- Add TPCDS Benchmark test and snapshots for S3 by @Sevenannn in https://github.com/spiceai/spiceai/pull/3152
- [cli] Include 200 in acceptable response codes for `doRuntimeApiRequest` by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3157
- Use `-build.{GIT_SHA}` for unreleased versions by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3159
- Upgrade to Rust 1.82 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3158
- Disable `hive_infer_partitions` by default by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3160
- Upgrade to DuckDB 1.1.1 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3161
- feat: Add MySQL TPCDS results snapshots and exclude workarounds by @peasee in https://github.com/spiceai/spiceai/pull/3165
- Fix task_history output for sql, add output to table_schema & list_datasets tool by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3166
- feat: Add ClickBench queries as separate files by @peasee in https://github.com/spiceai/spiceai/pull/3169
- Calculate embeddings in a separate blocking thread by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3170
- docs: Update ROADMAP.md and release criterias by @peasee in https://github.com/spiceai/spiceai/pull/3124
- Handle OpenTelemetry errors by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3173
- Update version to 0.19.2-beta by @Sevenannn in https://github.com/spiceai/spiceai/pull/3182

**Full Changelog**: https://github.com/spiceai/spiceai/compare/v0.19.1-beta...v0.19.2-beta

Resourcesโ€‹

Communityโ€‹

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Discord or by email to get involved.

Spice.ai v0.7-alpha

ยท 2 min read
Phillip LeBlanc
Co-Founder and CTO of Spice AI

Announcing the release of Spice v0.7-alpha! ๐Ÿน

Spice v0.7-alpha is an all new implementation of Spice written in Rust. The Spice v0.7 runtime provides developers with a unified SQL query interface to locally accelerate and query data tables sourced from any database, data warehouse, or data lake.

Learn more and get started in minutes with the updated Quickstart in the repository README!

Highlights in v0.7-alphaโ€‹

DataFusion SQL Query Engine: Spice v0.7 leverages the Apache DataFusion query engine to provide very fast, high quality SQL query across one or more local or remote data sources.

Data tables can be locally accelerated using Apache Arrow in-memory or by DuckDB.

New in this releaseโ€‹

  • Adds runtime rewritten in Rust for high-performance.
  • Adds Apache DataFusion SQL query engine.
  • Adds The Spice.ai platform as a data source.
  • Adds Dremio as a data source.
  • Adds OpenTelemetry (OTEL) collector.
  • Adds local data table acceleration.
  • Adds DuckDB file or in-memory as a data table acceleration engine.
  • Adds In-memory Apache Arrow as a data table acceleration engine.
  • Removes the built-in AI training engine; now cloud-based and provided by the Spice.ai platform.
  • Removes the built-in dashboard and web-interface; now cloud-based and provided by the Spice.ai platform.

Resourcesโ€‹

Communityโ€‹

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Discord or by email to get involved.