Skip to main content

DuckDB Data Accelerator

To use DuckDB as Data Accelerator, specify duckdb as the engine for acceleration.

datasets:
- from: spice.ai:path.to.my_dataset
name: my_dataset
acceleration:
engine: duckdb

Configuration​

The DuckDB accelerator can be configured by providing the following params:

  • duckdb_file: The name for the file to back the DuckDB database. If the file does not exist, it will be created. Only applies if mode is file.

Configuration params are provided in the acceleration section for a data store. Other common acceleration fields can be configured for DuckDB, see see datasets.

datasets:
- from: spice.ai:path.to.my_dataset
name: my_dataset
acceleration:
engine: duckdb
mode: file
params:
duckdb_file: /my/chosen/location/duckdb.db
Limitations
  • The DuckDB accelerator does not support enum, dictionary, or map field types. For example:
    • Unsupported:
      • SELECT MAP(['key1', 'key2', 'key3'], [10, 20, 30])
  • The DuckDB accelerator does not support Decimal256 (76 digits), as it exceeds DuckDB's maximum Decimal width of 38 digits.
  • Using the DuckDB accelerator with on_zero_results: use_source does not support using filters on binary columns when the query results in using the source connection, like WHERE col_blob <> ''. Cast the binary to another data type instead, like WHERE CAST(col_blob AS TEXT) <> ''.
  • Updating a dataset with DuckDB acceleration while the Spice Runtime is running (hot-reload) will cause the DuckDB accelerator query federation to disable until the Runtime is restarted.
Memory Considerations

When accelerating a dataset using mode: memory (the default), some or all of the dataset is loaded into memory. Ensure sufficient memory is available, including overhead for queries and the runtime, especially with concurrent queries.

In-memory limitations can be mitigated by storing acceleration data on disk, which is supported by duckdb and sqlite accelerators by specifying mode: file.

Cookbook​