DuckDB Data Accelerator
To use DuckDB as Data Accelerator, specify duckdb
as the engine
for acceleration.
datasets:
- from: spice.ai:path.to.my_dataset
name: my_dataset
acceleration:
engine: duckdb
Configuration​
The DuckDB accelerator can be configured by providing the following params
:
duckdb_file
: The name for the file to back the DuckDB database. If the file does not exist, it will be created. Only applies ifmode
isfile
.
Configuration params
are provided in the acceleration
section for a data store. Other common acceleration
fields can be configured for DuckDB, see see datasets.
datasets:
- from: spice.ai:path.to.my_dataset
name: my_dataset
acceleration:
engine: duckdb
mode: file
params:
duckdb_file: /my/chosen/location/duckdb.db
Limitations
- The DuckDB accelerator does not support enum, dictionary, or map field types. For example:
- Unsupported:
SELECT MAP(['key1', 'key2', 'key3'], [10, 20, 30])
- Unsupported:
- The DuckDB accelerator does not support
Decimal256
(76 digits), as it exceeds DuckDB's maximum Decimal width of 38 digits. - Using the DuckDB accelerator with
on_zero_results: use_source
does not support using filters on binary columns when the query results in using the source connection, likeWHERE col_blob <> ''
. Cast the binary to another data type instead, likeWHERE CAST(col_blob AS TEXT) <> ''
. - Updating a dataset with DuckDB acceleration while the Spice Runtime is running (hot-reload) will cause the DuckDB accelerator query federation to disable until the Runtime is restarted.
Memory Considerations
When accelerating a dataset using mode: memory
(the default), some or all of the dataset is loaded into memory. Ensure sufficient memory is available, including overhead for queries and the runtime, especially with concurrent queries.
In-memory limitations can be mitigated by storing acceleration data on disk, which is supported by duckdb
and sqlite
accelerators by specifying mode: file
.
Cookbook​
- A cookbook recipe to configure DuckDB as a data accelerator in Spice. DuckDB Data Accelerator