Skip to main content

2 posts tagged with "bedrock"

Amazon Bedrock related topics and usage

View All Tags

Spice v1.9.1 (Nov 24, 2025)

ยท 7 min read
Viktor Yershov
Senior Software Engineer at Spice AI

Announcing the release of Spice v1.9.1!๐Ÿ”ฅ

v1.9.1 introduces Amazon Bedrock Nova 2 Multimodal embeddings support with high-dimensional vectors up to 3,072 dimensions and purpose-optimized embeddings for semantic search and retrieval operations, DynamoDB timestamp filter pushdown for more efficient append-mode acceleration with configurable time formatting, HTTP Data Connector health probe configuration for improved endpoint validation reliability, and Spice .NET SDK v0.2 with expanded .NET version support and updated gRPC libraries. This release focuses on bug fixes, stability, and performance improvements.

Amazon Bedrock Nova 2 Multimodal embeddingsโ€‹

Spice now supports the Amazon Nova 2 Multimodal embeddings models via the Bedrock models provider, enabling high-quality text embeddings for semantic search and vector similarity operations. The Nova embeddings model offers configurable dimensions and advanced features like truncation modes and embedding purpose optimization.

Key Features:

  • High-Dimensional Embeddings: Support for up to 3,072 dimensions for rich semantic representations
  • Configurable Truncation: Control how input text is truncated when exceeding token limits (START, END, or NONE)
  • Purpose Optimization: Optimize embeddings for specific use cases (GENERIC_INDEX, GENERIC_RETRIEVAL, or CLASSIFICATION)
  • Multimodal Model: Leverages Amazon's Nova 2 multimodal architecture for consistent embeddings across different content types

Example spicepod.yml configuration:

embeddings:
- from: bedrock:amazon.nova-2-multimodal-embeddings-v1:0
name: nova_embeddings
params:
dimensions: '3072' # Required: Output dimensions
truncation_mode: START # Optional: START, END, or NONE (default: NONE)
embedding_purpose: GENERIC_RETRIEVAL # Optional. GENERIC_INDEX is default

For more details on the embedding parameters and configuration options, refer to the Amazon Nova Embeddings Documentation and the Spice Embeddings Documentation.

DynamoDB Timestamp Filter Pushdownโ€‹

The DynamoDB Data Connector now supports timestamp filter pushdown, enabling more efficient append-mode acceleration refreshes by pushing timestamp filters directly to DynamoDB queries. Since DynamoDB stores timestamps as strings rather than native datetime types, this feature includes configurable timestamp formatting to ensure correct parsing and filtering.

Key Features:

  • Filters on timestamp columns are now pushed down to DynamoDB, reducing data transfer and improving query performance
  • Support for Go-style datetime formatting patterns to handle various timestamp string formats
  • Uses ISO 8601 format by default when no custom format is specified

Example spicepod.yml configuration:

datasets:
- from: dynamodb:sales
name: sales
time_column: created_at
time_format: timestamptz
params:
time_format: 2006-01-02T15:04:05.000Z07:00
acceleration:
enabled: true
engine: duckdb
refresh_mode: append

For more details, refer to the DynamoDB Data Connector Documentation.

HTTP Data Connector Health Probe Configurationโ€‹

The HTTP Data Connector now supports configurable health probe paths for endpoint validation. Instead of using a random non-existent path, the system can now validate endpoints using a user-specified path, improving flexibility and reliability for health checks.

Example spicepod.yml configuration:

datasets:
- from: https://api.tvmaze.com
name: tvmaze
params:
file_format: json
health_probe: /health-check

For more details, refer to the HTTP Data Connector Documentation.

Spice .NET SDK v0.2โ€‹

The Spice .NET SDK has been upgraded with expanded .NET version support, custom User-Agent configuration, and updated gRPC libraries: spice-dotnet v0.2.0. The SDK is available on NuGet.

Key Features:

  • Expanded .NET Support: Now supports .NET Standard 2.0, .NET Core 8.0, 9.0, and 10.0.
  • Custom User-Agent: Configure custom User-Agent headers for client identification and telemetry.
  • Updated gRPC Libraries: Upgraded gRPC dependencies and netstandard for improved performance and reliability

Upgrade Example:

dotnet add package SpiceAI --version 0.2.0

For more details, refer to the .NET SDK Documentation.

Additional Improvements & Bug Fixesโ€‹

  • Reliability: Fixed view loading to respect topological order, preventing dependency resolution errors.
  • Reliability: Migrated from deprecated trust_dns_resolver to hickory_resolver for improved DNS resolution reliability.
  • Security: Fixed arbitrary file access vulnerability during archive extraction ("Zip Slip") to prevent potential security exploits.
  • Distributed Query: Fixed object store initialization across scheduler/executor gap, improving reliability for distributed query execution.
  • Distributed Query: Optimized query routing by preventing runtime.* schema queries from being sent to the scheduler, improving performance for metadata queries.
  • Performance: Added Blake3 and xxHash support with xxh3_64 as the default caching hashing algorithm for improved cache and query performance.
  • Performance: Optimized default Zstd compression level to 6 for better balance between compression ratio and speed.
  • UX: Improved dataset loading output with clearer progress indicators and status messages.

Contributorsโ€‹

Breaking Changesโ€‹

No breaking changes.

Cookbook Updatesโ€‹

No major cookbook updates.

The Spice Cookbook includes 82 recipes to help you get started with Spice quickly and easily.

Upgradingโ€‹

To upgrade to v1.9.1, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.9.1 image:

docker pull spiceai/spiceai:1.9.1

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

AWS Marketplace:

๐ŸŽ‰ Spice is now available in the AWS Marketplace!

What's Changedโ€‹

Changelogโ€‹

  • fix integration tests: order by the query to make snapshots deterministic by @phillipleblanc in #8198
  • Add health probe override by @lukekim in #8236
  • Use Moka optionally_get_with for SWR single-in-flight semantics by @lukekim in #8231
  • fix: Arbitrary file access during archive extraction ("Zip Slip") by @phillipleblanc in #8242
  • Migrate trust_dns_resolver to hickory_resolver by @phillipleblanc in #8243
  • fix: Deny assert macros in non-test code by @peasee in #8223
  • Distributed query: Object store initialization across scheduler/executor gap, misc bugfixes & improvements by @mach-kernel in #8009
  • Add Blake3, enable xxHash, set xxh3_64 as default, add bench by @lukekim in #8157
  • Make cache zstd default compression level 6 by @lukekim in #8234
  • Use seed for xxh3 by @lukekim in #8232
  • DynamoDB Timestamp Filter Pushdown by @krinart in #8235
  • Add ready_wait for mongo-arrow benchmarks by @krinart in #8246
  • Add support for amazon.nova-2-multimodal-embeddings-v1:0 by @Jeadie in #8225
  • Improve the output of dataset loading by @lukekim in #8256
  • Load views in topological order by @lukekim in #8255
  • Distributed query: Do not send runtime.* schema queries to scheduler by @mach-kernel in #8271
  • Remove input length check for Nova model. by @Jeadie in #8270

Spice v1.5.2 (Aug 11, 2025)

ยท 7 min read
Kevin Zimmerman
Principal Software Engineer at Spice AI

Announcing the release of Spice v1.5.2! ๐Ÿ› ๏ธ

Spice v1.5.2 introduces a new Amazon Bedrock Models Provider for converse API (Nova) compatible models, AWS Redshift support using the Postgres data connector, and Hadoop Catalog Support for Iceberg tables along with several bug fixes and improvements.

What's New in v1.5.2โ€‹

Amazon Bedrock Models Provider: Adds a new Amazon Bedrock LLM Provider. Models compatible with the Converse API (Nova) are supported.

Amazon Bedrock provides access to a range of foundation models for generative AI. Spice supports using Bedrock-hosted models by specifying the bedrock prefix in the from field and configuring the required parameters.

Supported Model IDs:

  • amazon.nova-lite-v1:0
  • amazon.nova-micro-v1:0
  • amazon.nova-premier-v1:0
  • amazon.nova-pro-v1:0

Refer to the Amazon Bedrock documentation for details on available models and cross-region inference profiles.

Example Spicepod.yaml:

models:
- from: bedrock:us.amazon.nova-lite-v1:0
name: novash
params:
aws_region: us-east-1
aws_access_key_id: ${ secrets:AWS_ACCESS_KEY_ID }
aws_secret_access_key: ${ secrets:AWS_SECRET_ACCESS_KEY }
bedrock_guardrail_identifier: arn:aws:bedrock:abcdefg012927:0123456789876:guardrail/hello
bedrock_guardrail_version: DRAFT
bedrock_trace: enabled
bedrock_temperature: 42

For more information, see the Amazon Bedrock Documentation.

AWS Redshift Support for Postgres Data Connector: Spice now supports connecting to Amazon Redshift using the PostgreSQL data connector. Redshift is a columnar OLAP database compatible with PostgreSQL, allowing you to use the same connector and configuration parameters.

To connect to Redshift, use the format postgres:schema.table in your Spicepod and set the connection parameters to match your Redshift cluster settings.

Example Spicepod.yaml:

# Example datasets for Redshift TPCH tables
datasets:
- from: postgres:public.customer
name: customer
params:
pg_host: ${secrets:PG_HOST}
pg_port: 5439
pg_sslmode: prefer
pg_db: dev
pg_user: ${secrets:PG_USER}
pg_pass: ${secrets:PG_PASS}
- from: postgres:public.lineitem
name: lineitem
params:
pg_host: ${secrets:PG_HOST}
pg_port: 5439
pg_sslmode: prefer
pg_db: dev
pg_user: ${secrets:PG_USER}
pg_pass: ${secrets:PG_PASS}

Redshift types are mapped to PostgreSQL types. See the PostgreSQL connector documentation for details on supported types and configuration.

Hadoop Catalog Support for Iceberg: The Iceberg Data and Catalog connectors now support connecting to Hadoop catalogs on filesystem (file://) or S3 object storage (s3://, s3a://). This enables connecting to Iceberg catalogs without a separate catalog provider service.

Example Spicepod.yaml:

catalogs:
- from: iceberg:file:///tmp/hadoop_warehouse/
name: local_hadoop
- from: iceberg:s3://my-bucket/hadoop_warehouse/
name: s3_hadoop

# Example datasets
- from: iceberg:file:///data/hadoop_warehouse/test/my_table_1
name: local_hadoop
- from: iceberg:s3://my-bucket/hadoop_warehouse/test/my_table_2
name: s3_hadoop

For more details, see the Iceberg Data Connector documentation and the Iceberg Catalog Connector documentation.

Parquet Reader: Optional Parquet Page Index: Fixed an issue where the Parquet reader, using arrow-rs and DataFusion, errored on files missing page indexes, despite the Parquet spec allowing optional indexes. The Spice team contributed optional page index support to arrow-rs (PR #6) and configurable handling in DataFusion (PR #93). A new runtime parameter, parquet_page_index, makes Parquet Page Indexes configurable in Spice:

runtime:
params:
parquet_page_index: required # Options: required, skip, auto
  • required: (Default) Errors if page indexes are absent.
  • skip: Ignores page indexes, potentially reducing query performance.
  • auto: Uses page indexes if available; skips otherwise.

This improves compatibility and query flexibility for Parquet datasets.

Contributorsโ€‹

Breaking Changesโ€‹

Amazon S3 Vectors Vector Engine: Amazon S3 Vectors is currently a preview AWS service. A recent update to the Amazon S3 Vectors service API introduced a breaking change that affects the integration when projecting (selecting) the embedding column. This results in the following error:

Json error: whilst decoding field 'data': expected [ got nullReceived only partial JSON payload from QueryVectors

The issue is expected to be resolved in the next release of Spice. A current workaround is to limit queries to non-embedding columns.

i.e. instead of:

SELECT url, title, scored, body_embedding
FROM vector_search(pulls, 'bugs in DuckDB', 4)
WHERE state = 'OPEN'
ORDER BY score DESC
LIMIT 4;

Remove the *_embedding column from the projection. E.g.

SELECT url, title, scored
FROM vector_search(pulls, 'bugs in DuckDB', 4)
WHERE state = 'OPEN'
ORDER BY score DESC
LIMIT 4;

This issue and workaround also applies to SELECT * FROM vector_search(..). E.g.

SELECT *
FROM vector_search(pulls, 'bugs in DuckDB', 4)
WHERE state = 'OPEN'
ORDER BY score DESC
LIMIT 4;

Cookbook Updatesโ€‹

The Spice Cookbook includes 75 recipes to help you get started with Spice quickly and easily.

Upgradingโ€‹

To upgrade to v1.5.2, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.5.2 image:

docker pull spiceai/spiceai:1.5.2

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

AWS Marketplace:

๐ŸŽ‰ Spice is also now available in the AWS Marketplace!

What's Changedโ€‹

Dependenciesโ€‹

No major dependency updates.

Changelogโ€‹

  • fixes for databricks OpenAI compatibility (#6629) by @Jeadie in #6629
  • Update spicepod.schema.json (#6632) by @app/github-actions in #6632
  • Remove 'stream_options' from databricks LLMs (#6637) by @Jeadie in #6637
  • Move retry and rate limiting logic for Amazon bedrock out of embeddings. (#6626) by @Jeadie in #6626
  • Disable Metal precomplation in integration_llms.yml (#6649) by @Jeadie in #6649
  • fix: Hadoop integration test (#6660) by @peasee in #6660
  • feat: Add Hadoop Catalog Data Component (#6658) by @peasee in #6658
  • update datafusion-table-providers to latest spiceai tag (#6661) by @mach-kernel in #6661
  • feat: Add Hadoop Catalog connectors for Iceberg (#6659) by @peasee in #6659
  • Make FullTextSearchExec robust to RecordBatch column ordering. (#6675) by @Jeadie in #6675
  • Make 'runtime-object-store' crate (#6674) by @Jeadie in #6674
  • fix: Support include for Iceberg (#6663) by @peasee in #6663
  • feat: Add Hadoop TPCH benchmark (#6678) by @peasee in #6678
  • feat: Add Hadoop metadata_path parameter (#6680) by @peasee in #6680
  • fix: Automatically infer Hadoop warehouse scheme (#6681) by @peasee in #6681
  • Amazon Bedrock, specifically Nova models (#6673) by @Jeadie in [#6673](https://github.com/spiceai/spiceai/pull/6673
  • fix perplexity_auth_token parameters for web_search (#6685) by @Jeadie in #6685
  • Fix AWS Auth issue (#6699) by @Advayp in #6699
  • Limit Concurrent Requests for GitHub (#6672) by @Advayp in #6672
  • Add runtime parameter to enable more permissive parquet reading when page indexes are missing (#6716) by @phillipleblanc in #6716
  • Improve Flight REPL error messages (#6696) by @lukekim in #6696
  • Fixes from search tests (#6710) by @Jeadie in #6710