DynamoDB Streams (Native CDC)
Stream every INSERT, UPDATE, and DELETE from an Amazon DynamoDB table directly into a Spice-accelerated dataset using DynamoDB Streams.
This is the recommended way to keep a Spice accelerator (DuckDB, SQLite, PostgreSQL, Cayenne) continuously in sync with a DynamoDB source β no Kafka, no Debezium, no Lambda required.
How it worksβ
ββββββββββββββββββββ DynamoDB Streams βββββββββββββββββββββ ChangeBatch βββββββββββββββββ
β DynamoDB ββββββββββββββββββββββββΆβ Spice runtime ββββββββββββββββββββΆβ Accelerator β
β table β shard iterators β (dynamodb β (INSERT/ β DuckDB / β
β + stream β β connector) β UPDATE / β SQLite / β
β β β β DELETE) β Postgres / β
ββββββββββββββββββββ βββββββββββββββββββββ β Cayenne β
βββββββββββββββββ
On first start the connector:
- Bootstraps the accelerator with a full
Scanof the source table so initial state is captured. - Subscribes to each open stream shard and begins polling for records.
- Applies each
INSERT/MODIFY/REMOVEevent as a row-level change to the accelerator.
The dataset reports Ready once stream lag drops below ready_lag (default 2s).
On subsequent restarts, file-backed accelerators resume from the persisted shard checkpoint instead of re-scanning the source table.
Prerequisitesβ
1. Enable Streams on the source tableβ
Streams must be enabled with view type NEW_AND_OLD_IMAGES so Spice receives both the old and new image of every modified item.
aws dynamodb update-table \
--table-name orders \
--stream-specification StreamEnabled=true,StreamViewType=NEW_AND_OLD_IMAGES
2. IAM permissionsβ
The credentials Spice uses need both table and stream actions:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"dynamodb:DescribeTable",
"dynamodb:Scan",
"dynamodb:DescribeStream",
"dynamodb:GetShardIterator",
"dynamodb:GetRecords",
"dynamodb:ListStreams"
],
"Resource": [
"arn:aws:dynamodb:*:*:table/orders",
"arn:aws:dynamodb:*:*:table/orders/stream/*"
]
}
]
}
3. Acceleration must be enabled with refresh_mode: changesβ
Using DynamoDB Streams requires acceleration with refresh_mode: changes. Supported engines are duckdb, sqlite, postgres, and cayenne.
Minimal configurationβ
datasets:
- from: dynamodb:orders
name: orders_stream
params:
dynamodb_aws_region: us-east-1
dynamodb_aws_access_key_id: ${secrets:aws_access_key_id}
dynamodb_aws_secret_access_key: ${secrets:aws_secret_access_key}
acceleration:
enabled: true
engine: duckdb
mode: file # Persistence is recommended so restarts skip the initial Scan
refresh_mode: changes
Tuningβ
datasets:
- from: dynamodb:orders
name: orders_stream
params:
scan_interval: 100ms # Poll DynamoDB Streams every 100 ms (default 0s)
ready_lag: 1s # Report Ready when stream lag drops below 1s (default 2s)
lag_exceeds_shard_retention_behavior: ready_before_load # Behavior on > 24h lag
acceleration:
enabled: true
engine: duckdb
mode: file
refresh_mode: changes
snapshots: enabled
snapshots_trigger: stream_batches
snapshots_trigger_threshold: 5 # Snapshot every 5 batch updates
scan_intervalβ Polling frequency for new records. Lower values give lower latency at the cost of moreGetRecordsAPI calls.ready_lagβ Maximum stream lag before the dataset is reportedReadyfor queries. Defaults to2s.lag_exceeds_shard_retention_behaviorβ What to do when stream lag exceeds the DynamoDB shard retention window (24h). One oferror(default),ready_before_load, orready_after_load. See the connector parameter reference for the full description.snapshots_trigger: stream_batchesandsnapshots_trigger_thresholdlet you trigger acceleration snapshots based on stream-batch counts rather than wall time. See Acceleration Snapshots.
Metricsβ
The connector exposes the following component metrics for monitoring streaming health:
| Metric | Type | Description |
|---|---|---|
shards_active | Gauge | Current number of active shards in the stream |
records_consumed_total | Counter | Total number of records consumed from the stream |
lag_ms | Gauge | Current lag in milliseconds between stream watermark and now |
errors_transient_total | Counter | Total number of transient errors encountered while polling from the stream |
reinitializations_on_lag_exceeds_shard_retention_total | Counter | Total rebootstrap operations triggered due to expired shards |
These metrics are opt-in. See the DynamoDB Streams Metrics section of the connector docs for an example metrics: block and a sample Grafana dashboard.
Limitationsβ
- DynamoDB Streams shards are retained for 24 hours. If Spice falls behind by more than that, the connector follows
lag_exceeds_shard_retention_behavior(default:error). refresh_sqlis not supported with DynamoDB Streams.
See alsoβ
- DynamoDB Data Connector β complete parameter reference, deployment guide, and supported AWS regions.
- DynamoDB Streams cookbook recipe.
refresh_mode: changesβ refresh-mode reference.
