Skip to main content
Version: Next

DynamoDB Data Connector Deployment Guide

Production operating guide for the DynamoDB data connector covering IAM, DynamoDB Streams CDC, checkpointing, and lag handling.

Authentication & Secrets​

DynamoDB authentication uses the standard AWS credential chain. Configure via the same parameters as the S3 connector:

ParameterDescription
dynamodb_aws_regionAWS region of the DynamoDB table.
dynamodb_aws_access_key_idExplicit access key (optional; falls back to the credential chain when unset).
dynamodb_aws_secret_access_keyExplicit secret key (optional).
dynamodb_aws_session_tokenSession token for temporary credentials (optional).

For production on EKS/ECS, leave access-key parameters unset and rely on instance-profile, IRSA, or ECS task-role credentials. Grant the role dynamodb:Scan, dynamodb:Query, and dynamodb:DescribeTable on the table; for streams, additionally grant dynamodb:DescribeStream, dynamodb:GetShardIterator, dynamodb:GetRecords, and dynamodb:ListStreams.

Secrets should be sourced from a secret store when not using IAM role auth.

Resilience Controls​

Streams and Checkpointing​

The DynamoDB connector supports CDC via DynamoDB Streams with an accelerated dataset as the sink. Stream state is persisted as a checkpoint alongside the accelerator, allowing resumption after a restart.

ParameterDefaultDescription
scan_interval0sInterval between polls for new records in a DynamoDB stream.
ready_lag2sOnce lag falls below this threshold, the dataset is reported as Ready.
lag_exceeds_shard_retention_behaviorerrorBehavior when stream lag exceeds shard retention (24h): error, ready_before_load, or ready_after_load.

Shard Retention and Lag​

DynamoDB Streams retain records for 24 hours. If Spice is offline longer than the retention window, the checkpoint becomes stale and the next stream open returns ShardNotFound. Behavior is controlled by lag_exceeds_shard_retention_behavior:

  • error (default): Mark the dataset Error. Requires operator intervention to re-bootstrap.
  • ready_before_load: Mark the dataset Ready immediately, then re-bootstrap the accelerated dataset in the background. Queries see stale data until the bootstrap completes.
  • ready_after_load: Re-bootstrap the accelerated dataset, then mark it Ready. Queries block / return Error during bootstrap.

A checkpoint older than 18 hours is treated as near-expired and triggers the same recovery path even if the shard has not yet been dropped by DynamoDB.

Capacity & Sizing​

  • Read capacity: Full-table scans consume provisioned read capacity. Use on-demand billing or reserve sufficient RCU for refresh windows to avoid throttling.
  • Stream throughput: DynamoDB Streams shards cap at 1000 records/sec and 2 MB/sec each. Wide or high-write tables automatically partition into more shards.
  • Checkpoint storage: Checkpoint records live in the acceleration engine and add roughly one row per stream shard. Negligible for sizing.

Metrics​

The DynamoDB connector registers the following metrics:

MetricTypeDescription
shards_activeGaugeNumber of active DynamoDB Streams shards being consumed.
records_consumed_totalCounterTotal stream records consumed.
lag_msGaugeCurrent lag behind the stream head in milliseconds (approximate, summed across shards).
errors_transient_totalCounterTransient stream read errors (retried automatically).
reinitializations_on_lag_exceeds_shard_retention_totalCounterNumber of times the stream was reinitialized due to lag exceeding shard retention.

Metrics are exposed with the dataset_dynamodb_ prefix. Monitor lag_ms together with errors_transient_total — a climbing lag with rising transient errors indicates the connector is falling behind retention.

See Component Metrics for enabling and exporting metrics.

Task History​

Stream polling and bootstrap operations emit spans that participate in task history under the enclosing accelerated_table_refresh and changes-stream tasks.

Known Limitations​

  • Global Secondary Indexes: Not exposed as separate datasets. Query the base table and let DataFusion filter.
  • Conditional writes: DynamoDB conditional expressions (e.g., attribute_exists) are not supported in DML operations.
  • Cross-region streams: Must configure dynamodb_aws_region to match the region of the source table; cross-region access requires resource policies and is not recommended.
  • Table with StreamSpecification disabled: CDC mode is unavailable; fall back to full-table refresh.

Troubleshooting​

SymptomLikely causeResolution
Dataset stuck in Error after restart with stream enabledCheckpoint older than 18h or exceeded 24h retention.Set lag_exceeds_shard_retention_behavior: ready_after_load to auto-recover, or trigger a manual refresh.
ProvisionedThroughputExceededExceptionRCU exhausted during initial scan.Switch to on-demand billing, raise RCU for the refresh window, or slow the refresh via acceleration settings.
TrimmedDataAccessExceptionRecords trimmed from the stream before they could be processed.Same recovery path as ShardNotFound — re-bootstrap. Reduce bootstrap duration via parallel segments if supported.
AccessDeniedException on DescribeStreamIAM role lacks stream permissions.Add dynamodb:DescribeStream, GetShardIterator, GetRecords, ListStreams to the role.
ResourceNotFoundException on stream startStream not enabled on the table.Enable streams on the DynamoDB table (NEW_AND_OLD_IMAGES recommended).