Skip to main content
Version: Next

PostgreSQL Data Connector Deployment Guide

Production operating guide for the PostgreSQL data connector covering authentication, connection pool sizing, TLS, metrics, and observability.

Authentication & Secrets​

The connector uses the native PostgreSQL wire protocol with username/password authentication.

ParameterDescription
pg_hostPostgreSQL server hostname.
pg_portTCP port (default 5432).
pg_dbDatabase name.
pg_userDatabase user.
pg_passPassword. Use ${secrets:...} to resolve from a configured secret store.
pg_connection_stringAlternative to the individual parameters.

Passwords must be sourced from a secret store in production. See Secret Stores for configuration options (environment variables, file, Kubernetes, AWS Secrets Manager, HashiCorp Vault).

TLS​

TLS is controlled via pg_sslmode:

ValueBehavior
disableNo TLS.
allowTry non-TLS first, retry with TLS if the server requires it.
preferTry TLS, fall back to plaintext. Not recommended for production.
requireRequire TLS; no server certificate verification.
verify-caRequire TLS and verify the CA chain.
verify-fullRequire TLS, verify CA chain, and verify server hostname.

For production, use verify-full with pg_sslrootcert pointing to the CA bundle file path.

Resilience Controls​

Connection Pool Sizing​

The connector maintains a per-dataset connection pool:

ParameterDefaultDescription
pg_connection_pool_min_idle1Minimum idle connections held by the pool.
connection_pool_size5Maximum connections the pool will open.

When pg_connection_pool_min_idle exceeds connection_pool_size, the pool silently caps idle connections at the pool size.

Size the pool to match concurrent query and refresh load for the dataset. The server's max_connections (default 100) is a shared budget across Spice datasets, other clients, and server-side background workers — plan accordingly, or front Postgres with PgBouncer.

Application Name​

The connector automatically sets application_name to the Spice.ai version string, which surfaces in pg_stat_activity.application_name. This value is not configurable.

Retry Behavior​

Transient query failures are not automatically retried at the connector layer. Dataset refresh retries are controlled by the acceleration refresh policy (see Data Refresh).

Capacity & Sizing​

  • Network: Postgres traffic is TCP. Sum connection_pool_size across all Spice datasets sharing the server when sizing max_connections.
  • Memory: Result sets are streamed in record batches; memory footprint for federated reads is bounded by DataFusion's batch size (8192 rows default).
  • Connection setup cost: TLS handshake and authentication add latency to cold connections. connection_pool_min_idle keeps a warm pool to absorb burst traffic.

Metrics​

The PostgreSQL connector exposes observable metrics for its replication pipeline. Enable them in the dataset's metrics section. See Component Metrics for general configuration.

Metric NameTypeDescription
replication_lag_msObservableGaugeReplication lag in milliseconds.
replication_lag_bytesObservableGaugeReplication lag in bytes.
replication_confirmed_flush_lsnObservableGaugeConfirmed flush LSN position.
replication_server_wal_end_lsnObservableGaugeServer WAL end LSN position.
replication_transactions_totalObservableCounterTotal transactions received via replication.
replication_inserts_totalObservableCounterTotal insert operations received.
replication_updates_totalObservableCounterTotal update operations received.
replication_deletes_totalObservableCounterTotal delete operations received.
replication_truncates_totalObservableCounterTotal truncate operations received.
replication_bootstrap_rows_totalObservableCounterTotal rows fetched during initial bootstrap.
replication_bootstrap_completeObservableGaugeBootstrap completion status.
replication_decode_errors_totalObservableCounterTotal WAL decode errors.
replication_schema_mismatch_errors_totalObservableCounterTotal schema mismatch errors during replication.
replication_recv_errors_totalObservableCounterTotal receive errors during replication.
replication_reconnects_totalObservableCounterTotal reconnection attempts.

Metric instruments are exposed with the prefix dataset_postgres_. Each instrument carries a name attribute set to the dataset name.

Task History​

PostgreSQL operations participate in Spice task history via the shared SQL data-connector spans. Queries executed against Postgres are captured as child spans of the enclosing sql_query or accelerated_table_refresh task.

Known Limitations​

  • Only TCP connections are supported. Unix sockets are not exposed through Spice configuration.
  • pg_sslmode: prefer silently downgrades to plaintext and is not recommended for production.
  • LISTEN/NOTIFY is not exposed. CDC is supported natively via logical replication (WAL streaming) — see the replication parameters in the connector docs.
  • Server-side cursors are used for federated reads; long-running queries hold a backend for their duration.

Troubleshooting​

SymptomLikely causeResolution
FATAL: password authentication failedIncorrect credentials.Verify credentials via the secret store; test with psql using the same credentials.
FATAL: too many clients alreadyPool size + other clients exceeds server max_connections.Reduce connection_pool_size or raise max_connections / front the server with PgBouncer.
Idle connections never exceed connection_pool_size despite a higher pg_connection_pool_min_idleThe pool silently caps min_idle at the pool size.Set pg_connection_pool_min_idle to connection_pool_size or lower for clarity.
Sustained active_wait_requests > 0Pool saturation.Increase connection_pool_size or reduce concurrent refreshes.
certificate verify failedpg_sslmode: verify-ca / verify-full with wrong CA or hostname.Verify pg_sslrootcert matches the server's issuing CA; with verify-full ensure hostname matches SAN.
Sessions lingering with the default app nameMultiple Spice instances share the same version-based name.The application_name is auto-set to the Spice.ai version and is not currently configurable.