PostgreSQL Data Connector Deployment Guide
Production operating guide for the PostgreSQL data connector covering authentication, connection pool sizing, TLS, metrics, and observability.
Authentication & Secrets​
The connector uses the native PostgreSQL wire protocol with username/password authentication.
| Parameter | Description |
|---|---|
pg_host | PostgreSQL server hostname. |
pg_port | TCP port (default 5432). |
pg_db | Database name. |
pg_user | Database user. |
pg_pass | Password. Use ${secrets:...} to resolve from a configured secret store. |
pg_connection_string | Alternative to the individual parameters. |
pg_application_name | Application identifier reported in pg_stat_activity. Defaults to the Spice.ai version. |
Passwords must be sourced from a secret store in production. See Secret Stores for configuration options (environment variables, file, Kubernetes, AWS Secrets Manager, HashiCorp Vault).
TLS​
TLS is controlled via pg_sslmode:
| Value | Behavior |
|---|---|
disable | No TLS. |
prefer | Try TLS, fall back to plaintext. Not recommended for production. |
require | Require TLS; no server certificate verification. |
verify-ca | Require TLS and verify the CA chain. |
verify-full | Require TLS, verify CA chain, and verify server hostname. |
For production, use verify-full with pg_sslrootcert pointing to the CA bundle.
Resilience Controls​
Connection Pool Sizing​
The connector maintains a per-dataset connection pool:
| Parameter | Default | Description |
|---|---|---|
connection_pool_min_idle | Tracks connection_pool_size with a floor of 1. | Minimum idle connections held by the pool. |
connection_pool_size | 10 | Maximum connections the pool will open. |
connection_pool_min_idle must be less than or equal to connection_pool_size; conflicting values are rejected as configuration errors at startup.
Size the pool to match concurrent query and refresh load for the dataset. The server's max_connections (default 100) is a shared budget across Spice datasets, other clients, and server-side background workers — plan accordingly, or front Postgres with PgBouncer.
Application Name​
pg_application_name defaults to the Spice.ai version string, which surfaces in pg_stat_activity.application_name. Override this to distinguish traffic from multiple Spice instances or environments.
Retry Behavior​
Transient query failures are not automatically retried at the connector layer. Dataset refresh retries are controlled by the acceleration refresh policy (see Data Refresh).
Capacity & Sizing​
- Network: Postgres traffic is TCP. Sum
connection_pool_sizeacross all Spice datasets sharing the server when sizingmax_connections. - Memory: Result sets are streamed in record batches; memory footprint for federated reads is bounded by DataFusion's batch size (8192 rows default).
- Connection setup cost: TLS handshake and authentication add latency to cold connections.
connection_pool_min_idlekeeps a warm pool to absorb burst traffic.
Metrics​
The PostgreSQL connector exposes observable metrics for its connection pool. Enable them in the dataset's metrics section. See Component Metrics for general configuration.
| Metric Name | Type | Description |
|---|---|---|
connection_count | ObservableGauge | Active connections to the database server. |
connections_in_pool | ObservableGauge | Idle connections sitting in the pool. |
active_wait_requests | ObservableGauge | Requests waiting for a connection (saturation signal). |
create_failed | Counter | Connections that failed to be created. |
discarded_excess_idle_connection | Counter | Connections closed because the pool already had enough idle connections. |
discarded_unestablished_connection | Counter | Connections closed because they could not be established. |
dirty_connection_return | Counter | Connections returned in a dirty state (open transaction, pending queries). |
Metric instruments are exposed with the prefix dataset_postgres_. Each instrument carries a name attribute set to the dataset name.
Task History​
PostgreSQL operations participate in Spice task history via the shared SQL data-connector spans. Queries executed against Postgres are captured as child spans of the enclosing sql_query or accelerated_table_refresh task.
Known Limitations​
- Only TCP connections are supported. Unix sockets are not exposed through Spice configuration.
pg_sslmode: prefersilently downgrades to plaintext and is not recommended for production.LISTEN/NOTIFYis not exposed; Postgres CDC is handled through Debezium rather than the Postgres connector directly.- Server-side cursors are used for federated reads; long-running queries hold a backend for their duration.
Troubleshooting​
| Symptom | Likely cause | Resolution |
|---|---|---|
FATAL: password authentication failed | Incorrect credentials. | Verify credentials via the secret store; test with psql using the same credentials. |
FATAL: too many clients already | Pool size + other clients exceeds server max_connections. | Reduce connection_pool_size or raise max_connections / front the server with PgBouncer. |
connection_pool_min_idle must be <= connection_pool_size at startup | Misconfiguration. | Correct the values so min_idle <= size. |
Sustained active_wait_requests > 0 | Pool saturation. | Increase connection_pool_size or reduce concurrent refreshes. |
certificate verify failed | pg_sslmode: verify-ca / verify-full with wrong CA or hostname. | Verify pg_sslrootcert matches the server's issuing CA; with verify-full ensure hostname matches SAN. |
| Sessions lingering with the default app name | Multiple Spice instances share the same name. | Set pg_application_name per deployment for clear pg_stat_activity attribution. |
