Skip to main content
Version: Next

PostgreSQL Data Connector

PostgreSQL is an advanced open-source relational database management system known for its reliability, extensibility, and support for SQL compliance.

The PostgreSQL Server Data Connector enables federated/accelerated SQL queries on data stored in PostgreSQL databases.

datasets:
- from: postgres:my_table
name: my_dataset
params: ...

Quickstart​

Connect to a local PostgreSQL database and accelerate a table for fast local queries:

version: v1
kind: Spicepod
name: pg_demo

datasets:
- from: postgres:public.customers
name: customers
params:
pg_host: localhost
pg_port: "5432"
pg_db: mydb
pg_user: spice_reader
pg_pass: ${secrets:PG_PASSWORD}
acceleration:
enabled: true
echo "PG_PASSWORD=your_password" > .env

Start Spice and query the data:

spice run
# In another terminal:
spice sql
sql> SELECT count(*) FROM customers;

Configuration​

from​

The from field takes the form postgres:my_table where my_table is the table identifer in the PostgreSQL server to read from.

The fully-qualified table name (database.schema.table) can also be used in the from field.

datasets:
- from: postgres:my_database.my_schema.my_table
name: my_dataset
params: ...
info

Unquoted identifiers are normalized to lowercase. To reference a table or schema with mixed-case characters, wrap each case-sensitive part in double quotes: postgres:my_schema."MixedCaseTable". See Identifier Case Sensitivity.

name​

The dataset name. This will be used as the table name within Spice.

Example:

datasets:
- from: postgres:my_database.my_schema.my_table
name: cool_dataset
params: ...
SELECT COUNT(*) FROM cool_dataset;
+----------+
| count(*) |
+----------+
| 6001215 |
+----------+

params​

The connection to PostgreSQL can be configured by providing the following params:

Parameter NameDescription
pg_connection_stringOptional. The connection string to use to connect to the PostgreSQL server. This can be used instead of providing individual connection parameters.
pg_hostThe hostname of the PostgreSQL server.
pg_portThe port of the PostgreSQL server.
pg_dbThe name of the database to connect to.
pg_userThe username to connect with.
pg_passThe password to connect with. Use the secret replacement syntax to load the password from a secret store, e.g. ${secrets:my_pg_pass}.
pg_sslmodeOptional. Specifies the SSL/TLS behavior for the connection, supported values:
  • verify-full: (default) This mode requires an SSL connection, a valid root certificate, and the server host name to match the one specified in the certificate.
  • verify-ca: This mode requires a TLS connection and a valid root certificate.
  • require: This mode requires a TLS connection.
  • prefer: This mode will try to establish a secure TLS connection if possible, but will connect insecurely if the server does not support TLS.
  • disable: This mode will not attempt to use a TLS connection, even if the server supports it.
pg_sslrootcertOptional parameter specifying the path to a custom PEM certificate that the connector will trust.
pg_connection_pool_min_idleOptional. The minimum number of idle connections to keep open in the pool. Default is 1.
connection_pool_sizeOptional. The maximum number of connections created in the connection pool. Default is 5.

Types​

The table below shows the PostgreSQL data types supported, along with the type mapping to Apache Arrow types in Spice.

PostgreSQL TypeArrow Type
int2Int16
int4Int32
int8Int64
moneyInt64
float4Float32
float8Float64
numericDecimal128
textUtf8
varcharUtf8
bpcharUtf8
uuidUtf8
byteaBinary
boolBoolean
jsonUtf8
timestampTimestamp(Nanosecond, None)
timestamptzTimestamp(Nanosecond, UTC)
dateDate32
timeTime64(Nanosecond)
intervalInterval(MonthDayNano)
pointFixedSizeList(Float64[2])
int2[]List(Int16)
int4[]List(Int32)
int8[]List(Int64)
float4[]List(Float32)
float8[]List(Float64)
text[]List(Utf8)
bool[]List(Boolean)
bytea[]List(Binary)
geometryBinary
geographyBinary
enumDictionary(Int8, Utf8)
Composite TypesStruct
info

The Postgres federated queries may result in unexpected result types due to the difference in DataFusion and Postgres size increase rules. Explicitly specify the expected output type of aggregation functions when writing queries involving Postgres tables in Spice. For example, rewrite SUM(int_col) into CAST (SUM(int_col) as BIGINT).

Examples​

Connecting using Username/Password​

datasets:
- from: postgres:my_database.my_schema.my_table
name: my_dataset
params:
pg_host: localhost
pg_port: 5432
pg_db: my_database
pg_user: my_user
pg_pass: ${secrets:my_pg_pass}

Connect using SSL​

datasets:
- from: postgres:my_database.my_schema.my_table
name: my_dataset
params:
pg_host: localhost
pg_port: 5432
pg_db: my_database
pg_user: my_user
pg_pass: ${secrets:my_pg_pass}
pg_sslmode: verify-ca
pg_sslrootcert: ./custom_cert.pem

Separate dataset/accelerator secrets​

Specify different secrets for a PostgreSQL source and acceleration:

datasets:
- from: postgres:my_schema.my_table
name: my_dataset
params:
pg_host: localhost
pg_port: 5432
pg_db: my_database
pg_user: my_user
pg_pass: ${secrets:pg1_pass}
acceleration:
engine: postgres
params:
pg_host: localhost
pg_port: 5433
pg_db: acceleration
pg_user: two_user_two_furious
pg_pass: ${secrets:pg2_pass}

Secrets​

Spice integrates with multiple secret stores to help manage sensitive data securely. For detailed information on supported secret stores, refer to the secret stores documentation. Additionally, learn how to use referenced secrets in component parameters by visiting the using referenced secrets guide.

Cookbook​