Skip to main content
Version: Next

ClickHouse Data Connector

ClickHouse is a fast, open-source columnar database management system designed for online analytical processing (OLAP) and real-time analytics. This connector enables federated SQL queries from a ClickHouse server.

datasets:
- from: clickhouse:my.dataset
name: my_dataset

Configuration​

from​

The from field for the ClickHouse connector takes the form of from:db.dataset where db.dataset is the path to the Dataset within ClickHouse. In the example above it would be my.dataset.

The clickhouse_db parameter is required when not using clickhouse_connection_string. When using a connection string without a database path, it defaults to the default database.

info

Unquoted identifiers are normalized to lowercase. To reference a table or database with mixed-case characters, wrap each case-sensitive part in double quotes: clickhouse:my_db."MixedCaseTable". See Identifier Case Sensitivity.

name​

The dataset name. This will be used as the table name within Spice.

datasets:
- from: clickhouse:my.dataset
name: cool_dataset
SELECT COUNT(*) FROM cool_dataset;
+----------+
| count(*) |
+----------+
| 6001215 |
+----------+

The dataset name cannot be a reserved keyword or any of the following keywords that are reserved by ClickHouse:

  • PREWHERE
  • SETTINGS
  • FORMAT

params​

The ClickHouse data connector can be configured by providing the following params:

Parameter NameDefinition
clickhouse_connection_stringThe connection string to use to connect to the ClickHouse server. This can be used instead of providing individual connection parameters.
clickhouse_hostThe hostname of the ClickHouse server.
clickhouse_tcp_portThe port of the ClickHouse server.
clickhouse_dbThe name of the database to connect to.
clickhouse_userThe username to connect with.
clickhouse_passThe password to connect with.
clickhouse_secureOptional. Specifies the SSL/TLS behavior for the connection, supported values:
  • true: (default) This mode requires an SSL connection. If a secure connection cannot be established, server will not connect.
  • false: This mode will not attempt to use an SSL connection, even if the server supports it.
connection_timeoutOptional. Specifies the connection timeout in milliseconds. Default is 10000 (10 seconds).

Types​

The table below shows the ClickHouse data types supported, along with the type mapping to Apache Arrow types in Spice.

ClickHouse TypeArrow Type
BoolBoolean
Int8Int8
Int16Int16
Int32Int32
Int64Int64
UInt8UInt8
UInt16UInt16
UInt32UInt32
UInt64UInt64
Float32Float32
Float64Float64
DecimalDecimal128 / Decimal256
StringUtf8
FixedStringUtf8
UUIDUtf8
DateDate32
Date32Date32
DateTimeTimestamp(Second, None)
Nullable(T)Mapped inner type T

Examples​

Connecting to localhost​

datasets:
- from: clickhouse:my.dataset
name: my_dataset
params:
clickhouse_host: localhost
clickhouse_tcp_port: 9000
clickhouse_db: my_database
clickhouse_user: my_user
clickhouse_pass: ${secrets:my_clickhouse_pass}
connection_timeout: 10000
clickhouse_secure: false

Specifying a connection timeout​

datasets:
- from: clickhouse:my.dataset
name: my_dataset
params:
clickhouse_connection_string: tcp://my_user:${secrets:my_clickhouse_pass}@localhost:9000/my_database
connection_timeout: 10000
clickhouse_secure: true

Using a connection string​

datasets:
- from: clickhouse:my.dataset
name: my_dataset
params:
clickhouse_connection_string: tcp://my_user:${secrets:my_clickhouse_pass}@localhost:9000/my_database?connection_timeout=10000&secure=true

Secrets​

Spice integrates with multiple secret stores to help manage sensitive data securely. For detailed information on supported secret stores, refer to the secret stores documentation. Additionally, learn how to use referenced secrets in component parameters by visiting the using referenced secrets guide.

Cookbook​