Unity Catalog Catalog Connector
Connect to a Unity Catalog as a catalog provider for federated SQL query against Delta Lake tables.
Configuration​
catalogs:
- from: unity_catalog:https://my_unity_catalog_host.com/api/2.1/unity-catalog/catalogs/my_catalog
name: uc
include:
- "*.my_table"
dataset_params:
# delta_lake S3 parameters
unity_catalog_aws_region: us-west-2
unity_catalog_aws_access_key_id: ${secrets:aws_access_key_id}
unity_catalog_aws_secret_access_key: ${secrets:aws_secret_access_key}
unity_catalog_aws_endpoint: s3.us-west-2.amazonaws.com
from
​
The from
field is used to specify the catalog provider. For Unity Catalog, use unity_catalog:<catalog_path>
. The catalog_path
is the URL to the getCatalog
endpoint of the Unity Catalog API. It should be formatted as https://<unity_catalog_host>/api/2.1/unity-catalog/catalogs/<catalog_name>
.
name
​
The name
field is used to specify the name of the catalog in Spice. The schema hierarchy of the external catalog is preserved in Spice.
include
​
Use the include
field to specify which tables to include from the catalog. The include
field supports glob patterns to match multiple tables. For example, *.my_table_name
would include all tables with the name my_table_name
in the catalog from any schema. Multiple include
patterns are OR'ed together and can be specified to include multiple tables.
params
​
The params
field is used to configure the connection to the Unity Catalog. The following parameters are supported:
unity_catalog_token
: The personal access token used to authenticate against the Unity Catalog API.
dataset_params
​
The dataset_params
field is used to configure the dataset-specific parameters for the catalog.
Unity catalog object store parameters​
AWS S3​
unity_catalog_aws_region
: The AWS region for the S3 object store. E.g.us-west-2
.unity_catalog_aws_access_key_id
: The access key ID for the S3 object store.unity_catalog_aws_secret_access_key
: The secret access key for the S3 object store.unity_catalog_aws_endpoint
: The endpoint for the S3 object store. E.g.s3.us-west-2.amazonaws.com
.
Azure Blob​
One of the following auth values must be provided for Azure Blob:
unity_catalog_azure_storage_account_key
,unity_catalog_azure_storage_client_id
andazure_storage_client_secret
, orunity_catalog_azure_storage_sas_key
.
unity_catalog_azure_storage_account_name
: The Azure Storage account name.unity_catalog_azure_storage_account_key
: The Azure Storage master key for accessing the storage account.unity_catalog_azure_storage_client_id
: The service principal client id for accessing the storage account.unity_catalog_azure_storage_client_secret
: The service principal client secret for accessing the storage account.unity_catalog_azure_storage_sas_key
: The shared access signature key for accessing the storage account.unity_catalog_azure_storage_endpoint
: The endpoint for the Azure Blob storage account.
Google Storage (GCS)​
unity_catalog_google_service_account
: Filesystem path to the Google service account JSON key file.
Limitations​
-
Unity Catalog does not support reading Delta tables with the
V2Checkpoint
feature enabled. To use the Unity Catalog connector with such tables, drop theV2Checkpoint
feature by executing the following command:ALTER TABLE <table-name> DROP FEATURE v2Checkpoint [TRUNCATE HISTORY];
For more details on dropping Delta table features, refer to the official documentation: Drop Delta table features