Skip to main content
Version: Next

Unity Catalog Catalog Connector Deployment Guide

Production operating guide for the Unity Catalog catalog connector — discovering Databricks Unity Catalog tables and federating them through Spice.

For Databricks-specific operational concerns (SQL Warehouse resilience, metrics, permissions flow as applied to Databricks workspaces), see the Databricks Deployment Guide — the Unity Catalog logic described there applies directly when the catalog connector targets a Databricks workspace.

Authentication & Secrets​

ParameterDescription
unity_catalog_tokenBearer token for the Unity Catalog API. Use ${secrets:...} from a secret store.

The catalog URL must match the pattern https://<host>/api/2.1/unity-catalog/catalogs/<catalog_id> and is parsed into the endpoint and catalog identifier at startup. Mismatched URLs are rejected as configuration errors.

The token is optional — when unset, the catalog connector issues unauthenticated requests, suitable for locally-hosted Unity Catalog deployments (OSS UC) with permissive access. For Databricks workspaces, the token is always required.

Secrets must be sourced from a secret store in production. Rotate tokens from the UC / Databricks console and update the secret store.

Resilience Controls​

HTTP Retry Policy​

The Unity Catalog client uses the shared resilient_http helper with these defaults:

  • Maximum retries: 3
  • Backoff: fibonacci
  • Retriable conditions: HTTP 408, 429, 5xx, and transient network errors (connect, timeout)
  • Respects Retry-After, retry-after-ms, x-retry-after-ms headers
  • Maximum backoff: 300 seconds

These are not exposed as user-tunable parameters on the Unity Catalog connector itself.

Discovery Concurrency​

The connector fans out schema and table enumeration with bounded concurrency to avoid thundering-herd on the UC API:

  • Schema refresh: up to 5 concurrent requests (buffer_unordered(5))
  • Permission checks: up to 5 concurrent requests (buffer_unordered(5))

For catalogs with thousands of tables, initial discovery can take minutes while the connector respects these limits.

Table Type and Permission Handling​

Table Type Filtering​

Table TypeSupportedNotes
MANAGEDYesStandard Delta tables
EXTERNALYesTables with external storage locations
FOREIGNYesLakehouse Federation foreign tables
MATERIALIZED_VIEWYesMaterialized views
VIEWNoSkipped during discovery
STREAMING_TABLENoSkipped during discovery

Unsupported table types are skipped during catalog discovery. When referenced directly, an error is returned.

Effective Permissions​

Before creating a table provider, the connector checks permissions via GET /api/2.1/unity-catalog/effective-permissions/table/{catalog.schema.table}. The following privileges grant read access:

  • SELECT
  • ALL_PRIVILEGES / ALL PRIVILEGES
  • OWNER / OWNERSHIP

Behavior:

  • Discovery: Tables without read permission are skipped.
  • Direct reference: An InsufficientPermissions error is returned.
  • Foreign tables: The precheck is skipped (requires_read_permission_validation = false) because Lakehouse Federation access can be valid when the UC effective-permissions endpoint does not report a table-level privilege. Access is still enforced by Databricks at query time.
  • Graceful degradation: If the UC API is unreachable or returns an error for the permissions endpoint, discovery proceeds with a warning — table providers are still created, and any per-query authorization failures surface at query time.

Capacity & Sizing​

  • Initial discovery: Scales with the number of schemas × tables. Bounded concurrency caps throughput; plan 5–30 minutes for catalogs with thousands of tables on a cold start.
  • Refresh: Catalog refresh re-enumerates schemas and tables at the configured interval. For very large catalogs, refresh less frequently (every few hours) unless schemas change rapidly.
  • Permission-check cost: One API call per table. The buffer of 5 caps concurrency.

Metrics​

The Unity Catalog connector does not currently register UC-specific OpenTelemetry metric instruments. When used via the Databricks connector, the shared SQL Warehouse and UC spans produce task-history records that can be aggregated for operational insight.

Monitor via:

  • Spice query execution metrics (query_duration_ms, query_processed_rows) from runtime.metrics.
  • Task-history spans listed below.
  • Databricks / UC workspace audit logs for API-level visibility.

See Component Metrics for general configuration.

Task History​

Unity Catalog operations emit the following task history spans:

SpanInputDescription
uc_get_tableFully-qualified table nameFetch table metadata from Unity Catalog.
uc_get_catalogCatalog IDFetch catalog metadata.
uc_list_schemasCatalog IDList schemas in a catalog.
uc_list_tablescatalog_id.schema_nameList tables in a schema.
uc_get_effective_permissionsFully-qualified table nameCheck effective permissions for a table.

Known Limitations​

  • VIEW and STREAMING_TABLE are skipped: Only queryable table types are exposed.
  • No UC write-back: The connector is read-only; writes to UC are not supported through Spice.
  • HTTP retry/concurrency parameters not exposed: The resilient-HTTP defaults (3 retries, fibonacci backoff, concurrency 5) are not currently user-tunable on the UC connector.
  • Graceful degradation on permission-endpoint failures: If UC effective-permissions is unreachable, Spice proceeds; authorization errors surface at query time rather than discovery time.

Troubleshooting​

SymptomLikely causeResolution
401 Unauthorized on catalog listMissing, expired, or wrong-workspace token.Regenerate token in UC / Databricks; update secret store.
Table visible in UC but missing from the Spice catalogTable type is VIEW / STREAMING_TABLE or permissions were denied.Confirm table type is supported and that the principal has SELECT (or equivalent).
InsufficientPermissions on direct table referenceRole lacks read privilege on the table.Grant SELECT on the table in UC.
Slow catalog discovery on thousands of tablesBounded concurrency + permission checks per table.Expected behavior; schedule discovery during low-traffic windows and cache via accelerated datasets.
Tables from a Lakehouse Federation source missingFOREIGN precheck passed but Databricks denied at query time.Verify the Databricks workspace has federation privileges granted to the principal.