GCP Integrations
Spice.ai integrates with Google Cloud Platform (GCP) for data federation, AI inference, embeddings, and authentication. This page consolidates GCP-compatible components and links to the relevant configuration guides.
Data Connectors​
Data connectors federate SQL queries across GCP data sources without data movement.
| Connector | Description | Documentation |
|---|---|---|
| BigQuery (via ADBC) | Query BigQuery tables using the BigQuery ADBC driver. Includes built-in SQL dialect support for federated queries. | ADBC Data Connector |
| Cloud Storage (S3-compat) | Query Parquet, CSV, and JSON objects in Cloud Storage using the S3 connector with HMAC keys against the GCS interoperability endpoint. | S3 Data Connector |
| Cloud SQL for PostgreSQL | Connect to Cloud SQL for PostgreSQL directly or through the Cloud SQL Auth Proxy. | PostgreSQL Data Connector |
| Cloud SQL for MySQL | Connect to Cloud SQL for MySQL directly or through the Cloud SQL Auth Proxy. | MySQL Data Connector |
| Cloud SQL for SQL Server | Connect to Cloud SQL for SQL Server. | MSSQL Data Connector |
| AlloyDB for PostgreSQL | Connect to AlloyDB using the PostgreSQL wire protocol. | PostgreSQL Data Connector |
| Apache Iceberg (GCS) | Query Iceberg tables stored in Cloud Storage with REST or Hive metadata. Native GCS authentication via service account credentials or OAuth tokens. | Iceberg Data Connector |
| Delta Lake (GCS) | Query Delta Lake tables stored in Cloud Storage. | Delta Lake Data Connector |
| GCP databases via ODBC | Connect through ODBC drivers for additional GCP-compatible data sources. | ODBC Data Connector |
Example: BigQuery via ADBC​
datasets:
- from: adbc:my_dataset.orders
name: orders
params:
adbc_driver: bigquery
adbc_uri: 'bigquery:///my-gcp-project'
adbc_driver_options: |
adbc.bigquery.sql.dataset_id=my_dataset
adbc.bigquery.sql.auth_type=adbc.bigquery.sql.auth_type.json_credential_file
adbc.bigquery.sql.auth_credentials=/var/run/secrets/gcp/key.json
When the runtime uses Workload Identity, omit auth_type and auth_credentials — the BigQuery driver picks up Application Default Credentials automatically.
Example: Cloud Storage via S3 connector​
Cloud Storage exposes an S3-compatible interoperability endpoint. Generate an HMAC key tied to a service account, then point the S3 connector at storage.googleapis.com:
datasets:
- from: s3://my-bucket/path/to/data/
name: events
params:
file_format: parquet
s3_endpoint: https://storage.googleapis.com
s3_auth: key
s3_key: ${ secrets:GCS_HMAC_ACCESS_ID }
s3_secret: ${ secrets:GCS_HMAC_SECRET }
For Iceberg or Delta tables stored in GCS, use the connector's native GCS parameters instead, which support service-account credentials and ADC directly.
Example: Cloud SQL for PostgreSQL​
Run the Cloud SQL Auth Proxy as a sidecar (or 127.0.0.1 listener on Compute Engine) and connect over the loopback interface:
datasets:
- from: postgres:public.orders
name: orders
params:
pg_host: 127.0.0.1
pg_port: '5432'
pg_db: app
pg_user: ${ secrets:CLOUDSQL_USER }
pg_pass: ${ secrets:CLOUDSQL_PASSWORD }
pg_sslmode: disable
For Postgres replication-based CDC, Cloud SQL requires the cloudsql.logical_decoding = on flag.
AI Models (Google AI)​
Spice integrates with Google AI Studio for chat completion and reasoning models, including the Gemini family.
| Provider | Supported Models | Documentation |
|---|---|---|
| Google AI | Gemini 2.0/2.5/Pro, Gemini Flash, and other models from the Gemini API. | Google AI Models |
Example: Gemini Chat Model​
models:
- from: google:gemini-2.0-flash-exp
name: gemini
params:
google_api_key: ${ secrets:GEMINI_API_KEY }
See Google AI Models for the full list of supported model names.
Embeddings (Google AI)​
Generate vector embeddings using Gemini embedding models for semantic search and retrieval-augmented generation (RAG).
| Provider | Supported Models | Documentation |
|---|---|---|
| Google AI | text-embedding-004 and other models from Gemini API embeddings. | Google AI Embeddings |
Example: Google AI Embeddings​
embeddings:
- from: google:text-embedding-004
name: gemini_embeddings
params:
google_api_key: ${ secrets:GEMINI_API_KEY }
Snapshots and shared state​
Snapshots and the distributed query state location can use Cloud Storage as the shared object store. Configure with the gs:// scheme:
snapshots:
location: gs://my-bucket/spiceai/snapshots
When no explicit credentials are supplied, Spice reads GOOGLE_APPLICATION_CREDENTIALS and the Workload Identity-federated token, in that order.
Authentication​
All GCP integrations support the standard Application Default Credentials chain. When credentials are not explicitly configured, Spice attempts the following in order:
GOOGLE_APPLICATION_CREDENTIALS— path to a service account JSON key file.- Attached service account — Compute Engine, Cloud Run, or GKE node default service account.
- GKE Workload Identity — federated tokens for pods bound to a Google service account via the Kubernetes ServiceAccount. See Workload Identity for GKE.
gcloudCLI — cached credentials fromgcloud auth application-default login.- Workload Identity Federation — federated identity for workloads running outside GCP (other clouds, on-premises, GitHub Actions). See Workload Identity Federation.
For a deployment-side overview of these mechanisms, see the Authentication section of the GCP deployment guide.
IAM role bindings​
Each principal must have the appropriate IAM role for the services it accesses:
| Service | Common role(s) |
|---|---|
| Cloud Storage | roles/storage.objectViewer or roles/storage.objectAdmin |
| BigQuery | roles/bigquery.dataViewer and roles/bigquery.jobUser |
| Cloud SQL | roles/cloudsql.client (proxy) plus database-level grants |
| Secret Manager | roles/secretmanager.secretAccessor |
| Artifact Registry | roles/artifactregistry.reader for image pulls |
| Cloud Logging/Monitoring | roles/logging.logWriter, roles/monitoring.metricWriter |
When a Spicepod connects to multiple GCP services, ensure roles are granted on every resource the runtime touches.
