Skip to main content
Version: Next

GraphQL Data Connector Deployment Guide

Production operating guide for the GraphQL data connector covering authentication, pagination, and operational tuning.

Authentication & Secrets​

Authentication is endpoint-specific. The connector supports arbitrary HTTP headers via graphql_auth_header:

ParameterDescription
graphql_endpointGraphQL endpoint URL.
graphql_auth_headerAuthorization header. Typically "Bearer ${secrets:api_token}".
graphql_queryThe GraphQL query to execute.
graphql_json_pointerRFC-6901 JSON pointer to the row collection inside the response (e.g. /data/repository/issues/nodes).
graphql_pagination_parametersCursor / page-size configuration for pagination (see the connector reference).

Tokens must be sourced from a secret store in production.

TLS​

Use HTTPS endpoints in production. Self-signed certificates require a trusted CA bundle in the container / host OS trust store.

Resilience Controls​

Retry Behavior​

HTTP-level retries follow the shared resilient_http policy: 408/429/5xx plus transient network errors are retried with fibonacci backoff capped at 300s. The connector respects Retry-After, retry-after-ms, and x-retry-after-ms headers.

Pagination​

The connector supports cursor-based pagination. Each page is a separate HTTP request; pagination errors mid-sequence cause the entire refresh to fail. Use graphql_json_pointer to select the row collection and configure the pagination variables to match the upstream schema's cursor fields.

Server Rate Limits​

GraphQL APIs (GitHub, Shopify, etc.) typically enforce query-cost-based rate limits rather than request count. When a query returns a cost/rate-limit error, the connector surfaces it immediately. Reduce refresh frequency or narrow the query to stay within budget.

Capacity & Sizing​

  • Throughput: Bounded by the upstream rate limit, typical GraphQL endpoints cap at 100s-1000s of requests per minute.
  • Query cost: Design graphql_query to request only the fields you need. Request fewer nested fields to reduce query cost.
  • Pagination depth: Large datasets requiring hundreds of pages extend refresh duration linearly; plan refresh intervals accordingly.

Metrics​

The GraphQL connector does not register connector-specific instruments. Monitor via:

  • Spice query execution metrics (query_duration_ms, query_processed_rows, query_failures_total) from runtime.metrics.
  • HTTP response status distribution via the shared resilient_http instrumentation.
  • The upstream GraphQL provider's rate-limit dashboards.

See Component Metrics for general configuration.

Task History​

GraphQL requests participate in task history through the HTTP client's span. Each page fetch is a child of the enclosing sql_query or accelerated_table_refresh task.

Known Limitations​

  • Read-only: Only GraphQL queries (not mutations or subscriptions) are supported.
  • Single query per dataset: Each dataset is one GraphQL query. Multi-query datasets require separate dataset definitions.
  • Schema inference: The connector infers schema from the first response; schemas with deeply-nested optional fields may require an explicit dataset schema override.
  • Batching: GraphQL query batching (multiple operations in one HTTP request) is not exposed.

Troubleshooting​

SymptomLikely causeResolution
401 UnauthorizedWrong or expired token in graphql_auth_header.Rotate the token; verify the header format (Bearer prefix, etc.).
Rows missing from the datasetWrong graphql_json_pointer.Inspect the response payload; JSON pointer must navigate to the array of rows.
Refresh fails mid-paginationRate-limit or transient network failure.Reduce refresh frequency; the connector will retry on retriable errors. Narrow the query.
Query cost exceededQuery requests too many nested fields.Simplify the query; fetch only required fields.
Inferred schema differs between refreshesOptional fields appear/disappear in responses.Provide an explicit dataset schema to lock down types.