Version: Next

Functions (User-Defined Functions)

Functions extend Spice's SQL engine with custom scalar and table logic. Each entry in the top-level functions: block is registered as a callable SQL function and (for scalar functions, by default) as an LLM tool.

For an overview, examples, and execution-tier details, see Functions.

Registration is off by default

The functions: section is only honored when runtime.functions.enabled is set to true.

`functions`

The functions: section in your configuration declares one or more scalar or table functions.

Example:

runtime:
  functions:
    enabled: true

functions:
  - name: double_it
    from: sql
    description: Double a 64-bit integer.
    volatility: immutable
    signature:
      args:
        - { name: x, type: int64 }
      returns: int64
    body: 'x * 2'

`name`

A unique identifier for the function. The name is used to register the UDF in the SQL session and as the tool name when surfaced to LLMs.

`from`

Source URI selecting the execution tier:

sql — Inline SQL body executed in-process.
http://... / https://... — Remote endpoint invoked over HTTP + JSON.

Other schemes are rejected at startup. See Execution Tiers.

`enabled`

Optional. Defaults to true. Set to false to keep the declaration in the spicepod without registering the function — useful for staged rollouts. Disabled functions are also hidden from list_udfs() and GET /v1/functions.

`description`

Optional. Free-form description surfaced in list_udfs(), GET /v1/functions, and the LLM tool registry.

`kind`

Optional. Defaults to scalar.

Value	Description
`scalar`	(default) Returns a single value per row. Called as `SELECT my_fn(x) FROM ...`.
`table`	Returns multiple rows and columns. Called as `SELECT ... FROM my_fn(x)`. Always SQL-only (`as_tool` is ignored).

`volatility`

Optional. Defaults to volatile. Controls how the optimizer treats the function across calls.

Value	Description
`immutable`	Same inputs always yield the same output. May be constant-folded and cached.
`stable`	Stable within a single query but may change across queries. Cached per query.
`volatile`	(default) Unpredictable on every call. Never cached.

Choose the strongest level that's actually true. See Volatility.

`signature`

Required. Typed argument list and return type.

signature:
  args:
    - { name: lat1, type: float64 }
    - { name: lon1, type: float64 }
  returns: float64

`signature.args`

Optional. Positional argument list. Empty for niladic functions.

Each entry has:

name — Argument name. Used inside SQL body expressions and as the JSON key for remote tier requests.
type — Arrow type. Accepts Spicepod aliases (int64, utf8, list<int64>, decimal(38,10), timestamp(us, utc)) and Arrow display forms (Int64, List(Int64), etc.). See Types.

`signature.returns`

Required. For scalar functions, a single Arrow type string (e.g., int64, utf8). For table functions, a list of output column definitions:

# Scalar function
returns: int64

# Table function
returns:
  - { name: value, type: int64 }
  - { name: label, type: utf8 }

`body`

Inline SQL body. Required when from: sql (unless body_ref is set instead). For scalar functions, must be a single SQL expression referencing the function's arguments by name. For table functions, must be a single SELECT query; scalar arguments are available via a virtual args table.

body: |
  6371 * acos(
    cos(radians(lat1)) * cos(radians(lat2)) *
    cos(radians(lon2) - radians(lon1)) +
    sin(radians(lat1)) * sin(radians(lat2))
  )

The body can call any DataFusion built-in scalar function (math, string, datetime, JSON, regex, etc.).

Mutually exclusive with body_ref. Must not be set for non-SQL from: schemes.

`body_ref`

Path to a local filesystem file whose contents are the SQL body. Resolved relative to the runtime's current working directory at registration time.

body_ref: ./functions/shipping_class.sql

body_ref is not read from object stores when a spicepod is loaded remotely — use inline body: for portable remote spicepods.

Mutually exclusive with body.

`params`

Optional. Tier-specific parameters. Supports ${secrets:KEY} and ${env:KEY} interpolation at registration time.

For the Remote tier (from: http://... / https://...):

Parameter	Default	Description
`timeout`	`30s`	Per-call timeout. Plain integer seconds or `Ns` / `Nms` strings.
`batch_size`	`1024`	Maximum rows per HTTP request. Capped at `100 000`.
`batch_concurrency`	`4`	Maximum in-flight HTTP batches per invocation. Capped at `64`.
`auth_bearer`	unset	When set, adds `Authorization: Bearer <value>` to each request. Use `${secrets:...}`.

The SQL tier does not currently use params.

`metadata`

Optional. Free-form key/value pairs surfaced alongside the function definition. Useful for tagging, ownership, or custom metadata consumers.

metadata:
  owner: data-platform
  jira: DATA-1234

`as_tool`

Optional. Defaults to true for scalar functions. When true, the function is registered as an LLM tool with the same name and description and becomes callable from chat completions, POST /v1/tools/<name>, and the /v1/tools listing. Table functions (kind: table) are always SQL-only regardless of this setting.

Set to false to keep the function SQL-only:

functions:
  - name: internal_hash
    from: sql
    as_tool: false
    signature:
      args: [{ name: x, type: int64 }]
      returns: int64
    body: 'x * 2654435761'

`dependsOn`

Optional. Names of other Spicepod components that must be loaded before this function (e.g. a dataset the function queries internally).

`metrics`

Optional. Per-function metrics configuration. See Component Metrics.

Discovering registered functions

After startup, registered functions can be inspected:

From SQL — SELECT * FROM list_udfs() WHERE source = 'user';
From the HTTP API — GET /v1/functions returns a JSON array of user functions.

See Discovering registered functions.

functions​

name​

from​

enabled​

description​

kind​

volatility​

signature​

signature.args​

signature.returns​

body​

body_ref​

params​

metadata​

as_tool​

dependsOn​

metrics​

Discovering registered functions​