Skip to main content
Version: Next

JSON Functions and Operators

JSON support in Spice is based on datafusion-functions-json, which provides functions and operators to extract, query, and manipulate JSON data stored as strings. Advanced features for JSON creation, modification, or complex path expressions are not supported.

Limitations
  • JSON functions and operators are supported only during DataFusion (Arrow) execution.
  • Federated or accelerated sources (non-Arrow) may not support all JSON functions.

JSON Functions​

Enables extracting and manipulating data from JSON strings. Each function takes a JSON string as the first argument, followed by one or more keys or indices to specify the path.

json_contains​

Returns true if a JSON string contains a specific key at the specified path.

json_contains(json_string, key1[, key2, ...])

Arguments​

  • json_string: String containing valid JSON data.
  • key1, key2, ...: Path to the key to check. Can be string keys for objects or integer indices for arrays.

Example​

> SELECT json_contains('{"a": 1, "b": 2}', 'a');
+-------------------------------------------------+
| json_contains(Utf8("{\"a\": 1, \"b\": 2}"),Utf8("a")) |
+-------------------------------------------------+
| true |
+-------------------------------------------------+

json_get​

Retrieves a value from a JSON string based on its path.

json_get(json_string, key1[, key2, ...])

Arguments​

  • json_string: String containing valid JSON data.
  • key1, key2, ...: Path to the value. Can be string keys for objects or integer indices for arrays.

Example​

> SELECT json_get('{"a": 1, "b": 2}', 'a');
+----------------------------------------------+
| json_get(Utf8("{"a": 1, "b": 2}"),Utf8("a")) |
+----------------------------------------------+
| {int=1} |
+----------------------------------------------+

json_get_str​

Retrieves a string value from a JSON string based on its path.

json_get_str(json_string, key1[, key2, ...])

Arguments​

  • json_string: String containing valid JSON data.
  • key1, key2, ...: Path to the string value.

Example​

> SELECT json_get_str('{"name": "John", "age": 30}', 'name');
+----------------------------------------------------------------+
| json_get_str(Utf8("{"name": "John", "age": 30}"),Utf8("name")) |
+----------------------------------------------------------------+
| John |
+----------------------------------------------------------------+

json_get_int​

Retrieves an integer value from a JSON string based on its path.

json_get_int(json_string, key1[, key2, ...])

Arguments​

  • json_string: String containing valid JSON data.
  • key1, key2, ...: Path to the integer value.

Example​

> SELECT json_get_int('{"name": "John", "age": 30}', 'age');
+---------------------------------------------------------------+
| json_get_int(Utf8("{"name": "John", "age": 30}"),Utf8("age")) |
+---------------------------------------------------------------+
| 30 |
+---------------------------------------------------------------+

json_get_float​

Retrieves a float value from a JSON string based on its path.

json_get_float(json_string, key1[, key2, ...])

Arguments​

  • json_string: String containing valid JSON data.
  • key1, key2, ...: Path to the float value.

Example​

> SELECT json_get_float('{"price": 19.99, "quantity": 2}', 'price');
+-----------------------------------------------------------------------+
| json_get_float(Utf8("{"price": 19.99, "quantity": 2}"),Utf8("price")) |
+-----------------------------------------------------------------------+
| 19.99 |
+-----------------------------------------------------------------------+

json_get_bool​

Retrieves a boolean value from a JSON string based on its path.

json_get_bool(json_string, key1[, key2, ...])

Arguments​

  • json_string: String containing valid JSON data.
  • key1, key2, ...: Path to the boolean value.

Example​

> SELECT json_get_bool('{"active": true, "visible": false}', 'active');
+--------------------------------------------------------------------------+
| json_get_bool(Utf8("{"active": true, "visible": false}"),Utf8("active")) |
+--------------------------------------------------------------------------+
| true |
+--------------------------------------------------------------------------+

json_get_json​

Retrieves a nested JSON object or array as a raw JSON string from a JSON string based on its path.

json_get_json(json_string, key1[, key2, ...])

Arguments​

  • json_string: String containing valid JSON data.
  • key1, key2, ...: Path to the nested JSON value.

Example​

> SELECT json_get_json('{"user": {"name": "John", "age": 30}}', 'user');
+---------------------------------------------------------------------------+
| json_get_json(Utf8("{"user": {"name": "John", "age": 30}}"),Utf8("user")) |
+---------------------------------------------------------------------------+
| {"name": "John", "age": 30} |
+---------------------------------------------------------------------------+

json_get_array​

Retrieves an arrow array from a JSON string based on its path.

json_get_array(json_string, key1[, key2, ...])

Arguments​

  • json_string: String containing valid JSON data.
  • key1, key2, ...: Path to the array value.

Example​

> SELECT json_get_array('{"numbers": [1, 2, 3, 4]}', 'numbers');
+-------------------------------------------------------------------+
| json_get_array(Utf8("{"numbers": [1, 2, 3, 4]}"),Utf8("numbers")) |
+-------------------------------------------------------------------+
| [1, 2, 3, 4] |
+-------------------------------------------------------------------+

json_as_text​

Retrieves any value from a JSON string based on its path and represents it as a string. This is useful for converting JSON values to text format.

json_as_text(json_string, key1[, key2, ...])

Arguments​

  • json_string: String containing valid JSON data.
  • key1, key2, ...: Path to the value to convert to text.

Example​

> SELECT json_as_text('{"age": 30, "active": true}', 'age');
+---------------------------------------------------------------+
| json_as_text(Utf8("{"age": 30, "active": true}"),Utf8("age")) |
+---------------------------------------------------------------+
| 30 |
+---------------------------------------------------------------+

json_length​

Returns the length of a JSON string, array, or object. For objects, returns the number of key-value pairs. For arrays, returns the number of elements. For strings, returns the character count.

json_length(json_string[, key1, key2, ...])

Arguments​

  • json_string: String containing valid JSON data.
  • key1, key2, ...: Optional path to a nested value. If omitted, returns the length of the root JSON value.

Example​

> SELECT json_length('{"a": 1, "b": 2, "c": 3}');
+-----------------------------------------------+
| json_length(Utf8("{"a": 1, "b": 2, "c": 3}")) |
+-----------------------------------------------+
| 3 |
+-----------------------------------------------+

> SELECT json_length('[1, 2, 3, 4, 5]');
+--------------------------------------+
| json_length(Utf8("[1, 2, 3, 4, 5]")) |
+--------------------------------------+
| 5 |
+--------------------------------------+

json_object_keys​

Returns the top-level keys of a JSON object as an array of strings. If a path is provided, returns the keys of the object at that path. Returns NULL if the value at the path is not an object.

json_object_keys(json_string[, key1, key2, ...])

Alias: json_keys.

Arguments​

  • json_string: String containing valid JSON data.
  • key1, key2, ...: Optional path to a nested object. If omitted, returns the keys of the root object.

Example​

> SELECT json_object_keys('{"a": 1, "b": 2, "c": 3}');
+-----------------------------------------------------+
| json_object_keys(Utf8("{"a": 1, "b": 2, "c": 3}")) |
+-----------------------------------------------------+
| [a, b, c] |
+-----------------------------------------------------+

> SELECT json_object_keys('{"user": {"name": "John", "age": 30}}', 'user');
+-----------------------------------------------------------------------------+
| json_object_keys(Utf8("{"user": {"name": "John", "age": 30}}"),Utf8("user")) |
+-----------------------------------------------------------------------------+
| [name, age] |
+-----------------------------------------------------------------------------+

JSON Operators​

->​

JSON access operator. Retrieves a value from a JSON string based on its path. This operator is an alias for json_get.

json_string -> key
json_string -> key1 -> key2

Arguments​

  • json_string: String containing valid JSON data.
  • key: Object key (string) or array index (integer).

Example​

> SELECT '{"user": {"name": "John", "age": 30}}' -> 'user' -> 'name';
+-------------------------------------------------------------+
| '{"user": {"name": "John", "age": 30}}' -> 'user' -> 'name' |
+-------------------------------------------------------------+
| {str=John} |
+-------------------------------------------------------------+

->>​

JSON access operator for text extraction. Retrieves any value from a JSON string and converts it to text. This operator is an alias for json_as_text.

json_string ->> key
json_string -> key1 ->> key2

Arguments​

  • json_string: String containing valid JSON data.
  • key: Object key (string) or array index (integer).

Example​

> SELECT '{"user": {"name": "John", "age": 30}}' -> 'user' ->> 'age';
+-------------------------------------------------------------+
| '{"user": {"name": "John", "age": 30}}' -> 'user' ->> 'age' |
+-------------------------------------------------------------+
| 30 |
+-------------------------------------------------------------+

?​

JSON containment operator. Returns true if a JSON string contains the specified key. This operator is an alias for json_contains.

json_string ? 'key'

Arguments​

  • json_string: String containing valid JSON data.
  • key: Key to check for existence.

Example​

> SELECT '{"user": {"name": "John", "age": 30}}' ? 'user';
+--------------------------------------------------+
| '{"user": {"name": "John", "age": 30}}' ? 'user' |
+--------------------------------------------------+
| true |
+--------------------------------------------------+

Usage Examples​

Nested Object Access​

> SELECT '{"inventory": {"stock": {"S": 12, "M": 20}}}' -> 'inventory' -> 'stock' ->> 'S' as size_s_stock;
+---------------+
| size_s_stock |
+---------------+
| 12 |
+---------------+

Array Access​

> SELECT '{"sizes": ["S", "M", "L", "XL"]}' -> 'sizes' ->> 0 as first_size;
+------------+
| first_size |
+------------+
| S |
+------------+

Conditional JSON Queries​

> SELECT name, properties ->> 'color' as color
FROM products
WHERE properties ? 'color' AND properties ->> 'color' IN ('black', 'white');

Using JSON Functions in Views​

JSON functions can be used in views to simplify access to nested JSON data:

CREATE VIEW products_with_color AS
SELECT
id,
name,
properties ->> 'color' as color,
json_get_int(properties, 'stock') as stock_count
FROM products;

JSON Table Functions (UDTFs)​

Spice includes table-valued functions for decomposing JSON structures into relational rows. Each function is available as both a UDTF (in the FROM clause with literal input) and a scalar UDF returning a list of structs (for per-row use with UNNEST).

flatten_json​

Walks an arbitrary JSON value and emits one row per reachable leaf.

flatten_json(input Utf8 [, options...]) -> TABLE(
path Utf8,
parent_path Utf8,
key Utf8,
value Utf8,
type Utf8 -- "object"|"array"|"string"|"number"|"integer"|"boolean"|"null"
)

Options (named arguments):

OptionTypeDefaultDescription
max_depthUInt64Maximum recursion depth.
max_rowsUInt1000000Per-document row cap.
max_bytesUInt8388608Input size limit (bytes).
path_styleUtf8"dot""dot" or "json-pointer".
include_internalBoolfalseAlso emit interior object/array rows.
array_wildcardBoolfalseCollapse array indices to [*] instead of [0], [1], etc.

UDTF example:

SELECT path, value, type
FROM flatten_json('{"user": {"name": "Alice", "scores": [95, 87]}}');
pathvaluetype
user.nameAlicestring
user.scores[0]95integer
user.scores[1]87integer

Scalar UDF example (per-row with UNNEST):

SELECT rows.path, rows.value, rows.type
FROM (SELECT UNNEST(flatten_json(body)) AS rows FROM documents);

flatten_json_properties​

Decomposes a JSON Schema document into one row per field, extracting metadata such as types, descriptions, required status, enums, and format.

flatten_json_properties(input Utf8 [, options...]) -> TABLE(
path Utf8,
parent_path Utf8,
name Utf8,
description Utf8,
type Utf8,
required Boolean,
format Utf8,
enum_values List<Utf8>,
metadata Utf8
)

Handles properties recursion, items.properties (arrays of objects), additionalProperties maps, allOf/oneOf/anyOf merging, and local $ref pointers with cycle detection.

Options (named arguments):

OptionTypeDefaultDescription
max_depthUInt32Maximum recursion depth.
max_rowsUInt100000Per-document row cap.
max_bytesUInt8388608Input size limit (bytes).
path_styleUtf8"dot""dot" or "json-pointer".
dialectUtf8"json-schema""json-schema" or "openapi" (metrics tagging).
include_internalBoolfalseAlso emit container rows (objects, arrays).
expand_mapsBoolfalseWalk into additionalProperties and emit child paths with a wildcard segment (e.g., parent.[*].child).
map_wildcardUtf8"[*]"Wildcard segment for map values when expand_maps is true.

Example:

SELECT path, type, required, description
FROM flatten_json_properties('{
"type": "object",
"properties": {
"name": {"type": "string", "description": "User name"},
"age": {"type": "integer"}
},
"required": ["name"]
}');
pathtyperequireddescription
namestringtrueUser name
ageintegerfalse

Expanding maps:

When a JSON Schema uses additionalProperties to describe map values, enable expand_maps to produce JSONPath-style paths:

SELECT path, type
FROM flatten_json_properties(
'{"type": "object", "additionalProperties": {"type": "object", "properties": {"id": {"type": "string"}, "primary": {"type": "boolean"}}}}',
expand_maps => true
);
pathtype
[*].idstring
[*].primaryboolean

Further Reading​