JSON Functions and Operators
JSON support in Spice is based on datafusion-functions-json, which provides functions and operators to extract, query, and manipulate JSON data stored as strings. Advanced features for JSON creation, modification, or complex path expressions are not supported.
- JSON functions and operators are supported only during DataFusion (Arrow) execution.
- Federated or accelerated sources (non-Arrow) may not support all JSON functions.
- JSON Functions
- JSON Operators
- Usage Examples
- Further Reading
JSON Functions​
Enables extracting and manipulating data from JSON strings. Each function takes a JSON string as the first argument, followed by one or more keys or indices to specify the path.
json_contains​
Returns true if a JSON string contains a specific key at the specified path.
json_contains(json_string, key1[, key2, ...])
Arguments​
- json_string: String containing valid JSON data.
- key1, key2, ...: Path to the key to check. Can be string keys for objects or integer indices for arrays.
Example​
> SELECT json_contains('{"a": 1, "b": 2}', 'a');
+-------------------------------------------------+
| json_contains(Utf8("{\"a\": 1, \"b\": 2}"),Utf8("a")) |
+-------------------------------------------------+
| true |
+-------------------------------------------------+
json_get​
Retrieves a value from a JSON string based on its path.
json_get(json_string, key1[, key2, ...])
Arguments​
- json_string: String containing valid JSON data.
- key1, key2, ...: Path to the value. Can be string keys for objects or integer indices for arrays.
Example​
> SELECT json_get('{"a": 1, "b": 2}', 'a');
+----------------------------------------------+
| json_get(Utf8("{"a": 1, "b": 2}"),Utf8("a")) |
+----------------------------------------------+
| {int=1} |
+----------------------------------------------+
json_get_str​
Retrieves a string value from a JSON string based on its path.
json_get_str(json_string, key1[, key2, ...])
Arguments​
- json_string: String containing valid JSON data.
- key1, key2, ...: Path to the string value.
Example​
> SELECT json_get_str('{"name": "John", "age": 30}', 'name');
+----------------------------------------------------------------+
| json_get_str(Utf8("{"name": "John", "age": 30}"),Utf8("name")) |
+----------------------------------------------------------------+
| John |
+----------------------------------------------------------------+
json_get_int​
Retrieves an integer value from a JSON string based on its path.
json_get_int(json_string, key1[, key2, ...])
Arguments​
- json_string: String containing valid JSON data.
- key1, key2, ...: Path to the integer value.
Example​
> SELECT json_get_int('{"name": "John", "age": 30}', 'age');
+---------------------------------------------------------------+
| json_get_int(Utf8("{"name": "John", "age": 30}"),Utf8("age")) |
+---------------------------------------------------------------+
| 30 |
+---------------------------------------------------------------+
json_get_float​
Retrieves a float value from a JSON string based on its path.
json_get_float(json_string, key1[, key2, ...])
Arguments​
- json_string: String containing valid JSON data.
- key1, key2, ...: Path to the float value.
Example​
> SELECT json_get_float('{"price": 19.99, "quantity": 2}', 'price');
+-----------------------------------------------------------------------+
| json_get_float(Utf8("{"price": 19.99, "quantity": 2}"),Utf8("price")) |
+-----------------------------------------------------------------------+
| 19.99 |
+-----------------------------------------------------------------------+
json_get_bool​
Retrieves a boolean value from a JSON string based on its path.
json_get_bool(json_string, key1[, key2, ...])
Arguments​
- json_string: String containing valid JSON data.
- key1, key2, ...: Path to the boolean value.
Example​
> SELECT json_get_bool('{"active": true, "visible": false}', 'active');
+--------------------------------------------------------------------------+
| json_get_bool(Utf8("{"active": true, "visible": false}"),Utf8("active")) |
+--------------------------------------------------------------------------+
| true |
+--------------------------------------------------------------------------+
json_get_json​
Retrieves a nested JSON object or array as a raw JSON string from a JSON string based on its path.
json_get_json(json_string, key1[, key2, ...])
Arguments​
- json_string: String containing valid JSON data.
- key1, key2, ...: Path to the nested JSON value.
Example​
> SELECT json_get_json('{"user": {"name": "John", "age": 30}}', 'user');
+---------------------------------------------------------------------------+
| json_get_json(Utf8("{"user": {"name": "John", "age": 30}}"),Utf8("user")) |
+---------------------------------------------------------------------------+
| {"name": "John", "age": 30} |
+---------------------------------------------------------------------------+
json_get_array​
Retrieves an arrow array from a JSON string based on its path.
json_get_array(json_string, key1[, key2, ...])
Arguments​
- json_string: String containing valid JSON data.
- key1, key2, ...: Path to the array value.
Example​
> SELECT json_get_array('{"numbers": [1, 2, 3, 4]}', 'numbers');
+-------------------------------------------------------------------+
| json_get_array(Utf8("{"numbers": [1, 2, 3, 4]}"),Utf8("numbers")) |
+-------------------------------------------------------------------+
| [1, 2, 3, 4] |
+-------------------------------------------------------------------+
json_as_text​
Retrieves any value from a JSON string based on its path and represents it as a string. This is useful for converting JSON values to text format.
json_as_text(json_string, key1[, key2, ...])
Arguments​
- json_string: String containing valid JSON data.
- key1, key2, ...: Path to the value to convert to text.
Example​
> SELECT json_as_text('{"age": 30, "active": true}', 'age');
+---------------------------------------------------------------+
| json_as_text(Utf8("{"age": 30, "active": true}"),Utf8("age")) |
+---------------------------------------------------------------+
| 30 |
+---------------------------------------------------------------+
json_length​
Returns the length of a JSON string, array, or object. For objects, returns the number of key-value pairs. For arrays, returns the number of elements. For strings, returns the character count.
json_length(json_string[, key1, key2, ...])
Arguments​
- json_string: String containing valid JSON data.
- key1, key2, ...: Optional path to a nested value. If omitted, returns the length of the root JSON value.
Example​
> SELECT json_length('{"a": 1, "b": 2, "c": 3}');
+-----------------------------------------------+
| json_length(Utf8("{"a": 1, "b": 2, "c": 3}")) |
+-----------------------------------------------+
| 3 |
+-----------------------------------------------+
> SELECT json_length('[1, 2, 3, 4, 5]');
+--------------------------------------+
| json_length(Utf8("[1, 2, 3, 4, 5]")) |
+--------------------------------------+
| 5 |
+--------------------------------------+
json_object_keys​
Returns the top-level keys of a JSON object as an array of strings. If a path is provided, returns the keys of the object at that path. Returns NULL if the value at the path is not an object.
json_object_keys(json_string[, key1, key2, ...])
Alias: json_keys.
Arguments​
- json_string: String containing valid JSON data.
- key1, key2, ...: Optional path to a nested object. If omitted, returns the keys of the root object.
Example​
> SELECT json_object_keys('{"a": 1, "b": 2, "c": 3}');
+-----------------------------------------------------+
| json_object_keys(Utf8("{"a": 1, "b": 2, "c": 3}")) |
+-----------------------------------------------------+
| [a, b, c] |
+-----------------------------------------------------+
> SELECT json_object_keys('{"user": {"name": "John", "age": 30}}', 'user');
+-----------------------------------------------------------------------------+
| json_object_keys(Utf8("{"user": {"name": "John", "age": 30}}"),Utf8("user")) |
+-----------------------------------------------------------------------------+
| [name, age] |
+-----------------------------------------------------------------------------+
JSON Operators​
->​
JSON access operator. Retrieves a value from a JSON string based on its path. This operator is an alias for json_get.
json_string -> key
json_string -> key1 -> key2
Arguments​
- json_string: String containing valid JSON data.
- key: Object key (string) or array index (integer).
Example​
> SELECT '{"user": {"name": "John", "age": 30}}' -> 'user' -> 'name';
+-------------------------------------------------------------+
| '{"user": {"name": "John", "age": 30}}' -> 'user' -> 'name' |
+-------------------------------------------------------------+
| {str=John} |
+-------------------------------------------------------------+
->>​
JSON access operator for text extraction. Retrieves any value from a JSON string and converts it to text. This operator is an alias for json_as_text.
json_string ->> key
json_string -> key1 ->> key2
Arguments​
- json_string: String containing valid JSON data.
- key: Object key (string) or array index (integer).
Example​
> SELECT '{"user": {"name": "John", "age": 30}}' -> 'user' ->> 'age';
+-------------------------------------------------------------+
| '{"user": {"name": "John", "age": 30}}' -> 'user' ->> 'age' |
+-------------------------------------------------------------+
| 30 |
+-------------------------------------------------------------+
?​
JSON containment operator. Returns true if a JSON string contains the specified key. This operator is an alias for json_contains.
json_string ? 'key'
Arguments​
- json_string: String containing valid JSON data.
- key: Key to check for existence.
Example​
> SELECT '{"user": {"name": "John", "age": 30}}' ? 'user';
+--------------------------------------------------+
| '{"user": {"name": "John", "age": 30}}' ? 'user' |
+--------------------------------------------------+
| true |
+--------------------------------------------------+
Usage Examples​
Nested Object Access​
> SELECT '{"inventory": {"stock": {"S": 12, "M": 20}}}' -> 'inventory' -> 'stock' ->> 'S' as size_s_stock;
+---------------+
| size_s_stock |
+---------------+
| 12 |
+---------------+
Array Access​
> SELECT '{"sizes": ["S", "M", "L", "XL"]}' -> 'sizes' ->> 0 as first_size;
+------------+
| first_size |
+------------+
| S |
+------------+
Conditional JSON Queries​
> SELECT name, properties ->> 'color' as color
FROM products
WHERE properties ? 'color' AND properties ->> 'color' IN ('black', 'white');
Using JSON Functions in Views​
JSON functions can be used in views to simplify access to nested JSON data:
CREATE VIEW products_with_color AS
SELECT
id,
name,
properties ->> 'color' as color,
json_get_int(properties, 'stock') as stock_count
FROM products;
JSON Table Functions (UDTFs)​
Spice includes table-valued functions for decomposing JSON structures into relational rows. Each function is available as both a UDTF (in the FROM clause with literal input) and a scalar UDF returning a list of structs (for per-row use with UNNEST).
flatten_json​
Walks an arbitrary JSON value and emits one row per reachable leaf.
flatten_json(input Utf8 [, options...]) -> TABLE(
path Utf8,
parent_path Utf8,
key Utf8,
value Utf8,
type Utf8 -- "object"|"array"|"string"|"number"|"integer"|"boolean"|"null"
)
Options (named arguments):
| Option | Type | Default | Description |
|---|---|---|---|
max_depth | UInt | 64 | Maximum recursion depth. |
max_rows | UInt | 1000000 | Per-document row cap. |
max_bytes | UInt | 8388608 | Input size limit (bytes). |
path_style | Utf8 | "dot" | "dot" or "json-pointer". |
include_internal | Bool | false | Also emit interior object/array rows. |
array_wildcard | Bool | false | Collapse array indices to [*] instead of [0], [1], etc. |
UDTF example:
SELECT path, value, type
FROM flatten_json('{"user": {"name": "Alice", "scores": [95, 87]}}');
| path | value | type |
|---|---|---|
user.name | Alice | string |
user.scores[0] | 95 | integer |
user.scores[1] | 87 | integer |
Scalar UDF example (per-row with UNNEST):
SELECT rows.path, rows.value, rows.type
FROM (SELECT UNNEST(flatten_json(body)) AS rows FROM documents);
flatten_json_properties​
Decomposes a JSON Schema document into one row per field, extracting metadata such as types, descriptions, required status, enums, and format.
flatten_json_properties(input Utf8 [, options...]) -> TABLE(
path Utf8,
parent_path Utf8,
name Utf8,
description Utf8,
type Utf8,
required Boolean,
format Utf8,
enum_values List<Utf8>,
metadata Utf8
)
Handles properties recursion, items.properties (arrays of objects), additionalProperties maps, allOf/oneOf/anyOf merging, and local $ref pointers with cycle detection.
Options (named arguments):
| Option | Type | Default | Description |
|---|---|---|---|
max_depth | UInt | 32 | Maximum recursion depth. |
max_rows | UInt | 100000 | Per-document row cap. |
max_bytes | UInt | 8388608 | Input size limit (bytes). |
path_style | Utf8 | "dot" | "dot" or "json-pointer". |
dialect | Utf8 | "json-schema" | "json-schema" or "openapi" (metrics tagging). |
include_internal | Bool | false | Also emit container rows (objects, arrays). |
expand_maps | Bool | false | Walk into additionalProperties and emit child paths with a wildcard segment (e.g., parent.[*].child). |
map_wildcard | Utf8 | "[*]" | Wildcard segment for map values when expand_maps is true. |
Example:
SELECT path, type, required, description
FROM flatten_json_properties('{
"type": "object",
"properties": {
"name": {"type": "string", "description": "User name"},
"age": {"type": "integer"}
},
"required": ["name"]
}');
| path | type | required | description |
|---|---|---|---|
name | string | true | User name |
age | integer | false |
Expanding maps:
When a JSON Schema uses additionalProperties to describe map values, enable expand_maps to produce JSONPath-style paths:
SELECT path, type
FROM flatten_json_properties(
'{"type": "object", "additionalProperties": {"type": "object", "properties": {"id": {"type": "string"}, "primary": {"type": "boolean"}}}}',
expand_maps => true
);
| path | type |
|---|---|
[*].id | string |
[*].primary | boolean |
Further Reading​
- datafusion-functions-json - The underlying JSON manipulation library
- Spice JSON Cookbook - Spice Cookbook demonstrating how to work with JSON strings in Spice
