SELECT
Spice is built on Apache DataFusion and uses the PostgreSQL dialect, even when querying datasources with different SQL dialects.
SELECT syntax
The queries in Spice scan data from tables and return 0 or more rows. Please be aware that column names in queries are made lower-case, but not on the inferred schema. Accordingly, if you want to query against a capitalized field, make sure to use double quotes.
Spice supports the following syntax for queries:
[ WITH with_query [, ...] ]
SELECT [ ALL | DISTINCT ] select_expr [, ...]
[ FROM from_item [, ...] ]
[ JOIN join_item [, ...] ]
[ WHERE condition ]
[ GROUP BY grouping_element [, ...] ]
[ HAVING condition]
[ QUALIFY condition ]
[ UNION [ ALL | select ] ]
[ ORDER BY expression [ ASC | DESC ][, ...] ]
[ LIMIT count ]
[ EXCLUDE | EXCEPT ]
Window Functions (OVER Clause)
Window functions perform calculations across a set of rows related to the current row. Use the OVER clause to define the window:
SELECT
employee_id,
salary,
ROW_NUMBER() OVER (ORDER BY salary DESC) AS salary_rank,
SUM(salary) OVER (PARTITION BY dept_id) AS dept_total
FROM employees;
The OVER clause supports:
PARTITION BY: Divides rows into groupsORDER BY: Defines row ordering within each partition- Frame specifications:
ROWS BETWEEN ... AND ...
WITH clause
A with clause allows to give names for queries and reference them by name.
WITH x AS (SELECT a, MAX(b) AS b FROM t GROUP BY a)
SELECT a, b FROM x;
SELECT clause
The SELECT clause is used to select data from a database by defining the colummns it returns. Each select_expr in the
SELECT list can be an expression or wildcards.
Example:
SELECT a, b, a + b FROM table
The DISTINCT quantifier can be added to make the query return all distinct rows.
By default ALL will be used, which returns all the rows.
SELECT DISTINCT person, age FROM employees
FROM clause
The FROM clause is used to specify which table to select data from.
Example:
SELECT t.a FROM table AS t
WHERE clause
The WHERE clause is used define the conditions to filter the query results.
Example:
SELECT a FROM table WHERE a > 10
JOIN clause
Spice supports INNER JOIN, LEFT OUTER JOIN, RIGHT OUTER JOIN, FULL OUTER JOIN, NATURAL JOIN and CROSS JOIN.
The following examples are based on this table:
select * from x;
+----------+----------+
| column_1 | column_2 |
+----------+----------+
| 1 | 2 |
+----------+----------+
INNER JOIN
The keywords JOIN or INNER JOIN define a join that only shows rows where there is a match in both tables.
select * from x inner join x y ON x.column_1 = y.column_1;
+----------+----------+----------+----------+
| column_1 | column_2 | column_1 | column_2 |
+----------+----------+----------+----------+
| 1 | 2 | 1 | 2 |
+----------+----------+----------+----------+
LEFT OUTER JOIN
The keywords LEFT JOIN or LEFT OUTER JOIN define a join that includes all rows from the left table even if there
is not a match in the right table. When there is no match, null values are produced for the right side of the join.
select * from x left join x y ON x.column_1 = y.column_2;
+----------+----------+----------+----------+
| column_1 | column_2 | column_1 | column_2 |
+----------+----------+----------+----------+
| 1 | 2 | | |
+----------+----------+----------+----------+
RIGHT OUTER JOIN
The keywords RIGHT JOIN or RIGHT OUTER JOIN define a join that includes all rows from the right table even if there
is not a match in the left table. When there is no match, null values are produced for the left side of the join.
select * from x right join x y ON x.column_1 = y.column_2;
+----------+----------+----------+----------+
| column_1 | column_2 | column_1 | column_2 |
+----------+----------+----------+----------+
| | | 1 | 2 |
+----------+----------+----------+----------+
FULL OUTER JOIN
The keywords FULL JOIN or FULL OUTER JOIN define a join that is effectively a union of a LEFT OUTER JOIN and
RIGHT OUTER JOIN. It will show all rows from the left and right side of the join and will produce null values on
either side of the join where there is not a match.
select * from x full outer join x y ON x.column_1 = y.column_2;
+----------+----------+----------+----------+
| column_1 | column_2 | column_1 | column_2 |
+----------+----------+----------+----------+
| 1 | 2 | | |
| | | 1 | 2 |
+----------+----------+----------+----------+
NATURAL JOIN
A natural join defines an inner join based on common column names found between the input tables. When no common column names are found, it behaves like a cross join.
select * from x natural join x y;
+----------+----------+
| column_1 | column_2 |
+----------+----------+
| 1 | 2 |
+----------+----------+
CROSS JOIN
A cross join produces a cartesian product that matches every row in the left side of the join with every row in the right side of the join.
select * from x cross join x y;
+----------+----------+----------+----------+
| column_1 | column_2 | column_1 | column_2 |
+----------+----------+----------+----------+
| 1 | 2 | 1 | 2 |
+----------+----------+----------+----------+
GROUP BY clause
The GROUP BY clause groups together input rows that have the same value into summary rows.
GROUP BY is typically used with aggregrate functions (COUNT(), MAX(), SUM()), but if no aggregate functions are
included, the query with a GROUP BY clause is the same as SELECT DISTINCT.
Example:
SELECT a, b, MAX(c) FROM table GROUP BY a, b
Some aggregation functions accept optional ordering requirement, such as ARRAY_AGG. If a requirement is given,
aggregation is calculated in the order of the requirement.
Example:
SELECT a, b, ARRAY_AGG(c, ORDER BY d) FROM table GROUP BY a, b
GROUP BY ALL
Use GROUP BY ALL to group by every column in the SELECT list that isn’t inside an aggregate function. This keeps the column definitions in one place, simplifies the query, and prevents bugs by keeping the SELECT granularity aligned with the GROUP BY granularity (e.g., preventing unintended duplication).
Example:
SELECT a, b, MAX(c) FROM table GROUP BY ALL
HAVING clause
The HAVING clause can be used with GROUP BY to eliminate groups that don't satisfy the condition given.
Example:
SELECT a, b, MAX(c) FROM table GROUP BY a, b HAVING MAX(c) > 10
QUALIFY clause
The QUALIFY clause filters the results of window functions. It is evaluated after window functions are computed, similar to how HAVING filters results after GROUP BY.
Example:
SELECT
employee_id,
dept_id,
salary,
ROW_NUMBER() OVER (PARTITION BY dept_id ORDER BY salary DESC) AS rank
FROM employees
QUALIFY rank <= 3;
This query returns only the top 3 highest-paid employees in each department.
UNION clause
The UNION clause combines the results of two or more SELECT statements. By default UNION removes
duplicates. To include duplicates, use UNION ALL.
Example:
SELECT
a,
b,
c
FROM table1
UNION ALL
SELECT
a,
b,
c
FROM table2
ORDER BY clause
Orders the results by the referenced expression. By default it uses ascending order (ASC).
This order can be changed to descending by adding DESC after the order-by expressions.
Examples:
SELECT age, person FROM table ORDER BY age;
SELECT age, person FROM table ORDER BY age DESC;
SELECT age, person FROM table ORDER BY age, person DESC;
ORDER BY ALL
Order from left to right (by age, then by person) in ascending order:
SELECT age, person FROM table ORDER BY ALL;
LIMIT clause
Limits the number of rows to be a maximum of count rows. count should be a non-negative integer.
Example:
SELECT age, person FROM table
LIMIT 10
EXCLUDE and EXCEPT clause
Excluded named columns from query results.
Example selecting all columns except for age and person:
SELECT * EXCEPT(age, person)
FROM table;
SELECT * EXCLUDE(age, person)
FROM table;
Additional Example
SELECT name, age FROM employees WHERE age > 30 ORDER BY age DESC;
