zlacker

[return to "Pipe Syntax in SQL"]
1. Cianti+06[view] [source] 2024-08-24 16:16:12
>>legran+(OP)
Here is one example from the PDF:

    FROM r JOIN s USING (id)
    |> WHERE r.c < 15
    |> AGGREGATE sum(r.e) AS s GROUP BY r.d
    |> WHERE s > 3
    |> ORDER BY d
    |> SELECT d, s, rank() OVER (order by d)
Can we call this SQL anymore after this? This re-ordering of things has been done by others too, like PRQL, but they didn't call it SQL. I do think it makes things more readable.
◧◩
2. tmoert+Oa[view] [source] 2024-08-24 16:55:44
>>Cianti+06
The point of SQL pipe syntax is that there is no reordering. You read the query as a sequence of operations, and that's exactly how it's executed. (Semantically. Of course, the query engine is free to optimize the execution plan as long as the semantics are preserved.)

The pipe operator is a semantic execution barrier:everything before the `|>` is assumed to have executed and returned a table before what follows begins:

From the paper:

> Each pipe operator is a unary relational operation that takes one table as input and produces one table as output.

Vanilla SQL is actually more complex in this respect because you have, for example, at least 3 different keywords for filtering (WHERE, HAVING, QUALIFY) and everyone who reads your query needs to understand what each keyword implies regarding execution scheduling. (WHERE is before grouping, HAVING is after aggregates, and QUALIFY is after analytic window functions.)

◧◩◪
3. aidos+Mz[view] [source] 2024-08-24 20:07:28
>>tmoert+Oa
Golly, QUALIFY, a new SQL operator I didn’t know existed. I tend not to do much with window functions and I would have reached for a CTE instead but it’s always nice to be humbled by finding something new in a language you thought you knew well.
◧◩◪◨
4. mrbung+P21[view] [source] 2024-08-24 23:42:45
>>aidos+Mz
Is not common at all, is a non ANSI SQL clause that afaik was created by Teradata, syntactic sugar for filtering using window functions directly without CTEs or temp tables, especially useful for dedup. In most cases at least, for example you can't do a QUALIFY in an query that is aggregating data just as you can't use a window function when aggregating.

Other engines that implement it are direct competitors in that space: Snowflake, Databricks SQL, BigQuery, Clickhouse, and duckdb (only OSS implementation I now). Point is: if you want to compete with Teradata and be a possible migration target, you want to implement QUALIFY.

Anecdote: I went from a company that had Teradata to another where I had to implement all the data stack in GCP. I shed tears of joy when I knew BQ also had QUALIFY. And the intent was clear, as they also offered various Teradata migration services.

[go to top]