zlacker

[return to "Pipe Syntax in SQL"]
1. Cianti+06[view] [source] 2024-08-24 16:16:12
>>legran+(OP)
Here is one example from the PDF:

    FROM r JOIN s USING (id)
    |> WHERE r.c < 15
    |> AGGREGATE sum(r.e) AS s GROUP BY r.d
    |> WHERE s > 3
    |> ORDER BY d
    |> SELECT d, s, rank() OVER (order by d)
Can we call this SQL anymore after this? This re-ordering of things has been done by others too, like PRQL, but they didn't call it SQL. I do think it makes things more readable.
◧◩
2. andy80+D7[view] [source] 2024-08-24 16:29:10
>>Cianti+06
The multiple uses of WHERE with different meanings is problematic for me. The second WHERE, filtering an aggregate, would be HAVING in standard SQL.

Not sure if this is an attempt to simplify things or an oversight, but favoring convenience (no need to remember multiple keywords) over explicitness (but the keywords have different meanings) tends to cause problems, in my observation.

◧◩◪
3. wvenab+Ne[view] [source] 2024-08-24 17:29:49
>>andy80+D7
> The second WHERE, filtering an aggregate, would be HAVING in standard SQL.

Only if you aren't using a subquery otherwise you would use WHERE even in plain SQL. Since the pipe operator is effectively creating subqueries the syntax is perfectly consistent with SQL.

◧◩◪◨
4. andy80+eq[view] [source] 2024-08-24 18:53:44
>>wvenab+Ne
Perhaps, however then you eliminate the use of WHERE/HAVING sum(r.e) > 3, so in case you forgot what the alias s means, you have to figure that part out before proceeding. Maybe I'm just used to the existing style but as stated earlier, seems this is reducing explicitness which IMO tends to lead to more bugs.
◧◩◪◨⬒
5. wvenab+fr[view] [source] 2024-08-24 19:01:58
>>andy80+eq
A lot of SQL engines don't support aliases in the HAVING clause and that can require duplication of potentially complex expressions which I find very bug-inducing. Removing duplication and using proper naming I think would be much better.

I will already use subqueries to avoid issues with HAVING.

◧◩◪◨⬒⬓
6. magica+OH[view] [source] 2024-08-24 21:18:15
>>wvenab+fr
> A lot of SQL engines don't support aliases in the HAVING clause

We're moving from SQLAnywhere to MSSQL, and boy, we're adding 2-5 levels of subqueries to most non-trivial queries due to issues like that. Super annoying.

I had one which went from 2 levels deep to 9... not pleasant. CTEs had some issues so couldn't use those either.

◧◩◪◨⬒⬓⬔
7. wvenab+nQ[view] [source] 2024-08-24 22:09:58
>>magica+OH
I'm surprised you had issues with CTEs -- MS SQL has one of the better CTE implementations. But I could see how it might take more than just trivial transformations to make efficient use of them.
◧◩◪◨⬒⬓⬔⧯
8. magica+kX[view] [source] 2024-08-24 23:02:47
>>wvenab+nQ
I don't recall all off the top of my head.

One issue, that I mentioned in a different comment, is that we have a lot of queries which are used transparently as sub-queries at runtime to get count first, in order to limit rows fetched. The code doing the "transparent" wrapping doesn't have a full SQL parser, so can't hoist the CTEs out.

One performance issue I do recall was that a lateral join of a CTE was much, much slower than just doing 5-6 sub-queries of the same table, selecting different columns or aggregates for each. Think selecting sum packages, sum net weight, sum gross weight, sum value for all items on an invoice.

There were other issues using plain joins, but I can't recall them right now.

◧◩◪◨⬒⬓⬔⧯▣
9. RaftPe+Mj2[view] [source] 2024-08-25 15:02:29
>>magica+kX
CTE's (at least in MS SQL land) are a syntax level operation, meaning CTE's get expanded to be as if you wrote the same subquery at each place a CTE was, which frequently impacts the optimizer and performance.

I like the idea of CTE's, but I typically use temp tables instead of CTE's to avoid optimizer issues.

[go to top]