Not sure if this is an attempt to simplify things or an oversight, but favoring convenience (no need to remember multiple keywords) over explicitness (but the keywords have different meanings) tends to cause problems, in my observation.
Only if you aren't using a subquery otherwise you would use WHERE even in plain SQL. Since the pipe operator is effectively creating subqueries the syntax is perfectly consistent with SQL.
I will already use subqueries to avoid issues with HAVING.
To me that feels as redundant as having WHERE vs HAVING, i.e. they do the same things, but at different points in the execution plan. It feels weird to need two separate keywords for that.
We're moving from SQLAnywhere to MSSQL, and boy, we're adding 2-5 levels of subqueries to most non-trivial queries due to issues like that. Super annoying.
I had one which went from 2 levels deep to 9... not pleasant. CTEs had some issues so couldn't use those either.
Personally I rarely use HAVING and instead use WHERE with subqueries for the following reasons:
1-I don't like repeating/duplicating a bunch of complex calcs, easier to just do WHERE in outer query on result
2-I typically have outer queries anyway for multiple reasons: break logic into reasonable chunks for humans, also for join+performance reasons (to give the optimizer a better chance at not getting confused)
One issue, that I mentioned in a different comment, is that we have a lot of queries which are used transparently as sub-queries at runtime to get count first, in order to limit rows fetched. The code doing the "transparent" wrapping doesn't have a full SQL parser, so can't hoist the CTEs out.
One performance issue I do recall was that a lateral join of a CTE was much, much slower than just doing 5-6 sub-queries of the same table, selecting different columns or aggregates for each. Think selecting sum packages, sum net weight, sum gross weight, sum value for all items on an invoice.
There were other issues using plain joins, but I can't recall them right now.
SELECT * FROM (SELECT * FROM ... GROUP BY ...) t WHERE ...
and they decided this was common enough that they would introduce a HAVING clause for this case SELECT * FROM ... GROUP BY ... HAVING ...
But the real issue is that in order to make operations in certain orders, SQL requires you to use subselects, which require restating a projection for no reason and a lot of syntactical ceremony. E.g. you must give the FROM item a name (t), but it's not required for disambiguation.Another common case is projecting before the filter. E.g. you want to reuse a complicated expression in the SELECT and WHERE clauses. Standard SQL requires you to repeat it or use a subselect since the WHERE clause is evaluated first.
You don't need a |> operator to make this possible. Your point is that there is a reason that SQL didn't just allow two WHERE clauses, one before and one after GROUP BY: to make it clearer syntactically.
Whereas the sort of proposal made by TFA is that if you think of the query as a sequence of steps to execute then you don't need the WHERE vs. HAVING clue because you can see whether a WHERE comes before or after GROUP BY in some query.
But the whole point of SQL is to _not have to_ think of how the query is to be implemented. Which I think brings us back to: it's better to have HAVING. But it's true also that it's better to allow arbitrary ordering of some clauses: there is no reason that FROM/JOIN, SELECT, ORDER BY / LIMIT have to be in the order that they are -- only WHERE vs. GROUP BY ordering matters, and _only_ if you insist on using WHERE for pre- and post-GROUP BY, but if you don't then all clauses can come in any order you like (though all table sources should come together, IMO).
So all in all I agree with you: keep HAVING.
I like the idea of CTE's, but I typically use temp tables instead of CTE's to avoid optimizer issues.