Skip to content

Support of HAVING without GROUP BY #208

@dey4ss

Description

@dey4ss

While going through the Public BI Benchmark, I found multiple queries with HAVING clauses but no GROUP BY columns (e.g., [1, 2, 3]). I'd like to discuss whether we want to support such statements or not.
Related to #186.

Consider the following query [1]:

SELECT SUM(CAST("TrainsUK2_2"."Number of Records" AS BIGINT)) AS "sum:Number of Records:ok"
FROM "TrainsUK2_2" 
HAVING (COUNT(1) > 0)

According to DBFiddle [4], Postgres executes the query*, whereas we raise a syntax error due to seeing HAVING without GROUP BY columns. However, most of the HAVING criteria seem weird, as they consist of a a term always evaluating to true (see example above [1]) or false (see below [5]), assuming that the numbers passed th the aggregate functions do not refer to column IDs.

HAVING ((SUM(1) >= 30) AND (SUM(1) <= 100000))

If we want to adapt here, we could add a rule to the parser and change

opt_group : GROUP BY expr_list opt_having {
  $$ = new GroupByDescription();
  $$->columns = $3;
  $$->having = $4;
}
| /* empty */ { $$ = nullptr; };

to

opt_group : GROUP BY expr_list opt_having {
  $$ = new GroupByDescription();
  $$->columns = $3;
  $$->having = $4;
}
| HAVING expr {
  $$ = new GroupByDescription();
  $$->having = $2;
}
| /* empty */ { $$ = nullptr; };

Additionally, I noticed that the escaped column names in the example [1] do not work properly: "Number of Records" is not considered as one single identifier and "sum:Number of Records:ok" does not parse because sum is mistakenly treated as token for the SUM(...)aggregate function.

[1] https://github.com/cwida/public_bi_benchmark/blob/master/benchmark/TrainsUK2/queries/17.sql
[2] https://github.com/cwida/public_bi_benchmark/blob/master/benchmark/MLB/queries/103.sql
[3] https://github.com/cwida/public_bi_benchmark/blob/master/benchmark/Wins/queries/10.sql
[4] https://www.db-fiddle.com/
[5] https://github.com/cwida/public_bi_benchmark/blob/master/benchmark/Hatred/queries/12.sql

* SQL for reproduction:

CREATE TABLE "TrainsUK2_2"("Number of Records" smallint NOT NULL);
INSERT INTO "TrainsUK2_2" VALUES(-1);

SELECT SUM(CAST("TrainsUK2_2"."Number of Records" AS BIGINT)) AS "sum:Number of Records:ok" FROM "TrainsUK2_2" HAVING (COUNT(1) > 0);

Result:

sum:Number of Records:ok
-1

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions