Skip to content

Conversation

@nunogoncalves03
Copy link
Contributor

@nunogoncalves03 nunogoncalves03 commented Dec 18, 2025

Description

This PR implements the correct handling of bare columns in GROUP BY queries that contain exactly one MIN or MAX aggregate, as described in #3444.

How does it work?

  • Detects queries with exactly one MIN or MAX aggregate in the GROUP BY queries
  • Tracks the current minimum or maximum value and updates bare columns only when a new minimum or maximum is found
  • Ensures that bare columns are copied from the row that produces the minimum or maximum value, rather than arbitrarily from the group

NULL handling:

  • When all values are NULL, bare columns are updated for each NULL row, so the last NULL row wins.
  • When there are ties involving non-NULL values, bare columns are updated only when a value is strictly less than (for MIN) or greater than (for MAX) the current value. As a result, the first row that establishes the min/max wins.
Previous bytecode
turso> explain select a, b, max(c) from t group by a;
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     47    0                    0   Start at 47
1     Null               0     10    0                    0   r[10]=NULL
2     SorterOpen         0     3     0     k(1,B)         0   cursor=0
3     Integer            0     6     0                    0   r[6]=0; clear group by abort flag
4     Null               0     7     0                    0   r[7]=NULL; initialize group by comparison registers to NULL
5     Gosub              15    43    0                    0   ; go to clear accumulator subroutine
6     OpenRead           2     2     0     k(4,B,B,B)     0   table=t, root=2, iDb=0
7     Rewind             2     14    0                    0   Rewind table t
8       Column           2     0     12                   0   r[12]=t.a
9       Column           2     1     13                   0   r[13]=t.b
10      Column           2     2     14                   0   r[14]=t.c
11      MakeRecord       12    3     11                   0   r[11]=mkrec(r[12..14])
12      SorterInsert     0     11    0     0              0   key=r[11]
13    Next               2     8     0                    0   
14    OpenPseudo         1     11    3                    0   3 columns in r[11]
15    SorterSort         0     31    0                    0   
16      SorterData       0     11    1                    0   r[11]=data
17      Column           1     0     16                   0   r[16]=pseudo.column 0
18      Compare          7     16    1     k(1, Binary)   0   r[7..7]==r[16..16]
19      Jump             20    24    20                   0   ; start new group if comparison is not equal
20      Gosub            4     35    0                    0   ; check if ended group had data, and output if so
21      Move             16    7     1                    0   r[7..7]=r[16..16]
22      IfPos            6     46    0                    0   r[6]>0 -> r[6]-=0, goto 46; check abort flag
23      Gosub            15    43    0                    0   ; goto clear accumulator subroutine
24      Column           1     2     17                   0   r[17]=pseudo.column 2
25      AggStep          0     17    10    max            0   accum=r[10] step(r[17])
26      If               5     29    0                    0   if r[5] goto 29; don't emit group columns if continuing existing group
27      Column           1     0     8                    0   r[8]=pseudo.column 0
28      Column           1     1     9                    0   r[9]=pseudo.column 1
29      Integer          1     5     0                    0   r[5]=1; indicate data in accumulator
30    SorterNext         0     16    0                    0   
31    Gosub              4     35    0                    0   ; emit row for final group
32    Goto               0     46    0                    0   ; group by finished
33    Integer            1     6     0                    0   r[6]=1
34    Return             4     0     0                    0   
35    IfPos              5     37    0                    0   r[5]>0 -> r[5]-=0, goto 37; output group by row subroutine start
36    Return             4     0     0                    0   
37    AggFinal           0     10    0     max            0   accum=r[10]
38    Copy               8     1     0                    0   r[1]=r[8]
39    Copy               9     2     0                    0   r[2]=r[9]
40    Copy               10    3     0                    0   r[3]=r[10]
41    ResultRow          1     3     0                    0   output=r[1..3]
42    Return             4     0     0                    0   
43    Null               0     8     10                   0   r[8..10]=NULL; clear accumulator subroutine start
44    Integer            0     5     0                    0   r[5]=0
45    Return             15    0     0                    0   
46    Halt               0     0     0                    0   
47    Transaction        0     1     1                    0   iDb=0 tx_mode=Read
48    Goto               0     1     0                    0   
New bytecode
turso> explain select a, b, max(c) from t group by a;
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     66    0                    0   Start at 66
1     Null               0     12    0                    0   r[12]=NULL
2     SorterOpen         0     3     0     k(1,B)         0   cursor=0
3     Integer            0     6     0                    0   r[6]=0; clear group by abort flag
4     Null               0     7     0                    0   r[7]=NULL; initialize group by comparison registers to NULL
5     Gosub              17    60    0                    0   ; go to clear accumulator subroutine
6     OpenRead           2     2     0     k(4,B,B,B)     0   table=t, root=2, iDb=0
7     Rewind             2     14    0                    0   Rewind table t
8       Column           2     0     14                   0   r[14]=t.a
9       Column           2     1     15                   0   r[15]=t.b
10      Column           2     2     16                   0   r[16]=t.c
11      MakeRecord       14    3     13                   0   r[13]=mkrec(r[14..16])
12      SorterInsert     0     13    0     0              0   key=r[13]
13    Next               2     8     0                    0   
14    OpenPseudo         1     13    3                    0   3 columns in r[13]
15    SorterSort         0     48    0                    0   
16      SorterData       0     13    1                    0   r[13]=data
17      Column           1     0     18                   0   r[18]=pseudo.column 0
18      Compare          7     18    1     k(1, Binary)   0   r[7..7]==r[18..18]
19      Jump             20    24    20                   0   ; start new group if comparison is not equal
20      Gosub            4     52    0                    0   ; check if ended group had data, and output if so
21      Move             18    7     1                    0   r[7..7]=r[18..18]
22      IfPos            6     65    0                    0   r[6]>0 -> r[6]-=0, goto 65; check abort flag
23      Gosub            17    60    0                    0   ; goto clear accumulator subroutine
24      Column           1     2     19                   0   r[19]=pseudo.column 2
25      Integer          0     9     0                    0   r[9]=0
26      NotNull          19    31    0                    0   r[19]!=NULL -> goto 31
27      IsNull           8     29    0                    0   if (r[8]==NULL) goto 29
28      Goto             0     38    0                    0   
29      Integer          1     9     0                    0   r[9]=1
30      Goto             0     38    0                    0   
31      NotNull          8     34    0                    0   r[8]!=NULL -> goto 34
32      Integer          1     9     0                    0   r[9]=1
33      Goto             0     38    0                    0   
34      Copy             8     20    0                    0   r[20]=r[8]
35      Compare          19    20    1     k(1, Binary)   0   r[19..19]==r[20..20]
36      Jump             38    38    37                   0   
37      Integer          1     9     0                    0   r[9]=1
38      Column           1     2     21                   0   r[21]=pseudo.column 2
39      AggStep          0     21    12    max            0   accum=r[12] step(r[21])
40      IfPos            9     42    0                    0   r[9]>0 -> r[9]-=0, goto 42
41      Goto             0     46    0                    0   
42      Column           1     2     22                   0   r[22]=pseudo.column 2
43      Copy             22    8     0                    0   r[8]=r[22]
44      Column           1     0     10                   0   r[10]=pseudo.column 0
45      Column           1     1     11                   0   r[11]=pseudo.column 1
46      Integer          1     5     0                    0   r[5]=1; indicate data in accumulator
47    SorterNext         0     16    0                    0   
48    Gosub              4     52    0                    0   ; emit row for final group
49    Goto               0     65    0                    0   ; group by finished
50    Integer            1     6     0                    0   r[6]=1
51    Return             4     0     0                    0   
52    IfPos              5     54    0                    0   r[5]>0 -> r[5]-=0, goto 54; output group by row subroutine start
53    Return             4     0     0                    0   
54    AggFinal           0     12    0     max            0   accum=r[12]
55    Copy               10    1     0                    0   r[1]=r[10]
56    Copy               11    2     0                    0   r[2]=r[11]
57    Copy               12    3     0                    0   r[3]=r[12]
58    ResultRow          1     3     0                    0   output=r[1..3]
59    Return             4     0     0                    0   
60    Null               0     10    12                   0   r[10..12]=NULL; clear accumulator subroutine start
61    Null               0     8     0                    0   r[8]=NULL
62    Integer            0     9     0                    0   r[9]=0
63    Integer            0     5     0                    0   r[5]=0
64    Return             17    0     0                    0   
65    Halt               0     0     0                    0   
66    Transaction        0     1     1                    0   iDb=0 tx_mode=Read
67    Goto               0     1     0                    0 

Motivation and context

Fixes #3444.

Currently, Turso does not have this special-case handling for bare columns:

turso> create table t(a,b,c);
turso> insert into t values (1, 'a', 'a'), (1, 'b', 'b'), (1, 'c', 'c');
turso> select a, b, max(c) from t group by a; -- b column is incorrect
┌───┬───┬───────────┐
│ a │ b │ max (t.c) │
├───┼───┼───────────┤
│ 1 │ a │ c         │
└───┴───┴───────────┘
turso> select a, b, min(c) from t group by a; -- this one happens to be correct
┌───┬───┬───────────┐
│ a │ b │ min (t.c) │
├───┼───┼───────────┤
│ 1 │ a │ a         │
└───┴───┴───────────┘

SQLite does:

sqlite> create table t(a,b,c);
sqlite> insert into t values (1, 'a', 'a'), (1, 'b', 'b'), (1, 'c', 'c');
sqlite> select a, b, max(c) from t group by a;
1|c|c
sqlite> select a, b, min(c) from t group by a;
1|a|a

Description of AI Usage

Claude Sonnet 4.5 was heavily used in this PR. I relied on it to understand the existing group-by flow, from query planning through bytecode generation, and then used it iteratively to generate most of the code until I reached the desired solution. It was also used to create some of the test cases.

Copy link

@turso-bot turso-bot bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please review @pereman2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

min() and max() should have an effect on bare columns

2 participants