Unable to `SET` empty strings to `NULL` with duckdb 1.2.0

I've recently come across a weird bug, in which I'm unable to `SET` empty strings to `NULL`. This issue only happens with duckdb 1.2.0, when using 1.1.3 I have no problems with this operation.

I've tried creating a small reproducible example, but didn't manage to do it. Instead, I've created a table with 4 million rows in which the issue seems to come up quite frequently (please note that sometimes it works as expected, although it seems to be a small minority of the cases), which is attached to this issue. This table has 2 columns, `cbo1994`, the one in which the problem appears, and `id_estab`, that holds 14-digit ids. If `id_estab` is missing, the issue doesn't come up - i.e. the problem seems to be related with this column, for some strange reason.

Another strange behavior is that the issue only comes up when using the full table, with 4 million rows. If I subset to use only the first 3 million, the code runs fine. The issue is not related to the last 1 million per se, however - I also tried to run the code using only the last million, but it worked fine.

Reprex:

```r
library(data.table)

tmpdb <- tempfile(fileext = ".duckdb")
con <- duckdb::dbConnect(duckdb::duckdb(), dbdir = tmpdb)

test <- readRDS("path/to/test.rds")
  
duckdb::dbWriteTable(con, "test", test, overwrite = TRUE)
query <- "UPDATE test SET cbo1994 = NULL WHERE cbo1994 = '';"
n_linhas_afetadas <- DBI::dbExecute(con, query)
dplyr::tbl(con, dplyr::sql("SELECT * FROM test"))
#> # Source:   SQL [?? x 2]
#> # Database: DuckDB v1.2.0 [dhersz@Windows Server x64:R 4.4.2/C:\Users\dhersz\AppData\Local\Temp\7\RtmpOEiIFH\filed0cc461710fc.duckdb]
#>    cbo1994 id_estab      
#>    <chr>   <chr>         
#>  1 "58990" 11111111111111
#>  2 "58330" 11111111111111
#>  3 "58330" 11111111111111
#>  4 "98540" 11111111111111
#>  5 "84350" 11111111111111
#>  6 "42190" 11111111111111
#>  7 ""      11111111111111
#>  8 "98590" 11111111111111
#>  9 "73290" 11111111111111
#> 10 "73290" 11111111111111
#> # ℹ more rows

test <- test[1:3000000,]
duckdb::dbWriteTable(con, "test", test, overwrite = TRUE)
query <- "UPDATE test SET cbo1994 = NULL WHERE cbo1994 = '';"
n_linhas_afetadas <- DBI::dbExecute(con, query)
dplyr::tbl(con, dplyr::sql("SELECT * FROM test"))
#> # Source:   SQL [?? x 2]
#> # Database: DuckDB v1.2.0 [dhersz@Windows Server x64:R 4.4.2/C:\Users\dhersz\AppData\Local\Temp\7\RtmpOEiIFH\filed0cc461710fc.duckdb]
#>    cbo1994 id_estab      
#>    <chr>   <chr>         
#>  1 58990   11111111111111
#>  2 58330   11111111111111
#>  3 58330   11111111111111
#>  4 98540   11111111111111
#>  5 84350   11111111111111
#>  6 42190   11111111111111
#>  7 <NA>    11111111111111
#>  8 98590   11111111111111
#>  9 73290   11111111111111
#> 10 73290   11111111111111
#> # ℹ more rows
```

The exact same code works as expected with duckdb 1.1.3:

```r
library(data.table)

tmpdb <- tempfile(fileext = ".duckdb")
con <- duckdb::dbConnect(duckdb::duckdb(), dbdir = tmpdb)

test <- readRDS("path/to/test.rds")
  
duckdb::dbWriteTable(con, "test", test, overwrite = TRUE)
query <- "UPDATE test SET cbo1994 = NULL WHERE cbo1994 = '';"
n_linhas_afetadas <- DBI::dbExecute(con, query)
dplyr::tbl(con, dplyr::sql("SELECT * FROM test"))
#> # Source:   SQL [?? x 2]
#> # Database: DuckDB v1.1.3 [dhersz@Windows Server x64:R 4.4.2/C:\Users\dhersz\AppData\Local\Temp\7\RtmpAFcMMe\filec71467c348db.duckdb]
#>    cbo1994 id_estab      
#>    <chr>   <chr>         
#>  1 58990   11111111111111
#>  2 58330   11111111111111
#>  3 58330   11111111111111
#>  4 98540   11111111111111
#>  5 84350   11111111111111
#>  6 42190   11111111111111
#>  7 <NA>    11111111111111
#>  8 98590   11111111111111
#>  9 73290   11111111111111
#> 10 73290   11111111111111
#> # ℹ more rows

test <- test[1:3000000,]
duckdb::dbWriteTable(con, "test", test, overwrite = TRUE)
query <- "UPDATE test SET cbo1994 = NULL WHERE cbo1994 = '';"
n_linhas_afetadas <- DBI::dbExecute(con, query)
dplyr::tbl(con, dplyr::sql("SELECT * FROM test"))
#> # Source:   SQL [?? x 2]
#> # Database: DuckDB v1.1.3 [dhersz@Windows Server x64:R 4.4.2/C:\Users\dhersz\AppData\Local\Temp\7\RtmpAFcMMe\filec71467c348db.duckdb]
#>    cbo1994 id_estab      
#>    <chr>   <chr>         
#>  1 58990   11111111111111
#>  2 58330   11111111111111
#>  3 58330   11111111111111
#>  4 98540   11111111111111
#>  5 84350   11111111111111
#>  6 42190   11111111111111
#>  7 <NA>    11111111111111
#>  8 98590   11111111111111
#>  9 73290   11111111111111
#> 10 73290   11111111111111
#> # ℹ more rows
```

Again, I've tried reproducing the behavior with a smaller dataset, but wasn't able to, so please see attached a zip file with the required data in `.rds` format.

Best, glad to help with anything needed!

[test.zip](https://github.com/user-attachments/files/19232998/test.zip)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unable to `SET` empty strings to `NULL` with duckdb 1.2.0 #1082

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unable to SET empty strings to NULL with duckdb 1.2.0 #1082

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Unable to `SET` empty strings to `NULL` with duckdb 1.2.0 #1082