Skip to content

to_duckdb() ignoring sorting #1323

@filipemsc

Description

@filipemsc

I am having some unexpected behavior when using to_duckb() after sorting. Basically, after using to_duckdb() into an arrow table the sorting is ignored.

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(arrow)
#> The tzdb package is not installed. Timezones will not be available to Arrow compute functions.
#> 
#> Attaching package: 'arrow'
#> The following object is masked from 'package:utils':
#> 
#>     timestamp
library(duckdb)
#> Loading required package: DBI

conn <- dbConnect(
  duckdb()
)

data.frame(id = rep(c("A","B","C"), 4), value = sample(1:12)) |>
  as_arrow_table() |>
  arrange(id, -value) |>
  collect()
#> # A tibble: 12 × 2
#>    id    value
#>    <chr> <int>
#>  1 A        12
#>  2 A        10
#>  3 A         7
#>  4 A         3
#>  5 B        11
#>  6 B         9
#>  7 B         2
#>  8 B         1
#>  9 C         8
#> 10 C         6
#> 11 C         5
#> 12 C         4

data.frame(id = rep(c("A","B","C"), 4), value = sample(1:12)) |>
  as_arrow_table() |>
  arrange(id, -value) |>
  to_duckdb(conn) |>
  collect()
#> # A tibble: 12 × 2
#>    id    value
#>    <chr> <int>
#>  1 A         5
#>  2 B         8
#>  3 C         4
#>  4 A         9
#>  5 B         3
#>  6 C         1
#>  7 A        10
#>  8 B         7
#>  9 C         2
#> 10 A        12
#> 11 B        11
#> 12 C         6

Created on 2025-07-08 with reprex v2.1.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions