-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Closed
Milestone
Description
closest only accepts inequality operators, but it would seem natural to have == (or some other operator) for joining with the closest value, irrespective of whether it is higher or lower.
Consider this minimal example (data taken from this SO question). This issue was also mentioned here. Inequality join with closest is okay, but equality join with closest are not working.
library(lubridate)
library(dplyr)
df1 <- data.frame(var2 = c("Dog", "Dog", "Cat"),
Date = dmy(c("01-01-2022","02-01-2022" , "07-12-2022")))
# var2 Date
# 1 Dog 2022-01-01
# 2 Dog 2022-01-02
# 3 Cat 2022-12-07
df2 <- data.frame(Date = dmy(c("07-01-2022","04-12-2022" , "10-12-2022")))
# Date
# 1 2022-01-07
# 2 2022-12-04
# 3 2022-12-10
df1 %>%
inner_join(df2, join_by(closest(Date <= Date)))
# var2 Date.x Date.y
# 1 Dog 2022-01-01 2022-01-07
# 2 Dog 2022-01-02 2022-01-07
# 3 Cat 2022-12-07 2022-12-10
df1 %>%
inner_join(df2, join_by(closest(Date == Date)))
# Error in `join_by()`:
# ! The expression used in `closest()` can't use `==`.
# ℹ Expression 1 is `closest(Date == Date)`.
# Run `rlang::last_error()` to see where the error occurred.Instead, it'd be nice to have a simple option for either direction:
df1 %>%
inner_join(df2, join_by(closest(Date == Date)), multiple = "all")
# var2 Date.x Date.y
# 1 Dog 2022-01-01 2022-01-07
# 2 Dog 2022-01-02 2022-01-07
# 3 Cat 2022-12-07 2022-12-04
# 4 Cat 2022-12-07 2022-12-10rasmusrhl and QuintenSand
Metadata
Metadata
Assignees
Labels
No labels