-
Notifications
You must be signed in to change notification settings - Fork 188
Open
Labels
help wanted ❤️we'd love your help!we'd love your help!
Description
all(!is.na(x))
requires checking all elements of x
, but one could use !anyNA(x)
instead for better performance (maybe harder to read, cf https://lintr.r-lib.org/dev/reference/outer_negation_linter.html, edit: my bad, got confused).
Some benchmarks on integer and character vectors:
n <- 1e7
##### Integers
no_na <- sample(1:100, n, replace = TRUE)
some_na <- sample(c(1:100, NA), n, replace = TRUE)
only_na <- rep(NA, n)
bench::mark(
all(!is.na(no_na)),
!anyNA(no_na)
)
#> Warning: Some expressions had a GC in every iteration; so filtering is
#> disabled.
#> # A tibble: 2 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 all(!is.na(no_na)) 38.96ms 40.7ms 23.6 76.3MB 23.6
#> 2 !anyNA(no_na) 2.72ms 2.8ms 346. 0B 0
bench::mark(
all(!is.na(some_na)),
!anyNA(some_na)
)
#> Warning: Some expressions had a GC in every iteration; so filtering is
#> disabled.
#> # A tibble: 2 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 all(!is.na(some_na)) 30.3ms 33.1ms 28.6 76.3MB 28.6
#> 2 !anyNA(some_na) 0 100ns 14724688. 0B 0
bench::mark(
all(!is.na(only_na)),
!anyNA(only_na)
)
#> Warning: Some expressions had a GC in every iteration; so filtering is
#> disabled.
#> # A tibble: 2 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 all(!is.na(only_na)) 31.2ms 32ms 29.8 76.3MB 29.8
#> 2 !anyNA(only_na) 0 100ns 7926856. 0B 0
##### Strings
no_na <- sample(letters, n, replace = TRUE)
some_na <- sample(c(letters, NA), n, replace = TRUE)
only_na <- rep(NA, n)
bench::mark(
all(!is.na(no_na)),
!anyNA(no_na)
)
#> # A tibble: 2 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 all(!is.na(no_na)) 33.18ms 34.1ms 29.0 76.3MB 20.7
#> 2 !anyNA(no_na) 4.17ms 5.16ms 182. 0B 0
bench::mark(
all(!is.na(some_na)),
!anyNA(some_na)
)
#> # A tibble: 2 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 all(!is.na(some_na)) 25.4ms 26.9ms 37.0 76.3MB 22.2
#> 2 !anyNA(some_na) 0 1ns 17097081. 0B 0
bench::mark(
all(!is.na(only_na)),
!anyNA(only_na)
)
#> # A tibble: 2 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 all(!is.na(only_na)) 24.4ms 30.2ms 33.3 76.3MB 18.5
#> 2 !anyNA(only_na) 0 100ns 10787728. 0B 0
Those are also equivalent on length-0 input:
all(!is.na(character()))
#> [1] TRUE
!anyNA(character())
#> [1] TRUE
Current behaviour of lintr
:
lintr::lint("all(!is.na(x))\n")
#> ℹ No lints found.
This has more than 10k matches on Github (although I rarely use the code search feature, so I don't know how big that is): https://github.com/search?q=language%3AR+%22all%28%21is.na%28%22&type=code
Metadata
Metadata
Assignees
Labels
help wanted ❤️we'd love your help!we'd love your help!