dbplyr 1.4.0
Breaking changes
-
Error: `con` must not be NULL: If you see this error, it probably means
that you have probably forgotten to passcondown to a dbplyr function.
Previously, dbplyr defaulted to usingsimulate_dbi()which introduced
subtle escaping bugs. (It's also possible I have forgotten to pass it
somewhere that the dbplyr tests don't pick up, so if you can't figure it
out, please let me know). -
Subsetting (
[[,$, and[) functions are no longer evaluated locally.
This makes the translation more consistent and enables useful new idioms
for modern databases (#200).
New features
-
MySQL/MariaDB (https://mariadb.com/kb/en/library/window-functions/)
and SQLite (https://www.sqlite.org/windowfunctions.html) translations gain
support for window functions, available in Maria DB 10.2, MySQL 8.0, and
SQLite 3.25 (#191). -
Overall, dplyr generates many fewer subqueries:
-
Joins and semi-joins no longer add an unneeded subquery (#236). This is
faciliated by the newbare_identifier_okargument tosql_render();
the previous argument was calledrootand confused me. -
Many sequences of
select(),rename(),mutate(), andtransmute()can
be collapsed into a single query, instead of always generating a subquery
(#213).
-
-
New
vignette("sql")describes some advantages of dbplyr over SQL (#205) and
gives some advice about writing how to write literal SQL inside of dplyr,
when you you need to (#196). -
New
vignette("reprex")gives some hints on creating reprexes that work
anywhere (#117). This is supposrted by a newtbl_memdb()that
matches the existingtbl_lazy(). -
All
..._join()functions gain ansql_onargument that allows specifying
arbitrary join predicates in SQL code (#146, @krlmlr).
SQL translations
-
New translations for some lubridate functions:
today(),now(),
year(),month(),day(),hour(),minute(),
second(),quarter(), ``yday()(@colearendt, @derekmorr). Also added new translation foras.POSIXct()`. -
New translations for stringr functions:
str_c(),str_sub(),
str_length(),str_to_upper(),str_to_lower(), andstr_to_title()
(@colearendt). Non-translated stringr functions throw a clear error. -
New translations for bitwise operations:
bitwNot(),bitwAnd(),bitwOr(),
bitwXor(),bitwShiftL(), andbitwShiftR(). Unlike the base R functions,
the translations do not coerce arguments to integers (@davidchall, #235). -
New translation for
x[y]toCASE WHEN y THEN x END. This enables
sum(a[b == 0])to work as you expect from R (#202).yneeds to be
a logical expression; if not you will likely get a type error from your
database. -
New translations for
x$yandx[["y"]]tox.y, enabling you to index
into nested fields in databases that provide them (#158). -
The
.dataand.envpronouns of tidy evaluation are correctly translated
(#132). -
New translation for
median()andquantile(). Works for all ANSI compliant
databases (SQL Server, Postgres, MariaDB, Teradata) and has custom
translations for Hive. Thanks to @edavidaja for researching the SQL variants!
(#169) -
na_if()is correct translated toNULLIF()(rather thanNULL_IF) (#211). -
n_distinct()translation throws an error when given more than one argument.
(#101, #133). -
New default translations for
paste(),paste0(), and the hyperbolic
functions (these previously were only available for ODBC databases). -
Corrected translations of
pmin()andpmax()toLEAST()andGREATEST()
for ANSI compliant databases (#118), toMIN()andMAX()for SQLite, and
to an error for SQL server. -
New translation for
switch()to the simple form ofCASE WHEN(#192).
SQL simulation
SQL simulation makes it possible to see what dbplyr will translate SQL to, without having an active database connection, and is used for testing and generating reprexes.
-
SQL simulation has been overhauled. It now works reliably, is better
documented, and always uses ANSI escaping (i.e.`for field
names and'for strings). -
tbl_lazy()now actually puts adbplyr::srcin the$srcfield. This
shouldn't affect any downstream code unless you were previously working
around this weird difference betweentbl_lazyandtbl_sqlclasses.
It also includes thesrcclass in its class, and when printed,
shows the generated SQL (#111).
Database specific improvements
-
MySQL/MariaDB
-
Translations also applied to connections via the odbc package
(@colearendt, #238) -
Basic support for regular expressions via
str_detect()and
str_replace_all()(@colearendt, #168). -
Improved translation for
as.logical(x)toIF(x, TRUE, FALSE).
-
-
Oracle
-
Postgres
- Basic support for regular expressions via
str_detect()and
str_replace_all()(@colearendt, #168).
- Basic support for regular expressions via
-
SQLite
explain()translation now generatesEXPLAIN QUERY PLANwhich
generates a higher-level, more human friendly explanation.
-
SQL server
-
Improved translation for
as.logical(x)toCAST(x as BIT)(#250). -
Translates
paste(),paste0(), andstr_c()to+. -
copy_to()method applies temporary table name transformation
earlier so that you can now overwrite temporary tables (#258). -
db_write_table()method uses correct argument name for
passing along field types (#251).
-
Minor improvements and bug fixes
-
Aggregation functions only warn once per session about the use of
na.rm = TRUE(#216). -
table names generated by
random_table_name()have the prefix
"dbplyr_", which makes it easier to find them programmatically
(@mattle24, #111) -
Functions that are only available in a windowed (
mutate()) query now
throw an error when called in a aggregate (summarise()) query (#129) -
arrange()understands the.by_groupargument, making it possible
sort by groups if desired. The default isFALSE(#115) -
distinct()now handles computed variables likedistinct(df, y = x + y)
(#154). -
escape(),sql_expr()andbuild_sql()no longer acceptcon = NULLas
a shortcut forcon = simulate_dbi(). This made it too easy to forget to
passconalong, introducing extremely subtle escaping bugs.win_over()
gains aconargument for the same reason. -
New
escape_ansi()always uses ANSI SQL 92 standard escaping (for use
in examples and documentation). -
mutate(df, x = NULL)dropsxfrom the output, just like when working with
local data frames (#194). -
partial_eval()processes inlined functions (including rlang lambda
functions). This makes dbplyr work with more forms of scoped verbs like
df %>% summarise_all(~ mean(.)),df %>% summarise_all(list(mean))(#134). -
sql_aggregate()now takes an optional argumentf_rfor passing to
check_na_rm(). This allows the warning to show the R function name rather
than the SQL function name (@sverchkov, #153). -
sql_infix()gains apadargument for the rare operator that doesn't
need to be surrounded by spaces. -
sql_prefix()no longer turns SQL functions into uppercase, allowing for
correct translation of case-sensitive SQL functions (#181, @mtoto). -
summarise()gives a clear error message if you refer to a variable
created in that samesummarise()(#114). -
New
sql_call2()which is torlang::call2()assql_expr()is to
rlang::expr(). -
show_query()andexplain()usecat()rather than message. -
union(),union_all(),setdiff()andintersect()do a better job
of matching columns across backends (#183).