dbplyr 2.0.0
(To become dbplyr 2.0.0)
dplyr 1.0.0 compatibility
-
across()is now translated into individual SQL statements (#480). -
rename()andselect()support dplyr 1.0.0 tidyselect syntax (apart from
predicate functions which can't easily work on computed queries) (#502). -
relocate()makes it easy to move columns (#494) andrename_with()makes
it easy to rename columns programmatically (#502). -
slice_min(),slice_max(), andslice_order()are now supported.
slice_head()andslice_tail()throw clear error messages (#394)
SQL generation
-
Documentation has been radically improved with new topics for each major
verb and each backend giving more details about the SQL translation. -
intersect(),union()andsetdiff()gain anallargument to add the
ALLargument (#414). -
Join functions gains a
na_matchesargument that allows you to control
whether or notNA(NULL) values match otherNAvalues. The default is
"never", which is the usual behaviour in databases. You can set
na_matches = "na"to match R's usual join behaviour (#180). Additional
arguments error (instead of being silently swallowed) (#382). -
Joins now only use aliases where needed to disambiguate columns; this should
make generated queries more readable. -
Subqueries no longer include an
ORDER BYclause. This is not part of the
SQL spec, and has very limited support across databases. Now such queries
generate a warning suggesting that you move yourarrange()call later in
the pipeline (#276). (There's one exception:ORDER BYis still generated
ifLIMITis present; this tends to affect the returns rows but not
necessarily their order). -
Subquery names are now scoped within the query. This makes query text
deterministic which helps some query optimisers/cachers (#336). -
sql_optimise()now can partially optimise a pipeline; due to an unfortunate
bug it previously gave up too easily. -
in_schema()quotes each input individually (#287) (usesql()to opt out
of quoting, if needed). AndDBI::Id()should work anywhere that
in_schema()does.
SQL translation
-
Experimental new SAP HANA backend (#233). Requires the latest version of odbc.
-
All backends:
-
You can now use
::in translations, so that (e.g.)dbplyr::n()is
translated tocount(*)(#207). -
[[can now also translate numeric indices (#520). -
%/%now generates a clear error message; previously it was translated to
/which is not correct (#108). -
n()is translated tocount(*)instead ofcount()(#343). -
sub_str()translation is more consistent in edge cases (@ianmcook). -
All
median()(@lorenzwalthert, #483),pmin(),pmax()(#479),sd()
andvar()functions have anna.rmargument that warns once when not
TRUE. This makes them consistent withmean()andsum(). -
substring()is now translated the same way assubstr()(#378).
-
-
blob vectors can now be used with
!!and
!!!operators, for example infilter()(@okhoma, #433) -
MySQL uses standard SQL for index creation.
-
MS SQL translation does better a distinguishing between bit and boolean
(#377, #318).ifandifelseonce again generateIIF, creating
simpler expressions.as.*()function usesTRY_CAST()instead
ofCAST()for version 11+ (2012+) (@DavidPatShuiFong, #380). -
odbc no longer translates
count(); this was an accidental inclusion. -
Oracle translation now depends on Oracle 12c, and uses a "row-limiting"
clause forhead(). It gains translations fortoday()andnow(), and
improvedas.Date()translation (@rlh1994, #267). -
PostgreSQL: new translations for lubridate period functions
years(),
months(),days(), andfloor_date()(@bkkkk, #333) and stringr functions
str_squish(),str_remove(), andstr_remove_all()(@shosaco). -
New RedShift translations when used with
RPostgres::Redshift(). -
SQLite gains translations for lubridate functions
today(),now(),
year(),month(),day(),hour(),minute(),second(),yday()
(#262), and correct translation formedian()(#357).
Extensibility
If you are the author of a dbplyr backend, please see vignette("backend-2") for details.
-
New
dbplyr_edition()generic allows you to opt-in to the 2nd edition of
the dbplyr API. -
db_write_table()now callsDBI::dbWriteTable()instead of nine generics
that formerly each did a small part:db_create_indexes(),db_begin(),
db_rollback(),db_commit(),db_list_tables(),drop_drop_table(),
db_has_table(),db_create_table(), anddb_data_types(). You can
now delete the methods for these generics.db_query_rows()is no longer used; it appears that it hasn't been used
for some time, so if you have a method, you can delete it. -
DBI::dbQuoteIdentifier()is now used instead ofsql_escape_ident()and
DBI::dbQuoteString()instead ofsql_escape_string(). -
A number of
db_*generics have been replaced with new SQL generation
generics:dplyr::db_analyze()->dbplyr::sql_table_analyze()dplyr::db_create_index()->dbplyr::sql_table_index()dplyr::db_explain()->dbplyr::sql_queriy_explain()dplyr::db_query_fields()->dbplyr::sql_query_fields()dplyr::db_save_query()->dbplyr::sql_query_save()
This makes them easier to test and is an important part of the process of
moving all database generics in dbplyr (#284). -
A number of other generics have been renamed to facilitate the move from
dplyr to dbplyr:dplyr::sql_select()->dbplyr::sql_query_select()dplyr::sql_join()->dbplyr::sql_query_join()dplyr::sql_semi_join()->dbplyr::sql_query_semi_join()dplyr::sql_set_op()->dbplyr::sql_query_set_op()dplyr::sql_subquery()->dbplyr::sql_query_wrap()dplyr::db_desc()->dbplyr::db_connection_describe()
-
New
db_temporary_table()generic makes it easier to work with databases
that require temporary tables to be specially named. -
New
sql_expr_matches()generic allows databases to use more efficient
alternatives when determine if two values "match" (i.e. like equality but
a pair ofNULLs will also match). For more details, see
https://modern-sql.com/feature/is-distinct-from -
New
sql_join_suffix()allows backends to control the default suffixes
used (#254).
Minor improvements and bug fixes
-
All old lazy eval shims have been removed. These have been deprecated for
some time. -
Date-time escaping methods for Athena and Presto have moved to the packages
where they belong. -
Attempting to embed a Shiny reactive in a query now gives a helpful error
(#439). -
copy_lahman()andcopy_nycflights13()(and hencenycflights13_sqlite())
and friends now return DBI connections rather than the now deprecated
src_dbi()(#440). -
copy_to()can nowoverwritewhen table is specified with schema (#489),
and gains anin_transactionargument used to optionally suppress the
transaction wrapper (#368). -
distinct()no longer duplicates column if grouped (#354). -
transmute()now correctly tracks variables it needs when creating
subqueries (#313). -
mutate()grouping variables no longer generates a downstream error (#396) -
mutate()correctly generates subqueries when you re-use the same variable
three or more times (#412). -
window_order()overrides ordering, rather than appending to it.