Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query extremely slow when fetching many rows #61

Closed
renkun-ken opened this issue Sep 28, 2017 · 3 comments
Closed

Query extremely slow when fetching many rows #61

renkun-ken opened this issue Sep 28, 2017 · 3 comments

Comments

@renkun-ken
Copy link
Contributor

renkun-ken commented Sep 28, 2017

DB Server: MySQL Server 5.7-15 on Fedora 24
Client: RMariaDB built against libmariadb-client-lgpl-dev on Ubuntu 16.04.3

When I query a table of more than 8M rows and 23 columns, dbReadTable() will hang forever, but querying with limit 100 will return in milliseconds or limit 100000 in 4 seconds while limit 1000000 in 400 seconds. To query the whole table, it will cost a ridiculous amount of time. But with v0.11-5 and former RMySQL, querying this very table will only cost less than a minute.

Following are tests of query the same table with different limit using each recent versions:

devtools::install_github("rstats-db/[email protected]"):

> library(DBI)
> con <- dbConnect(RMySQL::MySQL(), ...)
> system.time(data <- dbGetQuery(con, "select * from ASHAREEODPRICES limit 100"))
   user  system elapsed 
  0.000   0.000   0.003 
> system.time(data <- dbGetQuery(con, "select * from ASHAREEODPRICES limit 1000"))
   user  system elapsed 
  0.008   0.000   0.010 
> system.time(data <- dbGetQuery(con, "select * from ASHAREEODPRICES limit 10000"))
   user  system elapsed 
  0.056   0.004   0.065 
> system.time(data <- dbGetQuery(con, "select * from ASHAREEODPRICES limit 100000"))
   user  system elapsed 
  0.696   0.008   0.721 
> system.time(data <- dbGetQuery(con, "select * from ASHAREEODPRICES limit 1000000"))
   user  system elapsed 
  5.284   0.332   5.634 
> system.time(data <- dbGetQuery(con, "select * from ASHAREEODPRICES"))
   user  system elapsed 
 56.372   3.040  59.463 

devtools::install_github("rstats-db/[email protected]"):

> library(DBI)
> con <- dbConnect(RMariaDB::MariaDB(), ...)
> system.time(data <- dbGetQuery(con, "select * from ASHAREEODPRICES limit 100"))
   user  system elapsed 
  0.004   0.000   0.004 
> system.time(data <- dbGetQuery(con, "select * from ASHAREEODPRICES limit 1000"))
   user  system elapsed 
  0.004   0.000   0.010 
> system.time(data <- dbGetQuery(con, "select * from ASHAREEODPRICES limit 10000"))
   user  system elapsed 
  0.052   0.008   0.062 
> system.time(data <- dbGetQuery(con, "select * from ASHAREEODPRICES limit 100000"))
   user  system elapsed 
  0.692   0.024   0.762 
> system.time(data <- dbGetQuery(con, "select * from ASHAREEODPRICES limit 1000000"))
   user  system elapsed 
  5.788   0.224   6.103 
> system.time(data <- dbGetQuery(con, "select * from ASHAREEODPRICES"))
   user  system elapsed 
 59.632   2.532  62.270 

devtools::install_github("rstats-db/[email protected]"):

> library(DBI)
> con <- dbConnect(RMariaDB::MariaDB(), ...)
> system.time(data <- dbGetQuery(con, "select * from ASHAREEODPRICES limit 100"))
   user  system elapsed 
  0.000   0.000   0.005 
> system.time(data <- dbGetQuery(con, "select * from ASHAREEODPRICES limit 1000"))
   user  system elapsed 
  0.008   0.000   0.010 
> system.time(data <- dbGetQuery(con, "select * from ASHAREEODPRICES limit 10000"))
   user  system elapsed 
  0.056   0.000   0.065 
> system.time(data <- dbGetQuery(con, "select * from ASHAREEODPRICES limit 100000"))
   user  system elapsed 
  0.680   0.024   0.756 
> system.time(data <- dbGetQuery(con, "select * from ASHAREEODPRICES limit 1000000"))
   user  system elapsed 
  5.468   0.276   5.799 
> system.time(data <- dbGetQuery(con, "select * from ASHAREEODPRICES"))
   user  system elapsed 
 59.200   2.920  62.183 

devtools::install_github("rstats-db/[email protected]"):

> library(DBI)
> con <- dbConnect(RMariaDB::MariaDB(), ...)
> system.time(data <- dbGetQuery(con, "select * from ASHAREEODPRICES limit 100"))
   user  system elapsed 
  0.000   0.004   0.005 
> system.time(data <- dbGetQuery(con, "select * from ASHAREEODPRICES limit 1000"))
   user  system elapsed 
  0.008   0.000   0.016 
> system.time(data <- dbGetQuery(con, "select * from ASHAREEODPRICES limit 10000"))
   user  system elapsed 
  0.068   0.004   0.108 
> system.time(data <- dbGetQuery(con, "select * from ASHAREEODPRICES limit 100000"))
   user  system elapsed 
  4.744   0.040   4.820 
> system.time(data <- dbGetQuery(con, "select * from ASHAREEODPRICES limit 1000000"))
   user  system elapsed 
423.176   0.448 423.683 
> system.time(data <- dbGetQuery(con, "select * from ASHAREEODPRICES"))
<hangs for more than 10 minutes>
@renkun-ken renkun-ken changed the title Query hangs when fetching many rows Query extremely slow when fetching many rows Sep 28, 2017
@krlmlr krlmlr closed this as completed in 53e571d Sep 30, 2017
@krlmlr
Copy link
Member

krlmlr commented Sep 30, 2017

Thanks, this and #60 gave me enough reason to turn off result prefetching. This means that we can't reliably use Connector/C until the fix hits the repos, which may take a while. Please open a new issue if problems persist.

krlmlr added a commit that referenced this issue Oct 1, 2017
- Test almost all test cases of the DBI specification.
- Queries not supported by the prepared statement protocol can now be run via `dbExecute()` or `dbSendStatement()`, the function `mariadbExecQuery()` has been removed (#28).
- Avoid storing all results on the client due to instability with large datasets (#60, #61).
@renkun-ken
Copy link
Contributor Author

renkun-ken commented Oct 2, 2017

I test with the v0.11-9 against libmysqlclient-dev and the query performance is good now. Thanks!

@github-actions
Copy link

github-actions bot commented Dec 7, 2020

This old thread has been automatically locked. If you think you have found something related to this, please open a new issue and link to this old issue if necessary.

@github-actions github-actions bot locked and limited conversation to collaborators Dec 7, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants