Skip to content

Memory allocation in curl() #399

Closed
Closed
@jeroen

Description

@jeroen

See: r-lib/httr2#704 (comment). TLDR: reading from non blocking connections takes a lot of memory. The bench::mark shows that this is memory allocated by R in readBin.

The problem is that for non blocking connections, it is possible to call readBin() a million times without any data coming in. This results to high memory usage and busy waiting.

stream_data <- function() {
  url <- "https://jeroen.github.io/data/nycflights13.json"
  con <- curl::curl(url)
  open(con, "rb", blocking = FALSE)
  on.exit(close(con))
  
  while(isIncomplete(con)) {
    readBin(con, raw(), 10 * 1024)
  }
}

bm <- bench::mark(stream_data(), iterations = 1, filter_gc = FALSE)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions