Skip to content

Releases: tidyverse/vroom

vroom 1.7.0

27 Jan 15:50

Choose a tag to compare

  • vroom.tidyverse.org is the new home of vroom's website, catching up to the much earlier move (April 2022) of vroom's GitHub repository from the r-lib organization to the tidyverse. The motivation for that was to make it easier to transfer issues between these two closely connected packages.

  • The path parameter has been removed from vroom_write(). This parameter was deprecated in vroom 1.5.0 (2021-06-14) in favor of the file parameter (#575).

  • The function vroom_altrep_opts() and the argument vroom(altrep_opts =) have been removed. They were deprecated in favor of vroom_altrep() and altrep =, respectively, in v1.2.0 (2020-01-13). Also applies to vroom_fwf(altrep_opts =) and vroom_lines(altrep_opts =) (#575).

  • vroom() now supports reading from a remote file that uses any of the supported compression formats, by downloading to a temporary (compressed) file. This is a new feature for .bz2, .xz, and .zip and fixes .gz bugs arising from problematic behaviour of base::gzcon() (#400, #553, tidyverse/readr#1555, tidyverse/readr#1553).

  • mtcars.csv.tar.gz and mtcars-concatenated.csv.gz are 2 new example files that are handy internally, at least, for exercising code related to reading compressed files.

  • vroom now offers to install the archive package if it's needed to complete the user's request (tidyverse/readr#1334).

  • vroom takes the recommended approach for phasing out usage of the non-API entry points SETLENGTH, SET_TRUELENGTH, and ATTRIB (#582, #596).

  • Unclosed quotes (e.g., a,b,"c with no closing ") now trigger a warning, instead of silent data truncation. The affected row is also newly included in the returned data, which should facilitate troubleshooting (#484, tidyverse/readr#1539, tidyverse/readr#1491).

  • Columns specified as having type "number" (requested via col_number() or "number" or 'n') or "skip" (requested via col_skip() or "skip" or _ or -) now work in the case where 0 rows of data are parsed (#427, #540, #548).

  • vroom(), vroom_lines(), and vroom_fwf() now close and destroy (instead of leak) the connection in the case where opening the connection fails due to, e.g., a nonexistent URL (#488).

  • If there is insufficient space for the tempfile used when reading from a connection (affects delimited and fixed width parsing, from compressed files and URLs), that is now reported as an error and no longer segfaults (#544).

  • vroom(..., n_max = 0, col_names = c(...)) with a connection (compressed file, URL, raw connection) no longer produces a "negative length vectors are not allowed" error or crashes R (#539).

  • vroom_fwf(..., n_max = 0) with a connection no longer segfaults (#590).

vroom 1.6.7

29 Nov 03:05

Choose a tag to compare

  • locale(encoding =) now warns, instead of errors, when the encoding cannot be found in iconvlist() return value. This removes an unnecessary blocker on platforms like Alpine Linux where the output doesn't reflect actual capabilities.

  • vroom no longer uses STDVEC_DATAPTR() and takes the recommended approach for phasing out usage of DATAPTR() (#561).

  • problems() works normally for vroom-produced objects, even if readr is attached (#534, #554).

  • problems() are no longer corrupted if the offending data frame is partially materialized, e.g. by viewing a subset, before calling problems() (#535).

vroom 1.6.6

19 Sep 11:19

Choose a tag to compare

  • Fixed a bad URL in the README at CRAN's request.

vroom 1.6.5

05 Dec 23:51

Choose a tag to compare

  • Internal changes requested by CRAN around format specification (#524).

vroom 1.6.4

02 Oct 15:02

Choose a tag to compare

  • It is now possible (again?) to read from a list of connections (@bairdj, #514).

  • Internal change for compatibility with cpp11 >= 0.4.6 (@DavisVaughan, #512).

vroom 1.6.3

28 Apr 22:36

Choose a tag to compare

v1.6.3

Increment version number to 1.6.3

vroom 1.6.1

22 Jan 22:46

Choose a tag to compare

  • str() now works in a colorized context in the presence of a column of class integer64, i.e. parsed with col_big_integer() (@bart1, #477).

  • The embedded implementation of the Grisu algorithm for printing floating point numbers now uses snprintf() instead of sprintf() and likewise for vroom's own code (@jeroen, #480).

vroom 1.6.0

30 Sep 15:47

Choose a tag to compare

  • vroom(col_select=) now handles column selection by numeric position when id column is provided (#455).

  • vroom(id = "path", col_select = a:c) is treated like vroom(id = "path", col_select = c(path, a:c)). If an id column is provided, it is automatically included in the output (#416).

  • vroom_write(append = TRUE) does not modify an existing file when appending an empty data frame. In particular, it does not overwrite (delete) the existing contents of that file (tidyverse/readr#1408, #451).

  • vroom::problems() now defaults to .Last.value for its primary input, similar to how readr::problems() works (#443).

  • The warning that indicates the existence of parsing problems has been improved, which should make it easier for the user to follow-up (tidyverse/readr#1322).

  • vroom() reads more reliably from filepaths containing non-ascii characters, in a non-UTF-8 locale (#394, #438).

  • vroom_format() and vroom_write() only quote values that contain a delimiter, quote, or newline. Specifically values that are equal to the na string (or that start with it) are no longer quoted (#426).

  • Fixed segfault when reading in multiple files and the first file has only a header row of column names, but subsequent files have at least one row (#430).

  • Fixed segfault when vroom_format() is given an empty data frame (#425)

  • Fixed a segfault that could occur when the final field of the final line is missing and the file also does not end in a newline (#429).

  • Fixed recursive garbage collection error that could occur during vroom_write() when output_column() generates an ALTREP vector (#389).

  • vroom_progress() uses rlang::is_interactive() instead of base::interactive().

  • col_factor(levels = NULL) honors the na strings of vroom() and its own include_na argument, as described in the docs, and now reproduces the behaviour of readr's first edition parser (#396).

vroom 1.5.7

30 Nov 14:38

Choose a tag to compare

  • Jenny Bryan is now the official maintainer.

  • Fix uninitialized bool detected by CRAN's UBSAN check (#386)

  • Fix buffer overflow when trying to parse an integer field that is over 64 characters long (tidyverse/readr#1326)

  • Fix subset indexing when indexes span a file boundary multiple times (#383)

vroom v1.5.6

12 Nov 17:51

Choose a tag to compare

  • vroom(col_select=) now works if col_names = FALSE as intended (#381)

  • vroom(n_max=) now correctly handles cases when reading from a connection and the file does not end with a newline (tidyverse/readr#1321)

  • vroom() no longer issues a spurious warning when the parsing needs to be restarted due to the presence of embedded newlines (tidyverse/readr#1313)

  • Fix performance issue when materializing subsetted vectors (#378)

  • vroom_format() now uses the same internal multi-threaded code as vroom_write(), improving its performance in most cases (#377)

  • vroom_fwf() no longer omits the last line if it does not end with a newline (tidyverse/readr#1293)

  • Empty files or files with only a header line and no data no longer cause a crash if read with multiple files (tidyverse/readr#1297)

  • Files with a header but no contents, or a empty file if col_names = FALSE no longer cause a hang when progress = TRUE (tidyverse/readr#1297)

  • Commented lines with comments at the end of lines no longer hang R (tidyverse/readr#1309)

  • Comment lines containing unpaired quotes are no longer treated as unterminated quotations (tidyverse/readr#1307)

  • Values with only a Inf or NaN prefix but additional data afterwards, like
    Inform or no longer inappropriately guessed as doubles (tidyverse/readr#1319)

  • Time types now support %h format to denote hour durations greater than 24, like readr (tidyverse/readr#1312)

  • Fix performance issue when materializing subsetted vectors (#378)