Description
I am trying to use git2r
to extract individual files from repository history for the purpose of comparing R objects through the repository history (for instance, for comparing model performance different versions of models saved as .rds
files in the repo). In some cases I am extracting subdirectories and so working through the git_tree
recursively. I do not want to overwrite the working copies of these files, but copy them to a location of my choosing.
I am able to use git2r::content()
to read an individual text file blobs, which can then be written to files. However, it returns NA if the blob/file is binary. I would like to be able to either (a) return a raw vector of binary data from git2r::content()
, or (b) copy the blobs to files directly without reading them in, perhaps by having a version of the C function blob_content_to_file
exposed to the R API. The latter would be more efficient as it avoids the read-write cycle into R, though I think the former would be easiest to implement.
I may be able to implement the latter as a PR but my C skills are limited. If I can and you are interested, would you prefer content()
to return a raw vector for binary data, or for content()
to be type-stable and aseparate content_raw()
function be used for binary files?
Alternatively, there may be a way to do this with checkout()
or another function that I've missed, but I've not figured it out.
Thanks for this excellent and long-maintained package!