Skip to content

vcfR2tidy can't handle files with empty INFO #200

@alkaZeltser

Description

@alkaZeltser

Firstly, thank you for a wonderful package!
I, unfortunately, have discovered a bug with an interesting edge-case.
I have a VCF that contains no information in the INFO column (. in each row) and accordingly no lines in the header with information on any INFO field (no ##INFO=<> lines).

This becomes a problem when calling the vcfR2tidy function. It generates the following error:
Error in strsplit(unlist(x), split = "=") : non-character argument

This error is resolved when a dummy INFO line is added to the @meta object.

I have traced the source to the following function which parses the INFO and FORMAT header information, assuming that a line starting with ##INFO exists:

vcf_field_names <- function(x, tag = "INFO") {

Reproducible example:

data(vcfR_test)
## cause error ##
# remove all INFO lines in @meta object
INFO.meta.lines <- grepl("^##INFO", vcfR_test@meta);
vcfR_test@meta <- vcfR_test@meta[!INFO.meta.lines];
# remove all data from INFO column of @fix object
INFO.col.index <- 8;
vcfR_test@fix[, INFO.col.index] <- rep(NA, nrow(vcfR_test@fix));

vcfR2tidy(vcfR_test)

Error in strsplit(unlist(x), split = "=") : non-character argument

## resolve error ##
dummy.INFO.line <- "##INFO=<ID=AF>"
vcfR_test@meta[length(vcfR_test@meta) + 1] <- dummy.INFO.line

vcfR2tidy(vcfR_test)

Extracting gt element GT
Extracting gt element GQ
Extracting gt element DP
Extracting gt element HQ

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions