Colapsing multisample BED file

Hello!
I have a BED file from gistic which I converted to the format accepted by AnnotSV. It contains these columns:

chr	chromStart	chromEnd	SVTYPE	Samples_ID	q.values	cn_confidency

Each sample is in a line of the file, even if they have the same chromStart and chromEnd

I have used:
AnnotSV -SvinputFile input..bed -annotationsDir /annotsv/AnnotSV_annotations/ -genomeBuild GRCh38 -samplesidBEDcol 5 -svtBEDcol 4 -outputFile output.tsv

The annotations work fine, but I find two suggestions that could be improved upon:

First, my extra columns q.values and cn_confidency gets renamed to user#1 and user#2. For two columns that's not a problem, but for many it might be slightly boring to rename. I think they should be named the same as the original automatically

Second, the CNVs are duplicated for each sample that has it, even if the SV_start and SV_end are the exact same. Is there a way to collapse all exact intervals by using comma? So if samples A B and C have the same variant with same type, SAMPLES_ID show "A, B,C". Maybe even merge different intervals according to a user-defined distance/overlap percentage?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Colapsing multisample BED file #292

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Colapsing multisample BED file #292

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions