-
Notifications
You must be signed in to change notification settings - Fork 8
2.3. Automated Pipeline Starting with raw counts
Got some raw counts files from htseq-counts/feature counts/star counts? Given how diverse counts can be generated, e.g. what features are counted, hg19 or hg38 aligned, gene names, gene presence, etc. It is preferable that you use either the FASTQ/FASTA or BAM workflows.
But sometimes you just have some old hg19 aligned files kicking around. Perhaps they're hg38 and you just want to try it anyway. Provided ALLSorts receives all 20625 required genes and they are counted from hg19 aligned files... it will probably work OK. However, outside of that, your results might be a bit sketchy. Though, you might find it useful, so why not?
FOR ALREADY GENERATED COUNTS MATRIX, GO TO THE MANUAL EXECUTION*
Just follow the instructions https://github.com/Oshlack/ALLSorts/wiki.
Ideally the counts will reflect ftp://ftp.ensembl.org/pub/grch37/current/gtf/homo_sapiens/Homo_sapiens.GRCh37.87.chr.gtf.gz
. Don't worry about the ensembl naming, that will be converted through this ALLSorts pipeline into gene names.
FOR ALREADY GENERATED COUNTS MATRIX, GO TO THE MANUAL EXECUTION*
Ok, ALLSorts has been installed? Raw counts in the correct format?
ALLSorts can be run with this script, note the parameter descriptions below:
bpipe -p results=$results -p strand=$strand -p type=$type _$COUNTSDIR/counts.groovy_ $counts
Feel free to make these environment variables (I tend to) or just directly insert them into the command line snippet above.
$results = /path/to/desired/output
$type = "counts"
$strand = "yes" or "no" or "reverse" # No and Reverse will be the two most used (no = unstranded, reverse = stranded typically)
$COUNTSDIR/counts.groovy should be the path /your/allsorts/clone/path/tools/counts/counts.groovy
$counts - the path to your counts files. Can be something as simple as /path/to/counts/*.txt.
We're still testing this functionality, please report any issues!