Subjunc: detecting exon-exon junctions and mapping RNA-seq reads

The Subjunc aligner is an RNA-seq read aligner, specialized in detecting exon-exon junctions and performing full alignments for the reads (exon-spanning reads in particular).

For the purpose of gene expression analysis, the Subread aligner is recommended for mapping RNA-seq reads although the Subjunc aligner can be used too. The main reason for this recommendation is because Subread is much faster than Subjunc and the gene expression analysis does not require the reads to be fully mapped. For other purposes, the Subjunc aligner should be used.

Download and installation

The Subjunc aligner is part of the Subread package. Please refer to the instructions there for the download and installation.

A quick start

Build an index for the reference genome (you may provide a single FASTA file including all the reference sequences):
subread-buildindex -o my_index chr1.fa chr2.fa ...
Report uniquely mapped reads only (by default). Mapping output includes BAM files and exon-exon junctions discovered from the data.
subjunc -T 5 -i my_index -r reads1.txt -o subjunc_results.bam
Report up to three alignments for each multi-mapping read:
subjunc --multiMapping -B 3 -T 5 -i my_index -r reads1.txt -o subjunc_results.bam
Detect indel of up to 16bp:
subjunc -I 16 -i my_index -r reads1.txt -o subjunc_results.bam
Map paired-end reads and discover exon-exon junctions:
subjunc -d 50 -D 600 -i my_index -r reads1.txt -R reads2.txt -o subjunc_results.bam

Citation

Liao Y, Smyth GK and Shi W (2013). The Subread aligner: fast, accurate and scalable read mapping by seed-and-vote. Nucleic Acids Research, 41(10):e108

Users Guide

The Users Guide contains a comprehensive description to this program.

Get help

You may post your questions/suggestions at the Bioconductor support site.

Links

Subread: a general-purpose read aligner.

featureCounts: Summarizing reads to genomic features.

Rsubread: a Bioconductor R implementation of the Subread package.

A case study for analyzing RNA-seq data: Using Bioconductor packages Rsubread and Limma to perform a complete analysis for RNA-seq data.

Subread package overview: Brief description to Subread package.