Subread package: high-performance read alignment, quantification and mutation discovery

The Subread package comprises a suite of software programs for processing next-gen sequencing read data including:

  • Subread: an accurate and efficient aligner for mapping both genomic DNA-seq reads and RNA-seq reads (for the purpose of expression analysis).
  • Subjunc: an RNA-seq aligner suitable for all purposes of RNA-seq analyses.
  • featureCounts: a highly efficient and accurate read summarization program.
  • exactSNP: a SNP caller that discovers SNPs by testing signals against local background noises.

These programs were also implemented in Bioconductor R package Rsubread.

CHANGELOG AND NEWS

NEWS

The Subread-featureCounts-limma/voom pipeline has been found to be one of the best-performing pipelines for the analyses of RNA-seq data by the SEQC/MAQC III Consortium. This study was published in the 2014 September issue of Nature Biotechnology -- A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium

Release 1.4.6, 15 Oct 2014
  • The default number of maximum allowed mismatches in each reported alignment is changed to 3.
  • The minimun fraction of consensus subreads out of all extracted subreads, required for detecting candidate mapping locations, is changed to 0.3 in subjunc for the mapping of exonic reads (reads falling within exons).
  • Better support for the mapping of micro RNA sequencing (miRNA-seq) reads. A full index with no gaps included can now be built to allow miRNA-seq reads to be mapped in the highest possible resolution. A new section is added to the User Guides to describe how to map miRNA-seq reads using Subread.
  • Bug fixes.

  • Release 1.4.5-p1, 7 July 2014
  • Fixed a bug for reporting unmapped reads in SAM output.

  • Release 1.4.5, 12 June 2014

    New options in featureCounts:
    --readExtension5 <int> Reads are extended upstream by <int> bases from their 5' end.
    --readExtension3 <int> Reads are extended downstream by <int> bases from their 3' end.
    --read2pos <5:3> The read is reduced to its 5' most base or 3' most base. Read summarization is then performed based on the single base position which the read is reduced to.
    --minReadOverlap <int> Specify the minimum number of overlapped bases required for assigning a read to a feature. 1 by default. Negative values are permitted, indicating a gap being allowed between a read and a feature.
    --countSplitAlignmentsOnly If specified, only split alignments (CIGAR strings containing letter 'N') will be counted. Example split alignments include exon-spanning reads in RNA-seq data.
    --ignoreDup If specified, reads marked as duplicates are not counted. Duplicate reads are identified using FLAG Ox400.

    A new option in subread-align/sbujunc:
    -M <int> Specify the maximum number of mismatched bases allowed in the alignment. 10 by default.

    Other changes:
  • NM tags are added into read mapping output.
  • Range of MAPQ values is changed to [0,60).
  • MAPQ values for Multiple-mapping reads are set to 0.
  • NCBI RefSeq gene annotations for hg19, mm10 and mm9 are added to the package, making it easier for performing read summarization.
  • Bug fixes.

  • ChangeLog history

    Download and Installation

  • Latest version v1.4.6
  • All the versions
  • Installation instructions
  • Mailing lists

  • Subread Users Group
  • Tutorials and Users Guide

  • A short tutorial on Subread
  • A short tutorial on Subjunc
  • A short tutorial on featureCounts
  • A short tutorial on exactSNP
  • A case study for analyzing RNA-seq data
  • Users Guide
  • Publications

  • Liao Y, Smyth GK and Shi W. The Subread aligner: fast, accurate and scalable read mapping by seed-and-vote. Nucleic Acids Research, 41(10):e108, 2013
  • Liao Y, Smyth GK and Shi W. featureCounts: an efficient general-purpose program for assigning sequence reads to genomic features. Bioinformatics, 30(7):923-30, 2014
  • Scientific publications citing our methods

  • Publications that cite Subread/Subjunc
  • Publications that cite featureCounts
  • Resources

  • Read count tables for published datasets
  • Links

  • Bioconductor R package Rsubread
  • Bioconductor R package seqc
  • WEHI Bioinformatics
  • Contact

    Dr. Wei Shi (shi at wehi dot edu dot au) or
    Dr. Yang Liao (liao at wehi dot edu dot au)