Openly accessible

Benchmarking differential expression analysis tools for RNA-Seq: normalization-based vs. log-ratio transformation-based methods

Quinn, Thomas P., Crowley, Tamsyn M. and Richardson, Mark F. 2018, Benchmarking differential expression analysis tools for RNA-Seq: normalization-based vs. log-ratio transformation-based methods, BMC bioinformatics, vol. 19, no. 1, doi: 10.1186/s12859-018-2261-8.

Attached Files
Name Description MIMEType Size Downloads
quinn-benchmarkingdiff-2018.pdf Published version application/pdf 3.17MB 72

Title Benchmarking differential expression analysis tools for RNA-Seq: normalization-based vs. log-ratio transformation-based methods
Author(s) Quinn, Thomas P.ORCID iD for Quinn, Thomas P. orcid.org/0000-0002-3698-8917
Crowley, Tamsyn M.ORCID iD for Crowley, Tamsyn M. orcid.org/0000-0002-1650-0064
Richardson, Mark F.
Journal name BMC bioinformatics
Volume number 19
Issue number 1
Article ID 274
Total pages 15
Publisher BioMed Central
Place of publication London, Eng.
Publication date 2018-07-18
ISSN 1471-2105
Keyword(s) CoDA
Compositional analysis
Compositional data
High-throughput sequencing analysis
RNA-Seq
Science & Technology
Life Sciences & Biomedicine
Biochemical Research Methods
Biotechnology & Applied Microbiology
Mathematical & Computational Biology
Biochemistry & Molecular Biology
TRANSCRIPT EXPRESSION
QUANTIFICATION
DATASETS
PACKAGE
READS
Summary BACKGROUND: Count data generated by next-generation sequencing assays do not measure absolute transcript abundances. Instead, the data are constrained to an arbitrary "library size" by the sequencing depth of the assay, and typically must be normalized prior to statistical analysis. The constrained nature of these data means one could alternatively use a log-ratio transformation in lieu of normalization, as often done when testing for differential abundance (DA) of operational taxonomic units (OTUs) in 16S rRNA data. Therefore, we benchmark how well the ALDEx2 package, a transformation-based DA tool, detects differential expression in high-throughput RNA-sequencing data (RNA-Seq), compared to conventional RNA-Seq methods such as edgeR and DESeq2.

RESULTS: To evaluate the performance of log-ratio transformation-based tools, we apply the ALDEx2 package to two simulated, and two real, RNA-Seq data sets. One of the latter was previously used to benchmark dozens of conventional RNA-Seq differential expression methods, enabling us to directly compare transformation-based approaches. We show that ALDEx2, widely used in meta-genomics research, identifies differentially expressed genes (and transcripts) from RNA-Seq data with high precision and, given sufficient sample sizes, high recall too (regardless of the alignment and quantification procedure used). Although we show that the choice in log-ratio transformation can affect performance, ALDEx2 has high precision (i.e., few false positives) across all transformations. Finally, we present a novel, iterative log-ratio transformation (now implemented in ALDEx2) that further improves performance in simulations.

CONCLUSIONS: Our results suggest that log-ratio transformation-based methods can work to measure differential expression from RNA-Seq data, provided that certain assumptions are met. Moreover, these methods have very high precision (i.e., few false positives) in simulations and perform well on real data too. With previously demonstrated applicability to 16S rRNA data, ALDEx2 can thus serve as a single tool for data from multiple sequencing modalities.
Language eng
DOI 10.1186/s12859-018-2261-8
Field of Research 06 Biological Sciences
08 Information And Computing Sciences
01 Mathematical Sciences
HERDC Research category C1 Refereed article in a scholarly journal
Copyright notice ©2018, The Authors
Free to Read? Yes
Use Rights Creative Commons Attribution licence
Persistent URL http://hdl.handle.net/10536/DRO/DU:30111887

Connect to link resolver
 
Unless expressly stated otherwise, the copyright for items in DRO is owned by the author, with all rights reserved.

Every reasonable effort has been made to ensure that permission has been obtained for items included in DRO. If you believe that your rights have been infringed by this repository, please contact drosupport@deakin.edu.au.

Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 9 times in TR Web of Science
Scopus Citation Count Cited 10 times in Scopus
Google Scholar Search Google Scholar
Access Statistics: 258 Abstract Views, 72 File Downloads  -  Detailed Statistics
Created: Fri, 27 Jul 2018, 12:44:03 EST

Every reasonable effort has been made to ensure that permission has been obtained for items included in DRO. If you believe that your rights have been infringed by this repository, please contact drosupport@deakin.edu.au.