Cancer as a tissue anomaly: classifying tumor transcriptomes based only on healthy data

Quinn, Thomas P, Nguyen, Thin, Lee, Samuel C and Venkatesh, Svetha 2019, Cancer as a tissue anomaly: classifying tumor transcriptomes based only on healthy data, Frontiers in genetics, vol. 10, pp. 1-6, doi: 10.3389/fgene.2019.00599.

Attached Files
Name Description MIMEType Size Downloads

Title Cancer as a tissue anomaly: classifying tumor transcriptomes based only on healthy data
Author(s) Quinn, Thomas PORCID iD for Quinn, Thomas P
Nguyen, ThinORCID iD for Nguyen, Thin
Lee, Samuel CORCID iD for Lee, Samuel C
Venkatesh, SvethaORCID iD for Venkatesh, Svetha
Journal name Frontiers in genetics
Volume number 10
Article ID 599
Start page 1
End page 6
Total pages 6
Publisher Frontiers Media
Place of publication Lausanne, Switzerland
Publication date 2019-07
ISSN 1664-8021
Keyword(s) Science & Technology
Life Sciences & Biomedicine
Genetics & Heredity
machine learning
anomaly detection
Summary Since the turn of the century, researchers have sought to diagnose cancer based on gene expression signatures measured from the blood or biopsy as biomarkers. This task, known as classification, is typically solved using a suite of algorithms that learn a mathematical rule capable of discriminating one group (“cases”) from another (“controls”). However, discriminatory methods can only identify cancerous samples that resemble those that the algorithm already saw during training. As such, discriminatory methods may be ill-suited for the classification of cancer: because the possibility space of cancer is definitively large, the existence of a one-of-a-kind gene expression signature is likely. Instead, we propose using an established surveillance method that detects anomalous samples based on their deviation from a learned normal steady-state structure. By transferring this method to transcriptomic data, we can create an anomaly detector for tissue transcriptomes, a “tissue detector,” that is capable of identifying cancer without ever seeing a single cancer example. As a proof-of-concept, we train a “tissue detector” on normal GTEx samples that can classify TCGA samples with >90% AUC for 3 out of 6 tissues. Importantly, we find that the classification accuracy is improved simply by adding more healthy samples. We conclude this report by emphasizing the conceptual advantages of anomaly detection and by highlighting future directions for this field of study.
Language eng
DOI 10.3389/fgene.2019.00599
Indigenous content off
HERDC Research category C1 Refereed article in a scholarly journal
Copyright notice ©2019, Quinn, Nguyen, Lee and Venkatesh
Persistent URL

Connect to link resolver
Unless expressly stated otherwise, the copyright for items in DRO is owned by the author, with all rights reserved.

Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 0 times in TR Web of Science
Scopus Citation Count Cited 1 times in Scopus
Google Scholar Search Google Scholar
Access Statistics: 39 Abstract Views, 2 File Downloads  -  Detailed Statistics
Created: Mon, 22 Jul 2019, 10:29:30 EST

Every reasonable effort has been made to ensure that permission has been obtained for items included in DRO. If you believe that your rights have been infringed by this repository, please contact