Open Access   Article Go Back

A Deep Learning Model For Dimension Reduction And Multi-Class Classification Of Gene Expression Data

Aradhita Mukherjee1 , Dibyendu Bikash Seal2

Section:Research Paper, Product Type: Journal Paper
Volume-6 , Issue-8 , Page no. 671-676, Aug-2018

CrossRef-DOI:   https://doi.org/10.26438/ijcse/v6i8.671676

Online published on Aug 31, 2018

Copyright © Aradhita Mukherjee, Dibyendu Bikash Seal . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: Aradhita Mukherjee, Dibyendu Bikash Seal, “A Deep Learning Model For Dimension Reduction And Multi-Class Classification Of Gene Expression Data,” International Journal of Computer Sciences and Engineering, Vol.6, Issue.8, pp.671-676, 2018.

MLA Style Citation: Aradhita Mukherjee, Dibyendu Bikash Seal "A Deep Learning Model For Dimension Reduction And Multi-Class Classification Of Gene Expression Data." International Journal of Computer Sciences and Engineering 6.8 (2018): 671-676.

APA Style Citation: Aradhita Mukherjee, Dibyendu Bikash Seal, (2018). A Deep Learning Model For Dimension Reduction And Multi-Class Classification Of Gene Expression Data. International Journal of Computer Sciences and Engineering, 6(8), 671-676.

BibTex Style Citation:
@article{Mukherjee_2018,
author = {Aradhita Mukherjee, Dibyendu Bikash Seal},
title = {A Deep Learning Model For Dimension Reduction And Multi-Class Classification Of Gene Expression Data},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {8 2018},
volume = {6},
Issue = {8},
month = {8},
year = {2018},
issn = {2347-2693},
pages = {671-676},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=2753},
doi = {https://doi.org/10.26438/ijcse/v6i8.671676}
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
DO = {https://doi.org/10.26438/ijcse/v6i8.671676}
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=2753
TI - A Deep Learning Model For Dimension Reduction And Multi-Class Classification Of Gene Expression Data
T2 - International Journal of Computer Sciences and Engineering
AU - Aradhita Mukherjee, Dibyendu Bikash Seal
PY - 2018
DA - 2018/08/31
PB - IJCSE, Indore, INDIA
SP - 671-676
IS - 8
VL - 6
SN - 2347-2693
ER -

VIEWS PDF XML
539 215 downloads 114 downloads
  
  
           

Abstract

Gene expression analysis has been vital in cancer detection across the world. Genes regulating cell growth in cancer, suffer altered expressions. This leads to various phenotypic traits. Gene expression profiling has been extensively used by researchers to accurately identify tumours and has thus enabled better understanding of tumour biology. However, feature extraction and classification of gene expression datasets is challenging due to the high dimension of gene expression datasets and the non-linear relationships among the data. In this article, we have developed a deep learning-based dimension reduction and multi-class classification model using deep auto-encoder and multi-layer perceptron (MLP). We have trained the auto-encoder to extract meaningful features from the RNA-Seq data. These features are then used for supervised classification of tumour samples using a multilayer perceptron. Our (deepAE-MLP) model showed better feature extraction and disease classification capabilities when compared to benchmark methods.

Key-Words / Index Term

Gene expression, Deep Learning, Auto-encoder, Multi-layer perceptron, Dimension Reduction, Multi-class Classification

References

[1] Creighton CJ et al., "Comprehensive molecular characterization of clear cell renal cell carcinoma", Nature 499(7456):43-9 (2013)
[2] H. Li, B. Yu, J. Li, L. Su, M. Yan, J. Zhang, C. Li, Z. Zhu and B. Liu, "Characterization of differentially expressed genes involved in pathways associated with gastric cancer", PloS one 10, p. e0125013
[3] (2015).
[4] T. Zhou, Y. Du and T. Wei, "Transcriptomic analysis of human breast cancer cells reveals differentially expressed genes and related cellular functions and pathways in response to gold nanorods", Biophysics Reports 1, 106 (2015)
[5] J. S. Myers, A. K. von Lersner, C. J. Robbins and Q.-X. A. Sang, "Differentially Expressed Genes and Signature Pathways of Human Prostate Cancer", PloS one 10, p. e0145322 (2015)
[6] M. Maienschein-Cline, J. Zhou, K. P. White, R. Sciammas and A. R. Dinner, "Discovering transcription factor regulatory targets using gene expression and binding data", Bioinformatics 28, 206 (2012)
[7] K. Shabana, K. A. Nazeer, M. Pradhan and M. Palakal, “A computational method for drug repositioning using publicly available gene expression data", BMC bioinformatics 16, p. 1 (2015)
[8] Yoo, C.K., Leeb, I. and Vanrolleghema, P.A. (2005) "Interpreting patterns and analysis of acute leukemia gene expression data by multivariate fuzzy statistical analysis", Computers & Chemical Engineering, Vol. 29, No. 6, pp.1345–1356.
[9] Liao, C., Li, S. and Luo, Z. (2006) "Gene selection using Wilcoxon rank sum test and support vector machine for cancer classification", Proceedings of the International Conference on Computational Intelligence and Security, 3–6 November, Guangzhou, China, pp.57–66.
[10] Peterson, L.E. and Coleman, M.A. (2005) "Comparison of gene identification based on artificial neural network pre-processing with k-means cluster and principal component analysis", Proceedings of the 6th Conference Workshop on Fuzzy Logic and Applications, 15–17 September, Crema, Italy, 267–276.
[11] Huerta, E.B., Duval, B. and Hao, J.K. (2006) "A hybrid GA/SVM approach for gene selection and classification of microarray data", Proceedings of the EvoWorkshops 2006: EvoBIO, EvoCOMNET, EvoHOT, EvoIASP, EvoINTERACTION, EvoMUSART, and EvoSTOC, 10–12 April, Budapest, Hungary, pp.34–44.
[12] Baena, R.M.L., Urda, D., Subirats, J.L., Franco, L. and Jerez, J.M. (2013) "Analysis of cancer microarray data using constructive neural networks and genetic algorithms", Proceedings of the 1st International Work-Conference on Bioinformatics and Biomedical Engineering, 18–20 March, Granada, Spain, pp.55–63.
[13] Hinton, Geoffrey E., Simon Osindero, and Yee-Whye Teh. "A fast
learning algorithm for deep belief nets.". Neural computation 18, no. 7 (2006): 1527-1554.
[14] Wang, Haohan, and Bhiksha Raj. "A Survey: Time Travel in Deep
Learning Space: An Introduction to Deep Learning Models and How Deep Learning Models Evolved from the Initial Ideas." arXiv preprint arXiv:1510.04781 (2015).
[15] Vincent, Pascal, Hugo Larochelle, Yoshua Bengio, and Pierre-Antoine Manzagol. "Extracting and composing robust features with de-noising autoencoders." In Proceedin7,s of the 25th international conference on Machine learning, pp. 1096-1103. ACM, 2008.
[16] Vincent, Pascal, Hugo Larochelle, Isabelle Lajoie, Y oshua Bengio, and Pierre-Antoine Manzagol. "Stacked de-noising autoencoders: Learning useful representations in a deep network with a local de-noising criterion." The Journal of Machine Learning Research II (20 I 0): 3371- 3408.
[17] T. S. Furey, N. Cristianini, N. Du
y, D. W. Bednarski, M. Schummer and D. Haussler, "Support vector machine classification and validation of cancer tissue samples using microarray expression data", Bioinformatics 16, 906 (2000).
[18] S. Reddy, K. T. Reddy, V. V. Kumari and K. V. Varma, "An SVM Based Approach to Breast Cancer Classification using RBF and Polynomial Kernel Functions with Varying Arguments", International Journal of Computer Science and Information Technologies 5, 5901 (2014).
[19] S. Wold, K. Esbensen and P. Geladi, “Chemometrics and intelligent laboratory systems”, 2, 37 (1987).
[20] Fakoor R, Ladhak F, Nazi A, Huber M, editors, "Using deep learning to enhance cancer diagnosis and classification", Proceedings of the ICML Workshop on the Role of Machine Learning in Transforming Healthcare Atlanta, Georgia: JMLR: W&CP; 2013.
[21] A. Gupta, H. Wang and M. Ganapathiraju, “Learning structure in gene expression data using deep architectures, with an application to gene clustering”, in Bioinformatics and Biomedicine (BIBM), 2015 IEEE International Conference on, 2015.
[22] Jie Tan, MAtthew Ung, Chao Cheng and Casey S Greene, "Unsupervised feature construction and knowledge extraction from genome-wide assays of breast cancer with de-noising autoencoders", Pac Symp Biocomput. 2015; 20: 132–143.
[23] Padideh Danaee, Reza Ghaeini, and David A. Hendrix, "A deep learning approach for cancer detection and relevant gene identification", Pac Symp Biocomput. 2016; 22: 219–229.
[24] Rui Xie, JiaWen, Andrew Quitadamo, Jianlin Cheng and Xinghua Shi, "A deep auto-encoder model for gene expression prediction", BMC Genomics 2017, 18(Suppl 9):845 DOI 10.1186/s12864-017-4226-0
[25] Ayse Dincer, Safiye Celik, Naozumi Hiranuma, and Su-In Lee, "DeepProfile: Deep learning of patient molecular profiles for precision medicine in acute myeloid leukemia", bioRxiv preprint first posted online Mar. 8, 2018; doi: http://dx.doi.org/10.1101/278739.
[26] Kumardeep Chaudhary, Olivier B Poirion, Liangqun Lu, Lana X Garmire, "Deep Learning based multi-omics integration robustly predicts survival in liver cancer", Clinical Cancer Research, Pages
clincanres. 0853.2017
[27] Fadhl M. Alkawaa, Kumardeep Chaudhary, and Lana X. Garmire, "Deep learning accurately predicts estrogen receptor status in breast cancer metabolomics data", J. Proteome Res., DOI: 10.1021/acs.jproteome.7b00595, Publication Date (Web): 07 Nov 2017
[28] Maryam M Najafabadi, Flavio Villanustre, Taghi M Khoshgoftaar, Naeem Seliya, Randall WaldEmail author and Edin Muharemagic, "Deep learning applications and challenges in big data analytics", Journal of Big Data20152:1 https://doi.org/10.1186/s40537-014-0007-7
[29] S Min, B Lee, S Yoon, "Deep learning in bioinformatics", Briefings in bioinformatics 18 (5), 851-869
[30] Angermueller C, Pärnamaa T, Parts L, Stegle O, "Deep learning for computational biology", Mol Syst Biol. 2016 Jul 29;12(7):878. doi: 10.15252/msb.20156651.
[31] Ravi D, Wong C, Deligianni F, Berthelot M, Andreu-Perez J, Lo B, Yang GZ, "Deep Learning for Health Informatics", IEEE J Biomed Health Inform. 2017 Jan;21(1):4-21. doi: 10.1109/JBHI.2016.2636665. Epub 2016 Dec 29.
[32] Mamoshina P, Vieira A, Putin E, Zhavoronkov A, "Applications of Deep Learning in Biomedicine", Mol Pharm. 2016 May 2;13(5):1445-54. doi: 10.1021/acs.molpharmaceut.5b00982. Epub 2016 Mar 29.
[33] Imad, Hafidi & Rochd, Yassir. (2018). An Enhanced Apriori Algorithm Using Hybrid Data Layout Based on Hadoop for Big Data Processing. International Journal of Network Security. 18. 161.
[34] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay, “Scikit-learn: Machine learning in Python," Journal of Machine Learning Research, vol. 12, pp. 2825{2830, 2011.
[35] S. Hochreiter, “The vanishing gradient problem during learning recurrent neural nets and problem solutions," Int. J. Uncertain. Fuzziness Knowl.-Based Syst., vol. 6, pp. 107{116, Apr. 1998.
[36] B. Scholkopf, A. Smola and K.-R. Muller, "Kernel principal component analysis", in International Conference on Articial Neural Networks, 1997.
[37] T. SenthilSelvi1 and R. Parimala, “Improving Clustering Accuracy using Feature Extraction Method”, International Journal of Scientific Research in Computer Science and Engineering, Vol. 6, Issue. 2, pp.15-19, 2018.