Open Access   Article

Cloud Computing in Bioinformatics: Solution to Big Data Challenge

Shahid Tufail1 , M. Abdul Qadeer2

1 Dept. of Computer Engineering, Z. H. College of Engineering and Technology, (Aligarh Muslim University), Aligarh, India.
2 Dept. of Computer Engineering, Z. H. College of Engineering and Technology, (Aligarh Muslim University), Aligarh, India.

Correspondence should be addressed to:

Section:Review Paper, Product Type: Journal Paper
Volume-5 , Issue-9 , Page no. 232-236, Sep-2017


Online published on Sep 30, 2017

Copyright Shahid Tufail, M. Abdul Qadeer . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library


IEEE Style Citation: Shahid Tufail, M. Abdul Qadeer, Cloud Computing in Bioinformatics: Solution to Big Data Challenge, International Journal of Computer Sciences and Engineering, Vol.5, Issue.9, pp.232-236, 2017.

MLA Style Citation: Shahid Tufail, M. Abdul Qadeer "Cloud Computing in Bioinformatics: Solution to Big Data Challenge." International Journal of Computer Sciences and Engineering 5.9 (2017): 232-236.

APA Style Citation: Shahid Tufail, M. Abdul Qadeer, (2017). Cloud Computing in Bioinformatics: Solution to Big Data Challenge. International Journal of Computer Sciences and Engineering, 5(9), 232-236.

203 108 downloads 28 downloads


The piling up of vast quantity of biological data owing to the enormous exploitation of next and third generation sequencing techniques has made their management and handling an uphill task. Cloud computing offers solution to the storage, processing and analysis issues of such a gigantic amount of biological data. The abstraction layer in cloud computing empowers an incorporated access to handling, storage and virtualization. Herein, we review various types of clouds, cloud based service models in bioinformatics and cloud computing platforms with parallel application tools. Lastly, we discuss how the cloud based platforms are being exploited for big data analysis in biology.

Key-Words / Index Term

Cloud computing, bioinformatics, big data, handling, challenge


[1] O`Driscoll A, Daugelaite J, Sleator RD. (2013). `Big data`, Hadoop and cloud computing in genomics. J Biomed Inform 46(5):774-81.
[2] Mansaf Alam, Kashish Ara Shakil. (2012) Recent Developments in Cloud Based Systems: State of Art.
[3] Ritushree Narayan. (2017) Cloud Computing In Bioinformatics: Current Status and Future Research. International Journal for Scientific Research & Development (IJSRD) 4(12):198-201.
[4] Radhe Shyam Thakur and Rajib Bandopadhyay. (2014). Role of cloud computing in bioinformatics research for handling the huge biological data. Chapter 20. In: Biology of useful plants and microbes; Edited by: Arnab Sen; Published by: Narosa Publishing House, New Delhi, India.
[5] Lin Dai, Xin Gao, Yan Guo, Jingfa Xiao, Zhang Zhang. (2012) Bioinformatics clouds for big data manipulation. Biology Direct 7:43.
[6] Dudley JT, Butte AJ: In silico research in the era of cloud computing. Nat Biotechnol 2010, 28(11):11811185.
[7] Stein LD: The case for cloud computing in genome informatics. Genome Biol 2010, 11(5):207.
[8] Taylor RC: An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics. BMC Bioinformatics 2010, 11(Suppl 12):S1.
[9] Nguyen T, Shi W, Ruden D: CloudAligner: a fast and full-featured MapReduce based tool for sequence mapping. BMC Res Notes 2011, 4:171.
[10] Schatz MC: CloudBurst: highly sensitive read mapping with MapReduce. Bioinformatics 2009, 25(11):13631369.
[11] Langmead B, Schatz MC, Lin J, Pop M, Salzberg SL: Searching for SNPs with cloud computing. Genome Biol 2009, 10(11):R134.
[12] Matsunaga A, Tsugawa M, Fortes J: Combining MapReduce and Virtualization on Distributed Resources for Bioinformatics Applications. In Fourth IEEE International Conference on eScience; 2008:222229.
[13] Hong D, Rhie A, Park SS, Lee J, Ju YS, Kim S, Yu SB, Bleazard T, Park HS, Rhee H, et al: FX: an RNA-Seq analysis tool on the cloud. Bioinformatics 2012, 28(5):721723.
[14] Langmead B, Hansen KD, Leek JT: Cloud-scale RNA-sequencing differential expression analysis with Myrna. Genome Biol 2010, 11(8):R83.
[15] Zhang L, Gu S, Liu Y, Wang B, Azuaje F: Gene set analysis in the cloud. Bioinformatics 2012, 28(2):294295.
[16] Wall DP, Kudtarkar P, Fusaro VA, Pivovarov R, Patil P, Tonellato PJ: Cloud computing for comparative genomics. BMC Bioinformatics 2010, 11:259.
[17] Feng X, Grossman R, Stein L: PeakRanger: a cloud-enabled peak caller for ChIP-seq data. BMC Bioinformatics 2011, 12:139.
[18] Habegger L, Balasubramanian S, Chen DZ, Khurana E, Sboner A, Harmanci A, Rozowsky J, Clarke D, Snyder M, Gerstein M: VAT: a computational framework to functionally annotate variants in personal genomes within a cloud-computing environment. Bioinformatics 2012. Epub ahead of print.
[19] Wang Z, Wang Y, Tan KL, Wong L, Agrawal D: eCEO: an efficient Cloud Epistasis cOmputing model in genome-wide association study. Bioinformatics 2011, 27(8):10451051.
[20] Jourdren L, Bernard M, Dillies M-A, Le Crom S: Eoulsan: a cloud computing-based framework facilitating high throughput sequencing analyses. Bioinformatics 2012. doi:2010.1093/bioinformatics/bts2165. published online April 5, 2012.
[21] Afgan E, Baker D, Coraor N, Goto H, Paul IM, Makova KD, Nekrutenko A, Taylor J: Harnessing cloud computing with Galaxy Cloud. Nat Biotechnol 2011, 29(11):972974.
[22] Afgan E, Baker D, Coraor N, Chapman B, Nekrutenko A, Taylor J: Galaxy CloudMan: delivering cloud compute clusters. BMC Bioinformatics 2010, 11(Suppl 12):S4.
[23] Krampis K, Booth T, Chapman B, Tiwari B, Bicak M, Field D, Nelson K: Cloud BioLinux: pre-configured and on-demand bioinformatics computing for the genomics community. BMC Bioinformatics 2012, 13(1):42.
[24] Angiuoli SV, Matalka M, Gussman A, Galens K, Vangala M, Riley DR, Arze C, White JR, White O, Fricke WF: CloVR: a virtual machine for automated and portable sequence analysis from the desktop using cloud computing. BMC Bioinformatics 2011, 12:356.
[25] Hyungro Lee. (2014) Using Bioinformatics Applications on the Cloud.
[26] A. Matsunaga, M. Tsugawa, and J. Fortes. Cloudblast: Combining mapreduce and virtualization on distributed resources for bioinformatics applications. In eScience, 2008. eScience`08. IEEE Fourth International Conference on, pages 222{229. IEEE, 2008.
[27] M. C. Schatz. Cloudburst: highly sensitive read mapping with mapreduce. Bioinformatics,
25(11):1363{1369, 2009.
[28] B. Langmead, M. C. Schatz, J. Lin, M. Pop, and S. L. Salzberg. Searching for snps with cloud computing. Genome Biol, 10(11):R134, 2009.
[29] M. S. Wiewi_orka, A. Messina, A. Pacholewska, S. Ma_oletti, P. Gawrysiak, and M. J. Okoniewski. Sparkseq: fast, scalable, cloud-ready tool for the interactive genomic data analysis with nucleotide precision. Bioinformatics, page btu343, 2014.
[30] A. Schumacher, L. Pireddu, M. Niemenmaa, A. Kallio, E. Korpelainen, G. Zanetti, and K. Heljanko. Seqpig: simple and scalable scripting for large sequencing data sets in hadoop. Bioinformatics, 30(1):119{120, 2014.
[31] W. Lu, J. Jackson, and R. Barga. Azureblast: a case study of developing science applications on the cloud. In Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, pages 413{420. ACM, 2010.
[32] S. Zhao, K. Prenger, L. Smith, T. Messina, H. Fan, E. Jaeger, and S. Stephens. Rainbow: a tool for large-scale whole-genome sequencing data analysis using cloud computing. BMC genomics, 14(1):425, 2013.
[33] H. Nordberg, K. Bhatia, K. Wang, and Z. Wang. Biopig: a hadoop-based analytic toolkit for large-scale sequence data. Bioinformatics, 29(23):3014{3019, 2013.
[34] Wall, D. P., Kudtarkar, P., Fusaro, V. A., Pivovarov, R., Patil, P., & Tonellato, P. J. Cloud computing for comparative genomics RSD algorithm summary. BMC Bioinformatics. 11, (2010). 259-270.
[35] Kudtarkar, P., Deluca, T. F., Fusaro, V. A., Tonellato, J., & Wall, D. Evolutionary Bioinformatics cost-effective cloud computing : A case study Using the comparative Genomics Tool, Roundup. Evol Bioinfo. 6, (2010).197-203.