Open Access   Article Go Back

A Comparative Study of Computational Tools for Biological Sequence Cleaning and Analysis

K.S.Mehta 1 , D.S.Mehta 2 , V.Dahiya 3

Section:Review Paper, Product Type: Journal Paper
Volume-6 , Issue-7 , Page no. 1136-1140, Jul-2018

CrossRef-DOI:   https://doi.org/10.26438/ijcse/v6i7.11361140

Online published on Jul 31, 2018

Copyright © K.S.Mehta, D.S.Mehta, V.Dahiya . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: K.S.Mehta, D.S.Mehta, V.Dahiya, “A Comparative Study of Computational Tools for Biological Sequence Cleaning and Analysis,” International Journal of Computer Sciences and Engineering, Vol.6, Issue.7, pp.1136-1140, 2018.

MLA Style Citation: K.S.Mehta, D.S.Mehta, V.Dahiya "A Comparative Study of Computational Tools for Biological Sequence Cleaning and Analysis." International Journal of Computer Sciences and Engineering 6.7 (2018): 1136-1140.

APA Style Citation: K.S.Mehta, D.S.Mehta, V.Dahiya, (2018). A Comparative Study of Computational Tools for Biological Sequence Cleaning and Analysis. International Journal of Computer Sciences and Engineering, 6(7), 1136-1140.

BibTex Style Citation:
@article{_2018,
author = {K.S.Mehta, D.S.Mehta, V.Dahiya},
title = {A Comparative Study of Computational Tools for Biological Sequence Cleaning and Analysis},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {7 2018},
volume = {6},
Issue = {7},
month = {7},
year = {2018},
issn = {2347-2693},
pages = {1136-1140},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=2573},
doi = {https://doi.org/10.26438/ijcse/v6i7.11361140}
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
DO = {https://doi.org/10.26438/ijcse/v6i7.11361140}
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=2573
TI - A Comparative Study of Computational Tools for Biological Sequence Cleaning and Analysis
T2 - International Journal of Computer Sciences and Engineering
AU - K.S.Mehta, D.S.Mehta, V.Dahiya
PY - 2018
DA - 2018/07/31
PB - IJCSE, Indore, INDIA
SP - 1136-1140
IS - 7
VL - 6
SN - 2347-2693
ER -

VIEWS PDF XML
481 246 downloads 123 downloads
  
  
           

Abstract

The next generation sequencing(NGS) technology is playing an increasingly prominent role in capturing DNA and RNA sequencing by producing high-throughput sequences (HTS). The major challenge with HTS is the complexity and difficulty of data quality control (QC). Only a high quality data is capable for accurate diagnosis of the disease. For accurate diagnosis the data that needs to be analysed must be appropriate and correct. To fulfill this requirement, computer scientists have implemented the algorithms in easy to use manner that become convenient tools for biological research. The raw sequence generated by the NGS technologies is first cleaned and then moved further for clinical analysis. The step of cleaning includes removal of short sequences and trimming of inappropriate headers. This paper compares some popular, open source tools used for cleaning the captured sequences.

Key-Words / Index Term

Illumina, FASTQ, FASTA, tag removal, single end, paired end

References

[1]https://twistbioscience.com/company/blog/twistbioscienceexomesequencing4dataanalysis
[2] White Paper: Exome Sequencing and Data Analysis. Scigenom.com
[3] Trimmomatic Manual: V0.32.
[4] Anthony M. Bolger, Marc Lohse and Bjoern Usadel. Trimmomatic: a flexible trimmer for Illumina sequence data. Vol. 30 no. 15 2014, pages 2114–2120.
[5] Taking appropriate QC measures for RRBS-type or other -Seq applications with Trim Galore!. Babraham Bioinformatics. September 03, 2013.
[6] Alexis Criscuolo and Sylvain Brisse. ALIENTRIMMER: A tool to quickly and accurately trim off multiple short
contaminant sequences from high-throughput sequencing reads Frontiers in Genetics. Vol. 5, Article 130, May 2014.
[7] Alexis Criscuolo. AlienTrimmer User Guide. [Version 0.2.1] September 2012
[8]Hongshan Jiang, Rong Lei, Shou-Wei Ding and Shuifang Zhu. Skewer: a fast and accurate adapter trimmer for next-generation sequencing paired-end reads.BMC Bioinformatics 2014, 15:182.
[9] Marc Sturm, Christopher Schroeder and Peter Bauer. SeqPurge: highly-sensitive adapter trimming for paired-end NGS data. Sturm et al. BMC Bioinformatics (2016) 17:208.
[10] https://genomics.sschmeier.com/ngs-qc/index.html
[11] Murray P Cox, Daniel A Peterson, Patrick J Biggs. SolexaQA: At-a-glance quality assessment of Illumina second-generation sequencing data. Cox et al. BMC Bioinformatics 2010, 11:485
[12] Robert Schmieder and Robert Edwards. Quality control and preprocessing of metagenomic datasets. Vol. 27 no. 6 2011, pages 863–864.
[13] Marcel Martin. cutadapt Documentation Release 1.16. Feb 21, 2018.
[14] Richard M. Leggett, Ricardo H. Ramirez-Gonzalez, Bernardo J. Clavijo, DarrenWaiteand Robert P. Davey. Sequencing quality assessment tools to enable data-driven informatics for high throughput genomics. Frontiers in Genetics. December 2013, Volume4, Article288.
[15] FASTQC Manual
[16]https://www.reddit.com/r/bioinformatics/comments/63nu1f/comparing_quality_trimming_and_adapter_removing/ [25 May, 2018]
[17] https://cutadapt.readthedocs.io/en/stable/ [25 May, 2018]
[18] Stephan Pabinger, Andreas Dander, Maria Fischer, Rene Snajder, Michael Sperk, Mirjana Efremova, Birgit Krabichler, Michael R. Speicher, Johannes Zschocke and Zlatko Trajanoski. A survey of tools for variant analysis of next-generation genome sequencing data. BRIEFINGS IN BIOINFORMATICS. VOL 15. NO 2. 256-278, 21 January 2013.
[19] Hongshan Jiang. Skewer: A fast and accurate adapter trimmer for paired-end reads: User’s Manual. Chinese Academy of Inspection and Quarantine May 12, 2015.