Open Access   Article Go Back

A Novel Approaches to Study Different DNA Pattern Matching Algorithms over Two Compression Techniques

G. Dutta1 , A. Mukherjee2

Section:Research Paper, Product Type: Journal Paper
Volume-6 , Issue-7 , Page no. 291-295, Jul-2018

CrossRef-DOI:   https://doi.org/10.26438/ijcse/v6i7.291295

Online published on Jul 31, 2018

Copyright © G. Dutta, A. Mukherjee . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: G. Dutta, A. Mukherjee, “A Novel Approaches to Study Different DNA Pattern Matching Algorithms over Two Compression Techniques,” International Journal of Computer Sciences and Engineering, Vol.6, Issue.7, pp.291-295, 2018.

MLA Style Citation: G. Dutta, A. Mukherjee "A Novel Approaches to Study Different DNA Pattern Matching Algorithms over Two Compression Techniques." International Journal of Computer Sciences and Engineering 6.7 (2018): 291-295.

APA Style Citation: G. Dutta, A. Mukherjee, (2018). A Novel Approaches to Study Different DNA Pattern Matching Algorithms over Two Compression Techniques. International Journal of Computer Sciences and Engineering, 6(7), 291-295.

BibTex Style Citation:
@article{Dutta_2018,
author = {G. Dutta, A. Mukherjee},
title = {A Novel Approaches to Study Different DNA Pattern Matching Algorithms over Two Compression Techniques},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {7 2018},
volume = {6},
Issue = {7},
month = {7},
year = {2018},
issn = {2347-2693},
pages = {291-295},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=2431},
doi = {https://doi.org/10.26438/ijcse/v6i7.291295}
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
DO = {https://doi.org/10.26438/ijcse/v6i7.291295}
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=2431
TI - A Novel Approaches to Study Different DNA Pattern Matching Algorithms over Two Compression Techniques
T2 - International Journal of Computer Sciences and Engineering
AU - G. Dutta, A. Mukherjee
PY - 2018
DA - 2018/07/31
PB - IJCSE, Indore, INDIA
SP - 291-295
IS - 7
VL - 6
SN - 2347-2693
ER -

VIEWS PDF XML
427 368 downloads 191 downloads
  
  
           

Abstract

Pattern matching is a technique to find the given pattern over the text within a database. Different types of algorithms are used to find the desired pattern over a text. For easy retrieval of DNA sequences of various diseases which are stored in large databases and comparison happens in sequence analysis. By using DNA pattern matching if it is found that a particular sequence occurs again and again and by counting the no of occurrence to find the existence and intensity of a disease. Compression is a technique for reducing the quantity of data used to represent any content without excessively reducing the quality of the data(ie. image, video etc.). Data compression is the process of encoding information using fewer bits than an uncoded representation is also making a use of specific encoding schemes. This procedure also reduces the number of bits required to store over the disk. For large amount of data, compression is a technique that makes storing easier. Different techniques are used for data compression. In this paper to find out a particular pattern in the given compressed DNA sequence using Brute-force, Boyer-Moor and KMP string matching algorithm and also measure performance of those algorithms more efficiently. Those algorithms are executed over the compressed DNA sequence to measure their performances and to avoid wasteful comparison. Two different techniques, ¼ th compression and Huffman compression is used for compressing the DNA sequences and compared which one is better among these two different techniques for pattern matching.

Key-Words / Index Term

Bioinformatics, DNA Pattern Matching, Brute-force, Boyer-Moor, KMP, ¼ th compression, Huffman coding

References

[1] S.T. Klein, D. Shapira, “A New Compression Method for Compressed Matching”,IEEE conference on Data Compression (DCC), August, 2002.
[2] E.S.de Moura,G.Navarro,N.Ziviani and R.Baeza-Yates,“Direct pattern matching on compressed text”, In Proc. 5th International Symp. on String Processing and Information Retrieval, IEEE Computer Society, pp. 90-95,1998.
[3] Raju Bhukya, DVLN Somayajulu, “Exact Multiple Pattern Matching Algorithm using DNA Sequence and Pattern Pair”,International Journal of Computer Applications (IJCA),Vol. 17 No.8,pp: 32-38, March 2011.
[4] G.Navarro,T.Kida,M.Takeda,A.Shinohara,S.Arikawa, “Faster Approximate String Matching over Compressed Text”,IEEE conference on Data Compression (DCC),August, 2002.
[5] Mamta Sharma,”Compression Using Huffman Coding”,International Journal of Computer Science and Network Security(IJCSNS), VOL.10 No.5, May 2010.
[6] Priya jain,Shikha Pandey, “Comparative Study on Text Pattern Matching for Heterogeneous System”, International Journal of Computer Science & Engineering Technology (IJCSET), Vol. 3 No. 11 , pp: 537-543, Nov 2012.
[7] S.RAJESH,S.PRATHIMA,Dr.L.S.S.REDDY,“Unusual Pattern Detection in DNA Database Using KMP Algorithm”, International Journal of Computer Applications, Vo;.1, No. 22, pp: 1-5, 2010.
[8] Panwei Cao, Suping Wu, “Parallel Research on KMP Algorithm”, IEEE International Conference on Consumer Electronics, Communications and Networks (CECNet), May, 2011.
[9] Lei Chen, Shiyong Lu, Jeffrey Ram, “Compressed Pattern Matching in DNA Sequences”, Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference (CSB 2004).