Open Access   Article Go Back

Characteristic mining of Mathematical Formulas from Document - A Comparative Study on Sequence Matcher and Levenshtein Distance procedure

G. Appa Rao1 , G. Srinivas2 , K.Venkata Rao3 , P.V.G.D. Prasad Reddy4

  1. Department of CSE, GIT, GITAM,VISAKHAPATNAM,INDIA.
  2. Department of IT, ANITS, VISAKHAPATNAM, INDIA.
  3. Department of CSSE, AndhraUniversity,VISAKHAPATNAM,INDIA.

Section:Research Paper, Product Type: Journal Paper
Volume-6 , Issue-4 , Page no. 400-404, Apr-2018

CrossRef-DOI:   https://doi.org/10.26438/ijcse/v6i4.400404

Online published on Apr 30, 2018

Copyright © G. Appa Rao, G. Srinivas, K.Venkata Rao, P.V.G.D. Prasad Reddy . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: G. Appa Rao, G. Srinivas, K.Venkata Rao, P.V.G.D. Prasad Reddy, “Characteristic mining of Mathematical Formulas from Document - A Comparative Study on Sequence Matcher and Levenshtein Distance procedure,” International Journal of Computer Sciences and Engineering, Vol.6, Issue.4, pp.400-404, 2018.

MLA Style Citation: G. Appa Rao, G. Srinivas, K.Venkata Rao, P.V.G.D. Prasad Reddy "Characteristic mining of Mathematical Formulas from Document - A Comparative Study on Sequence Matcher and Levenshtein Distance procedure." International Journal of Computer Sciences and Engineering 6.4 (2018): 400-404.

APA Style Citation: G. Appa Rao, G. Srinivas, K.Venkata Rao, P.V.G.D. Prasad Reddy, (2018). Characteristic mining of Mathematical Formulas from Document - A Comparative Study on Sequence Matcher and Levenshtein Distance procedure. International Journal of Computer Sciences and Engineering, 6(4), 400-404.

BibTex Style Citation:
@article{Rao_2018,
author = {G. Appa Rao, G. Srinivas, K.Venkata Rao, P.V.G.D. Prasad Reddy},
title = {Characteristic mining of Mathematical Formulas from Document - A Comparative Study on Sequence Matcher and Levenshtein Distance procedure},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {4 2018},
volume = {6},
Issue = {4},
month = {4},
year = {2018},
issn = {2347-2693},
pages = {400-404},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=1909},
doi = {https://doi.org/10.26438/ijcse/v6i4.400404}
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
DO = {https://doi.org/10.26438/ijcse/v6i4.400404}
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=1909
TI - Characteristic mining of Mathematical Formulas from Document - A Comparative Study on Sequence Matcher and Levenshtein Distance procedure
T2 - International Journal of Computer Sciences and Engineering
AU - G. Appa Rao, G. Srinivas, K.Venkata Rao, P.V.G.D. Prasad Reddy
PY - 2018
DA - 2018/04/30
PB - IJCSE, Indore, INDIA
SP - 400-404
IS - 4
VL - 6
SN - 2347-2693
ER -

VIEWS PDF XML
646 521 downloads 204 downloads
  
  
           

Abstract

The key predicament in the present circumstances is how to categorize the mathematically related keywords from a given text file and store them in one math text file. As the math text file contains only the keywords which are related to mathematics. The math dataset is a collection of huge amount of tested documents and stored in math text file. The dataset is trained with giant amount of text files and the size of dataset increases, training with various text samples. Finally the dataset contains only math-related keywords. The proposed approaches evaluated on the text containing individual formulas and repeated formulas. The two approaches proposed are one is Sequence matcher and another one is Levenshtein Distance, both are used for checking string similarity. The performance of the repossession is premeditated based on dataset of repetitive formulas and formulas appearing once and the time taken for reclamation is also measured.

Key-Words / Index Term

Levenshtein distance,Sequence matcher

References

[1] Kai Ma, Siu Cheung Hui and Kuiyu Chang “Feature Extraction and Clustering-based Retrieval for Mathematical Formulas”, pp. 372-377.
[2] Sidath Harshanath Samarasinghe and Siu Cheung Hui “Mathematical Document Retrieval for Problem Solving”, International Conference on Computer Engineering and Technology, pp.583-587,2009.
[3] J. Misutka and L. Galambos, “Mathematical Extension of Full Text Search Engine Indexer”, Proc. 3rd International Conference on Information and Communication Technologies: From Theory to Applications (ICTTA 08), , pp. 1-6,April 2008.
[4] B.R. Miller and A. Youssef, “Technical Aspects of the Digital Library of Mathematical Functions”, in Annals of Mathematics and Artificial Intelligence, Springer Netherlands, pp. 121-136, 2003.
[5] H. Zhang, T.B. and M.S. Lin, “An Evolutionary Kmeans Algorithm for Clustering Time Series Data” ,Proc. International Conference on Machine Learning and Cybernetics, pp. 1282-1287, 2004.
[6] R. Munavalli and M.R. MathFind, “A Math-aware Search Engine”, Proc. Annual International ACM SIGIR Conference on Research and development in information retrieval, pp.735-735, 2006.
[7] M. Kohlhase. “Markup for Mathematical Knowledge,” An Open Markup format for Mathematical Documents”, Ver. 1.2, Lecture Notes in Computer Science, , Springer Berlin, pp. 13-23.
[8] G.AppaRao,K.Venkata Rao,PVGD Prasad Reddy and T.Lava Kumar,“An Efficient Procedure for Characteristic mining of Mathematical Formulas from Document”, International Journal of Engineering Science and Technology (IJEST), Vol. 10 No.03,pp152-157, Mar 2018