A Study on Data Preprocessing Methods on Web Log Data in Web Usage Mining
R.Sandrilla 1 , M. Savitha Devi2
Section:Review Paper, Product Type: Journal Paper
Volume-6 ,
Issue-7 , Page no. 920-928, Jul-2018
CrossRef-DOI: https://doi.org/10.26438/ijcse/v6i7.920928
Online published on Jul 31, 2018
Copyright © R.Sandrilla, M. Savitha Devi . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
View this paper at Google Scholar | DPI Digital Library
How to Cite this Paper
- IEEE Citation
- MLA Citation
- APA Citation
- BibTex Citation
- RIS Citation
IEEE Style Citation: R.Sandrilla, M. Savitha Devi, “A Study on Data Preprocessing Methods on Web Log Data in Web Usage Mining,” International Journal of Computer Sciences and Engineering, Vol.6, Issue.7, pp.920-928, 2018.
MLA Style Citation: R.Sandrilla, M. Savitha Devi "A Study on Data Preprocessing Methods on Web Log Data in Web Usage Mining." International Journal of Computer Sciences and Engineering 6.7 (2018): 920-928.
APA Style Citation: R.Sandrilla, M. Savitha Devi, (2018). A Study on Data Preprocessing Methods on Web Log Data in Web Usage Mining. International Journal of Computer Sciences and Engineering, 6(7), 920-928.
BibTex Style Citation:
@article{Devi_2018,
author = {R.Sandrilla, M. Savitha Devi},
title = {A Study on Data Preprocessing Methods on Web Log Data in Web Usage Mining},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {7 2018},
volume = {6},
Issue = {7},
month = {7},
year = {2018},
issn = {2347-2693},
pages = {920-928},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=2536},
doi = {https://doi.org/10.26438/ijcse/v6i7.920928}
publisher = {IJCSE, Indore, INDIA},
}
RIS Style Citation:
TY - JOUR
DO = {https://doi.org/10.26438/ijcse/v6i7.920928}
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=2536
TI - A Study on Data Preprocessing Methods on Web Log Data in Web Usage Mining
T2 - International Journal of Computer Sciences and Engineering
AU - R.Sandrilla, M. Savitha Devi
PY - 2018
DA - 2018/07/31
PB - IJCSE, Indore, INDIA
SP - 920-928
IS - 7
VL - 6
SN - 2347-2693
ER -
VIEWS | XML | |
429 | 290 downloads | 135 downloads |
Abstract
Web usage Mining is an extension of traditional data mining. As the tremendous amount of data is increasing, the prominence of internet is growing. This impact upholds the user’s needs and the users are also increasing in enormous speed. Because of these realities the web data has been budding day to day. Therefore extracting the useful data from WWW has become the challenging one. Due to this fact the users are feeling disoriented. So it is necessary for the web usage miners to discover the new way of finding the desired information or the ease of accessing the web. As a result the web mining has become more popular and reached the peak in research field having in mind about mining the data and WWW as well. The aim of the proposed research is to survey on different Data preprocessing techniques carried out by most of researchers has been discussed, where this web log preparation is considered as the first step on web mining process to identify the user behavior. This phase is referred to be the most important process to ensure the quality of the log data. The log files are gathered and pre-processed by removing the unwanted or irrelevant information. A complete overview on data preprocessing may recommend better technique to find the user behavior and to improve the performance, and finally we concluded by providing a glimpse of various Web mining Applications.
Key-Words / Index Term
Web usage mining, Web server log, Data Preprocessing, User identification, Session Identification
References
[1] R.Shanthil, Dr.S.P.Rajagopal, “An Efficient Web Mining Algorithm to Mine Web Log Information”
[2] Dharmendra Patel, Dr. Kalpesh Parikh, Atul Patel,” Sessionization –A Vital Stage in Data Preprocessing of Web Usage Mining - A Survey” International Journal of Engineering Research and Applications (IJERA), Vol. 2, Issue 1, Jan-Feb. 2012, pp. 327-330.
[3] http://www.surfray.com/blog/2009/08/11/iis-log-file-formats-overview/
[4] Jia Li (2013), “Research of Analysis of User Behavior Based on Web Log”, International Conference on Computational and Information Sciences. 2013
[5] Jiawei Han et al, “Data mining, concept and techniques” .cs.sfu.ca, 2, Jan. 31, 2011. [Online]. Available: http://www.cs.sfu.ca.
[6] J. Han, M. Kamber and J. Pei, “Data mining: concepts and techniques”, Morgan Kaufmann, (2006).
[7] C.P.Sumathi, R.PadmajaValli, Santhanam,” An Overview Of Preprocessing of Web Log Files For Web Usage Mining” Journal of Theoretical and Applied Information Technology 31st December 2011. Vol. 34 No.2
[8] Dafa-Alla, Mirghani. A. Eltahir and Anour F.A(2013),” Extracting Knowledge from Web Server Logs Using Web Usage Mining”, 2013 international conference on computing, electrical and electronic engineering (ICCEEE)
[9] Tasawar Hussain, dr.asghar and dr. Masood” preprocessing techniques in web log mining” 2010
[10] Theint Theint Aye “Web Log Cleaning for Mining of Web Usage Patterns”, University of Computer Studies, Mandalay 2011.
[11] Chandana S. Khatavkar, Prof. Mangesh Wanjari, “A Hybrid approach for Clustering Weblog”, International Journal of Advanced Research in Computer Science and Software Engineering. Volume 5, Issue 3, March 2015.
[12] Abdul Rahaman Wahab Sait and Dr. T.Meyappan “Data preprocessing and Transformation techniques to generate Patterns from Web Logs”, International conference on Computer Science and Information Systems (ICSIS’2014) Oct 2014.
[13] B.Uma Maheshwari, 2P.Sumathi,” An Effective Method to Preprocess the Data in Web Usage Mining”, ARPN Journal of Science and Technology, Vol. 3, NO. 3, March 2013
[14] P.Nithya, 2 Dr. Sumathi,” A Survey on Web Usage Mining: Theory and Applications”, P Nithya et al, Int.J.Computer Technology & Applications, Vol 3 (4), 1625-1629.
[15] Sheetal A. Raiyani1, Shailendra Jain2, Ashwin G. Raiyani3.” Advanced Preprocessing using Distinct User Identification in web log usage data”, International Journal of Advanced Research in Computer and Communication Engineering Vol. 1, Issue 6, August 2012.
[16] Ramya C., Shreedhara K. S., and Kavitha G.” Preprocessing: A Prerequisite for Discovering Patterns in Web Usage Mining Process”, International Journal of Information and Electronics Engineering, Vol. 3, No. 2, March 2011.
[17] Chitraa, Dr. Antony Selvadoss Davamani,” An Efficient Path Completion Technique for web log mining”, IEEE International Conference on Computational Intelligence and Computing Research, 2010
[18] Surbhi Anand, Rinkle Rani Aggarwal,” An Efficient Algorithm for Data Cleaning of Log File using File Extensions”, International Journal of Computer Applications (0975 – 888) Volume 48– No.8, June 2012.
[19] R. Cooley, B. Mobasher, J. Srivastava , “Data Preparation for Mining World Wide Web Browsing Pattern” in Journal of Knowledge and Data Engineering Workshop, IEEE, 1999Vol.1 Page(s): 5-32
[20] G. Castellano, A. M. Fanelli, M. A. Torsello,” Log Data Preparation for Mining Web Usage Patterns” IADIS International Conference Applied Computing 2007
[21] G T Raju, Nandini N,” Preprocessing of Web Usage Data for Application in Prefetching to Reduce Web Latency”, International Journal of Electrical& Computer Sciences IJECS-IJENS Vol: 14 No: 04
[22] Navin Kumar Tyagi1, A.K. Solanki2& Sanjay Tyagi3,” An algorithmic approach to data preprocessing in Web Usage Mining”, International Journal of Information Technology and Knowledge Management July-December 2010, Volume 2, No. 2, pp. 279-283
[23] K. R. Suneetha, Dr. R. Krishnamoorthi,” Identifying User Behavior by Analyzing Web Server Access Log File”, IJCSNS International Journal of Computer Science and Network Security, VOL.9 No.4, April 2009.
[24] Abdul Rahaman Wahab Sait, and Dr.T.Meyappan,” Data Preprocessing and Transformation Technique to Generate Pattern from the Web Log”, International conference on Computer Science and Information Systems (ICSIS’2014) Oct 17-18, 2014.
[25] Naga Lakshmi, Raja Sekhara Rao, Sai Satyanarayana Reddy,”An Overview of Preprocessing on Web Log Data for Web Usage Analysis” International Journal of Innovative Technology and Exploring Engineering (IJITEE), Volume-2, Issue-4, March 2013.
[26] Mona S.Kamat, J.W.Bakal & Madhu Nashipudi,” Optimization of Web Preprocessing in Web Usage Mining”, Volume-2, Issue-6, 2013.
[27] Dr. Girish S. Katkar,” Use of Log Data for Predictive Analytics through Data Mining”, Current Trends in Technology and Science Volume: 3, Issue: 3 (Apr-May. 2014)
[28] B N ShankarGowda, Vibha Lakshmikanthab, K R Venugopal, L M Patnaikd,” A Framework for Preprocessing Web Log in the Data Warehouse Environment for Web User Behavior Analytics” International Journal of Information Processing, 8(1), 40-52, 2014IK International Publishing House Pvt. Ltd., New Delhi, India.
[29] Vijay Kumar Padala1, Sayeed Yasin2, Durga Bhavani Alanka3,” A Novel Method for Data Cleaning and User- Session Identification for Web Mining”, International Journal of Modern Engineering Research (IJMER), Vol. 3, Issue. 5, Sep - Oct. 2013 pp-2816-2819.
[30] Shashi Sahu1,Leena Sahu2 “A Survey on Frequent Web Page Mining with Improving Data Quality of Log Cleaner”, International Journal of Advanced Research in Computer Engineering & Technology (IJARCET) Volume 4 Issue 3 , March 2015
[31] Xiaohua Hu,” DB-HReduction: A Data Preprocessing Algorithm for Data Mining Applications”, 0893-9659/03/$ - see front matter c° 2003 Elsevier Science Ltd. All rights reserved.