Modeling Customer’s Credit Worthiness using Enhanced Ensemble Model
Research Paper | Journal Paper
Vol.6 , Issue.7 , pp.1466-1470, Jul-2018
CrossRef-DOI: https://doi.org/10.26438/ijcse/v6i7.14661470
Abstract
The financial organizations such as banks assess credit worthiness of borrowers before providing new loans. Certain assessments are done on the basis of ‘probability of default’ for the potential borrower which is based on credit scoring of customer. Banks are having customer`s portfolios that are likely to go through current crisis without much difficulty, but at the same time it is difficult to determine the risks arising due to various factors. Most of the features in customer databases have little predictive effect on the credit worthiness of the customer. So, determining features that affects the credit worthiness is another important task before knowledge discovery. The predictive models, based on Ensemble Methods (EM): a ML paradigm under Data Mining (DM), have the capability to determine the relevant features and customer’s credit worthiness in efficient manner. In this paper, a good number of prediction models proposed in the literature are surveyed. To deal with major challenges like data incompleteness, noise and its vastness while building predictive models, this work proposes a predictive framework (classifier) for detection of Credit Worthiness. It employs a Supervised Attribute (Feature) selection method to select the worth of subset, filters and an Enhanced Ensemble Method based on Boosting for precise convergence. The proposed Enhanced Ensemble based classifier is implemented in Weka 3.8.2. Later in this work, to know the performance of proposed model, evaluation and comparison against several model is presented. The proposed framework selects minimum number of attributes and is good in terms of False Positive, Recall, True Positive and Prediction Rate than original base classifiers. Also, the model is good for applying on large datasets due the benefit of ensembles.
Key-Words / Index Term
Artificial Intelligence, Boosting, Classification, Credit Risk, Credit Worthiness, Ensemble Methods, Machine Learning, Predictive Models, Weka
References
[1] Polikar, R. (2006). Ensemble based systems in decision making. IEEE Circuits and Systems Magazine, 6(3), 21–45.
[2] Loris Nanni , Alessandra Lumini, “An experimental comparison of ensemble of classifiers for bankruptcy prediction and credit scoring”, Elsevier, Expert Systems with Applications 36 (2009) 3028–3033.
[3] Stuti Nathaniel, Anand Motwani, Arpit Saxena, "Cloud based Predictive Model for Detection of ‘Chronic Kidney Disease’ Risk", International Journal of Computer Sciences and Engineering, Vol.6, Issue.4, pp.185-188, 2018.
[4] Bhekisipho Twala, “Multiple classifier application to credit risk assessment”, Expert Systems with Applications 37 (2010) 3326–3336.
[5] Gang Wang, Jinxing Hao, Jian Mab, Hongbing Jiang, “A comparative assessment of ensemble learning for credit scoring”, Expert Systems with Applications 38 (2011) 223–230.
[6] Zhou, Z. H. (2009). Ensemble. In L. Liu & T. Özsu (Eds.), Encyclopedia of database systems. Berlin: Springer.
[7] Regina Esi Turkson, Edward Yeallakuor Baagyere, Gideon Evans Wenya, “A Machine Learning Approach for Predicting Bank Credit Worthiness”, ISBN: 978-1-4673-9187-0, IEEE 2016
[8] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay, “Scikit-learn: Machine learning in Python,” Journal of Machine Learning Research, vol. 12, pp. 2825–2830, 2011.
[9] C.R.Durga devi, Dr.R.Manicka chezian, “A Relative Evaluation of the Performance of Ensemble Learning in Credit Scoring”, IEEE International Conference on Advances in Computer Applications (ICACA), 978-1-5090-3770-4/16, 2016
[10] Zhang, “Z., Research of credit risk of commercial bank`s personal loan based on CHAID decision tree”, Artificial Intelligence, Management Science and Electronic Commerce (AIMSEC), 2011.
[11] http://www.cs.waikato.ac.nz/~ml/weka/index.html
[12] Weka, University of Waikato, Hamilton, New Zealand.
[13] Anand Motwani, Goldi Bajaj, Sushila Mohane, "Predictive Modelling for Credit Risk Detection using Ensemble Method", International Journal of Computer Sciences and Engineering, Vol.6, Issue.6, pp.863-867, 2018.
[14] Ling Kock Sheng, and Teh Ying Wah, “A comparative study of data mining techniques in predicting consumers’ credit card risk in banks”, African Journal of Business Management Vol. 5 (20), pp. 8307-8312, 16 September, 2011
[15] http://archive.ics.uci.edu/ml/datasets/default+of+credit+card+clients
Citation
M. Singh, G. K. Dixit, "Modeling Customer’s Credit Worthiness using Enhanced Ensemble Model," International Journal of Computer Sciences and Engineering, Vol.6, Issue.7, pp.1466-1470, 2018.
Predicting Credit Worthiness of Bank Customer with Machine Learning over Cloud
Research Paper | Journal Paper
Vol.6 , Issue.7 , pp.1471-1477, Jul-2018
CrossRef-DOI: https://doi.org/10.26438/ijcse/v6i7.14711477
Abstract
Using Machine Learning (ML) and data analytics in the banking organizations is more than a trend and has become essential to keep up with the market competition and reduce credit risks. In recent years, Customer’s Credit Worthiness is becoming more crucial for financial organizations. In past many credit risk models, that are actually statistical tools, are used to infer the future probabilities of customers to become default. At the same time with the massive increase in the volume, variety and velocity of data generated through various banking and business transactions pose a great computational and storage challenge for data analysis and intelligence tasks. To address the challenges for intelligence tasks Cloud Computing (CC) paradigm is evolved. The data and computation can be distributed to any CC environment with minimal effort nowadays. Also, CC paradigm turned out to be valuable alternatives to speed-up ML platforms. This paper aims to build and assess the performance of the 03 machine learning models, for prediction of credit card payment defaulter, over Microsoft Azure Machine Learning Platform. Finally a predictive analytics framework for classifying and predicting payment default by credit holder is proposed. For developing and testing the model a large, real and recent dataset of credit card, obtained from UCI repository, is used. The key focus of the work is on detection of Credit Worthiness which is defined as the ‘probability of default’ on the loan or credit from financial organizations like banks. The efficacy of model is demonstrated, on the basis of prediction accuracy and other metrics, against benchmark classifiers. Proposed work also demonstrates the use of Microsoft Azure cloud which is one of the foremost cloud environments for ML. The results attained by the proposed model are promising and the obtained results have potential to direct the future research work in domain.
Key-Words / Index Term
Artificial Intelligence, Predictive Modelling, Cognitive Computing, Computer Vision, Credit Risk, Credit Worthiness, Data Classification, Financial Organization, Machine Learning, Microsoft Azure, Cloud Computing
References
[1] https://www.investopedia.com/terms/c/credit-worthiness.asp
[2] https://www.moodys.com/sites/products/ProductAttachments/CreditMonitor_brochure.pdf
[3] Usama Fayyad, Gregory Piatetsky-Shapiro, and Padhraic Smyth, “From Data Mining to Knowledge Discovery in Databases”,American Association for Artificial Intelligence. All rights reserved. 0738-4602-1996.
[4] Sebastiaan Tesink,” Improving Intrusion Detection Systems through Machine Learning” Tilburg University March 2007.
[5] K. Tumer and N. C. Oza, “Decimated input ensembles for improved generalization,” in Proceedings of the International Joint Conference on Neural Networks (IJCNN `99), pp. 3069–3074, Washington, DC, USA, July 1999.
[6] Belady C. “In the data center, power and cooling costs more than the it equipment it supports” 2007. URL: http://www.electronics-cooling.com/articles/2007/feb/a3/.
[7] National Institute of Standards and Technology (2011) NIST cloud computing reference architecture: Version 1. NIST Meeting Report
[8] P. Mell, T. Grance, “The NIST Definition of Cloud Computing, National Institute of Standards and Technology”, ver. 15, 9 July 2010.
[9] Regina Esi Turkson, Edward Yeallakuor Baagyere, Gideon Evans Wenya, “A Machine Learning Approach for Predicting Bank Credit Worthiness”, ISBN: 978-1-4673-9187-0, IEEE 2016
[10] Ling Kock Sheng, and Teh Ying Wah, “A comparative study of data mining techniques in predicting consumers’ credit card risk in banks”, African Journal of Business Management Vol. 5 (20), pp. 8307-8312, 16 September, 2011
[11] Loris Nanni , Alessandra Lumini, “An experimental comparison of ensemble of classifiers for bankruptcy prediction and credit scoring”, Elsevier, Expert Systems with Applications 36 (2009) 3028–3033.
[12] U Bhuvaneswari, P. James Daniel Paul, Siddhant Sahu, “Financial Risk Modelling in Vehicle Credit Portfolio”, 978-1-4799-4674-7/14/$31.00, IEEE 2014
[13] C.R.Durga devi, Dr.R.Manicka chezian, “A Relative Evaluation of the Performance of Ensemble Learning in Credit Scoring”, IEEE International Conference on Advances in Computer Applications (ICACA), 978-1-5090-3770-4/16, 2016
[14] David Opitz and Richard Maclin, “Popular Ensemble Methods: An Empirical Study”, Journal of artificial intelligence research 169-198, 1999.
[15] L. Breiman, “Bagging predictors,” Machine Learning, vol. 24, no. 2, pp. 123–140, 1996.
[16] Freund, Y., & Schapire, R. (1996), “Experiments with a new boosting algorithm”, in Proceedings of the thirteenth international conference on machine learning, Bari, Italy, pp. 148–156.
[17] Wolpert, D. H. (1992). Stacked generalization, Neural Networks, 5(2), 241–259.
[18] Gang Wang, Jinxing Hao, Jian Mab, Hongbing Jiang, “A comparative assessment of ensemble learning for credit scoring”, Expert Systems with Applications 38 (2011) 223–230.
[19] K. Tumer and N. C. Oza, “Decimated input ensembles for improved generalization,” in Proceedings of the International Joint Conference on Neural Networks (IJCNN `99), pp. 3069–3074, Washington, DC, USA, July 1999.
[20] Anand Motwani, Goldi Bajaj, Sushila Mohane, "Predictive Modelling for Credit Risk Detection using Ensemble Method", International Journal of Computer Sciences and Engineering, Vol.6, Issue.6, pp.863-867, 2018.
[21] Sihem Khemakhem ,and Younés Boujelbènea, “Credit risk prediction: A comparative study between discriminant analysis and the neural network approach”, Accounting and Management Information Systems Vol. 14, No. 1, pp. 60-78, 2015
[22] https://aws.amazon.com/aml/
[23] https://cloud.google.com/products/machine-learning/
[24] https://cloud.google.com/ml-engine/
[25] https://azure.microsoft.com/en-in/services/machine-learning-studio/
[26] Weka, University of Waikato, Hamilton, New Zealand., http://www.cs.waikato.ac.nz/~ml/weka/index.html
[27] http://archive.ics.uci.edu/ml/datasets/default+of+credit+card+clients
Citation
A. Motwani, P. Chaurasiya, G. Bajaj, "Predicting Credit Worthiness of Bank Customer with Machine Learning over Cloud," International Journal of Computer Sciences and Engineering, Vol.6, Issue.7, pp.1471-1477, 2018.
Test case selection using multi-objective Evolutionary Algorithms
Research Paper | Journal Paper
Vol.6 , Issue.7 , pp.1478-1484, Jul-2018
CrossRef-DOI: https://doi.org/10.26438/ijcse/v6i7.14781484
Abstract
Regression testing is needed to ensure the correct behavior of software after change. For the process of automation and selection of test cases, a number of meta-heuristic techniques have been used in literature. In this paper, bat algorithm, cuckoo search and multi-objective binary genetic algorithms have been discussed. The proposed multi-objective binary genetic algorithm is evaluated against test functions and its performance is analyzed in comparison to existing algorithms i.e. bat and cuckoo search algorithm. For this, we have considered factors such as fault coverage and execution time. The related dataset is extracted from benchmark repository named flex object which originates from SIR. Results indicate that multi-objective performs better than bat and cuckoo search algorithm.
Key-Words / Index Term
Software Testing, Regression Testing, Bat Algorithm, Cuckoo Search Algorithm, Software Maintenance
References
[1] S Nachiyappan, A Vimala devi and C B Selva Lakshmi, “An evolutionary algorithm for regression test suite reduction”, In Communication and Computational Intelligence (INCOCCI), 2010 International Conference, pp. 503-508, 2010, December, IEEE.
[2] G Rothermel , S Elbaum, A Kinneer and H Do, 2006. Software-artifact infrastructure repository. URL http:// sir.unl. edu/portal.
[3] R.Y Nakamura, L.A Pereira, K A Costa, D Rodrigues, J.P Papa and X.S Yang, 2012,“BBA: a binary bat algorithm for feature selection”. In 2012, 25th SIBGRAPI Conference on Graphics Patterns and Images, pp. 291-297, 2012,August, IEEE.
[4] Zhaolu Guo, Xuezhi Yue, Kejun Zhang, Shenwen Wang and Zhijian Wu, “A Thermodynamical Selection-Based Discrete Differential Evolution for the 0-1 Knapsack Problem”, Entropy 2014, 16, 6263-6285; doi:10.3390/e16126263.
[5] Khan, K. and Sahai, A., 2012. A comparison of BA, GA, PSO, BP and LM for training feed forward neural networks in e-learning context. International Journal of Intelligent Systems and Applications ,4(7), p.23.
[6] Songwei Huang, Lifang He, Xu Si, Yuanyuan Zhang and Pengyu Hao, An Effective Krill Herd Algorithm for Numerical Optimization, International Journal of Hybrid Information Technology Vol. 9, No.7 (2016), pp. 127-138 http://dx.doi.org/10.14257/ijhit.2016.9.7.13.
[7] Usha Badhera, G.N Purohit, Debarupa Biswas, 2012 Test case prioritization algorithm based upon modified code coverage regression testing, International Journal of Software Engineering & Applications (IJSEA), Vol.3, No.6, November 2012.
[8] Luciano S. de Souza, Ricardo B.C. Prudêncio, Flavia de A. Barros, Eduardo H. da S. Aranha, 2013 Search based constrained test case selection using execution effort, Expert Systems with Applications 40 (2013) 4887–4896.
[9] Yanhong Feng, Ke Jia and Yichao He, An Improved Hybrid Encoding Cuckoo Search Algorithm for 0-1 Knapsack Problems, Hindawi Publishing Corporation Computational Intelligence and Neuroscience Volume 2014, Article ID 970456, 9 pages http://dx.doi.org/10.1155/2014/970456.
[10] Srivastava P.R, Sravya C, Ashima, Kamisetti S and Lakshmi M, 2012. Test sequence optimisation: an intelligent approach via cuckoo search. International Journal of Bio-Inspired Computation,4(3), pp.139-148.
[11] Nagar R, Kumar A, Singh G.P and Kumar S, 2015, February. Test case selection and prioritization using cuckoos search algorithm. In Futuristic Trends on Computational analysis and knowledge management(ABLAZE) 2015, International Conference on (pp. 283-288).IEEE.
[12] Yang, X.S., 2010. A new metaheuristic bat-inspired algorithm. In Nature Inspired Cooperative Stratergies for Optimization (NICSO 2010) (pp. 65-74). Springer Berlin Heidelberg.
[13] Biswal, S., Barisal, A.K., Behera, A. and Prakash, T., 2013, April. Optimal power dispatch using BAT algorithm.InEnergy Efficient Technologies for Sustainability (ICEETS) 2013, National Conference on(pp. 1018-1023). IEEE.
[14] Yang, X.S. and Deb, S., 2009, December. Cuckoo search via Lévy flights. In Nature and Biologically Inspired Computing, 2009 NaBIC 2009, World Congress on (pp. 210-214). IEEE.
[15] Deb, K., 2001, “Multi-Objective Optimization using Evolutionary Algorithms,” Wiley Chichester, UK.
[16] Abdullah Konak, David W. Coit , Alice E. Smith, Multi-objective optimization using genetic algorithms: A tutorial, Reliability Engineering & System Safety Volume 91, Issue 9, September 2006, Pages 992-1007, Elsevier.
[17] P. Sudheer Kumar Reddy* P. Anil Kumar G.N.S. Vaibhav, Application of BAT Algorithm for Optimal Power Dispatch, International Journal of Innovative Research in Advanced Engineering (IJIRAE) ISSN: 2349-2163 Issue 2, Volume 2 (February 2015).
[18] Lingzhi Yi, Yue Liu, wenxin Yu, Genping Wang, Yongbo Sui, Adaptive Cuckoo Search Algorithm for the Speed Control System of Induction Motor, SCIREA Journal of Electrical Engineering http://www.scirea.org/journal/DEE February 21, 2017 Volume 2, Issue 1, February 2017.
[19] De Souza, L.S., Prudêncio, R.B. and Barros, F.D.A., MultiObjective Test Case Selection: A study of the influence of the Catfish effect on PSO based strategies.
[20] Ali, A., Nadeem, A., Iqbal, M.Z.Z. and Usman, M., 2007, December. Regression testing based on UML design models. In Dependable Computing, 2007. PRDC 2007. 13th Pacific Rim International Symposium on (pp. 85-88). IEEE.
[21] Gupta Nirmal Kumar and Rohil Mukesh Kumar "Improving GA based Automated Test Data Generation Technique for Object Oriented Software", IEEE International Advance Computing Conference (IACC), Ghaziabad, pp.249 – 253, 2013.
[22] Srivastava, P.R., Bidwai, A., Khan, A., Rathore, K., Sharma, R. and Yang, X.S., 2014. An empirical study of test effort estimation based on bat algorithm. International Journal of Bio-Inspired Computation,6(1), pp.57-70.
[23] Praveen Ranjan Srivastava, Chandolu Sravya, Ashima, Sai Kamisetti and Manogna Lakshmi, Test sequence optimisation: an intelligent approach via cuckoo search, Int. J. Bio-Inspired Computation, Vol. 4, No. 3, 2012.
[24] Ehsan Valian, Saeed Tavakoli, Shahram Mohanna, Atiyeh Haghi, Improved cuckoo search for reliability optimization problems, Computers & Industrial Engineering 64 (2013) 459–468, Elsevier.
[25] Xin-She Yang, Bat algorithm: literature review and applications, Int. J. Bio-Inspired Computation, Vol. 5, No. 3, pp. 141–149 (2013). DOI: 10.1504/IJBIC.2013.055093
[26] S. L. Yadav, M. Phogat, “A Review on Bat Algorithm”, International Journal of Computer Sciences and Engineering, Volume-5, Issue-7, E-ISSN: 2347-2693.
Citation
S. Raheja, R. Singh, "Test case selection using multi-objective Evolutionary Algorithms," International Journal of Computer Sciences and Engineering, Vol.6, Issue.7, pp.1478-1484, 2018.
The Role of Technology and Education in Financial Inclusion – A Data Mining Analysis in a Fuzzy Framework
Research Paper | Journal Paper
Vol.6 , Issue.7 , pp.1485-1497, Jul-2018
CrossRef-DOI: https://doi.org/10.26438/ijcse/v6i7.14851497
Abstract
Data mining is the exploration and analysis of large data sets, in order to discover meaningful patterns and rules. Fuzzy Logic can be integrated with Data Mining techniques to incorporate human type reasoning in pattern discovery. In this paper we are using Fuzzy Data Mining techniques in a database created from a survey pertaining to Financial Inclusion. The popular Data Mining techniques like clustering and association rule mining are used in a fuzzy framework at various stages of the analysis. Starting from the survey database this paper proceeds through all the steps of Data Mining like pre-processing, attribute selection, segmenting quantitative values, clustering and finally reaching at natural fuzzy association rules. Financial Inclusion is one of the key areas where economists and governments try to concentrate for the eradication of poverty. With this analysis we clearly reach at a conclusion that Education and introduction to Information Technology plays the most significant role in Financial Inclusion
Key-Words / Index Term
Financial inclusion, Fuzzy Data Mining, Fuzzy Logic, Fuzzy Clustering
References
[1]. ADB, (2007): “Low-Income Households` Access to Financial Services”, InternationalExperience, Measures for Improvement and the Future; Asian Development Bank
[2]. Kempson, E. (2006): “Policy Level Response to Financial Exclusion in DevelopedEconomies: Lessons for Developing Countries”, Paper for Access to Finance: Building Inclusive Financial Systems, World Bank, Washington, May
[3]. Sen, Amartya, (2000): ‘Development as Freedom’, Anchor Books, New York, 2000.
[4]. Peachy, S. and A. Roe, (2004): “Access to Finance - What Does it Mean and How” Applications in the Social Sciences, SAGE Publications, London.
[5]. Kempson, E. (2006): “Policy Level Response to Financial Exclusion in DevelopedEconomies: Lessons for Developing Countries”, Paper for Access to Finance: Building Inclusive Financial Systems, World Bank, Washington, May
[6]. H.M. Treasury, (2007): “Financial Inclusion: the Way Forward”, HM Treasury, UK, March
[7]. Mohan, R. (2006): `Agricultural Credit in India: Status, Issues and Future Agenda`,
Economic and Political Weekly (March), pp.1013-23
[8]. Towards Faster and More Inclusive Growth The approach paper to the Eleventh Plan 2006, retrieved from, http://mhrd.gov.in/sites/upload_files/mhrd/files/apppap_11_1.pdf
[9]. Report of the Committee on Financial Inclusion in India (Chairman: C. Rangarajan)(2008), Government of India
[10]. Reasons for financial exclusion - Reserve Bank of India(2007)www.rbi.org.in › Speeches
[11]. . W. H. Inmon.: The data warehouse and data mining. Commun, ACM, vol. 39, pp. 49–50 (1996)
[12]. Pavel Berkhin, “Survey of Clustering Data Mining Techniques”, http://citeseer.ist.psu.edu/berkhin02survey.html
[13]. U. Fayyad and R. Uthurusamy, “Data mining and knowledge discovery in databases”, Commn. ACM, vol. 39, pp. 24–27, 1996.
[14]. Pujari, A. K., 2001, Data Mining Techniques, University Press, Hyderabad, India
[15]. K. Pal and P. Mitra(January 2002).: Data Mining in Soft Computing Framework: A Survey. IEEE transactions on neural networks, vol. 13, no. 1 ( January 2002)
[16]. G. J Klir, T A. Folger, Fuzzy Sets, Uncertainty and Information, Prentice Hall,1988
[17]. Wai-Ho Au, Keith C.C. Chan, “Classification with Degree of Membership: A Fuzzy Approach”, Proceedings IEEE International Conference on Data Mining (ICDM), 2001
[18]. U.M. Fayyad; G. Piatetsky-Shapiro; P. Smyth, 1996, Advances in knowledge discovery and data mining",MIT Press
[19]. M.S. Aldenderfer; R.K. Blash_eld, 1984, Cluster analysis", volume 44 of Quantitative Applications in the Social Sciences, SAGE Publications, London.
[20]. Hegland, M. 2003. Data Mining – Challenges, Models, Methods and Algorithms. May 8.Scientific Literature Digital Library, citeseer.ist.psu.edu/context/68861/0.
[21]. G Raju, B Thomas , Fuzzy clustering methods in data mining: A comparative case analysis Proceeedings of International Conference on Advanced Computer Theory and Engineering, 2008. ICACTE `08. ,2008 pp 489 – 493
[22]. E. Cox.: Fuzzy Modeling and Genetic Algorithms for Data Mining and Exploration. Elsevier, (2005)
[23]. Binu Thomas, G. Raju, and W. Sonam. "A modified fuzzy c-means algorithm for natural data exploration." World Academy of Science, Engineering and Technology, Volume 49, 2009 pp 478-481
[24]. M Muyeba, MS Khan, F Coenen,(2010) citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.157.9281&rep=rep1&type=pdf
[25]. Shu Yue, J , Tsang, E. ; Yeung, D. Mining fuzzy association rules with weighted items Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics ,2000 , pp 1906 - 1911 vol.3
[26]. Patel, Mohnish, Aasif Hasan, and Sushil Kumar. "A survey: Preventing discovering association rules for large data base." International Journal of Scientific Research in Computer Science and Engineering 1.2 (2013): 30-32.
[27]. Ghuse, Namrata, Pranali Pawar, and Amol Potgantwar. "An Improved Approch For Fraud Detection In Health Insurance Using Data Mining Techniques." health 11.12 (2017): 13.
Citation
Binu Thomas, "The Role of Technology and Education in Financial Inclusion – A Data Mining Analysis in a Fuzzy Framework," International Journal of Computer Sciences and Engineering, Vol.6, Issue.7, pp.1485-1497, 2018.
Hybrid Music Recommendation System Using Content-based Filtering and K-Mean Clustering Algorithm
Research Paper | Journal Paper
Vol.6 , Issue.7 , pp.1498-1501, Jul-2018
CrossRef-DOI: https://doi.org/10.26438/ijcse/v6i7.14981501
Abstract
Data is recognized as an important source for knowledge generation. Sometime user may aware about requirement but sometime may not. Recommender systems are software or technical facilities to provide items suggestions or predict customer preferences by using prior user information. Recommendations can help to increase sales and improve user satisfaction. Music Recommendation system can help to explore relative music based on user preference or internal similarity. A hybrid recommender system is usually developed through the combination of multiple recommendation techniques to boost the quality of recommendations. This paper uses content-based filtering with K-mean clustering algorithm for music recommendation system which provides effective and relevant content to be suggested.
Key-Words / Index Term
Recommendation system; Content-based filtering; K-mean; Data Mining
References
[1]. Paulo Chiliguano, Gyorgy Fazekas “Hybrid Music Recommender Using Content-based And Social Information” published in IEEE ICASSP, 2016 pp 2618-2622
[2]. Milind Mathur, Ayush Kesarwani , “Selective Unsupervised Feature Learning with Convolutional Neural Network (S-CNN)” published in NCNHIT 2013
[3]. B Amini, R Ibrahim, MS Othman, MA Nematbakhsh . Expert Systems with Applications 42 (2), 913-928, 2015 . 14, 2015. Discovering the impact of knowledge in recommender systems: A comparative study International Journal of Computer Applications (0975–8887) 23 (4).
[4]. Jyotsna Chanda, “An Improved Web Page Recommendation System Using Partitioning and Web Usage Mining” Proceedings of the International Conference on Intelligent Processing, Security and Advanced Communication. Article No. 80.
[5]. International Journal of Scientific Research in Computer Sciences and Engineering (ISSN: 2320-7639) .
Citation
Karishma Mandloi, Amit Mittal, "Hybrid Music Recommendation System Using Content-based Filtering and K-Mean Clustering Algorithm," International Journal of Computer Sciences and Engineering, Vol.6, Issue.7, pp.1498-1501, 2018.
Page Ranking Algorithm for Ranking Web Pages
Research Paper | Journal Paper
Vol.6 , Issue.7 , pp.1502-1505, Jul-2018
CrossRef-DOI: https://doi.org/10.26438/ijcse/v6i7.15021505
Abstract
Billions of data related to the user queries is stored in several web Pages and it is growing each day. Many times, Query result is not satisfied to the user. Sometimes web pages display irrelevant data or insufficient data to the user. This type of problem is solved by using page ranking algorithm. Page ranking is based on the user queries. Page rank algorithm is massively used for ranking the web pages in order of most relevant in search engines World-Wide-Web. Page Rank work as main role in the process of web mining. Based on user query a rank list is associated with the listed web pages by any search engine. Therefore the web pages display higher Page ranks are listed in the top rank that helps the user to get a most relevant and useful information in minimum possible time. In this page ranking algorithm we can display both link and the content at a time. Many algorithms are used for page ranking such as Google page rank algorithm, Hyperlink-Induced topic search(HITS) algorithm etc., Using this algorithm we can easily eliminate the problem of older outdated web pages from our rank list. Page rank does not change only by the user click because most visited web pages are not useful or satisfied to the user. So we find the most visited web pages as well as total times spend by the user on the particular web pages. Based on current trend the particular web pages ranking is updated regularly.
Key-Words / Index Term
Data mining, Hyperlink-Induced Topic Search [HITS] algorithm, Page Rank Algorithm.
References
[1] Combined Approach for “ Page Ranking In Information Retrieval System Using Context and TF-IDF Weight” s.Gupta,V.Jain,P.Bhadana Research Paper | Journal Paper Vol.2,Issue.6,pp 39-42,Proceedings of the IEEE International Journal of Computer Sciences and Engineering, June-2014.
[2] Diefenbach D., Thalhammer A. (2018) PageRank and Generic Entity Summarization for RDF Knowledge Bases. In: Gangemi A. et al. (eds) The Semantic Web. ESWC 2018. Lecture Notes in Computer Science, vol 10843. Springer, Cham
[3] Thalhammer A., Rettinger A. (2016) PageRank on Wikipedia: Towards General Importance Scores for Entities. In: Sack H., Rizzo G., Steinmetz N., Mladenić D., Auer S., Lange C. (eds) The Semantic Web. ESWC 2016. Lecture Notes in Computer Science, vol 9989. Springer, Cham
[4] N. Duhan, A. K. Sharma and K. K. Bhatia, “Page Ranking Algorithms: A Survey”, Proceedings of the IEEE International Conference on Advance Computing, 2009.
[5] Kohlschütter C., Chirita PA., Nejdl W. (2006) Efficient Parallel Computation of PageRank. In: Lalmas M., MacFarlane A., Rüger S., Tombros A., Tsikrika T., Yavlinsky A. (eds) Advances in Information Retrieval. ECIR 2006. Lecture Notes in Computer Science, vol 3936. Springer, Berlin, Heidelberg
[6] R. Kosala, H. Blockeel, “Web M ining Research: A Survey”, SIGKDD Explorations, Newsletter of the ACM Special Interest Group on Knowledge Discovery and Data Mining Vol. 2, No. 1 pp 1-15, 2000.
[7] P Ravi Kumar, and Singh Ashutosh kumar, ”Web Structure Mining Exploring Hyperlinks and Algorithms for Information Retrieval”, American Journal of applied sciences, 7 (6) 840-845 2010.
[8] Wenpu Xing and Ali Ghorbani, “Weighted PageRank Algorithm”, Proceedings of the Second Annual Conference on Communication Networks and Services Research (CNSR ’04), IEEE, 2004.
[9] J. Kleinberg, “Authoritative Sources in a Hyper-Linked Environment”, Journal of the ACM 46(5), pp. 604-632,1999.
[10] Lages, J., Patt, A. & Shepelyansky, D. Eur. Phys. J. B (2016) 89: 69. https://doi.org/10.1140/epjb/e2016-60922-0, Springer Berlin Heidelberg.
[11] D. Cohn and H. Chang, "Learning to probabilistically identify Authoritative Documents". In Proceedings of 17th International Conf. on Machine Learning, pages 167-174. Morgan Kaufmann, San Francisco, CA, 2000.
Citation
V.Banu Priya, T.Meyyapan, SM Thamarai,, "Page Ranking Algorithm for Ranking Web Pages," International Journal of Computer Sciences and Engineering, Vol.6, Issue.7, pp.1502-1505, 2018.
A Novel Methodology for Mining Frequent Itemsets from Temporal Dataset
Research Paper | Journal Paper
Vol.6 , Issue.7 , pp.1506-1511, Jul-2018
CrossRef-DOI: https://doi.org/10.26438/ijcse/v6i7.15061511
Abstract
Traditional data mining techniques predict frequent itemsets without considering the temporal data. Due to this, efficiency of the frequent itemsets mining is not upto the mark on the temporal data. A new extended apriori algorithm proposed in this research work handles the time interval while identifying the frequent itemsets. The main objective of this research work is to identify patternset in periodic intervals from the temporal data sets. Datasets from UCI data repository is subjected to this proposed method. Experimental results are tabulated and plotted. The results show improvement over the traditional apriori algorithm.
Key-Words / Index Term
Data mining, Apriori Algorithm, Frequent Itemsets, Temporal Data.
References
[1]. Mazaher Ghorbani and Masound Abessi,”A New Methdology for Mining Frequent itemsets on Temporal Data,”IEEE Transactions on Engineering Management.,vol:64,issue:4,Nov.2017.
[2]. Y. Xiao, Y. Tian, and Q. Zhao, “Optimizing frequent time-window selection for association rules mining in a temporal database using a variable neighbourhood search,” Comput. Oper.Res., vol. 52, pp. 241–250, 2014.
[3]. D. Nguyen, B. Vo, and B. Le, “CCAR: An efficient method for mining class association rules with itemset constraints,” Eng. Appl. Artif. Intell.,vol. 37, pp. 115–124, 2015.
[4]. C.-H. Lee, M.-S. Chen, and C.-R. Lin, “Progressive partition miner: An efficient algorithm for mining general temporal association rules,” IEEE Trans. Knowl. Data Eng., vol. 15, no. 4, pp. 1004–1017, Jul./Aug. 2003.
[5]. C.-H. Lee, J. C. Ou, and M.-S. Chen, “Progressive weighted miner: An efficient method for time-constraint mining,” in Advances in Knowledge Discovery and Data Mining. New York, NY, USA: Springer, 2003,pp. 449–460.
[6]. Temporal Association Rule Mining Based On T-Apriori Algorithm And Its Typical Application.
[7]. Efficient Algorithm for Mining Temporal Association Rule, vol 7, no 4, April 2007.
[8]. Temporal Association Rules in Mining Method, This paper to survey content.
[9]. R. Agrawal, T. Imieli´nski, and A. Swami, “Mining association rules between sets of items in large databases,” ACM SIGMOD Rec., vol. 22, no. 2, pp. 207–216, 1993.
[10]. B. Liu,W. Hsu, and Y. Ma, “Integrating classification and association rule mining,” in Proc. 4th Int. Conf. Knowl. Discovery Data Mining, 1998, pp. 80–86.
[11]. D. Nguyen, B. Vo, and B. Le, “CCAR: An efficient method for mining class association rules with itemset constraints,” Eng. Appl. Artif. Intell.,vol. 37, pp. 115–124, 2015.
[12]. X. Wu, C. Zhang, and S. Zhang, “Efficient mining of both positive and negative association rules,” ACM Trans. Inf. Syst., vol. 22, no. 3,pp. 381–405, 2004.
[13]. Y.-L. Chen, K. Tang, R.-J. Shen, and Y.-H. Hu, “Market basket analysis in a multiple store environment,” Dec. Support Syst., vol. 40, no. 2, pp. 339–354, 2005.
[14]. M. Shaheen, M. Shahbaz, and A. Guergachi, “Context based positive and negative spatio temporal association rule mining,” Knowl.-Based Syst.,vol. 37, pp. 261–273, 2013.
[15]. Y.-L. Chen and C.-H. Weng, “Mining fuzzy association rules from questionnaire data,” Knowl.-Based Syst., vol. 22, no. 1, pp. 46–56, 2009.
[16]. J. Han and Y. Fu, “Discovery of multiple-level association rules from large databases,” in Proc. 21st Int. Conf. Very Large Data Bases, 1995,pp. 420–431.
[17]. F. Benites and E. Sapozhnikova, “Hierarchical interestingness measures for association rules with generalization on both antecedent and consequent sides,” Pattern Recognit. Lett., vol. 65, pp. 197–203, 2015
[18]. Krutika. K .Jain , Anjali . B. Raut “Review paper on finding Association rule using Apriori Algorithm in Data mining for finding
Citation
B. Sowndarya, T. Meyyappan, SM Thamarai, "A Novel Methodology for Mining Frequent Itemsets from Temporal Dataset," International Journal of Computer Sciences and Engineering, Vol.6, Issue.7, pp.1506-1511, 2018.
Predicting Rating from Textual Reviews
Review Paper | Journal Paper
Vol.6 , Issue.7 , pp.1512-1516, Jul-2018
CrossRef-DOI: https://doi.org/10.26438/ijcse/v6i7.15121516
Abstract
As of late, we have seen a twist of audit sites. It displays an incredible chance to share our perspectives for different items we buy. Be that as it may, we confront the data over-burdening issue. Instructions to mine significant data from surveys to comprehend a client`s inclinations and make an exact proposal is vital. Customary recommender frameworks (RS) think about a few components, for example, client`s buy records, item classification, and geographic area. In this work, we propose a conclusion based rating expectation technique (RPS) to enhance forecast exactness in recommender frameworks. Initially, we propose a social client nostalgic estimation approach and ascertain every client`s notion on things/items. Besides, we consider a client`s own wistful properties as well as mull over relational nostalgic impact. At that point, we think about item notoriety, which can be surmised by the wistful appropriations of a client set that mirror clients` thorough assessment. Finally, we intertwine three components client supposition likeness, relational wistful impact, and thing`s notoriety closeness into our recommender framework to make a precise rating forecast. We direct an execution assessment of the three nostalgic factors on a genuine dataset gathered from Yelp.
Key-Words / Index Term
Meta-Data, Rating prediction, yelp
References
[1]. Li, Yu, Liu Lu, and Li Xuefeng. "A hybrid collaborative filtering method for multiple-interests and multiple-content recommendation in E-Commerce." Expert systems with applications 28.1 (2005): 67-77.
[2]. Liu, Bing. "Sentiment analysis and opinion mining." Synthesis lectures on human language technologies 5.1 (2012): 1-167.
[3]. Pang, Bo, and Lillian Lee. "Opinion mining and sentiment analysis." Foundations and Trends® in Information Retrieval 2.1–2 (2008): 1-135.
[4]. Meshram, Milind D. "Feature based opinion mining: an overview." Proceedings of National Conference on Emerging Trends: Innovations and Challenges in IT. Vol. 19. 2013.
[5]. Bucur, Cristian. "Opinion Mining platform for Intelligence in business." Economic Insights--Trends and Challenges 66.3 (2014).
[6]. Osimo, David, and Francesco Mureddu. "Research challenge on opinion mining and sentiment analysis." Universite de Paris-Sud, Laboratoire LIMSI-CNRS, Bâtiment 508 (2012).
[7]. Vinodhini, G., and R. M. Chandrasekaran. "Sentiment analysis and opinion mining: a survey." International Journal 2.6 (2012): 282-292.
[8]. Varghese, Raisa, and M. Jayasree. "A survey on sentiment analysis and opinion mining." International Journal of Research in Engineering and Technology 2.11 (2013): 312-317.
[9]. Kaur, Amandeep, and Vishal Gupta. "A survey on sentiment analysis and opinion mining techniques." Journal of Emerging Technologies in Web Intelligence 5.4 (2013): 367-371.
[10]. Lin, Chenghua, et al. "Weakly supervised joint sentiment-topic detection from text." IEEE Transactions on Knowledge and Data engineering 24.6 (2012): 1134-1145.
[11]. Pang, Bo, and Lillian Lee. "Opinion mining and sentiment analysis." Foundations and Trends® in Information Retrieval 2.1–2 (2008): 1-135.
[12]. Pang, Bo, Lillian Lee, and Shivakumar Vaithyanathan. "Thumbs up? sentiment classification using machine learning techniques." Proceedings of the ACL-02 conference on Empirical methods in natural language processing-Volume 10. Association for Computational Linguistics, 2002.
[13]. Dave, Kushal, Steve Lawrence, and David M. Pennock. "Mining the peanut gallery: Opinion extraction and semantic classification of product reviews." Proceedings of the 12th international conference on World Wide Web. ACM, 2003.
[14]. Blitzer, John, Mark Dredze, and Fernando Pereira. "Biographies, boom-boxes and blenders: Domain adaptation for sentiment classification." ACL. Vol. 7. 2007.
[15]. Whitelaw, Casey, Navendu Garg, and Shlomo Argamon. "Using appraisal groups for sentiment analysis." Proceedings of the 14th ACM international conference on Information and knowledge management. ACM, 2005.
[16]. Kennedy, Alistair, and Diana Inkpen. "Sentiment classification of movie reviews using contextual valence shifters." Computational intelligence 22.2 (2006): 110-125.
[17]. Ye, Qiang, Ziqiong Zhang, and Rob Law. "Sentiment classification of online reviews to travel destinations by supervised machine learning approaches." Expert systems with applications 36.3 (2009): 6527-6535.
Citation
Arshiya Begum, Ruksar Fatima, "Predicting Rating from Textual Reviews," International Journal of Computer Sciences and Engineering, Vol.6, Issue.7, pp.1512-1516, 2018.
Analysis on Human Behavior Traits Using Intelligence Palmistry
Research Paper | Journal Paper
Vol.6 , Issue.7 , pp.1517-1520, Jul-2018
CrossRef-DOI: https://doi.org/10.26438/ijcse/v6i7.15171520
Abstract
Palm Print can be useful to understand human behavior. Palm prints are the mirrors to our inborn talents and potentials, talent and likings. If it is not recognized appropriately and well in time, they may remain shadowed all through a person’s life. Then it follows a life full of dislike and frustrations of underperformance at work or dissatisfaction of occupation. There are some characteristics of child on his/her born time with above average intelligence in some specific areas. For extracting, fingerprint patterns and correlating them with their behavior and personality type. The application works based on intelligence palmistry. The images of human palm form input to the system. Then, system applies digital image processing and analysis techniques on input images to identify certain features in the image. By using knowledge base of intelligence palmistry, it analyzes certain features in image and predicts behavior traits
Key-Words / Index Term
Intelligence Palmistry, Knowledge Base, Human Palm, Behavioral traits
References
[1] R. C. Gonzalez and R. E. Woods “Digital Image Processing”, 2nd edition, Pearson Education, 2004
[2] D. M. Shah “Decision Support system for Image Analysis” in journal of Advanced Research in Computer Engineering, 1(1-2) January-December 2007, pp 51-56.
[3] Dr. K. Lakshmi Kumari , Dr. P. V. S. S.Vijaya Babu and Dr. S. V. Kumar el at “Dermatoglyphics and Its Relation to Intelligence Levels of Young Students” in IOSR Journal of Dental and Medical Sciences (IOSR-JDMS)e-ISSN: 2279-0853, p-ISSN: 2279-861. Volume 13, Issue 5 Ver. II. (May. 2014), PP 01-03
[4] Dr. C. Venkatesh, Mr. G. Thirunavukkarasu, Ms. V. Kalaimagal,el at” Investigation of Human Behavior using Biometrics” IJSTE - International Journal of Science Technology & Engineering | Volume 1 | Issue 9 | March 2015
[5] Rohit R Prabhu, C.N. Ravikumar “A Novel Extended Biometric Approach for Human Character Recognition using Fingerprints” International Journal of Computer Applications (0975 – 8887) Volume 77– No.1, September 2013
[6] Hardik Pandit and Dipti Shah, “Decision Support System for Medical Palmistry” - in “Advances in Applied Research”, vol.2, July-December 2010, pp 173178.
[7] Maduguri Sudhir, E.V.Narayana, “Digital Image Processing in Medical Palmistry”, International Journal of Advanced Engineering and Global Technology I Vol-04, Issue-03, May 2016
Citation
N.H. Desai, D. B. Shah, "Analysis on Human Behavior Traits Using Intelligence Palmistry," International Journal of Computer Sciences and Engineering, Vol.6, Issue.7, pp.1517-1520, 2018.
Energy Efficient Host Overloading Detection Algorithm in Cloud Computing
Research Paper | Journal Paper
Vol.6 , Issue.7 , pp.1521-1525, Jul-2018
CrossRef-DOI: https://doi.org/10.26438/ijcse/v6i7.15211525
Abstract
Cloud computing is now a most popular technology of the present generation. Energy efficiency is big aspect to think as the big data center is consuming a lot of energy to run and to serve their customers. Energy efficient algorithm and techniques are required to reduce the carbon emissions. In this paper we have worked for consolidation of Virtual Machine(VM) by detecting over-utilized hosts by using Pattern matching and reduced number of migrations by taking a new approach of Mode Absolute Deviation. It analyzes the historical data of CPU usages to search the usage pattern of CPU and finds the dynamic thresholds values for migration of virtual machine. The work has been carried out in CloudSim and the results in our work has been better than previous work[1] and we are able to save energy and reduce the number of migrations by using our proposed method.
Key-Words / Index Term
Energy Efficient, host overloading, VM Consolidation, VM Migration, Mode, Cloud Computing
References
[1] O. Sharma and H. Saini, “VM Consolidation for Cloud Data Center Using Median Based Threshold Approach,” Procedia Comput. Sci., vol. 89, pp. 27–33, 2016.
[2] P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt, and A. Warfield, “Xen and the art of virtualization,” ACM SIGOPS Oper. Syst. Rev., vol. 37, no. 5, p. 164, 2003.
[3] T. Guerout, T. Monteil, G. Da Costa, R. Neves Calheiros, R. Buyya, and M. Alexandru, “Energy-aware simulation with DVFS,” Simul. Model. Pract. Theory, vol. 39, pp. 76–91, 2013.
[4] C. Clark, K. Fraser, S. Hand, J. G. Hansen, E. Jul, C. Limpach, I. Pratt, and A. Warfield, “Live migration of virtual machines,” NSDI’05 Proc. 2nd Conf. Symp. Networked Syst. Des. Implement., no. Vmm, pp. 273–286, 2005.
[5] W. Voorsluys, J. Broberg, S. Venugopal, and R. Buyya, “Cost of virtual machine live migration in clouds: A performance evaluation,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 5931 LNCS, pp. 254–265, 2009.
[6] E. Pinheiro and R. Bianchini, “Load balancing and unbalancing for power and performance in cluster-based systems,” … Syst. Low Power, pp. 1–8, 2001.
[7] J. S. Chase, D. C. Anderson, P. N. Thakar, A. M. Vahdat, and R. P. Doyle, “Managing energy and server resources in hosting centers,” Acm Sigops, vol. 35, no. 5, p. 103, 2001.
[8] D. Kusic, J. O. Kephart, J. E. Hanson, N. Kandasamy, and G. Jiang, “Power and Performance Management of Virtualized Computing Environments Via Lookahead Control,” 2008 Int. Conf. Auton. Comput., pp. 3–12, 2008.
[9] J. Luo, X. Li, and M. Chen, “Hybrid shuffled frog leaping algorithm for energy-efficient dynamic consolidation of virtual machines in cloud data centers,” Expert Syst. Appl., vol. 41, no. 13, pp. 5804–5816, 2014.
[10] M. Forsman, A. Glad, L. Lundberg, and D. Ilie, “Algorithms for automated live migration of virtual machines,” J. Syst. Softw., vol. 101, pp. 110–126, 2015.
[11] W. Song, Z. Xiao, Q. Chen, and H. Luo, “Adaptive resource provisioning for the cloud using online bin packing - Wagner,” Comput. IEEE Trans., vol. X, no. X, pp. 1–14, 2013.
[12] G. Han, W. Que, G. Jia, and L. Shu, “An efficient virtual machine consolidation scheme for multimedia cloud computing,” Sensors (Switzerland), vol. 16, no. 2, pp. 1–17, 2016.
[13] R. Buyya, A. Beloglazov, and J. Abawajy, “Energy-Efficient Management of Data Center Resources for Cloud Computing : A Vision , Architectural Elements , and Open Challenges Clou d Computing and D istributed S ystems ( CLOUDS ) Laboratory Department of Computer Science and Software Engineering The,” Univ. Melbourne, Aust., no. Vm, pp. 1–12, 2010.
[14] A. Beloglazov, J. Abawajy, and R. Buyya, “Energy-aware resource allocation heuristics for efficient management of data centers for Cloud computing,” Futur. Gener. Comput. Syst., 2012.
[15] A. Beloglazov and R. Buyya, “Optimal online deterministic algorithms and adaptive heuristics for energy and performance efficient dynamic consolidation of virtual machines in Cloud data centers,” Concurr. Comput. Pract. Exp., vol. 24, no. 13, pp. 1397–1420, 2012.
[16] A. Khajeh-Hosseini, D. Greenwood, J. Smith, and I. Sommerville, “The Cloud Adoption Toolkit: supporting cloud adoption decisions in the enterprise,” Softw. - Pract. Exp., vol. 43, no. 4, pp. 447–465, 2012.
[17] K. Park and V. S. Pai, “CoMon: a mostly-scalable monitoring system for PlanetLab,” ACM SIGOPS Oper. Syst. Rev., vol. 40, no. 1, pp. 65–74, 2006
Citation
N. Kumar, R. Kumar, "Energy Efficient Host Overloading Detection Algorithm in Cloud Computing," International Journal of Computer Sciences and Engineering, Vol.6, Issue.7, pp.1521-1525, 2018.