Movie Recommendation Model Using Stochastic Gradient Descent For Collaborative Filtering In Social Media Mining
Nowadays, many people appetite to watch TV-shows or - series anytime and anywhere they want. In recent years, online TV has experienced exponential growth. Netflix is one of the parties that jumped into the world of online streaming services. In this effort, many subsist movie recommendation approaches learn a user ranking model from user feedback with respect to the movie’s content. Unfortunately, this approach suffers from the sparsity problem inherent in SMR data. Collaborative filtering (CF) is the workhorse of recommender engines since it can perform feature learning on its own, meaning it learns for itself what features to use. CF can be split into Memory-Based Collaborative Filtering and Model-Based Collaborative filtering. Here compare results from memory-based CF, model-based CF and third approach which uses an algorithm called `Stochastic gradient descent` for collaborative filtering. The propose stochastic gradient descent algorithm using movie recommender system. In this propose system use movie lens dataset, one of the most common datasets used to implement and test recommender engines. It contains 100,000 movie ratings from 943 users and a selection of 1682 movies. Evaluate the results using the Root Mean Squared Error (RMSE) and Mean Absolute Error(MAE).
Movie Recommendation System, Memory-Based Collaborative Filtering, Model-Based Collaborative Filtering, Stochastic Gradient Descent
Social Media Mining : Retrieving , Preprocessing Storing and Analyzing Bone Cancer Related Tweets Using R
Social media provides easily an accessible platform for users to share information. Mining social media has its potential to extract actionable patterns that can be beneficial for business, users,and consumers. Social media data are vast, noisy, unstructured, and dynamic in nature, and thus novel challenges arise. This paper deals with social media mining in which we retrieved tweets ,preprocessed and store it in a csv file in order to compare with ontology related to cancer which is created using protégé. Also analysis made on preprocessed cancer related tweets using R.
Social media, Mining, preprocess, csvfile, Ontology, Tweets, R
High Utility Text and Data Mining Methods
Text data has continuous growth of volumes of data, automate extract ion of implicit, previously unknown, and potentially useful information becomes more necessary to properly utilize this vast source of knowledge. Text mining corresponds to the extension of the data mining approach to textual data and is concerned with various tasks, such as extraction of information implicitly contained in collection of documents, or similarity-based structuring. This paper provides the reader with a very brief introduction to some of the theory and methods of text data mining. The intent of this paper is to introduce some of the current text mining methods that are employed within this discipline area. In this paper we provide some of methods of text datamining.
Text Mining, Text Mining Text Processing, Methods Text, Document clustering
Data Mining Approaches to Predict the Factors that Affect the Agriculture Growth using Stochastic Model
In the recent times, there has been an increasing demand for efficient strategies in the data mining in agriculture prediction. Data mining is equipment to predict effectively by stochastic model sensing concept. This paper proposes an efficient factor that affects the agriculture growth using different data like rainfall, groundwater and temperature by adopting stochastic modeling and data mining approaches. Firstly, the novel model is proposed to predict the factors affecting the growth of agriculture using stochastic model and numerical illustrations are done and the various expected estimation the sternness of the proposed approach.
Data Mining, Agriculture productions, Rainfall, Groundwater, Temperature and Stochastic model
Earthquake Prediction using SVM based Time Predictable Technique
As with so many natural phenomena, earthquakes are the product of what scientists call "complex systems," or systems which are more than the sum of their parts. Not just speaking proverbially, but in truest ever sense, precise prediction of earthquakes has long been a question of Life & Death for the scared inhabitants of earthquake-prone areas and so is for the forecasters and scientists ranging from Nostradamus to Dr. Vladimir Kellis-Borok since last a few centuries. Though the experts still don’t know many of the details of the physical processes involved and how to predict these events, several prediction and chaos theories have been put forth with varying degrees of successes. In spite of the inherent complexities involved in such a complex system, the research is still on and on. The time- predictable model of earthquake prediction is based on the theory that earthquakes in fault zones are caused by the constant build-up and release of strain in the Earth`s crust. This model has become a standard tool for hazard prediction in many earthquake-prone regions and, therefore, it is not surprising that the scientists in the United States and other Pacific Rim countries, such as Japan and New Zealand, routinely use this technique for long-range hazard assessments when adequate data are available.
Earthquakes;time-predictablemodel; forecasters
Versatile Distributed Computing Taxonomy
As indicated by NIST meaning of distributed computing, it has five attributes: on-request self-benefit, broad network access, asset pooling, rapid elasticity, and measured services, while mobile computing figuring centers around gadget portability and setting mindfulness considering systems administration and versatile asset/information get to. Portable distributed computing is normally viewed as expanding on distributed computing and versatile registering; be that as it may, it has some one of a kind highlights, for example, benefit offloading, migration, composition Versatile distributed computing advances portable figuring innovations and use bound together flexible assets of fluctuated mists and system advances. This part gives a review of different vital ideas that are very identified with versatile distributed computing and outline their relations through genuine models.
Data collection , Measurement sensor, Radiocommunication, Distributed system Network, protocol Energy consumption, Taxonomy
Architecture for Automated Data Quality Checking in Big Data Migration Process
Data are gathered from different sources that have high quality issues. Increasing volume of information is there in the digital libraries. Most of the system may be affected by the replicas. Data cleaning is the important process to remove replicas using de-duplication. It consists of process of parsing, data transformation, duplicate elimination and statistical methods. It is one of the most challenging stages to clear repeated documents. It deals with the detection and removal of errors, filling in omitted values, smoothing noisy data to improve the quality of data. De-duplication is the key function in data integration which is from various sources. It is the process of determining all categories of information contained by a data set that indicate the same real world entity. This paper is going to introduce a methodology for automated data quality checking with de-duplication algorithm.
Data Quality, Data Cleansing, De-Duplication
Parameter-Free Algorithm for Mining Rare Association Rules
This paper exhibits a Parameter-Free grammar guided genetic programming algorithm for mining rare association rules. This algorithm utilizes a context-free grammar to represent individuals, encoding the solutions in a tree-shape conformant to the grammar, so they are more expressive and flexible. The algorithm here introduced has the advantages of utilizing evolutionary algorithms for mining rare association rules, and it also additionally takes care of the issue of tuning the tremendous number of parameters required by these algorithms. The principle highlight of this algorithm is the small number of parameters required, providing the possibility of discovering rare association rules in an easy way for non-expert users. We compare our approach to existing evolutionary and exhaustive search algorithms, obtaining important results and overcoming the drawbacks of both exhaustive search and evolutionary algorithms. The experimental stage reveals that this approach discovers infrequent and reliable rules without a parameter tuning.
Genetic Programming, Association Rules, Free Parameters, Data Mining
Prediction of Data Ware House Model using Dynamic Function Point Analysis
Approach for estimation of Data Ware House(DWH) Projects/Data Marts using Function Point Analysis is an ETL Development, Enterprise. The Objectives are the burden of maintaining the composite and enterprise data model by the data warehouse is not directly recognized. Often a development team’s “hidden” efforts in delivering the support architecture of a data warehouse are compared unfavorably with more traditional, and highly visible, user functionality. Unlike other traditional data base systems, a Data Warehouse uses other software systems as data sources and does not create new information, which generally would be more static in nature. So, applying Function Point Analysis to Data warehouse applications became more tedious as Data warehousing itself has some peculiarities of its own compared to traditional OLTP(Online Transactional Processing) applications. Data Warehouse/Data Mart are a special type of applications, with particular characteristics such as the fact that the users only use the software system for queries and report generation and not for data update, the fact that development is based on existing data of other systems without generating new information, and the fact that it follows a different development process than the traditional OLTP software systems. It is necessary, therefore, to adapt (rather than to exactly follow) the size measurement approach defined for the traditional OLTP systems so that they consider the specific characteristics of Data warehouse/Data Mart and generate more accurate estimations. The proposed approach helps in estimating Data warehouse/Data Mart Projects using Function Point Analysis especially for ETL operations, in a more traditional and systemati c way.
FPA - Function Point Analysis, OLAP - Online Analytical Processing, ETL - Extraction, Transformation, Loading, OLTP - Online Transactional Processing, DWH - Datawarehouse
Self Organzied Wireless Sensor Network Model for Military Decentralized Applications
Developments in integrated circuit design technology are expected to make the mass production of sensor devices relatively inexpensive, and hence such large sensor networks are likely to be common.A cluster-based scheme is proposed as a solution for this problem. The proposed scheme extends First Input High Energy (FIHE) clustering algorithm and enables multi-hop transmissions among the clusters by incorporating the selection of cooperative sending and receiving nodes. We propose a sensor network architecture based on the cluster-tree based multi-hop model with optimized cluster head election and the corresponding node design method to meet the tactical requirements. In the earlier system, such types of networks for transmission of information are available but there is no security mechanisms for providing the security for that transmitted information. Because several attackers may enter into the network without any authentication and they can attack the network and they can access the data or service they require. With the proposed WSN architecture, one can easily design the sensor network for military usage in remote large scale environments.
Military sensor networks, Architecture, Design, Self-organization, Cluster head election
