Please use this identifier to cite or link to this item:
|Data Mining Approach to Discover Training Needs
|Research Subject Categories::TECHNOLOGY
|The aim of this thesis is to apply data mining techniques on human resource data to discover training needs of the employees in an organization. Mining data in a human resource database can be used to identify the associations among data items and hidden patterns. Some of the discovered association rules can be used to predict the training needs, which are most suitable for employees, rather than what the employees asked for. In this thesis several well known classification and class association rule mining algorithms are considered and they are evaluated to find best classification algorithm and class association rule mining algorithm for the given employee dataset. Selected Classification algorithm can be used for directly predict one choice, in contrast selected class association rule mining algorithm can be used for predict several choices with regard to training needs and rank them. A new technique for selecting best rules for a given dataset and ranking the selected rules is also proposed in this thesis. C4.5 algorithm is selected as the classification algorithm. The Class Association Rule (CAR) mining algorithm, which is a specialization of well known Apriori, is selected as the association rule mining algorithm in this thesis. Employing both techniques: classification and, association rule mining lead to better predictions on training needs. WEKA is an open source suite of machine learning algorithms, which used as the data mining component of another open source ETL tool, KETTLE. WEKA (Waikato Environment for Knowledge Analysis) is used to evaluate several algorithms and select two algorithms for implementation. Cross validation technique is used for evaluating algorithms only with training data. The selected algorithms are implemented as a custom application using WEKA API. Success and error rate of the predictions are evaluated using the census data of government school teachers. This new technique should be thoroughly experimented with the all attributes of the data-set discussed. This system can be used to make predictions with other domains other than human resource as the system is flexible enough to use with any subject domain. The success rate with other domains should be should be analyzed and extensions for the system should be proposed.
|Appears in Collections:
|Master of Computer Science - 2013
Files in This Item:
|View/Open Request a copy
Items in UCSC Digital Library are protected by copyright, with all rights reserved, unless otherwise indicated.