|
|
TOUR-STYLE PREFERENCE ANALYSIS BASED ON MACHINE LEARNING TECHNIQUES——A Case Study on Nanjing Residents |
ZHANG Chen1, ZHANG Shu-fu2, TAO Zhuo-min2 |
1. Department of Land Resources and Tourism Sciences, Nanjing University, Nanjing 210093, China;
2. School of Geographic Science, Nanjing Normal University, Nanjing 210046, China |
|
|
Abstract People's preferences to tour styles are of practical significance for the development of tourism. In this paper, tour style preferences are investigated systematically by employing the advances in both psychology and machine learning research communities. In detail, team tour and self-help tour are considered in this paper to characterize the potential tourists. Questionnaire is designed based on the demographic characteristics and personal values. Data with respect to the variables are extracted from these questionnaires where each variable corresponds to a specific answer in the questionnaire. Then, an advanced special machine learning algorithm named C4.5-rule PANE is employed for data analysis. This algorithm works in a twice-learning style. Specifically, in the first learning stage, this algorithm learns a neural network ensemble from the training data, and the virtual examples are generated and classified by this learned neural network ensemble. In the second learning stage, these virtual examples with the labels provided by the neural network ensemble in the first learning stage are used to enlarge the original training data set and C4.5 decision rules are learned from the augmented data set. The learned model is expressed in the form of decision rules (e.g., If a and b then c, where a and b are Boolean expressions with respect to certain variable and c is the concept class to be predicted), which can be easily understood by the data analyzer. Thus, such a twice-learning procedure produces a predictive model with not only powerful predictive ability for potential tourists but also excellent comprehensibility. These advantages enable accurate modeling of the nonlinear mapping from the variables characterizing the potential tourists to the objective concept (i.e. the tour style preferences) and explicit analysis of such mapping from the comprehensible model. This empirical approach is applied to the data extracted from the questionnaire presented to 305 Nanjing residents, and interesting results are reached.
|
Received: 01 July 2008
|
|
|
|
|
|
|