Abstract
K-Nearest-Neighbor (KNN) as an important classification method based on closest training examples has been widely used in data mining due to its simplicity, effectiveness, and robustness. However, the class probability estimation, the neighborhood size and the type of distance function confronting KNN may affect its classification accuracy. Many researchers have been focused on improving the accuracy of KNN via distance weighted, attribute weighted, and dynamically selected methods et al. In this paper, we first reviewed some improved algorithms of KNN in three categories mentioned above. Then, we singled out an improved algorithm called dynamic k-nearest-neighbor with distance and attribute weighted, simply DKNDAW. In DKNDAW, we mixed dynamic selected, distance weighted and attribute weighted methods. We experimentally tested our new algorithm in Weka system, using the whole 36 standard UCI data sets which are downloaded from the main website of Weka. In our experiment, we compared it to KNN, WAKNN, KNNDW, KNNDAW, and DKNN. The experimental results show that DKNDAW significantly outperforms KNN, WAKNN, KNNDW, KNNDAW, and DKNN in terms of the classification accuracy.
Original language | English |
---|---|
Title of host publication | 2010 International Conference on Electronics and Information Engineering |
Subtitle of host publication | ICEIE 2010, Proceedings |
Place of Publication | Chengdu, China |
Publisher | Institute of Electrical and Electronics Engineers (IEEE) |
Pages | V1-356-V1-360 |
Number of pages | 5 |
Volume | 1 |
ISBN (Electronic) | 9781424476817 |
ISBN (Print) | 9781424476800 |
DOIs | |
Publication status | Published - 1 Aug 2010 |
Externally published | Yes |
Event | 2010 International Conference on Electronics and Information Engineering, ICEIE 2010 - Kyoto, Japan Duration: 1 Aug 2010 → 3 Aug 2010 |
Conference
Conference | 2010 International Conference on Electronics and Information Engineering, ICEIE 2010 |
---|---|
Country/Territory | Japan |
City | Kyoto |
Period | 1/08/10 → 3/08/10 |
Keywords
- data mining
- learning (artificial intelligence)
- pattern classification
- pattern clustering
- statistical analysis
- k-nearest neighbor method
- closest training
- probability estimation
- distance weighted classification
- attribute weighted classification
- mixed dynamic method
- Weka system
- neighborhood size
- Classification algorithms
- Heuristic algorithms
- Training
- Accuracy
- Training data
- Nearest neighbor searches
- Mathematical model
- dynamic
- k-nearest-neighbor
- distance weighted
- attribute weighted
- classification accuracy