A Survey of kNN Algorithm

Authors

  • Jingwen Sun Information Engineering College, Panzhihua University of Technology, Sichuan, China
  • Weixing Du
  • Niancai Shi

DOI:

https://doi.org/10.18063/ieac.v1i1.770

Keywords:

kNN algorithm, k nearest neighbor algorithm, Machine learning, Text classifi cation

Abstract

The kNN algorithm is a well-known pattern recognition method, which is one of the best text classifi cation algorithms. It is one of the simplest machine learning algorithms in machine learning classification algorithm. In this paper, we summarize the kNN algorithm and related literature, introduce the idea, principle, implementation steps and implementation code of kNN algorithm in detail, and analyze the advantages and disadvantages of the algorithm and its various improvement schemes. This paper also introduces the development of kNN algorithm, the important published papers. At the end of this paper, the application of kNN algorithm is introduced, and its implementation in text classifi cation is emphasized.

References

Cover T, Hart T P. Nearest neighbor pattern classifi cation [J]. IEEE, 1967 (1): 21 - 27.

Cover T. Rates of convergence for nearest neighbor procedures [J]. Systems Sciences, 1968.

Stone C J. Consistent Nonparametric Regression [J]. Institute of Mathematical Statistics, 1977 (7), 5 (4): 595-620.

Cleveland W S. Robust locally weighted regression and smoothing scatterplots [J]. Journal of the American Statistical Association, 1979, 74: 829-836.

Brown, T., Koplowitz, Jack. The weighted nearest neighbor rule for class dependent sample sizes (Corresp.) [J]. IEEE, 1979 (9) .IT-25: 617 - 619.

Short R, Fukunaga K. A new nearest neighbor distance measure [J]. IEEE, 1980: 81-86.

Robert D .; Fukunaga, K. The optimal distance measure for nearest neighbor classifi cation [J]. IEEE, 1981 (9), 27 (5): 622 -627.

Myles J, Hand D. The multi-class metric problem in nearest neighbor discrimination rules [J]. Pattern Recognition, 1990, 23 (11): 1291-1297.

Altman N S. An introduction to kernel and nearest-neighbor nonparametric regression [J]. 1992, 46 (3): 175-185.

Zhang M, Zhou Z. ML-KNN: A lazy learning approach to multi-label learning [J]. Pattern Recognition, 2007 (7), 40 (7): 2038-2048.

Hall P, Samworth B. Choice of neighbor order in nearest-neighbor classifi cation [J]. The Annals of Statistics, 2008 (10), 36 (5): 2135-2152.

Pan J, Manocha, D. Bi-level locality sensitive hashing for k-nearest neighbor computation [J]. IEEE, 2012 (4): 378-338.

Michel M Deza, Elena Deza. Encyclopedia of Distances. Springer, 2009

Zhou J, Liu J. A KNN algorithm for optimizing distance using class correlation. Journal of Computer Applications, 2010 (11), 31 (11): 7-12.

Sebastiani F. Machine learning in automated text categorization [J] .ACM Computing Surveys, 2002, 34 (1): 1-47.

Zhao Jidong, Lu Ke, Wu Yue. A web image search method based on spectral theory [J]. Computer Applications Research, 2008 (5): 12-13.

Zhang Hua. Research on image semantic information extraction method [J]; Journal of Shandong Normal University;

Wen Xiaobin. Research and implementation of interact image search engine [D]. Haikou: Hainan University, 2006.

Cai Dang, He Xiaofei, Li Zhiwei, et al. Hieraehical clustering of WWW image search results using visual Textual and link information [C]. Proceedings of the ACM International Conference on Multimedia, New York, USA, 2004: 952-959.

Cheng En, Jing Feng, Zhang Chao, et al. Search result clustering based relevance feedback for web image retrieval [C]. Interactional Conference on Acoustics, Speech, and Signal Processing, Hawaii, 2007: 961-964.

Xie Tong. Based on the text of the Web image search engine research and implementation [D]. Chengdu: University of Electronic Science and Technology, 2007.

Cai D, Yu S, Wen J R, et al. VIPS: a vision-based page segmentation algorithm, MSR-TR-2003-79 [R] .Microsoft Research, 2003.

Kang Shiyong, Liu Yan. On the semantic components and semantic sentences of Chinese verb predicate sentences [J] .Journal of Tang University, 1998,14 (1): 89-93.

Xu Bin. Semantic sentence recognition based on PCFG-HDSM model [D]. Nanjing: Nanjing University of Aeronautics and Astronautics, 2008.

P E Har. The condensed nearest neighbor rule. IEEE Trans on Information Theory, 1968, IT-14 (3): 515-516.

Li R, Hu Y. KNN text classifier training sample crop method based on density. Journal of Computer Research and Development, 2004, 41 (4): 539-546.

W J Hwang, K W Wen. Fast KNN classifi cation algorithm based on partial distance search [J]. Electron lett, 1998, 34 (21): 2062_2063.

J S Pan, Y L Qiao, S H Sun. Neighbors classifi cation algorithm [J]. IEICE Trans Fundamentals, 2004, E87-A (4): 961-961.

Hou Shijiang, Liu Chehua, Yu Jing, Chu Bingyi. K nearest neighbor query algorithm in spatial network database. Computer Science, 2006Vol.33No.8.

Sun Qiuyue. KNN algorithm based on SNN similarity degree. Master's Thesis, Yunnan University, 2008.

H. Wang. Nearest Neighbors without k: A Classification Formalism based on Probability, technical report, Faculty of Informatics, University of Ulster, N, Ireland, UK, 2002.

Downloads

Published

2018-05-10

Issue

Section

Articles