Anda belum login :: 23 Apr 2025 00:52 WIB
Home
|
Logon
Hidden
»
Administration
»
Collection Detail
Detail
Training Data Selection in Large Margin Nearest Neighbor Method for Classification Problems
Oleh:
Yamazaki, Fumihiro
;
Sakamoto, Shunsuke
;
Mikawa, Kenta
;
Goto, Masayuki
Jenis:
Article from Proceeding
Dalam koleksi:
The 14th Asia Pacific Industrial Engineering and Management Systems Conference (APIEMS), 3-6 December 2013 Cebu, Philippines
,
page 1-6.
Topik:
Distance Metric Learning
;
Mahaanobis Distance
;
SemiDefinite Programming
;
Large Margin Nearest Neighbor
Fulltext:
1037.pdf
(446.55KB)
Isi artikel
Due to the development of Information Technology, the importance of knowledge discovery from enormous data is pointed out in many studies. This paper focuses on the classification problem of multi-dimensional data in vector space model. The classification is to predict the category of a new input data by using a classifier which is learned from a training data set with true category information. In most cases of classification problems, especially using the vector space model, selecting an appropriate distance measure is essential to many learning algorithms such as k-Nearest Neighbor (kNN). Thus, several algorithms which learn an appropriate distance measure have been proposed and this approach is called Metric Learning.In this paper, we focus on the well-known metric learning approach with good classification accuracy; Large Margin Nearest Neighbor (LMNN). LMNN learns a Mahalanobis distance, which is adapted to kNN classification, from a labeled training data set. The metric is optimized with an objective function so that the kNN data,which is aset of k data selected during the process of kNN, always belong to the same category while data with different set of categories are separated by a large margin. As contrast with the other metric learning approaches, LMNN has a desirable property that it relatively takes much account of boundary data between categories. However, LMNN uses all given training data to learn a metric which may not take much account of the data around boundary between categories. Therefore, this paper proposes a new method based on training data selection which can improve the accuracy of classification with LMNN. Specifically, we propose an algorithm that selects boundary data effectively from the given training data set for LMNN. If the appropriate training data could be preemptively selected from the given data set, it is possible to learn a metric that improves the accuracy of classification. Finally, we verify the effectiveness of proposed method by a simulation experiment with UCI data sets.
Opini Anda
Klik untuk menuliskan opini Anda tentang koleksi ini!
Kembali
Process time: 0.015625 second(s)