A k-nearest neighbor multilabel ranking algorithm with application to content-based image retrieval
Abstract
Multilabel ranking is an important machine learning task with many applications, such as content-based image retrieval (CBIR). However, when the number of labels is large, traditional algorithms are either infeasible or show poor performance. In this paper, we propose a simple yet effective multilabel ranking algorithm that is based on k-nearest neighbor paradigm. The proposed algorithm ranks labels according to the probabilities of the label association using the neighboring samples around a query sample. Different from traditional approaches, we take only positive samples into consideration and determine the model parameters by directly optimizing ranking loss measures. We evaluated the proposed algorithm using four popular multilabel datasets. The proposed algorithm achieves equivalent or better performance than other instance-based learning algorithms. When applied to a CBIR system with a dataset of 1 million samples and over 190 thousand labels, which is much larger than any other multilabel datasets used earlier, the proposed algorithm clearly outperforms the competing algorithms.
Collections
- Electrical Engineering [2649 items ]