The Customized-Queries Approach to CBIR using EM


This paper makes two contributions. The first contribution is an approach called the ``customized-queries'' approach (CQA) to content-based image retrieval. The second is an algorithm called FSSEM that performs feature selection and clustering simultaneously. The customized-queries approach first classifies a query using the features that best differentiate the major classes and then customizes the query to that class by using the features that best distinguish the images within the chosen major class. This approach is motivated by the observation that the features that are most effective in discriminating among images from different classes may not be the most effective for retrieval of visually similar images within a class. This occurs for domains in which not all pairs of images within one class have equivalent visual similarity, i.e. subclasses exists. Because we are not given subclass labels, we must simultaneously find the features that best discriminate the subclasses and at the same time find these subclasses. We use FSSEM to find these features. We apply this approach to content-based retrieval of high-resolution tomographic images of patients with lung disease and show that this approach radically improves the retrieval precision over the traditional approach that performs retrieval using a single feature vector.