This paper explores the problem of feature subset selection for unsupervised
learning within the wrapper framework. In particular, we examine feature
subset selection wrapped around expectation-maximization (EM) clustering
with order identification (identifying the number of clusters in the data).
We investigate two different performance criteria for evaluating candidate
feature subsets: scatter separability and maximum likelihood. When the
``true'' number of clusters k is unknown, our experiments on simulated
Gaussian data and real data sets show that incorporating the search for
k within the feature selection procedure obtains better ``class'' accuracy
than fixing k to be the number of classes. There are two reasons: 1) the
``true'' number of Gaussian components is not necessarily equal to the
number of classes and 2) clustering with different feature subsets can
result in different numbers of ``true'' clusters. Our empirical evaluation
shows that feature selection reduces the number of features and improves
clustering performance with respect to the chosen performance criteria.