As the digital technology develops, a large amount of visual data (image and video) are captured and shared every day in the social media. It is the fact that most data are lack of organization and the tags or labels are barely available for the most cases. To represent the unlabeled visual data for different purposes, extensive efforts have been made in both industry and academia. For instance, Apple launched their face clustering application in iPhone iOS-10 system to help users organize their photos by human identity. While it suffers from many criticisms, including poor clustering performance with occluded faces, non-frontal faces or many others. The fundamental problem behind this challenging face clustering problem is, how to robustly represent the photos under different variations, such as head pose, lighting conditions, occlusions, or even larger corruptions. In this dissertation, we focus on solving robust visual data representation problems using unsupervised subspace learning
According to the number of input data views (modalities), unsupervised subspace clustering methods are usually divided into two categories, i.e.
single-view subspace clustering and multi-view subspace clustering. In this dissertation, both single-view case and multi-view case are discussed.
Specifically, in single-view subspace clustering, we propose a novel graph-based method, Ensemble Subspace Segmentation under Block-wise constraints (ESSB), which unifies least squares regression and locality preserving graph regularizer into an ensemble learning framework. The “divide-and-conquer” strategy is applied on features, resulting in an efficient ESSB framework to handle the high-dimensional data. For the large-scale data, we propose a Fast Regression Coding (FRC) to optimize regression codes, and simultaneously train a non-linear function to approximate the codes. By using FRC, we develop an efficient Regression Coding Clustering (RCC) framework to solve the large-scale clustering problem, consisting of sampling, FRC and clustering. Besides, we provide a theorem guarantee that the non-linear function has a first-order approximation ability and a group effect. The theorem manifests that the codes are easily used to construct a dividable similarity graph.
In multi-view subspace clustering, we firstly present a deep matrix factorization framework for traditional multi-view clustering problem, where semi-nonnegative matrix factorization is adopted to learn the hierarchical semantics of multi-view data in a layer-wise fashion. To maximize the mutual information from each view, we enforce the non-negative representation of each view in the final layer to be the same. Beyond this, we consider a more practical and challenging scenario, i.e., missing view information for random samples. A novel robust graph regularized method is proposed to handle the incomplete data by projecting the original and incomplete data to a new and complete latent space. Finally, as a brand-new application, we define two types of outliers in multi-view learning setting, including attribute-type and class-type. By representing the multi-view data with latent coefficients and sample-specific errors, the proposed consensus regularized multi-view outlier detection method is able to detect both types of anomalies simultaneously.
- Professor Yun Raymond Fu (Advisor)
- Professor Sarah Ostadabbas
- Professor Ehsan Elhamifar