In the era of social media, more and more social characteristics are conveyed by multimedia, i.e., images, videos, audios, and webpages with rich media information. With the emergence and popularity of the Internet, collecting multimedia data cannot be much easier than today. However, tons of data available on-line are lack of organization and their tags or labels are sparse or loosely organized for certain problems we are interested in.
The basic question is how to learn discriminant features or representations given very limited or poor training data for social media analytics.
In this thesis, we focus on the popular social media data such as, face, object, digital number images, and study the problems of social media analytics in two lines: (1) developing efficient and effective machine learning tools given limited or poor training data by considering the structure of the data from different domains, (2) applying existing or developed machine learning tools to novel social media problems, e.g., kinship verification, family photo understanding. These two lines are detailed as followings:
For knowledge-based machine learning algorithms, label or tag is critical in training the discriminative model. However, labeling data is not an easy task because these data are either too costly to obtain or too expensive to hand-label. For that reason, researchers use labeled, yet relevant, data from different databases to facilitate learning process. This is exactly transfer learning that studies how to transfer the knowledge gained from an existing and well-established data (source) to a new problem (target). To this end, we propose two novel methods to align the structure of the source and target data in the learned subspace to mitigate the domain divergence, which provide robust, scalable, and discriminant features for learning tasks. The first method utilizes a low-rank regularizer to guide the discriminant subspace learning. The second method manages to take advantage of the unlabeled source data for efficient large-scale transfer learning.
Transfer learning approaches above are appropriate for social media analytics problems such as kinship verification. A critical observation is that faces of parents captured while they were young are more like their children's compared with images captured when they are old. Therefore, we can readily apply the proposed transfer learning methods to kinship verification defined above, where kin relation between young parent and child is the source problem, while that between old parent and child is the target. Promising research outcome can be extended to real-world applications: family album management, image retrieval and annotation, missing children search, etc.
Advisor: Professor Yun Fu
Professor Yun Fu
Professor Jennifer Dy
Professor Yizhou Sun