登入選單
返回Google圖書搜尋
Online Activity Understanding and Labeling in Natural Videos
註釋The first approach we propose is based on an ensemble of SVM classifiers. Given the set of unlabeled data, we select the most informative queries to be labeled by a human annotator. Then, we train new SVMs and include them into the ensemble with updated weights. In the second approach, we take the advantage of contextual relationship among the activities and the objects in the video sequence. We encode contextual information using a conditional random field and then, present a novel active leaning algorithm that utilizes both of the entropy and the mutual information of the activity nodes. In the third approach, we further reduce human effort by making early prediction of the activity labels and providing dynamic suggestions to the annotator. State-of-the-art approaches do not scale with the growing number of video categories and they do not consider the cost of long viewing time for video labeling. Our proposed framework uses label propagation and LSTM based recurrent neural network that effectively selects informative queries and provides early suggestions to the annotator. In the fourth approach, we propose a continuous activity learning framework by intricately tying together deep hybrid feature models and active learning. This allows us to automatically select the most suitable features and to take the advantage of incoming unlabeled instances to improve the existing model incrementally. Finally, we propose a method for temporally segmenting meaningful activities that require very limited supervision. Perceiving these activities in a long video sequence is a challenging problem due to ambiguous definition of meaningfulness as well as clutters in the scene. We approach this problem by learning a generative model for regular motion patterns. We propose two methods that are built upon the autoencoders for their ability to work with unlabeled data.