Treffer: Human Action Recognition Based on Analysis of Video Sequences
Sahoo, Suraj Prakash (2021) Human Action Recognition Based on Analysis of Video Sequences. PhD thesis.
1435820902
From OAIster®, provided by the OCLC Cooperative.
Weitere Informationen
Human actions are defined as the coordinated movement of different body parts in a meaningful way to describe different aspects of human behavior. Recognizing human actions through computer vision is a trending research area as it has applications in both indoor and outdoor environments. Human action recognition (HAR) has a broad application area in the field of surveillance, patient’s behaviour detection, video retrieval, sports video analysis, human-computer interaction, etc. However, processing of the action videos is a challenging and complex task. This motivates to develop a good HAR algorithm with better video representation, feature extraction and classification capabilities to recognize different action classes effectively. In this regard, at first a semisupervised tree and 3D local feature based HAR paradigm (sST-3DF) is developed. Here, a motion history image (MHI) based interest point refinement is proposed to remove the noisy interest points. Histogram of oriented gradient (HOG) and histogram of optical flow (HOF) techniques are extended from spatial to spatio-temporal domain to preserve the temporal information. These local features are used to build the trees for the random forest technique. During tree building, a semi-supervised learning is proposed for better splitting of data points at each node. For recognition of an action from the video, mutual information is estimated for all the extracted interest points to each of the trained class by passing them through the random forest. Next, a two-stream sequential network is developed to leverage sequential and shape information for recognition of human actions more efficiently. In this technique, a deep bi-directional long short term memory (DBiLSTM) network is constructed to model temporal relationship between action frames through sequential learning. Action information in each frame is extracted using pre-trained convolutional neural network (CNN). During the shape learning, the knowledge of shape i