We present a novel hierarchical model for human activity recognition. In contrast with approaches that successively recognize actions and activities, our approach jointly models actions and activities in a unified framework, and their labels are simultaneously predicted. The model is embedded with a latent layer that is able to capture a richer class of contextual information in both state-state and observation-state pairs. Although loops are present in the model, the model has an overall linear-chain structure, where the exact inference is tractable. Therefore, the model is very efficient in both inference and learning. The parameters of the graphical model are learned with a structured support vector machine. A data-driven approach is used to initialize the latent variables; therefore, no manual labeling for the latent states is required. The experimental results from using two benchmark datasets show that our model outperforms the state-of-the-art approach, and our model is computationally more efficient.
Document (PDF)
Niet bekend