dc.description.abstract |
Computer vision-based human action recognition is a highly active research area which has many application areas including security, surveillance, assisted living, and entertainment. In this thesis, a new system for computer vision-based recognition of human actions is presented. The proposed system uses videos as input. The approach is invariant of the location of the action and zoom levels, the appearance of the person, partial occlusions including self-occlusions and some viewpoint changes. It is robust against temporal length variations. Keypoints are tracked through time and the trajectories of tracked keypoints are used for interpreting the human action in the video. Then, features from videos are extracted. A group of features for describing a trajectory are proposed. Trajectories are clustered using these trajectory features. The clustered trajectories are used for describing an image sequence. Image sequence descriptors are the normalized histograms of the clusters of trajectories. At the nal stage, the proposed system uses the descriptors of the image sequences in a supervised learning approach. An application based on the proposed method has been developed and applied to various datasets. A new multi modal dataset, called WeCare, which is focused on elderly care systems is introduced. The main objective of the dataset is to detect falls of humans. For attaining this goal, some other actions that can be confused with the falling action are included in the dataset. The evaluation of the proposed approach is done using two datasets: KTH Human Action Dataset and URADL Dataset. The proposed technique performs comparable to the methods in the literature. It has 87.25 per cent accuracy on the KTH dataset, 88 per cent accuracy on the URADL dataset. It has an accuracy of 98.75 per cent on the WeCare dataset. |
|