Abstract:
Being able to understand and recognize actions is a crucial precondition for social integration, which enables the members of a social community to interact with each other and with their environments. During the development of a person's cognitive abilities, the ability to detect and recognize actions improve over the course of a long period. First, we learn to recognize simple and explicit actions that have little ambiguity, such as; waving, eating, or walking. As we progress this ability we learn to recognize subtler actions, such as; smiling, resting, or reading. In more complex cases, these actions may overlap or compose a more integral action, such as riding a bike. In a similar spirit to humans, for a robot to be able to make natural and seamless interactions with people and make plans that are appropriate for the state of the environment it is in, it is imperative that it can understand the actions of the people around it. For this, a robot needs a powerful action recognition module that can on real-world conditions where a variety of distortions are present. Action recognition is the task of observing the sequential progression of these movements and matching some segments of this sequence with previously de ned action classes which have been labeled by the action type that de nes them. In many cases, the task of action recognition comes together with the task of action detection. Action detection is the task of extracting the segments from a usually long observation that contains some action. In many cases, action recognition and detection occur naturally in humans and we subconsciously recognize a person's actions without much e ort. This ability makes social interaction much smoother. However, this is a very challenging problem for computer, since many actions contain complex action segments that might have very similar appearances and temporal processions. Even subtle di erences can put an action into an entirely di erent class.