Abstract:
Attentive robots, inspired by human-like vision - are required to have visual systems with fovea-periphery distinction and saccadic motion capability. Thus, each frame in the incoming image sequence has nonuniform sampling and consecutive saccadic images have temporal redundancy. In this thesis, we propose a novel video coding and streaming algorithm for low bandwidth networks that exploits these two features simultaneously. Our experimental results reveal improved video streaming in applications like robotic teleoperation.Furthermore, we present a complete framework for foveating to the most interesting region of the scene using attention criteria. The construction of this function can vary depending on the set of visual primitives used. In our case, we show the feasibility of using Cartesian and Non-Cartesian filters for the case of human-face videos. Since the algorithm is predicated on the Gaussian-like resolution of human visual system and is extremely simple to integrate with the standard coding schemes, it can also be used in applications such as cellular phones with video.