Abstract:
Person re-identification (re-id) is a challenging task particularly when people have various poses and some parts of the person’s body are occluded or missing in the view. In this thesis, we propose a part-based person re-id method to cope with these challenges. The proposed method does not require any alignment in input images and is able to work even if only a part of the body is seen. In the proposed method, with the help of a pose estimator, body parts are detected and extracted from the input images. The extracted body parts are processed separately through associated deep learning models. Given two input body part images, each model outputs the probability that the given body parts extracted from two di↵erent raw images belong to the same person. In the fusion step, outputs of each body part model are combined and the result is output as the final probability value. We investigated three fusion methods and empirically showed that averaging the available body part model outputs is the best fusion method among the three. In order to develop and evaluate our proposed method, we collected the Robot Cafe dataset which abounds with the challenges mentioned above. For this purpose, we developed an annotation tool to easily and fastly select and extract person images from videos. Robot Cafe dataset has 10,969 images from 93 persons. The experiments are conducted on Robot Cafe and CUHK03 datasets. The experiment results showed that our method is more robust to missing body parts and huge pose changes compared to some of the previous studies.