Abstract:
In this thesis, we have performed people detection in cluttered scenes. The people search operation in an image is performed by sliding a detection window and converting the content of each window to a feature vector. Dense feature representation of the detection window is obtained by dividing it into overlapping blocks and extracting local features of the blocks. These block features are concatenated to form the combined feature vector of the detection window. Feature vectors are obtained from windows with people and not containing people (negative samples), and used to train a linear SVM classifier. We have studied various types of features to use for people detection. First, we have performed people detection using Histogram of Oriented Gradients (HOG) using various combinations of values of HOG feature extraction parameters like block sizes, gradient operators, HOG bin numbers and normalization methods. In addition to HOG features, we have also studied other features like Gabor energies, block orientation vectors, skin color, projection profiles and cluster distances. In order to increase the performance and the reliability of the detection algorithm, various fusion techniques are applied at data, feature and decision levels. For example, HOG based detector scores are fused with Gabor based detector scores and improved detection scores are obtained. Also, same type detectors are varied by changing detector parameters and the detection scores of these detectors are combined. The performance of the algorithms is measured using different parameters and configurations, and results are compared using Detection Error Tradeoff plots.