Multi-person head segmentation in low resolution crowd scenes using convolutional encoder-decoder framework
Abstract
Person head detection in crowded scenes becomes a challenging task if facial features are absent, resolution is low and viewing angles are unfavorable. Motion and out-of-focus blur along with headwear of varying shapes exacerbate this problem. Therefore, existing head/face detection algorithms exhibit high failure rates. We propose a multi-person head segmentation algorithm in crowded environments using a convolutional encoder-decoder network which is trained using head probability heatmaps. The network learns to assign high probability to head pixels and low probability to non-head pixels in an input image. The image is first down sampled in encoder blocks and then up sampled in decoder blocks to capture multiresolution information. The information loss due to down sampling is compensated by using copy links which directly copy data from encoder blocks to the decoder blocks. All heads and faces in an image patch are simultaneously detected contrasting to the traditional sliding window based detectors. Compared to the existing state-of-the-art methods, the proposed algorithm has demonstrated excellent performance on a challenging spectator crowd dataset.
Collections
- Computer Science & Engineering [2402 items ]