Download PDF

IEEE Transactions on Pattern Analysis and Machine Intelligence

Publication date: 2014-11-01
Volume: 36 Pages: 2131 - 2143
Publisher: IEEE Computer Society

Author:

Dantone, Matthias
Gall, Juergen ; Leistner, Christian ; Van Gool, Luc

Keywords:

Human pose estimation, fashion, random forest, regression, classification, PSI_VISICS, Science & Technology, Technology, Computer Science, Artificial Intelligence, Engineering, Electrical & Electronic, Computer Science, Engineering, PICTORIAL STRUCTURES, FLEXIBLE MIXTURES, OBJECT DETECTION, TREE MODELS, CONSTRAINTS, Algorithms, Databases, Factual, Humans, Image Processing, Computer-Assisted, Joints, Posture, 0801 Artificial Intelligence and Image Processing, 0806 Information Systems, 0906 Electrical and Electronic Engineering, Artificial Intelligence & Image Processing, 4603 Computer vision and multimedia computation, 4611 Machine learning

Abstract:

In this work, we address the problem of estimating 2d human pose from still images. Articulated body pose estimation is challenging due to the large variation in body poses and appearances of the different body parts. Recent methods that rely on the pictorial structure framework have shown to be very successful in solving this task. They model the body part appearances using discriminatively trained, independent part templates and the spatial relations of the body parts using a tree model. Within such a framework, we address the problem of obtaining better part templates which are able to handle a very high variation in appearance. To this end, we introduce parts dependent body joint regressors which are random forests that operate over two layers. While the first layer acts as an independent body part classifier, the second layer takes the estimated class distributions of the first one into account and is thereby able to predict joint locations by modeling the interdependence and co-occurrence of the parts. This helps to overcome typical ambiguities of tree structures, such as self-similarities of legs and arms. In addition, we introduce a novel data set termed FashionPose that contains over 7,000 images with a challenging variation of body part appearances due to a large variation of dressing styles. In the experiments, we demonstrate that the proposed parts dependent joint regressors outperform independent classifiers or regressors. The method also performs better or similar to the state-of-the-art in terms of accuracy, while running with a couple of frames per second.