Download PDF (external access)

Mooney Face Image Processing in Deep Convolutional Neural Networks Compared to Humans

Publication date: 2022-01-01

Author:

Zeman, Astrid
Leers, Tim ; de Beeck, Hans Op

Abstract:

Deep Convolutional Neural Networks (CNNs) are criticised for their reliance on local shape features and texture rather than global shape. We test whether CNNs are able to process global shape information in the absence of local shape cues and texture by testing their performance on Mooney stimuli, which are face images thresholded to binary values. More specifically, we assess whether CNNs classify these abstract stimuli as face-like, and whether they exhibit the face inversion effect (FIE), where upright stimuli are classified positively at a higher rate compared to inverted. We tested two standard networks, one (CaffeNet) trained for general object recognition and another trained specifically for facial recognition (DeepFace). We found that both networks perform perceptual completion and exhibit the FIE, which is present over all levels of specificity. By matching the false positive rate of CNNs to humans, we found that both networks performed closer to the human average (85.73% for upright, 57.25% for inverted) for both conditions (61.31% and 62.70% for upright, 48.61% and 42.26% for inverted, for CaffeNet and DeepFace respectively). Rank order correlation between CNNs and humans across individual stimuli shows a significant correlation in upright and inverted conditions, indicating a relationship in image difficulty between observers and the model. We conclude that in spite of the texture and local shape bias of CNNs, which makes their performance distinct from humans, they are still able to process object images holistically.