Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP 2015)
2015 Conference on Empirical Methods in Natural Language Processing (EMNLP 2015) location:Lisbon, Portugal date:17-21 September 2015
This paper is concerned with the task of bilingual lexicon induction using image-based features. By applying features from a convolutional neural network (CNN), we obtain state-of-the-art performance on a standard dataset, obtaining a 79% relative improvement over previous work which uses bags of visual words based on SIFT features. The CNN image-based approach is also compared with state-of-the-art linguistic approaches to bilingual lexicon induction, even outperforming these for one of three language pairs on another standard dataset. Furthermore, we shed new
light on the type of visual similarity metric to use for genuine similarity versus relatedness tasks, and experiment with using multiple layers from the same network in an attempt to improve performance.