Download PDF

Digital Scholarship in the Humanities

Publication date: 2020-04-01
Volume: 35 16
Publisher: Oxford University Press (OUP)

Author:

Keersmaekers, Alek

Keywords:

Arts & Humanities, Social Sciences, Humanities, Multidisciplinary, Linguistics, Arts & Humanities - Other Topics

Abstract:

This article describes a first attempt to annotate the full Greek papyrus corpus automatically for linguistic information. It gives an overview of existing work on Ancient Greek and analyzes the typical problems one encounters when using natural language processing techniques on (1) a historical corpus of (2) a highly inflectional language (as opposed to the more analytic present-day English) and offers solutions to them, testing several different approaches. The focus is on part-of-speech/morphological tagging and lemmatization; some syntactic parsing experiments are also briefly discussed. The conclusion discusses the strengths and shortcomings of the examined techniques and suggests possible ways to further improve tagging and parsing accuracy.