Title: Children’s oral reading corpus (CHOREC) : description and assessment of annotator agreement
Authors: Cleuren, Leen ×
Duchateau, Jacques
Ghesquière, Pol
Van hamme, Hugo #
Issue Date: 2008
Publisher: European Language Resources Association (ELRA)
Host Document: LREC 2008 Proceedings pages:1-8
Conference: International conference on language resources and evaluation - LREC 2008 edition:6 location:Marrakech, Morocco date:28-30 May 2008
Abstract: Within the scope of the SPACE project, the CHildren’s Oral REading Corpus (CHOREC) is developed. This database contains recorded,
transcribed and annotated read speech (42 GB or 130 hours) of 400 Dutch speaking elementary school children with or without reading
difficulties. Analyses of inter- and intra-annotator agreement are carried out in order to investigate the consistency with which reading
errors are detected, orthographic and phonetic transcriptions are made, and reading errors and reading strategies are labeled. Percentage
agreement scores and kappa values both show that agreement between annotations, and therefore the quality of the annotations, is high.
Taken all double or triple annotations (for 10% resp. 30% of the corpus) together, % agreement varies between 86.4% and 98.6%,
whereas kappa varies between 0.72 and 0.97 depending on the annotation tier that is being assessed. School type and reading type seem
to account for systematic differences in % agreement, but these differences disappear when kappa values are calculated that correct for
chance agreement. To conclude, an analysis of the annotation differences with respect to the ’*s’ label (i.e. a label that is used to annotate
undistinguishable spelling behaviour), phoneme labels, reading strategy and error labels is given.
Description: Cleuren L., Duchateau J., Ghesquière P., Van hamme H., ''Children’s oral reading corpus (CHOREC) : description and assessment of annotator agreement'', Proceedings 6th international conference on language resources and evaluation - LREC 2008, 8 pp., May 28-30, 2008, Marrakech, Morocco.
ISBN: 2-9517408-4-0
Publication status: published
KU Leuven publication type: IC
Appears in Collections:Parenting and Special Education
ESAT - PSI, Processing Speech and Images
× corresponding author
# (joint) last author

Files in This Item:
File Description Status SizeFormat
2751.pdf Published 234KbAdobe PDFView/Open


All items in Lirias are protected by copyright, with all rights reserved.

© Web of science