Adriane Boyd, Markus Dickinson, and Detmar Meurers
Proceedings of the Sixth Workshop on Treebanks and Linguistic Theories (TLT 2007). Bergen, Norway.
While error detection approaches have been developed for various types of corpus annotation, so far only limited attention has been paid to the recall of those methods. We show how the recall of the so-called variation $n$-gram method can be increased by examining comparable part-of-speech tag sequences instead of the recurring strings themselves. To guide the search for erroneous annotation and to distinguish errors with high precision, we also develop new context reliability indicators.
Electronically available file formats:
- .pdf (146.116 bytes)
Bibtex entry:
@InProceedings{boyd-et-al:07b,
author = {Adriane Boyd and Markus Dickinson and Detmar Meurers},
title = {Increasing the Recall of Corpus Annotation
Error Detection},
booktitle = {Proceedings of the Sixth Workshop on Treebanks
and Linguistic Theories (TLT 2007)},
address = {Bergen, Norway},
url = {http://purl.org/dm/papers/boyd-et-al-07b.html},
year = {2007}
}