Julia Hancke, Sowmya Vajjala and Detmar Meurers
Proceedings of COLING 2012, the 24th Int. Conference on Computational Linguistics..
We investigate the problem of reading level assessment for German texts on a newly compiled corpus of freely available easy and difficult articles, targeted at adult and child readers respectively. We adapt a wide range of syntactic, lexical and language model features from previous research on English and combined them with new features that make use of the rich morphology of German. We show that readability classification for German based on these features is highly successful, reaching 89.7% accuracy, with the new morphological features making an important contribution.
Electronically available file formats:
Bibtex entry:
@InProceedings{hancke-vajjala-meurers-12,
author = {Hancke, Julia and Vajjala, Sowmya and Meurers, Detmar},
title = {Readability Classification for {G}erman using Lexical,
Syntactic, and Morphological Features},
booktitle = {Proceedings of the 24th International Conference on
Computational Linguistics (COLING 2012)},
year = {2012},
address = {Mumbai, India},
pages = {1063--1080},
url = {http://purl.org/dm/papers/hancke-vajjala-meurers-12.html},
pdf = {http://www.aclweb.org/anthology/C12-1065}
}