Proceedings of the RDWG Online Symposium on Easy-to-Read on the Web
Easy-to-Read on the Web date:3 december 2012
The WAI-NOT environment is a platform which allows people with cognitive disabilities to communicate online using pictographs instead of text and supports two pictograph sets. Users can enter messages using their pictograph set and/or text. These messages are encoded as text and sent to the receiver where they are decoded into the target pictograph set wherever possible. There are two problems with this approach:
1. The decoding is purely string-based and no disambiguation takes place, which occasionally leads to wrong pictograph generation; and
2. The current string-based overlap between the two sets is too small to be of practical usage
We have collected a corpus of 200K words of e-mail messages sent with WAI-NOT, and we show how simple NLP techniques such as part-of-speech tagging and lemmatisation can improve the conversion of messages from one pictograph set to the other. A relative improvement of >45% was reached on unseen data.
Furthermore we discuss how in the next phase we will use word-sense-disambiguation and linking to Cornetto, a lexical-semantic database, to further improve the results.