The presented framework predicts the native language of a reader when reading English. Tested on native speakers of Portuguese, Spanish, Japanese, Mandarin Chinese, as well as English, the system relies on an eye tracker camera to record gaze location of participants as they read a small set of free-form sentences. Using the gaze recording, the system extracts a set of linguistically motivated features to characterize gaze patterns and then implements a machine learning algorithm on those features to predict native language of a reader.
Moreover, the system can reliably distinguish different languages as well as the difference between native and non-native English speakers. For substantially different languages like Japanese and Spanish, the system differentiates between the languages with over 90% accuracy. For similar languages like Portuguese and Spanish, the system distinguishes with above chance accuracy.