Predicting Native Language from Gaze

This case is related to MIT technology: #19689

A system that predicts native language from gaze can provide benefit for forensic, advertisement-related, and educational applications.

Researchers

Yevgeni Berzak / Boris Katz

Departments: Computer Science & Artificial Intelligence Lab
Technology Areas: Artificial Intelligence (AI) and Machine Learning (ML) / Communication Systems: Optical / Computer Science: Bioinformatics, Networking & Signals / Sensing & Imaging: Optical Sensing

  • predicting native language from gaze
    United States of America | Granted | 11,003,852

Technology

The presented framework predicts the native language of a reader when reading English. Tested on native speakers of Portuguese, Spanish, Japanese, Mandarin Chinese, as well as English, the system relies on an eye tracker camera to record gaze location of participants as they read a small set of free-form sentences. Using the gaze recording, the system extracts a set of linguistically motivated features to characterize gaze patterns and then implements a machine learning algorithm on those features to predict native language of a reader.

Moreover, the system can reliably distinguish different languages as well as the difference between native and non-native English speakers. For substantially different languages like Japanese and Spanish, the system differentiates between the languages with over 90% accuracy. For similar languages like Portuguese and Spanish, the system distinguishes with above chance accuracy. 

Problems Addressed

Current frameworks for understanding cross-linguistic influence in multilingualism derive primarily from studies of language production. Such studies examine cross-linguistic influence, or how a first language affects second language processing, in writing produced by an individual. Analysis that derives from this work on written text provides key insights for native language identification and other features of language acquisition in learners. However, insights gained from studies that focus on language production offer an incomplete account of cross-linguistic influence in language processing.

This system presents a novel framework for reviewing cross-linguistic influence in language comprehension. The end-to-end system predicts the native language of a reader based on their eye-movement patterns when reading free-form English. This innovative analysis of eye-movement patterns during reading provides the basis for further inquiry of cross-linguistic influence in language comprehension. Combined with evidence from language production, this line of investigation can play a key role in advancing linguistic theory of multilingualism.

Advantages

  • Machine learning algorithm, improved performance possible with more data
  • Feature extraction and prediction occurs in seconds
  • Applicable to other languages beyond English

Publications

Berzak, Yevgeni, Chie Nakamura, Suzanne Flynn, and Boris Katz. "Predicting Native Language from Gaze." In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 541–551. Vancouver, Canada: Association for Computational Linguistics, 2017.

License this technology

Interested in this technology? Connect with our experienced licensing team to initiate the process.

Sign up for technology updates

Sign up now to receive the latest updates on cutting-edge technologies and innovations.