Translating Text to, Merging, and Optimizing Graphical User Interface Tasks


This technology is a method to enable multiple users to collaborate on automating computer tasks. It can additionally build a database of solutions for important computer tasks.

Problem Addressed

As computers have gotten faster and more functional, they have also gotten much more complicated. This can prevent non-expert users from performing day-to-day tasks such as removing a virus or configuring their home router, leading them to spend large amounts of time with online technical help or drudging through tutorials that often fail to help them. While there exist expert-scripted programs that automatically perform a given task for a user, this is an often slow and expensive process with a limited repository of tasks. The Inventors have developed WikiDo, a system that allows lay users to automate computer tasks and contribute to a database of solutions by simply performing the task.


WikiDo builds a database of computer task automations based on crowd-sourced help from a community of Internet users. Users contribute by simply performing a set of instructions which are translated from a natural language (English) to a sequence of GUI actions. For example, the sentence “click on OK” will be translated to a GUI command LEFT_CLICK and an object BUTTON: OK on which to perform that command. As the commands are performed, the system aggregates traces from multiple users into a canonical sequence of GUI actions that, when executed on a user machine, can automatically perform the corresponding task. When a user searches the WikiDo database for a task they do not know how to perform, they can either read a text version of the GUI actions, use the solution as a tutorial that will walk them through how to perform the task step by step, or allow the solution to run automatically on their computer to perform the desired task.

WikiDo achieves high-accuracy annotations for documents by merging multiple action sequences to filter out idiosyncrasies and mistakes of individual solutions. Additionally, a classifier is used to predict which steps are likely to be misinterpreted and requests human intervention to properly perform them. This process can be done iteratively until the translation is believed to be correct.


  • WikiDo achieves nearly expert-level accuracy at a fraction of the cost
  • Database of available solutions grows as users solve new task
  • Tasks can be performed on a variety of environments and configurations