M. D. Dunlop and J. Levine, Multidimensional Pareto optimization of touchscreen keyboards for speed, familiarity and improved spell checking, Proceedings of CHI 2012, May 2012. Honourable Mention - PDF Full Paper - ACM Page
Full video coming soon....
The video contains materials that are all available for use under Creative Commons agreements. See SATH Video Creative Commons Info for sources used.
See SATH Demos for Android and Browser based demos of the SATH keyboard.
The SATH keyboards are based on analysis of English corpora. The key data uses is:
- bigrams.txt - a list of bigrams sorted by frequency (most common first with the number representing the number of occurrences of the two letter pair in our corpus). Bigrams represent pair transitions in English based on our corpus, for example E<space> occurs 978 658 times. With a total number of bigram occurrences of 29 497 638 this gives a probability that a transition is from E to space as 978 658 / 29 497 638 = 0.0332 (or roughly 3%). Many researchers argue it is good to bring together common bigrams on a keyboard layout to increase typing speed.
- badgrams.txt - a list of badgrams sorted by frequency (most common first with the number representing the frequency of words with that badgram in our corpus). Badgrams represent letters that when substituted by the other lead to a valid word, e.g. mistyping A and E often leads to a valid word (e.g. getting and when one expected end). Our paper argues it is good to separate common badgrams on a keyboard layout to increase the chance of a spell checker correcting words accurately.