Towards a Multimodal Interface for Mathematical Manipulation

At the conclusion of my UROP with Prof. Randall Davis in CSAIL, which was over January of 2010, I wrote a paper detailing some of my work. The main thrust of the work was finding pitch contours in the spoken form of mathematical expressions.

Abstract

This paper presents elements to a possible symmetric multimodal user interface for manipulating mathematical expressions. We will discuss multimodal user interfaces, what subset of mathematics is tractable, previous work in this area, novel ideas to progress such multimodal interfaces, and a direction for future work.

Download the paper: (PDF)

Download the source code: avg-img.py     conv.lisp

Here is an example of the output of conv.lisp using MBROLA voices:

Here is an example of the pitch contour of someone uttering “a(x2 + 2).” (WAV)