Abstract
Several systems have been developed that allow mathematical expressions to be spoken and navigated. This paper describes studies involving the latest revision of the most widely used system: MathPlayer 4. This version includes features to allow navigation of mathematical expressions. Students with blindness or low vision used NVDA + MathPlayer to read Microsoft Word documents with math problems in them. The results were compared with the same students reading similar documents using their favorite modality (braille or large print). The results showed that speech augmented with navigation resulted in similar comprehension rates compared to when students used their preferred modality. This is an important finding because electronic documents are often available in situations where braille or large print documents are not.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
1 Background
There exist several software aids for speaking mathematical expressions in web pages and elsewhere (e.g., MathPlayer, JAWS, Safari + VoiceOver, ChromeVox). A spoken expression is comprehensible when the expression being spoken is short. For most people, working memory is limited to around 7 words [1], and may be shorter when dealing with mathematics due to the density of its notation. This makes comprehension of larger expressions difficult via speech alone. One obvious solution is to allow users to navigate expressions so they can rehear parts and better understand the structure of the expression.
Several systems have implemented some form of navigation including the earliest systems for speaking math: Aster [2] and MathTalk [3]. Aster used a strict tree-based model of navigation. MathTalk and subsequent systems rejected that as too complicated and used a tree only for two-dimensional notations such as fractions and roots. Subsequent research efforts including MathGenie [4] and AudioMath [5] also supported navigation. Currently available math-to-speech systems include MathPlayer [6], ChattyInfty [7], ChromeVox [8], Safari + VoiceOver, and JAWS: all support navigation. ChromeVox, Safari + VoiceOver, and JAWS navigate math similar to MathPlayer’s simple mode (see below); ChattyInfty’s navigation is similar to MathPlayer’s character mode.
In collaboration with the Educational Testing Service (ETS) as part of an IES grant, Design Science added the ability to move around/navigate expressions to MathPlayer. Both NVDA and Window-Eyes make use of MathPlayer to generate math speech, with several other assistive technology companies looking into using MathPlayer. The MathPlayer navigation work includes many capabilities not found in prior work; it is discussed in the next section.
Only MathTalk and MathGenie have published user studies and for both of them, studies were done with sighted users. The IES study is the first to use blind and low vision students to compare comprehension and usability of speech versus braille and large print for mathematical expressions. The findings are discussed in the remainder of this paper.
2 Implementation
Navigation was added to MathPlayer for a navigation study and modified some for the MathPlayer 4 release based on feedback from the study. Navigation in MathPlayer is performed via keyboard commands. Features include:
-
Moving/Zooming: This is the basic mode of navigating. Three modes of moving around an expression are supported (see below). Arrow keys are used to move left/right and to zoom in/out of expressions.
-
Descriptions/Overviews: Users can choose between hearing the expression read to them or hearing a description (overview) of the expression (e.g., “fraction plus something plus 1”). Overviews can be set as a default when moving around or can be heard via key commands.
-
Place markers: 10 place markers are supported. At any point, users can set, move to, or hear what is at the place marker. This is particularly useful for cancelling fractions, marking coefficients for systems of equations, etc.
-
Where am I: the ability to recall context without moving (e.g., “x + 1 inside of the fraction with numerator x + 1 and denominator x squared minus one”). The ability to get more and more context along with the ability to get the entire context is provided.
A unique aspect of MathPlayer’s navigation is the ability to navigate in different modes: character, simple, and enhanced. To illustrate the differences, this sample expression is used:
-
Character/Word: navigate the leaves of the tree. E.g., moving to the right by typing the right arrow key in the above expression, results in the user hearing “2”, “inside square root, in base, x”, “in exponent, 2”, “out of exponent, minus”, etc. Character and Word mode differ only for multi-digit numbers such as 128 and multi-character identifiers/operators such as “sin”.
-
Simple: navigates by word except for 2D notations such as fractions and exponents. For these, the entire 2D notation is spoken. Users zoom into and out of the notation to hear parts of it. This is the common model that is implemented in many systems such as Safari, ChromeVox, and JAWS. In simple mode, moving to the right in the above expression, the user hears “2”, “times the square root of x squared minus 4”, “plus”, “3”, “a”, “times the square root of x plus 1”.
-
Enhanced: infers what the expression tree is for the math and moving left/right uses that structure. E.g., in the example above, one would hear “2 times the square root of x squared minus 4”, “plus”, “3 a times the square root of x plus 1”.
Another unique aspect of MathPlayer’s navigation is “auto zoom in”/“auto zoom out”. A description can be found in [6]. Several power users (those who read at very high TTS speeds) requested that auto zoom out be turned off. These users said that they commonly “bang” multiple times on the arrow key and want to use the end of a structure to act as a wall that stops them. No student in the IES study requested this. The ability to turn to turn off auto zoom out was added to the final release of MathPlayer. “Shift arrow” will auto zoom out even if it is turned off. This provides a way to avoid having to “back out” (zoom out) of a nested 2D notation.
3 Study Results
The IES grant consisted of MathPlayer development along with four feedback studies and a final pilot study covering all aspects of the grant. The four feedback studies looked at a new speech style (ClearSpeak [9]), various forms of prosody and lexical cues to resolve speech ambiguities, navigation, and authoring documents (aimed at teachers). After making changes based on the studies, these features were evaluated in final pilot study [10]. This paper discusses the navigation study and the pilot study.
IRB approval of the studies was obtained and all participants signed consent forms. As thanks for participating in the study, the students received gift cards in amounts ranging between $25 and $125 depending upon the length of the study.
3.1 Navigation Study
The initial navigation study involved 20 students with blindness or low vision in classes ranging from algebra 1 to pre-calculus. Each participant read through an interactive tutorial to learn and practice MathPlayer’s navigation features. Based upon their experiences from the tutorial for each of the navigation features, the study asked:“how easy/hard was it learn…” and “how likely are you to use…”. The students found it easy to learn most features. On a scale ranging from 0-3, with three being “very easy,” the mean was between 2.44 and 2.79. Three features were viewed as less likely to be used:
-
Describe/Overview (1.78)
-
Placemarkers (2.21)
-
Where am I (2.28)
Describe/Overview mode was the least developed feature in MathPlayer, so it came as no surprise to us that it was the least liked feature. There are two problems with Describe/Overview that we were aware of:
-
More effort needed to be spent determining the amount of detail to provide. E.g., the expression
is read as “something plus fraction plus 1”. It would probably be better to read it as “x squared plus one over something plus 1”. That is only slightly longer, but it provides much more detail.Footnote 1
-
We debated using the words “term”, “factor”, “exponent”, etc., instead of “something” in expressions. Ultimately, we used the generic word “something” because the semantics of the expression aren’t fully known and we felt that using a wrong word might be misleading. One student suggested using “term”, etc., when asked what they would like to see changed; most students had no suggestions for improvement.
There were two things about place markers that confused some students. As implemented (for simplicity of implementation), place markers are local to each expression: they can only reference the current expression and disappear when the expression being navigated is exited. A couple of students didn’t seem to realize this and asked for a method to clear the place markers. One student asked for more than 10 place markers (place markers are currently bound to keys 0–9 for simplicity).
There were two comments about “where am I”: one person wanted it to go from the bigger to the smaller (whole context then current location) and one person wanted an indication of how deeply nested they were. The rest either had no comment or thought it was fine the way it was.
In the final pilot (see next section), students were again asked about specific navigation features and how they helped their understanding and solving math problems in the pilot. Table 1 (below) shows the responses from the pilot study (one student didn’t answer this question). As can be seen, the results are similar to those found in the navigation study. Several questions tried to get information about on how the students liked the three navigation modes. Students’ answers varied widely as to their preferred mode, although many of students said they made use of all three modes and found each useful for different situations.
3.2 Final Pilot
The final pilot involved 21 students, 17 of whom had also participated in the navigation study. They were given two similar documents: a Word document with math problems (accessible via TTS + NVDA + MathPlayer + MathTypeFootnote 2) and a braille, regular print for CCTV, or large print document based upon their preferences or previous usage. Students were divided randomly into two groups. Each group received paired documents in different orders (speech first or last), with each document containing 16 questions (32 total). This allowed a comparison between our speech-based solution and the student’s preferred non-electronic format.
Prior to the experiment, students familiarized themselves with MathPlayer by going through a tutorial. On the day of their study participation, they practiced with two problems to make sure they remained comfortable with the system. Each part of the pilot began with a sample problem and answer followed by problems the student should solve. Here are a few examples:
-
1.
How many zeroes are there to the right of the decimal point in the number \( 3.0000001 \)?
-
2.
The following questions are based on the polynomial
$$ 12x^{6} + 18x^{2} + 35x^{7} + 5x^{15} + 45 + 16x^{12} $$-
(a)
How many terms does the polynomial have?
-
(b)
What is the coefficient of \( x^{2} \)?
-
(a)
-
3.
Simplify the expression \( 4 + 3x - 2 + 8y - 2x - 3y + 5 - 4y + 10x \)
-
4.
What is the value of the expression \( 3\left( {\left( {6 + 5} \right) - \left( {8 - 4} \right)} \right) - 2 \)?
-
5.
Simplify the algebraic fraction \( \frac{{\left( {x + 1} \right)\left( {2x - 3} \right)}}{{\left( {2x + 1} \right)\left( {x + 1} \right)\left( {2x + 3} \right)}} \).
-
(a)
What is the numerator of the simplified fraction?
-
(b)
What is the denominator of the simplified fraction?
-
(a)
Net scores were computed for the paired (spoken and other format) problems as follows:
-
0: student answered both the spoken question and its non-spoken clone correctly/incorrectly
-
1: student correctly answered the spoken question but not its non-spoken clone
-
−1: student incorrectly answered spoken question but correctly answered its non-spoken clone
The average net score per question was 0.125 (Std. Dev. 2.73). This indicates that the students’ performance using speech was similar to their performance using their usual format (insignificant bias towards speech). In other words, despite less familiarity with the speech solution, students performed comparably to the familiar but more costly printed solution.
Table 2 (below) shows that most students performed similarly on the two formats independent of the question with two exceptions: question 3.2 (example 3 above, much worse with speech) and question 4.3 (example 4 above, much better with speech).
Table 3 shows the data per user along with their favorite modality for accessing math. The maximum net difference for a user was just 2, showing that speech is a viable option among all users in the study independent of their preferred format. Despite the students’ similar performance across formats, on a feedback question, student’s expressed a small preference for their usual format. We looked at the results for those who answered that they would always or would usually prefer their usual method. The data showed that their math scores were slightly higher for speech. Also, the time they spent on the problems in each method didn’t correlate with their preference.
Students were asked at the end of each document how easy or difficult it was to understand the math in the document. Almost all of the students said understanding the speech was “somewhat easy”, compared to “very easy” for their preferred format.
Notes
- 1.
This is not ambiguous because “over” is only used when the denominator is simple.
- 2.
Nemeth refreshable braille is also supported by NVDA + MathPlayer, but the study did not allow students to use this feature.
References
Miller, G.A.: The magical number seven, plus or minus two: some limits on our capacity for processing information. Psychol. Rev. 63(2), 81 (1956)
Raman, T.V.: Audio system for technical readings. PhD thesis, Ithaca, NY, USA. UMI Order No. GAX95-11869 (1994)
Edwards, A.D., Harling, P.A., Stevens, R.D.: Access to mathematics for visually disabled students through multimodal interaction. Hum.-Comput. Interact. 12, 47–92 (1997). doi:10.1207/s15327051hci1201&2_3
Gillan, D.J., Barraza, P., Karshmer, A.I., Pazuchanics, S.: Cognitive analysis of equation reading: application to the development of the math genie. In: Miesenberger, K., Klaus, J., Zagler, W.L., Burger, D. (eds.) ICCHP 2004. LNCS, vol. 3118, pp. 630–637. Springer, Heidelberg (2004)
Ferreira, H., Freitas, D.: AudioMath-using MathML for speaking mathematics. Presented at XML: Aplicacoes e Tecnologias Associadas, Braga, Portugal (2005)
Soiffer, N.: Browser-independent accessible math. In: Proceedings of the 12th Web for All Conference (W4A 2015), Article 28, 3 p. ACM, New York, NY, USA (2015). doi:10.1145/2745555.2746678
Sorge, V., Chen, C., Raman, T.V., Tseng, D.: Towards making mathematics a first class citizen in general screen readers. In: Proceedings of 11th Web for All Conference (W4A 2014), Article 40, 10 p. (2014). doi:10.1145/2596695.2596700
Science Accessibility Net. ChattyInfty, the Version 3 Series Manual (2014). http://www.sciaccess.net/en/ChattyInfty/ChattyInfty3_Eng_Manual.pdf
Frankel, L., Brownstein, B,, Soiffer, N.: Navigable, customizable TTS for algebra. J. Technol. Persons Disabil. 1 [22] (2013) http://scholarworks.csun.edu/handle/10211.3/121942
Frankel, L., Brownstein, B., Soiffer, N.: Expanding Audio Access to Mathematics Expressions by Students with Visual Impairments via MathML. To be published in ETS Research Report Series
Acknowledgements
The study portion of the grant was carried out by Lois Frankel and Beth Brownstein at ETS. I am very thankful for their expertise in designing the study questions and hard work in getting IRB approval, lining up the students, and evaluating the results. Some tutorial material and student recruitment was carried out by Stephen Noble.
I had many late night discussions with Sina Bahram about how navigation should work. Many ideas were discussed and rejected until we came to the current design. The design was very much a joint effort.
The research reported here was supported by the Institute of Education Sciences, U.S. Department of Education, through Grant R324A110355 to the Educational Testing Service and Design Science. The opinions expressed are those of the author and do not represent views of the Institute or the U.S. Department of Education.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Soiffer, N. (2016). A Study of Speech Versus Braille and Large Print of Mathematical Expressions. In: Miesenberger, K., Bühler, C., Penaz, P. (eds) Computers Helping People with Special Needs. ICCHP 2016. Lecture Notes in Computer Science(), vol 9758. Springer, Cham. https://doi.org/10.1007/978-3-319-41264-1_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-41264-1_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41263-4
Online ISBN: 978-3-319-41264-1
eBook Packages: Computer ScienceComputer Science (R0)