Abstract
This paper describes a new web-based system specially adapted to the education of Czech pupils with visual impairment. The system integrates speech and language technologies with a web framework in lower secondary education, especially in mathematics and physics subjects. A new interface utilized the text-to-speech (TTS) synthesis for online automatic reading of educational texts. The interface provides several TTS voices, synthesized data caching, and automatic processing of formulas in mathematics and physics. The system was designed to enable teachers create and manage teaching materials. It also enables the pupils to view and listen to the read forms of these documents online. A school for pupils with visual impairment participated in the development and implementation of the system. After one year of using the system daily, the user experience and evaluation data were collected. The results indicate a positive reception and frequent use of the system as well as a preference over classical educational materials.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
Modern computers are beneficial for many types of impairments or disabilities. There are various assistive technologies [2, 34, 35], e.g. for people with cognitive disabilities [10, 25], mobility impairment [8], or hearing loss [15]. For individuals diagnosed with a print disabilityFootnote 1 [16, 24, 37], the transition from textbooks accessible in tapes to electronic texts read by computers is considered as one of the most effective methods of the last decade [3, 23]. However, this is not always true about the automatic conversion of mathematics to voice by computers. The creation and use of accessible materials is often discussed; however, the problem remains unsolved in several schools. Teachers cope daily with several issues arising from teaching pupils with combined disabilities (e.g. visual impairment and dyslexia or speech defects). The challenge worsens in science subjects, such as mathematics and physics for which accessible materials in a given language are noticeably lacking. It is also very challenging for computers to automatically read technical texts, such as mathematical formulas. Thus, it is difficult to develop a standard interface for the creation and presentation of science educational materials for pupils with visual impairments.
Visual impairment is designated by measuring visual acuity (VA). Other diagnosable characteristics are the range of visual field, depth perception, contrast sensitivity, colour discrimination, accommodation, adaptation, ophthalmogyric activity, and ability to localize a subject and follow it in motion. Based on the ophthalmology examination, the visual impairment of the person is classified into different categories as specified by the World Health Organization classification: (1) mild vision loss (near-normal vision), (2) moderate low vision, (3) severe low vision, (4) profound low vision, (5) near-total blindness, and (6) total blindness (no light perception)Footnote 2 [46].
Pupils with visual impairment educated in lower secondary schools in Czech Republic are classified into a wide range of the categories (3–6). The pupils are taught together in classrooms of about ten pupils. During mathematics lessons, the pupils routinely use low vision aids such as video magnifiers, large displays with enlarged text font, or Braille code display. Only some pupils classified as totally blind learn and use Braille or Nemeth Braille code.Footnote 3 An alternative method for these pupils is specialized computer hardware and software, such as screen readers, that enable them to read e-text using a computer [4].
Today, people with visual impairment already use standard screen readers using text-to-speech (TTS) technology (see [42]) that requires localization for a given language. A screen reader involves a computer application designed to automatically ‘read’ text on a screen without human intervention. However, the ‘non-textual’ content, such as mathematical formulas, images, and graphical schemes, is skipped or garbled by standard screen readers which lack support for correct text pre-processing modules. On the other hand, there are also offline digital textbooks and materials that correctly combine written texts and images with human voice (also known as hybrid books). The creation of such textbooks for the learning of mathematics is a long and laborious process [3, 5]. For example, the voice of a human is recorded along with the corresponding textual content.
1.1 Tasks in automatic formula reading
Two essential steps are involved in providing an appropriate and flexible interface for preparing new educational materials for daily use which are available through the automatic reading of mathematical formulas.
The first step involves the correct transcription of the symbolic notations of formulas and their subsequent decoding into corresponding grammatically correct word forms. One issue in the design is obtaining an adequate formula editor accessible to teachers. The authors (teachers) usually lack skills for the typesetting of mathematical equations, they are not web programmers, and they do not know how to generate well-structured web pages with mathematical formalisms.
The second important step involves the application of TTS. Current TTS systems are optimized on the corpora utilized during system design. As the spoken formulas are usually not included in these corpora, problems with synthetic speech quality can arise when conventional TTS systems are employed to read technical documents. Thus, the intelligibility and naturalness of the generated voice can be questionable.
1.2 Supporting pupils with visual impairment in the learning of mathematics
In the context of mathematics education, Stoeger [41] defines four main problems: (1) access to the mathematical literature (books, teaching materials, articles, etc.); (2) preparation of teaching materials (school exercises, notes); (3) navigation in mathematical formulas, and (4) the process of learning mathematics (calculations, formal manipulation of expressions, problem-solving).
This paper contributes to the four problems mentioned above. We propose a new web-based systemFootnote 4 that provides the following: (1) an easy access for pupils with visual impairment; (2) an interface for preparing educational texts including mathematical formulas; (3) navigation through the prepared audio-visual web content, and (4) a qualitative evaluation of the system with selected topics in mathematics and physics.
The system is designed for the teachers to prepare, manage, and administrate new educational materials using the online back-end interface. The teacher can convert, check, and immediately publish the prepared topic (lesson) to the web page with voice automatically supplemented by an integrated TTS (currently in Czech, English, and German). The educational material is available online to the pupils via the front-end interface of the website in both voice and visual modes.
The web page generated by the system is accessed via standard screen readers.Footnote 5 The educational material in this audio-visual form can also be integrated into other educational interfaces.
In general, the variety of the subjects prepared by the system is unlimited. However, the novelty of the proposed system is the capability to decode mathematical formulas.Footnote 6 Therefore, the system was qualitatively evaluated using mathematics and physics for pupils with visual impairment.
2 Current technologies and automatic reading of mathematical text
Several technologies and standards have been developed to improve the availability of textbooks for pupils with a print disability. Nevertheless, very few technologies currently provide a framework to make mathematical educational materials accessible in the required form (school textbooks, training notes, exercises). Mathematical equations are presented in several formats and codes, such as Mathematical markup language (MathML), NIMAS, DAISY, and Nemeth [22, 27, 28].
MathMLFootnote 7 is a computer XML-based format for describing mathematical formalism on the web [33]. However, current screen-readers are unable to read the MathML tags properly as they read the content alone and ignore the structure.
NIMAS is the file format for developing printed textbooks. By default, math content is provided in NIMAS file sets as images.
DAISY is another standard for producing accessible and navigable multimedia documents in the form of a synchronized audio/text-book. In addition, the MathDaisy add-in converts the equations to MathML and saves the document as a DAISY digital talking book (DTB). DTBs are used in eBook readers and the DAISY DTBs are read by players such as gh-PLAYER [18].
AudioMath developed at Porto University is another mathematics reader that uses MathML [12].
The Nemeth Braille code is used for the linear coding of mathematical and scientific notations using standard 6-point Braille cells, and it is unsuitable for automatic reading.
The problem of reading mathematics directly from a TeX source document was addressed, for instance, in the audio system for technical readings (AsTeR) [31]. For the world wide web, MathPlayer [38] provides a plugin to the web browser which displays, highlights, and reads mathematical expressions on the website. Accessibility is feasible here by right-clicking on an equation and choosing a ‘to render aurally’ command or using a screen reader that reads the entire web page and invokes MathPlayer to speak the mathematical formula in a structured (and sometimes customized) way. It should be noted that mathematical formalisms are often articulated in an ambiguous form.
The current technology falls short because it does not incorporate appropriate rules on how to describe mathematical expressions due to their naturally nonlinear structure. Unifying the translation process was also one of the aims of the MathSpeak project [18]. Furthermore, a prototype of the MathML reader called MathGenie was designed to provide an unambiguous verbal presentation of nontrivial mathematical formulas [20].
MathTalk is another assistive technology based on MathML, and it helps users with visual impairments create mathematical formulas by voice commands, display of information on the screen, and conversion of the information into braille [39, 40]. For the Czech language, the Lambda math-editor, developed at the University of York, was accommodated at Masaryk University [43] as a support system for editing mathematical formulas by blind users (creators) using Braille and audio output. Lambda also provides a compact and linear 8-dot Braille math-code [11]. Converting MathML to Braille is also possible using Math2Braille [9].
For the creation and subsequent reading of mathematical formulas, it is possible to use TeX or MathML code directly or to use an editor such as MathTypeFootnote 8 [13]. There are more general editors and converters of varying quality and different functionalities. Some applications are extensions or add-ons of internet browsers, e.g. Firemath,Footnote 9 Amaya,Footnote 10 Bluegriffon,Footnote 11 Tex4ht.Footnote 12 Expressions can also be graphically created using a word processor. The formulas are often stored as graphic image files in formats such as PNG or SVG, with or without alternative text. Some applications can convert expressions into speech, but they are mostly for the English language and of limited functionality.
Besides the conversion of text to speech, there are further specific requirements for application processing and reading of mathematical expressions for pupils with visual impairment. These requirements include font size, text colour, and background colour. The language as well as volume and speed control also play a vital role in the TTS conversion. Other technologies for creating accessible text for mathematics use images that are supported or annotated in ways that are more accessible to people with visual impairment for example, the provision of haptic feedback or a verbal description of important images such as tables and graphs. There are several specialized software tools to provide accessible images, such as creating tactile representations of graphs [19], using MathTrax, automatically generating image descriptions [29], or creating a unified version of the figure or image in real time [32].
The required feature of the tools is adequate support for the learning of mathematics. Nevertheless, in many cases, it is up to the teacher to modify classical mathematical learning materials into an accessible format, which can be time-consuming and financially demanding [5]. These tools also require special skills to typeset mathematical formulas or web programming. Our proposed system provides an accessible editor for the teacher, from which a readable format is automatically generated. To the best of our knowledge, such a complex system for a mathematical text is unavailable.
3 Overview of proposed system
The developed system is a web application which is based on a client-server architecture and runs on Apache HTTP server with MySQL database system [26]. The core of the system is based on Symfony 1.4Footnote 13 [30] which is an open-source web application framework. The client-side of the system consists of two parts: front-end and back-end. The front-end serves as a public interface for selecting, displaying, and reading documents arranged in topics. The back-end is an administrative interface, where the documents are created and modified. The server side of the system provides a text pre-processing and TTS synthesis. A schematic of the system can be seen in Fig. 1, and a detailed description of the system can be found in [14].
3.1 Client-side
The topic administrator, e.g. a teacher, has a direct access to the document through the back-end, where he or she can create and modify documents using the incorporated WYSIWYG text editor (TinyMCE). A screenshot of the back-end is shown in Fig. 2.
To unify the visual style of the content, the topic administrator can use templates to clarify the meaning of particular document fragments. Currently, five templates are supported in the system, namely Definition, Important, Note, Example, and Solution. For example, the Important template highlights crucial information to which the students should pay more attention. Different synthetic voices can be assigned to each template. Changing the voice while listening to an entire document improves attention compared to listening in a single voice.
The system supports two ways of inserting or editing mathematical formulas in the document. A simple formula with a linear structure, e.g. \(y=x+1\), can be written and stored as a plain text (so-called ‘inline formula’). Additionally, a graphic editor WIRISFootnote 14 is incorporated for more complex mathematical expressions. This editor provides the MathML representation of the formula which is used to derive its word-level transcription (see Sect. 3.3).
All documents are available through the public web interface (front-end) (see screenshot in Fig. 3). The document is read continuously from the beginning to the end. This process can be automatically interrupted at predefined points in the document.
While the document is being read, the students can use a graphical navigation panel with six control buttons: right arrow to play, square to stop, double arrow to rewind (next/previous sentence or formula), and triple arrow to quickly navigate to the next/previous chapter. Furthermore, the buttons are contrastingly coloured. The students can also use keyboard shortcuts or jump into any point of the document by clicking anywhere in the text.
3.2 Server-side
Before displaying and reading the document, the HTML source code is automatically processed. First, the parts of the text, including the templates and the formulas, are extracted and normalized (see Sect. 3.3). Subsequently, the texts are sent to the Web TTS server which is responsible for the conversion of texts to audio (see Sect. 3.5). All audio files are stored in a cache to avoid re-synthesizing already synthesized texts.
3.3 New technique for reading mathematics
The developed system should handle documents containing numerous mathematical expressions such as formulas, notations, and symbols. Generally, reading formulas is a highly complicated task, especially if there is no limitation in the complexity of the equation structure. Moreover, Czech is an inflective language; thus, all operands in the formula should be converted into the correct grammatical form (which can differ in various mathematical contexts).
3.3.1 Automatic conversion of ‘inline formulas’
Formulas with a simple linear structure can be represented by a text string (‘inline formulas’) which is usually a sequence of operators and operands read in the order they are written in. All operands in the formula are inflected into the correct grammatical form determined by the previous operator. We define a transcription rule for each operator, which contains a transcription of the operator and grammatical form for the following operand (case, number, gender, and cardinal/ordinal form). For the inflexion of operands, the method described in [47] was utilized.
The current version of the system supports only the basic operators and operand types in the text representation. These include addition, subtraction, multiplication, division, brackets, superscript (power), subscript, numbers, variables and physical units. An example is presented in Table 1. The formulas having other operators or a more complex structure are represented using MathML.
3.3.2 Automatic conversion of formulas represented by MathML
MathML is an XML application for describing mathematical notation by capturing both the structure and content of the formula. It can represent mathematical formulas of almost any structure and complexity. Moreover, the standard notation can be easily extended with new elements. For example, we defined a new type of operand for labelling physical units.
The transcription of formulas represented by MathML can be divided into several steps:
Decomposition of a MathML code,
Selection of suitable transcription rules for the operators, and
Transcription of the operator and inflexion of the related operands.
For each mathematical operation, several transcription rules can be defined. The rules differ in their activation conditions (e.g. mathematical context, various values, or types of operands). For most operators, we consider one basic rule and several additional rules for exceptional cases.
The transcription rule consists of a text template defining the constant part of the final transcription, a type of the resulting expression describing the relation to a higher level of the formula, and a corresponding grammatical form for each operand. An example of a formula with its MathML representation and transcription is shown in Table 2.
Example 1
An illustrative example of transcription rules for two operators in YAML notation—power and fraction:
3.4 Final text processing
After the conversion of mathematical formulas to a text, an analysis and processing of the remaining document content is the next important step preceding the speech generation. This process can be divided into several actions shown in Fig. 4.
3.4.1 Text filtering
The texts entering the pre-processing are parsed from an HTML-formatted source and may contain some unwanted ‘garbage’ characters, e.g. HTML tags, entity characters, quotation marks, etc. These characters must be removed or replaced before further processing.
3.4.2 Text normalization
The text normalization detects any ‘non-standard word’ (e.g. digit, date, abbreviation) in the input text and converts it to a grammatically correct ‘full-word’ form.
The determination of the grammatically correct form is one of the most challenging tasks for all inflective languages (e.g. Czech) as a single word can have many various forms depending on the syntax and the meaning of the sentence. For example, the phrase ‘2 ženy’ (2 women) is to be converted to ‘dvě ženy’ (two women) after the text normalization. However, it can have other forms depending on the context, e.g. ‘bez dvou žen’ (without two women), ‘se dvěma ženami’ (with two women), ‘ke dvěma ženám’ (towards to two women).
An extensive semantic and syntactic analysis is required to assign a word with the correct form. The development of such analysis is still ongoing; thus, an estimator (TnT tagger [7]) is currently used to find the correct form with some probability. A very efficient statistical part-of-speech tagger has been trained on a large Czech corpus already tagged by morphological tags beforehand.
Two examples of the text normalization are shown in Fig. 4. The numeral ‘3’ is converted to the correct form of ‘tři’ (three), whereas the ordinal number ‘1.’ is converted to ‘první’ (first).
3.4.3 Word substitutions
In the input text, words with a non-standard pronunciation (e.g. foreign words, names, or proper nouns) may occur. These words cannot be transcribed using standard Czech phonetic transcription rules mentioned in Sect. 3.4.5; thus, they require special processing. Therefore, we used a ‘dictionary-like’ system in which a single word can be replaced with a corresponding ‘phonetic-friendly’ transcription, and this can be correctly processed during the following phonetic transcription. In Fig. 4, the proper noun ‘Newtonovy’ (Newton’s) is substituted by a Czech phonetic-friendly transcription ‘ňůtnovy’.
Support for non-standard word pronunciation was also integrated into the system’s back-end. The editor can mark a word as a ‘pronunciation exception’ and assign its proper pronunciation.
3.4.4 Phrasification and prosodic description
In addition to the phonetic transcription, each input text is described in terms of prosodic symbols. In Slavic languages (also in other Indo-European languages), prosody can be viewed to supplement the phonetic information by other linguistic aspects, such as sentence modality (e.g. declarative sentences vs. yes/no questions), emotions, styles, or general expressiveness and speaker attitude. Thus, prosody helps listeners understand the meaning of the transmitted message. Prosody also helps in the division of longer utterances into sentences, sentences into shorter phrases, and phrases into words.
3.4.5 Phonetic transcription
During the phonetic transcription, an orthographic form of the input text is converted to a phoneme sequence. This process is rule-based in our system as the conversion is almost always unambiguous in the Czech language. The pronunciation exceptions, e.g. foreign words, are handled as described in Sect. 3.4.3.
3.4.6 Phonetic filtering
After the phonetic transcription, the phoneme sequence might still contain some characters that are not supported by the speech synthesis engine. Currently, all unsupported characters are omitted.
In addition, some phonetic substitutions can also be made in this step. For instance, some phonetic nuances could be discarded, i.e. symbols representing phonetic subclasses can be replaced by symbols representing a more general phonetic class. In Fig. 4, a syllabic voiced alveolar trill [r=] is replaced by its basic non-syllabic version [r]. Similarly, unvoiced and voiced alveolar fricative trills ( and ) were merged as both represent a similar phone.
3.5 Text-to-speech
To make the content of the website accessible for students with visual impairment, TTS technology was used. The primary task of any TTS system is to convert an arbitrary input plain text to a speech signal which should correctly reflect the content of the text. For our application, a unit-selection-based TTS system ARTIC [44] was adapted. It produces high-quality and naturally sounding speech and manages several Czech male and female voices, and these are assigned to particular templates (see Sect. 3.1). For other languages, ARTIC can be replaced by another TTS system as the communication protocol is easy to adapt, e.g. we used MaryTTSFootnote 15 [36] and CereProcFootnote 16 for German and British English to support the teaching of foreign languages.
4 Evaluation methods
4.1 Participants
The participants of the study were 41 lower secondary pupils (14 girls and 27 boys) of the sixth, seventh, and eighth grades (aged 12 to 14) and three teachers of the primary school for pupils with visual impairment in Pilsen, Czech Republic. This school educates pupils from all over the Pilsen and Karlovy Vary regions. The distribution of the classified visual impairments combined with other disabilities of the pupils in the study is summarised in Table 3.
4.2 Materials
Twenty selected topics of mathematics and physics were used to evaluate the system. The topics partially cover the curriculum of the lower secondary school (see Table 4) and were created in the back-end of the system by the teachers of the pupils in the study. Each topic consists of an explanation of the subject matter of one lesson including examples and exercises. The topic substitutes the pupils’ notes from the school lesson and helps them with individual preparation.
These topics were selected by teachers according to a greater difficulty for pupils. Usually, these topics require more effort for mastery. The contents of each topic were selected to allow independent home preparation with an emphasis on exercise.
4.3 Procedure
The study lasted one school year. The pupils were shown each topic in the system for at least one school hour. The prerequisites for using the system in the classroom included digital projector and notebook or interactive whiteboard to avoid organizational complications. Thus, it was possible to enlarge the text on the screen/interactive whiteboard and it proved beneficial to the pupils with severe visual impairment. The pupils had full access to their standard teaching aids.
During the study period, lessons were delivered with minimal changes. If the lesson was covered by some topic in our system, the teacher mentioned this with a brief overview on the interactive whiteboard as the focus was on home preparation. The pupils were given the homework from the exercise part of the topic. They could check for the solution in the system and receive immediate feedback. In case there was a problem, the solution guided them in sufficiently understanding the example. For a better insight into the topic, the pupils could repeat the explanation of the subject matter. Thus, the pupils were able to work at their own pace and independently.
Generally, the pupils used the tool mainly for home repetition, supplementing misunderstood material, and practicing. During the study period, there were two dedicated afternoons which the teacher dedicated to teaching the pupils how to operate the system.
Each use of the system in a given lesson was recorded in time-sheets with a positive or negative approach obtained from the pupils. After one year of use, we qualitatively evaluated the system through questionnaires administered to the pupils and teachers. The questionnaire items were mostly scaled to allow a finer distinction of answers. When the pupils were filling the questionnaires, an individual approach with adult assistance was adopted to ensure that they understood the questions and answered correctly.
5 Results
The inquiry, partly realized using a questionnaire and an interview, was focused on several monitored areas:
- 1.
How and how often was the product used?
- 2.
In what areas did the use of the product show a positive effect?
- 3.
How was the quality of the reading voice assessed?
The results obtained by the evaluation of each of the questions above are summarized in the following subsections.
5.1 The ways and frequency of using the system
For the pupils, results were collected only from questionnaires. The frequency of using the system was collected on a 1-to-6 response scale (see Table 5). As observed, the results are distributed among all interval levels. Most answers are found in a more frequent use interval—at least once a week.
When asked about a specific type of usage, the pupils chose from three options (see Table 6). The result shows the importance of both acoustic and visual modality (64% respondents). The most frequent response was the item, ‘I listen to a computer voice, and I follow everything on monitor depending on need and fatigue’. Approximately 2% of the pupils ‘only listened to a computer voice’.
For the question, ‘Can you use the system after school?’, responses were obtained in the frequency shown in Table 7. For the item, ‘Are your parents familiar with the use of the system for automatic reading textbooks?’ 88% of the respondents answered ‘yes’. To determine whether people around the pupils were interested in the system, respondents answered that the most curious person was the mother (63%) and friends (43%). The interest, however, is characterized as a ‘little’. On the other hand, 85% of the grandparents were not interested at all.
5.2 The effect of the system
The results of this evaluation were collected from questionnaires administered to the pupils and teachers. The responses of the pupils to the question, ‘Is it a useful tool in understanding difficult topics in mathematics or physics?’ show mostly affirmative acceptance. Precisely, 51% of the respondents answered ‘certainly yes’ and 44% answered ‘rather yes’. For the question, ‘What subject and topics were most beneficial’, 73% of all responses point to mathematics, and most answered topics were ‘Fractions’ (27%), ‘Unit conversion’ (24%), and ‘Linear equations’ (24%).
Selecting overall questions for evaluating the item ‘Working with the application is for me...’, the frequency of ‘yes’ response is shown in Table 8. The next question was aimed at comparing the work to other educational tools such as learning from exercise books and preparation from textbooks. The responses show a preference for our system: ‘definitely yes’ is 55% and ‘somewhat agree’ is 37% of the respondents. However, 67% of the respondents do not favour our system compared to learning with friends or parents.
The second result is from the questionnaires filled by the teachers and their discussions with the authors of this paper. The aim was to clarify whether the system has an influence on the academic achievement of the pupils in the given subjects.
On first question, ‘Does the system help pupils in their home preparation for the subject and why?’, all teachers answered ‘definitely yes’ on four scales. The second question was ‘Do the pupils achieve better results with this special teaching aid and why?’. Two answers were ‘probably yes’ from teachers of mathematics and one answer was ‘probably no’ from a teacher of physics. The responses and comments are summarized in Table 9.
5.3 Reading voice of the texts
In the next item, the voice that reads the texts and mathematical problems was evaluated. Respondents had several options to choose from. To make it simpler, the results of ‘definitely yes’ and ‘probably yes’ were merged into ‘yes’ and the ‘definitely not’ and ‘probably not’ into ‘no’. These main results alone are shown in Table 10.
6 Discussion
The main purpose of the proposed web-based system is for pupils to prepare for lessons after school. Eighty-eight percent of the pupils could access the educational material online through internet connection. During the pupils’ home preparation, the parents were interested in the system, especially the mothers. For the question, ‘How often and how was the product used?’, the responses were distributed among all interval levels, and 53% of the answers were found in a more frequent use interval - at least once a week.
While evaluating the synthetic voices (conventional TTS system for Czech), intelligibility and pleasantness were praised by the pupils, but the voice tended to sound less natural and rather monotonous. The results are especially valuable for further technical adjustments in which it would be appropriate to improve the naturalness (and remove the monotony) of a synthetic voice employed to read technical documents.
The system was mostly positively assessed as more than 85% of the pupils voted for the positive effect of the system in the criteria ‘simplification of preparation for school’ and ‘clarification of subject matter’. This is consistent with former results of comparing mathematical TTS software with printed text [1]. These results of pre- and post-test in secondary students with visual impairments showed increasing accuracy. In our study, there was 95% positive acceptance in the criteria of understanding the topic if it was difficult for the pupil.
The flexibility of the application allowed the pupils to operate individually according to their state of vision and the current situation affected by fatigue (indicated by 64% of the pupils). For pupils with severe visual impairment, the results prove the importance of visual modality (e.g. a graphically rendered mathematical formula) with a synchronous rendered voice. In this study, a total of 30 of 41 pupils were classified with severe visual impairment. These findings are consistent with the results from [21], where these authors warning before using only the listening to digital text.
Another factor affecting the system acceptance is the age of the pupils and the complexity of the subject matter. In a study conducted on older high-school students with visual impairment and in Algebra 1 course [6], the authors emphasised a preference for classical text materials and a general resistance to new technology. In contrast to a different study [17] on the junior high school students with visual impairment trained on their system, an effective improvement in mathematics was reported. The pupils in our study preferred to work with the system compared to paper text, books, and textbooks. This finding can be explained by the higher ‘didactic friendliness’ of the system that can be caused by (1) a continual introduction of the system to the pupils in the classroom, (2) a pre-algebra course containing elemental mathematics, and (3) the better attitude of young students to electronic texts.
From the teachers ‘perspective, the pupils’ responses to the system were monitored continuously throughout the evaluation year. According to the teachers’ responses and comments, the system fulfilled its main purposes: to help pupils in individual preparation, to repeat the difficult curriculum, and to substitute the notes from the lesson. The improvement in the pupils’ proficiency was indicated in mathematics, but all teachers agreed to improve pupils’ access to the curriculum and their positive perception of the system. In addition, the teachers appreciated the possibility of explaining the subject matter in another way using the system.
7 Conclusion
We present a new web-based system specially developed to facilitate access to educational materials by automatic reading, for Czech pupils with visual impairment. The system enables teachers to prepare and process arbitrary topics focusing on technical documents that contain mathematics and physics formulas (at the lower secondary school level). The system converts the content automatically to speech, and the implemented solution provides a method for reading formulas in various mathematical contexts and correct grammatical forms that are very important for inflective languages, such as Czech.
In general, the system consists of the client and server-side. The client-side is composed of two types of interfaces (front-end and back-end). The front-end is a public interface enabling the user (pupils) full services via graphics and voice. Regarding the pre-synthesized text in the cache on the server-side, the selected documents are immediately read and synchronized with graphic highlighting. The back-end is an administrative interface where the documents are created and modified. The server side of the system is modular, implements several web services, and provides automatic processing for the client-side.
The system was experimentally evaluated by 41 pupils and three teachers of a school for pupils with visual impairments. The responses indicate the positive contribution of the system, especially for the difficult topics, and the pupils preferred the system over paper textbooks. The most frequent usage of the system is in a multi-modal form combining auditory perception with visual perception.
Notes
Learning, visual, or physical disability prevents gaining information from printed material in the standard way.
Categories (1) and (2) are also termed as low vision, in the USA, the visually impaired in the categories (3) to (6) are considered legally blind.
Encoding of mathematical and scientific formulas linearly in the row.
Available at http://ucebnice.zcu.cz/.
Simple HTML-source code including the mathematics formulas as graphics with alternative text.
The new version of the system provides an extension for the special needs of the subjects as chemistry or grammar, the system is a result of two European Social Fund (ESF) projects—SAMOČET CZ.1.07/1.2.31/02.0019.
References
Alajarmeh N, Pontelli E (2012) A non-visual electronic workspace for learning algebra. In: Miesenberger K, Karshmer A, Penaz P, Zagler W (eds) Computers helping people with special needs. Springer, Berlin, pp 158–165
Alper S, Raharinirina S (2006) Assistive technology for individuals with disabilities: a review and synthesis of the literature. J Spec Educ Technol 21(1):47–56
Argyropoulos V, Paveli A, Nikolaraizi M (2018) The role of daisy digital talking books in the education of individuals with blindness: a pilot study. Educ Inf Technol 24:693–709
Bencharef O (2018) An assistive technology for braille users to support mathematical learning: a semantic retrieval system. Symmetry. https://doi.org/10.3390/sym10110547
Bouck EC, Meyer NK (2012) eText, mathematics, and students with visual impairments: "What teachers need to know". Teach Except Child 45(2):42–49
Bouck EC, Weng PL, Satsangi R (2016) Digital versus traditional: secondary students with visual impairments’ perceptions of a digital algebra textbook. J Vis Impair Blind 110(1):41–52
Brants T (2000) TnT: a statistical part-of-speech tagger. In: Proceedings of the 6th conference on applied natural language processing (ANLC’00), Seattle, Washington, pp 224–231
Cowan RE, Fregly BJ, Boninger ML, Chan L, Rodgers MM, Reinkensmeyer DJ (2012) Recent trends in assistive technology for mobility. J NeuroEng Rehab 9(20):20
Crombie D, Lenoir R, McKenzie N, Barker A (2004) math2braille: Opening access to mathematics. In: Computers helping people with special needs, vol 3118, lecture notes in computer science, Springer, Berlin, pp 670–677
Dawe M (2006) Desperately seeking simplicity: How young adults with cognitive disabilities and their families adopt assistive technologies. In: Proceedings of the SIGCHI conference on human factors in computing systems (CHI’06) Montreal, Canada, pp 1143–1152
Edwards ADN, McCartney H, Fogarolo F (2006) Lambda: a multimodal approach to making mathematics accessible to blind students. In: Proceedings of the 8th international ACM SIGACCESS conference on computers and accessibility (ASSETS 2006), Portland, Oregon, pp 48–54
Ferreira HF (2011) AudioMath: speaking mathematics with MathML. In: Second European workshop on MathML and scientific e-contents, Kuopio, Finland, pp 55–62
Foster KR (2001) MathType 5 with MathML for the WWW. IEEE Spectr 38(12):64
Gr\(\mathring{\rm u}\)ber M, Matoušek J, Hanzlíček Z, Krňoul Z, Zajíc Z (2016) ARET – automatic reading of educational texts for visually impaired students. In: Interspeech, pp 383–384
Hersh MA, Johnson MA (eds) (2003) Assistive technology for the hearing-impaired. Deaf and Deafblind. Springer, London
Hersh MA, Johnson MA (eds) (2008) Assistive technology for visually impaired and blind people. Springer, London
Huang PH, Chiu MC, Hwang SL, Wang JL (2015) Investigating e-learning accessibility for visually-impaired students: an experimental study. Int J Eng Educ 21(1):495–504
Isaacson M, Srinivasan S, Lloyd LL (2010) Development of an algorithm for improving quality and information processing capacity of MathSpeak synthetic speech renderings. Disabil Rehabilit Assist Technol 5(2):83–93
Jayant C (2006) A survey of math accessibility for blind persons and an investigation on text/math separation. In: Technical report, University of Washington, Seattle, Washington
Karshmer A, et al. (2004) UMA: a system for universal mathematics accessibility. In: Proceedings of the 6th international ACM SIGACCESS conference on computers and accessibility (ASSETS 2000), Atlanta, pp 55–62
Klingenberg OG, Holkesvik AH, Augestad LB (2020) Digital learning in mathematics for students with severe visual impairment: a systematic review. Br J Vis Impair 38(1):38–57. https://doi.org/10.1177/0264619619876975
Leas D, Persoon E, Soiffer N, Zacherle M (2008) Daisy 3: a standard for accessible multimedia books. IEEE Multimed 15(4):28–37
Lewis P, Noble S, Soiffer N (2010) Using accessible math textbooks with students who have learning disabilities. In: Proceedings of the 12th international ACM SIGACCESS conference on computers and accessibility (ASSETS 2000), ACM, Orlando, pp 139–146
Lewis RB (1998) Assistive technology and learning disabilities: today’s realities and tomorrow’s promises. J Learn Disabi 31(1):16–26
Lopresti EF, Mihailidis A, Kirsch N (2004) Assistive technology for cognitive rehabilitation: State of the art. Neuropsychol Rehabil 14(1–2):5–39
Matoušek J et al (2011) Web-based system for automatic reading of technical documents for vision impaired students. In: Text, speech, vol 6836. and dialogue, lecture notes in artificial intelligence. Springer, Berlin, pp 364–371
McCracken RE, Nemeth A, Roberts H (1972) The Nemeth Braille code for mathematics and science notation 1972 revision. American Printing House for the Blind, Louisville
Miner R (2005) The importance of MathML to mathematics communication. Not AMS 52(5):532–538
Moskovitch Y, Walker BN (2010) Evaluating text descriptions of mathematical graphs. In: Proceedings of the 12th international ACM SIGACCESS conference on computers and accessibility (ASSETS 2010), Orlando, Florida, pp 259–260
Potencier F (2009) The symfony reference guide. Sensio SA
Raman T (1994) Audio system for technical readings. Ph.D. thesis, Cornell University
Ramloll R. et al. (2000) Constructing sonified haptic line graphs for the blind student: first steps. In: Proceedings of the 4th international ACM conference on assistive technologies (ASSETS 2000), Arlington, Virginia, pp 17–25
Sandhu P (2009) The MathML Handbook. Charles River Media
Scherer MJ (2004) Connecting to learn: educational and assistive technology for people with disabilities. American Psychological Association, Washington
Scherer MJ, Craddock G (2002) Matching person & technology (MPT) assessment process. Technol Disabil 14(3):125–131
Schröder M, Charfuelan M, Pammi S, Steiner I (2011) Open source voice creation toolkit for the MARY TTS platform. In: Proceedings of the 12th annual conference of the international speech communication association (Interspeech 2011), Florence, pp 3253–3256
Sears A, Young M (2002) Physical disabilities and computing technologies: an analysis of impairments. In: Jacko JA, Sears A (eds) The human–computer interaction handbook: fundamentals, evolving technologies and emerging applications. Lawrence Erlbaum Associates, New Jersey, pp 482–503
Soiffer N (2007) MathPlayer v2.1: web-based math accessibility. In: Proceedings of the 9th international ACM SIGACCESS conference on computers and accessibility (ASSETS 2007), Kuopio, pp 257–258
Stevens R, Edwards A (1994) Mathtalk: the design of an interface for reading algebra using speech. In: Computers for handicapped persons, vol 860. Lecture notes in computer science. Springer, Berlin, pp 313–320
Stevens R, Edwards A (1994) Mathtalk: usable access to mathematics. Inf Technol Disabil J 1(4)
Stoeger B, Batusic M, Miesenberger K, Haindl P (2006) Supporting blind students in navigation and manipulation of mathematical expressions: Basic requirements and strategies. In: Computers helping people with special needs, vol 4061. Lecture notes in computer science. Springer, Berlin, pp 1235–1242
Taylor P (2009) Text-to-speech synthesis. Cambridge University Press, Cambridge
Teiresiás-MUNI-Brno: Draft of the czech 8 dot braille code standard. http://www.teiresias.muni.cz/czbraille8 (2008). Accessed on 09 Jan 2015
Tihelka D et al (2018) Current state of text-to-speech system ARTIC: a decade of research on the field of speech technologies. In: Text, speech and dialogue, vol 11107. Lecture notes in computer science. Springer, Berlin, pp 369–378
Wells J (1997) SAMPA computer readable phonetic alphabet. In: Gibbon D, Moore R, Winski R (eds) Handbook of standards and resources for spoken language systems. Mouton de Gruyter, Berlin
World Health Organization: International statistical classification of diseases and related health problems 10th revision (2003)
Zelinka J, Kanis J, Müller L (2005) Automatic transcription of numerals in inflectional languages. In: Text, speech, vol 3658 and dialogue, lecture notes in artificial intelligence. Springer, Berlin, pp 326–333
Acknowledgements
This research was supported by the European Social Fund and the State Budget of the Czech Republic project No. CZ.1.07/1.2.00/08.0021, and Ministry of Education, Youth and Sports of the Czech Republic project No. LO1506. We thank Primary school for pupils with visual impairment in Pilsen, Czech Republic for help with implementation and evaluation of the system. The WIRIS editor was incorporated into the system with courtesy of “Maths for More” mathematical software company based in Barcelona, Spain.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Matoušek, J., Krňoul, Z., Campr, M. et al. Speech and web-based technology to enhance education for pupils with visual impairment. J Multimodal User Interfaces 14, 219–230 (2020). https://doi.org/10.1007/s12193-020-00323-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12193-020-00323-1