Speech and web-based technology to enhance education for pupils with visual impairment

Matoušek, Jindřich; Krňoul, Zdeněk; Campr, Michal; Zajíc, Zbyněk; Hanzlíček, Zdeněk; Grůber, Martin; Kocurová, Marie

doi:10.1007/s12193-020-00323-1

Speech and web-based technology to enhance education for pupils with visual impairment

Original Paper
Published: 19 April 2020

Volume 14, pages 219–230, (2020)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal on Multimodal User Interfaces Aims and scope Submit manuscript

Speech and web-based technology to enhance education for pupils with visual impairment

Download PDF

Jindřich Matoušek²,
Zdeněk Krňoul²,
Michal Campr¹,
Zbyněk Zajíc¹,
Zdeněk Hanzlíček¹,
Martin Grůber¹ &
…
Marie Kocurová³

1044 Accesses
7 Citations
3 Altmetric
Explore all metrics

Abstract

This paper describes a new web-based system specially adapted to the education of Czech pupils with visual impairment. The system integrates speech and language technologies with a web framework in lower secondary education, especially in mathematics and physics subjects. A new interface utilized the text-to-speech (TTS) synthesis for online automatic reading of educational texts. The interface provides several TTS voices, synthesized data caching, and automatic processing of formulas in mathematics and physics. The system was designed to enable teachers create and manage teaching materials. It also enables the pupils to view and listen to the read forms of these documents online. A school for pupils with visual impairment participated in the development and implementation of the system. After one year of using the system daily, the user experience and evaluation data were collected. The results indicate a positive reception and frequent use of the system as well as a preference over classical educational materials.

Developing Speech-Based Web Browsers for Visually Impaired Users

Didactic Tool for the Visually Impaired

A Multimodal Platform to Teach Mathematics to Students with Vision-Impairment

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Modern computers are beneficial for many types of impairments or disabilities. There are various assistive technologies [2, 34, 35], e.g. for people with cognitive disabilities [10, 25], mobility impairment [8], or hearing loss [15]. For individuals diagnosed with a print disability^{Footnote 1} [16, 24, 37], the transition from textbooks accessible in tapes to electronic texts read by computers is considered as one of the most effective methods of the last decade [3, 23]. However, this is not always true about the automatic conversion of mathematics to voice by computers. The creation and use of accessible materials is often discussed; however, the problem remains unsolved in several schools. Teachers cope daily with several issues arising from teaching pupils with combined disabilities (e.g. visual impairment and dyslexia or speech defects). The challenge worsens in science subjects, such as mathematics and physics for which accessible materials in a given language are noticeably lacking. It is also very challenging for computers to automatically read technical texts, such as mathematical formulas. Thus, it is difficult to develop a standard interface for the creation and presentation of science educational materials for pupils with visual impairments.

Visual impairment is designated by measuring visual acuity (VA). Other diagnosable characteristics are the range of visual field, depth perception, contrast sensitivity, colour discrimination, accommodation, adaptation, ophthalmogyric activity, and ability to localize a subject and follow it in motion. Based on the ophthalmology examination, the visual impairment of the person is classified into different categories as specified by the World Health Organization classification: (1) mild vision loss (near-normal vision), (2) moderate low vision, (3) severe low vision, (4) profound low vision, (5) near-total blindness, and (6) total blindness (no light perception)^{Footnote 2} [46].

Pupils with visual impairment educated in lower secondary schools in Czech Republic are classified into a wide range of the categories (3–6). The pupils are taught together in classrooms of about ten pupils. During mathematics lessons, the pupils routinely use low vision aids such as video magnifiers, large displays with enlarged text font, or Braille code display. Only some pupils classified as totally blind learn and use Braille or Nemeth Braille code.^{Footnote 3} An alternative method for these pupils is specialized computer hardware and software, such as screen readers, that enable them to read e-text using a computer [4].

Today, people with visual impairment already use standard screen readers using text-to-speech (TTS) technology (see [42]) that requires localization for a given language. A screen reader involves a computer application designed to automatically ‘read’ text on a screen without human intervention. However, the ‘non-textual’ content, such as mathematical formulas, images, and graphical schemes, is skipped or garbled by standard screen readers which lack support for correct text pre-processing modules. On the other hand, there are also offline digital textbooks and materials that correctly combine written texts and images with human voice (also known as hybrid books). The creation of such textbooks for the learning of mathematics is a long and laborious process [3, 5]. For example, the voice of a human is recorded along with the corresponding textual content.

1.1 Tasks in automatic formula reading

Two essential steps are involved in providing an appropriate and flexible interface for preparing new educational materials for daily use which are available through the automatic reading of mathematical formulas.

The first step involves the correct transcription of the symbolic notations of formulas and their subsequent decoding into corresponding grammatically correct word forms. One issue in the design is obtaining an adequate formula editor accessible to teachers. The authors (teachers) usually lack skills for the typesetting of mathematical equations, they are not web programmers, and they do not know how to generate well-structured web pages with mathematical formalisms.

The second important step involves the application of TTS. Current TTS systems are optimized on the corpora utilized during system design. As the spoken formulas are usually not included in these corpora, problems with synthetic speech quality can arise when conventional TTS systems are employed to read technical documents. Thus, the intelligibility and naturalness of the generated voice can be questionable.

1.2 Supporting pupils with visual impairment in the learning of mathematics

In the context of mathematics education, Stoeger [41] defines four main problems: (1) access to the mathematical literature (books, teaching materials, articles, etc.); (2) preparation of teaching materials (school exercises, notes); (3) navigation in mathematical formulas, and (4) the process of learning mathematics (calculations, formal manipulation of expressions, problem-solving).

This paper contributes to the four problems mentioned above. We propose a new web-based system^{Footnote 4} that provides the following: (1) an easy access for pupils with visual impairment; (2) an interface for preparing educational texts including mathematical formulas; (3) navigation through the prepared audio-visual web content, and (4) a qualitative evaluation of the system with selected topics in mathematics and physics.

The system is designed for the teachers to prepare, manage, and administrate new educational materials using the online back-end interface. The teacher can convert, check, and immediately publish the prepared topic (lesson) to the web page with voice automatically supplemented by an integrated TTS (currently in Czech, English, and German). The educational material is available online to the pupils via the front-end interface of the website in both voice and visual modes.

The web page generated by the system is accessed via standard screen readers.^{Footnote 5} The educational material in this audio-visual form can also be integrated into other educational interfaces.

In general, the variety of the subjects prepared by the system is unlimited. However, the novelty of the proposed system is the capability to decode mathematical formulas.^{Footnote 6} Therefore, the system was qualitatively evaluated using mathematics and physics for pupils with visual impairment.

2 Current technologies and automatic reading of mathematical text

Several technologies and standards have been developed to improve the availability of textbooks for pupils with a print disability. Nevertheless, very few technologies currently provide a framework to make mathematical educational materials accessible in the required form (school textbooks, training notes, exercises). Mathematical equations are presented in several formats and codes, such as Mathematical markup language (MathML), NIMAS, DAISY, and Nemeth [22, 27, 28].

MathML^{Footnote 7} is a computer XML-based format for describing mathematical formalism on the web [33]. However, current screen-readers are unable to read the MathML tags properly as they read the content alone and ignore the structure.

NIMAS is the file format for developing printed textbooks. By default, math content is provided in NIMAS file sets as images.

DAISY is another standard for producing accessible and navigable multimedia documents in the form of a synchronized audio/text-book. In addition, the MathDaisy add-in converts the equations to MathML and saves the document as a DAISY digital talking book (DTB). DTBs are used in eBook readers and the DAISY DTBs are read by players such as gh-PLAYER [18].

AudioMath developed at Porto University is another mathematics reader that uses MathML [12].

The Nemeth Braille code is used for the linear coding of mathematical and scientific notations using standard 6-point Braille cells, and it is unsuitable for automatic reading.

The problem of reading mathematics directly from a TeX source document was addressed, for instance, in the audio system for technical readings (AsTeR) [31]. For the world wide web, MathPlayer [38] provides a plugin to the web browser which displays, highlights, and reads mathematical expressions on the website. Accessibility is feasible here by right-clicking on an equation and choosing a ‘to render aurally’ command or using a screen reader that reads the entire web page and invokes MathPlayer to speak the mathematical formula in a structured (and sometimes customized) way. It should be noted that mathematical formalisms are often articulated in an ambiguous form.

The current technology falls short because it does not incorporate appropriate rules on how to describe mathematical expressions due to their naturally nonlinear structure. Unifying the translation process was also one of the aims of the MathSpeak project [18]. Furthermore, a prototype of the MathML reader called MathGenie was designed to provide an unambiguous verbal presentation of nontrivial mathematical formulas [20].

MathTalk is another assistive technology based on MathML, and it helps users with visual impairments create mathematical formulas by voice commands, display of information on the screen, and conversion of the information into braille [39, 40]. For the Czech language, the Lambda math-editor, developed at the University of York, was accommodated at Masaryk University [43] as a support system for editing mathematical formulas by blind users (creators) using Braille and audio output. Lambda also provides a compact and linear 8-dot Braille math-code [11]. Converting MathML to Braille is also possible using Math2Braille [9].

For the creation and subsequent reading of mathematical formulas, it is possible to use TeX or MathML code directly or to use an editor such as MathType^{Footnote 8} [13]. There are more general editors and converters of varying quality and different functionalities. Some applications are extensions or add-ons of internet browsers, e.g. Firemath,^{Footnote 9} Amaya,^{Footnote 10} Bluegriffon,^{Footnote 11} Tex4ht.^{Footnote 12} Expressions can also be graphically created using a word processor. The formulas are often stored as graphic image files in formats such as PNG or SVG, with or without alternative text. Some applications can convert expressions into speech, but they are mostly for the English language and of limited functionality.

Besides the conversion of text to speech, there are further specific requirements for application processing and reading of mathematical expressions for pupils with visual impairment. These requirements include font size, text colour, and background colour. The language as well as volume and speed control also play a vital role in the TTS conversion. Other technologies for creating accessible text for mathematics use images that are supported or annotated in ways that are more accessible to people with visual impairment for example, the provision of haptic feedback or a verbal description of important images such as tables and graphs. There are several specialized software tools to provide accessible images, such as creating tactile representations of graphs [19], using MathTrax, automatically generating image descriptions [29], or creating a unified version of the figure or image in real time [32].

The required feature of the tools is adequate support for the learning of mathematics. Nevertheless, in many cases, it is up to the teacher to modify classical mathematical learning materials into an accessible format, which can be time-consuming and financially demanding [5]. These tools also require special skills to typeset mathematical formulas or web programming. Our proposed system provides an accessible editor for the teacher, from which a readable format is automatically generated. To the best of our knowledge, such a complex system for a mathematical text is unavailable.

3 Overview of proposed system

The developed system is a web application which is based on a client-server architecture and runs on Apache HTTP server with MySQL database system [26]. The core of the system is based on Symfony 1.4^{Footnote 13} [30] which is an open-source web application framework. The client-side of the system consists of two parts: front-end and back-end. The front-end serves as a public interface for selecting, displaying, and reading documents arranged in topics. The back-end is an administrative interface, where the documents are created and modified. The server side of the system provides a text pre-processing and TTS synthesis. A schematic of the system can be seen in Fig. 1, and a detailed description of the system can be found in [14].

3.1 Client-side

The topic administrator, e.g. a teacher, has a direct access to the document through the back-end, where he or she can create and modify documents using the incorporated WYSIWYG text editor (TinyMCE). A screenshot of the back-end is shown in Fig. 2.

To unify the visual style of the content, the topic administrator can use templates to clarify the meaning of particular document fragments. Currently, five templates are supported in the system, namely Definition, Important, Note, Example, and Solution. For example, the Important template highlights crucial information to which the students should pay more attention. Different synthetic voices can be assigned to each template. Changing the voice while listening to an entire document improves attention compared to listening in a single voice.

The system supports two ways of inserting or editing mathematical formulas in the document. A simple formula with a linear structure, e.g. \(y=x+1\), can be written and stored as a plain text (so-called ‘inline formula’). Additionally, a graphic editor WIRIS^{Footnote 14} is incorporated for more complex mathematical expressions. This editor provides the MathML representation of the formula which is used to derive its word-level transcription (see Sect. 3.3).

All documents are available through the public web interface (front-end) (see screenshot in Fig. 3). The document is read continuously from the beginning to the end. This process can be automatically interrupted at predefined points in the document.

While the document is being read, the students can use a graphical navigation panel with six control buttons: right arrow to play, square to stop, double arrow to rewind (next/previous sentence or formula), and triple arrow to quickly navigate to the next/previous chapter. Furthermore, the buttons are contrastingly coloured. The students can also use keyboard shortcuts or jump into any point of the document by clicking anywhere in the text.

3.2 Server-side

Before displaying and reading the document, the HTML source code is automatically processed. First, the parts of the text, including the templates and the formulas, are extracted and normalized (see Sect. 3.3). Subsequently, the texts are sent to the Web TTS server which is responsible for the conversion of texts to audio (see Sect. 3.5). All audio files are stored in a cache to avoid re-synthesizing already synthesized texts.

3.3 New technique for reading mathematics

The developed system should handle documents containing numerous mathematical expressions such as formulas, notations, and symbols. Generally, reading formulas is a highly complicated task, especially if there is no limitation in the complexity of the equation structure. Moreover, Czech is an inflective language; thus, all operands in the formula should be converted into the correct grammatical form (which can differ in various mathematical contexts).

3.3.1 Automatic conversion of ‘inline formulas’

Formulas with a simple linear structure can be represented by a text string (‘inline formulas’) which is usually a sequence of operators and operands read in the order they are written in. All operands in the formula are inflected into the correct grammatical form determined by the previous operator. We define a transcription rule for each operator, which contains a transcription of the operator and grammatical form for the following operand (case, number, gender, and cardinal/ordinal form). For the inflexion of operands, the method described in [47] was utilized.

The current version of the system supports only the basic operators and operand types in the text representation. These include addition, subtraction, multiplication, division, brackets, superscript (power), subscript, numbers, variables and physical units. An example is presented in Table 1. The formulas having other operators or a more complex structure are represented using MathML.

Table 1 An example of inline formula transcription (in Czech). For cardinal form, the operand determines the grammatical number and gender

Full size table

3.3.2 Automatic conversion of formulas represented by MathML

MathML is an XML application for describing mathematical notation by capturing both the structure and content of the formula. It can represent mathematical formulas of almost any structure and complexity. Moreover, the standard notation can be easily extended with new elements. For example, we defined a new type of operand for labelling physical units.

The transcription of formulas represented by MathML can be divided into several steps:

Decomposition of a MathML code,
Selection of suitable transcription rules for the operators, and
Transcription of the operator and inflexion of the related operands.

For each mathematical operation, several transcription rules can be defined. The rules differ in their activation conditions (e.g. mathematical context, various values, or types of operands). For most operators, we consider one basic rule and several additional rules for exceptional cases.

The transcription rule consists of a text template defining the constant part of the final transcription, a type of the resulting expression describing the relation to a higher level of the formula, and a corresponding grammatical form for each operand. An example of a formula with its MathML representation and transcription is shown in Table 2.

Example 1

An illustrative example of transcription rules for two operators in YAML notation—power and fraction:

Table 2 Transcription of a mathematical formula represented by MathML (in czech)

Full size table

3.4 Final text processing

After the conversion of mathematical formulas to a text, an analysis and processing of the remaining document content is the next important step preceding the speech generation. This process can be divided into several actions shown in Fig. 4.

3.4.1 Text filtering

The texts entering the pre-processing are parsed from an HTML-formatted source and may contain some unwanted ‘garbage’ characters, e.g. HTML tags, entity characters, quotation marks, etc. These characters must be removed or replaced before further processing.

3.4.2 Text normalization

The text normalization detects any ‘non-standard word’ (e.g. digit, date, abbreviation) in the input text and converts it to a grammatically correct ‘full-word’ form.

The determination of the grammatically correct form is one of the most challenging tasks for all inflective languages (e.g. Czech) as a single word can have many various forms depending on the syntax and the meaning of the sentence. For example, the phrase ‘2 ženy’ (2 women) is to be converted to ‘dvě ženy’ (two women) after the text normalization. However, it can have other forms depending on the context, e.g. ‘bez dvou žen’ (without two women), ‘se dvěma ženami’ (with two women), ‘ke dvěma ženám’ (towards to two women).

An extensive semantic and syntactic analysis is required to assign a word with the correct form. The development of such analysis is still ongoing; thus, an estimator (TnT tagger [7]) is currently used to find the correct form with some probability. A very efficient statistical part-of-speech tagger has been trained on a large Czech corpus already tagged by morphological tags beforehand.

Two examples of the text normalization are shown in Fig. 4. The numeral ‘3’ is converted to the correct form of ‘tři’ (three), whereas the ordinal number ‘1.’ is converted to ‘první’ (first).

3.4.3 Word substitutions

In the input text, words with a non-standard pronunciation (e.g. foreign words, names, or proper nouns) may occur. These words cannot be transcribed using standard Czech phonetic transcription rules mentioned in Sect. 3.4.5; thus, they require special processing. Therefore, we used a ‘dictionary-like’ system in which a single word can be replaced with a corresponding ‘phonetic-friendly’ transcription, and this can be correctly processed during the following phonetic transcription. In Fig. 4, the proper noun ‘Newtonovy’ (Newton’s) is substituted by a Czech phonetic-friendly transcription ‘ňůtnovy’.

Support for non-standard word pronunciation was also integrated into the system’s back-end. The editor can mark a word as a ‘pronunciation exception’ and assign its proper pronunciation.

3.4.4 Phrasification and prosodic description

In addition to the phonetic transcription, each input text is described in terms of prosodic symbols. In Slavic languages (also in other Indo-European languages), prosody can be viewed to supplement the phonetic information by other linguistic aspects, such as sentence modality (e.g. declarative sentences vs. yes/no questions), emotions, styles, or general expressiveness and speaker attitude. Thus, prosody helps listeners understand the meaning of the transmitted message. Prosody also helps in the division of longer utterances into sentences, sentences into shorter phrases, and phrases into words.

3.4.5 Phonetic transcription

During the phonetic transcription, an orthographic form of the input text is converted to a phoneme sequence. This process is rule-based in our system as the conversion is almost always unambiguous in the Czech language. The pronunciation exceptions, e.g. foreign words, are handled as described in Sect. 3.4.3.

3.4.6 Phonetic filtering

After the phonetic transcription, the phoneme sequence might still contain some characters that are not supported by the speech synthesis engine. Currently, all unsupported characters are omitted.

In addition, some phonetic substitutions can also be made in this step. For instance, some phonetic nuances could be discarded, i.e. symbols representing phonetic subclasses can be replaced by symbols representing a more general phonetic class. In Fig. 4, a syllabic voiced alveolar trill [r=] is replaced by its basic non-syllabic version [r]. Similarly, unvoiced and voiced alveolar fricative trills ( and ) were merged as both represent a similar phone.

3.5 Text-to-speech

To make the content of the website accessible for students with visual impairment, TTS technology was used. The primary task of any TTS system is to convert an arbitrary input plain text to a speech signal which should correctly reflect the content of the text. For our application, a unit-selection-based TTS system ARTIC [44] was adapted. It produces high-quality and naturally sounding speech and manages several Czech male and female voices, and these are assigned to particular templates (see Sect. 3.1). For other languages, ARTIC can be replaced by another TTS system as the communication protocol is easy to adapt, e.g. we used MaryTTS^{Footnote 15} [36] and CereProc^{Footnote 16} for German and British English to support the teaching of foreign languages.

4 Evaluation methods

4.1 Participants

The participants of the study were 41 lower secondary pupils (14 girls and 27 boys) of the sixth, seventh, and eighth grades (aged 12 to 14) and three teachers of the primary school for pupils with visual impairment in Pilsen, Czech Republic. This school educates pupils from all over the Pilsen and Karlovy Vary regions. The distribution of the classified visual impairments combined with other disabilities of the pupils in the study is summarised in Table 3.

Table 3 The number of classified visual impairments of the pupils in the study (S—severe, P—profound, B—near-total) combined with one or more other disabilities (SLD—specific learning disability, ID—intellectual disability, PD—physical disability, SD—speech defects

Full size table

4.2 Materials

Twenty selected topics of mathematics and physics were used to evaluate the system. The topics partially cover the curriculum of the lower secondary school (see Table 4) and were created in the back-end of the system by the teachers of the pupils in the study. Each topic consists of an explanation of the subject matter of one lesson including examples and exercises. The topic substitutes the pupils’ notes from the school lesson and helps them with individual preparation.

These topics were selected by teachers according to a greater difficulty for pupils. Usually, these topics require more effort for mastery. The contents of each topic were selected to allow independent home preparation with an emphasis on exercise.

Table 4 Selected topics of mathematics and physics used for evaluation of the system

Full size table

Table 5 Frequency of using the system

Full size table

Table 6 Specific type of use

Full size table

Table 7 After-school program use

Full size table

Table 8 Working with the application is for me...

Full size table

Table 9 Questions, responses, and comments of the teachers

Full size table

4.3 Procedure

The study lasted one school year. The pupils were shown each topic in the system for at least one school hour. The prerequisites for using the system in the classroom included digital projector and notebook or interactive whiteboard to avoid organizational complications. Thus, it was possible to enlarge the text on the screen/interactive whiteboard and it proved beneficial to the pupils with severe visual impairment. The pupils had full access to their standard teaching aids.

During the study period, lessons were delivered with minimal changes. If the lesson was covered by some topic in our system, the teacher mentioned this with a brief overview on the interactive whiteboard as the focus was on home preparation. The pupils were given the homework from the exercise part of the topic. They could check for the solution in the system and receive immediate feedback. In case there was a problem, the solution guided them in sufficiently understanding the example. For a better insight into the topic, the pupils could repeat the explanation of the subject matter. Thus, the pupils were able to work at their own pace and independently.

Generally, the pupils used the tool mainly for home repetition, supplementing misunderstood material, and practicing. During the study period, there were two dedicated afternoons which the teacher dedicated to teaching the pupils how to operate the system.

Each use of the system in a given lesson was recorded in time-sheets with a positive or negative approach obtained from the pupils. After one year of use, we qualitatively evaluated the system through questionnaires administered to the pupils and teachers. The questionnaire items were mostly scaled to allow a finer distinction of answers. When the pupils were filling the questionnaires, an individual approach with adult assistance was adopted to ensure that they understood the questions and answered correctly.

5 Results

The inquiry, partly realized using a questionnaire and an interview, was focused on several monitored areas:

1.
How and how often was the product used?
2.
In what areas did the use of the product show a positive effect?
3.
How was the quality of the reading voice assessed?

The results obtained by the evaluation of each of the questions above are summarized in the following subsections.

5.1 The ways and frequency of using the system

For the pupils, results were collected only from questionnaires. The frequency of using the system was collected on a 1-to-6 response scale (see Table 5). As observed, the results are distributed among all interval levels. Most answers are found in a more frequent use interval—at least once a week.

When asked about a specific type of usage, the pupils chose from three options (see Table 6). The result shows the importance of both acoustic and visual modality (64% respondents). The most frequent response was the item, ‘I listen to a computer voice, and I follow everything on monitor depending on need and fatigue’. Approximately 2% of the pupils ‘only listened to a computer voice’.

For the question, ‘Can you use the system after school?’, responses were obtained in the frequency shown in Table 7. For the item, ‘Are your parents familiar with the use of the system for automatic reading textbooks?’ 88% of the respondents answered ‘yes’. To determine whether people around the pupils were interested in the system, respondents answered that the most curious person was the mother (63%) and friends (43%). The interest, however, is characterized as a ‘little’. On the other hand, 85% of the grandparents were not interested at all.

5.2 The effect of the system

The results of this evaluation were collected from questionnaires administered to the pupils and teachers. The responses of the pupils to the question, ‘Is it a useful tool in understanding difficult topics in mathematics or physics?’ show mostly affirmative acceptance. Precisely, 51% of the respondents answered ‘certainly yes’ and 44% answered ‘rather yes’. For the question, ‘What subject and topics were most beneficial’, 73% of all responses point to mathematics, and most answered topics were ‘Fractions’ (27%), ‘Unit conversion’ (24%), and ‘Linear equations’ (24%).

Table 10 Voice of the reader texts

Full size table

Selecting overall questions for evaluating the item ‘Working with the application is for me...’, the frequency of ‘yes’ response is shown in Table 8. The next question was aimed at comparing the work to other educational tools such as learning from exercise books and preparation from textbooks. The responses show a preference for our system: ‘definitely yes’ is 55% and ‘somewhat agree’ is 37% of the respondents. However, 67% of the respondents do not favour our system compared to learning with friends or parents.

The second result is from the questionnaires filled by the teachers and their discussions with the authors of this paper. The aim was to clarify whether the system has an influence on the academic achievement of the pupils in the given subjects.

On first question, ‘Does the system help pupils in their home preparation for the subject and why?’, all teachers answered ‘definitely yes’ on four scales. The second question was ‘Do the pupils achieve better results with this special teaching aid and why?’. Two answers were ‘probably yes’ from teachers of mathematics and one answer was ‘probably no’ from a teacher of physics. The responses and comments are summarized in Table 9.

5.3 Reading voice of the texts

In the next item, the voice that reads the texts and mathematical problems was evaluated. Respondents had several options to choose from. To make it simpler, the results of ‘definitely yes’ and ‘probably yes’ were merged into ‘yes’ and the ‘definitely not’ and ‘probably not’ into ‘no’. These main results alone are shown in Table 10.

6 Discussion

The main purpose of the proposed web-based system is for pupils to prepare for lessons after school. Eighty-eight percent of the pupils could access the educational material online through internet connection. During the pupils’ home preparation, the parents were interested in the system, especially the mothers. For the question, ‘How often and how was the product used?’, the responses were distributed among all interval levels, and 53% of the answers were found in a more frequent use interval - at least once a week.

While evaluating the synthetic voices (conventional TTS system for Czech), intelligibility and pleasantness were praised by the pupils, but the voice tended to sound less natural and rather monotonous. The results are especially valuable for further technical adjustments in which it would be appropriate to improve the naturalness (and remove the monotony) of a synthetic voice employed to read technical documents.

The system was mostly positively assessed as more than 85% of the pupils voted for the positive effect of the system in the criteria ‘simplification of preparation for school’ and ‘clarification of subject matter’. This is consistent with former results of comparing mathematical TTS software with printed text [1]. These results of pre- and post-test in secondary students with visual impairments showed increasing accuracy. In our study, there was 95% positive acceptance in the criteria of understanding the topic if it was difficult for the pupil.

The flexibility of the application allowed the pupils to operate individually according to their state of vision and the current situation affected by fatigue (indicated by 64% of the pupils). For pupils with severe visual impairment, the results prove the importance of visual modality (e.g. a graphically rendered mathematical formula) with a synchronous rendered voice. In this study, a total of 30 of 41 pupils were classified with severe visual impairment. These findings are consistent with the results from [21], where these authors warning before using only the listening to digital text.

Another factor affecting the system acceptance is the age of the pupils and the complexity of the subject matter. In a study conducted on older high-school students with visual impairment and in Algebra 1 course [6], the authors emphasised a preference for classical text materials and a general resistance to new technology. In contrast to a different study [17] on the junior high school students with visual impairment trained on their system, an effective improvement in mathematics was reported. The pupils in our study preferred to work with the system compared to paper text, books, and textbooks. This finding can be explained by the higher ‘didactic friendliness’ of the system that can be caused by (1) a continual introduction of the system to the pupils in the classroom, (2) a pre-algebra course containing elemental mathematics, and (3) the better attitude of young students to electronic texts.

From the teachers ‘perspective, the pupils’ responses to the system were monitored continuously throughout the evaluation year. According to the teachers’ responses and comments, the system fulfilled its main purposes: to help pupils in individual preparation, to repeat the difficult curriculum, and to substitute the notes from the lesson. The improvement in the pupils’ proficiency was indicated in mathematics, but all teachers agreed to improve pupils’ access to the curriculum and their positive perception of the system. In addition, the teachers appreciated the possibility of explaining the subject matter in another way using the system.

7 Conclusion

We present a new web-based system specially developed to facilitate access to educational materials by automatic reading, for Czech pupils with visual impairment. The system enables teachers to prepare and process arbitrary topics focusing on technical documents that contain mathematics and physics formulas (at the lower secondary school level). The system converts the content automatically to speech, and the implemented solution provides a method for reading formulas in various mathematical contexts and correct grammatical forms that are very important for inflective languages, such as Czech.

In general, the system consists of the client and server-side. The client-side is composed of two types of interfaces (front-end and back-end). The front-end is a public interface enabling the user (pupils) full services via graphics and voice. Regarding the pre-synthesized text in the cache on the server-side, the selected documents are immediately read and synchronized with graphic highlighting. The back-end is an administrative interface where the documents are created and modified. The server side of the system is modular, implements several web services, and provides automatic processing for the client-side.

The system was experimentally evaluated by 41 pupils and three teachers of a school for pupils with visual impairments. The responses indicate the positive contribution of the system, especially for the difficult topics, and the pupils preferred the system over paper textbooks. The most frequent usage of the system is in a multi-modal form combining auditory perception with visual perception.

Notes

Learning, visual, or physical disability prevents gaining information from printed material in the standard way.
Categories (1) and (2) are also termed as low vision, in the USA, the visually impaired in the categories (3) to (6) are considered legally blind.
Encoding of mathematical and scientific formulas linearly in the row.
Available at http://ucebnice.zcu.cz/.
Simple HTML-source code including the mathematics formulas as graphics with alternative text.
The new version of the system provides an extension for the special needs of the subjects as chemistry or grammar, the system is a result of two European Social Fund (ESF) projects—SAMOČET CZ.1.07/1.2.31/02.0019.
http://www.w3.org/TR/MathML3.
http://www.dessci.com/en/products/mathtype.
http://www.firemath.info.
http://www.w3.org/Amaya/.
http://www.bluegriffon.com/.
http://tug.org/applications/tex4ht/mn.html.
www.symfony-project.org.
http://www.wiris.com/en/editor.
http://mary.dfki.de.
https://www.cereproc.com.

References

Alajarmeh N, Pontelli E (2012) A non-visual electronic workspace for learning algebra. In: Miesenberger K, Karshmer A, Penaz P, Zagler W (eds) Computers helping people with special needs. Springer, Berlin, pp 158–165
Chapter Google Scholar
Alper S, Raharinirina S (2006) Assistive technology for individuals with disabilities: a review and synthesis of the literature. J Spec Educ Technol 21(1):47–56
Article Google Scholar
Argyropoulos V, Paveli A, Nikolaraizi M (2018) The role of daisy digital talking books in the education of individuals with blindness: a pilot study. Educ Inf Technol 24:693–709
Article Google Scholar
Bencharef O (2018) An assistive technology for braille users to support mathematical learning: a semantic retrieval system. Symmetry. https://doi.org/10.3390/sym10110547
Article Google Scholar
Bouck EC, Meyer NK (2012) eText, mathematics, and students with visual impairments: "What teachers need to know". Teach Except Child 45(2):42–49
Article Google Scholar
Bouck EC, Weng PL, Satsangi R (2016) Digital versus traditional: secondary students with visual impairments’ perceptions of a digital algebra textbook. J Vis Impair Blind 110(1):41–52
Article Google Scholar
Brants T (2000) TnT: a statistical part-of-speech tagger. In: Proceedings of the 6th conference on applied natural language processing (ANLC’00), Seattle, Washington, pp 224–231
Cowan RE, Fregly BJ, Boninger ML, Chan L, Rodgers MM, Reinkensmeyer DJ (2012) Recent trends in assistive technology for mobility. J NeuroEng Rehab 9(20):20
Article Google Scholar
Crombie D, Lenoir R, McKenzie N, Barker A (2004) math2braille: Opening access to mathematics. In: Computers helping people with special needs, vol 3118, lecture notes in computer science, Springer, Berlin, pp 670–677
Dawe M (2006) Desperately seeking simplicity: How young adults with cognitive disabilities and their families adopt assistive technologies. In: Proceedings of the SIGCHI conference on human factors in computing systems (CHI’06) Montreal, Canada, pp 1143–1152
Edwards ADN, McCartney H, Fogarolo F (2006) Lambda: a multimodal approach to making mathematics accessible to blind students. In: Proceedings of the 8th international ACM SIGACCESS conference on computers and accessibility (ASSETS 2006), Portland, Oregon, pp 48–54
Ferreira HF (2011) AudioMath: speaking mathematics with MathML. In: Second European workshop on MathML and scientific e-contents, Kuopio, Finland, pp 55–62
Foster KR (2001) MathType 5 with MathML for the WWW. IEEE Spectr 38(12):64
Article Google Scholar
Gr\(\mathring{\rm u}\)ber M, Matoušek J, Hanzlíček Z, Krňoul Z, Zajíc Z (2016) ARET – automatic reading of educational texts for visually impaired students. In: Interspeech, pp 383–384
Hersh MA, Johnson MA (eds) (2003) Assistive technology for the hearing-impaired. Deaf and Deafblind. Springer, London
Hersh MA, Johnson MA (eds) (2008) Assistive technology for visually impaired and blind people. Springer, London
Google Scholar
Huang PH, Chiu MC, Hwang SL, Wang JL (2015) Investigating e-learning accessibility for visually-impaired students: an experimental study. Int J Eng Educ 21(1):495–504
Google Scholar
Isaacson M, Srinivasan S, Lloyd LL (2010) Development of an algorithm for improving quality and information processing capacity of MathSpeak synthetic speech renderings. Disabil Rehabilit Assist Technol 5(2):83–93
Article Google Scholar
Jayant C (2006) A survey of math accessibility for blind persons and an investigation on text/math separation. In: Technical report, University of Washington, Seattle, Washington
Karshmer A, et al. (2004) UMA: a system for universal mathematics accessibility. In: Proceedings of the 6th international ACM SIGACCESS conference on computers and accessibility (ASSETS 2000), Atlanta, pp 55–62
Klingenberg OG, Holkesvik AH, Augestad LB (2020) Digital learning in mathematics for students with severe visual impairment: a systematic review. Br J Vis Impair 38(1):38–57. https://doi.org/10.1177/0264619619876975
Article Google Scholar
Leas D, Persoon E, Soiffer N, Zacherle M (2008) Daisy 3: a standard for accessible multimedia books. IEEE Multimed 15(4):28–37
Article Google Scholar
Lewis P, Noble S, Soiffer N (2010) Using accessible math textbooks with students who have learning disabilities. In: Proceedings of the 12th international ACM SIGACCESS conference on computers and accessibility (ASSETS 2000), ACM, Orlando, pp 139–146
Lewis RB (1998) Assistive technology and learning disabilities: today’s realities and tomorrow’s promises. J Learn Disabi 31(1):16–26
Article Google Scholar
Lopresti EF, Mihailidis A, Kirsch N (2004) Assistive technology for cognitive rehabilitation: State of the art. Neuropsychol Rehabil 14(1–2):5–39
Article Google Scholar
Matoušek J et al (2011) Web-based system for automatic reading of technical documents for vision impaired students. In: Text, speech, vol 6836. and dialogue, lecture notes in artificial intelligence. Springer, Berlin, pp 364–371
McCracken RE, Nemeth A, Roberts H (1972) The Nemeth Braille code for mathematics and science notation 1972 revision. American Printing House for the Blind, Louisville
Google Scholar
Miner R (2005) The importance of MathML to mathematics communication. Not AMS 52(5):532–538
MathSciNet MATH Google Scholar
Moskovitch Y, Walker BN (2010) Evaluating text descriptions of mathematical graphs. In: Proceedings of the 12th international ACM SIGACCESS conference on computers and accessibility (ASSETS 2010), Orlando, Florida, pp 259–260
Potencier F (2009) The symfony reference guide. Sensio SA
Raman T (1994) Audio system for technical readings. Ph.D. thesis, Cornell University
Ramloll R. et al. (2000) Constructing sonified haptic line graphs for the blind student: first steps. In: Proceedings of the 4th international ACM conference on assistive technologies (ASSETS 2000), Arlington, Virginia, pp 17–25
Sandhu P (2009) The MathML Handbook. Charles River Media
Scherer MJ (2004) Connecting to learn: educational and assistive technology for people with disabilities. American Psychological Association, Washington
Book Google Scholar
Scherer MJ, Craddock G (2002) Matching person & technology (MPT) assessment process. Technol Disabil 14(3):125–131
Article Google Scholar
Schröder M, Charfuelan M, Pammi S, Steiner I (2011) Open source voice creation toolkit for the MARY TTS platform. In: Proceedings of the 12th annual conference of the international speech communication association (Interspeech 2011), Florence, pp 3253–3256
Sears A, Young M (2002) Physical disabilities and computing technologies: an analysis of impairments. In: Jacko JA, Sears A (eds) The human–computer interaction handbook: fundamentals, evolving technologies and emerging applications. Lawrence Erlbaum Associates, New Jersey, pp 482–503
Chapter Google Scholar
Soiffer N (2007) MathPlayer v2.1: web-based math accessibility. In: Proceedings of the 9th international ACM SIGACCESS conference on computers and accessibility (ASSETS 2007), Kuopio, pp 257–258
Stevens R, Edwards A (1994) Mathtalk: the design of an interface for reading algebra using speech. In: Computers for handicapped persons, vol 860. Lecture notes in computer science. Springer, Berlin, pp 313–320
Stevens R, Edwards A (1994) Mathtalk: usable access to mathematics. Inf Technol Disabil J 1(4)
Stoeger B, Batusic M, Miesenberger K, Haindl P (2006) Supporting blind students in navigation and manipulation of mathematical expressions: Basic requirements and strategies. In: Computers helping people with special needs, vol 4061. Lecture notes in computer science. Springer, Berlin, pp 1235–1242
Taylor P (2009) Text-to-speech synthesis. Cambridge University Press, Cambridge
Book Google Scholar
Teiresiás-MUNI-Brno: Draft of the czech 8 dot braille code standard. http://www.teiresias.muni.cz/czbraille8 (2008). Accessed on 09 Jan 2015
Tihelka D et al (2018) Current state of text-to-speech system ARTIC: a decade of research on the field of speech technologies. In: Text, speech and dialogue, vol 11107. Lecture notes in computer science. Springer, Berlin, pp 369–378
Wells J (1997) SAMPA computer readable phonetic alphabet. In: Gibbon D, Moore R, Winski R (eds) Handbook of standards and resources for spoken language systems. Mouton de Gruyter, Berlin
Google Scholar
World Health Organization: International statistical classification of diseases and related health problems 10th revision (2003)
Zelinka J, Kanis J, Müller L (2005) Automatic transcription of numerals in inflectional languages. In: Text, speech, vol 3658 and dialogue, lecture notes in artificial intelligence. Springer, Berlin, pp 326–333

Download references

Acknowledgements

This research was supported by the European Social Fund and the State Budget of the Czech Republic project No. CZ.1.07/1.2.00/08.0021, and Ministry of Education, Youth and Sports of the Czech Republic project No. LO1506. We thank Primary school for pupils with visual impairment in Pilsen, Czech Republic for help with implementation and evaluation of the system. The WIRIS editor was incorporated into the system with courtesy of “Maths for More” mathematical software company based in Barcelona, Spain.

Author information

Authors and Affiliations

NTIS - New Technologies for the Information Society, Faculty of Applied Sciences, University of West Bohemia, Univerzitní 8, 306 14, Pilsen, Czech Republic
Michal Campr, Zbyněk Zajíc, Zdeněk Hanzlíček & Martin Grůber
Department of Cybernetics, Faculty of Applied Sciences, University of West Bohemia, Univerzitní 8, 306 14, Pilsen, Czech Republic
Jindřich Matoušek & Zdeněk Krňoul
Department of Pedagogy, Faculty of Education, University of West Bohemia, Sedlákova 38, 306 14, Pilsen, Czech Republic
Marie Kocurová

Authors

Jindřich Matoušek
View author publications
You can also search for this author in PubMed Google Scholar
Zdeněk Krňoul
View author publications
You can also search for this author in PubMed Google Scholar
Michal Campr
View author publications
You can also search for this author in PubMed Google Scholar
Zbyněk Zajíc
View author publications
You can also search for this author in PubMed Google Scholar
Zdeněk Hanzlíček
View author publications
You can also search for this author in PubMed Google Scholar
Martin Grůber
View author publications
You can also search for this author in PubMed Google Scholar
Marie Kocurová
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zdeněk Krňoul.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Matoušek, J., Krňoul, Z., Campr, M. et al. Speech and web-based technology to enhance education for pupils with visual impairment. J Multimodal User Interfaces 14, 219–230 (2020). https://doi.org/10.1007/s12193-020-00323-1

Download citation

Received: 30 April 2019
Accepted: 27 March 2020
Published: 19 April 2020
Issue Date: June 2020
DOI: https://doi.org/10.1007/s12193-020-00323-1

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Speech and web-based technology to enhance education for pupils with visual impairment

Abstract

Similar content being viewed by others

Developing Speech-Based Web Browsers for Visually Impaired Users

Didactic Tool for the Visually Impaired

A Multimodal Platform to Teach Mathematics to Students with Vision-Impairment

Explore related subjects

1 Introduction

1.1 Tasks in automatic formula reading

1.2 Supporting pupils with visual impairment in the learning of mathematics

2 Current technologies and automatic reading of mathematical text

3 Overview of proposed system

3.1 Client-side

3.2 Server-side

3.3 New technique for reading mathematics

3.3.1 Automatic conversion of ‘inline formulas’

3.3.2 Automatic conversion of formulas represented by MathML

Example 1

3.4 Final text processing

3.4.1 Text filtering

3.4.2 Text normalization

3.4.3 Word substitutions

3.4.4 Phrasification and prosodic description

3.4.5 Phonetic transcription

3.4.6 Phonetic filtering

3.5 Text-to-speech

4 Evaluation methods

4.1 Participants

4.2 Materials

4.3 Procedure

5 Results

5.1 The ways and frequency of using the system

5.2 The effect of the system

5.3 Reading voice of the texts

6 Discussion

7 Conclusion

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation