1 Introduction

Modern computers are beneficial for many types of impairments or disabilities. There are various assistive technologies [2, 34, 35], e.g. for people with cognitive disabilities [10, 25], mobility impairment [8], or hearing loss [15]. For individuals diagnosed with a print disabilityFootnote 1 [16, 24, 37], the transition from textbooks accessible in tapes to electronic texts read by computers is considered as one of the most effective methods of the last decade [3, 23]. However, this is not always true about the automatic conversion of mathematics to voice by computers. The creation and use of accessible materials is often discussed; however, the problem remains unsolved in several schools. Teachers cope daily with several issues arising from teaching pupils with combined disabilities (e.g. visual impairment and dyslexia or speech defects). The challenge worsens in science subjects, such as mathematics and physics for which accessible materials in a given language are noticeably lacking. It is also very challenging for computers to automatically read technical texts, such as mathematical formulas. Thus, it is difficult to develop a standard interface for the creation and presentation of science educational materials for pupils with visual impairments.

Visual impairment is designated by measuring visual acuity (VA). Other diagnosable characteristics are the range of visual field, depth perception, contrast sensitivity, colour discrimination, accommodation, adaptation, ophthalmogyric activity, and ability to localize a subject and follow it in motion. Based on the ophthalmology examination, the visual impairment of the person is classified into different categories as specified by the World Health Organization classification: (1) mild vision loss (near-normal vision), (2) moderate low vision, (3) severe low vision, (4) profound low vision, (5) near-total blindness, and (6) total blindness (no light perception)Footnote 2 [46].

Pupils with visual impairment educated in lower secondary schools in Czech Republic are classified into a wide range of the categories (3–6). The pupils are taught together in classrooms of about ten pupils. During mathematics lessons, the pupils routinely use low vision aids such as video magnifiers, large displays with enlarged text font, or Braille code display. Only some pupils classified as totally blind learn and use Braille or Nemeth Braille code.Footnote 3 An alternative method for these pupils is specialized computer hardware and software, such as screen readers, that enable them to read e-text using a computer [4].

Today, people with visual impairment already use standard screen readers using text-to-speech (TTS) technology (see [42]) that requires localization for a given language. A screen reader involves a computer application designed to automatically ‘read’ text on a screen without human intervention. However, the ‘non-textual’ content, such as mathematical formulas, images, and graphical schemes, is skipped or garbled by standard screen readers which lack support for correct text pre-processing modules. On the other hand, there are also offline digital textbooks and materials that correctly combine written texts and images with human voice (also known as hybrid books). The creation of such textbooks for the learning of mathematics is a long and laborious process [3, 5]. For example, the voice of a human is recorded along with the corresponding textual content.

1.1 Tasks in automatic formula reading

Two essential steps are involved in providing an appropriate and flexible interface for preparing new educational materials for daily use which are available through the automatic reading of mathematical formulas.

The first step involves the correct transcription of the symbolic notations of formulas and their subsequent decoding into corresponding grammatically correct word forms. One issue in the design is obtaining an adequate formula editor accessible to teachers. The authors (teachers) usually lack skills for the typesetting of mathematical equations, they are not web programmers, and they do not know how to generate well-structured web pages with mathematical formalisms.

The second important step involves the application of TTS. Current TTS systems are optimized on the corpora utilized during system design. As the spoken formulas are usually not included in these corpora, problems with synthetic speech quality can arise when conventional TTS systems are employed to read technical documents. Thus, the intelligibility and naturalness of the generated voice can be questionable.

1.2 Supporting pupils with visual impairment in the learning of mathematics

In the context of mathematics education, Stoeger [41] defines four main problems: (1) access to the mathematical literature (books, teaching materials, articles, etc.); (2) preparation of teaching materials (school exercises, notes); (3) navigation in mathematical formulas, and (4) the process of learning mathematics (calculations, formal manipulation of expressions, problem-solving).

This paper contributes to the four problems mentioned above. We propose a new web-based systemFootnote 4 that provides the following: (1) an easy access for pupils with visual impairment; (2) an interface for preparing educational texts including mathematical formulas; (3) navigation through the prepared audio-visual web content, and (4) a qualitative evaluation of the system with selected topics in mathematics and physics.

The system is designed for the teachers to prepare, manage, and administrate new educational materials using the online back-end interface. The teacher can convert, check, and immediately publish the prepared topic (lesson) to the web page with voice automatically supplemented by an integrated TTS (currently in Czech, English, and German). The educational material is available online to the pupils via the front-end interface of the website in both voice and visual modes.

The web page generated by the system is accessed via standard screen readers.Footnote 5 The educational material in this audio-visual form can also be integrated into other educational interfaces.

In general, the variety of the subjects prepared by the system is unlimited. However, the novelty of the proposed system is the capability to decode mathematical formulas.Footnote 6 Therefore, the system was qualitatively evaluated using mathematics and physics for pupils with visual impairment.

2 Current technologies and automatic reading of mathematical text

Several technologies and standards have been developed to improve the availability of textbooks for pupils with a print disability. Nevertheless, very few technologies currently provide a framework to make mathematical educational materials accessible in the required form (school textbooks, training notes, exercises). Mathematical equations are presented in several formats and codes, such as Mathematical markup language (MathML), NIMAS, DAISY, and Nemeth [22, 27, 28].

MathMLFootnote 7 is a computer XML-based format for describing mathematical formalism on the web [33]. However, current screen-readers are unable to read the MathML tags properly as they read the content alone and ignore the structure.

NIMAS is the file format for developing printed textbooks. By default, math content is provided in NIMAS file sets as images.

DAISY is another standard for producing accessible and navigable multimedia documents in the form of a synchronized audio/text-book. In addition, the MathDaisy add-in converts the equations to MathML and saves the document as a DAISY digital talking book (DTB). DTBs are used in eBook readers and the DAISY DTBs are read by players such as gh-PLAYER [18].

AudioMath developed at Porto University is another mathematics reader that uses MathML [12].

The Nemeth Braille code is used for the linear coding of mathematical and scientific notations using standard 6-point Braille cells, and it is unsuitable for automatic reading.

The problem of reading mathematics directly from a TeX source document was addressed, for instance, in the audio system for technical readings (AsTeR) [31]. For the world wide web, MathPlayer [38] provides a plugin to the web browser which displays, highlights, and reads mathematical expressions on the website. Accessibility is feasible here by right-clicking on an equation and choosing a ‘to render aurally’ command or using a screen reader that reads the entire web page and invokes MathPlayer to speak the mathematical formula in a structured (and sometimes customized) way. It should be noted that mathematical formalisms are often articulated in an ambiguous form.

The current technology falls short because it does not incorporate appropriate rules on how to describe mathematical expressions due to their naturally nonlinear structure. Unifying the translation process was also one of the aims of the MathSpeak project [18]. Furthermore, a prototype of the MathML reader called MathGenie was designed to provide an unambiguous verbal presentation of nontrivial mathematical formulas [20].

MathTalk is another assistive technology based on MathML, and it helps users with visual impairments create mathematical formulas by voice commands, display of information on the screen, and conversion of the information into braille [39, 40]. For the Czech language, the Lambda math-editor, developed at the University of York, was accommodated at Masaryk University [43] as a support system for editing mathematical formulas by blind users (creators) using Braille and audio output. Lambda also provides a compact and linear 8-dot Braille math-code [11]. Converting MathML to Braille is also possible using Math2Braille [9].

For the creation and subsequent reading of mathematical formulas, it is possible to use TeX or MathML code directly or to use an editor such as MathTypeFootnote 8 [13]. There are more general editors and converters of varying quality and different functionalities. Some applications are extensions or add-ons of internet browsers, e.g. Firemath,Footnote 9 Amaya,Footnote 10 Bluegriffon,Footnote 11 Tex4ht.Footnote 12 Expressions can also be graphically created using a word processor. The formulas are often stored as graphic image files in formats such as PNG or SVG, with or without alternative text. Some applications can convert expressions into speech, but they are mostly for the English language and of limited functionality.

Besides the conversion of text to speech, there are further specific requirements for application processing and reading of mathematical expressions for pupils with visual impairment. These requirements include font size, text colour, and background colour. The language as well as volume and speed control also play a vital role in the TTS conversion. Other technologies for creating accessible text for mathematics use images that are supported or annotated in ways that are more accessible to people with visual impairment for example, the provision of haptic feedback or a verbal description of important images such as tables and graphs. There are several specialized software tools to provide accessible images, such as creating tactile representations of graphs [19], using MathTrax, automatically generating image descriptions [29], or creating a unified version of the figure or image in real time [32].

The required feature of the tools is adequate support for the learning of mathematics. Nevertheless, in many cases, it is up to the teacher to modify classical mathematical learning materials into an accessible format, which can be time-consuming and financially demanding [5]. These tools also require special skills to typeset mathematical formulas or web programming. Our proposed system provides an accessible editor for the teacher, from which a readable format is automatically generated. To the best of our knowledge, such a complex system for a mathematical text is unavailable.

3 Overview of proposed system

The developed system is a web application which is based on a client-server architecture and runs on Apache HTTP server with MySQL database system [26]. The core of the system is based on Symfony 1.4Footnote 13 [30] which is an open-source web application framework. The client-side of the system consists of two parts: front-end and back-end. The front-end serves as a public interface for selecting, displaying, and reading documents arranged in topics. The back-end is an administrative interface, where the documents are created and modified. The server side of the system provides a text pre-processing and TTS synthesis. A schematic of the system can be seen in Fig. 1, and a detailed description of the system can be found in [14].

Fig. 1
figure 1

The scheme of client-server architecture of the system

3.1 Client-side

The topic administrator, e.g. a teacher, has a direct access to the document through the back-end, where he or she can create and modify documents using the incorporated WYSIWYG text editor (TinyMCE). A screenshot of the back-end is shown in Fig. 2.

Fig. 2
figure 2

Back-end interface—a mathematical exercise with various types of formulas and WIRIS editor

To unify the visual style of the content, the topic administrator can use templates to clarify the meaning of particular document fragments. Currently, five templates are supported in the system, namely Definition, Important, Note, Example, and Solution. For example, the Important template highlights crucial information to which the students should pay more attention. Different synthetic voices can be assigned to each template. Changing the voice while listening to an entire document improves attention compared to listening in a single voice.

The system supports two ways of inserting or editing mathematical formulas in the document. A simple formula with a linear structure, e.g. \(y=x+1\), can be written and stored as a plain text (so-called ‘inline formula’). Additionally, a graphic editor WIRISFootnote 14 is incorporated for more complex mathematical expressions. This editor provides the MathML representation of the formula which is used to derive its word-level transcription (see Sect. 3.3).

All documents are available through the public web interface (front-end) (see screenshot in Fig. 3). The document is read continuously from the beginning to the end. This process can be automatically interrupted at predefined points in the document.

Fig. 3
figure 3

Front-end interface with an example of the document. The currently read part is highlighted by the yellow colour. The navigation panel is on the right

While the document is being read, the students can use a graphical navigation panel with six control buttons: right arrow to play, square to stop, double arrow to rewind (next/previous sentence or formula), and triple arrow to quickly navigate to the next/previous chapter. Furthermore, the buttons are contrastingly coloured. The students can also use keyboard shortcuts or jump into any point of the document by clicking anywhere in the text.

3.2 Server-side

Before displaying and reading the document, the HTML source code is automatically processed. First, the parts of the text, including the templates and the formulas, are extracted and normalized (see Sect. 3.3). Subsequently, the texts are sent to the Web TTS server which is responsible for the conversion of texts to audio (see Sect. 3.5). All audio files are stored in a cache to avoid re-synthesizing already synthesized texts.

3.3 New technique for reading mathematics

The developed system should handle documents containing numerous mathematical expressions such as formulas, notations, and symbols. Generally, reading formulas is a highly complicated task, especially if there is no limitation in the complexity of the equation structure. Moreover, Czech is an inflective language; thus, all operands in the formula should be converted into the correct grammatical form (which can differ in various mathematical contexts).

3.3.1 Automatic conversion of ‘inline formulas’

Formulas with a simple linear structure can be represented by a text string (‘inline formulas’) which is usually a sequence of operators and operands read in the order they are written in. All operands in the formula are inflected into the correct grammatical form determined by the previous operator. We define a transcription rule for each operator, which contains a transcription of the operator and grammatical form for the following operand (case, number, gender, and cardinal/ordinal form). For the inflexion of operands, the method described in [47] was utilized.

The current version of the system supports only the basic operators and operand types in the text representation. These include addition, subtraction, multiplication, division, brackets, superscript (power), subscript, numbers, variables and physical units. An example is presented in Table 1. The formulas having other operators or a more complex structure are represented using MathML.

Table 1 An example of inline formula transcription (in Czech). For cardinal form, the operand determines the grammatical number and gender

3.3.2 Automatic conversion of formulas represented by MathML

MathML is an XML application for describing mathematical notation by capturing both the structure and content of the formula. It can represent mathematical formulas of almost any structure and complexity. Moreover, the standard notation can be easily extended with new elements. For example, we defined a new type of operand for labelling physical units.

The transcription of formulas represented by MathML can be divided into several steps:

  • Decomposition of a MathML code,

  • Selection of suitable transcription rules for the operators, and

  • Transcription of the operator and inflexion of the related operands.

For each mathematical operation, several transcription rules can be defined. The rules differ in their activation conditions (e.g. mathematical context, various values, or types of operands). For most operators, we consider one basic rule and several additional rules for exceptional cases.

The transcription rule consists of a text template defining the constant part of the final transcription, a type of the resulting expression describing the relation to a higher level of the formula, and a corresponding grammatical form for each operand. An example of a formula with its MathML representation and transcription is shown in Table 2.

Example 1

An illustrative example of transcription rules for two operators in YAML notation—power and fraction:

figure a
Table 2 Transcription of a mathematical formula represented by MathML (in czech)

3.4 Final text processing

After the conversion of mathematical formulas to a text, an analysis and processing of the remaining document content is the next important step preceding the speech generation. This process can be divided into several actions shown in Fig. 4.

Fig. 4
figure 4

A block diagram of the text processing. The English translation of the Czech sample text is: There are 3 Newton’s physical laws, the 1. law is called “The law of motion”. The phonetic strings are written in SAMPA notation [45]

3.4.1 Text filtering

The texts entering the pre-processing are parsed from an HTML-formatted source and may contain some unwanted ‘garbage’ characters, e.g. HTML tags, entity characters, quotation marks, etc. These characters must be removed or replaced before further processing.

3.4.2 Text normalization

The text normalization detects any ‘non-standard word’ (e.g. digit, date, abbreviation) in the input text and converts it to a grammatically correct ‘full-word’ form.

The determination of the grammatically correct form is one of the most challenging tasks for all inflective languages (e.g. Czech) as a single word can have many various forms depending on the syntax and the meaning of the sentence. For example, the phrase ‘2 ženy’ (2 women) is to be converted to ‘dvě ženy’ (two women) after the text normalization. However, it can have other forms depending on the context, e.g. ‘bez dvou žen’ (without two women), ‘se dvěma ženami’ (with two women), ‘ke dvěma ženám’ (towards to two women).

An extensive semantic and syntactic analysis is required to assign a word with the correct form. The development of such analysis is still ongoing; thus, an estimator (TnT tagger [7]) is currently used to find the correct form with some probability. A very efficient statistical part-of-speech tagger has been trained on a large Czech corpus already tagged by morphological tags beforehand.

Two examples of the text normalization are shown in Fig. 4. The numeral ‘3’ is converted to the correct form of ‘tři’ (three), whereas the ordinal number ‘1.’ is converted to ‘první’ (first).

3.4.3 Word substitutions

In the input text, words with a non-standard pronunciation (e.g. foreign words, names, or proper nouns) may occur. These words cannot be transcribed using standard Czech phonetic transcription rules mentioned in Sect. 3.4.5; thus, they require special processing. Therefore, we used a ‘dictionary-like’ system in which a single word can be replaced with a corresponding ‘phonetic-friendly’ transcription, and this can be correctly processed during the following phonetic transcription. In Fig. 4, the proper noun ‘Newtonovy’ (Newton’s) is substituted by a Czech phonetic-friendly transcription ‘ňůtnovy’.

Support for non-standard word pronunciation was also integrated into the system’s back-end. The editor can mark a word as a ‘pronunciation exception’ and assign its proper pronunciation.

3.4.4 Phrasification and prosodic description

In addition to the phonetic transcription, each input text is described in terms of prosodic symbols. In Slavic languages (also in other Indo-European languages), prosody can be viewed to supplement the phonetic information by other linguistic aspects, such as sentence modality (e.g. declarative sentences vs. yes/no questions), emotions, styles, or general expressiveness and speaker attitude. Thus, prosody helps listeners understand the meaning of the transmitted message. Prosody also helps in the division of longer utterances into sentences, sentences into shorter phrases, and phrases into words.

3.4.5 Phonetic transcription

During the phonetic transcription, an orthographic form of the input text is converted to a phoneme sequence. This process is rule-based in our system as the conversion is almost always unambiguous in the Czech language. The pronunciation exceptions, e.g. foreign words, are handled as described in Sect. 3.4.3.

3.4.6 Phonetic filtering

After the phonetic transcription, the phoneme sequence might still contain some characters that are not supported by the speech synthesis engine. Currently, all unsupported characters are omitted.

In addition, some phonetic substitutions can also be made in this step. For instance, some phonetic nuances could be discarded, i.e. symbols representing phonetic subclasses can be replaced by symbols representing a more general phonetic class. In Fig. 4, a syllabic voiced alveolar trill [r=] is replaced by its basic non-syllabic version [r]. Similarly, unvoiced and voiced alveolar fricative trills ( and ) were merged as both represent a similar phone.

3.5 Text-to-speech

To make the content of the website accessible for students with visual impairment, TTS technology was used. The primary task of any TTS system is to convert an arbitrary input plain text to a speech signal which should correctly reflect the content of the text. For our application, a unit-selection-based TTS system ARTIC [44] was adapted. It produces high-quality and naturally sounding speech and manages several Czech male and female voices, and these are assigned to particular templates (see Sect. 3.1). For other languages, ARTIC can be replaced by another TTS system as the communication protocol is easy to adapt, e.g. we used MaryTTSFootnote 15 [36] and CereProcFootnote 16 for German and British English to support the teaching of foreign languages.

4 Evaluation methods

4.1 Participants

The participants of the study were 41 lower secondary pupils (14 girls and 27 boys) of the sixth, seventh, and eighth grades (aged 12 to 14) and three teachers of the primary school for pupils with visual impairment in Pilsen, Czech Republic. This school educates pupils from all over the Pilsen and Karlovy Vary regions. The distribution of the classified visual impairments combined with other disabilities of the pupils in the study is summarised in Table 3.

Table 3 The number of classified visual impairments of the pupils in the study (S—severe, P—profound, B—near-total) combined with one or more other disabilities (SLD—specific learning disability, ID—intellectual disability, PD—physical disability, SD—speech defects

4.2 Materials

Twenty selected topics of mathematics and physics were used to evaluate the system. The topics partially cover the curriculum of the lower secondary school (see Table 4) and were created in the back-end of the system by the teachers of the pupils in the study. Each topic consists of an explanation of the subject matter of one lesson including examples and exercises. The topic substitutes the pupils’ notes from the school lesson and helps them with individual preparation.

These topics were selected by teachers according to a greater difficulty for pupils. Usually, these topics require more effort for mastery. The contents of each topic were selected to allow independent home preparation with an emphasis on exercise.

Table 4 Selected topics of mathematics and physics used for evaluation of the system
Table 5 Frequency of using the system
Table 6 Specific type of use
Table 7 After-school program use
Table 8 Working with the application is for me...
Table 9 Questions, responses, and comments of the teachers

4.3 Procedure

The study lasted one school year. The pupils were shown each topic in the system for at least one school hour. The prerequisites for using the system in the classroom included digital projector and notebook or interactive whiteboard to avoid organizational complications. Thus, it was possible to enlarge the text on the screen/interactive whiteboard and it proved beneficial to the pupils with severe visual impairment. The pupils had full access to their standard teaching aids.

During the study period, lessons were delivered with minimal changes. If the lesson was covered by some topic in our system, the teacher mentioned this with a brief overview on the interactive whiteboard as the focus was on home preparation. The pupils were given the homework from the exercise part of the topic. They could check for the solution in the system and receive immediate feedback. In case there was a problem, the solution guided them in sufficiently understanding the example. For a better insight into the topic, the pupils could repeat the explanation of the subject matter. Thus, the pupils were able to work at their own pace and independently.

Generally, the pupils used the tool mainly for home repetition, supplementing misunderstood material, and practicing. During the study period, there were two dedicated afternoons which the teacher dedicated to teaching the pupils how to operate the system.

Each use of the system in a given lesson was recorded in time-sheets with a positive or negative approach obtained from the pupils. After one year of use, we qualitatively evaluated the system through questionnaires administered to the pupils and teachers. The questionnaire items were mostly scaled to allow a finer distinction of answers. When the pupils were filling the questionnaires, an individual approach with adult assistance was adopted to ensure that they understood the questions and answered correctly.

5 Results

The inquiry, partly realized using a questionnaire and an interview, was focused on several monitored areas:

  1. 1.

    How and how often was the product used?

  2. 2.

    In what areas did the use of the product show a positive effect?

  3. 3.

    How was the quality of the reading voice assessed?

The results obtained by the evaluation of each of the questions above are summarized in the following subsections.

5.1 The ways and frequency of using the system

For the pupils, results were collected only from questionnaires. The frequency of using the system was collected on a 1-to-6 response scale (see Table 5). As observed, the results are distributed among all interval levels. Most answers are found in a more frequent use interval—at least once a week.

When asked about a specific type of usage, the pupils chose from three options (see Table 6). The result shows the importance of both acoustic and visual modality (64% respondents). The most frequent response was the item, ‘I listen to a computer voice, and I follow everything on monitor depending on need and fatigue’. Approximately 2% of the pupils ‘only listened to a computer voice’.

For the question, ‘Can you use the system after school?’, responses were obtained in the frequency shown in Table 7. For the item, ‘Are your parents familiar with the use of the system for automatic reading textbooks?’ 88% of the respondents answered ‘yes’. To determine whether people around the pupils were interested in the system, respondents answered that the most curious person was the mother (63%) and friends (43%). The interest, however, is characterized as a ‘little’. On the other hand, 85% of the grandparents were not interested at all.

5.2 The effect of the system

The results of this evaluation were collected from questionnaires administered to the pupils and teachers. The responses of the pupils to the question, ‘Is it a useful tool in understanding difficult topics in mathematics or physics?’ show mostly affirmative acceptance. Precisely, 51% of the respondents answered ‘certainly yes’ and 44% answered ‘rather yes’. For the question, ‘What subject and topics were most beneficial’, 73% of all responses point to mathematics, and most answered topics were ‘Fractions’ (27%), ‘Unit conversion’ (24%), and ‘Linear equations’ (24%).

Table 10 Voice of the reader texts

Selecting overall questions for evaluating the item ‘Working with the application is for me...’, the frequency of ‘yes’ response is shown in Table 8. The next question was aimed at comparing the work to other educational tools such as learning from exercise books and preparation from textbooks. The responses show a preference for our system: ‘definitely yes’ is 55% and ‘somewhat agree’ is 37% of the respondents. However, 67% of the respondents do not favour our system compared to learning with friends or parents.

The second result is from the questionnaires filled by the teachers and their discussions with the authors of this paper. The aim was to clarify whether the system has an influence on the academic achievement of the pupils in the given subjects.

On first question, ‘Does the system help pupils in their home preparation for the subject and why?’, all teachers answered ‘definitely yes’ on four scales. The second question was ‘Do the pupils achieve better results with this special teaching aid and why?’. Two answers were ‘probably yes’ from teachers of mathematics and one answer was ‘probably no’ from a teacher of physics. The responses and comments are summarized in Table 9.

5.3 Reading voice of the texts

In the next item, the voice that reads the texts and mathematical problems was evaluated. Respondents had several options to choose from. To make it simpler, the results of ‘definitely yes’ and ‘probably yes’ were merged into ‘yes’ and the ‘definitely not’ and ‘probably not’ into ‘no’. These main results alone are shown in Table 10.

6 Discussion

The main purpose of the proposed web-based system is for pupils to prepare for lessons after school. Eighty-eight percent of the pupils could access the educational material online through internet connection. During the pupils’ home preparation, the parents were interested in the system, especially the mothers. For the question, ‘How often and how was the product used?’, the responses were distributed among all interval levels, and 53% of the answers were found in a more frequent use interval - at least once a week.

While evaluating the synthetic voices (conventional TTS system for Czech), intelligibility and pleasantness were praised by the pupils, but the voice tended to sound less natural and rather monotonous. The results are especially valuable for further technical adjustments in which it would be appropriate to improve the naturalness (and remove the monotony) of a synthetic voice employed to read technical documents.

The system was mostly positively assessed as more than 85% of the pupils voted for the positive effect of the system in the criteria ‘simplification of preparation for school’ and ‘clarification of subject matter’. This is consistent with former results of comparing mathematical TTS software with printed text [1]. These results of pre- and post-test in secondary students with visual impairments showed increasing accuracy. In our study, there was 95% positive acceptance in the criteria of understanding the topic if it was difficult for the pupil.

The flexibility of the application allowed the pupils to operate individually according to their state of vision and the current situation affected by fatigue (indicated by 64% of the pupils). For pupils with severe visual impairment, the results prove the importance of visual modality (e.g. a graphically rendered mathematical formula) with a synchronous rendered voice. In this study, a total of 30 of 41 pupils were classified with severe visual impairment. These findings are consistent with the results from [21], where these authors warning before using only the listening to digital text.

Another factor affecting the system acceptance is the age of the pupils and the complexity of the subject matter. In a study conducted on older high-school students with visual impairment and in Algebra 1 course [6], the authors emphasised a preference for classical text materials and a general resistance to new technology. In contrast to a different study [17] on the junior high school students with visual impairment trained on their system, an effective improvement in mathematics was reported. The pupils in our study preferred to work with the system compared to paper text, books, and textbooks. This finding can be explained by the higher ‘didactic friendliness’ of the system that can be caused by (1) a continual introduction of the system to the pupils in the classroom, (2) a pre-algebra course containing elemental mathematics, and (3) the better attitude of young students to electronic texts.

From the teachers ‘perspective, the pupils’ responses to the system were monitored continuously throughout the evaluation year. According to the teachers’ responses and comments, the system fulfilled its main purposes: to help pupils in individual preparation, to repeat the difficult curriculum, and to substitute the notes from the lesson. The improvement in the pupils’ proficiency was indicated in mathematics, but all teachers agreed to improve pupils’ access to the curriculum and their positive perception of the system. In addition, the teachers appreciated the possibility of explaining the subject matter in another way using the system.

7 Conclusion

We present a new web-based system specially developed to facilitate access to educational materials by automatic reading, for Czech pupils with visual impairment. The system enables teachers to prepare and process arbitrary topics focusing on technical documents that contain mathematics and physics formulas (at the lower secondary school level). The system converts the content automatically to speech, and the implemented solution provides a method for reading formulas in various mathematical contexts and correct grammatical forms that are very important for inflective languages, such as Czech.

In general, the system consists of the client and server-side. The client-side is composed of two types of interfaces (front-end and back-end). The front-end is a public interface enabling the user (pupils) full services via graphics and voice. Regarding the pre-synthesized text in the cache on the server-side, the selected documents are immediately read and synchronized with graphic highlighting. The back-end is an administrative interface where the documents are created and modified. The server side of the system is modular, implements several web services, and provides automatic processing for the client-side.

The system was experimentally evaluated by 41 pupils and three teachers of a school for pupils with visual impairments. The responses indicate the positive contribution of the system, especially for the difficult topics, and the pupils preferred the system over paper textbooks. The most frequent usage of the system is in a multi-modal form combining auditory perception with visual perception.