Keywords

1 Introduction

Information and communication technology (ICT) spreads in Japanese public elementary schools. In Japanese public elementary schools, every pupil may use an ICT device individually and simultaneously. In the cases, a few teachers in most cases one teacher must teach all pupils. A class have 40 pupils at most in Japanese public elementary schools. The median of the numbers of pupils is about 32 in Utsunomiya Japan.

The usage of the ICT devices makes many benefits and problems. Each individual device makes many measurements about reading activities. This is a large benefit of usage of individual ICT devices. However, a large amount of measured data themselves make problems for a teacher. The measured data must be processed for effective usage in classes. The largeness of the measured data makes difficult to process. There are many easy problems to use an ICT device. However, one teacher cannot handle all of the problems and benefits about the usage of ICT devices individually and simultaneously.

We will cover the easy problems with the ICT device itself. In Japan, a normal class includes about 32 pupils. About 20 % of pupils have some problems about using ICT devices. We will cover the 80 % of the problems with ICT device itself. In the case, the teachers can treat only two pupils that have the problems not covered by the ICT device itself. We will cover the works with the ICT device itself. We will process the measured data about reading activities by the ICT device. A teacher can concentrate upon his work for utilize the processed results.

For treating the problems caused with the usage of ICT devices and helping a pupil and a teacher, the ICT systems must recognize the user’s activities. The Japanese text presentation system was proposed for helping the pupils with or without a reading difficulty [1]. In the Japanese text presentation system, the activities of a user are key touches, eye movements, and read aloud voices. Figure 1 shows the relations between teacher participations and our new supports for decreasing the participations. ICT devices may generate large new data. For instance, Japanese text presentation system measures and records the precise operations. These new data are useful for understanding pupil’s reading profile. However, these new data create new work load for a teacher.

Fig. 1.
figure 1

Teacher’s participations and our new supports for decreasing them.

The pupil may leave the ICT system. Our system does not have arms. It cannot prevent to leave the pupils from the front of the ICT system. However, teachers can treat this kind of problems. Many pupils use the ICT system well. However, many simple problems prevent to use the system well. A new work-load may prevent a teacher to use ICT devices. Our goal is to decrease the work of teachers to introduce ICT devices in a class. A teacher needs to participate at a time using ICT devices and at a time after using ICT devices. Shown in Fig. 1, we will decrease both participation with reading activity estimation and reading profile estimation. ICT devices introduced in a class must be useful for pupils. And, the ICT devices introduced must be welcomed by teachers.

In Japan, if a pupil shows two years delay of reading ability, we say that the pupil has a reading difficulty. Some Japanese normal public elementary schools have about 20 % of pupils with a light reading difficulty. Of course, there are pupils with a heavy reading difficulty. The pupils with a heavy reading difficulty attend special support education classes or schools.

A reading ability is most important for learning in a school. Almost all materials are text books. Recently, multimedia materials have increased gradually. However, in multimedia materials, texts have an important role. The pupils with a reading difficulty have a large handicap in all subjects. Even if a pupil has enough intelligence, with a reading difficulty the pupil has difficulty about learning all subjects. The helping method for the pupils with a reading difficulty is important.

This paper proposes the method to recognize and to analyze the activities of the user on the Japanese text presentation system that helps to read Japanese texts with or without reading difficulties, and the system decreases the work of teachers who help the pupils.

There are many pupils with a reading difficulty in Japanese elementary schools. There are many difficulties. The big and first one is reading Japanese characters. Japanese characters are the construction of hiragana (phonetic character), katakana (another type of a phonetic character), kanji (Semantic character) and other characters. In the period of elementary school, pupils learn 48 characters of hiragana, 48 characters of katakana and 1008 characters of kanji. Almost all pupils learn hiragana and katakana easily. However, the huge number of kanji is difficult to learn for some pupils in normal classes [2].

The next one is the difficulties about recognizing the sentence structures. In Japanese sentences, there is no spacing between words. For easing the difficulties about reading kanji characters, we can replace kanji characters with the hiragana characters. We can write hiragana characters that represent the pronunciation of the kanji characters at the side of the kanji characters.

We recognize the words constructing the Japanese text in the help of kanji. There are a large number of words starting from the character of kanji. We recognize the chunk of characters that constructs a word for the complex of hiragana, katakana and kanji.

Replacing kanji characters with hiragana characters, we have the sequence of hiragana characters only. In a long sequence of hiragana, it is difficult to recognize the chunk of characters constructing a word. It has no problems of this kind to write hiragana characters at the side of kanji characters.

In an elementary school, pupils learn hiragana and katakana at first. In the first stage in elementary schools, the Japanese text-books have a space between words for the ease of understanding the structures of the sentences. However, normal Japanese texts have no space between words.

Every pupil has those two difficulties at first. In a long school life, they acquire the skill to conquer those difficulties. Anyway, those two difficulties are large barriers for reading and understanding Japanese sentences.

Every infant has no knowledge about the Japanese characters. Every pupil has a little knowledge about the huge number of kanji characters at first. Then, they learn hiragana, katakana and kanji characters in a long elementary-school life.

In Japanese elementary schools, reading difficulty means two years delay of reading abilities. A few of pupils with dyslexia learn in special support education classes or schools. However, there are many pupils with reading difficulties in normal elementary schools. Of course, some pupils have difficulty about remembering kanji characters. Most of the pupils remember kanji characters gradually. However, pupils with a learning disability tendency have difficulty with reading Japanese sentences in the case that they can remember the kanji characters. In the case, they may be dyslexia.

There may be many causes of the difficulties on reading Japanese texts. We do not discuss the causes. We only pay attentions to the methods for easing their difficulties. We call their difficulties as “reading difficulty” in this paper.

The research about teachers shows that the pupils with ADHD (attention deficit hyperactivity disorder) tendency have difficulty about following the characters sequentially and recognizing the grammatical structures [4]. Of course, there are many types of reading difficulties. There are many causes of the reading difficulties. The resulting reading difficulties show the similar symptoms. They are the difficulties about following the characters sequentially, recognizing grammatical structures and reading kanji characters.

We have developed a visual text presentation system for persons with a reading difficulty in windows environments. The system records every operation of a user. With the recorded operations, we assess the difficulty of the user.

The Japanese text presentation system was proposed and implemented for the pupils with reading difficulties [2]. The system provides the multi-level high-lighting. The system makes the precise record of the operations. With the operational record, we can assess the reading abilities and difficulties on objective base, and reading profile.

For treating the problems caused with the usage of ICT devices and helping a pupil and a teacher, the ICT systems must recognize the user’s activities. In the Japanese text presentation system, the activities of a user are key touches, eye movements, and read aloud voices.

The pupil may leave the ICT system. Our system does not have arms. It cannot prevent to leave the pupils from the front of the ICT system. However, teachers can treat this kind of problems. Many pupils use the ICT system well. However, many simple problems prevent to use the system well.

This paper proposes the method to recognize the activities of the user with the read aloud voices for decreasing the work of teachers who help the pupils with or without reading difficulties.

First, this paper proposes the Japanese text presentation system with user’s activity recognition based on the read aloud voices. Then, we discuss precisely the plan of the Japanese text presentation system recognizing reading activities with read aloud voices in a normal Japanese class room. Next, we discuss the method to analyze the reading profile of a user from the reading patterns. Then, we propose the implementation of the system. Next, we show the experimental results. And last, we conclude this work.

2 Japanese Text Presentation System with Recognition of Read Aloud Voices

The Japanese text presentation system records all the operations of a user. The Japanese text presentation system moves the high-lighted part in a text with the key-input of the user. However, with only the key operations, we cannot recognize precisely the reading activities of a user. For instance, a user may only type a proper key with a proper interval without no reading activities. For recognizing a reading activity and helping the user, the system needs to observe the reading activity with a more direct method. In the usage of the system, the user read aloud Japanese sentences. One direct observation method of the reading activity is the measurement of a read aloud voice. The read aloud voice is a direct result of reading activity. The eye movement is important for understanding a reading activity. However, with eye movements, we cannot have any information about reading results. So, we start from the analysis of read aloud voices. The read aloud voice is the result of reading activity itself. We can evaluate the performance of reading activity directly.

In the usage of the Japanese text presentation system, the user directs the move to the next high-lighted part with a key-input. The Japanese text presentation system records the key operations with the precise time. With the record, we can measure the time for reading the high-lighted part.

With the proper operations, the resulting information is important for understanding the reading profile of a user. For confirming the proper operation of the Japanese text presentation system, we use the voice of reading aloud.

2.1 User’s Activity About Reading

Using Japanese text presentation system, the usage is simple as shown in Fig. 2. A user read a high-lighted part of a text, then types a key to move the high-lighted part. In the simple process, a user looks at the display, follows the text, recognizes the characters, understands the high-lighted chunk of characters, read aloud and types a key. Figure 3 shows more precise flow of reading aloud. The Japanese text presentation system cannot help a user to look a display. However, the Japanese text presentation system helps to find a proper sentence on the display with high-lighting the sentence and masking other sentences weakly. The system helps the user to find a chunk of characters with high-lighting also. We cannot observe the process of understanding. However, we can observe the read aloud actions and eye movements. We can observe the expressions on a user’s face and body movement also. For guiding and helping the user of the Japanese text presentation system, the eye movement and the read aloud voice are important.

Fig. 2.
figure 2

Operations on Japanese text presentation system.

Fig. 3.
figure 3

Precise activities to read a text aloud.

The read aloud activity is a direct result of a reading. We target the read aloud activity at first. With the recognition of read aloud activities, we can assess the reading ability directly.

2.2 Reading Activity Measurement Based on the Reading Aloud Voices

For assessing, we use the relation between the reading time and the length of the high-lighted part. For measuring the length of a high-lighted part of texts, there are many measures. One is the number of characters, and the other is the number of phonemes. In our pre-experiments, it shows clear relations between the reading time and the number of characters of the high-lighted part of a text. We use the number of characters for measuring the length of a text. In Japanese texts, there are kanji characters, hiragana characters and etc. As a result, there is a change of phonemes at a character. However, the number of character shows better relation to the reading time.

The Japanese text materials differ in the target age of the readers. For elder pupils, the materials include more kanji characters. A single kanji character represents a same word that is represented using many hiragana characters. The elder pupils read faster than the younger pupils do. As a result, there are constant relations between the number of characters and the reading time of a material.

Without reading difficulties, there is a linear relation between the reading time and the length of the high-lighted part. However, in real reading, there are many miss-operations and reading difficulties. Figure 4 shows the example of the relation between the reading time and the length of the high-lighted part. There are points on a linear function and outlier points.

Fig. 4.
figure 4

Relation between reading time and the length.

We decide the outlier points in the reading time per character of the high-lighted part. We use a simple threshold for this process. We decide that the reading time per character without reading difficulties are between 0.1 S and 0.3 S. We plot the pairs of the length and the reading time of the high-lighted parts after filtering the outliers with the threshold in Fig. 5.

Fig. 5.
figure 5

Relation between reading time and the length without out-lire data.

The outlier points may represent a reading difficulty or some error operations. It is important for understanding the reading activity to distinguish a reading difficulty and error operations. Only from the key operations, we have no information for distinguish them.

With read aloud voices, we can easily recognize the reading activities. However, it is difficult to recognize the relation between the read aloud voice and the high-lighted part of a text. The observed voice may be only a talking to oneself. The observed voice may be a correct read aloud of the high-lighted part of a text. The read aloud voice includes some error pronunciations of the high-lighted part of a text. However, pupil is not an announcer. Their pronunciations are not clear.

Normal speech recognition is powerful now. With the power of a web cloud, our smart phones recognize our speech well. However, dictations of long sentences are difficult. With a long sentence, a speech recognition makes some errors.

In Japanese elementary schools, the Internet connection is more or less restricted for keeping security. In the environment, powerful cloud-based speech recognition cannot work. We must use the poor speech recognition system that works without the Internet connection. In the environment, the Japanese text presentation system must recognize the reading activity of a user with error some results of speech recognitions.

In Fig. 5, the pairs of the length and the reading time have the relation of linear function clearly. With the reading difficulties, the pupil needs much more reading time. As a result, the high-lighted parts where the user has difficulties for reading are plotted upper regions over the linear function.

The plotted points over the normal linear function direct the reading difficulties. The corresponding part of the text shows the kinds of reading difficulties.

3 Analysis of Reading Profile from Reading Activities

If we control the reading activities of a pupil well, we have a record of reading activities of a pupil. With the record we can analyze the reading profile of the pupil. Our former works show the linear relation between the reading time and the length of a high-lighted part without reading difficulties. However, with reading difficulties it is difficult to find the linear relation between the reading tine and the length of a high-lighted part clearly. However, almost all pupils have a little problems about reading a sentence. The clarity of the linear relation is lost in the case.

3.1 Reading Profile Based on the Linear Relation Between the Reading Time and the Length of a High-Lighted Part

The Japanese text presentation system records the key operations by a user. The key operation is apparent presentation to read the next part in a text. The records the key operation, the high-lighted part of a text, and the time of the key operation. From the records, we have the length of the high-lighted part and the time to read the high-lighted part. If there is no reading difficulties, we have clear linear relation between the length of the high-lighted part and the time to read the part. However there is a little reading difficulties. In the case, the linear relation may not be clear.

3.2 Finding Linear Relation from Recorded Key Operations

Our former works show that the 1.5 s delay from normal reading time shows a reading difficulty. However, there is no normal reading time before hand. There are some patterns in reading profiles. Someone takes much time before starting to read aloud. Someone takes less time before starting to read aloud and read more slowly.

For finding the linear relation that represents the profile of reading activities, we must find the relation from the records that include the reading difficulties. So, we need to exclude the data that represent the reading difficulties. We assume that in a normal class there is no pupil who have heavy reading difficulties. The pupils with heavy reading difficulties attend special support education classes or schools. In this assumption, the records in a normal class include many of normal reading activities. In the case, we can distinguish a reading difficulty in the record.

In the record, we have enough number of normal reading activities. From the record, we calculate the linear approximation of the distribution of recorded data. The linear approximation represents the mixture of normal reading activities and reading difficulties. However, there are enough normal reading activities. We decide that the data that need 1.5 s more reading time represent reading difficulties. We exclude the data from the total recorded data. And we repeat this process until there is no data excluded. The remaining data must include only normal reading activities. From the remaining data, we can estimate the reading profile of a user. The resulting reading profile represents the feature of reading activities without reading difficulties. The profile helps a teacher to understand the type of a pupil in reading activities.

3.3 Profiles of Reading Activities

At each pupil, we have the reading profile represented by the linear relation. A linear relation is represented using two parameters. One represents the slant of the line of the liner relation. And, the other represents the vertical position of the line. The slant represents the speed of reading. The vertical position represents the leading time before starting to read aloud. In a reading session, we have these two parameters and the number of reading difficulties that are excluded from the original record. As a result, at a session, we have three parameters. In 2-dimensional space, it is difficult to represent the three parameters. However, we need to represent the three parameters in a sheet. So, we use bubble graph to represent the three parameters. We represent two parameters describing the linear relation that shows the reading profile with the position of a bubble. We do one parameter describing the number of reading difficulties with the size of a bubble.

The bubble chart represents the reading profile and the reading difficulty in a sheet. That helps a teacher to understand the reading ability and reading type of a pupil. With plotting a number of pupils in a sheet, we have the landscape of reading activities and difficulties in a class.

4 Implementation of Japanese Text Presentation System with Recognition of User Activity

4.1 ICT Environments

A normal personal computer has a microphone to be able to catch the voices of user’s read aloud. The basic function of the Japanese text presentation system is to present the Japanese text properly for easing the reading difficulties of a user without any stress. The system must move the high-lighted part without no delay after a key-input.

Speech recognition needs some processing time. A key operation and a reading aloud are asynchronous activities. So, the system processes the task around the key operations and one around voice recognition simultaneously.

In Japanese elementary schools, the Internet connection is more or less restricted for keeping security. In the environment, powerful cloud-based speech recognition cannot work. We must use the poor speech recognition system that works without the Internet connection. In the environment, the Japanese text presentation system must recognize the reading activity of a user with error some results of speech recognition.

The assessment process needs large amount of teacher contributions. In reading with the Japanese text presentation system, teachers monitor the process of the readings. After that, teachers see the operational records. This assessment results an objective estimation of the reading difficulties of the user. However, there is a little difference of the teacher contributions between the assessment using the Japanese text presentation system and the classical assessment methods.

In the reading processes, a pupil may read the part that is not high-lighted. A pupil may make un-correct pronunciation. Those events make no marks in the operational record. The observing teachers guide the pupil for proper operations of the Japanese text presentation system. The teachers also record the un-correct pronunciations.

The Japanese text presentation system tries to help every pupil with reading difficulties in a normal class room. In Japanese elementary schools, there are a few pupils with reading difficulties. The teacher must make a class for the majority of normal pupils. The teachers need the day by day assessments of reading difficulties for evaluating their teaching to ease the reading difficulties of a pupil. With the present Japanese text presentation system, teachers can assess the difficulties about reading. However, the Japanese text presentation system needs many works with teachers. For enabling day by day assessments of reading difficulty, we must decrease the teachers’ contributions for assessing the reading difficulties.

4.2 Class Room

There are many problems for utilizing the ICT technology in Japanese elementary schools [4]. The problems are listed in Table 1. For solving the problems, the proposed text presentation system treats only the electronic text. In Japan, a law forces to prepare the electronic readable texts of text books [3]. And, there are many documents accessible through the Internet. There is no paper document for an input in the proposed system.

Table 1. Problems about ICT usability in a special aid school in Japan.

Many pupils may remember the full text of the many times used materials as text books. Those remembered materials cannot be used for evaluating the reading performance of a pupil. The reading of the materials cannot help to enforce the reading abilities of the pupil.

In normal class rooms, many pupils use the Japanese text presentation system simultaneously. In Japan, a class of a public elementary school has about 30 pupils and a teacher. With the instructions of the teacher, we estimate that about 80 % of the pupils work with the Japanese text presentation system properly. There are 20 % of pupils who need a help to use the Japanese text presentation system properly. It is difficult to support 6 pupils by a teacher simultaneously. Our new system will support 80 % of pupils that have some problems to use the system by itself. Then, 2 % of the pupils in a class there are one or two pupils who need helps. A teacher can support the pupils. In the case, all of the pupils in a class work properly with the Japanese text presentation system. There is no need of the complete support for all the pupils in a class. The 80 % support for pupils is enough in a normal class.

In a normal school, teacher are busy for their day by day works. And they need much time to lead and teach pupils. Not only, a new ICT device must be welcomed from pupils. But also, it must be welcomed from a teacher. To be welcomed from teachers, a new ICT device must not increase the work of teachers.

4.3 System Design

The proposed system has the features listed in Table 2. The proposed Japanese text presentation system has only 2 new functions. We restrict the functions of the proposed system. The new proposed system has the function writing hiragana characters at the side of kanji characters, and the function of analysis of user’s voices. With those new functions, the new Japanese text presentation system makes easy to estimate the user’s reading difficulties. This is discussed in previous section. The teachers around the pupil with reading difficulties need the objective measurements of the performance of the reading ability of the pupil. For the pupils without reading difficulties, the objective measurements of performance show the progress of the user. For this purpose, the proposed system provides the operation logging function. The operation logs describe the reading speed at each meaningful chunk of characters.

Table 2. The plan for covering the problems.

The proposed Japanese text presentation system enables to use one-time materials for measuring the performance of a pupil. The real-time presentation generation enables to use any new plain text materials at any time with personalized presentation.

This real-time presentation generation enables to adapt the presentation for each pupil with different reading difficulties. DAISY has no function about adaptation for each pupil.

For adapting the variety of pupils’ ages and disability grade, the presentation system has the function to replace the un-studied kanji characters with hiragana characters. The phonic hiragana character is first studied character. There is a little difficulty about reading hiragana.

For easing the difficulty about kanji characters, the new system has another function that adds hiragana characters that represent the pronunciations of the kanji characters at the side of the kanji characters. This presentation helps users to recognize the relation between the kanji characters and their pronunciation.

The operations to the presentation system have the information about the user. The proposed system logs every operation at the time. This log represents the fluency of the reader.

The new system has the function that analyses the voice of read aloud of the user. With the voice of the reading aloud, the new proposed system estimates the pronunciation. With the estimated pronunciations, the new system estimates the reading activity of the user. With the reading activities estimated, the new system can change the presentation. The new system guides the user for proper usage of the system. With the recorded voice, the teacher may check the pronunciations afterward.

The new system has the features listed in Table 2. The first, the second and the third rows are new added features. They decrease the work by a teacher about using the Japanese text presentation system. For wide use of the Japanese text presentation system, the system does not need large-scale contributions of teachers. The network problem is important in Japanese schools. There is a large limitation about the Internet access. As a result, some cloud based implementation cannot work. The proposed system must work without the Internet access.

4.4 System Implementation

Language and Library. We implement the new Japanese text presentation system with Python. The new system uses Julius and Mecab. Julius is a Japanese speech recognition system [6]. Mecab is a morphological analyzer for Japanese sentences [7]. There are Python’s interfaces for Julius and Mecab. Our Python based system integrates Julius and Mecab. For Japanese text presentation, the system uses Pyglet [8]. Pyglet provides an object-oriented programming interface for developing games and other visually-rich applications. With Pyglet functions, the new system enables to display any collections of display formats.

Multiprocessing. The new Japanese text presentation system has two major processes. One process takes a work for presenting Japanese text. The other process takes a work for estimating user’s activities. With separating a text presentation and an activity estimation, the text presentation works freely from the time-consuming speech recognition. This implementation ensures the light display of texts. Figure 6 shows the basic structure of the new Japanese text presentation system. The dashed line box is the range of current implementation. The guidance generation is left for future. Figure 7 shows the outline of the new Japanese text presentation system.

Fig. 6.
figure 6

Japanese text presentation system with recognition of user’s activity user’s profile.

Fig. 7.
figure 7

Outline of the new Japanese text presentation system.

Phoneme Recognition. The Japanese speech recognition system Julius can recognize a speech well with proper preparations. However, in simultaneous use without proper preparations, the Julius cannot show its good performance. In the case, there are many error recognitions. With the error some recognition results, the new system makes the estimation of user’s activity with error some speech recognition results. The new system equates similar sounds with each other. The new system recognizes the part where the user read aloud in a text. The correctness of reading is not evaluated. With the recognition of the part of reading aloud, the system recognizes that the user uses the system properly or not.

The Japanese speech recognizer Julius recognizes the chunk of voices. There are many errors in the recognized results. The new system only uses the phonemes.

The new system evaluates the length of phonemes recognized. The number of phonemes is robust in noisy environments. Using the number of phonemes recognized, the new system estimates the correspondence between the phonemes of a high-lighted part of texts and the phonemes recognized from voices based on the length of the phonemes. The new system evaluates the difference between the phonemes of the high-lighted part of texts and the recognized phonemes using Levenshtein distance [9]. With the Levenshtein distance, the new system estimates the correctness of the reading aloud voices for the high-lighted part in the text. In the implementation, the insertion and the deletion take 2 for their edit distances. The substitution’s cost is 4 for normal substitutions. Between the nearly same phonemes, the substitution’s cost is 2. For instance, ‘shi’ and ‘hi’ are nearly same in Japanese. With a threshold, the new system decides the read aloud voice is proper pronunciation of the high-lighted part of a text, or not. Figure 8 shows the precise flow of phoneme analysis.

Fig. 8.
figure 8

Process to estimate user’s reading profile.

User’s Reading Profile Recognition. To make a user’s profile from the recorded data, we must remove the outlier. However, in the original recorded data, it is difficult to distinguish the normal recorded reading activity and the outlier recorded activity. Discussed in 3, our target pupils are normal pupils. The normal pupil has some reading difficulties. However, they can read large part of a text without a large problem. So, we start from the recorded data that include a large number of normal reading activities and a relatively small number of reading difficulties. We estimate the linear approximation of the recorded data. Based on the linear approximation, we decide the outlier data. Then, we remove the outlier data. The resulting data have less reading difficulties. And, again, we estimate the linear approximation and remove the outliers. We repeated this process until there is no outliers. This process is shown in Fig. 8.

5 Experiments

We will help the user by the Japanese text presentation itself. For this purpose, we implement the reading activity estimation with the voice of user’s read aloud. The new system records the voice. The new Japanese text presentation system includes the original Japanese text presentation system. The new system includes the function to estimate the reading activity with user’s reading aloud voice and the function to make users’ profiles for the assessment of a reading difficulty of the users.

5.1 Text Presentation Varieties

The new Japanese text presentation system enables much more varieties of text presentation. The new function displays hiragana characters at the side of kanji characters. In Japan, it is popular helping method for easing the difficulty of reading kanji characters to write hiragana characters at the side of kanji characters.

The placement of hiragana characters at the side of kanji characters has many methods. Our implementation places the hiragana characters at the center of the word of kanji characters. Figure 9 shows an example of presentation of Japanese texts with hiragana characters writing at the side of kanji characters. The current sentence is high-lighted, and the current part of the sentence is high-lighted with other formats. Other parts of the text is not high-lighted. There are three levels of presentations in the Fig. 9.

Fig. 9.
figure 9

Presentation example.

5.2 Read Aloud Voice Recognition

We have eight students in our laboratory for the experiments. They include three students that mother tongues are not Japanese. It is easy to measure the strength of the voice of a user in experimental environments. With the voice of a single person, it is difficult to evaluate the precise pronunciations. However, it is easy to evaluate the strength of the voice.

In normal class room, there are many other sounds other than the voice of the user. In the environments, it is not easy to separate the voice among other voices and noises. We use the recorded voice for checking the pronunciations by the teachers.

In the experiment, the new Japanese text presentation system decides about 80 % of the voices as correct pronunciations of the high-lighted parts. This result depends on the threshold. We can tune these results. Figure 10 shows the part of the recognition results. In Fig. 10, ‘mukashimukashi’ is the phonemes of the first part of the text in Fig. 9. The phenomes of a text and the phenomes of a voice are same in Fig. 10.

Fig. 10.
figure 10

Phoneme analysis logs.

Table 3 shows the examples of voice recognitions that have some errors. In Table 3, the column ‘Text’ is the correct phonemes of a text. The column ‘Voice’ is the recognized phonemes from the voice reading the text.

Table 3. Error examples in voice recognition.

At the first row, a long vowel is not recognized. That is represented as ‘:’. At the second row, also a long vowel is not recognized. And, a gap between words is not properly recognized. At fifth row, two phonemes are not recognized properly. At sixth row, a phoneme ‘ri’ is inserted in the result of voice recognition.

The errors as the first row are recovered with the help of the Levenshtein distance. The errors as the second row are difficult to recover in this stage.

5.3 Estimation of User’s Activity

Table 4 shows the analyzed results of users’ activities using phoneme analysis. The subjects are male, and span from 22 years old to 27 years old. The subjects D, G and H are subjects that mother tongs are not Japanese. They can read, write, and speak Japanese well. Other subjects are Japanese. The subject D, G and H need more reading time than other Japanese.

Table 4. Reading time of all subjects.

Table 4 shows the experiments of 8 subjects. The correctness in the table is the correct recognition rate of the decision about correctly reading aloud or not. In Table 4, the subjects D, G and H need more silence time than other subjects need. With the utterance analysis, we have much more precise information for understanding the user’s reading activity.

Figure 11 shows the relations between reading time and the utterance time. In the graph, the vertical measure’s unit is second. There are varieties of reading activity. In the graph, the increase in a silence time causes the increase in a reading time. The subject ‘A’ needs a little silence time. The subject ‘H’ needs a large silence time. With a long silence time, the reading speed increases.

Fig. 11.
figure 11

Reading time and utterance time (Color figure online).

Reading Profile. At each session, we have the distribution of the length of high-lighted parts and the reading time of the parts from the record of reading activities. Figure 12 shows an example of the distribution. The subject has good reading ability. In Fig. 12, the center line shows a line representing the linear approximation of the distribution of the data. The linear approximation is shown (1).

Fig. 12.
figure 12

Distribution of a reading time and the length of a high-lighted part.

$$ {\text{Y}} = 336.16{\text{X}} + 256.93 $$
(1)

In (1), Y is the reading time in 1/1000 s. X is the length of a high-lighted part in the number of characters. The upper line is 1 s increased from the linear approximation. The lower line is 1 s decrease from the linear approximation. In the example, three data are over the upper line. Removing these three data, we have the distribution shown in Fig. 13. In Fig. 13, all data drop between the upper and lower line. The linear approximation is shown (2).

Fig. 13.
figure 13

Distribution of a reading time and the length of a high-lighted part with outlier removal.

$$ {\text{Y}} = 273.56{\text{X}} + 612.58 $$
(2)

Comparing (1) and (2), the slant of the line decreases 20 %. The vertical position of (2) is 0.25 s upper than (1). The reading activity described (2) is the ideal reading activity for the subject.

This linear approximation represents the profile of a subject with two parameters.

5.4 Total Reading Profiles in a Class

Table 5 shows the calculated reading profiles and reading difficulties of pupils of a 4th school years in a school. The value of slant is the 1/1000 s per character. The value of position is 1/1000 s. In the table, it is difficult to understand the distribution of reading activities and reading profiles. In Table 5, there is no difference between the original and the removed without removed data. Removing the outlier data, the slant of the linear approximation of the distribution decreases in many cases. The pair of the parameters representing the linear approximation represents the reading profile of a subject. However, with this table, it is difficult to understand the distributions of the reading profiles of pupils in a class.

Table 5. Profiles of pupils of 4th school years.

In Table 5, it is difficult to understand the users’ profiles. However, we can find some tendency between the sessions with or without outliers. The outlier removed present reading difficulties. The sessions with and without outliers are similar in their averages. The average of pairs of the slant and the position are 192 mS/character and 1714 mS with reading difficulties. The average are 197 mS/character and 1603 mS. Removing outliers, we have the average 176 mS/character and 1844 mS with reading difficulties. Simple averaging shows no apparent difference between the profiles of reading activities with or without reading difficulties. Removing outliers, we can find difference between reading profiles with or without reading difficulties. With reading difficulties, a pupil read 30 % faster than the pupils without a reading difficulty.

Figure 14 shows the bubble chart that represents the data in Table 5. In Fig. 14, the size of a bubble represents the number of reading difficulties. The number of reading difficulties is same the number of removed data in Table 5. The horizontal measure represents the slant of the linear approximation. The vertical measure does the vertical position of the linear approximation. The unit of vertical measure is 1/1000 s. The unit of the horizontal measure is 1/1000 s per character. We can find two groups of reading profiles. One type starts slowly and read fast. The other type starts a little faster and read slowly. In Fig. 14, we can understand the position of the reading abilities in a class easily.

Fig. 14.
figure 14

Reading profiles of pupils of 4th school year.

6 Conclusion

The proposed new Japanese text presentation system estimates the precise reading activities of the user and the profiles of the users’ reading activities from the recorded reading activity logs. The user’s reading activity includes not only the key operations of the user, but also the read-aloud voice. We confirm the performance with the experiments.

Using the estimated reading activity, we can estimate the user’s state and help the user by the ICT device itself. The estimated users’ reading profile helps a teacher to understand the pupils’ reading abilities and reading types. It makes the teacher to understand their pupils well. The experiments confirm the performance, and it reveals that we can categorize the reading profiles into two types. One type starts to read slowly and read fast. The other type starts to read fast, and read slowly.

The new Japanese text presentation system enables to work simultaneously in a class room. In a normal class room, a teacher has many pupils, including ones with reading difficulties.

The new proposed system decreases the works of a teacher for using the Japanese text presentation system in a class simultaneously and individually. All of pupils in a class utilize the Japanese text presentation system properly with the help of a teacher and the system itself. A teacher does not need to check all record of the user’s reading activities. The system detects the points where the reading difficulty is. This enables easier use of the Japanese text presentation system in normal class rooms. The user’s reading profile helps a teacher to understand the pupils’ reading behaviors.

To understand the reading profiles, we need much more experiments and discussions with teachers. The much more precise record of the user’s activities helps to make the precise understanding of the reading activity with less teacher’s work. We will add user guidance function discussing with teachers.

We must discuss about the sequence of a silence time and an utterance time. We must discuss about the two types of reading profiles. These lead us to the more precise understanding of reading activity and reading profile.