Keywords

1 Introduction

Studies in Europe and USA show that around 10–15% of the population is either deaf or hard-of-hearing (World Health Organization 2021). In the United States, one in eight people aged 12 years or older has hearing loss in both ears (Lin et al. 2011), and the percentage is expected to increase as the population ages (National Institute on Deafness and Other Communication Disorders 2021).

Although both deaf and hard-of-hearing people face barriers to accessing spoken language, they form separate groups with different perspectives and priorities. Those who are hard-of-hearing typically start life as hearing and experience hearing loss as they age. They use a spoken language, such as Greek or French, as their preferred, or first language.

On the other hand, those who identify as deaf often use a sign language,Footnote 1 such as Greek Sign Language or French Sign Language, as their preferred language. Both groups require better access to the information contained in speech, but those who identify as deaf also require better access to written forms of spoken language.Footnote 2

The impact of hearing loss can be profound. Communication with others is difficult and has worsened recently due to the impact of preventive measures against COVID-19, including masking and social distancing. This can lead to a feeling of social isolation (Vas 2017). The lack of communication also creates barriers to education and job opportunities, which has resulted in an estimated 60 billion dollars in lost productivity due to costs related to unemployment and premature retirement in Europe and the US (World Health Organization 2021).

Both groups could potentially benefit from specially designed information and communication technology (ICT). ICT holds great promise in supporting communication needs. Today’s hearing devices and assistive listening technologies are powerful miniaturized computing systems, and increasingly offer options for coupling and connectivity with modern communication devices to expand access to audible and written forms of spoken language. This promises to contribute to better health and a better standard of living (Davis and Hoffman 2019).

However, even the most sophisticated technology may be of little use if it is not fit well to a person’s individual usage requirements. ICT can support visual as well as acoustic modalities with pictures or written forms of speech on a screen allowing individuals to extend both their general knowledge and use of language without utilizing sound. ICT also has the potential to support translation from acoustic and written forms of spoken language to the preferred sign language of a deaf user. Additionally, using ICT for collaborative activities can encourage a group of persons to improve their use of language and their understanding of concepts as they plan and carry out their work.

There are a wide variety of technologies that have promising potential: smart hearing instruments, adaptive and user-controlled hearing systems, machine learning-based hearing systems for individualisation of listening experience, algorithms for improving the acoustics of sound, and other types of cutting-edge technology which can assist people with listening, speaking, reading and writing. Further, avatar-based systems have the potential for enhancing communication for those whose preferred language is signed, and for whom a spoken language is a second language.

The following sections examine the challenges discussed by researchers working to improve language accessibility for both the deaf and hard-of-hearing communities.

2 Live Captioning

Live captions as words displayed on the screen allow real-time transcription of speech into a visually perceptible representation of language. They give all viewers, including those who are deaf or hard-of-hearing as well as non-native speakers, a visual medium to follow audible information. Traditionally, live captions are created by a human (captioner) who manually transcribes or respeaks the content into automatic speech recognition (ASR) software. In this way, the quality of live captioning is affected by the delay of a human’s response in listening and transcribing live speech, and, consequentially, has a higher error rate due to transcribing under pressure. A study by Karam et al. (2022) demonstrates that the pressure brings a higher perceived subjective mental workload to captioners, especially when live-captioning fast-paced sports, talk shows, and weather.

To help to overcome this challenge, ASR has started to be used widely. With the rise of artificial intelligence (AI) technology in the last decade, the quality of ASR has increased rapidly. A recent comparison of commercial software reported up to 94% accuracy for live conversations, however, this rate was only achieved with recordings of excellent audio quality (Piskorek et al. 2022). There is a need to achieve 98% accuracy. For this reason, manual editing of automatic captions is still necessary. Piskorek et al. (2022) present a possible solution with the help of collaboratively corrected AI-generated live captions. In their study they compared the accuracy rates of the AI-generated and user-corrected captions using the Word Error Rate (WER) metric, which showed promising results for collaborative work on captions.

The WER metric which was used in Piskorek’s study is widely employed in the development of caption metrics. However, there is a question if this metric is actually appropriate for measurement of caption quality. Current research focuses on human-centered metrics, and Wells et al. (2022) conducted a user study in which they compared two caption metrics, the traditional WER and the human-centric Automated-Caption Evaluation (ACE), for their suitability in evaluating caption quality in live television. Interestingly, they found that ACE was more sensitive to large accumulations of errors than WER and penalized those errors more than human participants would. However, the difference in performance between WER and ACE was not statistically significant. Therefore, there is still a need to explore usability and optimization of human-centric captions metrics for the measurement of caption quality.

Another interesting discussion regarding captions is the question of their placement on screen. Olson et al. (2022) investigated preferences by deaf and hard-of-hearing users and found that the target audiences prefer to have captions appear to the right of a speaker’s head, especially in Web Real-Time Communication (WebRTC) environments together with a texting format that allows real-time text communication in a letter-by-letter mode.

In another application of captions, Suzuki et al. (2022) investigated the use of see-through live captions as part of a guided tour in a museum as a typical environment where deaf and hard-of-hearing people need assistance. The study showed that the proposed see-through captions were well received, however, there were issues related to the design of the system itself, such as its application for sign language users, who were not fully accommodated by this text-based system. According to the researchers, it will be necessary to consider an interface using sign language, such as sign language avatars.

3 Sign Language Avatars

For their entire lives, deaf sign language users struggle with a barrier to information. The barrier is not simply one of sound but one of language. Sign languages are independent, fully qualified linguistic systems in their own right and thus are different from spoken languages. For a person using a sign language as a first, or preferred language, even the written form of a spoken language poses a barrier, because the spoken language is different from their preferred language. Reliable access to spoken language requires the services of certified sign language interpreters. Unfortunately, qualified interpreters are in very short supply, and can be difficult to find. Without professional and sufficient interpretation services, deaf sign language users' participation in education, society and general social life is severely hampered, especially if majority members are not sign-language-competent.

Avatars that display sign languages have the potential to improve accessibility for deaf sign language users when used as part of an automatic translation system. Such technology could possibly remove the necessity of finding an interpreter in situations that would be simple if both parties shared a common language. However, the use of animated avatars instead of professional human interpreters is regarded very critically from different perspectives, as noted in the study by Krausneker and Schügerl (2022). They researched opinions and perceptions and discuss possible long-term effects of using sign language avatars. The study showed that deaf and hearing participants were critical towards sign language avatars but not categorically opposed to them. In principle, having quick and easy access to information would be a welcome development, but current technology is not equal to the task.

4 Environmental Sound Recognition

Deaf and hard-of-hearing people usually cannot recognize the sounds of daily life, such as running water, door chimes, home alarms and other sudden and occasional sounds. These sounds are called environmental sounds, and deaf and hard-of-hearing people need warning signs communicated to them by their family, friends, and even hearing assistance dogs and special products which are used to support them. A study, carried out by Furukawa et al. (2022), investigates how to convert environmental sounds into visual information or vibrations and to present them to the final users. Using support vector machines, the researchers were able to achieve an average discrimination rate of 83.3% for six types of sound (from car horn to natural background noises).

5 Assistive Technology

Due to the COVID-19 pandemic it was evident that assistive, mobile, and other contemporary accessible technologies have become increasingly important in the everyday lives of deaf and hard-of-hearing people. This target group already has difficulties with access and active participation in communication, education, and social engagement. During the pandemic the shift to online communication and use of assistive technology caused additional challenges. A study by Halbach (2022) shows that deaf and hard-of-hearing people have been dependent on their hearing instruments and assistive technologies. According to the study, approximately 75% of users were satisfied or very satisfied with their assistive technologies, which primary refers to hearing aids. Other people, especially elderly, were not equally satisfied since they have difficulties using the devices in their everyday lives. The survey shows a number of technical solutions for several situations. However, the author also uncovered several areas where technology can be improved. This picture of technology as an enabler with weaknesses was confirmed by interviews.

In this way, it is important to determine which technologies are currently available for the purpose of communicating, listening to speech and audio devices, sound recognition, informing, sensing, alerting, and learning. A study by Kožuh et al. (2022) provides a comprehensive list of assistive technologies and a list of mobile technologies together with a description of new technologies, like WebRTC, ASR and Metaverse. It is evident that technologies for communication and language acquisition are among the most important assistive technologies.

6 Conclusion

The richly diverse papers in this special thematic session cover a wide range of exciting innovations and provide thoughtful analyses that lend a clear-eye perspective on the current state of the art. Although the technologies differ quite remarkably one from another, they all are designed for improved accessibility – accessibility to late-breaking broadcast news, accessibility to classroom lectures, to guided tours, to the surrounding environment and even to entirely different languages. Innovations of these sorts will make possible future gains in productivity through higher educational achievement, better health care and improved job opportunities.

Entwined and intrinsic to the theme of accessibility is the centrality of the user and user needs. Sensibility to the usability of assistive technology, as demonstrated in these papers, is essential for continued innovation and progress toward the goal of equal accessibility for all.