Introduction

The information society offers a collection of new tools to access a wealth of information. Personal computers, mobile devices web technologies have changed the way people read the news, shop, and communicate.

One would hope that such technological advances would benefit all people equally. Unfortunately, despite the early hopes placed in new technology, there are accessibility and usability issues that still hinder access to technology and information for both mainstream users and users with disabilities. Universal access for all people has sadly lagged behind technical advancement, leaving many technologies difficult or, in some cases, impossible to use for people with disabilities. This paper presents some of the tools and techniques that address this digital divide.

As the field of access to information is extensive, this paper presents a broad survey of research regarding the presentation of information to people with visual disabilities, addressing in particular those problems that are prominent in the research community. While an attempt is made to include research which is representative of multinational initiatives, this review, inevitably, has a bias towards the English-speaking world. A subsequent review of additional sources from the non-English-speaking research community, as well as a large bibliographic database, are available through the authors [94]. It is important to note that this survey is intended as a starting point for those interested in pursuing research on accessible information for people with visual disabilities; it is not a definitive volume of all research in the field. In particular, this paper does not address the input modality of interfaces for accessing diagrams, mathematics or other forms of information.

In this paper the term people with visual disabilities is used to refer to the full range of people who have visual disabilities. This includes people who are blind, who have little or no functional vision, and people who have low vision.

This paper begins with a discussion of the presentation alternatives available to people with visual disabilities. These alternatives are audio presentation, discussed in Sect. 2, and tactile presentation, covered in Sect. 3. Following the discussion of these technologies, the paper examines how they are applied to different types of content, addressing the presentation of textual information, mathematics and graphics in Sects. 4, 5 and 6, respectively.

Section 7 considers online sources of information which include web and multimedia documents. The review concludes with a discussion of haptic technology and areas for future exploration.

Audio media

This section discusses how information can be conveyed through sound to a user, starting from non-speech sounds and then proceeding to discuss synthesized speech.

The term auditory icon refers to the use of real-world sounds to communicate the interaction of a user with objects in a scene. Originally proposed by Gaver [78], these sounds are usually related to the task being performed and the object with which the user is interacting. For example, the Trash bin icon in a graphical user interface indicates visually when there are documents which have yet to be cleared from the system. When a user chooses to empty the Trash, he/she hears an auditory icon of papers being shuffled out of a rubbish bin.

As these sounds are digital representations of their real-world counterparts, the parameters of the sounds can be adjusted in order to indicate the identity of the object being manipulated, such as its relative size and the action being performed on it. While it is possible to adjust many parameters of the sound, such as the pitch and tempo of the icon being played, it is hypothesized that nomic mappings of sounds to tasks, in general, are better than metaphorical mappings [78].

In contrast, earcons are abstract musical melodies that are symbolic of tasks or objects. As an example, a scale increasing in pitch could map to the opening of a file, while a decreasing scale could represent closing a file. Further information regarding the design and use of earcons can be found in [19, 82, 113, 121, 153, 192, 193].

The use of speech for communicating information, and in particular text, cannot be ignored when designing interfaces for people with visual disabilities. While the applications of text to speech technology are discussed in Sect. 4, the following are some of the usability concerns of which designers must be aware:

  • Technology. Hardware synthesizers, on average, provide better sound production and more accurate speech synthesis. However, these devices have the drawback of being costly to purchase as well as taking up additional workspace. Software synthesizers can also be expensive to purchase, but open source initiatives are resulting in some low cost or free alternatives. These software synthesizers also take advantage of existing sound card hardware available on most home computing workstations.

  • Speed. The average speed of screen readers is approximately 2.8 times slower than the average speed for a user with a visual disability [8]. In order to compensate for varying comprehension rates, any application using text to speech technology should provide an accessible means of adjusting the speed of the speech output.

  • Voicing. The majority of speech synthesis systems provide a range of voices ranging from low male to high childlike. Much like speed, the voice used in vocalizing text must be a customizable option for the user.

The use of three-dimensional sound (3D) interfaces is likely to become more common as the cost for sound hardware decreases. The 3D sound system must produce a signal which matches the transformation of a sound from its point of origin to its arrival in the ear canal. This signal will vary based on the origin point and its relative position to the head. In the case of a sound originating on the left side of the head, the sound wave signal will reach the left ear first, unfiltered by the head; whereas the right ear will receive an altered signal, which is caused by the wave being shadowed by the head [25, 76].

As a result of this complicated set of factors, sound systems have a collection of head-related transfer functions (HRTF), which are numbers representing the time delay, amplitude and tonal transformation of sounds from various points around the head. These functions are used to alter a sound signal which is being sent towards the ear in order to give the illusion that it has come from a point in the 3D space. The HRTF in formation itself is recorded through a series of tone experiments with microphones placed in the ear canal of either a manikin or a specific person.

In terms of hardware, 3D sound applications can be created using either headphones or loudspeakers. In the case of loudspeakers, these can be placed in a traditional stereo configuration, a sound wall consisting of a bank of speakers (e.g., as seen in the work by Donker et al. [51]), or with multiple surrounding speakers. For headphones, HRTF production for a single subject is relatively easy, with information being projected directly into the appropriate ear. Loudspeakers have the additional problem of crosstalk which can be described as sound waves intended for one ear arriving at the other. These extra signals disrupt the localization effects for the user. In order to counteract these signals, crosstalk filters can be added to the signal, cancelling out the unwanted sound waves and preventing them from reaching the wrong ear.

For examples of applications of 3D sound the reader is referred to the table browsing interface by Raman [163], memory enhancement techniques by Sánchez et al. [175] and cognitive map formation work by Ohuchi et al. [144]. The use of 3D sound in presenting graphical user interfaces to the blind was also investigated in the GUIB project [42, 43, 57].

Tactile media

The sense of touch can play an important role in presenting information to people with visual disabilities. However, the production of tactile documents has lagged behind print for the sighted.

The following technologies all produce what can be define as offline documents. Each of them can be read, and in some cases authored, away from a desktop computer. Examples of such print documents would be maps, calendars or textbooks.

Technology and techniques for ad hoc production

When working with an individual student, it is often beneficial to be able to generate tactile documents in an ad hoc manner, as the need arises. Such documents may consist of tactile graphics displayed in a 2D space, with various materials providing depth or texture to the graphic.

There are several examples in [67] of variable height pictures which are static pictures, prepared to provide depth to a graphic by reproducing contours or raised areas by attaching felt or other materials to a background and using fasteners like stick pins to identify landmarks of interest. In place of this, ink which dries to a raised surface can be used. There are several prefabricated kits which are designed to assist in building such pictures such as the Chang Tactual Diagram Kit which provides felt shapes and lines to apply to a background and the Tactile Diagram Starter’s Kit [31, 196].

Tactile-experience pictures, as discussed in [224], are graphics primarily used by children which are created with wood, sandpaper, and other types of materials with distinct tactile sensations. Buildup displays by comparison consist of several very thin layers of paper placed on top of each other to produce a contoured surface. With build-up displays, household materials like string, wire and drawing pins can then be used to draw attention to landmarks of interest in the tactile scene. Finally, for fast, immediate generation of tactile documents, a raised line drawing board where a plastic stylus is run over a plastic film can be used to produce raised lines.

While these types of tools are very useful, they are not suitable for generating mass production graphics. In order to provide documents in large quantities, one must turn to traditional embossing techniques, thermoform materials, swell paper, or computer presentation through tactile displays.

Embossing

Embossing in the context of this paper will refer to the printing of raised dots within a small distance of each other to create 2D structures. The dots are produced by embossing printers such as those listed in [10], or through heat transfer copying as discussed in [224]. These dots are usually the same distance apart as the standard Braille character, which is approximately 2.5 mm, permitting the easy generation of Braille text intermixed with other graphical elements. However, there are examples, such as the TIGER embosser [67, 70, 71], which provide more finely spaced dots for the production of near continuous raised lines and surfaces. A large list of embossers is available through the Royal National Institute of the Blind (RNIB) [168].

Microcapsule paper

Microcapsule paper reproductions are polyethylene paper with a polystyrene microcapsule layer coating one side. These capsules expand when heated, raising areas of the paper, thus giving the medium its colloquial name of swell paper. These documents are produced through an application of graphic elements to the paper with a dark colored ink pen, or through standard printing techniques. The microcapsule paper is then placed in a tactile image enhancer which heats the paper expanding the capsules, with the darker sections absorbing more heat and resulting in an area raised higher than the lighter areas of the paper.

In place of this, a pen with a heated tip can be used to free draw on microcapsule paper. This can be useful in certain situations, such as class room interaction. However, many teachers and parents shy away from such devices as there is a chance of burns occurring through skin contact with the heated tip [224].

Thermoforming

Thermoforming (vacuum forming) is the process of generating a tactile document from pre-tooled dyes. A large metal dye is molded into the shape of the document, which can include both Braille and printed text, line graphics and multi-tiered graphics. These molds are placed under a PVC sheet and heated, which causes the sheet to form over the mold. When the material cools, the sheet firms around the mold and can be removed, creating a replica of the document. This process can be repeated as many times as desired [224].

Limitations of offline tactile documents

While all of the above technologies are in use by people with visual disabilities, they have several disadvantages:

  1. 1.

    Size: due to the need to have large enough features for recognition through the finger tip, these documents are always substantially larger than standard print documents. As a result, they can be bulky and awkward to transport. In the case of multilevel vacuum form documents, stacking of materials may simply be impossible, resulting in storage problems.

  2. 2.

    Information loss: often a lack of space or resolution in the tactile medium results in the loss of fine detail. This loss of information could result in the misinterpretation of data, or in the reader becoming confused during unguided navigation tasks.

  3. 3.

    Cost: while more mundane, this is still a serious problem for the community. While many ink based printers are now (2009) less than 100 USD, the average embossing printer or tactile developing system costs several thousand dollars. This, coupled with the cost of the production media (e.g., microcapsule sheets), which ranges between 1.50 USD and 5 USD a sheet, limits the availability of such offline media.

  4. 4.

    Immutability: Changes to documents are inevitable. If a document needs to be updated, the only way to incorporate changes into a tactile document is to regenerate either a portion or the whole of the document, which is not only costly, but it is also virtually impossible to ensure that documents are up to date.

Online document production

Clearly, many of the problems with tactile offline media derive from the fact that, by their very nature, they cannot adjust to the needs of an individual user. In order to compensate for this, research has attempted to provide access to online documents. Devices for the display of these types of online data are varied in their capabilities. A selection is reviewed below.

There are several examples of tablet displays which convey audio information associated with the tactile documents. A tactile overlay is placed on top of the display area and the user explores the surface with his/her fingers. As pressure is placed on the display, the user’s finger location is transmitted to a computer where the coordinate information is decoded to speech or other audio information. These types of displays can address the information loss problem found in offline documents through the audio annotations; however, the cost, reproduction and size issues are often not addressed. Examples of these devices are the NOMAD [88] tablet, the Talking Tactile Tablet [105, 106] and the IVEO system [217] produced by ViewPlus technologies that is intended for use with the Tiger embosser and associated software.

Truly dynamic displays which can be refreshed in a matter of seconds, such that a user can page through a document, are more rare and in general, more difficult and expensive to produce. This is no more evident than in the case of the ill-fated optical to tactile converter (Optacon) [27, 41, 60, 77, 80, 127129, 155157, 174, 176]. Originally created in 1966, the Optacon was used by a large number of people with visual disabilities despite its substantial cost. Optacon could scan almost any surface, including computer screens, and produce a tactile image on a small surface of 144 vibrating pins. This type of display gave access to all types of printed materials from printed books to everyday items such as coins and receipts. Optacon was successful also as a research tool in many diverse projects, including: early electronic image processing [96, 97]; spatial cognition development [9]; interactive Braille output [226]; tutoring systems [54]; tactile exploration experiments [34, 108, 112]; and virtual textures [86]. Unfortunately, despite proposals for a new Optacon device as late as 1994 [133], the device was discontinued in 1996, leaving a large void in the community which has yet to be filled by a comparable device. In her 1998 open letter to the community, regarding the fate of the Optacon, Barbara Kent Stein, who was, at the time, the First Vice President of the National Federation of the Blind of Illinois stated [188]:

“Surely there is another approach to the whole problem, one that does not depend on speech at all. Why not develop a device to enable blind people to read the screen tactually? Why not turn visual graphics into tactile images?”

There are further examples of dynamic displays, such as: pin displays similar to the DMD 120060 [131] and the NIST pin display [169]; the wave based displays as discussed in [138]; portable displays [199, 229], and many more as listed in the extensive review by Vidal-Verdú and Hafez. Most notable is VideoTIM, which provides similar functionality to Optacon [1]. However, all of these devices, and several more like them, are largely only available in experimental settings and have yet to be produced at a low cost for the end user.

Text transcription

Text documents are perhaps the most common types of documents, and, therefore, the most important. As a result, it is not surprising that substantial efforts have been committed to rendering text for people with visual disabilities. The first electronic text transcription proposals were available in the early 1970s [104, 182184, 186].

Audio presentation of text

Audio presentation of text can either be speech recordings, such as those found in most audio books and some digital talking books, or synthesized speech produced by text to speech technology, which can be combined with screen readers for access to computer text.

Audio book formats have been present in mainstream media for more than 50 years with novels, textbooks and other printed materials being read into recordings by authors or celebrity readers. Indeed, there are still major initiatives to distribute such materials to blind populations through the world, with many thousands of recordings being produced every year [39]. These books had been available on various versions of analog tape [101], which could be navigated through rewinding and fast forwarding. This provided simple navigation, but due to the sequential nature of such recordings, significant challenges were faced by readers when they wanted to review specific sections of the document.

With digital media, books were moved to CDs and portable digital music players [122, 143, 172, 223]. While many of the sequential navigation problems remained, the nature of such digital media provided the ability to provide chapter and section markers to assist in navigation through the text [101].

Digital talking books (DTB) and the process of their standardization is overseen by the Digital Accessible Information Systems (DAISY) Consortium, a non-profit organization which was started by leaders in international libraries for blind and other print disabled readers. The DAISY Standard 2.0 has been developed through an iterative process (as detailed in [101]) and now includes markup standards based on World Wide Web Consortium (W3C) languages such as SMIL, thus permitting the synchronization of audio presentation with the visual presentation of the document (e.g., audio description of video). Examples of technology for reading DTB’s can be found in [44, 45, 101, 102, 134, 135].

In place of recordings, text to speech systems can be used to automatically read aloud text. Indeed, there are several early examples of monotone speech generation being used to convey information to either equipment operators or in early telephony applications [79, 114, 146]. The best-known example in the area of accessibility is perhaps the Kurzweil reading machine, which performed basic rendering of text to speech from scanned printed documents [3, 98, 104, 130]. However, it is also recognized that a simple transcription of character representation to speech is insufficient for understanding [145]. Without sufficient prosodic cues for perceiving and evaluating the context of the information being presented, long streams of speech can be difficult for the listener to understand.

Today, while there is still work to do on prosodic processing, there are text to speech systems available in many Latin, Germanic and other languages including: English, German, French [200, 201], Italian [47, 48], Japanese [222] and Chinese [116, 117]. While it is well-understood how to prepare such a system, there are still documents which remain unavailable to people with visual disabilities. Developing nations where funding for such transcription projects is limited, countries with multiple official languages and languages with small speaking populations, all remain a problem for providing text to speech output [50, 148, 179].

In order to take advantage of text to speech technology, documents must be transformed into an electronic form. This can, of course, be done by direct entry, as in the case of word processors. Alternatively, paper documents can be scanned into electronic form and optical character recognition (OCR) can be used to retrieve the character information. While OCR is a fairly mature technology with examples of use since the early 1970s [2, 15, 33, 46, 49, 59, 137, 160, 180, 195, 216, 227, 240], poor scanning results, or defects in the paper documents, or hand written notes can all still result in errors in text recognition.

Enhanced visual presentation of text

For low-vision users, text access can be accomplished through the use of increased font sizes, which require either large screens or screen magnification technology. One resource for information on screen magnification is the recent article by Blenkhorn et al. in [20], where several architectures and design factors for screen magnifiers are discussed.

In addition to screen magnifiers, some people with low vision, or particular color-vision deficiencies require alternative color contrast for text against backgrounds. Recent work in the BenToWeb project [17] demonstrates that the color contrast calculation for cathode ray tube television sets is still the most accurate at describing the perception of color contrast for the general user population. However, this work does not necessarily account for the preferences of an individual, and as such the ability to adjust the color of both text and document background is a requirement for accessibility.

Tactile presentation of text

In place of auditory presentation, touch can be used to present documents. For reading text there are a variety of Braille codes, with the original code being created by Louis Braille in 1829, that are used for transcription of text documents into tactile form [21]. These codes consist of characters of either six or eight dots with dots, in columns of three or four dots. The dots are approximately 2.5 mm apart; however, the optimal spacing of the dots is more subtle and is an issue of much discussion whenever a new device is designed. Aside from text, there are several related Braille-style codes that are used to translate a variety of materials such as music [58], flow charts [38], computer symbols [37], chemistry notation [32] and mathematics [141] into sequential strings of similar Braille characters. The following discussion applies to all of these codes.

Historically, focus has been on the transcription of printed documents into Braille. Transcribers have been available since the first use of Braille for manual transcription. With the introduction of electronic computers, there was an initiative to alleviate some of the manual transcription problems by facilitating the entry of text into computers and having the computer perform automatic transcription of that text into Braille [52, 83, 187, 228].

Now, automatic transcription of text into Braille is fairly common place, with Braille output being produced through embossing machines. Many such devices are available, with several different resolutions developed over the years [11, 28, 29, 71, 75, 87, 123, 170, 221, 231]. A recent survey of Braille embossers is found in [10]. The Royal National Institute for the Blind also has elaborated a list of embossers available on the market today [168].

There is a more immediate form of Braille transcription and presentation which is available at a relatively low cost. Braille display terminals are small portable terminals that present either 20, 40 or 80 characters to a blind reader through a set of refreshable Braille cells. Many of these displays now come with Braille note-taking interfaces consisting of seven keys that can be used for navigating documents and recording Braille characters.

Non-standard text layout

With all of the discussed technologies for rendering and transcribing text, it would seem that this problem is, for the most part, solved. However, there are still some significant challenges that need to be addressed in the research community.

The vast majority of these challenges result from the 2D layout of text, which provides context for how to read the information. For example, a document may contain layout information indicating section headers, spacing regarding paragraph breakdown and, in some places, lists of information which are indented to indicate their importance to the whole document. Sighted readers are able to take in all of this information through the overview process which is facilitated through the visual sense. On the other hand, if the sighted reader was to read the document one line at a time, seeing only between 20 and 80 characters on each line, the reading experience would be very different, and would be analogous to reading a document through a single line refreshable Braille display. Indeed, a sighted reader can see at one time approximately 50 times more of a document when compared to what a blind reader perceives when using a single line Braille display. This difference in perception provides the sighted user with the advantage that the text can be placed in context in the document as it is being read, without the increased load on working memory caused by viewing a document one line at a time.

Moving away from ordinary text documents such as novels and newspapers, there are several other types of text information which provide significant presentation difficulties when translated into sequential form. An example is the computer pseudocode presented in Algorithm 1.

figure a

This simple algorithm for an insertion sort poses several problems for a screen reader, due to the combination of text with mathematics symbols (which will be discussed more in Sect. 5) that are arranged in semantic groupings. Specifically, there is significant meaning contained in the indentation of the code that must to be communicated to the blind individual. Adding to this the non-standard notation present in the user defined variable names; this algorithm would be communicated very poorly through a screen reader or other assistive technology. While there has been some research on reading and presenting source code [181, 218, 232], tools to process non-standard notation are uncommon.

Problems similar to this can occur in the processing of tabular data by speech synthesis. Despite the fact that table data presentation are a heavily researched area, in particular due to their prominence in layout designs in hypermedia documents (as discussed in Sect. 7), there is no universally agreed upon solution. This is partially due to the large number of variations in table use and in structure; however, it also depends on the intentions of the user when perusing table data [164, 230].

Mathematics presentation

Many of the solutions used for presentation of text tend to not work well with mathematics. The reasons for this come from the very structure of mathematical notation, or more appropriately, the plethora of audio descriptions a fragment of mathematical notation can take. For example, considering the following formula:

$$ x = {\frac{{ - b + \sqrt {{\text{b}}^{2} - 4ac} }}{2a}}. $$

How should a screen reader vocalize such a formula? Should the numerator be read first or the denominator? Regarding the terms under the square root sign, is 4ac a product of two or three terms? The ambiguity resulting from the perception of the mathematical notation without understanding the intention of the author, when combined with the problem of vocalizing the notation in a predictable way, makes audio presentation extremely difficult. Of course, the alternative to audio presentation is to generate a tactile representation of the mathematical notation which can be explored with the hands. This section discusses the many approaches for both audio and tactile mathematics presentation, each of which has its own benefits.

Audio presentation of mathematics

Many of the earliest approaches towards the presentation of mathematics drew on the availability of audio hardware to generate speech interpretations of the notation. This type of approach suffered from the same problems as highly structured text that was reviewed in Sect. 4. Experimental evidence indicates that internalizing the structure of mathematical notation can be very difficult when presented through audio [191]. This may be attributed to the increased cognitive load on the reader with having to perceive and understand the notation while focussing on navigating through a mathematical document to detect terms of interest.

Raman attempted to solve these types of navigation problems based on his own experience with university level computer science research papers. The Aster project introduced a customizable profile to specify how a user would prefer to read a document. Through a collection of user defined rules, the following options are available [164, 165]:

  • browse the entire document;

  • skip sections entirely;

  • retrieve summaries of technical areas of the document;

  • mark areas for recall;

  • retrieve simplified or descriptive audio output on mathematical formulae;

  • recognize patterns for specialized context renderings.

Whereas Raman’s [164] work focused on providing access to advanced technical documents, the work by Edwards and Stevens on the Mathtalk system was intended for high-school level and early undergraduate work in mathematics, in particular complex algebra [191].

Edwards and Stevens recognized that the key advantage to the visual sense is that readers do not need to remember all the information presented at one time. The fundamental difference between sighted and blind mathematicians was that sighted mathematicians use paper to record progress and for recall of previously encountered mathematical symbols. In this way, the sighted mathematician can focus on the comprehension of what the symbols mean, as opposed to the sequence of presentation. The Mathtalk system allows audio browsing of algebraic equations through an active reading process with the user participating in the display and review of mathematical notation [191].Footnote 1

The work by Edwards and Stevens on mathematics for people with visual disabilities is extensive; the following are general design recommendations from that work:

  1. 1.

    Lexical cues can provide a means of breaking up algebra into unambiguous representations.

  2. 2.

    Prosody of speech can provide a better means of understanding equation structure over lexical cues when used as an equivalent of the typographic rules for formatting algebra in print.Footnote 2

  3. 3.

    A method of navigating the text at all levels must be provided. The user must be able to step through sections of the text to gain a preview of a complete document, and skip over objects which are not pertinent to the reading task. These rules must be extensible by the user on a situational basis, allowing the rules to change while the document is read.Footnote 3

  4. 4.

    The user must be able to navigate through a formula with the application providing the ability to identify sections of the formula through audio cues and either read the contents of the formula or skip over the section entirely.

  5. 5.

    Blind users will use various reading strategies for mathematics. These strategies must be taken into consideration when the mathematics interface is designed [190].

Recently, Gillan and Karshmer completed a large study on how people process mathematics when it is presented through audio and through print. Their results corroborate the design principles above [100].

Furthermore, recent work in the European project Linear Access to Mathematic for Braille Device and Audio-synthesis (LAMBDA) uses this principle of active recording in the design of an interface combining tactile and audio user interaction with mathematics [177].

Tactile presentation of mathematics

Currently, two options for the tactile presentation of mathematics are regularly used, namely Braille codes and the Dots-Plus system.

Tactile codes

The Nemeth code was developed in 1968 and is the standard code for tactile presentation mathematics in North America. This standard uses context symbols to change between the literary context and the mathematics context. This code was designed primarily as a transcription language, and while it is recommended that individuals doing transcription have the technical knowledge of the mathematics material, it is intended that anyone who knows the Nemeth code can translate a written mathematical document directly [141].

A second commonly used mathematic code is the Marburg code, used primarily in the European Union. In comparison to the Nemeth code, which represents the syntax of mathematics, the Marburg notation combines content with presentation information. Through the use of prefix indicators for identifying parts of a formula, and spacing of characters and delimiter marks for displaying 2D mathematics in a linear form, the Marburg code is capable of representing the majority of mathematics through 64 symbols [14].

There remains some debate regarding the effectiveness of a “number mode” presentation style similar to that of the Marburg code and the literary Braille code. This is due to the perceived “clumsiness” of complex mathematics presented in this manner. The alternative is to assign a unique symbol to each number as is done in the Nemeth code after it enters the mathematics context, and in the GS code proposed by Gardner and Salinas [74]. Several other codes for mathematics have been proposed and are or have been in use, such as the Halifax [85], as well as Russian and French codes.

The history of translating mathematics to Braille codes is not as robust as that for translating text; however, there are several options documented in the literature. One such attempt at providing a system to translate ad hoc mathematics documents for students is the work by Dr. Fred Lytle, a chemistry Professor at the University of Purdue. While teaching blind students in his chemistry class, he had been told that it was impossible to generate Nemeth code automatically from a document specification due to the context problems associated with such a transcription. Lytle prepared a mathematics to Nemeth code translation program in a macro set designed for WordPerfect 7 for Windows personal computers [119].

Recent work in the transcription of mathematics to tactile codes is the work done through the Universal Mathematics Access Project spearheaded by the University of South Florida Lakeland and the University of New Mexico. The overall goal of this project is to provide a Universal Math Converter which will convert from traditional mathematics authoring languages, such as TEX, OpenMath and MathML, into either Nemeth or Marburg Braille [147].Footnote 4

Recent work at the University of Western Ontario has focused on the complete translation of technical documents containing both simple text and mathematics to their Braille equivalents from a plain TEX source. This translation is then presented to the user through a document browser on a refreshing pin display [56, 95].

DotsPlus

Standard literary Braille has low usage, with estimates of use ranging from 10 to 20% of all blind/low-vision readers knowing the code. Mathematics Braille codes have an even lower usage than that. Particular reasons for this lack of use are the need of remembering complex symbol combinations and contextual overloading of symbols.

DotsPlus Braille attempts to address this problem by combining graphical characters with specialized numerical Braille symbols. Taken from the description available through the Science Access Project (SAP),Footnote 5 Dots-Plus addresses some of the problems associated with traditional mathematical Braille specifically:

Translation. Translation of mathematics into tactile form through Dots-Plus Braille font characters which are substituted for their print equivalents.

Numbers. DotsPlus avoids the use of the above described number mode, instead using the single cell Braille numbers. It adds an additional dot to each of the characters representing the digits in literary Braille number mode.

Exotic symbols. Exotic symbols such as summation and integration symbols are represented as direct tactile translations of their visual equivalents.

Combined with the TIGER Braille printers, the DotsPlus system is an alternative for readers at all levels of mathematics. Those who have lost their sight later in life can use their residual visual memory to process the outlines of exotic symbols. Further information regarding DotsPlus and the Tiger embosser can be found in [67, 69, 70, 72].

Graphics presentation

Diagrams are critical for the process of collecting, organizing and interpreting data, as well as for the exchange of information within office or education environments. Inaccessible graphics are a barrier that must be addressed for these settings to be inclusive for people with visual disabilities [107].

Taxonomies of graphics

There are a variety of types of pictures throughout media. Graphics can be distinguished into two very broad categories based on their presentation formats [94]. There are graphics which are representations of real-world phenomena. These graphics, referred to as pictures, require precision in the placement of their graphical elements in order to duplicate the features of their real-world equivalents. By comparison, diagrams are the mapping of real-world ideas to abstract representations. In diagrams it is much easier to separate the meaning of the diagram from the presentation of the diagram. Indeed, in many cases, a diverse collection of diagrams can represent a single set of data, as, for example, in histograms and pie charts.

The two categories of graphics mentioned above can be further broken down into a classification based on the intended use of the final graphic document. Under the category of pictures, there are photographs, navigational maps and structure diagrams such as architectural plans and medical sketches.

Diagrams are a larger category, as humans tend to use many kinds of diagrams to organize data and to ease interpretation tasks. Sub-categories of diagrams include statistical charts including bar charts, histograms, pie charts and graph diagrams (i.e., a collection of labeled nodes and edges, including trees and modeling language diagrams).

Way, in his treatise on tactile graphics [224], further distinguishes between these types of graphics, pointing out that time can play a part in understanding graphical content. Static graphics are those which once they are complete will not change, like a photograph or a sketch, while dynamic graphics are likely to change over time, such as software modeling diagrams. While such a classification system is reasonable, it is perhaps not flexible enough to encompass all graphical documents. Considering for example a set of architectural blueprints for an office building as an example, the blueprints will change a great deal during the initial design phase for the building; however, once construction has started, the blueprints are unlikely to change. Later, when the building requires renovation, the blueprints will be examined and changed according to the needs of the clients. This type of punctuated change occurs in many types of graphics, including architectures for buildings and software, city maps (as new roads are built) and circuit diagrams. This implies that there is an iterative life cycle for graphics where a graphic has its requirements specified, it is authored and after an arbitrary amount of time, the requirements are changed and the graphic updated.

While the above taxonomies provide an understanding of how to interpret graphics, they do not provide hints regarding how audio/tactile diagrams should be presented to a reader with a visual disability. Recent conferences on tactile graphics have shown that this is a question which still defies a precise answer, as there is no consensus on what makes a “good” or “meaningful” tactile graphic. Clearly, no single solution can address all of these different types of graphics. As a result, there is a large number of projects which have approached graphics presentation through audio, or tactile/haptic or multimodal presentation.

Audio presentation of graphics

In their recent survey of audio presentation of diagrams, Brown et al. [24] identified several design principles which are required for non-visual diagram access. These principles also apply to tactile and multimodal presentations of diagrams.

  • Overview. In much the same way that Stevens [191] advocates providing navigation from the general to the specific in mathematics equations, diagrams must provide an external reference for the reader such that organizational information does not need to be completely internalized. This is difficult to do for highly detailed tactile pictures, as the resolution and workspace size is limited as well as in audio, due to its inherent sequential nature.

  • Search. A facility to search for specific pieces of information or types of information is essential for providing an understanding of diagram contents [16].

  • Recognition. The search mechanism provides access to the explicit information in a diagram, such as the nodes or edges of a graph; however, there should also be a means of providing access to implicit features of interest within the diagram. The example used in [24] is locating and describing cycles within a simple graph.

  • Representational constraints. Many approaches emphasize the use of a diagram form which is similar to the printed form used by sighted people, in order to support collaboration between mainstream readers and readers with visual disabilities. Similar results can be seen in work on the presentation of tree diagrams [6], histograms [132] and technical diagrams [158]. However, as observed by Challis et al. [30] in their work on music presentation, it was discovered that simple visual to tactile translations do not always result in a workable document, and thus any graphic design must be tempered with user evaluation.

These guidelines provide a good start for the design of tactile diagrams, however, the resulting diagram will be further constrained by the medium in which it is presented. Further examples and guidelines of how to read various diagram types have recently been the focus of the Technical Drawings Understanding for the Blind (TeDUB) project, and an initial document has been published for study materials [40]. This work is significant in that the researchers approached experts and users regarding what type of information is to be conveyed through particular types of diagrams, such as those of software architecture. This type of user engagement produced a deeper understanding of the intentions of authors and of the readers in order to produce sensible interactions that would not only allow the user to perceive aspects of the diagram, but also aid in the navigation and comprehension of the diagram meaning.

This type of engagement of the target audience, from both the perspective of the author and the reader are essential for future improvement of audio presentation of graphics.

Tactile presentation of graphics

When trying to represent pictures, the constraints on presentation are fixed; the tactile pictures duplicate the visual scene as closely as possible. For example, a change in location or size of a feature in a tactile map could lead to a misunderstanding of the layout of a room, or in the distance between two cities.

For the tactile rendering of photographs, the most notable example of system is the work by Way on the TACTICS project [225]. This system uses image processing techniques to emphasize boundary areas and height differences in order to generate a static tactile picture which can then be explored by a blind person.

Tactile maps are one of the most heavily researched areas regarding pictures. In particular, Ungar et al. [18, 202215] conducted several research projects regarding the exploration and encoding of information on tactile maps by blind individuals.

A conference on tactile graphics [162] shows that there is no consensus among the research and user communities regarding which guidelines should be considered as standard. It may be that such a debate is the result of existing standards being too constrained in their representations of information. This over-specification results in graphics which are accessible to a very narrowly defined user group, but inaccessible to others with only minor differences in accessibility requirements. Examples of problems which may arise from over-specific standards are as follows:

  • Low-vision users prefer to take advantage of their residual sight and, as a result, diagrams prepared without enlarged fonts or without extreme contrast will be less accessible.

  • The age of sight loss can play a role in graphic interpretation, as residual visual memory can help late-blind individuals interpret diagrams with which they are familiar, such as the math symbols discussed in the Dots-Plus project [68].

  • Different cultural backgrounds lead to different expectations for diagram presentation. For example, a recent project in Japan produced a set of guidelines for tactile graphics [65]. Although it is clear from this work that there are differences between this and other user groups, it is not clear what is different about Japanese users which makes North American or European tactile graphics guidelines not applicable.

  • Tactile sensitivity may be low with some users; as a result, more space between tactile features may be needed.

Due to these problems, guidelines must be very precisely specified for a particular user group (see the research done by Jacko et al. [8991] in their work on elderly low-vision adults and the specification of visual profiles), or very general guidelines must be specified.

An investigation of existing tactile graphic standards has resulted in the following collection of tactile features, which seem to be accepted by the research and user communities.

  1. 1.

    Tactile symbols should be simple [30, 36]. In this case, simple refers to the amount of time which is needed to comprehend a specific symbol. For example, a star symbol requires more time than a circle, due to the need to count points.

  2. 2.

    Consistent mapping of tactile shapes to concepts is necessary to enhance comprehension [30, 106].

  3. 3.

    There should be a minimal number of tactile symbols used to reduce the cognitive load of the reader (preferably fewer than 15 symbols [106]).

  4. 4.

    Tactile symbol design should relate to the information being represented. Overly abstract symbols will require frequent consultation, by the reader, from either a legend or an expert reader [106]. This relates to the problem of presentation consistency between sighted and non-sighted formats, as those symbols which are shared will be more recognizable by people who became blind later in life.

  5. 5.

    Diagrams should avoid disconnected components with excess white space between them. Large amounts of empty space leads to disorientation of the reader [30, 36]. However, this must be tempered with the knowledge that objects cannot be spaced too closely together, as they will be indistinguishable from one another.

  6. 6.

    Consistent line type and line size are important factors when attempting to have the user follow a specific path. For those with low vision, high contrasting colors are also important in this task [36, 106].

  7. 7.

    Braille labels should be kept to a minimum due to the large amount of space required for them [36].

Multimodal presentation of graphics

A common trend in tactile graphics in recent years is to combine audio output with tactile pictures to aid in navigation and comprehension. This process is usually carried out through a static printout being placed on a touch sensitive pad which transmits finger positions to a computer to play associated sounds such as those discussed in Sect. 3. These systems have the advantage of being able to communicate layers of both speech and non-speech sounds to the reader in conjunction with tactile exploration. Examples of such documents can be found in the work on the Talking Tactile Tablet by Touchgraphics Incorporated and the IVEO Tablet by Viewplus Technologies [105, 106, 217]. One of the first such systems was proposed in detail in [158].

There remain several open problems regarding this type of technology. First, complications arise from the interruption of audio information through repeated interaction with the touch pad. This interruption can lead to a “stuttering” effect in the audio playback. Second, there is a problem associated with the detection of the finger positions on the display. Even with the low resolution of such displays, it is difficult to determine the exact location of the fingers of the user. This can result in inaccurate reporting of the audio annotations, or the user receiving no audio information at all.

The World Wide Web

The Internet and the World Wide Web (WWW) have changed the way people interact with information. In the last 10 years, information on just about every subject imaginable has been made available for users to download and peruse at their leisure. The web, originally, held a great deal of potential for helping to eliminate the gap in access to information between those with sight and those without. Web pages that originally consisted of mostly text and few graphics promised to be a resource that would be accessible through screen reading technology and Braille displays. However, as connection speeds increased, users began to demand more variety in their media. When new technologies were designed to provide graphics, games and more on the web, they were produced so quickly that they rarely considered accessibility factors. Instead companies focussed on providing more content, faster, in what seems to be a continuing down-hill trend in accessibility. With the majority of new web sites consisting of extensive non-text, visual content, without a descriptive audio or tactile counterpart, much of the content is unavailable to people who are blind.

There are several thematic areas in the literature on how to make web documents more accessible for people with visual disabilities, including guidelines for governing how web sites are created, specialized browser design for disabled users, navigation and presentation strategies, and semantic de-construction of web page content.

Identifying the problems

When examining web pages, it is easy to see that they suffer from many of the same problems present in other types of documents, such as text in non-standard layout and graphics embedded in documents without any kind of alternative access to the information.

These problems are augmented by the very nature of web pages and web sites. First, the documents are intended to be viewed online, with little thought on how a hard copy of a web page should look or be generated. Therefore, it is difficult to generate an accessible hard copy of the document automatically for a sighted or blind person. With the addition of animated graphics and multimedia applications, automatic rendering is not a viable option.

The following sub-sections summarize the features in web pages that are significant barriers to access.

Navigation by hyperlinks

One significant problem with web pages is the method of navigation between individual pages. In a book, the method of moving to new content is obvious: the reader turns the page. In the case of web pages, hyperlinks attached to text within the actual document are used to move from page to page to access content. There have been several efforts to understand how to alternatively present such links to people with visual disabilities, including Raman’s work on aural stylesheets for web pages [163], and the work of Petrie et al. [136] on the use of earcons and auditory icons in presenting navigation information.

Frames

Frames are a navigational challenge to anything but a point and click interface, due to the necessity to address window focus to a specific frame for navigating the links contained within. Even though there are features which can be used to make frames more accessible, such as the use of frame titles and the noframe XHTML element, these features are seldom used, or if used, are improperly composed (e.g., a frame title such as “Top Frame”) [53]. In order to address these challenges, it is important to inform web developers of alternatives such as using the div XHTML element for layout.

Tables

As observed by Raman [163], tables are used in two significantly different ways within web pages. The first is to provide organization of relational data. In this case, the tables produce many of the same problems that are observed in [26] regarding the navigation of structured information. In particular, users are unable to orient themselves within the broader space to be able to understand the information in context. In such cases, tables must be marked up appropriately with summary data, as well as row and column headings which can be read by a screenreader. The second use for tables is in layout of text and graphics, which should be avoided, using in place of the table element the now standard div element.

Graphical content

Web pages can have informative images [152], in that they convey information related to the web page content, and in other cases decorative images which are completely unrelated to the actual content (e.g., advertisements, bullet graphics). Additionally, navigation links can be attached to a graphic that may be difficult to navigate as they seldom have descriptive text associated with them [53, 154].

A recent study based on interviews with print disabled users indicated that the guidelines for providing alternative text and long description text are still insufficient. The following were cited in the interviews as problems [152]:

  1. 1.

    There was consensus among users that not all images require descriptions. In particular, those images which are used for spacing and filler should have empty strings provided in place of descriptions in order to facilitate skipping over these images.

  2. 2.

    The WCAG guidelines state that a minimum of two to three words should be used to describe all images. For informative images that do require description, this is insufficient for comprehension about what the graphic is supposed to communicate.

  3. 3.

    Descriptions should augment the information that is already contained in the body or caption text.

An investigation of over 100 web pages based on the results from these interviews showed that 71% of the informative images contained descriptive comments, while only 10% of the decorative images had associated descriptions. Obviously this is far below the goal of all informative images having such descriptions [152].

Application content

Most recently, high speed Internet access has provided the means of introducing full fledged applications in Internet web sites. These applications, which initially were relatively limited in functionality, are now dominant in web site production. Tools like Flash and Java Servlets provide to the web developer a great deal of flexibility, but, at the same time, even more care is necessary to ensure that content is not driven away from being accessible to the disabled. While there has been previous work on examining simple web applications, like Java applets [53], to date there has not been an extensive study focussed on the accessibility of such web applications.

Standards, guidelines and legislation

There have been several attempts to provide guidance and regulatory controls to the World Wide Web. While these standards are well thought of in the web development community, and are endorsed by governmental agencies, they seldom are followed by the industry, either due to lack of training, or worse, apathy towards such a small segment of the population.

The most widely known standards and guidelines come from the World Wide Web Consortium (W3C) [61] through the Web Accessibility Initiative (WAI). This organization provides a forum for researchers and other collaborators to contribute to the creation of guidelines for governing how information is to be presented on the web. There are several groups addressing a number of topics, including: Web Content Accessibility Guidelines (WCAG), Scalable Vector Graphics (SVG), Authoring Tool Accessibility, and markup languages (among others).

These guidelines are available freely, and involvement in the groups is encouraged to anyone who has expertise and interest in participating. This open format provides a significant contribution from both public and private organizations interested in accessible content on the web.

Colwell and Petrie examined WCAG 1.0 [35, 150]. Their studies looked at both the readability of the guidelines themselves and their use in developing new web pages. These experiments showed that there were several problems with navigation of the standards document, which caused developers to make mistakes while creating web pages. It was also found that some of the guidelines did not produce optimal results in creating accessible content.

In a recent study conducted by Freitas et al. [61], they observed that many of the problems with generating accessible content comes from the lack of support to help developers. Developers often do not have time to read and learn such a large set of guidelines. This, coupled with the fact that many of the developers use drag and drop style web page authoring tools which often do not have support for accessibility, leads to inaccessible content. Fortunately, it appears that some of these problems could be mitigated as software companies begin to include accessibility standards and validation technology into their products [142].

One of the problems attributed to the inability of developers to know and understand their responsibility to the disabled community is the sheer number of regulations which govern Internet content. There are not only international guidelines, as published by the W3C, but also local government regulations. There is also the question of international jurisdiction: is it the case that a developer from Canada should respect the content guidelines of Brazil? The answer is not clear. It may be that several policies from multiple nations overlap in terms of content, but an extensive study, and perhaps an international treaty agreement, is required to ensure that the global Internet remains accessible to the disabled around the world. For references on international legislation and policy the reader is directed to [219].

There are some tools to help developers provide accessible content without an explicit knowledge of the regulations. These tools come in the form of checklists and validation programs. The WPASET checklist is a means of evaluating the design of a web site through a subjective set of questions regarding how accessible the developer has made a web page. As mentioned in [150], this checklist format lends itself to a certain amount of bias, due to the fact that the developer evaluates his/her own work. It may be the case that such a checklist would be more valuable if conducted by an external referee of a document. The validation programs for the most part do not perform their functions very well. Indeed, as discussed in [150], these automatic checking tools often fail to aid the developer in identifying problems. This could be attributed to the lack of understanding of the initial guidelines by the designers of the validating programs, or due to the inability of these programs to check qualities of the content in comparison to their markup. For example, an automatic checking tool can only detect whether an alternative text tag is present, it cannot check if the content of that tag is correct.

The most recent development in this area is the completion of the Benchmarking Tools and Methods for the Web (BenToWeb) project; an EU initiative working with the World Wide Web Consortium (W3C), Web Accessibility Initiative (WAI) on the development evaluation and validation techniques and tools. This project has produced a large number of results [17] on a wide variety of accessibility issues. These results include:

  • An evaluation of color contrast equations regarding how well they describe users’ perceptions.

  • An evaluation of factors affecting navigational consistency between web pages.

  • An extensive survey regarding awareness and knowledge of website owners and designers.

  • A comparison of the accessibility and usability of validation tools.

  • The creation of a test suite for validating test tools and methods regarding their correctness and completeness in testing WCAG guidelines.

Browsing solutions

Several authors have looked at the needs of people with visual disabilities and their access to web documents. Compiled here are collected recommendations based on interviews and usability studies for presenting web content to people with visual disabilities. For references on web browsers designed specifically for people with visual disabilities, the reader is directed to Raman’s work [163], to the ACCESS project and their DAHNI browser [154], the Brookestalk project [239], the SMART web browsing system by Truillet et al. [198] and the Home Page Reader by Asakawa [7]. A review of early web browser systems is also found in [198].

There must be a means of providing an overview of the document to give the users the opportunity to understand whether or not the document contains the information they require, or contains links to other documents in which they might be interested. Several projects by Zajicek et al. [237] have provided evidence that features like headings can be used to build conceptual overviews for users with visual disabilities, while hyperlinks typically do not provide an appropriate overview as they represent other documents, not the current document. Also tested in [238] was the use of keywords which proved to provide some context on the contents of a page, but the use of tri-grams (three consecutive words) produced trouble in understanding a document, and an abridged text format was not very successful in communicating a page’s purpose.

For navigation, interfaces should include the ability to re-read sections of the document at various grammatical levels, allowing the user to review paragraphs, sentences or single words. This navigation must also provide a means of gathering related features for review. This includes keywords, headings (which are specified in the W3C Web Content Accessibility Guidelines (WCAG) as requirements for accessibility) and link structures. In addition, providing a means of traversing back along a known path to a previous location in a document or of returning to a known location is suggested for orientation of the reader [154].

The majority of the solutions designed to date have extremely simple interface controls. In the case of the Brookestalk browser [239], functionality is mapped away from graphical buttons to the function keys on the keyboard of a personal computer. In the case of the Home Page Reader designed by Asakawa [7], all functionality (almost 100 features) was mapped to the keypad. It is noted that each of these implementations overloads the mappings on the interface buttons and it is surprising that the participants were not only able to learn the interface, but excelled at using it.

The ACCESS project designed a completely new web browsing interface that maps common web browsing functions onto a series of tactile buttons, which was shown to be effective in communicating the intent of the interface to the users who were included in the usability tests. Coupled with these tactile interfaces are auditory interfaces which provide both speech feedback (for text and link names) and non-speech sounds for indicating events that occur in the environment. In fact, the work in [136] shows that such non-speech sounds are a boon to helping the user navigate the complex interfaces required for web browsing.

Web page analysis solutions

Web pages, due to their markup languages, are rich in structural information. The syntax of HTML and of more advanced markup languages, such as XML, provides information about how pages are structured. Indeed, it is not surprising that much research has been devoted to trying to leverage this information for making the information presented more accessible to the disabled reader. Indeed, the earliest work of providing aural cascading style sheets by Raman [163], and later adopted by the WAI, focuses on using the syntax of cascading style sheet files to define audio presentation details.

Alternatively, web pages can be restructured to provide different views depending on the navigation methods of the user or the task which is to be performed at a given time. For example, if the user wishes to review the contents of the page, it may be worthwhile to provide a table of contents based on the heading tags provided in the current XHTML specifications (2008).

This type of approach is purely syntactic. Looking at the structure of web pages, it is very easy to see that sections of the page serve very specific purposes. For example, it is common to have a navigation pane present on the left hand side of the page for easy access to links within the site. This grouping of content provides a certain amount of information regarding the use of such links. This information is apparent to the sighted users due to color, position and other sight-dependent attributes. This same information should be encoded overtly for a blind user, but to do this the intention of the content must be interpreted, and this can be a difficult task, given the non-conformity of web pages. An example of an attempt to harness the semantic information is presented in [64], where semantic groupings permit the addition of an information rich table of contents.

Finally, Pontelli et al. [159] propose a semantic representation of HTML structure (specifically for tables) in a graph format. This graph is hierarchical in nature, with the different levels of the structure representing a more granular view of the data contained within the structure. The first tier in the hierarchy represents the table itself, the second tier represents the rows and finally the third tier represents the data contained within the cells. Combined with this representation is a domain specific language (DSL), which is used to specify navigation through the links of the table and the hierarchy levels themselves. The intent of this system is to provide standard viewing annotations defined by the syntax definitions, but also to include separate DSL descriptions that would help govern the navigation of the user through the data. Additionally, there may be opportunities for learning techniques to predict future viewing from former behavior of a user.

Multimedia presentation

Multimedia content is becoming more common on the web. The combination of text, graphics, video and audio presents great challenges in customization and personalization of contents for individuals with disabilities.

The MultiReader project had the goal of investigating and understanding the problems associated with navigating multimedia documents by both mainstream and disabled user groups. Since its inception, this project has provided not only guidelines on how multimedia information should be presented in order to optimize the comprehension of the reader, but also the design, implementation and testing of a prototypical application which applies the results of their user studies. The MultiReader application went through several iterations and its architecture serves not only as an example of how complex the problem of navigating multimedia is, but also as an indicator of how far commercial applications must evolve before they are truly accessible to all users.

The original set of requirements needed for the application were obtained through focus groups including mainstream users, users with visual disabilities, hearing disabilities and people with specific learning difficulties such as dyslexia. The requirements from the interview sessions were refined through iterative testing of the MultiReader prototype to produce a set of access requirements for each of these user groups [151].

Haptics

The newest technology for information access by people with visual disabilities is that of haptic technology. As this field is extensive in its breadth, only a selection of results is presented here.

In this section the broad term haptic sensation is used to refer to the detection of external stimuli such as kinesthetic forces, vibration and temperature through the skin. Haptic interfaces or simply haptics is the general term used in the research and corporate communities to describe the interfaces which can connect the haptic sensory system with a virtual environment, be it a traditional 2D interface or a virtual reality environment. While haptics has been often used for teleoperation of equipment in dangerous environments, robotics and game system controllers, this section focuses on its applications in providing interfaces to virtual worlds for people with visual disabilities. For a more complete discussion of applications of haptics in its more common uses, the reader is referred to [22, 194].

There are several devices that have been used to provide varying levels of vibration to a user who is blind. Most commonly, low cost solutions that are available to both researcher and user alike are used, such as haptic joysticks, gamepads or mice. These devices cost approximately $100 USD, and can be purchased through local distributors,Footnote 6 and are thus more likely to be accepted by the community of people with visual disabilities.

In 2006, a new style of haptic mouse was introduced. The VTPlayer mouse is quite different from the haptic mice discussed above. In place of providing haptic feedback through vibration, there are two tactile cells on which the fingers of the user rest. This device holds some promise for novel interaction styles, such as the work by Brewster et al., which used a two handed interaction style using the VTPlayer mouse in one hand and a pen and tablet input device in the alternate hand. This apparatus was then used to allow the user to explore a 2D scene with the pen hand, while having tactile information delivered to the mouse hand [220].

These tactile devices are suitable for some applications, but the range of haptic feedback is, with few exceptions, limited to varying levels of vibration and two degrees of freedom in movement. In order to achieve a more fine-grained haptic sensation, one must turn to high-end electronic devices such as the PhantomFootnote 7 single-point haptic touch system by Sensable Technologies, or the CybergraspFootnote 8 5-point haptic system by the Immersion Corporation. Indeed, recent studies [233] have shown that in activities in the absence of sight, the higher sensory feedback from devices such as these can improve performance in haptic interaction tasks in comparison to off-the-shelf technology.

There have been extensive projects testing the application of advanced haptic technology. With most of these results coming from fields such as teleoperation and surgical medicine, it is difficult to generalize results from these sight-dependent tasks to haptic exploration by a user who is blind. For this reason, there has been an increasing number of research results describing the interaction techniques suitable for the blind user.

A large number of results regarding the use of haptics have been contributed by Lederman et al. [110] whose early work focussed on comparisons between sensory systems in estimating properties of materials, such as texture. Studies such as this one emphasize touch as a first class member of the human senses, being able to distinguish many properties, in this case texture, independent of vision.

Lederman et al. provide definitions of different types of exploratory procedures (EP) that can be employed by an individual for haptic identification of properties such as texture, hardness, temperature, weight, volume, global shape and exact shape. These exploratory procedures have been observed in laboratory settings with subjects interacting with 2D and 3D objects, such as those described in [167].Footnote 9 These procedures are: lateral motion, pressure, static contact, unsupported holding, enclosure and contour following [103]. A discussion of how these exploratory procedures relate to haptic interface design can be found in [111].

There are many other results which have been identified as providing starting points for a set of guidelines for haptic interaction. These include:

  1. 1.

    Haptic exploration with a single-point device results in slower recognition times and more misattributions than real-world exploration tasks [92].

  2. 2.

    There is debate regarding complex scenes perceived by blind users. Some results indicate that complex scenes are extremely difficult to identify through haptics alone [93]. However, the study by Magnusson et al. [120] indicates that blind users were able to identify complex scenes reliably.

  3. 3.

    For path finding, grooves are better than bumps due to prevent users from “falling off” of the bumps [63].

  4. 4.

    Roughness of textures is perceived to increase as groove width decreases [149].

  5. 5.

    Internal exploration of an object (i.e., within the object boundaries) results in the object size being perceived as larger than external exploration [149].

  6. 6.

    Multiple points of virtual contact result in better size estimates of virtual objects [124, 125].

  7. 7.

    Navigation through a virtual space can be done with only auditory cues [115].

Example applications that were specifically designed for people with visual disabilities include:

  • A haptic museum where virtual art pieces can be touched and explored [126].

  • Multimodal exploration of graphs for comprehension of data [23, 62, 63, 171, 233, 234, 236].

  • A non-visual molecule browser [23].

  • Navigation in unknown environments [185].

  • Software for exploration of mathematical relations [178].

  • World Wide Web exploration [139].

  • Exploration of virtual scenes (HOMERE) [109].

It is hoped that, as haptic technology becomes more affordable for both research facilities and the home users, applications presenting realistic, natural, 3D haptic interaction will be achievable in the near future.

Conclusions

This paper has reviewed several examples of presenting media to people with visual disabilities. It first discussed the alternatives for making visually based information available to people with visual disabilities, providing in particular a survey of the types of sound and tactile based presentation options that are available.

Second, the problems which are encountered when attempting to translate and present textual information, mathematics and graphics were identified. The trend towards moving information to online sources, such as multimedia and web documents, was also examined, discussing some of the unique challenges associated with these types of documents.

All of these issues have been explored by both users and researchers, resulting in a variety of approaches, all moving toward one common goal: the equality of access to information for people with visual disabilities. While all successful approaches have their merits, very few have made their way into mainstream use by the population. The following general statements can be made regarding future access technologies:

  • Interface requirements need to be abstracted away from specific applications. Specific applications provide a means of testing the effectiveness of interface theories and designs. These applications have very specific human and technological factors which make them successful in achieving their goals. These factors need to be generalized in such a way that future research and commercial systems can include them in new applications.

  • Multimodal interfaces must continue to be brought to mainstream applications. While any type of provided feedback is of benefit to a user, material translated for the people with visual disabilities should use both audio and tactile output. While audio output is certainly easier to manufacture, it is too serial to communicate all information effectively. It is clear from previous examples of technology, such as the Optacon and the IVEO tablet, that devices and applications which include tactile feedback are more readily accepted by the user community.

  • Further work on automatic transcoding is required. There are several examples of transcoding for each type of media discussed in this paper. Transcoding, if it can be accomplished without the aid of a human assistant, provide independence for the user with a visual disability in controlling their access to information. Also, it is essential that the process be examined from the view of the document as a whole, so that one tool can render all information contained in a single document.

  • Further automatic and semi-automatic testing tools need to bedeveloped. While there are many tools that provide automatic testing for accessibility on the web and in other domains, these tools address a small subset of accessibility problems.

  • Awareness of universal access must be increased. Tools for transcoding and verification of material will continue to be ineffective if those who are in greatest need of them are unaware of their existence. Media transcribers, developers and students all must be informed of the challenges which exist for those with disabilities, so that they can look for specific tools.

  • Involvement of the target user group must be sought at all levels of design, implementation and testing. There is a need to include the target user group at all levels of the research process. There are several examples in the literature where tools have been designed without the input of the users, and then tested without participation of that community. However, certain techniques for acquiring testing data produce accessibility concerns of their own. An example of this is the use of time diaries for people with visual disabilities, which were shown to have their own set of unique problems [4]. It is important for researchers to be aware of such problems, so that development and testing plans can be adjusted appropriately.

In summary, research regarding accessibility of information for people with visual disabilities is extensive. However, with the rapid pace at which technology changes, it is important for researchers and developers as a community to abstract solutions away from specific technologies, so that accessibility of all information presentation can be achieved in the future.