Keywords

1 Introduction

All people should be able to experience museums to engage with art, culture, and history. However, there are barriers for people with visual impairments (VIs) including a lack of museums with accessibility accommodations [10, 16]. While the number of museums with accessibility accommodations is increasing, People with VIs have to plan their visit to guarantee the accessible experience [23, 25, 28, 29].

Technology efforts aim to make museum spaces accessible, such as smartphone apps with art descriptions (e.g., [11, 19]). Bluetooth beacons can sense a person’s location [3, 26, 30] or depth cameras can sense one’s distance [24] to play relevant artwork descriptions. However, these experiences require experts to compose the artwork descriptions. Audio guides are costly to implement (both in cost and staff time). Audio guides that do exist might not conform to best practices as outlined by the Art Beyond Sight (ABS) Accessibility Guidelines [6] because they make the assumption that the person can see the art. For example, an audio description might solely focus on the artist biography and give no information about what the art looks like. There are accessible audio tours, but for a limited set of museums. There are unanswered questions for how to curate these descriptions from people other than curators who are already occupied – for example, laypeople who are already visiting the museum.

Our research investigated how to curate audio guide-worthy content without the need for expert composition by guiding laypeople in composing accessible artwork descriptions. Our multidisciplinary team with Human-Computer Interaction (HCI) researchers and an Associate Curator at an art museum used the ABS Accessibility Guidelines [6] as a framework. We created four tasks (or short text assignments) inspired by HCI research (Baseline Approach) and four tasks inspired by the established ABS Accessibility Guidelines (ABS Approach). ABS is from Art Education for the Blind, which leads a multidisciplinary collaborative of sighted and blind professionals and advisors [7]. ABS Guidelines were developed from theory and research by sighted and blind scholars, professionals, and artists [8]. Our work is the first step toward curating accessible descriptions based on these guidelines.

We included different stakeholders in our research. To understand the feasibility of using artwork descriptions written by museum patrons, we analyzed artwork descriptions written by Mechanical Turk workers (MTurkers). Four docents evaluated each artwork description per the ABS Accessibility Guidelines. 31 people with VIs evaluated the sets of contributions from both approaches in terms of how well they understood each artwork’s contents. Both people with VIs and docents rated the descriptions from ABS Approach higher than the Baseline Approach in understandability and per the ABS Guidelines, respectively. People with VIs appreciated the ABS Approach descriptions because they highlighted prominent elements, described layout of the artwork, and made the artwork come alive. The ABS Approach shows potential – by having patrons respond tasks, work by museum employees is reduced from composition to vetting. We show the feasibility of gathering accessible descriptions of artworks through a multidisciplinary process. We make three contributions.

  1. 1.

    We describe the multidisciplinary team’s design process of the ABS Approach.

  2. 2.

    We designed and developed tasks through collaborative work between experts in art and HCI to ensure the crowdsourced descriptions were accessible to people with VIs.

  3. 3.

    We conducted an empirical study that compares ratings from 31 people with VIs and four docents of artwork descriptions from MTurkers. The second set of descriptions had higher ratings of accessibility from docents and understandability from people with VIs. They had vivid details and orientation information.

2 Background and Related Work

2.1 Gathering Textual Contributions in Museum Spaces

HCI research has investigated museumgoer engagement by gathering textual responses to artworks. Alelis et al. [4] studied visitor engagement through a task that requested emotional responses to artworks, and found that people were motivated to find personal connections with the artwork. Clarke et al. [15] deployed MyRun, a “participatory platform” with 13 touchscreens as a part of a 3-month exhibition about a famous half marathon. MyRun asked visitors to give stories about the half marathon and collected ~ 13,000 contributions. Cosley et al. [17] deployed MobiTags, a mobile system to improve visitor interaction with exhibits. They allowed users to view and “place” tags on objects throughout an exhibit. They found that people used the tags to form impressions of objects and as navigational tools. Cosley et al. [18] deployed ArtLinks, a standalone computer with keyboard, mouse, and display at a museum exhibit to foster social awareness and reflections. ArtLinks asked users to provide words and short phrases while reflecting on an artwork. Participants liked the social aspects of the interaction and being part of the museum system. Though this research engaged people and collected artwork information, it is unclear whether the descriptions are accessible. Our Baseline Approach is based on this prior research.

2.2 Art Beyond Sight

Art Beyond Sight (ABS) is from Art Education for the Blind who leads a multidisciplinary collaborative of sighted and blind professionals and advisors. ABS is a collaborative of community-based groups, museums, schools, advocacy groups, and people with VIs. Art Education for the Blind was founded in 1987 and their mission is to “make art, art history, and visual culture accessible to people who are blind or visually impaired.” They create accessible art programming, materials, advance knowledge in the field, and encourage visually impaired artists. They also partner with museums around the world to increase accessibility [7].

The ABS Guidelines [6] are 16 guidelines spanning from museum information, context of the artwork within the museum, explaining the artworks, to allowing people to interact with the artworks. They were developed from theory and research by sighted and blind scholars, professionals, and artists [8]. We did not focus on “Standard Information” because we provided it (e.g., artist, title, see Fig. 1). We excluded “Focus on the Style” and “Provide Information on the Historical and Social Context” because the associate museum curator determined these guidelines require formal education (e.g., brushwork, history of the artwork). Because the study was electronic, it would be inappropriate to have people “Describe the Importance of the Technique or Medium” because it would involve seeing the art in person. Finally, our study is museum agnostic, so we excluded “Indicate Where the Curators Have Installed a Work” because the artworks were presented as an image in a survey.

We excluded another 4 guidelines because we restricted artwork descriptions to be text: “Incorporate Sound in Creative Ways,” “Allow People to Touch Artworks,” “Alternative Touchable Materials,” and “Tactile Illustrations of Artworks.” We worked with the remaining 7 guidelines, possible to address by laypeople:

  1. 1.

    General Overview: Subject, Form, and Color: A general overview of the painting’s subject, form, and color is given by presenting visual information in a sequence.

  2. 2.

    Orient the Viewer with Directions: The viewer is oriented with directions using specific and concrete information on the location of objects or figures in the image.

  3. 3.

    Use Specific Words: The description uses specific words and includes clear and precise language that can be taken literally.

  4. 4.

    Provide Vivid Details: Vivid details of different parts of the painting are provided.

  5. 5.

    Refer to Other Senses as Analogues for Vision: Visual experiences are translated into other senses.

  6. 6.

    Explain Intangible Concepts with Analogies: Difficult to describe visual phenomena are explained by using analogies that compare the phenomena to objects or experiences from everyone’s common experience.

  7. 7.

    Encourage Understanding through Reenactment: Instructions to mimic a depicted figure’s pose are given.

2.3 Crowdsourced Descriptions of Images for People with VIs

Crowdsourcing is a low-cost solution to quickly source verbal descriptions of images for people with VIs. Several research efforts that involve VizWiz [12,13,14] enable people with VIs to receive information about their environment through crowdsourced responses to photo questions about objects. Burton et al. [14] explored subjective responses to fashion accessibility for people with VIs. This study used volunteers, none of whom were fashion experts. People with VIs trusted the volunteers’ responses. UCap is an android application that allows blind consumers to request product descriptions based on images [20]. While these works are successful at obtaining visual information, our contribution is developing tasks with a museum expert to curate art-related accessible information.

2.4 Art and Museum Accessibility for People with VIs

People with VIs do not experience the same level of access to museums as sighted people. While they want to experience museums, planning trips is time consuming [25] and there are limited availability of quality accessible materials [9]. While The Museum of Modern Art [23] and Smithsonian American Art Museum [28, 29] offer accessible tours, people must make appointments or attend on a bimonthly schedule.

Further, several museums do not provide audio descriptions or accessible information on their website. VocalEye’s “State of Museum Access Report of 2018” [16] studied museum accessibility across the United Kingdom and found that most museums fail to offer adequate online information about accessible services; for example, only 3% of museum websites mentioned “audio-descriptive guides,” or audio guides with accessible information. As of April 2020, the American Council of the Blind curated ~ 100 museums, parks, and exhibits with audio description across the US [10].

For public art institutions, there is a lack of funding to implement these solutions. Free platforms could meet accessibility needs, but there is a risk of platforms monetizing or cutting access to content. It is hard to predict long-term costs, complicating budgets. Another barrier is staff time and training, where museums stretch curators with other responsibilities (e.g., fundraising, teaching).

Several research efforts use technology to make museum spaces accessible. Rector et al. [24] created and deployed Eyes-Free Art, which allowed people with VIs to independently explore, immerse in, and engage with art. The researchers behind NavCog [3, 26] and the creators of the Andy Warhol Museum’s Out Loud audio guide app [30] used Bluetooth beacons to supply people with VIs with navigation instructions paired with audio descriptions. The Museum of Contemporary Art Chicago developed Coyote, open-source software to curate accessible descriptions of artwork [11, 19]. There are opportunities for technological solutions that do not require experts to compose the content.

Bartolome et al. [21] created a multimodal guide for people with VIs to touch a tactile representation of artwork and give voice commands to hear audio descriptions. Ahmetovic et al. [2] developed MusA, an augmented reality application for people with low vision to frame museum artwork with their smartphone and play a description in “chapters” with visual highlights. Ahmetovic et al. [1] created a touchscreen exploration of artwork to hear attributes or a hierarchical description based on their finger’s location. Our research expands upon these works by creating a scalable approach to gather accessible artwork descriptions. Laypeople can participate via mobile device, reducing expert work and cost to implement in a museum space.

3 Artworks Used in Both Approaches

We chose eight two-dimensional artworks from a public domain collection from the University of Iowa Museum of Art ranging in medium, date of origin, and region of origin. Ranging across four centuries, three continents, and multiple complex intersections of movements, styles, and subjects, these works reflect a comprehensive selection of artworks. Further, the artworks did not include violence, nudity, or sexually explicit content. We present each artwork and caption information inFootnote 1.

4 Baseline Approach Descriptions

We created four baseline tasks inspired by prior HCI research [4, 15, 17, 18] that engaged patrons in the museum space to see whether these already result in accessible descriptions. We deployed these tasks to MTurk. Then, we had four docents rate the responses using the ABS Accessibility Guidelines. Because the docent ratings were low, we waited to engage with people with VIs until we could curate better artwork descriptions. We iterated on the design of our questions so the ABS Approach would not be redundant with the Baseline approach. Below we present the task designs, method for curating the artwork descriptions, and docent ratings.

Fig. 1.
figure 1

Our eight selected artworks and captions from top left to bottom right: 1) Agnes Weinrich, Still Life (Sun Flowers), 1921–1926, Oil on canvas, Gift of Henry W. Starker 1973.185; 2) George Henry Yewell, Courtyard and Water Gate, Moret, France, 1856–1861, Oil on canvas, 12 ½ × 9 in. (31.75 × 22.86 cm), Gift of Oscar Coast 1927.21; 3) Aubrey Vincent Beardsley, Isolde, from The Studio, VI, 1896, Chrom-lithograph, 11 1/8 × 7 ½ in. (28.26 × 19.05 cm), Gift of Kenneth J. Oberembt 1983.59; 4) Robert Havell, Great Blue Heron (Ardea Herodias. Male) (after a drawing by J. J. Audubon), 1834, Engraving and aquatint, 38 × 25 ¼ in. (96.52 × 64.14 cm), Estate of Ann U. Morse 2007.56; 5) Kobayashi Kiyochika, Tokyo! Ryogoku Hyappongu Akatsuki No Zu (Dawn by the Hundred Pilings at Ryogoku in Tokyo), July 1879, Woodblock, 9 3/8 × 13 5/8 in. (23.81 × 34.61 cm), Gift of Owen and Leone Elliott 1968.212; 6) Pieter Bruegel, Spes (Hope), plate 2 from The Seven Theological and Cardinal Virtues, published by Hieronymous Cock, c. 1559, Engraving on paper, 8 7/8 × 11 ½ in. (22.54 × 29.21 cm), Museum purchase 1976.16; 7) Maurice Brazil Prendergast, Springtime, 1896–1897, Watercolor and pencil on paper, 9 ½ × 10 ¼ in. (24.13 × 26.04 cm), Gift of Frank Eyerly 1963.1; 8) Lil Picard, Waves, 1957, Oil on Canvas, 36 ¼ × 32 in. (92.08 × 81.28 cm), Lil Picard Collection 2012.209.

4.1 Task Designs

We created four baseline tasks (Baseline Approach) inspired by prior works [4, 15, 17, 18] (Table 1), though the prior works contain more than only crowdsourced descriptions; said works were deployed in physical spaces with in-person interactions. To ensure our layperson contributions resembled the prior works, we replicated each prior work’s task in both content and mode of input (i.e., mobileFootnote 2, computerFootnote 3). We created four smaller tasks because we wanted to simulate a person’s ability to visit artwork for varying time. People could make a single contribution, or if they chose to engage with artwork for a longer period, they could do multiple tasks. We intentionally did not use the ABS Guidelines in the Baseline (BL) Approach because we wanted to assess the potential of prior HCI approaches to soliciting accessible descriptions.

4.2 Method for Curating Artwork Descriptions

To collect artwork descriptions, we created Qualtrics surveys that had an artwork image, caption information (see Fig. 1’s caption), and a task. In Baseline Approach, each artwork had four tasks, so we had 32 surveys. We collected survey responses through Amazon’s Mechanical Turk [5], with five MTurkers completing each survey (for redundancy). We informed MTurkers that these descriptions were for people with VIs but did not ask them to follow the ABS Guidelines. We compensated MTurkers for BL_Words/Phrases: $0.60/task, BL_Tags: $0.75/task, BL_Emotions: $0.50/task, and BL_Story: $1.25/task. Based on average completion times (below), the average hourly rates would amount to $16.06, $22.31, $23.08, and $19.74, respectively.

We collected these 160 artwork descriptions from 132 MTurkers (demographic information in Table 2). The MTurkers completed a mean of 1.21 tasks, with 112 MTurkers completing 1 task, 14 MTurkers completing 2 tasks, 4 MTurkers completing 3 tasks, and 2 MTurkers completing 4 tasks. We did not filter for colorblindness because museumgoers with colorblindness could provide artwork descriptions. The mean(SD) duration in seconds for MTurkers was BL_Words/Phrases = 134.5(157), BL_Tags = 121(96.5), BL_Emotions = 78(50.3), and BL_Story = 228(350.5).

Table 1. Baseline (BL) Approach tasks by name, device, and content.
Table 2. Demographic information for each task in Baseline Approach. All demographics are uncertain due to the anonymity on MTurk. Native/bilingual (NB): “has complete fluency in the language, including breadth of vocabulary and idiom, colloquialisms, and pertinent cultural references.” Full professional (FP): “makes only quite rare and minor errors of pronunciation and grammar” and “can handle informal interpreting of the language.” Professional Working (PW): “has a general vocabulary which is broad enough that he or she rarely has to search for a word.” Limited Working (LW): “can usually handle elementary constructions quite accurately but does not have thorough or confident control of the grammar.”

4.3 Docent Ratings

To measure the extent to which MTurkers’ descriptions follow the ABS Accessibility Guidelines, we recruited four docents (all females, ages 30–62) to rate them. We recruited docents from the Association of Academic Art Museums and Galleries professional listserv and a university museum compiled volunteer docent list. We screened for visual disorders. No docents considered themselves artists. Their experience ranged from 6–28 years with three as educators. We did not assess docent experience with ABS Guidelines, but the docents were qualified to learn and apply the guidelines during our study. First, the target audience of the ABS Accessibility Guidelines includes museum staff [8], so the information is approachable to docents. Second, each guideline is brief, with one paragraph of description followed by an example.

To avoid participant fatigue, we had each docent rate half the descriptions. To mitigate ordering effects, we randomly ordered the artwork descriptions and had the docents begin from different starting points, such that two docents responded to each description. We collected 2240 ratings.

We met each docent via Zoom Video Communications because we wanted to facilitate them visiting the ABS Accessibility Guidelines before starting the survey. After verbal consent, the docent read the ABS Accessibility Guidelines [6]. Then, they rated each artwork description using each of the seven ABS Accessibility Guidelines on a 5-point Likert scale from “Strongly Disagree” to “Strongly Agree.” We encouraged the docent to take breaks during the survey. Not including training time, docents completed the ratings in 00:42:29, 00:29:47, 01:08:01, and 00:50:25. Due to the synchronous format and length of the sessions, we compensated each docent $20. The docents rated the Baseline Approach artwork descriptions low; only 0.27% of the ratings were at least a 4, where 5 is the best possible score.

5 ABS Approach Descriptions

5.1 Task Designs

Since the artwork descriptions from the Baseline Approach were inaccessible, our team designed four tasks to better fulfill the ABS Accessibility Guidelines (ABS Approach, Table 3).

Table 3. ABS Approach names, targeted guidelines, and content. All tasks were mobile. ABS_Reenact applied to artworks 3 & 5–7.

The authors who are HCI researchers were a graduate student and advisor in Accessibility. The advisor worked directly with people with VIs for 8 years with prior experience in artwork accessibility. The associate curator was an art professional with 8 years of experience in public institutions, including art museums. Their research includes access to the arts. When we designed ABS Approach, we had 3 considerations.

  1. 1.

    We experienced a design tension between the collaborators. The authors who identify as HCI researchers wanted creative responses from laypeople. However, the associate curator’s concern was that laypeople are more likely to give inaccurate information. Thus, in the ABS_Literal task, we told the MTurkers to exclude emotion and opinion. In the ABS_Senses task, we allowed creative responses.

  2. 2.

    The research team studied the few MTurker contributions that scored at least 4/5 compared to the contributions with lower scores. Clear language, facts, and inclusion of absolute or relative positions of elements resulted in more accessible descriptions - in line with ABS Accessibility guidelines. Therefore, in all tasks except for ABS_Senses, MTurkers could draw outline around the elements they discussed, which we converted to descriptions that included position (Sect. 4.2).

  3. 3.

    We determined that unambiguous language in our task prompts could help MTurkers better answer the questions. Therefore, we included “subject,” “aspect,” and “element” so that MTurkers could respond depending on the targeted guideline and level of abstractness of an artwork. The ABS_General and ABS_Literal tasks use elements because we wanted MTurkers to select objects and regions, regardless of whether they are literal or abstract. ABS_Reenact uses “subject” or “aspect” to cue selections that have a human form.

5.2 Curating Artwork Descriptions

We followed the same procedures as the Baseline Approach. ABS_Reenact applied to half the paintings that had a person or people. Therefore, instead of 32 surveys as in Baseline Approach, we had 28. ABS_General, ABS_Literal, and ABS_Reenact tasks involved tapping to place dots around the element, subject, or aspect of the artwork that a person would describe. To implement these dots, we used the Qualtrics Heatmap question. We collected the coordinates of each dot to programmatically convert it to region descriptions, which we used to augment the artwork descriptions.

We compensated workers $1.25 for each ABS Approach task to be the same as BL_Story. Based on average completion times (below), the average hourly rates would amount to $19.27, $17.54, $15.52, and $28.57. With 28 surveys for ABS approaches, we collected 140 artwork descriptions from 98 MTurkers (demographic information in Table 4). MTurkers completed a mean of 1.43 tasks with 73 MTurkers completing 1 task, 14 did 2 tasks, 8 did 3 tasks, and 1 did 4, 5, and 6 tasks, respectively. There was an overlap of 11 MTurkers between the two approaches. The mean(SD) duration in seconds for MTurkers was ABS_General = 233.5(140.5), ABS_Literal = 256.5(224.5), ABS_Reenact = 290(276.8), and ABS_Senses = 157.5(133).

Table 4. Demographic information for each task in ABS Approach. ABS_Reenact only applied to half the paintings, and therefore has approximately half the workers.

5.3 Docent Ratings and Comparison to Baseline Approach

We had the same four docents, compensated $20 again, rate the new descriptions. We chose a within-subjects study to compare within each docent. While expectation bias is possible, a 3-month break between docents rating the first descriptions (May-Jun. 2019) and second descriptions (Aug.-Sep. 2019) made it unlikely that they would recall their prior ratings. ABS_Reenact applied to only half the artworks, so there were 1960 ratings. Not including training time, docents completed the ratings in 00:43:34, 00:37:56, 01:49:35, and 01:04:42. The docents rated 6.61% of ABS Approach artwork descriptions 4/5 or above.

Table 5. The guideline number and statistical tests for Task-Artwork and Task. All statistics have p < 0.001, where p values multiplied by 28

Comparing between all tasks, we note the Task-Artwork interaction had a statistically significant effect on docent ratings, but Artwork did not have a statistically significant effect. Therefore, we focus on Task, which influenced docent ratings for all guidelines (Table 5). ABS Approach outperformed Baseline Approach in terms of the ABS Accessibility Guidelines. The Appendix has tables showing pairwise differences. We describe three high-level findings below. The descriptions highlighted in the findings were chosen based on the highest ratings from docents.

Responses to ABS_General and ABS_Literal are More Well-rounded than Other Tasks.

Docents rated artwork descriptions from the ABS_General and ABS_Literal tasks higher than other tasks. ABS_General and ABS_Literal were rated higher than all tasks for three guidelines (#2, #3, and #7). ABS_General and ABS_Literal were rated higher than Baseline tasks for three guidelines (#9, #10, and #11). Finally, ABS_Literal was rated higher than all tasks and ABS_General was rated higher than Baseline tasks for one guideline (#6). For instance, in artwork 7, both docents rated the following MTurker’s response a 4/5. The MTurker described the relative locations and orientations of the people and elements:

“On the bottom left: There is a woman lying down in a grass field wearing a black dress. Her left arm is tucked underneath her body, propping her up [off] the ground. She is also wearing a black hat that has a white ribbon wrapped around it. You cannot see her face because she is looking towards a city, so you are seeing [t]he back [of] her.

On the right side: There is a woman wearing a white dress with polka dots in the same field as the other woman. She is standing instead of lying down. The dress has long sleeves, and her hair is styled. She is also wearing a small black hat with a red ribbon. Her hair is a light brown color, and her skin is fair.

On the bottom center: In between these two women is a little girl that is seated. She is wearing a red-orange dress with a white hat on. You cannot see her face because she is turned away from you.”

Docents rated this Artwork 7’s description 1/5 and 2/5, which was briefer:

“On the right side: The lady standing on the left side of the foreground.

On the bottom center: The crowd in the background.

On the bottom: Lady sitting on the right side of the foreground”.

ABS_Senses Strong in Refer to Other Senses as Analogues for Vision and Addressed Other Guidelines.

Docents rated the artwork descriptions from the ABS_Senses task as higher than all other tasks. For example, Artwork 6 had two contributions rated by both docents as 4/5: “I can smell and taste the salty ocean air all around me. I feel my feet rest firm[ly] on the hard[-]stone ground of the platform I am standing on. I hear the tumultuous waves crashing towards me and the chaos of men on wooden boats that seem to be capsizing. I can feel the gritty stone walls of the tower beside me. I smell the stink and hear the groans of prisoners, laborers, and beggars around me.”

“It smells of human sweat and dirt mixed with rusted metals. You can taste the salty sea air as your feet stomp along on the pier. The sound of the waves crashing does little to mask the hustle and bustle of the town nearby.”

Further, ABS_Senses responses were rated higher than all Baseline Approach tasks for two guidelines (#7 and #11). ABS_Senses responses were rated higher than BL_Words/Phrases for two guidelines (#6 and #10).

ABS_Reenact Best Fulfills the Encourage Understanding through Reenactment guideline.

ABS_Reenact performed better than all other tasks, because it enabled clear descriptions when people were in the artwork. Both docents rated this description as a 5/5 for artwork 3: “Bend your back forward slightly. Then bring your hands to your face as if you're holding a bowl of warm soup to take a sip.” The docents rated this Artwork 3’s description 1/5 and 2/5: “The wine curtain that serves as a wall. The human lays close to the curtain and remains standing.”

6 People with Visual Impairments’ Ratings of and Comments on Artwork Descriptions

Once we had artwork descriptions that better met the ABS Accessibility Guidelines, we gathered ratings and justifications on the understandability of artwork descriptions from people with VIs. We conducted an unsupervised Qualtrics survey because they were answering questions based on their opinion; no initial review of guidelines was needed like with the docents. 31 people with VIs (9 males, 22 females) ages 19–68 mean(SD) = 40.2(15) filled out the survey, labeled P1-P31. Five were artists (from 2–45 years of experience), 23 were not artists, and three did not specify. No one considered themselves a museum employee. Thirteen participants were totally blind from birth, and another eight were totally blind from 10–56 years mean(SD) = 23(14.31). Two were legally blind from birth, and another four were legally blind from 1.5–29 years mean(SD) = 13.63(11.73). Two had low vision since birth, and another had low vision for ten years. Finally, one had a degenerative condition since childhood and cannot discern details. We wanted to pay people with VIs at the same rate of docents., so we used $5 Amazon gift cards, predicting the surveys would take < 30 min.

First, our survey listed the purpose of the study, which was “… to determine if written descriptions of artwork provided by sighted people are useful to people who have a visual impairment.” After agreeing to the study, people with VIs rated descriptions for the 8 artworks. Our survey had two pages per artwork. These pages were presented in a random order to offset the learning effect; specifically, half the artworks (i.e., 3–6) showed the collection of Baseline Approach descriptions first, and half the artworks (i.e., 1–2, 7–8) showed the ABS Approach descriptions first. For completeness, we wanted people with VIs to evaluate all descriptions, so they were shown regardless of redundant or contrary content. The survey did not mention that different pages pertained to different approaches. On each page, we presented the artwork, its metadata, and the collection of descriptions from either the Baseline Approach or ABS ApproachFootnote 4. The metadata included artist, title, year, medium, dimensions, and how the artwork was acquired; refer to Fig. 1’s caption for this information. We asked: “On a scale of 1 to 5, where 1 is Strongly Disagree to 5 is Strongly Agree, rate how much you agree with the following statement: I am able to understand most elements or objects of this artwork from the provided descriptions.”

Then, we asked them to “Write one sentence explaining the rating that you selected in the previous question.” Participants completed the survey at a minimum of 00:07:44, a maximum of 1 day + 19:16:41, and a median of 01:19:41, which may (we cannot know) have included interruptions.

To assess the difference between people with VI ratings for Baseline versus ABS approaches while controlling for differences in artwork and participant demographics, we used a Linear Mixed Model. Artwork and approach were repeated variables. Artwork, approach, and artwork * approach were fixed effects, while participant, age, gender, and level of vision were random effects. Whether the person was an artist was considered a redundant covariate and therefore not included in the analysis. We found that approach influenced “I am able to understand most elements or objects of this artwork…” (F(1) = 82.8, p < .001). They gave higher ratings when responding to descriptions from ABS Approach (mean(SD) = 3.98(1.05)) than descriptions generated from the Baseline Approach (mean(SD) = 3.25(1.24)).

We conducted a qualitative analysis of the optional justifications that people gave. We coded based on the ABS Accessibility Guidelines, the tasks themselves, or other comments from people with VIs. We iterated on our codebook for the first 60% of responses, and two researchers finished coding the remaining 40% responses by dividing in half.

6.1 Qualitative Findings for Baseline Approach

Overall, people with VIs had more negative (155) than positive (132) comments about the Baseline Approach descriptions. There were multiple reasons for criticism. First, 46 comments related to the descriptions not being vivid. P19 commented on artwork 4’s description: “… I have no idea on what the bird is doing or what the scenery looks like outside the fact that there seems to be some kind of lake involved.” The description for artwork 2 had P2 asking follow-up questions: “How tall is the building; from what angle do we view it? No people?” Second, participants spoke to flaws relating to the General Overview guideline, with participants not understanding what was occurring in the artwork (n = 38). For instance, with artwork 4, P13 stated that they needed “… more physical descriptions about what is actually happening.” Third, P13 noted contradictions between the descriptions: “The [descriptions] varied so much that it was hard to tell what was actually going on.”

BL_Story Best of Baseline Approach.

Out of the 132 positive comments about Baseline Approach’s descriptions, 51 were related to the BL_Story descriptions. For instance, P22 reflected on an MTurker’s story for artwork 1: “I loved the point of view of the person who said it was a painting of vibrant flowers against a dull background, reflecting the title, Still Life. I could distinguish the contradiction between the vibrant life of the flowers and the dullness beyond.”

Participants appreciated the descriptions from BL_Story for reasons including that “the stories make the artwork come alive” (P16, artwork 2) or that P24 “was able to experience this picture through the stories” (artwork 6). Second, the artwork descriptions had specific words to add more details: “I love the description of the meadow and grass, and I can picture a warm spring breeze as families play at the park; I also love how one individual used the descriptive word energizing to describe the weather” (P22, artwork 7).

Participants did have negative comments about BL_Story descriptions (n = 18). Stories lacked specific information about the layout and location of objects or figures in the artworks. P23 said, “I need more description about what is happening in each part of the painting and in the painting as a whole not composed in a story.”

Other Baseline Tasks Less Useful.

Participants made only 11 positive statements about BL_Words/Phrases. While words or phrases were helpful: “Strong words like peaceful and waves bring me back to laying down by the ocean.” (P29, artwork 8), overall, the short descriptions left participants confused, with 14 negative comments. Participants felt that the words or phrases “are not very descriptive” (P17).

Participants had mixed feelings toward descriptions from BL_Tags, with 17 positive comments and 16 negative comments. Participants noted that the tags were understandable. For instance, in artwork 6, P7 said: “I found the tags useful, at least in terms of getting a sense of what colors and what some of the elements it might have been.” But participants said that the tags were not helpful. More specifically, P17 felt that the “tags are very short and generic.” Further P9 “found the tags confusing because there was a lot of repetition.”

Participants had the fewest positive comments about emotions (n = 7). People could see the benefit; for instance, on artwork 5, P16 said “Basic elements are understandable, the emotions and stories are helpful in relating what I would see and feel.” However, participants gave 19 negative comments, due to being confusing. MTurkers could choose multiple emotions, and this led to contradictions: “Emotional responses are confusing with repeated tags and things like “angry, happy, sad” all on the same line, followed by happy elsewhere. What is this supposed to mean?” (P9, artwork 5).

6.2 Qualitative Findings for ABS Approach

The positive comments given about descriptions from the ABS Approach (n = 179) outweighed the negative (n = 86). Unlike Baseline Approach, people had more positive comments related to the general overview (#2), orient (#3), and vivid (#7) guidelines. For instance, P7 made a comment related to general overview: “The descriptions of elements, the pose descriptions[,] and the sensory evocations all work together to give me a really good sense of what this is an image of and what emotions it evokes.”

There were 29 comments related to the helpfulness of orienting the reader with specific layout descriptions. For instance, P17 said artwork 5’s descriptions “did a good job with describing the positioning of the different elements of the painting.” Further, P13 spoke to how the layout helped them visualize the artwork: “The details at the beginning were very helpful, as I was able to understand the layout of the painting, which helped me visualize how a sighted person would see it.“

Speaking to the vivid guideline, P22 stated that they were “… able to identify each individual part of the painting and imagine the sun reflecting off the water, with room for imagination too” (artwork 2). With that said, the vivid guideline had the most negative comments (n = 29). The descriptions also could leave participants asking follow-up questions. For example, while P13 knew aspects of artwork 2, they also asked questions: “I understand that there is an arch and a window that might be rundown, along with a building that may be older, but don't know if there is grass, gravel on the ground, if there is a staircase leading down, or if the window is part of the archway.”

ABS_General.

Participants spoke positively about the descriptions from ABS_General (n = 26) far more than negatively (n = 2). P17 stated that for artwork 6 the descriptions “did a good job of indicating the most prominent features of the painting.” Further, for artwork 5, P7 said: “The description of elements was incredibly useful, giving me concrete items to imagine and the sentence descriptions were also really useful here emphasizing the importance of the sun in the image.” The two negative comments were saying descriptions from ABS_General were less useful than the ABS_Literal or ABS_Reenact descriptions.

ABS_Literal.

Participants gave the most positive comments about ABS_Literal descriptions (n = 40) as opposed to only 3 negative comments. The three negative comments wished for more details about the artwork or mentioned the ABS_Reenact descriptions being more useful. P2 said the descriptions from ABS_Literal was “essential for [artwork 7].” Regarding artwork 3, P2 said the “the literal descriptions were the most comprehensive.” More specifically, for artwork 3, P17 said: “I liked the description of what the woman was wearing and how she was positioned.”

ABS_Reenact.

Participants appreciated the descriptions of reenacting a depicted figure, with 14 positive comments versus 3 negative comments (this task pertained to only half the artworks). Participants understood the value of the reenactments when people were in the artwork, with P2 saying about artwork 6, “The reenactments were most helpful since the woman is the focal part of the picture.” P19 commented that they “…really like the description of the painting the highlighted the woman, her features, and her pose against the background.” P7 further appreciated the instructions to hold the posture themselves: “Very clear descriptions of the figures in this piece of artwork, and the instructions as to how to simulate those figures postures were really useful in helping me understand what was presented.”

The negative comments were about artworks 3 and 5, where people with VIs felt the descriptions from ABS_Reenact were unhelpful. P24 said, “I thought that the [reenact] descriptions detracted from the description, in this case,” and P16 said, “the descriptions of body movements are interesting, but not as helpful as actual story scenarios.”

ABS_Senses.

Participants wrote 21 positive and 5 negative comments about senses. P24 spoke to artwork 1 “coming alive”: “The statements about the senses made the description more alive, and they agreed with each other enough so that I felt as though I understood the basic meaning of the picture.” P22 wrote about artwork 4: “I loved how some of the descriptions discussed using one's sense of hearing to enjoy the scenery in the artwork, and I was easily able to follow what was happening.”

Three of the negative comments were saying the ABS_Senses descriptions as unnecessary: “Description of senses experienced is not necessary, [a] blind viewer can fill in with their own emotional interpretation” (P17).

6.3 Suggestions for Improvement

Participants gave 53 suggestions for improvement across both approaches. 26 these comments asked specific follow-up questions about the artworks with another 16 saying that they wanted more details. P18 suggested a better ordering of the descriptions: “Why can the description not be[]given in an order that is more comprehensive instead of skipping all over and back again? Too confusing! But the thoughts about the setting are fantastic!”.

7 Discussion

7.1 Limitations

While we took careful measures to curate layperson artwork descriptions informed by Art Beyond Sight guidelines, we acknowledge that our study has limitations. It is possible that the level of pay had an influence on MTurker responses, but we opted to be fair in terms of predicted time per task; we did not want to financially disadvantage workers who got longer, more time-consuming tasks. There were MTurkers who completed multiple tasks within one approach, and multiple tasks across approaches. Our goal was to understand the potential quality of descriptions in a museum setting; it is within the realm of possibility that patrons give multiple descriptions for multiple artworks, along with others who make a chance encounter supplying only one description. While people could get better with practice, we did not give feedback on the content of their descriptions.

We chose an online study to assess the potential of this approach before deploying to a museum. These results may differ from in-person settings: MTurkers were paid (patrons would not be paid), MTurkers may be less invested in the artwork than patrons, and MTurkers experienced artworks online rather than in person. These differences may have unequally affected the tasks with respect to docent and people with VIs. For example, MTurkers may have composed better stories because they had to think more about the artwork than labeling with tags. Future work is needed to confirm these findings for in-person.

We did not ask for docent familiarity with the ABS Guidelines. However, these guidelines are meant for docents, we gave the docents time at the beginning of the session to familiarize with the guidelines, and we reminded them of the definition in every question. While docent ratings were low overall, Baseline Approach had only 0.27% of scores of at least 4/5 (5 being the best), while ABS Approach had 6.61%.

We did not ask for people with VIs’ familiarity with the artworks, but these artworks were randomly selected to represent a wide range of ages and styles. While our goal was to assess their understanding of the artwork based on MTurker contributions, we did not give a ground truth description. Despite this limitation, we still observed a statistically significant difference in their responses. Finally, we did not capture how long survey sessions took for people with VIs because we did not account for breaks.

7.2 Lessons Learned and Resulting Guidelines

Our goal was to develop and evaluate a novel approach for laypeople to generate accessible descriptions for people with VIs. Quantitatively, descriptions generated by ABS Approach were rated more highly and received more positive than negative comments, while the reverse was true for Baseline Approach. Qualitatively, with Baseline Approach, descriptions had insufficient details, while ABS_General and ABS_Literal led to helpful information about layout and orientation. The ABS_Literal descriptions make the artworks more vivid, which was not achieved by Baseline Approach. ABS_Reenact gave a new dimension that otherwise may have been missed by contributors, encouraging the descriptions to be more specific, particularly about human figures in the artwork. Finally, a positive aspect that arose in both approaches was that BL_Story and ABS_Senses made the artwork come alive.

While MTurkers spent longer completing ABS Approach than Baseline Approach tasks, people with VIs gave them higher scores in terms of understandability and docents rated those descriptions as higher per the ABS Accessibility Guidelines. We confirm a tradeoff in time needed to complete each task versus the accessibility of the description. One hypothesis is that MTurkers did not have to supply as comprehensive of responses to tasks from the Baseline Approach (except for BL_Story). Further, taking creative liberty is not acceptable for audiences looking for facts. This was a tension while we designed the ABS Approach tasks – the museum is a trustworthy institution that does not want to risk losing patron trust due to inaccurate descriptions [27]. People with VIs said the BL_Story was “silly” or the BL_Words/Phrases and BL_Tags were “not useful.” We recommend that artwork descriptions are curated via tasks grounded in ABS Guidelines. Further, it is important to disclose to patrons that descriptions were collected from other museumgoers. We raise further questions: how do we allow patrons to answer questions if they are quickly passing an artwork? How do we allow creativity while gathering factual descriptions?

Further, we uncovered that two of our ABS Approach tasks better fulfilled ABS Guidelines than the Baseline Approach; two were more focused on a singular guideline. This coverage is beneficial, because it allowed MTurkers to focus on one concept at a time, and combining the statements together helped with understandability. Therefore, we recommend multiple tasks, where some approach the artwork from a high level and other tasks approach from a low level.

7.3 Future Work: Moving from Online to Museum

While these results show the potential of improving artwork descriptions for people with VIs in museums, there are opportunities for future research. First, there is an opportunity to vote on the best descriptions via a collaboration between patrons and museum curators. There could be incentives for patrons including virtual awards for popular descriptions (much like incentives for Google Local Guides for Google Maps [31]). Finally, a curator and/or accessibility expert could do a final proofread of the best voted descriptions before they become publicly available. This reduces the level of effort for a museum employee from creation to vetting.

Second, our designed tasks are virtual, so one should deploy them in a museum. Patrons might notice different details—the size, texture, and finer aspects of the material reality of art objects; these are hard to convey through digital images. While we used the term “painting” for the online tasks, which is different from “artwork,” MTurkers and docents had the same experience: viewing a 2D image. Our caption information described the medium and method, but people did not experience it personally. Further, people in museums are in a formal setting, answering questions about artwork physically in front of them, so they are less anonymous. These factors can influence the statements we receive.

Third, a risk is that deploying this technology to the museum could make accessibility an afterthought. Curators should collaborate with patrons to make the descriptions. Crowdsourcing works toward another goal of museums: teaching audiences to look deeply at art (e.g., [22]). By creating experiences that guide novice visitors through the process of visually analyzing artworks, we achieve this pedagogical goal and aid people with VIs. Our work shows that scaffolding this task is difficult but possible. Future research must learn about the types of descriptions gathered in the museum and measure their accessibility compared to descriptions gathered online.

Fourth, we should explore how to present statements from patrons. For our survey of people with VIs, we included all MTurker artwork descriptions from each approach. However, we found that user contributions differed in content and quality and people with VIs did not always prefer the statement ordering. There are opportunities to explore how to effectively present these statements. User interfaces could present statements in order from most to least prominent elements, regions in clock notation, or moving from general descriptions to detailed descriptions about specific elements.

Finally, there are open questions for the experience of interacting with the written descriptions. A system could play statements through bone-conduction headphones serially or based on user choice. We could present descriptions via a proxemic interface where the user hears more detailed descriptions as they move toward or spend longer with the artwork [24]. This interaction could be physical (via user position) or phone-based using VoiceOver selections.

8 Conclusion

We designed and implemented tasks to help laypeople compose more accessible descriptions of artwork than prior HCI research. Through our framework of using the ABS Accessibility Guidelines and our multidisciplinary team, we were able to curate descriptions from MTurkers that 31 people with VIs and 4 docents rated higher than the descriptions from Baseline tasks. Integrating poses, senses, and orientation with the descriptions of elements allowed people to visualize the artwork and brought them to life. We hope our work will help researchers interested in accessible art exploration and who want to curate artwork descriptions at a larger scale from laypeople.