Introduction

We interact not only with humans but also with animals. Two of the most common nonhuman animals that interact with us are domestic dogs (Canis familiaris) and cats (Felis catus). Archeological evidence shows that dogs have the longest history of domestication (Clutton-Brock 1995), with mitochondrial DNA analysis revealing their ancestors to be wolves (Vilà et al. 1997). Recent studies suggest that dogs have acquired a superior cognitive ability to communicate with humans during their evolution. Dogs have been shown to possess the ability to understand cues presented by humans (e.g., pointing gestures or gazes) to locate hidden food (Miklósi et al. 1998; Hare and Tomasello 1999; Soproni et al. 2001, 2002). They can also use the bodies, heads, and often the eyes as cues to determine the attentional state of humans (Call et al. 2003; Gácsi et al. 2004; Virányi et al. 2004; Schwab and Huber 2006). Dogs can also discriminate their owners from unfamiliar persons (Topál et al. 1998) and match their owners’ voices and faces (Adachi et al. 2007).

The social ability of domestic cats, compared to domestic dogs, has not been explored thoroughly in the context of human interaction. The reason may lie in the history of cat domestication and the sociality of its ancestors. A decade ago, it was thought that cat domestication began about 4,000 years ago (Serpell 2000), though recent evidence found in Cyprus deposits has determined the earliest cat–human association to be 9,500 years old (Vigne et al. 2004). Genetic analyses suggested that the ancestors of cats were Felis silvestris, which were solitary animals, and domesticated in the Near East, probably coincident with the development of agriculture (Driscoll et al. 2007). Cats domesticated themselves in ancient times in order to prey on rodents that were attracted to the humans’ stocks of grains and cereals (Clutton-Brock 1988) or as scavengers on human waste food (Todd 1978; Driscoll et al. 2009). Thus, artificial selection was not conducted consciously and the original domestic cat was a product of natural selection (Clutton-Brock 1988; Driscoll et al. 2009). Cats have a shorter history of domestication and a relative lack of early artificial selection compared with dogs, which are considered to have been domesticated about 15,000 years ago (Savolainen et al. 2002): This has resulted in less attention being paid to domestic cats in the context of human interaction.

Nevertheless, domestic cats have developed different features from their wild ancestors. Since they depend on the food provided (intentionally or unintentionally) by humans, they occasionally form social groups because of population density in the habitat. They have developed an intraspecies communication not found in other solitary felids to maintain social groups (Bradshaw and Cameron-Beaumont 2000). Moreover, they have also developed traits related to human interaction. For example, their meowing, one of the communicative behaviors of domestic cats, is perceived by humans as being more pleasant than that of African wild cats (Felis silvestris lybica; Nicastro 2004). Some cats show solicitation purring which is exhibited at feeding time when the cats are actively soliciting food from their owners and perceived by humans as more urgent and less pleasant than nonsolicitation purring (McComb et al. 2009). Like dogs, domestic cats also have the ability to use human pointing as a cue to locate hidden food (Miklósi et al. 2005). They react to unfamiliar and familiar humans differently (Collard 1967; Casey and Bradshaw 2008). These results indicate that the social abilities of domestic cats are not confined to conspecifics but are also applicable to humans.

A social ability widely seen in a number of species is differentiation between conspecifics by using individual differences in vocalizations. For example, zebra finches (Taeniopygia guttata castanotis) recognize mates on the basis of their calls (Vignal et al. 2004, 2008); bottlenose dolphins (Tursiops truncatus) use whistles for mother–infant recognition (Sayigh et al. 1999); mother vervet monkeys (Cercopithecus aethiops) can distinguish their own offspring’s screams from those of others (Cheney and Seyfarth 1980); and female African elephants (Loxodonta africana) can distinguish the calls of female family and bond group members from those of female outsiders (McComb et al. 2000). Similarly, some domestic animals are also known to be able to recognize individual humans through voice. For example, horses can match the forms and voices of familiar handlers when the handlers were presented together with a stranger (Proops and McComb 2012). Dogs can match owners’ voices and faces from others (Adachi et al. 2007). As mentioned above, cats can distinguish familiar humans from unfamiliar ones. However, it remains to be seen whether this distinction can be made using vocal cues.

In this experiment, we investigate whether domestic cats can recognize their owners’ voices and distinguish them from other human voices. We felt that this question would be best answered by using animals that have already developed a relationship with humans. Consequently, almost all of the cats used here were kept in ordinary homes. The experiment was conducted in their homes with a habituation–dishabituation method. Visiting owners’ homes allowed us to observe the cats’ natural behaviors which might be disrupted in a laboratory because of vigilance against a novel place. Further, the habituation–dishabituation method enabled us to measure their reactions during a one-time visit, as no extensive training was required. Three strangers’ voices, followed by the owner’s voice and another stranger’s voice, were presented serially to the cats. If the cats habituated to the strangers’ voices and dishabituated to the owners’, a rebound of response to the presentation of the owner’s voice should be observed.

Methods

Subjects

Twenty domestic cats (12 males and 8 females) participated in this study. Of these, 19 were indoor cats, which were kept in 12 families comprising four male and eight female owners, while one was an outdoor cat, which was kept on a university campus by a male owner. There was no significant change in experimental procedure for the outdoor cat: Therefore, we combined the data obtained from this cat with those of the others. Subjects were 14 mongrels, two American Shorthairs, a Russian Blue, a Maine Coon, a Somali, and a Norwegian Forest Cat. Their ages ranged from 1 to 11 years, and the mean age was 6.05 years (SE = 0.67). The indoor cats began to live with their owners some months after birth at the latest and were spayed or neutered. The cats were not subjected to food deprivation during the study period. We asked the owners whether they were group fed and the frequency of having visitors at home (more than once/once/less than once in a month). This information is presented in the Electronic Supplementary Material.

Apparatus and stimuli

Five sound stimuli were prepared for each subject. One contained the voice of the owner calling the subject’s name, and the other four stimuli contained the voices of four different same-sex strangers calling out the subject’s name. Each owner was instructed to call out the cats’ names as they normally would. In addition, if they usually called their cats by their nicknames instead of their real names, the nicknames were used instead. Strangers were asked to call the cats’ names in the same manner as the owners. Thus, phonological elements were identical between the owners’ and strangers’ calls. Eleven men and 15 women participated as strangers in this experiment. They were randomly assigned to same-sex owners. We recorded the calls with a handheld digital audio recorder (ZOOM H2 Handy Recorder) in WAV format. The sampling rate was 44,100 Hz and the sampling resolution was 16-bit. The sound stimuli were adjusted to the same volume level by using sound editing software (Adobe Soundbooth CS4).

The number of stimuli used was determined in a pretest. The experimenter’s cat, which was not involved in the experiment, was exposed to four strangers’ voices calling out its name with a 30-s interstimulus interval (ISI). Its response was evaluated using a behavioral score identical to the one used in the actual experiment, which is described below. The decrease in score after the presentation of the third stimulus was attributed to habituation, leading us to conclude that the serial presentation stimuli of strangers 1–3 was sufficient to induce habituation.

During the experiment, the handheld recorder was used to present the stimuli through a speaker (Sony SRS-Z100), which was hidden from the subject. The distance between the subjects and the speaker was about 3 m. The volumes of the voices were approximately 65 dB at 3 m from the speaker. A video camera (Sanyo DNX-CA9), placed in front of the subjects, recorded their reactions during the playback of the stimuli.

Procedure

Experiments were conducted from August 2009 to March 2010 in the owners’ homes or on the university campus depending on the locations of the subjects. We used the habituation–dishabituation procedure. The experimenter waited until the subjects were calm before beginning the experiment. The stimuli were then played serially with a 30-s ISI. During the presentation, the owners were out of the subjects’ sight. The order of the presentation was stranger 1, stranger 2, stranger 3, owner, and stranger 4. Subjects’ responses to the stimuli were expected to decrease because of habituation during the presentation of the voices of strangers 1 through 3. If the subjects could discriminate their owners’ voices from those of the strangers, the subjects’ responses should increase again when presented with their owners’ voices because of dishabituation. The experiment lasted around 3 min. Two cats moved around throughout the experiment, but they were subsequently rated as nonhabituated cats, so their data were not used for the analysis of owner’s voice discrimination. Three cats, including one nonhabituated cat, showed displacement once during ISI. The other fifteen cats sat or lay down in one location but stayed awake.

Behavioral analysis

Recorded videos of subjects’ responses were clipped with Adobe Premiere CS4 for each stimulus presentation, from 5 s before stimulus onset to 10 s after stimulus offset. Voices calling out the cats’ names in the clips were masked with pure tones for the purpose of blind evaluation. In total, 100 clips were created.

We conducted two kinds of analyses to investigate the subjects’ response styles and response magnitudes. The first analysis was conducted to describe response style. One of the experimenters observed the clips of each subject in random order and classified the subjects’ responses to the stimuli into six categories: ear moving, head moving, pupil dilating, vocalizing, tail moving, and displacement. Each category is described in Table 1. These categories include orienting responses (ear moving and head moving; Olmstead and Villablanca 1980) and communicative responses (vocalizing and tail moving; Bradshaw and Cameron-Beaumont 2000). Each category was scored separately as 0 (absent) or 1 (present) for each clip to determine the proportion of subjects showing each response in each presentation trial. Then, the summed score was calculated as the total score for each clip to examine the correlation between the numbers of categories occurring simultaneously and response magnitude rated by blind raters (who are described in the next section). To check for reliability, the other experimenter observed a random selection of half of the clips and scored the subjects’ behaviors. The indices of concordance were 0.78 for ear moving, 0.80 for head moving, 0.96 for pupil dilating, 1.00 for vocalizing, 1.00 for tail moving, and 0.98 for displacement (κ = 0.72, P < 0.0001 for overall observation).

Table 1 Descriptions of categories for behavioral scores

The second analysis was conducted to examine the response magnitude. Ten blind raters (five men and five women, mean age = 25.5 years) scored the subjects’ responses from the 100 clips in a random order within each subject. The raters were instructed to compare the subjects’ behaviors before and after the presentation of each stimulus and rate the magnitude of the subjects’ responses to the stimuli from 0 (no response) to 3 (marked response). Kendall’s coefficient of concordance showed significant moderate concordance among the raters (W = 0.55, df = 99, P < 0.0001).

Results

Behavioral score

Figure 1a shows a summary of subjects’ response styles to the stimuli as scored by the experimenter. More than half of the subjects responded to voice stimuli by moving their heads. About 30 % of the subjects also responded by moving their ears. Less than 20 % of the cats demonstrated vocalization and tail movement. This trend did not differ between the strangers’ voices and the owners’ voices. The total scores (Fig. 1b) were moderately correlated with the average response magnitude evaluated by the raters shown in the next section (Spearman’s rank correlation, ρ = 0.63, P < 0.0001). Thus, the raters’ evaluations of the response magnitudes may have partly depended on the number of simultaneously occurring responses of the subjects.

Fig. 1
figure 1

a Behaviors observed in response to voice stimuli and the percentage of individuals that expressed them. b Mean total scores calculated using the behavioral score for all subjects. Error bars indicate SEs

Owner’s voice discrimination

The raters’ evaluation revealed that 15 out of the 20 subjects decreased average response magnitude from stranger 1 to stranger 3. These subjects were considered to have successfully habituated to the sound stimuli calling of their names. It was significantly higher frequency than chance (two-tailed binomial test, P < 0.05), confirming that the number of stimuli presented was sufficient for habituation in the present experiment. Then, 11 out of the 15 habituated cats increased their response magnitudes from stranger 3 to owner. Group-level analysis revealed a significant increase in response magnitude in 15 habituated subjects (two-tailed Wilcoxon signed rank test: V = 22, N = 15, P = 0.03, Fig. 2). Thus, habituated cats dishabituated when they heard their owners’ voices calling their names. Rehabituation from owner to stranger 4 was not observed (V = 69, N = 15, P = 0.62, Fig. 2).

Fig. 2
figure 2

Mean magnitudes of responses to each voice in habituated subjects. Error bars indicate SEs

Discussion

We serially presented three strangers’ voices to the cats, followed by their owners’ voices. Among the 20 tested animals, 15 demonstrated decreased response magnitudes to the voices of strangers 1–3. These subjects’ responses then increased when presented with their owners’ voices. This suggests that domestic cats are able to recognize individual humans, who are not conspecifics, through vocal communication as well as through face-to-face interaction (Collard 1967; Casey and Bradshaw 2008). The present results did not show rehabituation from owner to stranger 4, as has been shown in previous research employing the habituation–dishabituation paradigm (e.g., Rendall et al. 1996; Reby et al. 2001; Charlton et al. 2007). However, learning studies have shown that dishabituation induces rebound of responses to habituated stimuli after the presentation of novel ones (e.g., Thompson and Spencer 1966; Pinsker et al. 1979) which indicates that one-trial rehabituation is not a necessary condition for the demonstration of successful discrimination.

It remains unclear which feature(s) of human voice serve(s) as a cue for distinguishing their owners’ voices. In this study, we used the subjects’ names as stimuli to elicit responses from the cats. As pet owners often have idiosyncratic ways of calling a pet’s name, these idiosyncrasies might have induced strong responses on their own. To counter this possibility, we instructed owners and strangers to call the cats’ names naturally, assigned same-sex strangers to the owners, and asked the strangers to call the cats’ names in ways that matched the phonological elements in the owners’ calls. Future studies can investigate the acoustic feature(s) necessary for owner voice recognition in cats.

The analysis of behavior categories revealed that the cats responded to human voices mainly through orienting behavior (ear movement and head movement), but not through communicative behavior (vocalization and tail movement). This tendency did not change even when they were called by their owners (Fig. 1). Although cats utilize specific purring for solicitation (McComb et al. 2009), these results indicate that cats do not actively respond with communicative behavior to owners who are calling them from out of sight, even though they can distinguish their owners’ voices. This cat–owner relationship is in contrast to that with dogs. Dogs are known to not only understand social cues from humans, such as pointing gestures (e.g., Hare et al. 2002) and human facial expressions (Nagasawa et al. 2011), but also to send social cues of eye contact to their owners (Miklósi et al. 2005; Nagasawa et al. 2009a; Passalacqua et al. 2011). The response style of cats shown in the present experiment might be one of the factors that leads people to believe that cats are calm, lazy, unfriendly, and not affectionate (Serpell 1996) or less cooperative and sympathetic (Podberscek and Gosling 2000) than dogs.

The communication style of cats is very different from that of dogs, as mentioned above. In fact, Serpell (1996) has shown that dogs are perceived by owners as being more affectionate than cats. However, dog owners and cat owners did not differ significantly in their reported attachment levels to their pets (Serpell 1996). This fact may reflect the difference in expectations between cat owners and dog owners. One research questionnaire revealed that the more affection the dog owners have toward dogs, the more frequently they tended to have physical contact with them. However, no such relationship was observed among cat owners (Ota et al. 2005). Thus, the behavioral aspects of cats that cause their owners to become attached to them are still undetermined.

Generally, owners report that cats have a special relationship with them, indicating that cats might be able to establish an attachment to their owners. Dogs, on the other hand, show attachment behavior to their owners explicitly. They behave differently with their owners compared to strangers, as observed in the Strange Situation test (Topál et al. 1998). They also exhibit emotional responses when reunited with their owners after separation (Nagasawa et al. 2009b). Such attachment-related responses have not yet been demonstrated in cats. Historically speaking, cats, unlike dogs, have not been domesticated to obey humans’ orders. Rather, they seem to take the initiative in human–cat interaction (Turner 1991). These historical and behavioral differences complicate the application of the experimental paradigms used to study dogs to the study of cats. The development of appropriate methodology would shed light on social cognitive ability in cats.