Introduction

The use of fMRI provides an exciting area of cognition research with domestic dogs, particularly in regard to interspecific affiliation and socio-cognitive processes. Recent imaging studies have identified general regions of activation in the temporal cortex and caudate of the dog brain for human faces (Cuaya et al. 2016; Dilks, Cook et al., 2015). Further, we have identified separate regions of the dog brain implicated in processing human and dog faces, respectively, analogous to the fusiform gyrus and superior temporal sulcus in the human brain (Thompkins et al. 2018). These areas are sensitive to face stimuli but are not face selective, as other findings have indicated that faces are not processed in specialized regions of the dog brain, in contrast to conspecific face processing by humans (Bunford et al. 2020; Szabó et al. 2020). These findings both inform the suitability of fMRI for investigating socio-cognitive processes in the domestic dog and call for further study of variability and specialized regions of interest.

To explore the effects of familiarity on dogs’ processing of human faces, we look to similar research that has been conducted in humans. Stoeckel et al. (2014) used fMRI to compare responsiveness to images of children and dogs. Participants completed behavioral measures for assessment of attachment to their children and dogs, after which they viewed images of their own child, their own dog, and unfamiliar dogs and children in the scanner and were asked to score them according to valence and arousal. Attachment measures indicated that 93% of participants were extremely attached to their companion dog, considering him or her a family member. Indeed, functional data revealed overlapping regions of brain activation in the human owners’ brain including those associated with reward, emotion, and affiliation, namely the amygdala, hippocampus, and fusiform gyrus. Two contrasts did reveal significant differences between familiar conditions. Images of one’s child led to the activation of the substantia nigra/ventral tegmental area (implicated in reward and affiliation) whereas this pattern of activation was not seen with images of one’s dog. And although the amygdala was activated by both conditions, images of one’s dog led to greater activation of the fusiform gyrus than did one’s child. Stoeckel et al. (2014) note that this may be due to the lack of language-based affiliation with dogs, as human–dog interaction may be more dependent on face perception to pick up on emotion, gaze direction, and identity.

In regard to behavior, studies have shown that dogs can discriminate between human faces expressing different emotional content (Müller et al. 2015) as well as between familiar and unfamiliar human faces (Adachi et al. 2007). As described herein, the effect of familiarity was characterized using an unsolvable task (e.g., Horn et al. 2012; Lazarowski et al. 2020; Marshall-Pescini et al. 2009). In this task, a dog is presented with a scenario in which he or she is unable to access a treat or toy that is beyond some sort of barrier. Generally, behaviors directed toward the human in this situation are considered indicative of the dog drawing on its human partner for assistance in the task.

We selected the unsolvable task as a complementary out-of-scanner measure due to evidence that it may reveal behavioral biases in dog populations (e.g., Horn et al. 2012; Marshall-Pescini et al. 2009). Here, we sought to parse out behavioral tendencies to seek the assistance of a familiar person in a potentially stressful situation. To measure bias in the unsolvable task, data were analyzed in terms of frequency and duration of social and communicative behaviors directed towards the familiar versus unfamiliar person. An additional benefit of this task is the prevention of within-session training effects, as the task utilizes comparatively fewer trials than other social measures and does not introduce any response-specific reward contingencies.

The population that was used for this study (working odor detection dogs in training) was unique in that these dogs are trained to focus on independent performance (Rooney et al. 2004). However, due to cooperation with and attention to the trainer during training, it could be expected that when faced with an unsolvable task, a preference for the familiar trainer may emerge. This preference may be especially true during the training phase when guidance by the trainer is more pronounced. It was hypothesized that dogs would seek assistance from their familiar human more so than from an unfamiliar human. That is, we expected dogs to exhibit a greater amount of attempts to engage the familiar human and that they would also spend longer amounts of time doing so.

In the following experiment, the neural basis of familiarity and emotion processing of human faces in the dog brain was measured by presenting still images and videos of familiar and unfamiliar faces varying in emotional expression (as determined by valence scores) while dogs underwent awake fMRI. Brain activity while passively viewing the visual stimuli was correlated with a familiarity bias score derived from the out-of-scanner unsolvable task. Based on the previous research, we hypothesized that dogs would demonstrate reliable activations in response to human faces in accord with both familiarity and emotional valence. Following from human research, regions of interest included analogous regions to those associated with facial familiarity and facial emotion processing in humans. Further, non-human primate work suggested that we might find differential activation in the hippocampus that was mediated by familiarity (Sliwa et al. 2014), as well as differential activation in the amygdala that was mediated by emotional valence (Hadj-Bouziane et al. 2012). We hypothesized that correlations between behavior and neural activity would be representative of mediation by familiarity for both measures.

Methods

Subjects

37 dogs (age: M = 2.03 years; breed: Belgian Malinois = 6, German Shepard = 1, Labrador retrievers = 28, Labrador retriever-German Wirehaired Pointer mix = 1, Springer Spaniel = 1; Sex: Female = 17, Male = 20) procured and trained by iK9, LLC for detection tasks participated in the study. All dogs were between 6 months and 3 years of age. All dogs remained awake for imaging, for which they were trained to lie in a prone position on the scanner bed with head inserted into a human knee coil. Positive reinforcement was provided to keep dogs as still as possible and to desensitize them to the scanner environment. Ethical approval for this study was obtained from the Auburn University Institutional Animal Care and Use Committee (Protocol 2016–2942) and all methods were performed in accordance with their guidelines and regulations.

Visual fMRI task

Stimuli

The stimuli included in the still images set varied by dog (due to varying familiarity with individual trainers), but each stimulus set included 24 images. Unfamiliar humans were individuals with no history of working with the dogs on a regular basis and did not have a history of giving the dogs commands and/or rewards. In contrast, the familiar humans were defined as the trainers who regularly interacted with, cared for, and conducted training with the dogs. The trainers of the dogs typically engaged each dog in both preparatory training for fMRI and detection related training. The combined interaction time was 30–60 min a day, 4 days per week for at least 3 months at the time of scanning. Within each set, there were four positive familiar images, four neutral familiar images, four negative familiar images, four positive unfamiliar images, four neutral unfamiliar images, and four negative unfamiliar images. Familiar images included positive, neutral, and negative expressions from each of four individuals. The unfamiliar stimulus set included more than four individuals, as individual unfamiliar stimuli were included according to closest valence match by condition to the familiar stimuli. (See Thompkins et al. 2018, for stimulus development and validation). Likewise, the stimuli included in the video set varied according to dog group (trainer familiarity variance), but each set included 24 videos. Within each set, there were four positive familiar videos, four neutral familiar videos, four negative familiar videos, four positive unfamiliar videos, four neutral unfamiliar videos, and four negative unfamiliar videos.

We ensured that the characteristics of the stimuli in the spatial and spectral domains were not significantly different across conditions. While still images are classically used in the literature, we wanted to ensure attention, and videos provided greater opportunity for stimulus salience across conditions. Further, videos, being multimodal in nature, may be processed differently than images. Therefore, we wanted to test whether the neural response to facial familiarity and facial emotions correlate with behavioral measures of the dog–human bond, separately for images and videos. We retained the sounds in the video (actors saying “good dog” and “bad dog”) as we felt that it may be ecologically more valid as it relates to daily interactions that the dogs have with humans.

Still images

The still images condition consisted of familiar faces (trainers) and faces of unfamiliar individuals. Within these conditions, the models demonstrated positive expressions, neutral expressions, and negative expressions. Models were encouraged to display as much emotion as possible for each photo. Images were captured using a Canon Rebel XT 8-megapixel DSLR camera and were edited and processed in Aperture (https://support.apple.com/aperture). Images were cropped to 600 × 600 pixels framed around the face and neck and were saved as JPEG files. A sample still image set is provided in Fig. 1 showing the mean valence and standard error of the mean for each picture.

Fig. 1
figure 1

A sample of a still image stimulus set. Images of unfamiliar humans were matched to images of familiar humans according to emotional valence score. Valence values ranged from −5 (most negative) to + 5 (most positive). Mean valence values and SEMs are included. The final stimulus set consisted of eight positive images, eight neutral images, and eight negative images

Videos

The video condition consisted of familiar and unfamiliar individuals displaying positive, neutral, and negative emotions. In the positive condition, models said, “Good dog!” repeatedly in an excited tone and with a great deal of positive expression. In the neutral condition, we avoided the use of potential ‘trigger words’ and asked the models to repeat, “We’re gonna do this. We’re gonna do that.” They did this in a monotone voice with no emotion expression. In the negative condition, models said, “Bad dog!” repeatedly in a forceful tone and with anger expression. Videos were captured using a GoPro Hero 3 camera at 30 frames per second and were edited and processed in Quicktime for Mac. Videos were adjusted to 1024 × 768 pixels framed around the face and neck and were saved as AVI files.

Procedure

Awake fMRI training

Training was conducted according to our group’s previously-established protocols (see Jia et al. 2014, 2016, for details). For dogs to lie motionless and awake while unrestrained in the scanner, progressive positive-reinforcement training was implemented (Strassberg et al. 2019). Training progressed from basic behavioral shaping using the clicker/treat and target stick methods, through off-site mock scanner training, and finally to training in the scanner environment. Clicker training involves the pairing of a food reward with a “click” to create a marker for appropriate behavior. In early training, the appropriate behavior of touching the snout to a target stick was rewarded. Clicks and treats were presented at a rapid rate (e.g., every 2 s as long as the desired behavior was maintained) and this time span gradually increased until a dog maintained the appropriate behavior for several minutes. The use of a target stick ensured appropriate positioning of the dog in the scanner. In the functional imaging experiment, the appropriate behavior was defined as lying motionless in the prone position with his/her head in the coil for three to five minutes.

Clicker and treat training were conducted, along with scanner audio acclimation, in the mock scanner (Fig. 2a) until the dog demonstrated criteria performance. The dogs then entered MRI suite acclimation training, wherein they were first allowed to adjust to the sights and sounds of the scanner environment by walking around the suite and climbing onto the patient table. When the dog demonstrated ease in the scanner room, clicker and treat training were reintroduced inside the scanner (Fig. 2b). When the dog again reached the appropriate behavior criterion (3–5 min lying motionless with head in the coil), he/she was deemed ready for the experimental data acquisition (Fig. 2c).

Fig. 2
figure 2

a The mock coil used in pre-scanner training. Dogs were trained to lie motionless and awake while unrestrained with the aid of clicker/treat training. A high fidelity audio recording (CD) of scanner noise was played at increasing sound levels to acclimate the dog to the environment. b Transitional training was conducted in the MRI suite to further acclimate the dogs to the MR environment in preparation for scanning. c, d Elements of the experimental setup for imaging awake dogs (c, d). c A black lab inserting its head in the human knee coil and staying still. d The system for tracking head motion using an external camera and videography. The MR-compatible projector screen was attached to the end of the bore during imaging and is not shown here as it would block the view of the dog and the coil

Scan parameters

The setup comprised of a 3 T Siemens Verio scanner, a human knee coil adapted as a dog head coil, a projector system to present visual stimuli and an external infra-red camera used to track head motion in dogs and retrospectively correct for motion artifacts in the data (Figs. 2d, 3). The 3 T Siemens Verio scanner has a 70-cm bore, as opposed to standard 60-cm bores in most scanners, and this allowed more room for the trainers to monitor the dogs while data was acquired. Functional data were obtained from the 3 T Siemens Verio scanner using an EPI sequence with the following parameters: repetition time (TR) = 1000 ms, echo time (TE) = 29 ms, field of view (FOV) = 192 × 192 mm2, flip angle (FA) = 90 degree, in-plane resolution 3 × 3 mm, in-plane matrix 64 × 64, and whole brain coverage. Anatomical data were obtained for registration purposes using an MPRAGE sequence with the following parameters: TR = 1550 ms, TE = 2.64 ms, voxel size: 0.792 × 0.792 × 1 mm3, FA = 9°, in-plane matrix = 192 × 192, FOV = 152 × 152 mm2, number of slices: 104.

Fig. 3
figure 3

An illustration of the experimental setup involving the MR-compatible eye-tracker. a Front view showing the screen, dog head coil and eye-tracker. b Rear view showing the screen, eye-tracker and an awake and unrestrained dog inside the dog head coil

Scanning

During scanning sessions, each dog completed four runs of randomized order, including two runs of images and two runs of videos. Each run totaled 140 s and included either 12 stimuli (human faces only) or 24 stimuli (human and dog faces). For dogs that viewed dog face stimuli, runs included eight randomly-distributed dog face stimuli (four familiar and four unfamiliar). Findings for dog face stimuli have been reported separately (Thompkins et al. 2018). Stimuli were presented as a rapid event-related design via projector screen for five seconds, after which a blank screen was presented for a variable 3- to 11-s inter-stimulus interval (ISI) before moving to the next image. The ISI duration for each trial was optimized using OPTSEQ software (https://surfer.nmr.mgh.harvard.edu/optseq/). The stimuli were presented closely enough in time that their hemodynamic responses would overlap. This required that the onset times of the events be jittered to remove the overlap from the estimate of the hemodynamic response. This design is highly resistant to habituation and expectation because the dog does not know when the next stimulus will appear or which stimulus type it will be. It is also more efficient than fixed-interval event-related design because more stimuli can be presented within a given scanning interval at the cost of assuming that the overlap in the hemodynamic responses will be linear (referred to as “stochastic designs” in SPM).

Attention scoring

To be sure that each dog looked at each stimulus that was presented during scanning, several precautions were taken. Such precautions were necessary to assure that only trials in which the dogs attended to the stimulus were analyzed. Attention was judged by two or more raters via simultaneous video recording of stimulus presentation and the dog’s eye. For each trial, if the dog’s eye was visibly open, then the rater assigned a score of “yes”. If the dog’s eye was closed or not open enough that the pupil was visible, then the rater assigned a score of “no”. Inter-rater reliability was assessed for each trial, and only trials with 100% inter-rater agreement of attentiveness were retained for data analysis. The inattentive trials were modeled as baseline/null trials during analysis.

Unsolvable task

The unsolvable task was characterized by the familiarity of human models and the accessibility of the treat or toy. The unfamiliar human was defined as a research assistant that did not work with the dog on a regular basis and did not have a history of giving the dog commands and/or rewards. In contrast, the familiar human was defined as the trainer who regularly interacted with, cared for, and/or conducted training with the dog. Accessibility of the treat or toy was defined by the trial condition. During solvable trials, the dog was able to access the reward within the apparatus. During unsolvable trials, the apparatus was locked and the dog could not access the reward.

Apparatus

The apparatus (Fig. 4) was constructed of a plywood base (26″ × 20″ × 1.5″), upon which the lid to a Sterilite 2.5-qt (73/8″ × 55/8″ × 6″) storage container was mounted upside down. The container could then be placed upside down on the lid to conceal a treat (Purina Moist and Meaty pellet) or the dog’s toy. If the container was left unlocked, then the dog had easy access to the treat/toy during solvable trials (nose-poke pressure was enough to knock the container off the lid). If the container was locked, the dog was unable to access the treat or toy.

Fig. 4
figure 4

The testing arena shows dog position in front of the apparatus and human positions to the left and right (counterbalanced). The apparatus was unlocked during solvable trials and locked during unsolvable trials

Design

A four-person experimental team conducted the unsolvable task. Experimenter 1 organized and set up task trials and recorded session information. Experimenter 2 handled the dogs. In addition, there was a familiar and unfamiliar human that served as stimuli. Each session consisted of four solvable trials followed by four unsolvable trials.

Procedure

An acclimation period was allowed before each session began. During this time, the dog was monitored and allowed to roam until he/she became visibly comfortable in the testing room. Stress indicators (panting, whining, etc.) were assessed and if such indicators were absent after 5 min, the dog was cleared to begin pre-training.

Each experimental session was preceded by pre-training. A series of demonstration trials were given to establish that manipulation of the apparatus resulted in a treat reward. That is, dogs were shown that the apparatus could be knocked over to reveal a reward and the dog was gradually trained (through continued visual demonstration, vocal encouragement, and praise upon completion) to knock the apparatus over on his/her own. Prior to each demonstration, Experimenter 2 brought the dog into the room and held him/her by the collar until Experimenter 1 gave the signal to release the dog. Once the dog reliably approached and knocked over the barrier to reveal the reward, the experimental session began.

To begin a trial, Experimenter 2 brought the dog into the training/testing area. The familiar human stood at his/her designated task position with head forward and the unfamiliar human stood at his/her analogous (mirrored) task position. When the dog was positioned appropriately at the starting point, Experimenter 1 said, “okay” and Experimenter 2 released the dog. The dog was given 15 s to interact with the apparatus. However, the trial was marked as complete when the dog obtained all of the treat reward(s) in the solvable condition or when the dog had diverted his/her attention from the apparatus for more than 15 s. Unsolvable trials were continued for 15 s, regardless of whether the dog interacted with the apparatus. Each trial was separated by a 30-s inter-trial interval (ITI), during which the dog was removed from the arena.

Each session was videotaped and later coded by two or more researchers and/or research assistants. Behaviors of interest scored included barking at, pawing at, sitting near, jumping on, gazing toward, approaching the familiar and unfamiliar person (Passalacqua et al. 2013; Marshall-Pescini et al. 2009), and were combined into an aggregate score, as well as nonspecific behaviors such as barking or targeting (e.g., staring at) the apparatus. Aggregate scores were calculated in terms of both frequency and duration of those behaviors. That is, any individual emittance of a behavior was given a value of ‘1′ and summed, and the numerical value of duration in seconds of a behavior was recorded and summed.

A frequency and duration familiarity bias measure for the unsolvable task was calculated by subtracting the aggregate score for the unfamiliar person from the aggregate score for the familiar person. For each dog, duration and frequency scores for the unfamiliar person were subtracted from the duration and frequency scores for the familiar person, respectively, creating the final duration and bias scores for each dog. Thus, positive values indicate a bias toward the familiar person, and negative values indicate a bias toward the unfamiliar person. Aggregation of behavioral instances and durations was done to avoid false positives and multiple correlational tests. The data were coded using three categories of behaviors: behaviors directed at the familiar person, behaviors directed at the unfamiliar person, and nonspecific communicative behaviors.

Correlation of neural and behavioral data

Familiarity bias scores for dogs with usable fMRI data were correlated with t values of voxel clusters activated in the fMRI task for the familiar vs. unfamiliar contrast as well as positive/negative vs. neutral emotions across the dog sample. A significant correlation would indicate that the familiarity bias displayed in the unsolvable task is supported by neural processes in the dog brain underlying judgments regarding familiarity of and emotions in human faces.

Results

Visual fMRI task

Data retention

The imaging study began with 37 dogs (age: M = 2.03 years; breed: Belgian Malinois = 6, German Shepherd = 1, Labrador retrievers = 28, German Wirehaired Pointer mix = 1, Springer Spaniel = 1; Sex: Female = 17, Male = 20). Data from several dogs had to be discarded because of one or more of these reasons: (i) excessive head motion and/or (ii) insufficient amount of trials attended to in order to perform analysis. A run-wise frame-wise displacement threshold of 0.9 mm was used as suggested by previous studies (Siegel et al. 2014). If ≥ 75% of the trials were attended, it was deemed sufficient. This condition was easily met in most circumstances since > 80% of the trials were attended on average (Table 1). Therefore, we had usable data from 22 dogs for image runs and 25 dogs for video runs. Counts and percentages of stimuli attended to during still image and video runs are shown in Table 1.

Table 1 Counts and percentages of stimuli attended to during still image and video runs

Image processing

Data processing was conducted using SPM12 (http://www.fil.ion.ucl.ac.uk/spm/software/spm8/, Functional Imaging Lab, The Welcome Trust Centre for NeuroImaging, The Institute of Neurology at University College London). All usable data were run through standard preprocessing steps, including realignment to the first functional image, spatial normalization to our own custom dog template, and spatial smoothing (discussed in Jia et al. 2014, 2016). Following preprocessing, a general linear model (GLM) was utilized to determine voxels that were activated by effects of interest. In the GLM, in addition to regressors representing effects of interest, we used time and dispersion derivatives to account for the variability of the hemodynamic response function, as well as six motion-related regressors derived from rigid body registration and 2 motion regressors obtained from the external motion tracking device (please see Jia et al. 2014 for details on how the 2 motion regressors from the external camera were obtained). The first level of analysis involved GLMs for individual dogs. T tests were used to determine voxels within individual dogs that were significantly (p < 0.05, false discovery rate, corrected) more active when viewing familiar human face stimuli against unfamiliar human face stimuli as well as more active when viewing positive face stimuli against neutral faces and viewing negative faces versus neutral faces.

Contrasting positive/negative facial emotions with neutral faces provides a binary estimate of whether regions in the dog brain show greater activation in response to emotional valence compared to neutral emotions. However, we wanted to investigate whether voxels that were significantly activated due to emotions were sensitive to changes in emotional valence on a continuous scale. This provides inferences about even tighter coupling between variables being modulated in the input (emotional valence in our case) and the underlying extent of neural activation. In order to do so, we used parametric regressors in the GLM which modulated the primary regressors related to emotions in the range [− 5, 5] representing the range of valence scores that a given stimulus can be assigned. A corrected threshold of p < 0.05 was used and correction for multiple comparisons was performed using cluster size thresholding using alphasim (https://afni.nimh.nih.gov/pub/dist/doc/program_help/AlphaSim.html) as in Jia et al. 2014. A cluster threshold of 25 voxels was used and the number of voxels in each cluster can be seen in Table 2.

Table 2 Voxels in activated clusters

Next, second-level group analyses were conducted for each of the individual dog level contrasts described above to determine voxels that were significant at the group level. Significant areas of activation were identified in the caudate, hippocampus, and amygdala for still image and video presentations. Contrasts focused on familiarity (familiar > unfamiliar) and emotional content (positive > neutral and negative > neutral). Figure 5 shows the identified regions of activation in the dog brain for the still images (5a) and videos (5b) for the familiarity contrast. Differential activations mediated by familiarity were identified in the caudate for both images and videos and the amygdala for videos. Figure 6 shows the identified regions of activations for the still images (6a, c) and videos (6b, d) for the emotional content contrasts. Differential activations mediated by emotional valence were identified in the caudate (positive and negative images; negative videos), amygdala (positive and negative videos), and hippocampus (negative images; positive videos).

Fig. 5
figure 5

Activation maps for a still image and b video conditions. Three orthogonal views (L to R: sagittal, axial, coronal) are shown for each subfigure. A color map is used for activation intensity as represented by t value, with warmer colors corresponding to higher t values. For image presentations, caudate activation was revealed for familiar versus unfamiliar faces. For video presentations, caudate and amygdala activation was shown for familiar versus unfamiliar faces

Fig. 6
figure 6

Left: activation maps for positive > neutral a still image and b video conditions. Right: activation maps for negative > neutral c still image and d video conditions. Three orthogonal views (L to R: sagittal, axial, coronal) are shown for each subfigure. A color map is used for activation intensity as represented by t-value, with warmer colors corresponding to higher t values. Differential activations mediated by emotional valence were identified in the caudate (positive and negative images; negative videos), amygdala (positive and negative videos), and hippocampus (negative images; positive videos)

Parametric modulation by emotional content was investigated to decipher whether dog brain regions were sensitive to changes in emotional valence on a continuous scale (see Fig. 7). In both the caudate and hippocampus, voxels were identified which demonstrated (i) greater activation (i.e. a binary response) for emotional expression (negative, positive) than neutral, and (ii) were parametrically modulated by emotional valence scores (i.e. a continuous response). Voxels showing a binary response are interpreted as regions that on the whole activate more for emotionally valent faces as opposed to neutral faces. However, voxels that parametrically modulate with emotional valence show a response that continuously covaries with the magnitude of emotional valence. In some cases, the same voxels satisfied both (i) and (ii). In some cases, different voxels within the same region satisfied either (i) or (ii) independently. The explicit overlap of conditions was revealed only in the caudate. This may be due to finer spatial specialization in the amygdala (Bzdok et al. 2013) and hippocampus (Robinson et al. 2015), which are heterogeneous with multiple functional zones. Comparatively, the caudate is known to possess only two functional zones—dorsal and ventral, associated with motion/cognition and reward/affect, respectively (Choi et al. 2012; Huang et al. 2017)—and we found activation mostly in affect-related regions of the caudate.

Fig. 7
figure 7

Results of positive vs. neutral and negative vs. neutral contrasts for a still images and b videos juxtaposed with regions parametrically modulated by emotions. Coronal view is presented in left columns and sagittal view is presented in right columns. Explicit overlap of conditions within same voxels (yellow) was revealed only in the caudate, although different voxels within the same regions were activated by both emotion contrasts and parametric modulation of emotions

Unsolvable task

Data for 28 dogs were obtained. Results of the unsolvable task were analyzed to uncover communicative behaviors made toward familiar versus unfamiliar humans. Data were grouped by subject, trial condition (solvable/unsolvable), and response type (unfamiliar/familiar/nonspecific) as factors.

Trial duration was mediated by the solvability of the task. Trial duration was consistent across the four solvable trials (M = 4.95 s, SD = 2.78), as confirmed by a one-way repeated-measures ANOVA over trials (1, 2, 3, 4), F(3, 81) = 1.93, p = 0.13. Trial duration across the last four trials (unsolvable) was the 15-s maximum time allowed. Data were further broken out into aggregate scores for frequencies and durations of the behaviors of interest.

Figure 8a shows average frequencies of the behaviors of interest by condition and direction of behavior (familiar person, unfamiliar person). To confirm that performance was stable over trials, a series of one-way repeated-measures ANOVAs were conducted: solvable trials (familiar: F(3, 81) = 0.49, p = 0.69; unfamiliar: F(3, 81) = 1.35, p = 0.26) and unsolvable trials (familiar: F(3, 81) = 1.49, p = 0.22; unfamiliar: F(3, 81) = 2.09, p = 0.11). A two-way repeated measures ANOVA with familiarity (familiarity, unfamiliar) and solvability (solvable, unsolvable) as factors on behavior frequencies revealed a main effect of solvability, F(1, 27) = 31.29, p < 0.01, ηp2 = 0.54, but no effect of familiarity, F(1, 27) = 2.91, p = 0.10. There were no subject effects or interactions, F(1, 27) = 0.97, p = 0.33.

Fig. 8
figure 8

Frequency (a) and duration (b) of communicative behaviors in the unsolvable task grouped by solvability and direction toward the familiar or unfamiliar model. Error bars represent SEMs.

Figure 8b shows the average duration (in seconds) of communicative behaviors as grouped by condition and direction of behavior (familiar person, unfamiliar person). Dogs spent more time engaged in communicative behavior directed to the familiar than unfamiliar person. Trials 1 through 4 were solvable and yielded shorter durations of communicative behaviors than did trials 5 through 8, during which the task was unsolvable. Across solvable and unsolvable trials, durations of communicative behaviors were greater for familiar individuals. Trial stability was confirmed by one-way repeated-measures ANOVAs for solvable trials (familiar: F(3, 81) = 0.72, p = 0.54; unfamiliar: F(3, 81) = 1.35, p = 0.54) and unsolvable trials (familiar: F(3, 81) = 1.14, p = 0.34; unfamiliar: F(3, 81) = 2.67, p = 0.05). A two-way repeated measures ANOVA with familiarity (familiarity, unfamiliar) and solvability (solvable, unsolvable) as factors on communicative behavior durations revealed main effects of both familiarity, F(1, 27) = 5.38, p < 0.05, ηp2 = 0.17), and solvability, F(1, 27) = 26.33, p < 0.01, ηp2 = 0.49). There were no subject effects or interactions, F(1, 27) = 3.00, p = 0.09.

Correlation of neural and behavioral data

Bias scores for the unsolvable task (N = 28) ranged from − 2 to 4 (M = 0.43, SD = 1.55) for frequencies and −8 to 16 (M = 2.10, SD = 5.30) for the duration. Scores for each dog can be seen in Table 3. These scores were used for correlation with neural data.

Table 3 Familiarity bias scores for the unsolvable task

Sixteen dogs (out of the 22 dogs in the fMRI cohort) had usable data in both the fMRI and behavioral tasks. Correlational tests were run between brain activations in the still image and video tasks with the duration bias measure and the results are shown in Fig. 9. For familiar versus unfamiliar face still images, significant correlations were found in the amygdala (r = 0.50, p = 0.04), caudate (r = 0.67, p = 0.004), and hippocampus (r = 0.59, p = 0.01). For positive versus neutral faces, a significant correlation was shown in the hippocampus (r = 0.58, p = 0.01). For familiar versus unfamiliar face videos, significant correlations were found in the amygdala (r = 0.74, p = 0.001), caudate (r = 0.64, p = 0.007), and hippocampus (r = 0.62, p = 0.01).

Fig. 9
figure 9

Familiarity bias for duration scores from the unsolvable task correlated with fMRI activation from the still images and video tasks. Voxel-wise correlation yielded the amygdala, caudate, and hippocampus as areas of significant correlation (p < 0.05 corrected)

Discussion

The current study was developed to investigate the behavioral and neural indices of familiarity and emotion as they may be related to the dog–human social bond. This bond has been shaped by domestication over several thousands of years, and canine social cognition provides a rich avenue for research with dogs. The subsets of this field in which we were most interested were those that assess differential attention and behavior mediated by a dog’s history and relationship with a particular human being. This area of behavior and cognitive processing is particularly relevant to working dogs, as their human trainers serve as both a companion and an instructor. The experiments presented herein targeted the relationship between dog and trainer in a working dog population.

The data obtained in the imaging portion of the study revealed differential activations in the caudate for familiar images and videos, positive images, and negative images and videos; the hippocampus for positive videos and negative images; and the amygdala for familiar, positive, and negative videos. In the unsolvable task, a familiarity bias emerged when analyzing duration of communicative behaviors, but not frequency. Correlation of the unsolvable task duration bias scores with neural data revealed several significant regions of interest, including the caudate, amygdala, and hippocampus.

Activation of the amygdala may be expected in investigations of familiarity, as this region is widely implicated in emotion and arousal across species (Phelps and LeDoux 2005). Activation of the hippocampus follows past human and non-human studies of activation by emotional content (Iidaka et al. 2003) and familiar faces (Sliwa et al. 2014). The significance of the caudate in these results may be tied to the opportunity for command and reward as mediated by familiarity of a human. This is especially important for the working dog population used in this study, as heightened attention to a trainer is imperative for learning as it bears upon the receipt of commands and rewards. The relationship between caudate activation and working ability has also been suggested to underlie motivation and success in other working dog populations (Cook et al. 2014). The results found in the unsolvable task experiment suggest that when given the option of interacting with a familiar or unfamiliar person, dogs may more reliably approach familiar individuals. However, the question remains as to whether the duration of affiliative behaviors or frequency of affiliative behaviors is more valid for the assessment of these biases. Finally, the bio-behavioral correlations found here follow past human and non-human primate research as previously discussed (Phelps and LeDoux 2005; Iidaka et al. 2003; Sliwa et al. 2014; Cook et al. 2014). In sum, this study provided behavioral evidence for familiarity preference, regardless of task solvability, in working dogs as well as a bio-behavioral index of familiarity preference when correlated with neural data.

This research also adds to what is known about the mechanisms of facial familiarity and facial emotion processing in domestic dogs and provides the first familiarity-based comparisons in this area of interest. Investigations of facial familiarity and emotion processing in dogs have focused on human faces in opposition to inanimate and non-social content. Though familiarity has not previously been assessed, Dilks et al. (2015) and Cuaya et al. (2016) identified face processing areas in the temporal cortex and caudate. Thompkins et al. (2018) reported adjacent and separate areas in the dog temporal cortex for human (Human Face Area: HFA) and dog faces (Dog Face Area: DFA). The HFA was not modulated by the familiarity of human faces but was modulated by valence. In the present study, we localized the processing of familiar and emotional human faces to the hippocampus, amygdala, and caudate. Importantly, while hippocampal and amygdala activations may be stronger in our research due to stimulus emphasis on familiarity and emotion, the caudate, commonly referred to as the reward center, has been consistently implicated in face processing by dogs and highlights the social relevance of human face stimuli.

A unique feature of this study was the ability to correlate neural and behavioral data within the same subject set. By drawing a tie between the social phenomenon of familiarity preference in-scanner via face presentation and out-of-scanner via the unsolvable task, questions of validity and applicability of dog fMRI may be explored. With either method alone, there is much to be desired in terms of final conclusions and translations across brain function and behavior. Due to the detrimental effects of in-scanner motion, behavioral responding is severely limited for the domestic dog. Whereas humans may use a mechanism such as a button box for behavioral assessment in fMRI, it is of yet unrealistic to plan and implement analogous response mechanisms for dogs. As such, replications of in-scanner processes of interest outside of the scanner offer the greatest opportunity for valid bio-behavioral conclusions in dog research.

This research utilized a multi-method approach, merging behavioral and neuroimaging avenues of investigation to explore the neural processing of familiar faces and emotional expressions. Simultaneous acquisition of behavioral and neural data allowed us to correlate findings to uncover potential profiles of successful working dogs. In all, the hypotheses of the current research were supported. Hippocampus and amygdala activation appears to be mediated by both familiarity and emotional valence. We also found that familiarity bias in a behavioral task correlates with the magnitude of differential activation to familiar and emotionally salient faces in the amygdala, caudate, and hippocampus of the dog brain. Future analyses focusing on functional/structural connectivity and connectivity fingerprints across humans and dogs will allow for a novel characterization of the relationship between dog and human brains, and can function as an important step in yielding insight regarding the phylogeny and ontogeny of social abilities (Ramaihgari et al. 2018; Robinson et al. 2016; Kyathanahally et al. 2015; Mars et al. 2013, 2016,2018; Thompkins et al. 2016).

Together these findings provide evidence for a network in the domestic dog brain that is sensitive to familiarity and emotional content within human faces. Consistent activations in the hippocampus, amygdala, and caudate provide evidence for similarities in familiarity and emotion processing networks between humans and dogs, as well as build upon what has been evidenced previously in the canine cognition behavioral literature. These findings point toward an ancient neural system that may be phylogenetically shared across humans (Haxby et al. 2000), non-human primates (Hadj-Bouziane et al. 2012; Sliwa et al. 2014), and dogs (Thompkins et al. 2018), though caution should be taken to evaluate the extent to which these analogies apply (Bunford et al. 2020; Szabó et al. 2020).

Limitations and future directions

Given the area of investigation for this study, including a set of familiar and unfamiliar domestic dog stimuli varied by valence would have provided compelling data for comparison. However, we could not manipulate and record positive, neutral, and negative valence in dogs. We were also unable to match human stimuli by sex due to restrictions of the familiar individuals (selection dictated by familiarity to the dogs) and valence-scored stimuli (selection dictated by positive, neutral, and negative valence scores). Further, we could not evaluate the dogs’ perceived valence of the presented stimuli. The latter could be accomplished in future research by the implementation of a behavioral discrimination task conducted with the dog subjects to assess the perception of stimuli. Finally, the sample of dogs used in this study received specialized training in odor detection, which may preclude generalization to the general population of domestic dogs. We also recognize that dogs viewing human faces only tackles one side of the dog–human bond. Humans viewing dog faces (e.g., Bunford et al. 2020) is critically important for investigating the other side of the human–dog bond.

Conclusions

For the first time, subject-specific stimulus sets were used to investigate the mediation of neural activation in dogs by familiarity and emotional valence of faces. The use of subject-specific stimuli provides the opportunity for tailor-made research and study within targeted, goal-oriented directives. The results demonstrate a neural mechanism allowing for cross-species identity, familiarity, and emotional recognition. This study was also the first to use the same subject-specific familiar individual in and out of the scanner, providing a unique opportunity to correlate neural and behavioral data within the same subject set. By drawing a tie between the social phenomenon of familiarity preference in-scanner via face presentation and out-of-scanner via the unsolvable task, questions of validity and applicability of dog fMRI may be explored. In addition to the vast expansion of what is known about canine cognition and its analogies to human cognition, refinement of ideal bio-behavioral profiles may well serve to inform the breeding, selection, and training practices of working dog institutions around the globe.