Introduction

Primates sleep from 8.8 to 17 h per night (Nunn and Samson 2018), with the need for sleep impacting many aspects of primate behavior, ecology, and evolution (Anderson 1998, 2000). The study of sleep architecture in nonhuman primates is also critical for comparative analyses that investigate the function and evolution of sleep along the human lineage (Samson and Nunn 2015). Remarkably, however, sleep architecture has been studied in fewer than 6% of 504 (Estrada et al. 2017) identified extant primate species (Anderson 1998; McNamara et al. 2008; Nunn et al. 2010; Samson and Nunn 2015). Given the great interest in understanding primate sleep, this gap in our knowledge likely reflects that these studies are expensive, sometimes invasive, and extremely time-consuming to conduct (Balzamo et al. 1998; Cruz-Aguilar et al. 2015).

The current “gold standard” for measuring sleep is electroencephalography (EEG), which is the measurement of electrical activity in different parts of the brain via a titanium head post, which must be surgically implanted on the primate’s skull to work effectively. Because of the inherent difficulty in measuring sleep in primates—and the ethical concerns involved with performing highly invasive surgery on rare and endangered primates—the methods for studying primate sleep are in transition alongside technological innovations in the field. Historically, the first systematic studies of great ape sleep were conducted with polysomnography (multimodal sleep studies) on chimpanzees in the 1960s and early 1970s (Balzamo et al. 1972; Freemon et al. 1969; Kripke and Bert 1968; McNew et al. 1971, 1972). Some researchers have questioned the quality of the data generated in these studies, given that they used procedures that provoked significant stress that likely interfered with sleep, such as using physical restraints and social isolation of the animal (Kantha and Suzuki 2006b).

With the development of non-invasive methods of observing nocturnal behavior among apes (Morimura et al. 2012; Mizuno et al. 2006; Videan 2006) and monkeys (Zhdanova et al. 2002; Ancoli-Israel et al. 2003), EEG has given way to a multifaceted approach including actigraphy (Bray et al. 2017; Samson et al. 2015; Samson et al. 2018) and high-sensitivity video recordings as the preferred method of measuring primate sleep (Muñoz-Delgado et al. 1995; Balzamo et al. 1998). Actigraphy can accurately detect sleep vs. wake states when compared with EEG (Ancoli-Israel et al. 2003; Johnson et al. 2007; Samson et al. 2016) and overall has the advantage of being less invasive and more cost-efficient than EEG. Actigraphy logs data from an accelerometer and is used to measure aspects of sleep duration and quality, while infrared videography uses behavioral data to score sleep stages based on postural changes, eye movements, and distal phasic muscle twitches. Actigraphy uses algorithms to determine the sleep state in each epoch, rather than observer interpretation of video. Because actigraphy is an objective, standardized measure that does not require observer scoring, it is often considered more reliable and repeatable than videography assessed by human judges.

All methods of quantifying primate sleep have pros and cons. With actigraphy, subjects are fitted with a collar that tracks movements via a data logging sensor (Barrett et al. 2009; Bray et al. 2017; Samson and Shumaker 2013), making data collection relatively easy once the device is on the animal. However, actigraphy is not useable in great apes, given that the application of sensors to the body elicits a strong auto-grooming instinct (Samson and Shumaker 2013). In addition, many zoos have strict policies about anesthetizing animals or capturing animals for non-medical purposes; this is important because animal handling is required for fitting animals with actigraphic devices at the outset of a study, and then again for removing the device when the study is over. Another drawback of actigraphy is that it cannot distinguish between rapid eye movement (REM) and non-rapid eye movement (NREM) sleep states, the two major subtypes of mammalian sleep, enabling only the distinction between awake and asleep. Finally, although technological advances have decreased the cost of actigraphy and improved the size and durability of devices, the high cost of the devices limits their use to captive environments, where recovery of the devices is more likely.

Given these limitations, infrared videography is an appealing non-invasive method to analyze sleep in nonhuman primates, and the only option in some species, such as great apes. Infrared videography has been used to identify sleep states in several primate species, including Pan troglodytes, Eulemur coronatus, Eulemur flavifrons, Eulemur mongo, Lemur catta, Propithecus coquereli, Varecia rubra, and Varecia variegata (Mizuno et al. 2006; Samson et al. 2015). By observing the movement of the chest, a coder can infer the breathing rate of the subject, and from those data, make inferences about the subject’s sleep state. While irregular breaths characterize awake states, subjects in NREM sleep generally breathe deeply and rhythmically. REM sleep is identified by a sudden shift to rapid, irregular breathing following a period of NREM sleep, often accompanied by sudden movements in the extremities or eyes; these movements are known as distal phasic muscle twitches. Infrared videography analysis requires a researcher to observe and record behaviors such as these from the videography data. Thus, the camera must have a consistent view of the subject, again resulting in greater application of these methods to captive as compared to wild settings. Even when a good camera shot is possible, key movements, such as distal phasic muscle twitches, are challenging to identify in small-bodied subjects, and the subtlety of some physiological changes make it difficult for observers to agree in their coding of sleep states.

Importantly, previous work (and some of the last studies to use EEG in primates) demonstrated a strong association between sleep states scored using EEG and video recordings in rhesus monkeys (Balzamo et al. 1998). Subject behavior was analyzed manually and scored into three distinct states (wakefulness, NREM, and REM) in an analysis of both videographically scored and EEG-scored minute epochs. Correlation coefficients for REM sleep (r = 0.987), for NREM sleep (r = 0.996), and wakefulness (r = 0.999) were significant. This research clearly illustrates the validity of videography in scoring sleep stages.

This pilot study evaluates a novel approach that applies the success of Eulerian enhancement of video images in other contexts (Wu et al. 2012) to study sleep in nonhuman primates. This method processes video to amplify changes in pixel-level characteristics from predefined areas of digital video sequences, enabling a viewer to detect these changes more easily, and thus overcoming some of the challenges just noted with implementing videography to study sleep. Specifically, an original video is processed to magnify input video to produce a hyper-magnified output of temporal and spatial changes. This technology has been applied to a diverse range of applications, including determination of pulse by slight variations in skin color on the face of a subject (Balakrishnan et al. 2013) and extracting sound from silent video using originally imperceptible vibrations of an object (Davis et al. 2014). Its usefulness in determining sleep stages in nonhuman primates has yet to be systematically explored.

We used a combination of actigraphic monitors, infrared videography, and converted Eulerian videography footage to analyze sleep with each of these technologies. Our objective was to investigate the utility and consistency of these technologies to quantify sleep in lemurs. We studied captive individuals of four lemur species (Eulemur coronatus, Lemur catta, Propithecus coquereli, and Varecia rubra) at the Duke Lemur Center. By first comparing infrared videography and actigraphy, and then examining whether processing the video with Eulerian code improves interobserver congruence in scoring, we aimed to explore whether Eulerian videography offers a potential improvement in future researchers’ ability to measure primate sleep. We hypothesized that wake/sleep state could be characterized by both infrared videography and actigraphic collars, thus predicting a high level of agreement between the state assigned using actigraphy collars and when using infrared videography. We also investigated whether Eulerian videography improves scoring of sleep in videographic scoring, focusing on capturing slight movements of the chest that correspond to sleep states. We therefore predicted that agreement among two scorers would be stronger in observations with Eulerian videography, as compared to use of unmodified infrared videography. Lastly, we predicted a higher agreement between observer-scored Eulerian wake/sleep designations and actigraphy data, as compared to the agreement between observer-scored infrared videography sleep/wake designations and actigraphy.

Methods

All animal use and methods were approved by the Duke University Institutional Animal Care and Use Committee and the DLC Research Committee. The research also adhered to legal requirements and to the American Society of Primatologists (ASP) Principles for the Ethical Treatment of Non-Human Primates.

Subjects

Research was conducted at the Duke Lemur Center (DLC) in Durham, North Carolina. All lemurs in this study were housed socially in male–female pairs for the duration of the study. Indoor enclosures were of identical size (3.05 m × 2.29 m × 2.13 m) and allowed exposure to natural light and controlled temperature. Videography and actigraphy data were collected from April to August 2015. Subjects included six individuals (2 Eulemur coronatus, 1 Lemur catta, 1 Propithecus coquereli, and 2 Varecia rubra), four of whom were male and two of whom were female and all of which were recorded for two nights. This research complied with protocols approved by the Duke Lemur Center Research Committee and the Duke Institutional Animal Care and Use Committee (IACUC) and adhered to United States legal requirements and American Society of Primatologists Principles for the Ethical Treatment of Primates.

In sum, 12 videos beginning in the evening at 18:00 h and ending at 06:00 h the following morning were captured. Each raw 12-h video was analyzed in 1-min epochs, which amounted to 720 epochs per video. Three videos were rejected for analysis due to poor video quality, nighttime interference by a conspecific, or a subject moving out of frame line of sight for greater than 50% of the epochs in a given night, resulting in 6480 epochs of data (equivalent to nine 12-h nights of video) for analysis (Eulemur = 1440, Lemur = 1440, Propithecus = 720, Varecia = 2880). Because of technical challenges encountered with generating high-quality video in the Propithecus enclosure, we were only able to generate one full night where line of sight captured the entire sleep period without loss of data.

Actigraphy data collection

Nighttime sleep quotas were generated using continuously recorded MotionWatch 8 (CamNtech) tri-axial accelerometers. These actigraphic sensors are lightweight (7 g) and attached to standard nylon pet collars that were placed on the animals by trained DLC staff. Animals were monitored to ensure no adverse reactions to the collar. We found that subjects habituated to the collars within 2 h, and all subjects wore the collars throughout the study. To control for temperature and light conditions, the study took place on the animals in indoor housing only. Recent advances in scoring algorithms have increased accuracy in detecting wake–sleep states and total sleep times (Stone and Ancoli-Israel 2017). We followed protocols used in previous primate sleep studies (Andersen et al. 2013; Barrett et al. 2009; Kantha and Suzuki 2006a, b; Zhdanova et al. 2002), including previous research using these devices on the multiple species included in this study (Eulemur coronatus, Eulemur flavifrons, Eulemur mongo, Lemur catta, Propithecus coquereli, Varecia rubra, Varecia variegata) at the Duke Lemur Center (Bray et al. 2017; Samson et al. 2018; Samson et al. 2019). We used the operational definition of sleep in actigraphy as the absence of any force in any direction during the measuring period (Campbell and Tobler 1984). The sensor sampled movement once a second at 50 Hz, assigning an activity value per 1-min epoch. Each epoch was then assigned a ‘sleep’ or ‘awake’ designation using CamNtech MotionWatch 8 software.

Infrared videography

Infrared videography recordings were collected concurrently with actigraphy data. On nights of data collection, an AXIS P3364-LVE network camera with built-in infrared capacity was mounted inside the enclosure to ensure line of sight on the sleep site. The camera captured video data beginning in the evening at 18:00 h and ending at 06:00 h the following morning. Each raw 12-h video was coded at an 8× speed and scored by 1-min epoch for varying sleep states. For each epoch, subject behavior was assigned to one of five categories: awake, quiet awake, NREM sleep, REM sleep, or unknown. Subjects were classified as awake when eye movement and/or continuous gross body movement occurred in the given epoch. Conversely, subjects were classified as asleep in NREM when extended non-movement and rhythmic breathing were observed. REM sleep was characterized by involuntary muscle twitching (i.e., distal phasic muscle twitches) or prolonged absence of movement combined with rapid, irregular breathing. Unknown classifications were assigned when the subject moved out of the frame or into a position where video could not be scored.

The training protocol was based on instructions from previous work outlining how to differentiate behavioral signatures from sleep architecture states (Samson and Shumaker 2013). Figure 1 offers a visual representation of determination of sleep state (Kripke et al. 1968; Mizuno et al. 2006; Samson and Shumaker 2013; Weitzman et al. 1965). Following the conclusion of the videography analysis, each epoch was assigned to either the ‘wake’ or ‘sleep’ category so that the data could be compared to the output of the actigraphic monitor. In total across all species, we recorded 8640 epochs of data (equivalent to twelve 12-h nights of video). Three videos were rejected for analysis due to extremely poor video quality, nighttime interference by a conspecific, or a lemur out of frame for greater than 50% of the epochs in a given night, resulting in 6480 epochs of data (equivalent to nine 12-h nights of video) for analysis.

Fig. 1
figure 1

Sleep state determination characterization. The classification of sleep stages, validated by EEG and used in previous work with primates, can be performed using videographic behavioral coding. Image reproduced from Samson and Shumaker (2013)

Eulerian videography

One night of sleep for an L. catta subject was selected for pilot testing with this novel analysis method based on optimal video clarity throughout the entire 12-h video recording period. The videography had previously been scored in the preexisting infrared format by two researchers working independently (EM and DRS). The infrared videography then underwent Eulerian processing using MATLAB code that spatially and temporally magnified input video to produce a hyper-magnified output (Wu et al. 2012). The MATLAB source code was obtained from the inventors of the Eulerian algorithm via Massachusetts Institute of Technology's (MIT’s) Computer Science and Artificial Intelligence Laboratory website (Wu et al. 2012). This code was used to process the 12-h sleep recording, creating one full night of spatially magnified Eulerian videography (see video sample for side-by-side comparison of infrared and Eulerian video here: https://youtu.be/rVDRxazwP1M). The Eulerian videography was watched and scored again by both researchers; this was done more than 14 days after the original had been coded, to help ensure that details of the first coding did not influence the later coding. The epochs of this Eulerian videography were analyzed with a protocol identical to the infrared videography analysis outlined above, testing for both inter- and intra-observer association and actigraphy monitoring-Eulerian videography association. There were 720 1-min epochs of data analyzed (267 epochs recorded as “awake” and 453 epochs recorded as “asleep,” of which 354 epochs were classified as NREM and 99 were classified as REM). We determined the efficacy of Eulerian videography processing to improve sleep-architecture scoring accuracy by comparing the level of agreement between both independent observers for the normal infrared videography condition and that of Eulerian-enhanced videography. These coding methods (i.e., two independent scorers assessing the same video) are comparable to those used in previous work to investigate muscle fasciculations in humans using Eulerian videography (Van Hillegondsberg et al. 2017).

Statistical analysis

To test the prediction that agreement among two scorers would be stronger in observations with Eulerian videography, as compared to use of unmodified infrared videography, we scored the binary wake/sleep data from actigraphy, standard infrared videography, and Eulerian videography, and agreement tests were run between infrared videography and actigraphy data for each corresponding epoch. Nine videos equivalent to 6480 1-min epochs, distributed across all four species of lemur studied, were used to test associated actigraphy data. A Cramer’s V test (Field 2013) was run in R (Team 2016) for each data pairing. The Cramer’s V test is one of three commonly used measures (along with the phi statistic and the contingency coefficient) of correlation when testing categorical variables. Inferences for association are based on standard procedures, as described in Table 1 (Field 2013). Verbal descriptions are used to denote classes of association. The ‘redundant’ term is used to denote a Cramer’s V relationship coefficient between 0.5 and 0.99.

Table 1 Summary of inter-researcher association when using Eulerian videography as compared to infrared videography to classify sleep states

To test the prediction that there would be a higher agreement in scorers between the Eulerian wake/sleep designations and actigraphy data, as compared to the agreement between infrared videography designations and actigraphy, two scorers analyzed the 720 epochs (one 12-h night) of infrared videography, followed by the same video that had undergone Eulerian transformation over 2 weeks later. The Cramer’s V tests were applied to the researchers’ infrared and Eulerian data sets to test for agreement. These results between the two researchers’ data from the infrared videography were compared against the association values between the two researchers after watching the Eulerian-processed videography. This comparison determined whether Eulerian video processing had the ability to more accurately interpret the sleep state of a lemur while sleeping. The Eulerian scores were further analyzed for an agreement between more detailed sleep states. The ratio of NREM and REM sleep epochs in the independent scorer’s assessments were explored as a more in-depth way to assess the reliability of Eulerian videography versus infrared videography.

To test the prediction that there would be a higher agreement between observer-scored Eulerian wake/sleep designations and actigraphy data, as compared to the agreement between observer scored infrared videography sleep/wake designations and actigraphy, the Cramer’s V test was applied to each researchers’ Eulerian-scored data against the actigraphy data, and these statistics were averaged together to produce a composite measure of association. Then each researchers’ original infrared video-scored data and actigraphy correlation values were obtained, and these were also averaged to produce a composite measure. The difference between the infrared/actigraphy and Eulerian/actigraphy association measures was then calculated.

Results

We confirmed that actigraphy assesses the sleep/wake state of a lemur as reliably as infrared videography, with most comparisons revealing that the data produced by the two methods were “redundant,” i.e. measuring the same phenomena in eight of the nine 12-h videos scored (Field 2013). Specifically, eight of the nine analyses of actigraphy versus infrared videography returned a Cramer’s V correlation statistic that is considered redundant (> 0.50 value) (Fig. 2). The ninth video, which we excluded due to the subject moving out of line of sight for 253 out of 720 epochs, returned a Cramer’s V correlation of 0.422, which was below the threshold for data that is considered redundant. Nevertheless, the very high rate of redundant pairs indicated that actigraphy and infrared videography analyses produced consistent results.

Fig. 2
figure 2

Infrared versus actigraphy classification: level of agreement (Cramer’s V). The heavy black line represents the redundancy threshold. Colors code different species. Above the black line shows strong evidence in support of correlation between measures of scored sleep states. Each bar represents one 12-h night of data

Analyses also showed that Eulerian videography processing increases interobserver association levels (Table 1), supporting the prediction of higher agreement in scorers between the wake/sleep designations in Eulerian and actigraphy analyses, as compared to the agreement in wake/sleep designations between standard infrared videography and actigraphy. The inter-researcher comparison of sleep versus awake designations in infrared footage was high (Cramer’s V = 0.675; redundant association). When Eulerian videography was used to make the same designations, the association value was 0.730, a 7.11% increase. The association level for the determination of NREM sleep state rose 47%, from a value of 0.43 when infrared videography is scored to 0.631 with Eulerian videography. REM sleep state congruence also increased when using Eulerian videography, with Cramer’s V changing from 0.364 to 0.489 when Eulerian videography was scored, an increase of 34%. As the REM sleep state is most difficult to distinguish, this increase in agreement between the two scorers’ observations demonstrates potential applicability for future studies in primate sleep classifications.

We also found that accuracy in scoring Eulerian videography improved when using baseline actigraphy data. Specifically, the correlation between Eulerian videography and actigraphy increased the Cramer’s V value by 0.5145, a 152.83% improvement over the Cramer’s V correlation value comparing infrared and actigraphy data (Table 1). Thus, results obtained from Eulerian videography were more strongly correlated with actigraphy output than infrared videography output.

Discussion

We investigated the congruence among three different methods of studying primate sleep, including a new method, Eulerian videography, which uses computational methods to enhance pixel-level changes in infrared video recordings. We found that actigraphy and infrared videography produce “redundant” results 88.9% (in 8 out of 9 videos) of the time, leading us to conclude that videographic scoring is effective for these four species of lemurs in a captive setting. We also found that Eulerian enhancement of video improves the congruence of manual scoring of sleep architecture among multiple observers. Thus, the use of Eulerian-processed videography offers the potential to improve the accuracy of sleep state classification in lemurs as compared to infrared video. It does so by magnifying movements of the diaphragm, distal phasic muscle twitches, eye movement, and general gross body movements to deliver a more easily interpreted epoch-by-epoch score.

Although EEG analysis of sleep would have provided a more definitive test of the scoring of REM and NREM sleep and thus strengthened this pilot study, this was not an option in our primate population: as with most other facilities in which non-model system primates are studied, invasive research is not allowed at the Duke Lemur Center, due to ethical concerns and disruption of other behavioral research. The use of EEG for these species would have required surgical insertion of electrodes, along with some level of restraint and social isolation to prevent disturbance of protruding cables. Less invasive devices would be forcibly removed by animals due to their grooming instinct or would have required extreme restraint to limit animal access to the devices.

By first showing that infrared videography and actigraphic data produce strongly correlated classifications, and then showing that observer congruence in scoring videography improves following Eulerian processing, our study demonstrated that Eulerian videography offers promise for improving the study of primate sleep. This finding was further reinforced by the result that output from Eulerian videography was more strongly correlated to actigraphy results than infrared videography compared to actigraphy.

Future research could more directly compare Eulerian videography to EEG, which may be especially important for applying this to new species. Although previous work illustrated a significant relationship between videography and EEG (Balzamo et al. 1998), we recognize a weakness in that this has been demonstrated in only one species of primate (Macaca mulatta). Although we have no reason to expect any systematic biases to emerge as the method is generalized across species, more research is needed to more fully apply this approach across species, at least in the remaining settings where EEG can be performed.

Application of this new technology could help to fill a critical gap in our understanding of sleep in primates, including in humans. Applied across many primate species, this could increase primate species representation in our comparative datasets (McNamara et al. 2008; Nunn and Samson 2018; Samson and Nunn 2015). Whenever possible, it is important to validate the methods in the species of interest. Here, we were able to compare videography to actigraphy. Actigraphy is not possible in all primate species, given that some primates will not tolerate collars due to small body size or a strong grooming instinct that results in removal of collars or stressful conditions for the animals that would be unethical and interfere with sleep. Similarly, the setting may restrict use of actigraphy (and possibly videography), including many wild primates, for which recapture is not possible, ethical, or certain. “Drop-off” collars may help in some of these circumstances, provided the collars can be relocated reliably via telemetry or other means. In addition, devices could be placed in bedding to measure movements for scoring sleep.

Increases in inter-researcher agreement across all sleep states indicates that further studies of the Eulerian videography method would be worthwhile and promising. As with the other methods, however, Eulerian videography has some limitations and costs. As compared to EEG, videography captures sleep states through behavioral signatures, rather than directly through measurement of brain activity. The degree to which those behavioral signatures covary with brain activity across species will determine the generality of the approach. As compared to actigraphy, processing 12-h videos—as is necessary to study a night of sleep—requires several days of computer time. Resultant files are very large, making transfer of files between researchers a challenge. Like actigraphy, Eulerian videography—and video in general—is also probably most applicable to analysis of primate sleep in captivity, as the subject must remain in the view of the camera for the duration of the sleep period. Setting up cameras in the wild would be challenging, disruptive to sleep, and potentially dangerous for the scientists conducting the research, representing a limitation of this approach.

Several challenges also remain with manual classification of sleep architecture in primates. Regardless of the amount of training and experience that a researcher has received in sleep classification, the method is subject to the interpretation of the researcher watching the videography and is therefore subject to variations, such as the researcher’s experience, available technology (screen quality), training, and attentiveness. For example, the ninth video had a time period where the subject moved out of line of sight. While this did not meet the preestablished threshold for rejecting a night altogether (a lemur out of frame for greater than 50% of the epochs in a given night), it likely played a role in the much lower Cramer’s V correlation compared to all other nights of data analysis. Therefore, any use of videography to generate sleep data needs to take into consideration the practicality of recording in enclosures. Automating this process would be a significant innovation in videography. For example, a recent study applied machine-learning techniques to the application of Eulerian videography in humans (Deng et al. 2018). Exploring the application of machine-learning algorithms to classify nonhuman primate sleep behaviors (breathing movements, distal phasic muscle twitches, eye movements, gross body movements) is thus also a promising possibility.

Primates spend approximately half of their lives at sleeping sites (Anderson 1998), and, thus, research on sleep is important for enriching our understanding of primate behavior, ecology, and evolution. Understanding sleep is also important for research on primate cognition. For example, research on humans has shown that receiving less sleep has negative effects on normal cognitive and physiological regulation (Durmer and Dinges 2005; McCoy and Strecker 2011). Researchers have begun to explore sleep effects on cognitive processes in nonhuman primate populations, primarily in orangutans, bonobos, and chimpanzees (Martin-Ordas and Call 2011; Samson and Nunn 2015; Shumaker et al. 2014). For example, recent work demonstrated that normal, non-interrupted sleep improved memory consolidation in five species of lemurs, including the species in this study (Samson et al. 2019). The approach developed here could enable investigation of specific stages of sleep in relation to cognitive performance, and in a wider array of primates. More generally, appropriately classifying sleep into stages and depth has important implications for our evolutionary understanding of cognitive and physiological functions in primates.

In summary, methodological improvements to measuring sleep can increase the understanding of primate behavior, ecology, and evolution. We found that application of Eulerian videography can improve congruence among individual scientists in the scoring of sleep. Given that previous research has validated the use of videography for staging sleep (Balzamo et al. 1998; Samson and Shumaker 2013), our findings thus open the possibility to score sleep stages in a much wider array of primates, which is important given that EEG is no longer possible in most circumstances of primate captivity. As a greater number of primate sleep architectures are obtained, future research will have greater power to comparatively investigate the function and evolution of sleep among mammals, primates, and the human lineage.