A deep learning-based approach for real-time rodent detection and behaviour classification

Cocoma-Ortega, J. Arturo; Patricio, Felipe; Limon, Ilhuicamina Daniel; Martinez-Carranza, Jose

doi:10.1007/s11042-022-12664-y

A deep learning-based approach for real-time rodent detection and behaviour classification

Published: 06 April 2022

Volume 81, pages 30329–30350, (2022)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Multimedia Tools and Applications Aims and scope Submit manuscript

A deep learning-based approach for real-time rodent detection and behaviour classification

Download PDF

J. Arturo Cocoma-Ortega¹,
Felipe Patricio²,
Ilhuicamina Daniel Limon² &
…
Jose Martinez-Carranza ORCID: orcid.org/0000-0002-8914-1904¹

629 Accesses
4 Citations
4 Altmetric
Explore all metrics

Abstract

Animal models are helpful to evaluate the effects of some drugs in the treatment of brain diseases, such as the case of the Open Field Maze. Usually, these tests are recorded in video and analysed afterwards to carry out manual annotations about the activity and behaviour of the rat. Usually, these videos must be watched repeatedly to ensure correct annotations, but they are prone to become a tedious task and are highly likely to produce human errors. Existing commercial systems for automatic detection of the rat and classification of its behaviours may become inaccessible for research teams that cannot afford the license cost. Motivated by the latter, we propose a methodology for simultaneous rat detection and behaviour classification using inexpensive hardware in this work. Our proposal is a Deep Learning-based two-step methodology to simultaneously detect the rat in the test and classify its behaviour. In the first step, a single shot detector network is used to detect the rat; then, the systems crop the image using the bounding box to generate a sequence of six images that input our BehavioursNet network to classify the rodent’s behaviour. Finally, based on the results of these steps, the system generates an ethogram for the complete video, a trajectory plot, a heatmap plot for most visited regions and a video showing the rat’s detection and its behaviours. Our results show that it is possible to perform these tasks at a processing rate of 23 Hz, with a low error of 6 pixels in the detection and a first approach to classify ambiguous behaviours such as resting and grooming, with an average precision of 60%, which is competitive with that reported in the literature.

New Approaches to Studying Rodent Behavior Using Deep Machine Learning

Robust Open Field Rodent Tracking Using a Fully Convolutional Network and a Softargmax Distance Loss

EXPLORE: a novel deep learning-based analysis method for exploration behaviour in object recognition tests

Article Open access 14 March 2023

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Laboratory rats and mice are used in several fields of biomedical research, including the study of animal behaviour in the neurosciences area [47]. Specifically, its usefulness takes importance in translational medicine since drugs can assess their efficacy in animal models for future treatments in different brain diseases in humans [3, 19]. The cognitive states can be evaluated in paradigms such as the open field test, the Morris water maze, and the elevated plus-maze, to name only a few [66]. However, the most widely used equipment of behavioural tests to evaluate possible drugs for pathologies that involve the motor process, natural exploration, or anxiety, is the open field test [30, 58].

In the beginning, the Open Field Maze (OFM) was used in a rudimentary way, where the rat was placed in a box divided into sections with grids. Subsequently, the researcher quantified the classic parameters such as the number of quadrants visited and grooming [15, 48, 68] by observation. In such a way that when the researchers wrote down their annotations (which were repetitive due to multiple observations to the test video), they got tired and led to high variability in the results. While seeking to reduce labour may be attractive but expensive if more personnel are involved in the task.

Currently, commercial solutions have used computerised video-analysis systems or infrared beam grids for rat tracking or measuring the time it spends in specific areas of test arenas [17, 60, 64]. On the other hand, the general specifications necessary for behavioural analysis in most commercial systems and those research proposals found in the literature require controlled conditions such as camera position, constant light or high contrast between the rodent and the background. These demanding characteristics make it difficult to define and qualify the behaviour of animals with a particular interest for the experiment; likewise, they do not allow the researcher to adapt to highly changing experimental needs.

Nevertheless, in addition to environmental conditions, previous work has shown that rat tracking and some behaviour classification (although limited) are possible by using traditional computer vision techniques based on geometrical features [6]. Motivated by the latter, we propose a novel and robust system based on Deep Learning (DL), a recently successful artificial intelligence technique, to monitor the locomotive behaviour of rats in real-time in the open field maze test. This system aims to reduce the effort of researchers in the manual annotations of this test by automatically creating ethograms, a top view plot of the rat’s position in the test maze and a heatmap where the most visited locations are highlighted. These data are obtained by analysing images recorded with an inexpensive camera placed in the top view over the maze (see Fig. 1).

Therefore, in this paper, we have divided and organised the work into the following sections: in Section 2, we present related work that analyses the foundation of experimental design regarding the topics of the area; Section 3, we describe our method on automatic monitoring using DL testing on the open field maze, we also describe the animals used in our experiments, characteristics of behaviour and, the design and configuration of the hardware; in Section 4 we present our experimental framework; in Section 5, we discuss what has been achieved in this work based on the system that we are presenting. Finally, our conclusions are outlined in Section 6, including future perspectives.

2 Related work

Designing a system for rat detection and behaviours classification involves computer vision challenges such as object recognition.Object recognition is an essential task that requires knowledge of the scene that impacts many applications, such as autonomous navigation, pedestrian detection, facial expression recognition, human activity recognition, to name a few. To perform recognition is necessary first, some pre-processing of the image to handle it better. Then, extract features from the image and process them to perform classification.Depending on the nature of the image, it can apply a variety of pre-processing to the image, such as colour normalisation, deblurring, brightness and contrast correction. For challenging images, such as underwater images, the work [36] proposes using Contrast-Limited Adaptive Histogram Equalisation (CLAHE) and Percentile methodologies to enhance them.After pre-processing, several works in literature propose the use of visual descriptors as feature extractors combined with classifiers (such as k-Nearest Neighbours, random forest) to recognise objects in the image. The work in [33] uses a combination of Shi-Tomasi, Scale-Invariant Feature Transform (SIFT) and Speeded-up Robust Features (SURF) extractors followed by a random forest classifier to obtain an accuracy of 86.4% in the 10-class Wang dataset.Other works propose schemes that use a combination of SIFT and Oriented fast and rotated brief (ORB) extractors to feed classifiers such as BayesNet, or k-NN [34] for content-based image retrieval obtaining a precision rate of 88.9% on the Wang dataset. The work [35] increased precision up to 99.53% for the Corel dataset using decision trees, random forest, and multi-layer perceptron.The efforts to outperform the state-of-the-art (SOTA) have increased with Deep Learning (DL) growth in the last years. The work [32] presents a complete review of SOTA methods for 2d object recognition, comparing the performance of feature-based methods against DL-based approaches, showing that DL can score better performance for some datasets. In this field, DL methods have shown to be robust to image changes such as uncontrolled light conditions, dynamic objects in the environment, for mention some. Also, DL has shown a high score in challenging datasets for multi-class classification such as ImageNet; this is the case of CoAtNet [9] (current top-1 accuracy in SOTA for image classification). CoAtNet scores a high accuracy of 90.88% for 1000-classes ImageNet dataset, using a combination of convolutional layers and an attention model (Transformer).In addition to image classification, object detection in images is another challenging problem. Several works have been proposed to achieve the best results for object detection. Some of the most relevant works are the You Only Lool Once (YOLO) network [53] and the Single Shot Detector (SSD) network [41], which have a reliable performance in AP (average precision) detection. Despite recent works that highly outperforms the AP on the Common Objects in Context (COCO) dataset, such as SwinV2-G [42] that proposes an architecture using Swing Transformers, the SSD offers promising results in prediction with fewer classes and has the advantage of fast prediction due to its compact architecture.Additionally, works in literature report approach to solve various problems such as hand gesture recognition [28] using temporal information. Networks handle temporal information (in the form of data sequence) using LSTM (long short-term memory) modules or gated recurrent unit (GRU) that take advantage of temporal data to provide feedback to the input.Other work uses temporal information for facial expression recognition (FER) [51] in the same way as a sequence of data (image stacked). Still, instead of 2D convolutions layers, they use 3D convolutions to handle the image sequence.Not all works use recurrent modules or 3D convolutions to handle temporal information. In the context of autonomous drone racing (ADR), [55] shows that using temporal information as a mosaic image provides the necessary information to learn flight commands successfully. Furthermore, the work [5] shows that it is possible to handle temporal information as a sequence of grey-scale images with only 2D convolutions to estimate camera pose w.r.t. an object with a low error in the ADR context.Within the domain of neurosciences, where detecting a rodent into an apparatus such as the OFM is an essential task, we can find many efforts in the literature that use both classical computer vision algorithms as well as deep learning-based algorithms to rat detection and behaviour classification; evermore, some works combine the use of special devices to aid detection. Several works show approaches for detecting a rat in a test using traditional computer vision [4, 11, 13, 16, 37, 63, 69, 71]. However, these works do not classify any behaviour; furthermore, controlled conditions of light remark high contrast between the animal and the scenario [7, 40].

Strategies such as painting or lighting bells to detect the rodent efficiently are used in [14, 49, 56]. Another invasive marker is surgical implants used to detect rodents [21, 22, 44, 59]. To enhance rodent detection and classification of its behaviour, the approaches in [70, 71] use additional hardware (sensors) to classify specific behaviours of the rodent.

When using depth/infrared cameras, the works in [7, 8, 20, 45, 50, 67] can identify the rodent position and get its orientation; besides, they can analyse more than one rodent; although they cannot identify behaviours, they only classify rearing. With the growth of Deep Learning, several works propose using different convolutional neural network (CNN) to detect and classify rodent behaviours in different scenarios.

Detection systems are the most conventional systems to be used for many scenario types. For this purpose, the works in [1, 10, 11, 23, 43] use CNN architectures reported in the literature to detect the rodent; the most common network used in these works are the YOLO network [53].

The authors in [11] adapt the YOLO network to detect rats (1 to 3) in a test box; also, they combine the Extended Kalman Filter to correct missed detection and score a high accuracy of 95% in detection. Despite performing good detection, the system proposed cannot classify any behaviour.

An approach presented in [10] reported precision of 90% in detection. This work performed detection in three different scenarios, but controlled illumination is needed.

A complete work presented in [24] can make both rat detection and behaviour classification with a constant illumination condition; they can classify five behaviours; however, the system cannot classify grooming.

Furthermore, several works focus their attention on rodent behaviour identification for different types of tests.

A fine-tuning of AlexNet [31] is made in work [54] to identify five behaviours associated with the Object Location Memory test. Higaki et al. [18] uses a CNN to classify the Morris Maze test behaviours.

For specific behaviours, the work [12] uses an extensive dataset (over 2 million images) to classify rodent grooming; also, [39] proposes to input the network with a stack of optical flow images. Scratching is another behaviour that the work in [29] classifies by proposing a DL-based approach, where a sequence of 21 images is used to feed the network. There is also a work that centres its efforts on segmenting rats using thermal images with a CNN [46].

One of the most relevant proposals is the work in [65], in which the authors proposed a system that can classify nine behaviours (including grooming) with an average precision of 65%. Nevertheless, to achieve this result, some controlled conditions are needed. If they do not control light conditions, their results are below 60%.

In addition to the proposal described above, we can find commercial and free systems that work in different apparatuses and offer various tools.

Ethovision XT is a complete system in the market; it can detect different animals in many tests and classify behaviours in the home cage. The principal disadvantage of the Ethovision system is its high price, making it not accessible for all the researchers.

Another great system that can detect animals for several apparatus is the ANY-Maze software. ANY-Maze works on six different scenes. Despite being a perfect solution for animal tracking, ANY-Maze does not offer behavioural classification. One thing to keep in mind is that the price of ANY-Maze could be high for most researchers.

In sum, in most cases, the works that can perform detection and tracking do not classify behaviours; on the other hand, there is no detection, or any additional information obtained from the test when classifying behaviours. There is no visual information to help researchers understand the data obtained with these different systems.

3 Methodology

Simultaneously rodent detection and behaviour classification could be challenging; for this reason, our system separates our methodology into two main tasks: rodent detection and behaviour classification (See Fig. 2). This section will give a general overview of the Open Field Test and its setting, a description of the network architecture in each step of the methodology, dataset generation, and the necessary configuration to make the network training.

3.1 Animals

Five male Wistar rats weighing 250–300g were obtained from Bioterio Claude Bernard of Benemérita Universidad Autónoma de Puebla (BUAP). Animals were housed in temperature and humidity-controlled in the vivarium of Laboratorio de Neurofarmacología-BUAP with a light-dark cycle of 12–12 hrs and free access to food and water. All procedures have followed the Guide for the care and use of laboratory animals of the Mexican Council for Animal Care NOM-062-ZOO-1999. We also obtained the approval of the Use of Laboratory Animals and the Ethics Committee of BUAP.

3.2 Open field test

The open field maze was used to determine the spontaneous motor activity of the rats. This model consists of a wooden box with 1.2 m x 1.2 m x 1.2 m. The arena was divided into nine quadrants of 40cm x 40 cm each. The test consists of placing the rat in the central quadrant of the arena and letting it explore for 15 min. In addition, a camera was placed on a tripod above the open field maze to video record and visualise the spontaneous scanning movements of the rats in a wide field. After some time, the rat is removed and placed in the laboratory vivarium.

3.3 Rodent detection

The first task in our methodology is rodent detection. To detect the rat in the test, we extracted each frame from the video and used it as input for detection. We selected the Single Shot Detector network [41]. The SSD can identify multiple objects from an image, delimiting the containing area where the object is. Since we only need to identify one object (a rodent), we selected a reduced version of the SSD named SSD7, which has only seven convolutional layers as a base network. This reduced architecture allows detection at a faster frame rate, which helps to make our approach more efficient in computational terms.

3.4 Behaviours classification

For the second task, we proposed to use an additional CNN to predict rodent behaviours. We decided to use a second CNN based on a preliminary test with the SSD network, where we trained the CNN to detect the rodent and predict its behaviour simultaneously. Unfortunately, we did not obtain satisfactory results. Thus, we designed a compact network based on inception modules [62] to predict rodent behaviours only.

A stack of 6 consecutive grey-scale images inputs the network, followed by the sequence, a combination of convolutional layers, and one inception module that extracts the features necessary to predict behaviours by Multi-Layer Perceptron form by four neurons. Figure 3 shows the architecture of the proposed network described.

The stacked input consists of a cropped image containing the rat; we proposed using a sequence to provide more information to the CNN about the behaviour’s motion. The sequence is significant to classify grooming and differentiate walking from resting, two different behaviours in the performed movement.

To reduce the amount of data information, we proposed to use grey-scale images instead of RGB images because we focused on extracting motion features rather than complex texture features.

We did not change the input size proposed in [62]; the size for each image is 224x244.

3.5 Data processing module

For each step of our methodology, it is necessary to prepare the input frames and process each network’s output. The Data Processing Module (DPM) is a pre-processing and post-processing of the data in our system.

At first, the DPM takes the output of the SSD network and uses the detected bounding boxes to crop the frame, converting it to grey-scale. The cropped frame is used to generate a frame sequence of six consecutive frames (a stack of images) as input for the Rat Behaviours Classification network.

The output from the second network enables the DPM to generate all the graphics associated with the test, i.e., video with detection, ethogram, detection plot, and heatmap plot for most visited areas.

The SSD output is used to generate the rat’s detection plot. Using the detection, the DPM plots each detection in the x and y-axis, maintaining the origin as in the image, which means the top left corner is the origin with coordinates (0,0). In addition to the detection plot, the DPM stores all the centres. At the end of the video processing, depending on the rat’s behaviour, some regions will store more data indicating the rat’s preference; this information is plotted as a heatmap.

To analyse the behaviours performed in the test, DPM creates an ethogram showing the four behaviours with a different colour for each one. Behaviours are stored in a vector for every prediction and plotted at the end as a timeline plot.

3.6 Dataset generation

As described in previous sections, our methodology consists of two main tasks: rodent detection and behaviour classification. Thus, it was necessary to prepare a dataset for each task. Inspired by the work in [6], we implemented a system to detect the rat’s position using a filtering algorithm such as the Kalman filter. The image points from rat detection on the image were used as labels with the complete image as training examples for the SSD network. We have five recorded videos with an average length of 15 minutes. We used Video 2 for dataset creation. There were about 27 thousand frames with respective bounding box labels. For detection, we used RGB frames, as shown in Fig 4.a).

Also, the bounding boxes were used for cropping the rat from the image. For behaviour labels, it was necessary to perform the classification manually. Due to the video’s number of frames, the task is arduous to label each video frame. Instead, we set labels only at the start of each behaviour, reducing manual labels to about 150 per video. We used the detection system to generate the labels for each frame in a time range with these marks.

Figure 4.b) shows an example of the cropped images generated and used to train the network for behaviour classification. Because the rat does more actions in the first minutes of the test and then tends to be resting, the classes were unbalanced. We took the minimum number of labels for one behaviour as a threshold to prevent a wrong classification. We took only that number of labels (frames) of each behaviour for the training.

Thus, we had a semi-automatic labelling system for bounding boxes and behaviours using the hand labels and the automatic detection system for the rat’s position.

3.7 Training process

We used Python 2.7 with the Keras API 2.2.4 and TensorFlow 1.14 framework as the backend to implement our networks. We set 100 epochs for training with a batch size of 64 with Adam optimiser and learning rate= 0.001 for both SSD and Rat Behaviour networks.

We used the loss described in [41] for the SSD network, and we used a categorical cross-entropy loss for classification, respectively.

Table 1 summarises the setup of the parameters used for training both networks.

Table 1 Parameters used to train the system for the rat’s detection and classification

Full size table

The dataset for detection consisted of 27 thousand labelled frames; however, we only took 8 thousand images for the training, split it into 80% for the training and the remaining 20% for validation. The training dataset size was approximately 12 thousand images for the Rat Behaviour Network, and we split it into 8952 images for training and 2984 images for validation. We wrote the DPM, the CNN and the entire system in Python.

4 Results

This section presents the results obtained for each module of our two-step system and the results produced by the DPM.

4.1 SSD network

As we described in Section 3.7, we used a small dataset for SSD training. This dataset was enough for the SSD to learn a model for the rat’s detection for all the video frames without losing it. Figure 5a) shows an example of the detection of the rat in the video. This sequence shows the bounding box containing the rat and its centre with the corresponding label and the confidence of the object detection. We generated this sequence only to show the SSD detection. We remind the reader that our system also performs a second task, behaviour classification.

4.2 Rat behaviour classification network

The network at this step predicts one of the four possible behaviours: walking, rearing, resting, and grooming. This prediction was performed in every frame in the video. We attached the behaviour classification to the cropped image to show the network’s output; an example of the result is shown in Fig. 5.b).

Additionally, we argue in Section 3.4 that we also tested the SSD network to classify behaviours, but the network could not perform these tasks correctly. Figure 5.c) shows that when we combine detection with classification in the same CNN, the network does not correctly detect the bounding box that contains the rat.

4.3 Plots and data generated by the DPM

The methodology section emphasises that the Data Processing Module is the essential module for pre- and post-processing data that generates the statistics for the system output. These outputs are video, ethogram, detection plot, heatmap plot, and a total of visited cells.

A video where the system detected the rat (indicated by a green bounding box on the image) and the classified rat’s behaviour had been created by combining the outputs of SSD and the Rat Behaviours Classification network ^{Footnote 1}.

Because the classification runs on a frame-to-frame basis, we have annotations of the behaviour of the rat for each instant of time. We compared the system’s classification against the ground truth data frame by frame. We had also calculated the classification precision for each behaviour; we present these results in Table 2.

Table 2 Metrics for our system evaluation: Accuracy, Precision, Recall and the F-score. Note that walking is the behaviour with the highest score

Full size table

With the predictions generated by the network, the DPM created an ethogram from the video. The ethogram shows the behaviour at each time of the video. The blue colour indicates rearing; walking is presented in orange, purple for grooming, and green for resting. Figure 6 shows the ground truth ethogram corresponding to video 3. Note that in the first half of the ethogram, the rat explores the box, walking and taking some time to groom itself; then resting is the predominant behaviour for the complete video from the second middle of the test on wards.

Although the precision for the behaviour classification is not high, we can still use the information in the ethogram to interpret the general behaviour during the entire test.

Moreover, the centre of the rat was estimated at each frame for detection throughout the video test, the DPM generates a detection plot using the centre estimation (see Fig. 7). The Figure shows the rat’s detection for video 3; the system generates the detection plot for all the videos.

We compared the detection points generated by the CNN vs the ground truth. Figure 7 shows the comparison plots; the red points represent the ground truth, and the green points are the output SSD. As observed, the estimated points are close to the ground truth ones; the global mean error (distance) between ground truth and estimated detection points is 6.34 pixels. For each axis, we calculated the RMSE, obtaining for x-axis 3.8 pixels and y-axis 6.1 pixels. These errors are low enough to detect the rat and generate a cropped image containing the rat correctly. In addition to the detection plot, the DPM produced a heatmap indicating the most visited zones in the box test. Lighter colour indicates fewer visited frequencies; for the most visited zones, a darker colour paints the area, as shown in Fig. 8. The heatmap is also from video 3; we can notice that the rat preferred the top right box and bottom right box. To estimate the time required to process each result by our system, we measured the processing time between each frame in all the experiments performed, with which we obtained an average time of 42 ms (\(\sim 23\)Hz).

Additionally, to the detailed description of video three results, Fig. 9 presents the plots generated by our system for each video (including the one used for training). We can observe that rats tend to explore and stay in corners. Therefore, their behaviours are different in all cases.

Section 2 presented a review of related work developed in the last decade. Considering the review, Table 3 compares the most related work in recent years. It follows from column Scores (column 2) that the Precision or Accuracy obtained by our system is comparable with the proposals reported in the SOTA. Nevertheless, our proposal does not need controlled conditions such as light or high contrast to performing these results. Also, our system offers a variety of visual results that can help have a broader vision of what happens in the open field test.

Table 3 Comparison with most related works in literature. The column Score shows the reported result by each work; some works report only precision (P) and some others only accuracy (Acc)

Full size table

5 Discussion

This work aimed to develop a Deep Learning-based two-step methodology to track and detect a rat in the arena of the open field maze. Subsequently, it classifies the animal’s behaviour. Since the psychomotor process is highly researched in the neurosciences area [2, 52, 57], experimenters have had difficulties in the precision of the data obtained by some software and by themselves. For this reason, we designed our system to generate an ethogram for the rat behaviours in the video analysed; this allows the researcher to evaluate highly relevant behavioural parameters depending on the objective study. The present work results show that the detection performed with the Single Shot Detector network is efficient and enables the system to automatically perceive locomotor behaviour in free-moving rats in the open field maze model. Likewise, we have compared traditional computer vision algorithms against the approach proposed in this work, showing that our system can simultaneously detect and classify the animal’s behaviour, something not achieved by these traditional methods.

There has been a behaviour classification problem and remains a complex challenge to date. The detection of the subjects is possible, as we have described in our related work section. However, there is still room for improvement in the behaviour classification task [61]. In this regard, we have proposed an approach based on the SSD network and our novel CNN architecture called BehavioursNet. We use the former network for detection and the latter for behaviour classification. From our experience, attempting all tasks with a single CNN performs poorly. See, for instance, Fig. 5.c), where the SSD has been trained to detect the rat and classify its behaviour. However, the SSD did not even detect the rat when tested on the images.

According to the experiments made with the SSD network, its architecture tries to classify small regions of the image as the object to join later all regions (anchor boxes). The particularity of the rat’s behaviours presents similarities in some of them, such as resting and grooming in which the rat’s shape seems similar. Or, when the rat is walking, that long shape can be confused with rearing if the image is rotated. Providing only one image to the network to classify behaviours may not have enough information and cause the network to fail in detection. On the other hand, if we design a network that processes more than one image to provide more information, this can improve network classification, expecting some false positives in behaviours that show similarities. Thus, our approach detects and classifies the rat’s behaviour, as shown in Figs. 5.a and 5.b.

Since behaviour tests require the researcher’s constant observation in real-time, later looking at the video record, was tired and predisposed to errors. The automatic behaviour classification provided with our system can facilitate the locomotor study of experimental subjects such as the rats shown in Fig. 5. The detection of the rat on the image performed by our system, as shown in Fig. 8, is useful to analyse the activity of the rat, reflecting any condition derived from some drug or pathology. The behaviour classification on a frame-to-frame basis and the reported in the ethogram can speed up the behaviour analysis and evaluation for various pathologies, including PD and anxiety. All the data generated in real-time by the system permits the user to skip observation time, paying attention only to those time slots with relevant motion activity or behaviour classification.

Table 2 shows the evaluations of each behaviour’s classification: rearing, walking, grooming, and resting. The values indicate a high score for accuracy in both rearing and grooming; however, particularly for grooming, the score decreases in precision, recall and F-score (Table 2). This situation is caused by the similarity of grooming and resting behaviours when seen from a top view. From this perspective, essential body parts of the rat, such as the paws, are not visible, which may be crucial to classifying grooming.

Figure 6.a) shows an ethogram produced with ground truth data. Fig. X shows the ethogram obtained from our network’s classification; note that in the first half of this ethogram, the classification depicted in green resembles that of the ground truth. Despite the margin of error between grooming and resting, when interpreted with our network, such ethograms can be helpful when studying anxiety processes. This is advantageous for the researcher because he will not need to spend a more significant part of the time corroborating the data obtained by the ethogram compared to those taken in real-time.

Grooming is an innate behaviour in rats related to the hygiene of the animal and other physiological processes such as thermo-regulation, socialisation, and excitement [26]. However, in highly anxious animals, it is common to observe hyperactivity and increased grooming [25, 27]. In addition, a typical thigmotaxis behaviour has been observed in the open field maze, which is related to the amount of time the experimental subject remains adjacent to the maze wall [38]. In contrast, when evaluating anxiolytic drugs, this activity and behaviours diminished. Therefore, when assessing the four behaviours with our system, their measurements could be used to support the user’s interpretation of a specific pathology, even when having a small margin of error between grooming and resting. Additionally, our system could be useful when a large number of videos need to be analysed in various experimental groups of rodents of a given project.

As Table 3 shows, the accuracy and precision obtained by our system are comparable with those found in the literature. Nevertheless, our system takes advantage of those with greater accuracy or precision but only tracks the rat or detects one or more behaviours without grooming. This is a significant analysis in many experiments as we described earlier. Our proposal can track and detect behaviours, including grooming. Furthermore, compared with the work presented in [65], they score an AP = 65% with constant light conditions. When illumination presents changes, their AP drops to < 50%; in these cases, our work is not only comparable with the highest precision, it is also maintained under changing lighting conditions. In the additional material, it can be seen that when researchers observe the rat, they cause variations to the light due to their shadows; additionally, there are present some slight variations in light due to physical situations of the lights.

Finally, we emphasise again that our system processes data with two CNN architectures, selecting a reduced version of SSD (SSD 7) followed by our small BehavioursNet. Yet, our system performs at an average time of 42ms (\(\sim \)23Hz). This time makes possible the use of standard inexpensive cameras that can record videos in a range of 15 to 25 fps without requiring much more time than the duration of the video itself. The possibility of processing videos in real-time benefits the user because this can reduce the time needed to perform analyses during the experimentation of the effects of a drug. Even more, having the possibility to see together all the graphs generated with our system gives us a quick overview of the activity of each rat during the test, thus allowing us to observe the differences between activity and behaviour between trials. Therefore, our system’s automatic generation of these results offers researchers the opportunity to spend less time watching the recordings, focusing only on those videos where the ethograms and trajectory plots may exhibit distinctive data worth being analysed more carefully.

6 Conclusions

This work has described a system for automatically detecting a rat in an open field maze while simultaneously classifying its behaviours. We have shown that it is possible to use Deep Learning techniques such as Convolutional Neural Networks to perform these tasks efficiently at an average frequency of 23 Hz. Also, despite the difficulties of using top-view images, it is possible to classify behaviours with a precision and recall of 60%, comparable with the works reported in the literature, with the advantage of not requiring special setups or controlled environments. The results achieved with our proposal deem promising considering that we can do it with a low resolution and inexpensive video camera with a budget PC.

We will explore 3D data to enhance classification between similar behaviours for future work.

Notes

This video can be found at the following link https://drive.google.com/drive/folders/1AE7mcsj2avXcD8zJ_iCp2eR405bO9F3r?usp=sharing

References

Arac A, Zhao P, Dobkin BH, Carmichael ST, Golshani P (2019) Deepbehavior: A deep learning toolbox for automated analysis of animal and human behavior imaging data. Front Syst Neurosci 13:20. https://doi.org/10.3389/fnsys.2019.00020
Article Google Scholar
Brenes JC, Padilla M, Fornaguera J (2009) A detailed analysis of open-field habituation and behavioral and neurochemical antidepressant-like effects in postweaning enriched rats. Behav Brain Res 197(1):125–137. https://doi.org/10.1016/j.bbr.2008.08.014
Article Google Scholar
Bryda E (2013) The mighty mouse: The impact of rodents on advances in biomedical research. Missouri medicine 110:207–11
Google Scholar
Chanchanachitkul W, Nanthiyanuragsa P, Rodamporn S, Thongsaard W, Charoenpong T (2013) A rat walking behavior classification by body length measurement. In: The 6th 2013 biomedical engineering international conference. https://doi.org/10.1109/BMEiCon.2013.6687670, pp 1–5
Cocoma-Ortega J, Martinez-Carranza J (2021) A compact cnn approach for drone localisation in autonomous drone racing. Journal of Real-Time Image Processing. https://doi.org/10.1007/s11554-021-01162-3
Cocoma-Ortega JA, Martinez-Carranza J (2019) Towards a rodent tracking and behaviour detection system in real time. In: Pattern Recognition. Springer International Publishing, Cham, pp 159–169
da Silva Aragão R, Rodrigues MAB, de Barros KMFT, Silva SRF, Toscano AE, de Souza RE, de Castro RM (2011) Automatic system for analysis of locomotor activity in rodents—a reproducibility study. J Neurosci Methods 195(2):216–221. https://doi.org/10.1016/j.jneumeth.2010.12.016
Article Google Scholar
da Silva Monteiro JP (2012) Automatic behavior recognition in laboratory animals using kinect, Faculdade de Engenharia da Universidade do Porto
Dai Z, Liu H, Le QV, Tan M (2021) Coatnet: Marrying convolution and attention for all data sizes. In: Thirty-Fifth conference on neural information processing systems. https://openreview.net/forum?id=dUk5Foj5CLf
de Menezes R, Luiz JV, Henrique-Alves A, Cruz RS, Maia H (2020) Mice tracking using the yolo algorithm. In: Anais do XLVII Seminário Integrado de Software e Hardware. https://sol.sbc.org.br/index.php/semish/article/view/11326. SBC, Porto Alegre, pp 162–173
Geuther BQ, Deats SP, Fox KJ, Murray SA, Braun RE, White JK, Chesler EJ, Lutz CM, Kumar V (2018) Robust mouse tracking in complex environments using neural networks. bioRxiv. https://doi.org/10.1101/336685
Geuther BQ, Peer A, He H, Sabnis G, Philip VM, Kumar V (2021) Action detection using a neural network elucidates the genetics of mouse grooming behavior. eLife 10:63207. https://doi.org/10.7554/eLife.63207
Article Google Scholar
Giancardo L, Sona D, Scheggia D, Papaleo F, Murino V (2012) Segmentation and tracking of multiple interacting mice by temperature and shape information. In: Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012), pp 2520–2523
Gianelli S, Harland B, Fellous JM (2018) A new rat-compatible robotic framework for spatial navigation behavioral experiments. J Neurosci Methods 294:40–50. https://doi.org/10.1016/j.jneumeth.2017.10.021
Article Google Scholar
Giulian D, Silverman G (1975) Solid-state animal detection system: Its application to open field activity and freezing behavior. Physiol Behav 14(1):109–112. https://doi.org/10.1016/0031-9384(75)90150-X
Article Google Scholar
Gomez-Marin A, Partoune N, Stephens GJ, Louis M (2012) Automated tracking of animal posture and movement during exploration and sensory orientation behaviors. PLOS ONE 7(8):1–9. https://doi.org/10.1371/journal.pone.0041642
Article Google Scholar
Heredia-López FJ, May-Tuyub RM, Bata-García JL, Góngora-Alfaro JL, Álvarez-Cervera FJ (2013) A system for automatic recording and analysis of motor activity in rats. Behav Res Methods 45(1):183–190
Article Google Scholar
Higaki A, Mogi M, Iwanami J, Min LJ, Bai HY, Shan BS, Kan-no H, Ikeda S, Higaki J, Horiuchi M (2018) Recognition of early stage thigmotaxis in morris water maze test with convolutional neural network. PLOS ONE 13(5):1–11. https://doi.org/10.1371/journal.pone.0197003
Article Google Scholar
Hånell A, Marklund N (2014) Structured evaluation of rodent behavioral tests used in drug discovery research. Front Behav Neurosci 8:252. https://doi.org/10.3389/fnbeh.2014.00252
Article Google Scholar
Hong W, Kennedy A, Burgos-Artizzu XP, Zelikowsky M, Navonne SG, Perona P, Anderson DJ (2015) Automated measurement of mouse social behaviors using depth sensing, video tracking, and machine learning. Proc Natl Acad Sci 112(38):E5351–E5360. https://doi.org/10.1073/pnas.1515982112
Article Google Scholar
Howerton CL, Garner JP, Mench JA (2012) A system utilizing radio frequency identification (rfid) technology to monitor individual rodent behavior in complex social settings. J Neurosci Methods 209(1):74–78. https://doi.org/10.1016/j.jneumeth.2012.06.001
Article Google Scholar
Jia Y, Wang Z, Canales D, Tinkler M, Hsu C, Madsen TE, Mirbozorgi SA, Rainnie D, Ghovanloo M (2016) A wirelessly-powered homecage with animal behavior analysis and closed-loop power control. In: 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp 6323–6326
Jin T, Duan F (2019) Rat behavior observation system based on transfer learning. IEEE Access 7:62152–62162. https://doi.org/10.1109/ACCESS.2019.2916339
Article Google Scholar
Jin T, Duan F (2019) Rat behavior observation system based on transfer learning. IEEE Access 7:62152–62162. https://doi.org/10.1109/ACCESS.2019.2916339
Article Google Scholar
Kalueff A, Aldridge J, LaPorte J, Murphy D, Tuohimaa P (2007) Analyzing grooming microstructure in neurobehavioral experiments. Nat Protoc 2:2538–44. https://doi.org/10.1038/nprot.2007.367
Article Google Scholar
Kalueff A, Stewart A, Song C, Berridge K, Graybiel A, Fentress J (2015) Neurobiology of rodent self-grooming and its value for translational neuroscience. Nat Rev Neurosci 17:45–59. https://doi.org/10.1038/nrn.2015.8
Article Google Scholar
Kalueff A, Tuohimaa P (2005) The grooming analysis algorithm discriminates between different levels of anxiety in rats: Potential utility for neurobehavioural stress research. J Neurosci Methods 143:169–77. https://doi.org/10.1016/j.jneumeth.2004.10.001
Article Google Scholar
Kim JH, Hong GS, Kim BG, Dogra DP (2018) deepgesture: Deep learning-based gesture recognition scheme using motion sensors. Displays 55:38–45. https://doi.org/10.1016/j.displa.2018.08.001. Advances in Smart Content-Oriented Display Technology
Article Google Scholar
Kobayashi K, Matsushita S, Shimizu N, Masuko S, Yamamoto M, Murata T (2021) Automated detection of mouse scratching behaviour using convolutional recurrent neural network. Sci Rep 11(1):1–10
Article Google Scholar
Kraeuter A-K, Guest P C, Sarnyai Z (2019) The open field test for measuring locomotor activity and anxiety-like behavior. In: Pre-clinical models. Springer, pp 99–103
Krizhevsky A, Sutskever I, Hinton G (2012) Imagenet classification with deep convolutional neural networks. Neural Information Processing Systems 25
Kumar M, Bansal M, Kumar M (2020) 2d object recognition techniques: State-of-the-art work. Archives of Computational Methods in Engineering 28
Kumar M, Bansal M, Saluja K (2021) An efficient technique for object recognition using shi-tomasi corner detection algorithm. Soft Computing 25
Kumar M, Chhabra P, Garg N (2018) An efficient content based image retrieval system using bayesnet and k-nn. Multimed Tools Appl 77:21557–21570. https://doi.org/10.1007/s11042-017-5587-8
Article Google Scholar
Kumar M, Chhabra P, Garg N (2020) Content-based image retrieval system using orb and sift features. Neural Computing and Applications 32
Kumar M, Garg D, Garg N (2018) Underwater image enhancement using blending of clahe and percentile methodologies. Multimedia Tools and Applications 77
Lai PL, Basso DM, Fisher LC, Sheets AL (2011) 3 d tracking of mouse locomotion using shape-from-silhouette techniques
Lamprea M, Cardenas F, Setem J, Morato S (2008) Thigmotactic responses in an open-field. Braz J Med Biol Res = Revista brasileira de pesquisas mdicas e biolgicas / Sociedade Brasileira de Biofsica ... [et al] 41:135–40. https://doi.org/10.1590/S0100-879X2008000200010010
Article Google Scholar
Lee CC, Gao WW, Lui PW (2019) Rat grooming behavior detection with two-stream convolutional networks. In: 2019 Ninth International Conference on Image Processing Theory, Tools and Applications (IPTA), pp 1–5
Linares-Sánchez LJ, Fernández-Alemán JL, García-Mateos G, Pérez-Ruzafa A, Sánchez-Vázquez FJ (2015) Follow-me: A new start-and-stop method for visual animal tracking in biology research. In: 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp 755–758
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) Ssd: Single shot multibox detector. In: Leibe B, Matas J, Sebe N, Welling M (eds) Computer Vision – ECCV 2016. Springer International Publishing, Cham, pp 21–37
Liu Z, Hu H, Lin Y, Yao Z, Xie Z, Wei Y, Ning J, Cao Y, Zhang Z, Dong L et al (2021) Swin transformer v2: Scaling up capacity and resolution. arXiv:2111.09883
Lv X, Dai C, Chen L, Lang Y, Tang R, Huang Q, He J (2020) A robust real-time detecting and tracking framework for multiple kinds of unmarked object. Sensors 20(1)
Macrì S, Mainetti L, Patrono L, Pieretti S, Secco A, Sergi I (2015Aug) A tracking system for laboratory mice to support medical researchers in behavioral analysis. In: 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp 4946–4949
Matsumoto J, Urakawa S, Takamura Y, Malcher-Lopes R, Hori E, Tomaz C, Ono T, Nishijo H (2013) A 3d-video-based computerized analysis of social and sexual interactions in rats. PLOS ONE 8(10):1–14. https://doi.org/10.1371/journal.pone.0078460
Article Google Scholar
Mazur-Milecka M, Kocejko T, Ruminski J (2020) Deep instance segmentation of laboratory animals in thermal images. Applied Sciences 10(17)
Moulin TC, Covill LE, Itskov PM, Williams MJ, Schiöth HB (2021) Rodent and fly models in behavioral neuroscience: An evaluation of methodological advances, comparative research, and future perspectives. Neuroscience & Biobehavioral Reviews 120:1–12. https://doi.org/10.1016/j.neubiorev.2020.11.014
Article Google Scholar
O’Connor C, Heath DL, Cernak I, Nimmo AJ, Vink R (2003) Effects of daily versus weekly testing and pre-training on the assessment of neurologic impairment following diffuse traumatic brain injury in rats. J Neurotrauma 20(10):985–993. https://doi.org/10.1089/089771503770195830. PMID: 14588115
Article Google Scholar
Ohayon S, Avni O, Taylor AL, Perona P, Egnor SER (2013) Automated multi-day tracking of marked mice for the analysis of social behaviour. J Neurosci Methods 219(1):10–19. https://doi.org/10.1016/j.jneumeth.2013.05.013
Article Google Scholar
Ou-Yang TH, Tsai ML, Yen CT, Lin TT (2011) An infrared range camera-based approach for three-dimensional locomotion tracking and pose reconstruction in a rodent. J Neurosci Methods 201(1):116–123. https://doi.org/10.1016/j.jneumeth.2011.07.019
Article Google Scholar
Park SJ, Kim BG, Chilamkurti N (2021) A robust facial expression recognition algorithm based on multi-rate feature fusion scheme. Sensors 21(21)
Prut L, Belzung C (2003) The open field as a paradigm to measure the effects of drugs on anxiety-like behaviors: a review. Eur J Pharmacol 463(1):3–33. https://doi.org/10.1016/S0014-2999(03)01272-X. Animal Models of Anxiety Disorders
Article Google Scholar
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788
Ren Z, Annie AN, Ciernia V, Lee YJ (2017) Who moved my cheese? automatic annotation of rodent behaviors with convolutional neural networks. In: 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), pp 1277–1286
Rojas-Perez LO, Martinez-Carranza J (2020) Deeppilot: A cnn for autonomous drone racing. Sensors 20(16)
Samson AL, Ju L, Kim HA, Zhang SR, Lee JAA, Sturgeon SA, Sobey CG, Jackson SP, Schoenwaelder SM (2015) Mousemove: an open source program for semi-automated analysis of movement and cognitive testing in rodents. In: Scientific reports
Saré RM, Lemons A, Smith CB (2021) Behavior testing in rodents: Highlighting potential confounds affecting variability and reproducibility. Brain Sciences 11(4)
Seibenhener M, Wooten M (2015) Use of the open field maze to measure locomotor and anxiety-like behavior in mice. Journal of visualized experiments : JoVE
Sourioux M, Bestaven E, Guillaud E, Bertrand S, Cabanas M, Milan L, Mayo W, Garret M, Cazalets J-R (2018) 3-d motion capture for long-term tracking of spontaneous locomotor behaviors and circadian sleep/wake rhythms in mouse. J Neurosci Methods 295:51–57. https://doi.org/10.1016/j.jneumeth.2017.11.016
Article Google Scholar
Spink AJ, Tegelenbosch RAJ, Buma MOS, Noldus LPJJ (2001) The ethovision video tracking system—a tool for behavioral phenotyping of transgenic mice. Physiol Behav 73(5):731–744. https://doi.org/10.1016/S0031-9384(01)00530-3. Molecular Behavior Genetics of the Mouse
Article Google Scholar
Sturman O, Ziegler L, Schläppi C, Akyol F, Privitera M, Slominski D, Grimm C, Thieren L, Zerbi V, Grewe B, Bohacek J (2020) Deep learning-based behavioral analysis reaches human accuracy and is capable of outperforming commercial solutions. Neuropsychopharmacology 45
Szegedy C, Wei Liu, Yangqing Jia, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 1–9
Tungtur S K, Nishimune N, Radel J, Nishimune H (2017) Mouse behavior tracker: An economical method for tracking behavior in home cages. BioTechniques 63(5):215–220. https://doi.org/10.2144/000114607. PMID: 29185921
Article Google Scholar
Valletta JJ, Torney C, Kings M, Thornton A, Madden J (2017) Applications of machine learning in animal behaviour studies. Anim Behav 124:203–220. https://doi.org/10.1016/j.anbehav.2016.12.005
Article Google Scholar
van Dam EA, Noldus LPJJ, van Gerven MAJ (2020) Deep learning improves automated rodent behavior recognition within a specific experimental setup. J Neurosci Methods 332:108536. https://doi.org/10.1016/j.jneumeth.2019.108536
Article Google Scholar
Vuralli D, Wattiez AS, Russo AF, Bolay H (2019) Behavioral and cognitive animal models in headache research. The Journal of Headache and Pain 20(1)
Wang Z, Mirbozorgi SA, Ghovanloo M (2015) Towards a kinect-based behavior recognition and analysis system for small animals. In: 2015 IEEE Biomedical Circuits and Systems Conference (BioCAS), pp 1–4
Whishaw IQ, Haun F, Kolb B (1999) . In: Windhorst U, Johansson H (eds) Analysis of behavior in laboratory rodents. Springer Berlin Heidelberg, Berlin, Heidelberg, pp 1243–1275, DOI https://doi.org/10.1007/978-3-642-58552-4_44, (to appear in print)
Wilson JC, Kesler M, Pelegrin SLE, Kalvi L, Gruber A, Steenland HW (2015) Watching from a distance: A robotically controlled laser and real-time subject tracking software for the study of conditioned predator/prey-like interactions. J Neurosci Methods 253:78–89. https://doi.org/10.1016/j.jneumeth.2015.06.015
Article Google Scholar
Xie XS, Zhang J, Zou B, Xie J, Fang J, Zaveri NT, Khroyan TV (2012) . In: Chen J, Xu XM, Xu ZC, Zhang JH (eds) Rodent behavioral assessment in the home cage using the smartcage™ system. Humana Press, Totowa, NJ, pp 205–222
Ziegelaar M (2015) Development of an inexpensive, user modifiable automated video tracking system for rodent behavioural tests. Master’s Thesis, School of Mechanical and Mining Engineering

Download references

Acknowledgements

First author thanks to Consejo Nacional de Ciencia y Tecnología (CONACYT) for the scholarship no. 719218

Author information

Authors and Affiliations

Department of Computational Science, Instituto Nacional de Astrofisica, Optica y Electronica, Luis Enrique Erro 1, Tonantzintla, 72840, Puebla, Mexico
J. Arturo Cocoma-Ortega & Jose Martinez-Carranza
Laboratorio de Neurofarmacologia, Benemerita Universidad Autonoma de Puebla, Av. San Claudio, Puebla, 72570, Puebla, Mexico
Felipe Patricio & Ilhuicamina Daniel Limon

Authors

J. Arturo Cocoma-Ortega
View author publications
You can also search for this author in PubMed Google Scholar
Felipe Patricio
View author publications
You can also search for this author in PubMed Google Scholar
Ilhuicamina Daniel Limon
View author publications
You can also search for this author in PubMed Google Scholar
Jose Martinez-Carranza
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jose Martinez-Carranza.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cocoma-Ortega, J.A., Patricio, F., Limon, I.D. et al. A deep learning-based approach for real-time rodent detection and behaviour classification. Multimed Tools Appl 81, 30329–30350 (2022). https://doi.org/10.1007/s11042-022-12664-y

Download citation

Received: 21 September 2021
Revised: 24 December 2021
Accepted: 21 February 2022
Published: 06 April 2022
Issue Date: September 2022
DOI: https://doi.org/10.1007/s11042-022-12664-y

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A deep learning-based approach for real-time rodent detection and behaviour classification

Abstract

Similar content being viewed by others

New Approaches to Studying Rodent Behavior Using Deep Machine Learning

Robust Open Field Rodent Tracking Using a Fully Convolutional Network and a Softargmax Distance Loss

EXPLORE: a novel deep learning-based analysis method for exploration behaviour in object recognition tests

1 Introduction

2 Related work