Keywords

1 Introduction

Zebrafish is a small tropical fish native to Southeast Asia, known for its external fertilization, transparent embryos, and short life cycle. In addition, its genome was fully sequenced in 2013 [1], and its genetic structure shares up to 70% similarity with that of humans. Zebrafish offers numerous advantages as a model organism: (1) it is small in size and has a high survival rate, (2) it is less expensive to maintain than mice, (3) it can produce hundreds of eggs per week, providing a large number of embryos for research, (4) up to 84% of genes related to human diseases have corresponding genes in zebrafish, and (5) as a vertebrate with major organ systems that are similar to those of humans, it shares many similar characteristics with humans. Zebrafish has been widely used as a model organism in laboratory research for many years [2], with its analysis extending to numerous research applications, including drug safety, behavioral genetics, ecotoxicology, circadian rhythms [3,4,5,6,7], and many others. Moreover, zebrafish has a unique ability to regenerate heart muscle, which makes it a valuable tool for studying heart-related diseases.

1.1 Zebrafish Applied to Biomedicine

Zebrafish (Danio rerio) has rapidly emerged as a promising tool for disease modeling and drug discovery in the field of biomedical research [2, 8]. As an experimental animal, zebrafish has a wide range of applications in the medical field. Their genome is simple, easy to manipulate, and they grow quickly, making them useful for studying the mechanisms, treatments, and drug screening of various diseases [9,10,11]. In recent years, the use of zebrafish in cancer research, neuroscience [12, 13], and cardiology has gradually increased. Moreover, zebrafish has also been used to study ecotoxicology [14] to assess the impact of certain chemicals on organisms, thus protecting public health.

As a research model organism, zebrafish (Danio rerio) has been widely used in scientific and medical fields [15, 16]. Therefore, objective, impartial, and reproducible data collection and analysis methods are necessary to ensure the credibility and accuracy of experimental results [17]. Such methods can eliminate subjectivity and bias, reduce uncertainty in experiments, and help scientists further understand the experimental results. In addition, objective, impartial, and reproducible data collection and analysis methods make it easier for other scientists to verify and replicate experimental results, thereby further enhancing the reliability and sustainability of research outcomes. Therefore, objective, impartial, and reproducible data collection and analysis methods are crucial for zebrafish experiments and other scientific research fields.

1.2 Deep Learning Applied to Zebrafish Research

The use of deep learning techniques and zebrafish imaging studies can provide more accurate tracking and analysis of zebrafish behavior and movement trajectories. In recent years, many research teams have applied deep learning techniques to zebrafish image analysis [18,19,20], such as using convolutional neural networks to detect zebrafish movement and track their trajectories [21,22,23]. With these techniques, valuable features can be extracted from large amounts of zebrafish image data, and accurate models can be established to predict zebrafish behavior and study their movement trajectories [24,25,26]. These research findings help to deepen our understanding of zebrafish movement and behavior, providing important foundational information for biological and medical research. Additionally, utilizing deep learning techniques and zebrafish imaging studies can also develop more advanced image analysis tools, improving the automated analysis of zebrafish images, and providing more possibilities for the study of zebrafish behavior and physiology [12].

1.3 Introduction to YOLOv7

Dr. Chien-Yao Wang, Alexey Bochkovskiy, and Director Hung-Yu Liao successively introduced YOLOv4, ScaledYOLOv4, and YOLOR in 2020–2021, and recently launched their latest masterpiece - YOLOv7 [27] in early July of this year. YOLOv7 outperforms all known object detectors in terms of speed and accuracy within the range of 5 FPS to 160 FPS and at 30 FPS on GPU V100. The YOLOv7-E6 object detector (56 FPS V100, 55.9% AP) outperforms the transformer-based detector SWIN-L Cascade-Mask R-CNN (9.2 FPS A100, 53.9% AP) by 509% in speed and accuracy and the convolution-based detector ConvNeXt-XL Cascade Mask R-CNN (8.6 FPS A100, 55.2% AP) by 551% in speed and 0.7% AP in accuracy. Moreover, YOLOv7 surpasses YOLOR, YOLOX, Scaled-YOLOv4, YOLOv5, DETR, Deformable DETR, DINO-5scale-R50, ViT-Adapter-B, and many other object detectors in terms of speed and accuracy. Additionally, they trained YOLOv7 from scratch only on the MS COCO dataset without using any other datasets or pre-trained weights (Fig. 1).

Fig. 1.
figure 1

Comparison with other object detectors, their proposed methods achieve state-of-the-arts performance [27].

YOLOv7 reduces the parameter count and computation cost of today’s real-time object detection SOTA by about 40% and 50%, respectively. It mainly optimizes the model architecture and training process, proposing extended and scaling methods for effective utilization of parameters and computation costs in the model architecture optimization. As for the training process optimization, the authors proposed the “bag-of- freebies” method in YOLOv4, which increases accuracy at the cost of training, but does not increase inference cost, and in YOLOv7, they used re-parameterized techniques to replace the original modules and dynamic label assignment strategy to assign labels more efficiently to different output layers (Fig. 2).

Fig. 2.
figure 2

The picture shows the overall network architecture of yolov7, which consists of three parts: input, backbone and head. Unlike yolov5, the neck layer and the head layer are combined as the head layer, which actually has the same function. The functions of each part are the same as yolov5, such as backbone is used to extract features, and head is used for prediction [27].

2 Method

Fig. 3.
figure 3

Experimental flow chart of this study.

In this study, self-recorded videos were first converted to images and labeled. These labeled data were then used to train a convolutional neural network model. A small amount of test samples were input to confirm the accuracy and usability of the model. The videos containing the desired behavior to be detected were then input, and the behavior category was determined by detecting the position of zebrafish and its timeline. Finally, the results were output on a self-designed UI interface (Fig. 3).

2.1 Data Collection and Processing

We collected our own video footage to build the dataset, accumulating a total of 1 h 59 min and 52 s of video material. Subsequently, the videos were edited using Potplayer, and the annotations of the area of interest were performed using the LabelImg software (Fig. 4).

Fig. 4.
figure 4

Mark software LabelImg actual operation screen.

2.2 Model Training and Behavioral Judgment

After preprocessing, the images were divided into training, validation, and test sets. YOLOv7.pt among the six initial weights of YOLOv7 was used as the weight for training the model. After detecting the position of the zebrafish, the behavior category was determined by comparing the current position with the previous position.

In this experiment, zebrafish behavior was classified into three categories: normal movement, bottom-dwelling, and stationary. A timing unit of 1.5 s was used, and if the coordinates appeared within the bottom 20% of the tank during the detection process, meaning within the range of 4 cm from the tank bottom upwards, the behavior was defined as bottom-dwelling. If the coordinates did not move within 1.5 s, the behavior was considered stationary.

2.3 Experimental Environment

All deep learning is trained and evaluated on a machine equipped with AMD R5 5600X, 32GM RAM, NVIDIA GeForce RTX 3070Ti, and the operating system uses Window10 x64. Use Python 3.9.13 for table creation and data encoding. Use LabelImg to label datasets. Network construction and training verification are all run on the virtual environment of Anaconda3 architecture. The videos required for the experiments in this study were all shot by iPhone 12, and the size of the fish tank used was 30*16*20 in length, width and height.

2.4 Evaluation Indicators

In terms of evaluating the classification network, the precision and recall (also known as sensitivity or true positive rate (TPR)) were used as the evaluation metrics. The calculation formula is as follows:

$$ {\text{Precision}} = \frac{{T_{p} }}{{T_{p} + F_{p} }} $$
$$ {\text{Recall }}\left( {TPR} \right) = \frac{{T_{p} }}{{T_{p} + F_{n} }} $$

3 Results

The first batch of training used a total of 1200 labeled data, trained nine times with different parameters, and selected two of them for detection testing. However, the accuracy and recall rate were not satisfactory in the testing videos. Therefore, the amount of training data was increased to 2000, and the results showed that both the accuracy and recall rate reached 97% (Fig. 5). In the confusion matrix, TP also achieved a good performance of 98% (Fig. 6). Furthermore, in the videos with a speed of one frame per second, the accuracy and recall rate reached 100%, and the behavior categories were successfully automatically determined (Fig. 7).

Fig. 5.
figure 5

After increasing the amount of data for model training, the ideal accuracy and recall rates were achieved.

Fig. 6.
figure 6

After increasing the amount of data to train the model, the success rate of the model detecting zebrafish is as high as 98%.

Fig. 7.
figure 7

Input the location detection and behavior recognition of the video, and the table on the left is the behavior recognition record.

4 Conclusion

The present study successfully utilizes the combination of object position identification and the time axis to achieve zebrafish behavior recognition, indicating great potential in automating trajectory tracking. Through this approach, we can accurately track the movement trajectories of zebrafish and obtain valuable data and information, which can further help us understand zebrafish behavior and ecological habits and play an important role in scientific research both in laboratory and field settings. Therefore, the application prospects of this method are extremely broad and will make significant contributions to the development of zebrafish research and related fields.