Main

Animal behavior studies are fundamental in ecology, ethology, ecotoxicology, neuroscience and many other fields1,2. These studies can be performed in a wide variety of ways, ranging from observational tests in natural conditions to experimental trials in a laboratory environment. To compare experiments in laboratory conditions, performed by different research groups and with different organisms, it is important that these types of experiments are implemented in carefully controlled conditions and that they use standardized and repeatable protocols. Therefore, these experiments often use model organisms that have been widely studied, such as zebrafish or rodents, and take place in well-defined environments, so-called ‘arenas’. The size of the arena and the number of animals in an arena should also be carefully considered and controlled not to bias the study’s outcome.

Standardized tests can measure an organism’s activity in different arenas (Fig. 1a). For example, open arenas and plus-mazes are common in anxiety or motivation studies3, T-maze or Y-maze test arenas are commonly used for memory and spatial learning studies4 and three-chambered arenas are used to measure social approach or fear responses5. Other standard tests include water mazes6, elevated mazes7, arenas with light-dark transitions8 and arenas with thermal gradients or hot plates9. Stimuli or distractors can also be added to these arenas to measure behavioral changes in different conditions10. Although these arenas have different layouts, the way the data (video files) are acquired is similar (Fig. 1b).

Fig. 1: Overview of typical setups used in tracking experiments.
figure 1

a, Examples of different arena configurations commonly used for insects, rodents and fish. b, Experiments in laboratory conditions should be carefully designed for front or back illumination depending on the type of organism under study and the arena used. In addition, consider using diffuse illumination if details in the appearance of the animals are important, or backlight illumination to achieve high contrast for motion tracking. Make sure that no reflections from the surrounding environment or the lamp are seen.

Regardless of the arena design or organism used, most behavioral studies use video recording and analysis techniques in which the video frame rate, field of view and resolution are adjusted to capture the organism’s motion with the required temporal and spatial resolution11. These experiments can produce a vast amount of image data, which can be very time consuming to analyze, especially if manual annotation is used. In addition, manual annotation can introduce human errors and biases, which can reduce the accuracy of the results12. Therefore, objective and automatic approaches for animal analysis are needed. Software and algorithms that can track and analyze the organism’s position under study are critical for efficient research.

In general, these software packages can help to solve two main issues: detecting the position of animals via tracking algorithms or detecting the positions of the animals’ parts (so-called ‘pose estimation’). Pose estimation typically requires a previous tracking analysis. Therefore, because tracking is the main focus of this work, we refer only to the tracking software and the relevant tracking stages of pose estimation software. We explicitly exclude pose estimation software such as DeepPoseKit and DeepLabCut that are based on the extraction of images with distinct postures and the manual annotation of body parts to train machine learning models13,14,15. We think that these techniques are more oriented toward the detection of the behavior of a single animal at a fine scale and are not as directly relevant for general tracking applications. In addition, they are sufficiently different in their theoretical approach to grant a separate analysis in another review.

Tracking applications are available as free or commercial tools. Commercial software usually offer more features and flexibility, especially regarding input video formats and statistical outputs. However, many researchers cannot afford these expensive tools, and the algorithms used by these software are often not transparent to the user. Thus, we focus this analysis on free tracking software. Several new types and versions are published each year. The continuous release of new software makes it challenging to select the appropriate software for a particular experiment or understand each option’s limitations and differences. For example, many applications use the same pipeline and the same processing techniques, but there might be steps in the algorithm that are a limiting factor for some types of experiments. In addition, some programs are limited and specific in what they can do and what they can analyze.

With these considerations, we conducted our analysis using a systematic search of tracking applications on Google Scholar and Research Gate, with software published from 2008 until 2020 as the inclusion criterion. We based our analysis primarily on the descriptions of the software in their respective papers and the published results from their authors. In addition, we tested some of the analyzed software to gather information that was not available from these sources.

In this review, we compare 28 tracking applications in a comprehensive and accessible way to help guide researchers within the field of behavior studies. In contrast to previous reviews that focused on specific research fields and organisms16,17, explored only specific features such as social interactions18, did not include an in-depth discussion19 or included only a handful of software18,19, we point out the weaknesses of the applications in every step of the pipeline, identifying the problems that might face behavioral researchers when choosing a specific software. In addition, we provide a detailed taxonomy of the main algorithms and their limitations and show the emerging opportunities in animal-tracking software development.

General pipeline of a tracking program

To understand how a tracking application operates, it is essential to decompose the pipeline involved in the tracking process. Starting from video recording and ending in the analysis of animal trajectories, all the tracking applications that we review here use a similar sequence of algorithms. We show the general workflow of these tracking applications (Fig. 2) and describe each step of the pipeline below. It is important to recall that some applications do not include all the steps shown in the figure. For example, some applications lack the calibration procedure (Fig. 2b) and the possibility to analyze trajectories (Fig. 2d).

Fig. 2: Illustration showing the general workflow of a tracking program: how an image frame is processed from video to analysis.
figure 2

a, Image acquisition. b, Calibration. c, Detection. d, Trajectory generation. e, Data analysis. id., identity. See text for full description of the steps.

Image acquisition

This step defines how video images are obtained. There are two main approaches: offline, in which a sequence of images feeds the algorithm from a video file, or online, in which a live camera streams each image. An advantage of using the offline mode is that complex algorithms can use extensive computations and access future and past frames to process a given image. The online method uses real-time computations, which requires less storing capacity but increases the computational requirements, because each frame needs to be analyzed with speed similar to the camera frame rate to avoid losing data. As a result, the online method is currently suitable for use only with relatively simple algorithms.

Calibration

This step determines the relationship between the camera’s natural units (pixels) and the real-world units. A calibration algorithm that can correct distortions in the images caused by misalignment, projection errors and the camera lens is valuable.

Detection

This step consists of finding the spatial position of the animals in each image of a video sequence. Detection algorithms can include different steps, such as background subtraction to remove stationary features from the environment, segmentation to separate the objects that represent potential animals and a filtering step to remove false detections.

Trajectory generation

This step associates the detected animals in each image to the tracked individuals. In this step, a set of trajectories are generated, each defined by an animal’s positions in the subsequent images of the video. Trajectory generation is a trivial task for a single animal in an open arena without occlusions or reflections. However, in arenas with uneven illumination conditions, with multiple targets, possible occlusions or complex backgrounds, trajectory generation requires identity (id.) preservation algorithms to track animals when an occlusion or a crossing occurs. In general, trajectory generation is a very complex task that usually requires manual correction, especially with a large number of animals.

Data analysis

Analysis involves extracting information from each trajectory to obtain the behavioral data required by the experiment. This extracted data consists of statistics related to moving velocity, distance, animal orientation, rate of explored territory or activity rate. Data analysis is usually a post-processing step done after all other tracking tasks are completed.

Results and discussion

To compare how the 28 selected free programs handle the different steps in the tracking pipeline, we analyze and evaluate their main characteristics and functionalities for each step. Because there are no public datasets and standard metrics, we cannot make a direct quantitative comparison of their performance. In addition, these programs have been developed to use different video files, are optimized for specific arenas and are aimed at tracking different animals. Therefore, we point out their most important limitations, study the benefits and drawbacks of the techniques and algorithms, evaluate the areas that require further research and, finally, comment on the most important missing features of the software. We list these features in two tables that follow the tracking pipeline structure (Figs. 3 and 4) and discuss each step below.

Fig. 3: Tracking software comparison: last update, platform, video acquisition, calibration and detection.
figure 3

Red: functionality or characteristic not present or lacking when compared to the best software for the evaluated characteristic. Light gray: limited functionality or characteristic when compared to the best software for the evaluated characteristic. *User-friendliness: +, bad (programming or high-level domain-specific knowledge is required, or the software needs to be compiled, has complex installation requirements or is difficult to use); ++, OK (requires relevant domain-specific knowledge to install and run; it is suitable for laboratories or academia); +++, good (the software is easy to install and run, and it is suitable for domestic use). CNN, convolutional neural network; GPU, graphics processing unit; MCR, MATLAB Compiler Runtime; NA, not available; OS, operating system.

Fig. 4: Tracking software comparison: trajectory generation, data analysis and extra features.
figure 4

Red: functionality or characteristic not present or lacking when compared to the best software for the evaluated characteristic. Light gray: limited functionality or characteristic when compared to the best software for the evaluated characteristic. *Multiple animals: Minimum–maximum number of animals that can be tracked; in parentheses, the time the software is able to preserve their identity according to the results published by their authors.

Video acquisition

Behavioral experiments are often long and require a relatively large sample size; thus, they usually generate a vast amount of video data. The quality of the video data and, therefore, the results extracted from it are heavily dependent on experimental conditions such as illumination, camera position, camera resolution, optical quality and background color. The influence of these parameters in the behavioral results has rarely been studied, and most researchers are not familiar with the variables that need to be considered or how to adjust them20.

It is also important to note that parameters such as resolution, codec configuration or frame rate can affect the computational cost of tracking algorithms exponentially. It is recommended that each animal is composed of at least 50 pixels in a video21,22 and that the frame rate is high enough that the animal position overlaps in two consecutive frames. However, most tracking applications are limited by the maximum resolution that they can handle. For example, EthoWatcher23,24 is restricted to a maximum video resolution of 320 × 240 pixels, and MouseMove25 can handle resolutions only up to 640 × 480 pixels22. Thus, these applications are somewhat limited in the field of view and the number of organisms that they can track if one wants to fulfill the requirements presented above. To our knowledge, only anTraX26, Automated Planar Tracking27 and ToxTrac22,28,29 have been successfully tested with resolutions higher than 1,920 × 1,080. Other programs such as Idtracker21,30 and Idtracker.ai31,32 or Ctrax33,34 are, in our experience, also able to handle videos with such resolutions.

The frame rate or the number of frames in a video file is also a computational limiting factor. Because Idtracker21,30 and Idtracker.ai31,32 require high computation times for each frame, even on moderate resolutions, analyzing large data sets using these software programs can be very time consuming. Finally, ABC tracker35,36 can handle only videos shorter than 10 min, restricting the use of this software to short-time experiments37.

The most important limitation of video acquisition is the processing speed and memory required to run tracking algorithms. This limitation is currently the bottleneck in behavioral experiments. The ability to process high-resolution, high-frame-rate videos in real time (online analysis) would revolutionize behavioral experiments by markedly decreasing the analysis timescale and reducing the need for video data storage. Currently, real-time tracking is possible only by using simple algorithms that do not work with complex backgrounds and multiple targets. Of the 28 applications, BioSense38,39, EthoWatcher23,24, MARGO40,41, Multi-Worm Tracker42,43 and SwisTrack44,45 offer both online and offline video acquisition modes, whereas the other programs only operate in offline mode.

Calibration

Camera calibration is a process that allows the user to obtain measurements in real-world coordinates. In addition, calibration can also include removing image distortion and perspective errors, which occur when imaging a 2D surface with a fixed camera; it is important to recall that the distance from the center to the edges is not linearly increasing. Although calibration is a critical feature of animal-tracking software to obtain reliable data, only 9 of the 28 analyzed programs offer a calibration function. See Fig. 3, ‘Calibration’ column.

Calibration techniques are commonly based on the use of the pinhole mathematical camera model46 to solve the equations that describe the projection of a point in the real world to the image plane through the lens of an ideal camera. This model takes into account not only the pixel scale but also the rotation of the camera with respect to the arena, also allowing the estimation and removal of lens distortion. This technique requires solving a complex equation system and using a calibration pattern. Only SwisTrack44,45, MARGO40,41 and ToxTrac22,28,29 implement this technique.

Most calibration techniques do not take advantage of the pinhole model and use a simple scale transformation. A scale transformation converts image coordinates to world coordinates by multiplying them by a constant factor. The programs Animapp47, BEMOVI48,49, BioSense38,39, EthoWatcher23,24, MARGO40,41, MouseMove25, SwisTrack44,45 and ToxTrac22,28,29 use this approach. This technique is more straightforward for the user but far less flexible and accurate becaue image distortion is not taken into account. Only SwisTrack44,45, ToxTrac22,28,29 and MARGO40,41 have both calibration systems.

The lack of calibration options in available animal-tracking software is surprising. We believe that this issue illustrates the deep gap between the considerations of the academic community that develops tracking software and the actual laboratory needs of software users.

Detection

The detection step consists of finding the animals of interest in the images. In Fig. 3, we divide the detection step into three different sections.

Background subtraction

Background subtraction algorithms aim to remove features of the environment that can interfere with animal detection. Background subtraction is a key feature when recording animals in a natural setting with dynamic lighting conditions or in aquatic environments where images are changed by reflections, shadows and other artifacts. The two most common types of background subtraction techniques are those based on static images and those based on dynamic models.

Static techniques commonly use a reference frame or video of the background without animals, such as in EthoWatcher23,24, Pathtrackr50,51, MARGO40,41, MouseMove25, SpectralTL52 and SwisTrack44,45, or estimate a background by averaging the frames of the video, such as in anTraX26, Biotracker53, Idtracker21,30 and Idtracker.ai31,32. Static background techniques are easy to implement and are effective when detecting stationary or moving animals, if the background objects and illumination do not change during the experiment. Otherwise, these techniques should not be used.

Dynamic techniques use moving or decaying average models, such as in Multi-Worm Tracker42,43 and SwisTrack44,45, or Gaussian mixture models, such as in BioSense38,39, Ctrax33,34, ToxTrac22,28,29 and UMATracker54,55. Dynamic techniques can account for illumination changes or other gradual changes in the background. This factor is important when running long-time experiments in which the sun is used as a light source. However, these techniques are not reliable when detecting animals that remain static during a substantial part of the experiment.

The only program that approaches background subtraction with an innovative technique is Mouse Tracking56,57. Mouse Tracking56,57 uses a deep learning algorithm to separate the pixels from the background and foreground. This strategy is more robust than other techniques but is also complex and requires a massive amount of training data, which makes the method very computationally heavy.

From our experience, there is room for improvement in background subtraction algorithms given that only a few studies have addressed situations with low and/or changing contrast within the background.

Segmentation

Segmentation is performed immediately after background subtraction and usually consists of a technique aimed at separating potential animals in the image and a filtering step that removes possible false positives. The most common segmentation technique is based on so-called ‘thresholding’. Thresholding is a simple segmentation method that uses a reference value to separate pixel regions of the image of different brightness. Thresholding is based on contrast and requires the animals to appear as bright objects in a dark background or as dark objects in a bright background. The main advantage of thresholding is that it is a very computationally efficient technique. However, thresholding is very sensitive to false-positives or false-negatives in non-uniform images. Most of the analyzed software use a variation of this technique.

To increase the sensitivity and robustness of detection, a few techniques locate specific animal features on the image. This method limits the thresholding step because it can be applied only to specific animals with certain body shapes and therefore can not be used for general tracking. Examples of software using this strategy are Ctrax33,34 and Mouse Tracking56,57, which use an ellipse fitting strategy to search for circular shaped objects. Automated Planar Tracking27 and Fish CnnTracker58, on the other hand, rely on locating the specific shape of the fish head to improve detection.

The only applications that provide some innovations in animal detection are ABC Tracker35,36, Multi-Animal Tracker59,60 and Pathtrackr50,51. These programs use a system that requires the user to mark each animal’s location in a few frames on the video and use a machine learning technique to locate each animal in the remaining frames. This approach provides a more robust detection system than other algorithms. ABC Tracker35,36 successfully uses this approach by applying support vector machines and obtains robust results with a user-friendly experience.

Species

Most of the software studied are versatile and can be used for different species. Of the 28 tested applications, 16 can be used for tracking any type of animal, whereas 11 are designed for a specific animal type, and 1 requires expanding the software with specific animal models (see column ‘Species’ in Fig. 3).

Trajectory generation

The challenge of trajectory generation is to associate potential targets with previous trajectories (i.e., to associate a set of detections to a group of animals, where we know the trajectories of these animals before the current frame). The most common technique to solve this problem uses the Kalman filter, which is a prediction-correction technique61. With this technique, one can estimate an animal’s position in the next frame on the basis of its previous known positions by assuming a constant speed or a constant acceleration model. Then, in the next frame, the predicted positions are compared with the actual detections using a Hungarian optimization technique62. The Kalman filter is very efficient computationally, and most tracking programs that we are aware of have implemented a variation of this algorithm or use similar techniques (e.g., Particle Filters63).

However, the Kalman filter and other similar techniques are not reliable in animal-tracking scenarios where occlusions or multiple interacting animals exist. The reason for this limitation is that these techniques are not able to keep the identity of the objects and use only spatial information to match the trajectories. A typical example of a situation where these techniques fail is when two animals cross paths and change direction after the collision. When this situation occurs, the algorithm will lose track of the animals for a brief moment. It will then search for the animals’ new positions, assuming they continued moving in the same direction. As a consequence, animals’ identities will be switched.

Preserving the identity of multiple individuals (id. preservation) after an occlusion is currently the main limitation in the trajectory-generation step. The complexity of this problem is illustrated in a 2014 study21, in which Pérez-Escudero and colleagues analyzed a scenario with multiple interacting animals. In this scenario, when correctly solving 99% of all crossings, only 11% of the animals were correctly identified after 2 min of tracking, owing to error propagation22. In summary, preserving the identity is complex and computationally expensive, and only a few offline tracking applications offer major contributions to this field.

Motr64,65 uses an id. preservation that relies on marking the individuals with visually distinctive markers that can be easily identified automatically. This technique is reliable and allows the tracking of animals for long periods of time and in multiple sessions. However, many modern techniques try to avoid placing markers or sensors on the animals because the markers can be impractical and sometimes interfere with the experiment by affecting animal behavior66,67. Programs such as Automated Planar Tracking27, Tracktor19,68 and UMATracker54,55 offer id. preservation algorithms that, in our opinion, have only marginally improved the basic Kalman strategy and are not robust and reliable for some specific situations (Fig. 4).

Idtracker21,30 and ToxTrac22,28,29 use a strategy based on a probabilistic texture analysis to analyze animals’ similarity between collisions. Idtracker21,30 was one of the first applications that seriously approached this issue when tracking multiple targets and uses a complex algorithm based on a Bayesian analysis with a similarity metric to compare the objects’ texture. ToxTrac22,28,29, on the other hand, uses a combination of a similarity analysis with a Hungarian algorithm to manage the identity preservation of multiple targets. This technique builds on top of a very fast tracking algorithm that can handle simultaneous tracking in multiple arenas, resulting in one of the most flexible free tracking tools for trajectory generation.

Traditional probabilistic texture analysis is not capable of tracking many targets or for doing so for very long times, but the approach is very useful in short experiments with small groups of animals, for which body shape and appearance do not change much in comparison to their position or posture. On the basis of the results reported by ToxTrac22,28,29 and Idtracker21,30, we recommend using these programs with groups of up to five animals in videos no longer than 20 min. Between these two techniques, especially for users with limited computational speed, we think that ToxTrac22,28,29 is a better alternative because it requires substantially less processing time19,28.

Idtracker.ai31,32 and Fish CnnTracker58 use an approach based on deep learning models called ‘convolutional neural networks’ (CNNs). CNNs are optimized for image classification tasks and are among the most powerful image classification techniques nowadays, outperforming those based on traditional probabilistic texture analysis. The current drawbacks of these models are that they require specific training data and cannot be used in real-time applications even with optimized hardware, such as graphics processing units (GPUs). Although some techniques such as transfer learning can mitigate some of these issues, none of the software in the list use this approach.

Idtracker.ai31,32, which is an extension of Idtracker21,30, combines two different deep learning algorithms: one to detect occlusions and one to identify targets with classification analysis. Idtracker21,30 and Idtracker.ai31,32 present the most solid identity-preservation techniques on the market. However, the computational time required to analyze a standard experiment can be as high as 1 h per frame if not using expensive GPU computing hardware, making it impractical to run on a standard laboratory computer. Fish CnnTracker58 offers a less flexible approach only suitable for fish tracking and is also less accurate28 than Idtracker.ai31,32. Idtracker.ai31,32 achieves the best accuracy in id. preservation to date. In a set of experiments, Romero-Ferrero et al.31,32 manually reviewed a significant sample of the crossings for individual animals, and Idtracker.ai was able to track up to 100 zebrafish and fruit flies for 10 min or 4 mice for 1 h31. However, in our opinion, the computation times this technique requires make Idtracker.ai31,32 not suitable for most common scenarios, given that many behavioral laboratories do not have a high-end computer with a state of the art GPU and that processing a 1-h video with Idtracker.ai31,32, with a modern GPU, can take >1 d of computation in extreme situations32. Despite this limitation, it is important to note that Idtracker.ai31,32 is the best of the reviewed software to analyze experiments with large groups of unmarked animals.

ABC Tracker35,36 offers a novel approach that uses forward and backward particle filtering. A particle filter is a selection-prediction-measurement solution of similar complexity to the Kalman filter. The particle filter uses a set of samples called particles to estimate the internal states in dynamical systems from partial observations and with random perturbations. This filter is also easy to parallelize and can be more accurate than a Kalman filter69. In our opinion, the main novelty of ABC Tracker35,36 is that it combines with this prediction scheme an algorithm to track stationary objects on the basis of a local search strategy that, in practice, is able to solve most animal interactions when the scenario does not involve complex occlusions in a 3D space. ABC Tracker35,36 can currently track up to 30 animals in a video for <10 min37, obtaining very good results in these situations. In our opinion, ABC Tracker35,36 also has the most intuitive interface for analyzing experiments with multiple individuals.

Finally, anTraX26 combines the use of color tags to mark individuals, CNNs and a graph-based approach. According to the authors, these features should allow the tracking of dozens of marked individuals for hours, if not days. However, given the use of tags and the lack of validation results for specific times, it is unclear how robust the algorithm is compared to other techniques.

In summary, id. preservation algorithms for multiple interacting animals are still insufficient when looking at the computational performance and accuracy that are required for behavioral tests and analysis. So far, id. preservation has not been accomplished when running multiple arenas simultaneously, with animal sizes smaller than 50 pixels or with online image-acquisition techniques.

Data analysis

Nowadays, automatic behavior recognition can be achieved using annotated video datasets to train machine-learning classifiers70. Kabra et al.71 proposed an automatic animal behavior annotator that led to the creation of individual and social behavior classifiers for organisms, such as mice and larval flies. Robie et al.18 proposed a similar strategy to detect patterns of social interactions. Using these techniques, some researchers studied complex behaviors such as mating and feeding in mice72 and behavioral responses of larval fish to chemicals73. Despite this development, only a minority of the analyzed software include functions for complex behavior recognition, and they are always limited to specific behaviors in particular scenarios. For example, Multi-Animal Tracker59,60 detects pirouette movements in worms, Ctrax33,34 detects touch and chase social behaviors in flies, QTrack12,74 also detects specific courtship behaviors in flies and MouseMove25 can quantify unilateral locomotor deficits in mice.

In general, tracking applications provide only movement metrics. Thus, most tracking applications are limited to movement, orientation and zone exploration metrics. Furthermore, only a fraction of the assessed software in this review provides a useful array of these metrics (i.e., BioSense38,39, MouseMove25, Multi-Animal Tracker59,60, Multi-Worm Tracker42,43 and ToxTrac22,28,29). These software provide advanced toolkits that allow non-programmers to analyze parameters such as movement, time spent in selected areas, changes in direction and time spent moving. An important note is that these software packages can also provide individual or population metrics.

We conclude that more work is needed to integrate behavior recognition in tracking software, and we believe that algorithms for automatic detection of stress and other complex behaviors would represent a true innovation if included in tracking tools.

Extra features

Some tracking applications implement extra features to facilitate user experience or to add versatility. The most useful extra feature, from our point of view, is the ability to analyze video files that have been split into multiple files. This feature is explicitly supported by Idtracker21,30, Idtracker.ai31,32 and ToxTrac22,28,29. Another important feature is the possibility of processing a batch of video files by using the same camera configuration, allowing the user to adjust the parameters only once for a set of experiments. Only Animapp47, Ctrax33,34, idTracker21,30, MARGO40,41, MouseMove25 and ToxTrac22,28,29 implement this functionality.

The possibility of controlling hardware peripherals such as external sensors, lights or temperature sources is a nice feature implemented in MARGO40,41 and RAT75,76. This feature allows measuring parameters such as reaction times to stimuli, planning long-term tests without supervision and modifying the stimuli according to behaviors, thereby creating a closed control loop.

UMATracker54,55 and BioTracker53 implement a modular approach to facilitate the development and integration of new processing modules. We think that this addition can be useful in collaboration with the community to integrate new features into the software. However, taking full advantage of this feature requires a constant level of support and commitment that may not be realistic.

Finally, we highlight that Animapp47 includes an Android application that directly analyzes images using a smartphone camera. With the increased computational power of smartphones, we believe that this type of program can be useful for simple studies performed in a field environment.

Conclusions

Current tracking software need to balance robustness, accuracy and processing speed. As a rule of thumb, higher robustness and accuracy require complex algorithms that reduce processing speed. Therefore, online processing programs use simple algorithms that increase efficiency to reach real-time performance, but with the trade-off of less robustness and accuracy. Offline software can take advantage of more complex processing algorithms but require a computational capacity that is not achievable for all users or suitable for every experiment. We tried to highlight these differences and the contributions of each application to the animal-tracking field. (BioTrack77,78 and FIMTrack79,80 are not discussed specifically in the text but are included in Figs. 3 and 4.)

Our assessment shows that all applications share more or less the same pipeline and that very few of them offer a unique or revolutionary approach, with the use of CNN networks for id. preservation being the most relevant new contribution. However, our biggest concern is the lack of usability of recent software. Of the 28 tested programs, only four (IdTracker21,30, IdTracker.ai31,32, ABC Tracker35,36 and Toxtrac22,28,29) provide innovative algorithms, useful features and user-friendly interfaces. We believe that the main reason for this lack of usability is the existence of a gap between software design and their intended use in a laboratory. Most software packages are not easy to use, require tuning of several complex parameters for each experiment and do not include important features such as calibration options. In addition, most programs do not offer data analysis tools beyond the most basic ones and cannot extract valuable behavioral metrics. In our opinion, developers of tracking tools must change their paradigm from creating programs that can be published to creating programs that are useful and easy to use.

Finally, we want to draw attention to the lack of complete, open and well-labeled datasets that provide a standard reference for validation and accuracy testing. Such datasets would give researchers an objective tool for a quantitative comparison of tracking programs.