Introduction

Machine learning methods are increasingly popular in almost all fields of the life sciences [1]. This is in part due to the advances in deep learning. Deep learning is a form of feature learning with multiple layers of representations. Each layer performs a nonlinear transformation of the previous layer, starting with the raw data, eventually leading to learning of very complex representations [2]. This becomes very useful for the analysis of large datasets as the algorithm does not require an initial feature extraction step and learns from raw data. Applications of this method have exceeded human performance in image recognition, object detection, speech translation, and natural language understanding [2,3,4,5,6]. They are also increasingly being used in image analysis in many fields of medicine [7,8,9].

In the motor neurorehabilitation field, one major challenge is measurement of movement quality [10,11,12,13]. The current standard clinical methods are composed of simple ordinal grading scales (for example, giving 0 points for no performance, 1 for partial performance, and 2 for “normal” performance of the task) [14, 15]. This is problematic in many ways. For example, even a simple reaching task can be performed in an infinite number of ways, and thus comprises a rich dataset that needs to be analyzed properly. Kinematic analysis of movement has the potential to give very fine-grained information about the quality of the movement [16••]. Thus, kinematic analysis should be measured to assess the quality of the movements in neurorehabilitation [17••].

Quantification of Motor Behavior

What makes a series of movements during a task “good” or “normal”? Is it its speed and smoothness, or whether it is “successful” or not? How can the quality of the movement be quantified? Most experienced clinicians can distinguish a “high-quality” movement from a “low-quality” movement (similarly, an experienced sports coach or music teacher can also recognize high-quality movements). A detailed verbal and/or pictorial description of abnormal movements can be useful in identifying critical patterns of movements at various stages of a disease [18, 19]. However, even that kind of detailed description may not be objective and can miss the elements of movement that might be important but are not easily detected.

The current standard clinical motor impairment measurement tools are mostly based on scoring of the movements with an ordinal scale based on whether the tasks are completed only partially, or not accomplished at all [14, 15]. While these tools can be standardized, validated, reliable, and allow statistical analysis, such scales are crude and cannot adequately capture movement quality [11, 16], with the implication that they cannot distinguish between the “compensatory” mechanisms and a return to more natural movement patterns [13].

For example, during a skilled reach, healthy subjects can sit back in a chair and elevate their arm, extend it, and then grasp with their hand. After stroke, patients often move their trunk up and forwards to overcome their limitations in arm control (i.e., compensation). It is important to distinguish between the compensation and true recovery, which can best be done via kinematic analysis [16••] of the motor behavior, as it is “true recovery” that will most accurately reflect biological repair mechanisms [20]. Kinematic analysis of the movement provides most quantitative detail about the quality of the movements and detects compensatory actions [16••]. However, performing kinematic analysis can be challenging. Recent advances in machine learning hold promise to make kinematic analysis more feasible at the bedside.

Machine Learning Methods

Machine learning is the study of algorithms that can learn from data without the need for overt directives or specification, but rather do so by detecting patterns inherent to the data. As defined by Mitchell: “A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P if its performance at tasks in T, as measured by P, improves with experience E.” [21]. The most commonly used machine learning algorithms can be divided into three categories: supervised learning, unsupervised learning, and reinforcement learning. In a supervised learning algorithm, when given a set of input and outputs (training dataset), the model learns a function that can predict the output when given new input data (test dataset), by iterative optimization of the objective function. In unsupervised learning, the goal is to identify the patterns or the structure in a given dataset. So, there are no input-output pairs of data, but rather there is only a single dataset. Based on the similarity metrics used, the algorithm tries to identify commonalities between each piece of data and creates patterns based on how similar the pieces are. In the third type of algorithm, reinforcement learning, the agent takes actions to maximize a notion of cumulative reward. Using machine learning methods can be very useful in detecting inherent structure in complex datasets.

Deep Learning Methods

A subset of machine learning algorithms based on the use of artificial neural networks is called deep learning [2, 22••]. Artificial neural networks constitute of layers of nodes (neurons) where each neuron is a mathematical function that takes the sum of the outputs of each node from the previous layer, applies that function, and outputs the result to each node of the next layer. The input data is provided to the first layer, and the final layers spit out the output. There can be many layers (hence “deep”) and many nodes in each layer. When given a dataset consisting of matched input-output data, by using iterative optimization, the algorithm optimizes the weights between each node until the algorithm (network) learns the input-output relationship (predicting the output accurately when given a new input). The performance of deep neural networks increases with the increasing amount of training datasets, thus requiring a large set of labeled data initially [2]. Also, the training of the networks becomes much faster when graphics processing units (GPUs) are used [4].

A subtype of neural network architecture, convolutional neural networks (CNNs), are particularly good for image analysis [23]. CNNs are a special type of artificial neural networks combining modification of three architectural ideas: local receptive fields, shared weights, and sub-sampling (spatial or temporal) [23]. This becomes very useful in the processing of images. When given a training dataset with input “raw” images and pre-labeled outputs, CNNs perform extraordinarily well with image classification and object detection [2, 24]. Thus, when the labeled outputs of an image are, for example, the joints of the arm, the network can be trained to detect them. In fact, there have been several such network architectures developed to detect human body postures [25,26,27,28,29,30,31,32,33,34,35,36]. All of these algorithms, using slightly different network architectures, detect joint positions in two-dimensional (2D) images. To obtain three-dimensional (3D) positions, usually a multi-camera setup is required [37••, 38] (there are, however, algorithms that can estimate 3D poses directly from single 2D videos [39]).

Obtaining 3D Kinematics of Human Upper Extremity Movements Using Deep Learning

Although it is the case that ordinal scales and 2D kinematics can capture differences between normal and impaired movements [14, 15, 40], natural motor behaviors are performed in 3D and so are best evaluated in 3D [10, 17]. Kinematic analysis in 3D, however, has proven challenging to date. It has been mostly performed by either placing reflective markers on select joints (bony structures), by using robotics, or wearable sensors [40,41,42]. These are not practical for routine use, are mostly expensive, time-consuming, and challenging to calibrate or standardize. Thus, there is a need for inexpensive user-friendly marker-less solution. This is where deep learning tools come to the fore: markerless detection of 3D joint positions during naturalistic behaviors for clinical use has been shown to be possible [37••]. The authors first created a portable two-camera stereo system to record the movements. Then, by using one of the available network architectures to detect 2D human joint positions, they analyzed the videos and created 3D models of the subjects performing movements [37••]. This kind of system provides a desirable solution for clinical 3D motion capture. Such systems will almost certainly improve over time as analysis methods become more sophisticated. Moreover, reliability and validity studies are still needed before standard use of these systems in clinical trials and settings.

Why Machine Learning Is Needed for 3D Kinematic Analysis

There are two main reasons how machine learning methods can help with the 3D kinematic analysis of human movements. The first one is in obtaining the 3D data. As described in the previous section, obtaining 3D kinematic data can be challenging. However, it is important to analyze the movements in the most naturalistic way possible that is 3D, marker-less kinematics. This would require no external devices or markers attached to the body. Deep learning offers great opportunities for this purpose. The advanced methods for image recognition and object detection can be utilized to detect the joint positions of the human subjects in standard video recordings, and multiple camera views can be combined to obtain 3D positions of joints [37••].

The second reason how machine learning methods can be very useful is in the analysis of the 3D kinematic data. Machine learning methods are instrumental in recognizing the patterns inherent to the data. This is very important in 3D kinematic analysis. For example, identifying different patterns of movements in normal subjects itself is a challenging task. However, the movements are complex and highly variable. Moreover, this variability increases after a nervous system injury and especially during the recovery phase from the injury. It is important to identify movement patterns that are unique to certain types of injuries and compare these directly with those of healthy individuals. This would also be critically important in assessing whether a patient’s movements returned back to “normal” pattern of movements, or have evolved into a bizarre state mathematically. Using machine learning to identify different patterns of movements can be very important in assessing the state of the quality of the movements.

Analyzing 3D Kinematic Data: Feature Engineering

Obtaining 3D kinematics is the first challenge for measuring the quality of the movements. However, once obtained, the next, and likely the more important, challenge is to determine how the data should be analyzed. The main question is “How does one assess the “quality” of the movement?” The conventional approach to kinematic data analysis is to obtain pre-specified kinematic parameters such as the peak and average velocities, joint angles, trajectory smoothness, and endpoint error (see recent systematic review of more than 150 kinematic parameters [43•]). Moreover, one may apply unsupervised machine learning techniques [44•] to these numerous kinematic parameters to identify patterns. Here, the goal would be to find a set of parameters that can explain the highest variability. This approach can be useful in distinguishing a particular type of movement (e.g., the one with certain joint angles) than another (when these angles are different). A limitation to this approach is that one can end up with a hard-to-summarize list of kinematic variables, some of which may be important for assessing the movement quality but others may be compensatory. For example, in reaching task, when arm movement is impaired (with limited shoulder and elbow extension), a compensatory mechanism is to first do a shoulder abduction, and then trunk flexion in order to get the hand closer to the target. Thus, in this example, if only these two parameters (shoulder abduction and trunk flexion angles) are measured, they will probably show differences between normal and impaired subject but are just indicative of a compensatory response and thus are only indirectly related to movement quality.

Analyzing 3D Kinematic Data: a Holistic Approach

An alternative analysis to the conventional kinematic analysis discussed above is studying the movement as a whole, for example, the trajectories of the end effector during a reach. Impaired reaching after stroke can be compared to a reference population of normal reaches. The main advantage of this approach is that it examines the movement trajectory in its entirety and makes no assumptions about particular kinematic features. This approach can first generate a distribution of “normal” movements” and then provide a scalar distance measure for a given abnormal movement from this reference distribution. This kind of approach was successfully applied to 2D kinematic data using functional principal component analysis [45•, 46•, 47•]. However, this type of analysis needs to now be applied to 3D trajectories. Based on these distances between different trajectories (i.e., movements), one may use unsupervised machine learning techniques to identify clusters of trajectories that are similar to each other. These similar trajectories can form different clusters that are unique and different from each other. Thus, eventually, this type of analysis may result in clusters of “normal movements”, “low-quality” movements in a patient, “movements with compensatory actions”, “truly recovered movements” that are similar to “normal movements” etc.

Perspectives for Future

Machine learning algorithms hold strong promise in both generation and analysis of 3D kinematic data. 3D kinematic analysis of the movements is important for assessing the quality of movements in neurorehabilitation [17••]. Obtaining 3D kinematics can be challenging, however, doing this in the most natural way, i.e., marker-less, is the ideal method. Recent advances in deep learning techniques now make this possible [37••]. Furthermore, machine learning algorithms such as unsupervised learning methods can help with identification of patterns of movements the patients exhibit throughout their disease course. It is likely this kind of analysis has the greatest potential for the clinical movement science. For example, the patients can first be recorded while performing certain tasks. Then, an automated analysis would be able to identify the pattern of their movements and the algorithm can put together a score or document that shows the current state of the patient’s condition clinically. This may range from the extent of recovery after a stroke vs how much compensatory strategies the patient is using, to how advanced the diseases (such as Parkinson’s disease, Huntington’s disease, or certain genetic ataxias or other movement disorders) are. Alternatively, this kind of analysis can provide a guidance whether a treatment is effective, and if so, how much it has benefited the patient. The assessment of motor behavioral state of patients with this method is akin to revealing brain structure by obtaining a magnetic resonance imaging.

Computational models can provide great insight into how the movements are executed [48,49,50,51,52]. These theories could also be very useful in explaining how the movements would be affected after a neural injury. Combining detailed motor behavior analysis with neural data might be helpful in formulating such theories [53,54,55]. The movements are created by coordinated work of the cerebral cortex, subcortical circuitry, spinal cord, nerves, muscles, skeletal structures, and joints. Thus, a theory should be able to explain how this “machinery” works at the algorithmic level to generate the movements. Advanced machine learning algorithms can help identify patterns of movements, which when combined with the neural data, can help form new theories on motor control.

Conclusions

Kinematic analysis, preferably 3D, is important in measuring the quality of movements. Machine learning algorithms are useful tools for obtaining 3D movement kinematics and then analyzing them. These analyses can help distinguish compensatory actions vs behavior restitution.