Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Softness is an important source of information when interacting with remote or virtual environments (VE) via a haptic human-machine-interface. For example, in telesurgery where the surgeon operates a human-machine interface transmitting his/her actions to a robot performing actions inside the human body, tissue softness can indicate a healthy or non-healthy condition (De Gersem 2005). Humans have no dedicated sense for perceiving softness; instead, inferring an object’s compliance haptically requires the combination and integration of information from different sensory sources such as positional cues, force cues, and tactile information—see Chap. 5 for a deeper analysis of mechanisms involved in this process. For many technical systems, including above-mentioned telesurgery setups, tactile cues are not conveyed to the human operator, limiting the information available to infer softness movement and force. In direct interaction with a physical object, the gain and temporal relation of movement and force is determined by the object’s mechanical impedance. A telepresence or VE system can alter the impedance by, e.g., time delay in the communication channel (Rank et al. 2010a; Ohnishi and Mochizuki 2007; Pressman et al. 2007; Nisky et al. 2008; Hirche and Buss 2007; Rank et al. 2010; Hirche and Buss 2012) which is found to make participants underestimate stiffness under various circumstances, see also Chaps. 9 and 5. Determining the limits for distortions caused by the technical system that do not affect the operator’s percept is crucial to ensure a realistic interaction experience.

In the past, perceptual discrimination limits have often been characterized using psychophysical measures such as the just noticeable difference (JND) (Gescheider 1985; Weber 1834), allowing a distinction between perceivable and unperceivable differences in a physical quantity such as a force, length, or impedance by mapping each difference to a proportion in perceptual responses. By simplifying the characterisation of the perceptual system to such a static mapping, valuable information about the time-series characteristics of the environment interaction is lost. Temporal features in the interaction force and movement have though been shown to significantly influence our perception of haptic properties such as hardness (Lawrence et al. 2000) and mass (Baud-Bovy and Scocchia 2009). Perceptual phenomena such as the haptic masking effects found in Rank et al. (2012) could presumably only be understood by looking at the temporal characteristics over time. In softness perception, the amplitude of probing movements was also found to influence human perceptual performance (Tan et al. 1995), a factor that is not accounted for in a softness JND measure. To the authors best knowledge, no conclusive mechanism capturing the combination of movement and force to perceive softness has yet been established.

We propose the usage of dynamic haptic perception models, using differential equations to combine movement and force information together instead of static perception models, e.g., the JND. In this way, the impact of interaction characteristics on the perceptual judgment can be explicitly modelled. Looking at softness perception from a system theoretic point of view, we propose three plausible mechanisms which are capable of discriminating between different soft environments. The detection thresholds predicted by these models vary with the specific interaction movement with the environment. Based on the results from three psychophysical experiments, a dynamic state observer model is identified as a superior prediction model compared to a comparison of identified time delay values and an internal inverse model validation of the body and environment.

Theoretical model candidates from system theory, predicting perception thresholds for temporal misalignment between limb movement and force feedback are introduced in Sect. 8.2. Experimental data from three psychophysical experiments on the perception of time delay in soft, damped and inertial environments are presented in Sect. 8.4, and predictions from the parameterised models are discussed. The chapter is ended with a conclusion on the impact of the results on the design of telepresence and VE systems.

2 Perception Model Representations

Perceiving softness generally requires a combination of force and movement cues into a unified percept. Accounting for human perception characteristics in the design, control and evaluation of systems for human-machine interaction such as telepresence or VE systems requires the formulation of quantitative perception models capturing haptic discrimination abilities. The models proposed here are built upon the assumption of an existing decision criterion \(\delta \). This measure is used to determine which of two response alternatives to choose and can be found in well-established perceptual modelling techniques, e.g. signal detection theory (Macmillan and Creelman 2005) and diffusion models (Ratcliff 1978; Pleskac and Busemeyer 2010).

The perceptual output \(\varvec{y}_{p}\) at a given response time \(t_{r}\) is determined by

$$\begin{aligned} y_p(t_{r})= {\left\{ \begin{array}{ll} \text {``different'' } &{} \text {if } \exists t \in [0,\, t_{r}]:\,|\delta (\cdot )|>\varepsilon ,\\ \text {``same'' } &{} \text {otherwise}, \end{array}\right. } \end{aligned}$$
(8.1)

where \(\varepsilon \) is referred to as a decision threshold and the stimulus onset time is set to \(t=0\).

Remark 1 This formulation of the perceptual process accounts for the fact that in the context of human-robot interaction such as telerobotics, a perceptual decision may be held back and not responded to as soon as the decision has been made. Contrary to Ratcliff (1978), Pleskac and Busemeyer (2010), the formulation of perception models in Eq.  (8.1) thus accounts for all decisions made between the stimulus onset up to time \(t_{r}\).

In most existing computational haptic perception models, \(\delta (\cdot )\) is a static function of the sensory input. As an example from softness perception, a static perception model for discriminating two environments with stiffness coefficients \(k_{1}\) and \(k_{2}\) could be formulated by setting \(\delta (\cdot ) = k_{1}-k_{2}\) and setting the threshold value to the JND for stiffness \(\varepsilon =\text {JND}_{k}\). As a consequence, temporal aspects of the interaction such as movement speed, frequency, or interaction duration remain unmodelled. Instead, we use a dynamic modelling approach to capture the decision criterion. We will limit our considerations to ordinary differential equations.

In the following, three perception modelling candidates for the decision criterion \(\delta (\cdot )\) in (8.1) are proposed. The main inspiration for these models is drawn from considerations how one would approach the detection of differences in a haptic environment from a system theoretic point of view. Support for the mechanism candidates in terms of neurophysiological and psychophysical evidence is also reported.

2.1 Sensorimotor Control Model

The different modelling approaches are discussed using a simplified dynamic model of the human motor apparatus considering only one arm, which is a common simplification throughout the literature (Gil et al. 2004; Yokokohji and Yoshikawa 1994). The state vector \(\mathbf {x}_{h}\) consists of the hand position \(x_{h}\) and velocity \(\dot{x}_{h}\). A block diagram of the arm, controlled to follow a specific state trajectory, is depicted in Fig.  8.1. Note that we make the modelling variables’ dependency on time only implicit in favour of a clear presentation. The control mechanism \(\Phi _{\textit{con}}(\mathbf {x}_{h},\mathbf {x}_{\textit{des}})\) determines the forces which must be applied to the limb to follow a desired state trajectory \(\mathbf {x}_{\textit{des}}\). The arm with its mechanical properties \(\varvec{\Psi }_{\textit{body}}(\mathbf {x}_{h},\dot{\mathbf {x}}_{h},f_{\textit{res}})\), linearly approximated by a mass-damper system

$$\begin{aligned} \ddot{x}_{h} = -\frac{1}{m_{h}}(f_{\textit{res}}-d_{h}\dot{\mathbf {x}}_{h}) \end{aligned}$$

with human-like parameters (\(m_h=2\) kg, \(d_h=2\) Ns/m from Yokokohji and Yoshikawa (1994)) is in contact with the environment. The environment dynamics are contained in \(\Phi _{\textit{env}}(\mathbf {x}_{h},\dot{\mathbf {x}}_{h})\) and react to the state \(\mathbf {x}_{h}\) with a force \(f_{h}\). This feedback acts back on the limb and influences the force moving the limb.

Physiologically, humans are equipped with multiple haptic sensors (Hale and Stanney 2004), and we will focus on sensors for the muscle force \(f_{m}\), limb position \(x_{h}\) and velocity \(\dot{x}_{h}\). Dynamics and noise in the sensory estimates are not considered explicitly, but implicitly respected in the choice of perceptual thresholds \(\varepsilon \ne 0\).

Fig. 8.1
figure 1

The human arm is abstracted as a state-controlled single joint

2.2 Feature Comparison

A straightforward way of discriminating between two soft haptic environments is comparing their characteristic parameters \(\theta \). Such parameters include the stiffness coefficient, or, in case a telepresence system including delayed communication is involved, the time delay between movement and force feedback. To be able to compare the two environments on a parameter basis, a system identification technique suitable to capture this specific property must be used, leading to estimates \(\hat{\theta _{1}}\), \(\hat{\theta _{2}}\). Time delay between movement and force could well be identified using an estimate of the covariance between a position input and a force output signal (Ljung 1999). Acknowledging the fundamental assumption of a decision criterion and threshold for perceptual mechanisms in Eq. (8.1), we propose

$$\begin{aligned} y_p(t_{r}) = {\left\{ \begin{array}{ll} \text {``different'' } &{} \text {if } |\hat{\theta }_{1}-\hat{\theta }_{2}|>\theta _{thresh}\\ \text {``same'' } &{} \text {otherwise}. \end{array}\right. } \end{aligned}$$
(8.2)

In studies on monkeys, correlation techniques as a normalised form of covariance methods have been found to be good at explaining brain activity in specific brain regions associated with perception, if the animal attends to a certain visual stimulus (Niebur and Koch 1994). This could be taken as evidence for the existence of a neural substrate for performing correlations efficiently in the brain. Correlation mechanisms can furthermore explain humans’ performance in detecting temporal differences in audio-visual signals (Fujisaki and Nishida 2005).

Remark 2 The classical JND measure is defined in the dimension of the physical quantity under consideration, that means the haptic environment property \(\theta \) (Weber 1834; Jones and Hunter 1990). In that sense, classical perception models are contained in the feature comparison model proposed here and the predictions from the feature comparison model are seen as a baseline for the other dynamic prediction models.

2.3 Inverse Model Verification

An alternative approach to judge whether two soft environments have the same or different properties is the use of a model verification technique. In system identification, verification is a standard procedure to check whether an identified system has good generalisation capabilities (Åström and Eykhoff 1971). At first, a haptic environment model is built by exploring one stimulus and identifying its parameters by using, e.g., a covariance method as proposed in Sect. 8.2.2. Secondly, during the exploration of another haptic environment, sensory information is compared to a prediction of the sensory output, given the previously built internal representation of the environment dynamics. If prediction and sensory evidence match, the environments are considered the same. If there is a mismatch between the prediction and feedback, the two environments are classified as different. Diverse verification methods are utilised in various technical applications, differing in the criterion which is taken into consideration for classification.

One possibility for a perception model as proposed in Eq. (8.1) can be formulated based on the force required to move along a specific trajectory. The model

$$\begin{aligned} y_p(t_{r}) = {\left\{ \begin{array}{ll} \text {``different'' } &{} \text {if } \exists t\in [0,\,t_{r}]:\Delta f_{m}(t)>\Delta f_{thresh}\\ \text {``same'' } &{} \text {otherwise}, \end{array}\right. } \end{aligned}$$
(8.3)

is based on the force difference \(\Delta f_{m}(t) = |\hat{f}_{m}(t)-f_{m}(t)|\) with \(f_{m}(t)\) being the effective force from all muscles acting on the limb and \(\hat{f}_{m}\) is an estimation of the expected force given the previously identified haptic environment. The decision threshold is denoted \(\Delta f_{thresh}\) in this model. The main difference to the feature comparison model proposed in Sect. 8.2.2 is the fact that the dissociation between a target and a reference environment is not the experimentally varied variable, e.g. stiffness or the communication time delay in a teleoperation system, but the deviant force between the two conditions.

In addition to Eq. (8.3), a perception model based on Weber’s Law is proposed, respecting the fact that force discrimination levels have been found to depend linearly on the force level (Tan et al. 1994). A difference between two soft environments can be perceived if the fraction of force error and force magnitude exceeds the Weber fraction \(w\):

$$\begin{aligned} y_p(t_{r}) = {\left\{ \begin{array}{ll} \text {``different'' } &{} \text {if } \exists t\in [0,t_{r}]:\;{\Delta f_{m}(t)}/{f_{m}(t)}>w\\ \text {``same'' } &{} \text {otherwise} \end{array}\right. } \end{aligned}$$
(8.4)

Reconstructing the motor action from a measurement of the state \(\mathbf {x}_h(t)\) requires a dynamic model containing the body and the environment impedance. In motor control literature, a model predicting motor actions (force) from an observation of the body state \(\mathbf {x}_{h}\) (movement and position) is referred to as an inverse model. There is experimental evidence for the usage of inverse dynamic models in sensorimotor control by predicting the motor actions from the sensed state of the body (Kawato 1999; Shidara et al. 1993). Similarly, an inverse model \(\hat{f}_{m,\textit{res}}=\Phi _{\textit{inv}}(\mathbf {x}_{h})\) capturing dynamics of the arm, sensors and the environment can potentially play a role in perception as well. A stiffness estimation method on the basis of maximum force comparisons between conditions (Tan et al. 1995; Pressman et al. 2007) can be seen as a representative of a perception model using inverse dynamics. Model verifications are closely related to the prediction error method (PEM) which utilises the error between model predictions and sensory information to enhance identification results. This is a well-established technique in system identification (Ljung 1999) and a PEM algorithm has been found to explain the anticipatory perception of sensory events in a plausible way (Szirtes et al. 2005).

2.4 State Observer Model Verification

Alternatively to the exerted muscle force \(f_{m}(t)\) as a decision criterion for distinguishing two soft haptic environments, perceptual judgments can be based on the body state \(\mathbf {x}_{h}(t)\). In the proposed model of the arm in Fig. 8.1, consisting of one limb performing a unidirectional movement, \(\mathbf {x}_{h}(t)\) consists of the limb position \(x_{h}(t)\) and velocity \(\dot{x}_{h}(t)\). The resulting haptic perception model is given by

$$\begin{aligned} y_p(t_{r}) = {\left\{ \begin{array}{ll} \text {``different'' } &{} \text {if } \exists t\in [0, t_{r}]:\;|\hat{\mathbf {x}}_h(t)-\mathbf {x}_h(t)|>\Delta \mathbf {x}_{thresh}\\ \text {``same'' } &{} \text {otherwise}, \end{array}\right. } \end{aligned}$$
(8.5)

where \(\hat{\mathbf {x}}_{h}(t)\) is a prediction of the body state, given a previously experienced environment dynamics.

A state observer can predict the body state from observations of the motor input and sensory measurements, utilising a forward model of the body and environment dynamics. A state observer with a linear dynamic model is depicted in Fig. 8.2. The estimated dynamics of the limb and environment are contained in the state function  \({\hat{\varvec{\Psi }}}_{\textit{body/env}}(\hat{\mathbf {x}}_{h},\dot{\hat{\mathbf {x}}}_{h},f_{m})\). Generally, an output function is required to transform states into measurable outputs; however, since humans possess sensors for both position and velocity, no transformation is required here. Comparing the predictions to the actual sensory observations leads to a prediction error which is weighted with a matrix function \(K(\mathbf {x}_{h}-\hat{\mathbf {x}}_{h})\) and used to correct future estimates of the body states. In the following, we only consider linear body and environment models and simplify \(K(\cdot )\) to a linear matrix multiplication \(K(\cdot )=K\). In case \({\hat{\varvec{\Psi }}}_{\textit{body/env}}(\hat{\mathbf {x}}_{h},f_{m})\) captures the body and environment characteristics exactly and the initial state estimate \(\hat{\mathbf {x}}_{h}(0)\) is correct, the state estimate over time \(\hat{\mathbf {x}}_h(t)\) equals the real state \(\mathbf {x}_h(t)\). If the internal prediction model deviates from the real dynamics because the environment in the second stimulus differs from the comparison condition, the estimated state differs from the real state.

Fig. 8.2
figure 2

A block diagram of a state observer

In the case of white noise affecting the output measurement and states, the noise-optimal choice for \(K\) is the Kalman Gain. This choice turns the observer into a stationary Kalman filter. Kalman filters have been found to describe sensorimotor control processes well in various situations such as the estimation of hand position (Beers et al. 1999) or posture (Kuo 1995). This is a motivation to consider such a structure as a candidate for perceptual processes as well.

3 Model-Guided Experimental Design

A percept of a soft environment can be corrupted in various ways: On the one hand, differences in the stiffness coefficient alter the force feedback magnitude under constant exploration movement; on the other hand, temporal distortions such as time delay between movement and force feedback is capable of completely changing the impression of the environment. Although time delay in haptic feedback is not a natural phenomenon in everyday-life haptic interactions, it is a problem in the operation of telepresence systems over large distances (Peer et al. 2008), e.g., space (Sheridan 1993). We will focus on the investigation of distortions in the haptic combination process due to temporal faults for two reasons: While it is known that time delay between movement and force has a direct impact on the displayed softness (Hirche et al. 2005; Hirche and Buss 2012), the perception of time delay in haptic interaction with an environment is not yet sufficiently understood. However, such knowledge is helpful to provide guidelines and specifications for haptic telepresence systems. As a second motivation, time delays are well-suited to dissociate between the three perception model candidates, as will be detailed out in the following.

Experimental data from three published experiments on time delay detection in force feedback is used to evaluate the prediction capabilities of the proposed perception model candidates: In Rank et al. (2010), soft environments are explored with sinusoidal exploration movements. Amplitude and frequency as well as the stiffness coefficient are varied. The applicability of the models to environments different from softness is also examined to determine their capability to predict perceptual thresholds in damped and inertial environments as well, using data from Rank et al. (2010a).

3.1 Model-Guided Stimulus Selection

The prediction of perceptual thresholds based on the models introduced in Sects. 8.2.28.2.4 depends on a multitude of factors, e.g., the interaction movement speed, frequency, and amplitude. Given this high-dimensional parameter space, a fully crossed experimental design with conditions sampled over a range of stimuli is inappropriate. Instead, we choose a model-based selection of experimental stimuli based on predictions for the discrimination threshold of time delay in force feedback from the environment using a linear spring with spring constant \(k_{e}\). Without loss of generality, the equilibrium point of the spring is set to the position \(x_h=0\). The predicted perception limits of time delay on the basis of the matched filter model and the state observer model depend on the interaction movement \(x_{h}(t)\) with the haptic environment. A sinusoidal movement

$$\begin{aligned} x_{h}(t) = A\sin (\omega t) \end{aligned}$$
(8.6)

with amplitude \(A\) and frequency \(\omega \) is chosen as the interaction pattern since it is easy to understand and perform for participants in a psychophysical experiment. The predictions following from the choice of environment and interaction movement are discussed below.

The force feedback from a soft environment with time delay \(T_d\) is expressed as

$$\begin{aligned} f_{h}(t) = k_{e}x_h(t-T_d). \end{aligned}$$
(8.7)

Respecting the dynamical model of the human arm in contact with the environment illustrated in Fig. 8.1, the overall motor action that is required to move the limb in contact with the environment is

$$\begin{aligned} f_{m}(t) = m_h\ddot{x}_h(t) + d_h\dot{x}_h(t) + k_{e}x_h(t-T_d). \end{aligned}$$
(8.8)

Without loss of generality, we consider the case that the non-delayed soft environment is explored first. The delayed feedback is perceived second and the sensory evidence from this exploration is compared to predictions from the undelayed stiffness. Inaddition, we assume that humans have good knowledge of their body dynamics (inertia \(m_{h}\) and damping \(d_{h}\)), and the estimate \(\hat{k}_{e}\) of the environment stiffness coefficient \(k_{e}\) is sufficiently accurate from the non-delayed stimulus exploration.

The inverse model verification model founds on a comparison between sensory observation of the resulting muscular force \(f_{m}(t)\) and the predicted force feedback \(\hat{f}_{m}(t)\). Consequently, \(\hat{f}_{m}(t)\) is determined by

$$\begin{aligned} \hat{f}_{m}(t) = m_h\ddot{x}_h(t) + d_h\dot{x}_h(t) + \hat{k}_{e}x_h(t). \end{aligned}$$
(8.9)

Setting \(\hat{k}_{e}\approx k_{e}\) and substituting \(x(t)\) with Eq. (8.6), the error between model prediction and sensory feedback is calculated in agreement with (8.3) to

$$\begin{aligned} \Delta f_{m}(t) = |k_{e}A(\sin (\omega t)- \sin (\omega (t-T_d)))|. \end{aligned}$$
(8.10)

Model verification using a state observer relies on a prediction of the body state

$$\begin{aligned} \hat{\mathbf {x}}_h(t) = \begin{bmatrix} \hat{x}_h(t)&\dot{\hat{x}}_h(t) \end{bmatrix}^T, \end{aligned}$$
(8.11)

utilising a forward model of body and non-delayed environment dynamics. The state prediction is the solution of the set of differential equations, expressed in matrix form as

$$\begin{aligned} \begin{bmatrix} \hat{\dot{x}}_h(t) \\ \hat{\ddot{x}}_h(t) \end{bmatrix} = \begin{bmatrix} 0&1\\ -\frac{d_h}{m_h}&-\frac{k_{e}}{m_h} \end{bmatrix} \begin{bmatrix} \hat{x}_h(t) \\ \hat{\dot{x}}_h(t) \end{bmatrix} + \begin{bmatrix} 0\\ \frac{1}{m_h} \end{bmatrix}\, f_{m,res}(t) + \begin{bmatrix} k_{11}&k_{12} \\ k_{21}&k_{22} \end{bmatrix} \left( \begin{bmatrix} x_h(t) \\ \dot{x}_h(t) \end{bmatrix} -\begin{bmatrix} \hat{x}_h(t) \\ \hat{\dot{x}}_h(t) \end{bmatrix} \right) . \end{aligned}$$
(8.12)

In order to be detectable, the discrepancy in the decision variable must be larger than a threshold variable. In order to determine the amount of time delay between movement and force feedback, the maximum deviance between prediction and sensory observation is to be computed. For the inverse model, the discrepancy is at its maximum at time \(\frac{1}{2}T_d\) after the zero-crossings of the predicted (non-delayed) force reference, which is expressed by

$$\begin{aligned} \Delta f_{m,max} = \Delta f_{m}(t)|_{t=\frac{1}{2}T_d} = k_{e}A2\sin (\frac{1}{2}\omega T_d) \approx k_{e}A\omega T_d. \end{aligned}$$
(8.13)

The last step in the calculation holds for small values of \(\omega T_d\), which is a valid assumption for the practically relevant range of time delays in telepresence applications and the movement frequencies considered in the experiments.

Similarly, the state observation error can be computed by solving Eq. (8.12) for the specific interaction movement from Eq. (8.6) and the motor action from (8.8). In contrast to the solution for the maximum force error in Eq. (8.13), the maximum state error depends on the entries of the feedback matrix \(K\). These values are unknown.

Thus, the experimental conditions are optimized for the inverse model, and the prediction capabilities of the state observer model are tested post-hoc with a feedback matrix \(K\) that is identified based on experimental data.

Keeping the time delay \(T_d\) at a constant level, the maximum force error as the prediction criterion for time delay detection is higher with a greater amplitude \(A\), and/or higher movement frequency \(\omega \). This means in return, that time delay needed to exceed a hypothesized perception threshold on force error is smaller with larger \(A\) and/or higher \(\omega \). Notably, the maximum force error as introduced in Eq. (8.13) depends on the product of \(A\) and \(\omega \), predicting that choosing values of \(A\) and \(\omega \) such that their product is constant (\(A{\omega } = \text {const.}\)) results in the same detection threshold. For testing the influence of movement amplitude, frequency and their product, a systematic experimental design with three levels for \(A\), three levels for \(\omega \) and three levels of \(A\omega \) as depicted in Fig. 8.3 is chosen.

Fig. 8.3
figure 3

Six pairs of movement amplitudes and frequencies were chosen in such a way that \(\omega \), \(A\) and their product \(A\omega \) have three different levels respectively

Another factor in the computation of the maximum force error according to Eq. (8.13) is the stiffness coefficient \(k_{e}\). The perception model predicts a lower time delay detection threshold in the case where stiffness is higher.

In addition to a soft environment, the prediction capabilities of these models in damping and inertia are explored in order to test a generalisation to other experimental conditions as well. Stimuli with a damping \(d_{e}\), and an inertia \(m_{e}\) satisfy

$$\begin{aligned} \left. \frac{\Delta f_{m,max}}{f_{m}(t)|_{\Delta f_{m}(t)=\Delta f_{m,max}}}\right| _{d_{e}} = \left. \frac{\Delta f_{m,max}}{f_{m}(t)|_{\Delta f_{m}(t)=\Delta f_{m,max}}}\right| _{m_{e}}, \end{aligned}$$

such that the Weber fraction is equal in both conditions, resulting in a constant time delay detection threshold in the case of a perception criterion based on Weber’s Law.

4 Experimental Investigations

Experimental data from three studies is analysed here. From Rank et al. (2010a), time delay detection thresholds for sinusoidal movements with parameters as depicted in Fig. 8.3 is taken. In addition, detection thresholds for three levels of stiffness under two different movement patterns are taken from Rank et al. (2010a). Third, the time delay detection thresholds obtained for stiffness are compared to those in damping and inertia environments while keeping the interaction movement constant. This data is reported in Rank et al. (2010). A summary of all experimental conditions and the detection thresholds found in the experiments is provided in Table 8.1. Notably, we also report measurements of participants’ mean amplitude \(\hat{A}\) and frequency \(\hat{\omega }\) of their interaction movement since these have been found to differ from the experimental instructions.

Table 8.1 Mean detection thresholds (DT) and standard error (SE) of time delay-induced alterations of soft environments depend on the specific interaction movement and the composition of the environment

4.1 Results

Four substantial findings can be concluded from the experimental findings in Rank et al. (2010a):

  1. 1.

    The detection thresholds for time delay-induced environment alterations are negatively correlated with movement frequency and movement amplitude.

  2. 2.

    Movement amplitude and frequency influence the detection threshold separately.

  3. 3.

    Within the range of experimental conditions, stiffness does not affect perceptual discrimination abilities of time delay in force feedback.

  4. 4.

    A change in the environment due to time delay can be detected easiest in force feedback from a damper, followed by time delay in force feedback from softness. Inertia exhibits the largest detection thresholds.

In order to investigate which perception model candidate is most suited modelling this observed behaviour, parameters for each model are identified and predictions for the detection thresholds are obtained.

4.2 Model Predictions

Since experimental methods and the group of participants are not homogenous over the different experiments, we fit mean detection thresholds individually for each experiment. To compare the prediction quality between models, the mean squared error (MSE) is computed. In the following, the individual identification procedures and the prediction results are discussed in detail.

4.2.1 Feature Comparison Model

Humans may perceive time delay in a haptic environment per se and compare individual estimates obtained from haptic exploration of the standard and comparison environment. The correlation techniques discussed in Sect. 8.2.2 are indeed well-suited to infer a time delay between movement and force feedback. While an uncertainty in time delay detection performance due to noise in the biological system could lead to a detection threshold different from zero, there is no apparent reason why the uncertainty about the time delay should change with input amplitude, frequency, magnitude, or the type of environment. The predicted time delay detection threshold based on this method is thus constant over conditions. Identification of the only free parameter in this model is achieved by solving

$$\begin{aligned} \mathop {\text {arg min}}\limits _{DT^\theta } \frac{1}{N_{\textit{cond}}} \sum _{i=1}^{N_{\textit{cond}}} (DT_{i}-DT^\theta )^2 \end{aligned}$$
(8.14)

where \(N_{\textit{cond}}\) is the number of conditions in the respective experiment, and \(DT^\theta \) is the (constant) time delay detection threshold. The solution to this optimisation problem is the mean time delay over all conditions within one experiment. Predictions from this perception model result in a MSE of 127.34 ms\(^2\).

4.2.2 Inverse Model Verification

The parameterisation of this model, given the experimental results in Table 8.1 is the result of a nonlinear constrained optimisation problem

$$\begin{aligned}&\mathop {\text {arg min}}\limits _{DT_{i}^{f},\Delta f_{thresh}} \frac{1}{N_{cond}}\sum _{i=1}^{N_{cond}}(DT_{i}-DT_{i}^f)^2 \\ s.t.&\max \Delta f_{m,i}(t) = \max |f_{m,i}(t)-\hat{f}_{m,i}(t)| = \Delta f_{thresh} \, \forall i\in [1,N_{cond}] \nonumber \end{aligned}$$
(8.15)

where \(\Delta f_{thresh}\) is the (constant) detection threshold for the difference between the delayed and non-delayed exerted force and \(DT_{i}^{f}\) the corresponding time delay value causing \(\Delta f_{thresh}\). The predicted motor action on the basis of the measured state \(\mathbf {x}_h(t)\) is computed for each individual experimental condition, indexed by \(i\), and denoted \(\hat{f}_{m,i}(t)\). A numeric optimisation algorithm based on the interior-point method is used to find the optimal parameterisation fitting all experimental conditions (Byrd et al. 1999). Using the dynamic inverse model to explain average detection thresholds for time delay perception results in lower prediction errors (96.7 ms\(^2\)) compared to the feature comparison model prediction. The mean force difference thresholds for the experiments are 1.4 N for the first, 1.2 N for the second, and 1.7 N for the third experiment.

Force difference perception for experiments with slowly-changing forces is known to follow Weber’s Law (Tan et al. 1994). The Weber fraction of \(\Delta f_h(t)\) could thus be an good model to explain the detection thresholds of time delay as well. The optimisation problem to be solved is similar to Eq. (8.15), namely

$$\begin{aligned}&\mathop {\text {arg min}}\limits _{DT_{i}^w,w} \sum _{i=1}^{N_{cond}}(DT_{i}-DT_{i}^w)^2\\ s.t.&\max \frac{\Delta f_{m,i}(t)}{f_{m,i}(t)} = w \, \forall i\in [1,N_{cond}]\nonumber \end{aligned}$$
(8.16)

with \(w\) the Weber fraction. Indeed, the model fit for the experiment with different stiffness levels is admittedly good, with a MSE of only 4.5 ms\(^2\), but the model performs poorly for all other conditions, yielding a total MSE of 127.7 ms\(^2\). Thus, this model performs not better as the feature comparison model being the baseline predictor.

4.2.3 State Observer Model Verification

In contrast to the matched filter perception model, the state observer model utilizes an estimation of the body state for the decision about the environment time delay. The difference between the observed state and actual state heavily depends on the choice of the feedback matrix \(K\), as discussed in Sect. 8.2.4. The model predicts perception limits based on a threshold in the state estimation error. The state \(\mathbf {x}_h(t)\) consists of two components, namely the limb position \(x_h(t)\) and velocity \(\dot{x}_h(t)\). While deviations between the observed state and the measured state could be principally based on a generic threshold both on position and velocity, individual models considering a threshold on \(x_{h}\) and \(\dot{x}_{h}\) are considered here:

$$\begin{aligned}&\mathop {\text {arg min}}\limits _{DT_{i}^{x_1},\Delta x_{h,thresh},K} \frac{1}{N_{cond}} \sum _{i=1}^{N_{cond}}(DT_{i}-DT_{i}^{x_1})^2\\ s.t.&\max \Delta x_h(t) = \max |x_h(t)-\hat{x}_h(t)| = \Delta x_{h,thresh} \, \forall i\in [1,N_{cond}] \nonumber \end{aligned}$$
(8.17)

and

$$\begin{aligned}&\mathop {\text {arg min}}\limits _{DT_{i}^{x_2},\Delta \dot{x}_{h,thresh},K} \frac{1}{N_{cond}} \sum _{i=1}^{N_{cond}}(DT_{i}-DT_{i}^{x_2})^2 \\ s.t.&\max \Delta \dot{x}_h(t) = \max |\dot{x}_h(t)-\hat{\dot{x}}_h(t)| = \Delta \dot{x}_{h,thresh} \, \forall i\in [1,N_{cond}]. \nonumber \end{aligned}$$
(8.18)

The problems formulated in (8.17) and (8.18) have five free parameters to be optimised. Due to the comparably low number of experimental conditions which are available for model fitting and the fact that the optimisation problem may indeed be non-convex, the solution can depend on the chosen initial values. Suitable values are found from an initial grid search procedure, meaning a simulation of the state space observer model for different feedback matrices \(K\). Observation errors \(\Delta x_h(t)\) and \(\Delta \dot{x}_{h}(t)\) are computed for every candidate of \(K\) and the values resulting in the lowest variance for the state error between all conditions of each experiment is taken as initial values for the optimisation problems stated in Eqs. (8.17) and (8.18). Only one feedback matrix \(K\) for all experiments is fit to keep the number of variables computationally tractable and reduce the problem of overfitting. However, we do allow for different threshold values \(x_{h,thresh}\), \(\dot{x}_{h,thresh}\) in the three experiment to account for the differences in experimental methods. As a result, the state observers with feedback matrices

$$\begin{aligned} K_{x_{h}} = \begin{bmatrix} 11.8&36.3\\ 33.3&31.1 \end{bmatrix}, \text { and } K_{\dot{x}_{h}} = \begin{bmatrix} 0&9.8\\ 9.4&11.4 \end{bmatrix} \end{aligned}$$
(8.19)

for predictions based on \(x_h\) and \(\dot{x}_h\), respectively, give predictions with the lowest mean squared error. Threshold values for the position-based observer are 0.10, 0.02, and 0.07 m. Velocity thresholds are 0.15, 0.04 and 0.07 \(\frac{\text {m}}{\text {s}}\). The MSE values are 98.3 ms\(^2\) for the state observer using the position error as decision variable, and 85.7 ms\(^2\) for the velocity-based threshold. Predictions from all models in all experimental conditions are compared in Fig. 8.4.

Fig. 8.4
figure 4

Prediction errors, grouped by experimental condition (1–15, see Table 8.1). Prediction errors are high in environments different than softness (14–15)

4.3 Discussion

Comparing the predictions from all models introduced in Sects. 8.2.28.2.4 leads to the conclusion that the state observer model with a detection mechanism on the observation error in limb velocity is most successful in capturing the observed perceptual behaviour. While in the first experiment, conditions with comparable maximum force errors would lead to similar detection thresholds, the inverse model verification method would predict a decreasing detection threshold for an increase in stiffness. However, the second experiment fails to show such behaviour. In general, all dynamic perception models except the model verification model using a threshold based on Weber’s Law outperform the static feature comparison model.

The state observer verification model is most successful in predicting detection thresholds for time-delay induced changes in the environmental characteristics, but it also has most degrees of freedom. Claiming the superiority of this model over its alternatives is thus admittedly difficult. Statistical tests such as the Akaike information criterion fail here due to the inhomogeneity of the dataset with respect to participants and methods. However, considering the technical application motivating the perceptual modelling, valuable predictions can still be drawn for the practically relevant set of movement stimuli and haptic environments presented here.

An analysis of the prediction errors in the individual experimental conditions reveals that all proposed models capture the time delay detection thresholds with a significantly lower MSE for the soft environments compared to inertia and damping (Welch’s t-test, \(t(0.14)=15.7\), \(p<0.001\)). One reason for this lack of generality could be our implicit assumption of an internal representation of the environment that can generate a noise-free and temporally accurate prediction of the reference to the actual sensory feedback. It is known that time perception can be easily disturbed by many factors including attention to the stimulus, the frequency of events occurring etc (Grondin 2010). The difference between the soft, damped and inertial stimuli used in the studies described lies in the relative phase between the position and force signals, thus in their inherent characteristic temporal relation to each other. Modelling temporal uncertainties and noise on the perceptual signals during the exploration may bring further insights into the mechanisms involved in the combination of movement and force into a coherent percept of haptic environments.

So far, all found effects had been attributed to the time delay introduced between position and force feedback. However, using a regular exploration strategy with fixed frequency makes time delay indissociable to a non-linear spring, similar to Leib et al. (2010). The detection could thus as well be a measure of non-linearity in the environment characteristics rather than actual delay. Further studies are required to actually dissociate between these possibilities.

5 Implications for Telepresence Systems

Time delay is a critical issue for haptic telepresence systems operating over long distances (Peer et al. 2008; Hirche and Buss 2012; Sheridan 1993). Challenges to be dealt with include technical issues such as system instability and, on the side of the human operator, impaired perception of the environment’s haptic properties, especially softness (Hirche and Buss 2007, 2012). High-fidelity telepresence systems must aim for a high degree of transparency, that means, that the operator can not distinguish whether he/she directly interacts with the environment or by means of the technical system. Towards this ultimate goal, our findings provide valuable insights for the design and control of telepresence system that allow an unaltered perception of a remotely explored softness. First of all, the operator’s movement must be taken into consideration to evaluate whether a time delay in the communication channel affects softness perception or not. A haptic task which requires only slow movements can tolerate longer delays in the feedback than a highly dynamic task requiring movements with a high frequency. Not only the task can limit the amplitude and movement frequency, but also the haptic interface. A smaller workspace on the one hand, and high friction or uncompensated inertia on the other hand can influence the detection thresholds. The workspace dimensions of the local haptic interface determine the maximum movement amplitude, and detection thresholds increase. With larger inertia and damping of the local haptic interface, the achievable human movement frequency decreases, resulting in a higher detection threshold for time delay.

The finding that a scaling of the stiffness coefficient within the investigated range does not influence the sensitivity of temporal perception is interesting for the application in a specific teleoperation application, namely micromanipulation. In this area, small forces arising in a micro-scale environment must be augmented for the user to provide a perceptible haptic impression (Ando et al. 2001). For the case of delayed haptic feedback, our finding suggests that the scaling factor can be chosen irrespective of haptic latency. Note, however, that we only validated this hypothesis for a limited range of stiffnesses. In extreme scenarios, such as stiff contact with a rigid object, an infinitesimally small time delay may result in an unstable system, which completely changes the characteristics of the system. The human operator may then be able to infer the time delay from increasing oscillations in the force feedback.

Although none of the current model candidates are capable of entirely predicting thresholds for time delay detection in force feedback, the finding of such a dynamic model would have direct application for the design of communication algorithms, or haptic rendering systems as well: The greatest benefit of these models lies in the possibility to consider the influence of interaction movements on the perceptual threshold explicitly. In this way, more accurate predictions whether a time delay in the haptic feedback is perceivable or not can be utilised during the execution of a task, and appropriate measures can be taken, for example in communication Quality-of-Service control algorithms. We take this as a motivation to work further towards this ultimate goal.

6 Conclusions and Open Problems

Humans do not possess a dedicated sensor for haptic environment properties such as stiffness, damping, or inertia. Instead, temporal and magnitude information from movement and force feedback must be combined together to infer such measures. System theoretic perception models capable of combining these information sources have been proposed in this chapter. We tested the ability of all model candidates to predict time delay detection thresholds in force feedback. Taking together the results of six psychophysical experiments on time delay perception thresholds, a dynamic state observer model has been identified as the model capturing human discrimination performance best when movement and force feedback are temporally misaligned.

Although all model candidates have been tested for a number of different movements, the pattern was so far restricted to sinusoids of different amplitudes and frequencies. For a more general applicability to haptic telepresence systems, other movements must be considered as well. Ultimately, perceptual responses for time-delayed feedback from arbitrary voluntary explorations shall be predictable. Furthermore, the modelling performance in the third experiment, considering time delay perception levels in stiff, damped and inertial environments have not been captured well by either model proposed so far. Alternative models with other decision criteria could further improve the prediction performance. Together with a dynamic perception model for the influence of magnitude information on the combination of movement and force, conclusions about perception mechanisms for abstract environments containing arbitrary combinations of stiffness, damping and inertia could be eventually drawn.