Introduction

The perception of a stimulus is complex and constitutes a multidimensional construct (Taneja et al., 2019). What may elicit a perception to one individual may not necessarily evoke the same to another. Therefore, it can be challenging when trying to establish a standardised stimulus to produce a targeted perception, a factor important when investigating conditions such as nerve damage (Rolke et al., 2006). It is not surprising that there are a number of somatosensory investigations that exist to assess such a problem (Pigg, Baad-Hansen, Svensson, Drangsholt, & List, 2010; Rolke et al., 2006). However, within these investigatory protocols the main focus tends to be on altered mechanical and thermal sensitivity, paying particular interest in stimuli considered as painful or unpleasant (Svensson et al., 2011; Poort, van Neck, & van der Wal, 2009; Rolke et al., 2006). A percept that is regularly overlooked is that of pleasantness.

A recent discovery is that the coding of a pleasant sensation seems to be via a neuronal network known as the C-Tactile (CT) afferent system (Löken, Wessberg, Morrison, McGlone, & Olausson, 2009). These are unmyelinated, low-threshold afferents found in the hairy skin of humans (Vallbo, Olausson, Wessberg, & Norrsell, 1993, 1999). Since their discovery there has been a plethora of research into the contribution of the CT afferents to social touch and pleasant sensations (Walker, Trotter, Woods & McGlone, 2017; Morrison, Björnsdotter & Olausson, 2011).

It has been established that the optimal activation of CT afferents is a slowly stroking stimulation (1–10 cm/s) (Nordin, 1990). This gentle stroking is considered as pleasant and often seen in affiliative interactions (Olausson, Wessberg, McGlone, & Vallbo, 2010; Gallace & Spence, 2010). Furthermore, the psychophysical measure of pleasantness correlates with the firing of CT afferents, linking this system to such a percept (Löken et al., 2009). However, other peripheral afferents will be contributing, as a stroking touch can be pleasant in regions lacking CT afferents, i.e. glabrous skin (Löken, Evert, & Wessberg, 2011; McGlone et al., 2012).

The slow conduction of CT afferents renders them suboptimal for sensory discrimination (Morrison, Loken & Olausson, 2010). Nonetheless, a slow conduction velocity does not preclude a role in affective touch (McGlone, Wessberg, & Olausson, 2014). Consistently, functional magnetic resonance imaging studies identify CT signals to be processed in emotion-related areas of the brain, such as the insular cortex, and less in the primary somatosensory cortex (Olausson et al., 2002, 2008; Morrison et al., 2011) which is known to play a key role in discriminative touch (Olausson et al., 2008; McGlone et al., 2014; Case, Laubacher et al., 2016).

It is clear that stimuli optimal for activation of CT afferents is considered as pleasant, but the stimulus’ properties associated with pleasant sensations are poorly defined. Hence, the aim of this review was to conduct a systematic review of the stimuli and methods used to generate a pleasant sensation, with particular focus on CT-psychophysical studies. This would aid in identifying the stimulus factors that are most relevant in producing a pleasant sensation. An additional aim was to undertake a meta-analysis to provide a pooled estimate of the difference in pleasantness ratings for CT-optimal (3 cm/s) and CT-suboptimal (30 cm/s) stroking touch. (Löken et al., 2009; Vallbo et al., 1993, Vallbo, Olausson, & Wessberg, 1999). It is anticipated that the findings can guide studies that require a pleasant tactile sensation, or in neurosensory assessment protocols, allowing for a more complete examination of somatosensory function.

Materials and methods

The systematic review was performed in accordance with the PRISMA guidelines (Moher, Liberati, Tetzlaff & Altman, 2009) and registered with the PROSPERO database (Number: CRD42017058867).

Search strategy

A literature search of articles published in English from 1974 to June 2018 was conducted through the Cochrane Library, Embase, Medline, PubMed, Scopus and Web of Science, as well as a web search (via Google Scholar). The search was performed on the same day and had been tailored to the relevant database by author PT and an experienced librarian. English search terms were used, and there were no limits placed (Table 1).

Table 1 Search strategy used for identifying titles in PubMed

Eligibility assessment

The eligibility criteria consisted of articles of any study type, involving healthy adults (≥ 18 years), and thus excluding studies with conditions/diseases that may affect the perception of a pleasant sensation. If studies consisted of patients and a control (healthy) group, inclusion of the control group was only performed, if independent results on pleasantness were present. In addition, if a study consisted of multiple experiments, then only the relevant experiment/part was included. Eligibility also required the use of a defined stimulus (with a description) and a numerical rating for pleasantness. Exclusion of articles consisted of those that were attempting to define pleasantness by descriptive terms only, or by an active form of assessment (perceived by the participant touching a material, rather than being touched). Studies assessing pleasantness via a combination of touch with other stimuli (e.g. visual), or in which the experimental set-up may potentially influence the perceived pleasantness (e.g. within an MRI scanner, etc.), were excluded. To be included in the meta-analysis, the above-mentioned criteria had to be met and individual studies needed to compare stroke velocities, with similar experimental set-ups.

Data collection and analysis

Initially, two of the authors (PT and LBH) independently screened over half of the titles and abstracts based on the inclusion criteria listed above. The basic percentage inter-rater agreement was assessed to be 94%, and Cohen’s Kappa statistic was 0.87, indicating a near perfect agreement. Hence, PT screened the remaining titles and abstracts with any concerns discussed with LBH. Full text articles were obtained for those that appeared to address the review topic, or for those abstracts that were not clear. Finally, the reference lists of all included articles were searched to identify any studies that may have been missed. Any disagreements were resolved by discussion. PT performed data extraction, and relevant items included authors, year of publication, number of participants, gender, blinding and randomisation, region assessed, type of instrument used for stimulus delivery, velocity and force of the stimuli used, the interstimulus interval, and the rating scale used.

For the meta-analysis, effect sizes in the form of standardised mean differences were computed. This method allowed for the adjustment in different pleasantness rating scales utilised across studies and allow a direct comparison of the velocities (Morris & DeShon, 2002). Hence, descriptive data (mean, standard deviations (SD) and sample sizes) were collected. When studies presented with data that allowed more than one effect size to be computed (multiplicity), the one that allowed the maximum comparability between studies was selected, for example, based on details of the experimental set-up, duration of stimulus delivery, etc. Both reviewers (PT and LBH) agreed on this approach as selection was not based on any magnitude or direction (López‐López, Page, Lipsey & Higgins, 2018). In the instance that effect sizes were equally suitable, and if they were not significantly different from one another, the average was taken (López-López et al., 2018). In this way, each study would therefore contribute only 1 effect size.

Furthermore, as studies were likely to be repeated measures (investigating multiple velocities on the same participants), a correlation coefficient between the investigated velocities was needed for variance computation (Morris & DeShon, 2002; Dunlap, Cortina, Vaslow & Burke, 1996; Borenstein, Hedges, Higgins & Rothstein, 2011). If all but the correlation coefficient was available, the decision was to use the average correlation from those studies that stated or had sufficient data available for it to be computed (Morris & DeShon, 2002; Dunlap et al., 1996; Borenstein et al., 2011). Corresponding authors were contacted by email if any data were missing. If no response, the authors were contacted again after 2 weeks with an average time of 8 weeks to receive the information.

Quality assessment in individual studies

Both reviewers (LBH and PT) independently performed the quality assessment. An established quality assessment checklist was not utilised as it was the experimental technique for eliciting a pleasant sensation that was being evaluated, and not the overall clinical study. Hence, it was decided that criteria based on common checklists, as well as those factors considered as important in the investigation of CT afferents, should be devised. The assessment was at the methodological level and consisted of a 14-item checklist generally categorised into: participants selected, randomisation and blinding within the experimental set-up, consideration of confounding factors and statistical analysis used (“Appendix” section). All included studies were assessed against the checklist, and quality was categorised as a percentage with low (≤ 33%), moderate (≥ 34 and ≤ 66%) and high (≥ 67%) methodological quality.

Data synthesis and analysis

The effect sizes for each study were calculated in the form of standardised mean differences. This was achieved using the Cohen’s d via the pretest–posttest design method, using the pooled within-group SD (Borenstein et al., 2011). The calculations were done in a way that a negative effect size represented increased pleasantness ratings in the 3 cm/s group relative to 30 cm/s, whereas a positive effect size represented the converse (Faraone, 2008). The magnitude of the effect size was interpreted as small = 0.2, medium = 0.5 and large = 0.8 (Cohen, 1992).

The meta-analytical computations were performed using Stata software (StataCorp. 2017. Stata Statistical Software: Release 15. College Station, TX: StataCorp LLC). A random effects model was fitted as this model has been identified to account for the distribution of an effect over populations (Israel & Richter, 2011). Two tests investigated heterogeneity. The Cochran Q test, in which a p < 0.10 (see below for justification to significance level), signified statistically significant heterogeneity between studies, and the I2 statistic, which identifies the percentage of variation between study estimates as a result of heterogeneity rather than chance (Higgins & Thompson, 2002; Higgins, Thompson, Deeks, & Altman, 2003). An I2 value of 0%, 50% and 75% indicated low, moderate or high heterogeneity, respectively (Higgins, Thompson, Deeks & Altman, 2003).

Sensitivity analysis was used to investigate the robustness of different factors on the pooled effect estimate. Firstly, the impact on the overall effect estimate by different statistical methods (fixed and random effects model) was undertaken. Secondly, the “leave-one-out” analysis was performed to investigate whether removing any individual study could influence the overall pooled estimate.

A hand and robot stimulation subgroup analysis was planned to assess any influence on effect size heterogeneity with a 0.10 cut-off significance level to mitigate the small number of included studies and, therefore, problems with subgroup tests for heterogeneity (Sedgwick, 2013). Finally, to assess for publication bias, inspection of funnel plots and the Egger’s regression test were planned (Peters, Sutton, Jones, Abrams & Rushton, 2006).

Results

Study selection

Following title and abstract screening, 96 full text studies were reviewed (Fig. 1). Seventeen studies met the inclusion criteria, and the reference lists were analysed identifying one further study. In total 18 studies were included for the qualitative synthesis.

Fig. 1
figure 1

PRISMA flow chart showing the process of selecting studies

From these, 8 studies did not meet the inclusion criteria for the meta-analysis due to the experimental set-up or velocities tested (Bennett, Bolling, Anderson, Pelphrey & Kaiser, 2014; Essick et al., 2010; Etzi, Spence & Gallace, 2014; Gordon et al., 2013; Hua et al., 2008; Huisman, Frederiks, van Erp & Heylen, 2016; Tsalamlal, Ouarti, Martin & Ammi, 2013, Voos, Pelphrey & Kaiser, 2013). Seven of the remaining ten studies did not provide sufficient data, and therefore corresponding authors were contacted (Ackerley, Carlsson, Wester, Olausson & Backlund Wasling, 2014; Etzi, Carta & Gallace, 2018; Jönsson et al., 2015, 2017; Kass-Iliyya et al., 2016; Luong, Bendas, Etzi, Olausson & Croy, 2017; Löken et al., 2011). Replies with data were received for 2 of the studies (Ackerley et al., 2014; Etzi et al., 2018), but not for the other 5 studies, leading to their exclusion (Jönsson et al., 2015, 2017; Kass-Iliyya et al., 2016; Luong et al., 2017; Löken et al., 2011). Therefore, a total of 5 studies were included in the meta-analysis (Ackerley et al., 2014; Etzi et al., 2018; Pawling, Cannon, McGlone & Walker, 2017; Triscoli, Olausson, Sailer, Ignell & Croy, 2013; Triscoli, Ackerley & Sailer, 2014).

Quality assessment

The quality assessment of the methods undertaken was rated as high in eight complete studies (Ackerley et al., 2014; Essick et al., 2010; Etzi et al., 2018; Jönsson et al., 2015, 2017; Triscoli et al., 2013, 2014; Tsalamlal et al., 2013) and experiments 1 and 2 in Löken et al. (2011). A satisfactory score was achieved by four complete studies (Huisman et al., 2016; Kass-Iliyya et al., 2016; Luong et al., 2017; Pawling et al., 2017), experiment 3 in Löken et al. (2011), and experiment 2 in Etzi et al. (2014). Four studies had their described methodology rated as weak (Bennett et al., 2014, Gordon et al., 2013, Hua et al., 2008, Voos et al., 2013). The quality assessment score was not a factor for inclusion, and therefore no studies were excluded on their quality rating.

Qualitative synthesis (systematic review)

Study characteristics

The participant and study method characteristics were evaluated and are summarised in Tables 2 and 3. From the included studies, seven composed of more than one experiment (Essick et al., 2010; Etzi et al., 2014; Jönsson et al., 2015, 2017; Löken et al., 2011; Luong et al., 2017; Triscoli et al., 2014). Two studies provided data prior to the main experiment that required MRI scanning (Gordon et al., 2013; Voos et al., 2013), and one study obtained data from the healthy controls that were compared with patients with Parkinson’s disease (Kass-Iliyya et al., 2016).

Table 2 Study design and participant demographics from included studies
Table 3 Characteristics of the methods utilised for pleasantness assessment within included studies

The included studies (where k = number of studies) were published between 2014 and 2018 with the number of participants (as defined by the inclusion criteria) ranging from 8 (Löken et al., 2011) to 80 (Jönsson et al., 2017). Six studies had a greater number of female participants (k = 6) (Bennett et al., 2014; Etzi et al., 2014; Gordon et al., 2013; Kass-Iliyya et al., 2016; Pawling et al., 2017; Voos et al., 2013), three studies had more male participants (k = 3) (Huisman et al., 2016; Triscoli et al., 2013; Tsalamlal et al., 2013), and three studies had an equal gender distribution (Ackerley et al., 2014; Etzi et al., 2018; Hua et al., 2008). Those that contained multiple experiments had either a greater proportion of females, or an equal gender distribution (Essick et al., 2010; Jönsson et al., 2015, 2017; Luong et al., 2017; Löken et al., 2011; Triscoli et al., 2014). In all but four studies, a visual analogue scale (VAS) was used to capture pleasantness ratings (k = 14) (Ackerley et al., 2014; Essick et al., 2010; Etzi et al., 2014, 2018; Hua et al., 2008; Huisman et al., 2016; Jönsson et al., 2015, 2017; Kass-Iliyya et al., 2016; Luong et al., 2017; Löken et al., 2011; Pawling et al., 2017; Triscoli et al., 2013, 2014).

Application

There was an equal distribution in the mode of application of the tactile stimulus. Eight studies utilised a robot (Ackerley et al., 2014; Essick et al., 2010; Huisman et al., 2016; Jönsson et al., 2015, 2017; Luong et al., 2017; Tsalamlal et al., 2013; Triscoli et al., 2014), most often a rotary tactile stimulator (RTS) (k = 6) (Ackerley et al., 2014; Essick et al., 2010; Jönsson et al., 2015; Jönsson et al., 2017; Luong et al., 2017; Triscoli et al., 2014), and eight studies had a human deliver the stimulus (Bennett et al., 2014; Etzi et al., 2014, 2018; Gordon et al., 2013; Hua et al., 2008; Kass-Iliyya et al., 2016; Pawling et al., 2017; Voos et al., 2013). In two studies, both an experimenter (female) and a robot (RTS) delivered the stimuli (Löken et al., 2011; Triscoli et al., 2013).

A number of different materials were used to contact the skin in the desired region. Most often a goat hair brush was utilised (k = 9) (Ackerley et al., 2014; Jönsson et al., 2015, 2017; Kass-Iliyya et al., 2016; Luong et al., 2017; Löken et al., 2011; Pawling et al., 2017; Triscoli et al., 2013, 2014) with a width that ranged from 20 to 70 mm.

The time take between stimulations (interstimulus interval) ranged from 80 to 30 s (Etzi et al., 2018; Löken et al., 2011), with 10 and 15 s most frequently utilised (k = 4) (Ackerley et al., 2014; Kass-Iliyya et al., 2016; Luong et al., 2017; Triscoli et al., 2013).

Various approaches were employed to conceal the stimulation site from the participant. For example, visual input was prevented by using glasses blocking peripheral vision (k = 5) (Ackerley et al., 2014; Jönsson et al., 2015, 2017; Triscoli et al., 2013, 2014) or a curtain (k = 2) (Essick et al., 2010; Löken et al., 2009). Eight studies did not state if any measures were taken to blind the participant from testing. Furthermore, audible noises were reduced with earplugs (k = 2) (Essick et al., 2010; Löken et al., 2009) or headphones (k = 6) (Etzi et al., 2014, Jönsson et al., 2015, 2017; Triscoli et al., 2013, 2014; Tsalamlal et al., 2013).

Velocity

The velocity of stimulus application was the most common factor randomised and fell within (and including) 0.3–32 cm/s (k = 10) (Ackerley et al., 2014; Huisman et al., 2016; Jönsson et al., 2015, 2017; Kass-Iliyya et al., 2016; Luong et al., 2017; Löken et al., 2011; Pawling et al., 2017; Triscoli et al., 2013, 2014). Tsalamlal et al. (2013) utilised air as the stimulus and described velocity as low movement (0.6 rad/s) and high movement (12 rad/s). Fourteen studies investigated more than one velocity for stimulus application with velocities of 0.3, 1, 3, 10 and 30 cm/s most often used (Ackerley et al., 2014; Essick et al., 2010; Etzi et al., 2018; Huisman et al., 2016; Jönsson et al., 2015, 2017; Kass-Iliyya et al., 2016; Luong et al., 2017; Löken et al., 2011; Pawling et al., 2017; Tsalamlal et al., 2013; Triscoli et al., 2013, 2014; Voos et al., 2013). In 9 of these studies, a significant main effect of pleasantness ratings (all p < 0.05) was found (Ackerley et al., 2014; Essick et al., 2010; Huisman et al., 2016; Jönsson et al., 2015, 2017; Luong et al., 2017; Triscoli et al., 2013, 2014; Tsalamlal et al., 2013). This comprised utilising a robot (RTS) with a goat hair brush (k = 5) (Ackerley et al., 2014; Jönsson et al., 2015, 2017; Luong et al., 2017; Triscoli et al., 2014), and a robot applying multiple fabrics (k = 2) (Essick et al., 2010; Huisman et al., 2016). Triscoli et al. (2013) had both a human and robot administer a goat hair brush, and Tsalamlal et al., 2013 utilised air in their study. The remaining studies did not report if velocity had a main effect (Etzi et al., 2018; Kass-Iliyya et al., 2016; Löken et al., 2011; Pawling et al., 2017; Voos et al., 2013).

A stroking velocity of 3 cm/s compared to slower [0.3 cm/s, P ≤ 0.001 (Triscoli et al., 2013; 2014); 0.5 cm/s, p < 0.05 (Huisman et al., 2016)] and faster velocities (30 cm/s, P ≤ 0.035) (Ackerley et al., 2014; Etzi et al., 2018; Pawling et al., 2017; Triscoli et al., 2013, 2014) was rated on average as most pleasant (Kass-Iliyya et al., 2016; Löken et al., 2011). Triscoli et al. found that there was no significant difference in mean pleasantness scores when stroked at velocities of 0.3 or 30 cm/s [P > 0.05 (Triscoli et al., 2013) and P = 0.632 (Triscoli et al., 2014)]. Another study found that 8 cm/s was more pleasant than 32 cm/s (P = 0.003) (Voos et al., 2013) as was low air jet movement velocity (0.6 rad/s) versus static and high velocity (12 rad/s) (p < 0.05) (Triscoli et al. 2013).

The velocity-pleasantness profile was continuously identified to be best fit by a quadratic model (Ackerley et al., 2014; Huisman et al., 2016; Jönsson et al., 2015; 2017; Luong et al., 2017; Löken et al., 2011). When velocity against pleasantness ratings was plotted, the resultant profile was an inverse U-shape with peak ratings in the range of 1–3 cm/s.

The duration of stroking (multiple strokes over time) at the same velocity was also found to contribute to the pleasantness experienced. Triscoli et al. (2014) found that for stroking at 3 cm/s there was a significant decrease in participant pleasantness ratings over time (40 trials of 5 back and forth strokes, P = 0.042). This was not observed when stroking at 0.3 cm/s (P = 0.290) or 30 cm/s (P = 0.617). In addition, the pleasantness ratings at the end of the session, when further trials at 3 cm/s were performed, showed a decline between two groups of healthy participants (40 vs. 120 trials, P = 0.003). Such a decline over time was not seen at a velocity of 30 cm/s (40 vs. 267 trials, P = 0.349). In contrast, (Etzi et al., 2018) identified that only fast stroking (30 cm/s) was rated as less pleasant over time (9 s vs. 60 s, P = 0.02) compared to slower stroking (3 cm/s).

Force

The force of stimulus delivery ranged from 0.19 N to 400 grams (corresponding to 3.92 N) (k = 10) (Ackerley et al., 2014; Essick et al., 2010; Hua et al., 2008; Jönsson et al., 2015, 2017; Luong et al., 2017; Löken et al., 2011; Pawling et al., 2017; Triscoli et al., 2013, 2014) with a force of 0.4 N most often used (k = 7) (Ackerley et al., 2014; Jönsson et al., 2015; 2017; Luong et al., 2017; Löken et al., 2011; Triscoli et al., 2013, 2014). Seven studies did not control or did not state the force applied (Bennett et al., 2014; Etzi et al., 2014, 2018; Gordon et al., 2013; Huisman et al., 2016; Kass-Iliyya et al., 2016; Voos et al., 2013). Use of an air jet with a flow rate of 7.5 and 50 l/min was described by Triscoli et al. (2013). However, they did not identify what force levels these flow rates exerted onto the desired region.

Essick et al. (2010) was the only included study that investigated the effect of different forces. They found that as force increased the pleasantness scores decreased (experiment 1 and 2, p < 0.001). Tsalamlal et al. (2013) measured the flow rate of air onto the skin surface. The study found that low intensity flow (7.5 l/min) was rated significantly more pleasant than the high intensity flow (50 l/min) (continuous and discontinuous flows, p < 0.0001). A similar observation was also identified by Huisman et al. (2016), with low intensity stimuli, delivered by a robot rated more pleasant than high (p < 0.05).

Site

The most common site for stimulus application was the forearm (k = 16) (Ackerley et al., 2014; Essick et al., 2010; Etzi et al., 2014, 2018; Gordon et al., 2013; Huisman et al., 2016; Jönsson et al., 2015; 2017; Kass-Iliyya et al., 2016; Luong et al., 2017; Löken et al., 2011; Pawling et al., 2017; Triscoli et al., 20132014; Tsalamlal et al., 2013; Voos et al., 2013). Eleven studies applied a stimulus to more than one site (Ackerley et al., 2014; Bennett et al., 2014; Essick et al., 2010; Etzi et al., 2014; Gordon et al., 2013; Jönsson et al., 2015; 2017; Kass-Iliyya et al., 2016; Luong et al., 2017; Löken et al., 2011; Pawling et al., 2017) (see Table 2), with an interaction with perceived pleasantness investigated in 8 studies (Ackerley et al., 2014; Bennett et al., 2014; Essick et al., 2010; Etzi et al., 2014; Gordon et al., 2013; Luong et al., 2017; Löken et al., 2011; Pawling et al., 2017). The most common comparisons included the forearm and the palm (k = 8) (Ackerley et al., 2014; Bennett et al., 2014; Essick et al., 2010; Etzi et al., 2014; Gordon et al., 2013; Luong et al., 2017; Löken et al., 2011; Pawling et al., 2017), with 5 studies concluding that applying the stimulus to the arm was rated as significantly more pleasant (p < 0.05, irrespective of velocity, material or mode of application) (Bennett et al., 2014; Essick et al., 2010; Etzi et al., 2014; Gordon et al., 2013; Löken et al., 2011). Two studies found no significant interaction between sites (Luong et al., 2017; Pawling et al., 2017) and 1 study found that pleasantness ratings were significantly lower when stoking the arm compared to palm (velocity of 30 cm/s, P = 0.028) (Ackerley et al., 2014).

Gender

Seven studies considered the effect of gender on perceived pleasantness (Ackerley et al., 2014; Essick et al., 2010; Etzi et al., 2018; Jönsson et al., 2015, 2017; Löken et al., 2011; Triscoli et al., 2013). Jönsson et al. (2017) found that women rated pleasantness to touch significantly higher than men (combined study 1 and 2: P > 0.001). Essick et al. (2010) identified significant interactions of gender with body site (experiment 2: p < 0.001), material (experiment 1 and 2, p < 0.002), body site by material (experiment 2: p < 0.001) and velocity (experiment 1: p < 0.001 but not for experiment 2: P = 0.26). Furthermore, they also identified that pleasantness ratings differed between males and females for increasing force (experiment 2: p < 0.001). Within this experiment, both genders had a significant decrease in pleasantness ratings with increasing force (p < 0.05); however, the decrease was greater for males (p < 0.05). The remaining five studies found no significant effect of gender on pleasantness ratings (Ackerley et al., 2014; Etzi et al., 2018; Löken et al., 2011; Jönsson et al., 2015; Triscoli et al., 2013).

Quantitative synthesis (meta-analysis)

To reduce heterogeneity in the meta-analysis, studies that utilised the same material for stimulus delivery on the same site were included. Hence, quantitative synthesis was possible for velocities of 3 and 30 cm/s in the region of the forearm using a brush stimulus. A velocity of 3 cm/s (pleasant and CT favourable) was treated as the pretest score and 30 cm/s (less pleasant and non-CT favourable) as the posttest score. One study provided the raw data of a single stroke at 3 cm/s and 30 cm/s on the forearm for which a pretest–posttest correlation of 0.48 was calculated (Pawling et al., 2017). This estimate was generalised to the remaining studies to allow the computation of the variance of the effect estimate (Morris & DeShon, 2002; Dunlap et al., 1996; Borenstein et al., 2011).

Five studies undertook comparisons of stroking at 3 cm/s and 30 cm/s with similar experimental set-ups for which effect sizes were calculated (Ackerley et al., 2014; Etzi et al., 2018; Pawling et al., 2017; Triscoli et al., 2013, 2014). The overall standardised effect size estimate for velocity was − 0.59 (95% confidence interval [CI] − 0.89, − 0.29; p < 0.001) which is considered a medium effect size (Fig. 2) (Cohen, 1992). This supports that a stroke velocity of 30 cm/s is rated as less pleasant than a stroke at 3 cm/s using a brush stimulus on the forearm, furthermore complementing the findings from the systematic review. The I2 test indicated little variability between studies (18.9%), and the Cochrane Q test determined that there was no statistically significant heterogeneity present (Q [df = 4] = 4.94, P = 0.29). Consequently, a subgroup analysis was not performed.

Fig. 2
figure 2

Meta-analysis forest plot providing pooled estimate of perceived pleasantness

Sensitivity analysis showed little change in the overall effect when using a fixed effects model (− 0.57, CI − 0.83, − 0.57; p < 0.001), and the leave-one-out analysis had minimal effect (min = 0.50, max = 0.68, all P values < 0.001).

An assessment of publication bias was not performed due to inadequate numbers of included studies to properly assess a funnel plot or more advanced regression-based assessments (Higgins & Green, 2011).

Discussion

To the authors’ knowledge, this is the first systematic review performed in order to identify the optimum psychophysical stimulus to elicit a pleasant sensation. Furthermore, it is the first meta-analysis that has been conducted to establish an estimate of effect between a pleasant (CT favourable) and less pleasant (non-CT favourable) velocity.

The qualitative synthesis found that stroking at a velocity of 3 cm/s was repeatedly found to be significantly more pleasant than slower [0.3 cm/s (Triscoli et al., 2013, 2014)] or faster [30 cm/s (Ackerley et al., 2014; Etzi et al., 2018; Pawling et al., 2017; Triscoli et al., 2013, 2014)] velocities, which in turn were rated as pleasant as each other. This is not surprising as studies have established that a velocity of 3 cm/s is CT favourable, in turn, giving rise to the characteristic inverse U-shape profile seen within several studies in this review (Ackerley et al., 2014; Huisman et al., 2016; Jönsson et al., 2015; Luong et al., 2017; Löken et al., 2011). The meta-analysis reinforced these findings as 30 cm/s was found to be less pleasant relative to 3 cm/s from the pooled data. However, the available data were limited restricting the conclusions that could be drawn. Nonetheless, velocity plays a fundamental role in the perception of a pleasant sensation and provides for an ideal variable to randomise and investigate within the methodology of experimental studies involving tactile pleasantness.

The ability to maintain a constant velocity when delivered by a human experimenter may be a technical limitation. In addition, the force of application presents the same problem. Lighter forces are rated as most pleasant (Essick et al., 2010) and maintaining a light force, particularly through a dynamic movement, may inevitably create fluctuations. Methods incorporated to overcome these potential issues included using auditory signals (Etzi et al., 2018) visual aids (e.g. moving bar on a screen) (Etzi et al., 2014; Kass-Iliyya et al., 2016; Löken et al., 2011; Pawling et al., 2017); copying the stroke from an adjacent robot (Triscoli et al., 2013) or visual control in the bend of the brush bristles (Triscoli et al., 2013). A need for stimulus control may justify the use of a robot as the mode of stimulus delivery. Nonetheless, previous research has found that there is no difference in perceived pleasantness when a stimulus is delivered by a human or robot (Triscoli et al., 2013). However, too few included studies in the meta-analysis did not allow this factor to be investigated by a subgroup analysis.

The associated disadvantages with a robotic device would be the costs associated, as well as limitations on the ease of use for the robot to access and deliver a stimulus on certain regions of the body. Inevitably, the use of a robot could also reduce any potential confounding factor caused by the gender of the experimenter (Essick et al., 2010).

Within pain sensitivity studies, an interaction between gender and psychophysical responses have been demonstrated, whereby males behave differently in response to a painful stimulation in the presence of a female, compared to male, experimenter (Heslin et al., 1983). The studies included in this review did not investigate the interaction between the experimenters’ and participants’ gender on perceived pleasantness ratings. Instead, the gender of the participant was investigated with the majority of studies finding there was no significant difference between gender and pleasantness ratings (Ackerley et al., 2014; Etzi et al., 2018; Löken et al., 2011; Jönsson et al., 2015; Triscoli et al., 2013). However, the effect of gender interactions may lie within more complex patterns, e.g. interactions with body site, material, or velocity (Essick et al., 2010).

Whether handheld or via a robot, methodologies varied with which material was used to apply the stimuli. Three studies assessed pleasantness ratings from different materials (Essick et al., 2010; Etzi et al., 2014; Hua et al., 2008) and were consistent with each other, whereby the softest materials evaluated were rated as most pleasant. This further justifies why a goat hair brush was most often used (Ackerley et al., 2014; Case, Čeko et al., 2016, Case, Laubacher et al., 2016; Croy et al., 2016; Ellingsen et al., 2013; Jönsson et al., 2015; Kass-Iliyya et al., 2016; Liljencrantz et al., 2013, 2014; Löken et al., 2011; Sailer et al., 2016; Triscoli et al., 2013, 2014; Trotter et al., 2016), as it allows the uniform application of a soft material with ideal characteristics of a light force and velocity (Vallbo et al., 1999). Therefore, a brush would provide as an optimum “prototypical pleasant stimulus” as it may “remain pleasant irrespective of how it is moved across the skin, even if not delivered with the optimal stimulus parameters” (Guest & Essick, 2016). A brush was also a suitable stimulus for studies on the palm. Although the glabrous skin does not contain CT afferents, a brush stroke still elicits a pleasant sensation (Ackerley et al., 2014; Gordon et al., 2013; Luong et al., 2017; Löken et al., 2011; Pawling et al., 2017). This is hypothesised to be from learnt behaviour from top-down mechanisms as well as emotional memory circuits (Löken et al., 2011; McGlone et al., 2012). It should also not be overlooked that there is likely an important contributory effect from the Aβ pathway (McGlone et al., 2014) and may also justify why studies presented contrasting findings, whereby optimal and suboptimal CT velocities were equally pleasant in some (3 cm/s vs. 30 cm/s) (Triscoli et al., 2013, 2014), yet significantly different in others (8 cm/s vs. 32 cm/s) (Voos et al., 2013).

The number of strokes delivered also played a role in the perceived pleasantness. Long-lasting stroking at a pleasant and CT favourable velocity caused a small reduction in the pleasantness ratings (Triscoli et al., 2014). This, in part, may have resulted from a property of CT afferents described as fatigue, where the firing of the afferents is reduced upon repeated stimulation (Vallbo et al., 1999; Bessou et al., 1971; Morrison, 2012). Post-activation depression of CT afferents has been identified, in which they reduce their responsiveness to succeeding stimuli following an initial touch (Vallbo et al., 1999; Iggo, 1960). It was found that a resting period of greater duration allowed for better recovery of CT afferents, but full recovery could take several minutes (Iggo, 1960; Iggo & Kornhuber, 1977). Within the methodologies reviewed, the interstimulus interval ranged from 80 to 30 s. It would be expected that repeated stimulations with short intervals could thus reduce the pleasantness response, as the CT afferents are fatigued and have not had sufficient time to recover. Studies varied in the number of strokes delivered, and the fatigue effect may present a source of heterogeneity as it cannot be assumed to be equal across studies.

The VAS was most often utilised within the included studies. Known to allow evaluation of individuals’ experience of a phenomenon of interest (Wewers & Lowe, 1990), as seen in pain studies, it provides a method to transfer a sensation to a measurable dimension and allows a reliable way to assess what the patient actually feels (Ohnhaus & Adler, 1975). In addition, the VAS is one of the most common forms of rating scales utilised in pain assessment, with proven validity and reliability (Ferreira-Valente et al., 2011; Price et al., 1983; Hawker et al., 2011). However, the VAS is not without disadvantages. The data obtained are not always normally distributed, and the entirety of the scale is not always utilised (Williamson & Hoggart, 2005).

The systematic review and meta-analysis are associated with limitations. The search strategy was restricted to English due to resource limitations. Furthermore, the inclusion criteria were strict and led to the exclusion of studies that may be considered fundamental in CT afferent/pleasantness research (Ackerley et al., 2014; Croy et al., 2016; Moher et al., 2009; Morrison et al., 2011). The quality assessment designed was also unique with no weight for each item.

Within the meta-analysis, to compute the effect estimate, the pretest–posttest correlation was derived from 1 study and generalised to the others. This could result in a less precise pooled estimate as the correlation may change because of study characteristics such as duration of stroking and number of trials. However, with limited data this method was considered the most appropriate (Morris & DeShon, 2002; Dunlap et al., 1996).

Effect size multiplicity was dealt with by a reductionist approach, in order to resolve multiple effect estimates in a single study. As a result, the exploration of heterogeneity within studies was restricted (López-López et al. 2018).

Although the result of the Cohen’s Q test was not statistically significant, this does not necessarily indicate homogeneity. The efficacy in detecting true heterogeneity is reduced when only few studies are included due to a lack of power (Higgins et al., 2003). This also precluded a subgroup analysis, as well as evaluation of publication bias.

Conclusion

This review focused on passively received stimuli that were assessed for pleasantness and showed that careful standardisation of methodological factors such as texture, velocity, and force, as well as the duration of continuous stroking are most likely to play key roles, namely using a brush and stroking at a velocity of approximately 3 cm/s, with a light force, which is also the optimal type of stimulus in activating CT afferents. Having taken some of these factors into account, the meta-analysis confirmed that a stroking velocity of 30 cm/s was rated as significantly less pleasant than 3 cm/s.

The standardised stimulations identified within the included studies showed that the assessment of pleasantness bares many parallels to the assessment of pain and other somatosensory modalities.