Introduction

Hyperhidrosis (HH) (i.e. excessive sweating) is characterized by excessive focal or generalized sweating [1]. The overall prevalence of primary HH ranges from 0.9 to 20.6%, with a prevalence of primary axillar HH of 1.0–12.9%, primary palmar or plantar HH of 0.6–11.2%, and primary generalized HH of 2.2–6.1% [2,3,4,5,6,7,8,9,10,11,12]. The pathophysiology of primary HH remains incompletely understood [13]. Research has identified a hereditary component in the transmission of primary HH. Other studies have observed both more ganglion cells and bigger sympathetic ganglia in individuals with primary HH than in control individuals, as well as an increased acetylcholine and nicotinic receptor subunit expression [13]. A group of experts developed diagnostic criteria for focal primary HH, which are used in research, along with other symptom-based definitions [1, 14]. In clinical practice, physicians mainly diagnose HH from a composite of patient medical history, physical examination, and absence of underlying sweat-inducing comorbidity [1]. Paraclinical testing can further substantiate the HH diagnosis, although it has been argued that the intermittency of sweating can limit the value of such methods [1, 15,16,17]. The most widely used focal sweat measurement method is gravimetry, while other techniques including evaporation measurements and staining tests are also described in the literature [15, 16, 18]. Unfortunately, inter-study sweat rates in individuals with HH vary considerably and often poorly reflect self-reported HH sweating [14, 19,20,21]. In addition, patients can assess the impact of living with HH with different patient-reported outcome measures (PROM), which may be particularly relevant for the management of HH. However, there is no agreement on preference of these PROM, and they are used interchangeably between studies.

The use of validated and consensus-endorsed diagnostic methods, sweat quantification tests, and PROM would allow for inter-study comparison and reliable evaluation of treatments [22,23,24]. The objective of this review is therefore to summarize the existing literature on HH diagnostic criteria, focal sweat measurement methods, and PROM of HH. We also assess the methodological quality of diagnostic accuracy studies about focal sweat measurement methods.

Materials and methods

Literature review

We searched Cochrane Library, Embase, and PubMed. We examined reference lists of included original studies and reviews for additional publications. Two authors (LT and MASH) screened the eligible literature for inclusions and conducted article full-length assessment independently of each other. Disagreements were resolved by internal discussion in the author group. Verdict on inclusion, exclusion, and reason for exclusion were documented in an Excel spreadsheet by one author (MASH). The review was registered at PROSPERO, id: CRD42020155565.

Inclusion criteria

  1. 1.

    Study population must have HH.

  2. 2.

    Study must contain at least one diagnostic criteria for HH, one focal sweat measurement method, or one PROM for individuals with HH.

  3. 3.

    The aim of the study must be to develop or validate HH diagnostic criteria, focal sweat measurement methods, or PROM for individuals with HH.

Exclusion criteria

  1. 1.

    Study populations of fewer than five participants.

Reviews that developed diagnostic criteria for HH and did not fulfill inclusion criteria 1 were considered eligible. No restrictions on language or study design were applied. Full-length and abstract publications were eligible for inclusion.

Search strategy

An information specialist was consulted in the design of the search strategy. See Online Resource 1 for search strategy and Fig. 1 for how the included studies were arrived at. We employed the highly sensitive filter for systematic reviews on PROM for Embase and PubMed [25,26,27,28].

Fig. 1
figure 1

Flow diagram of how the studies were selected

Data items

Data on criteria for diagnosing hyperhidrosis and assessing severity of hyperhidrosis was collected. Additionally, for focal sweat measurement methods, data on anatomical location of HH, study population size, number of included females and males, and results of focal sweat measurement methods were collected both for study participants with hyperhidrosis and for control individuals. Additionally, results of statistical analyses comparing focal sweat measurement results in individuals with hyperhidrosis and in control individuals were collected. For studies on PROM, data on measurement properties was collected. All data was extracted by one author (MASH) and recorded in an Excel spreadsheet.

Risk of bias assessment

Risk of bias of individual diagnostic accuracy studies about focal sweat measurement methods was evaluated using the Quality Assessment of Diagnostic Accuracy Studies–2 (QUADAS-2) [29]. QUADAS-2 comprises 11 signaling questions divided between the domains: patient selection, index test, reference standard, and flow and timing. Each item has the response options Yes, No, or Unclear. Risk of bias assessments were performed by one author (MASH). Patient selection describes the patient inclusion and provides details on the included patients. Index test provides information on the execution and interpretation of the index test. Reference standard describes the execution and interpretation of the index test. Flow and timing provides information on patients who did not undergo reference standard or index testing and the time and interventions between the index test and reference standard [29].

Synthesis of results

This study is reported according to the preferred reporting items for systematic reviews and meta-analysis (PRISMA) guidelines. The data is presented narratively and quantitatively.

Results

Diagnostic criteria and severity of disease

We have identified two sets of diagnostic criteria for HH and one scale to assess severity of HH sweating [1, 14, 30, 31].

Diagnostic criteria

Experts have developed criteria for diagnosing focal primary HH based on a review of English language literature (Table 1) [1]. Before primary HH can be diagnosed, secondary HH needs to be excluded. The main causes of secondary HH include cardiopulmonary disease, infections, cancer, endocrine disease, neurologic disease, and also different medications and substance misuse [1]. Regional or focal causes of secondary HH include stroke, neurological tumors, and peripheral nerve injury [1]. There are additional rare causes of secondary HH that, depending on the findings from the patient history or physical examination, should be investigated [1]. Examples include Frey syndrome or eccrine nevus [1]. The medical history should focus on the diagnostic criteria of focal primary HH (Table 1) as well as on causes of secondary HH, including medications. The physical examination should focus on objective signs of sweating and on signs of diseases that can cause secondary HH. Examples of symptoms that can suggest secondary HH include fever and palpitations [1]. Supplementary laboratory testing may be necessary [1]. These criteria were examined in a retrospective chart review of 415 patients and compared to patient medical history, laboratory findings, and diagnostic imaging. Six months of HH and at least four of the following criteria: HH in the axillae, face, palms, or soles; bilateral and symmetrical symptoms; impaired daily activities; occurring at least weekly; onset < 25 years of age; positive family history; or cessation while asleep could differentiate between primary and secondary HH with a sensitivity of 99% and a specificity of 82% [31].

Table 1 Diagnostic criteria for focal primary hyperhidrosis by Hornberger et al. [1]

A cross-sectional study of 253 students defined HH based on a study-specific diagnostic question, information on anatomical location of sweating, and sweating intensity on a visual analogues scale. The results were then compared to gravimetry measurements. In total, 18 individuals sweated above the study’s diagnostic cutoff for palmar HH (20 mg/min/m2) and 41 sweated above the cutoff for axillary HH (50 mg/min/m2). Sensitivity and specificity of the diagnostic question were 0.98 and 1.00 for axillar HH and 0.89 and 1.00 for palmar HH, respectively [14].

Severity of HH sweating

One overall expert-based method was identified for assessment of HH severity. Axillar, palmar, and plantar HH was classified as mild, moderate, or severe based on sweat stains and symptoms (Table 2) [30].

Table 2 Severity of hyperhidrosis by Wohlrab et al. [30]

Sweat measurement methods

We have identified four focal sweat measurement methods. These include gravimetry, transepidermal water loss (TEWL), Minor’s iodine starch test, and the HH Area and Severity Index (HASI) [14, 15, 18,19,20,21, 24, 32,33,34,35,36]. Data extracted on gravimetry, TEWL, and HASI is presented in Tables 3 and 4. Strengths and limitations of gravimetry, TEWL, Minor’s iodine starch test, and HASI are summarized in Table 5.

Table 3 Gravimetry in individuals with and without hyperhidrosis [18, 20, 21, 32]
Table 4 Transepidermal water loss and Hyperhidrosis Area and Severity Index in individuals with and without hyperhidrosis [19, 24, 33, 34, 36]
Table 5 Strengths and limitations of sweat measurement tests

Gravimetry

Gravimetry is a method that quantitatively measures sweat production. Firstly, the patient is allowed to rest for about 15 min in a sitting position in room temperature [18]. Then, the skin of the axil is cleaned before a filter paper, which absorbs sweat, is placed in the axil for 1–5 min. Some researchers cover the filter paper with plastic to prevent sweat evaporation. The weight difference of the filter paper before and after gravimetry, as measured on a high-precision scale, equals the quantity of sweat produced [18, 20]. The gravimetry recording can be performed three times to find the median value [14]. It is important to maintain a room temperature of about 22–25 °C during the resting phase and gravimetry measurements. In addition to the axils, a common anatomical location for gravimetry is the palms and rarely other anatomical sites, as described below [20]. We identified one cohort and two case–control diagnostic accuracy studies that used gravimetry as an index test in individuals with suspected or known HH (Table 3) [18, 20, 21]. We also identified one cross-sectional study that investigated gravimetry in individuals with HH (Table 3) [32]. The test–retest reliability of gravimetry in 229 HH patients after thoracoscopic sympathectomy 3 months apart were 0.66, 0.79, 0.81, and 0.82 for abdomino-lumbar, axillar, palmar, and facial sweating, respectively [20]. Another study of 253 individuals found test–retest interclass correlation of gravimetry after 14 days of 0.91 (p < 0.0001) [14].

Water evaporation

Transepidermal water loss quantifies focal water evaporation from the skin, which combines the evaporation of both sweat and insensible perspiration from the epithelium. First, the patient is allowed to rest for 10–30 min in room temperature [19, 24, 34]. Then, the TEWL measuring device is placed on the skin surface, and measurements are continued until a steady state of TEWL is reached, which usually takes less than 90 s [19, 24, 33, 34, 37]. The measurement can be performed three times to find the mean or median value [34]. It is important that the room temperature is kept between 20 and 25 °C during the resting phase and the TEWL measurements. Twelve hours prior to the TEWL, the patients cannot perform physical exercise or apply hygiene products on the skin that is to be examined by the TEWL [19]. We have identified one retrospective chart review and three case–control diagnostic accuracy studies that employed TEWL as an index test in individuals with known HH (Table 4) [19, 24, 33, 34]. We did not assess the quality of the method of the study by Andrade et al., as it was published as an abstract and therefore did not contain enough detail [33].

Minor’s iodine starch test

Minor’s iodine starch test qualitatively identifies the hyperhidrotic skin area [35, 38]. First, the skin area is cleaned and dried and then covered in 1–5% iodine solution. After the iodine solution has dried, the iodine-covered portion of the skin is sprinkled with starch powder [35, 39]. As the sweat begins to react with the mixture of iodine and starch, it gradually becomes dark. After 10–15 min, inspection of the skin can determine location of the sweating [40]. Traditionally, Minor’s iodine starch test has been used to qualify axillar sweating [17]. We have identified an interventional study that investigated the correlation between Minor’s iodine starch test and the Dermatology Life Quality Index (DLQI) before and after axillary Botox treatment [35]. In 19 patients, Spearman correlation between Minor’s iodine starch test and DLQI were 0.44 (p = 0.06) before Botox treatment, 0.83 (p < 0.0001) 1 week after Botox treatment, and 0.58 (p = 0.03) 9 months after Botox treatment [35].

Hyperhidrosis Area and Severity Index

In HASI, a gravimetry recording, as described above, is conducted for 10 min. Then a Minor’s iodine starch test is conducted. The skin area that is colored by the reaction between sweat, iodine, and starch is covered with a grid paper. By combining the sweat rate of the gravimetry and the area that is colored by Minor’s iodine starch test as defined by the grid paper, the overall sweat production in mg/cm2 per minute can be calculated [15]. We have identified one developmental study and one case–control diagnostic accuracy study of 198 participants that examined the HASI (Table 4) [15, 36]. The HASI was correlated to body surface area (r = 0.89; p = 0.004) [36].

There are many other ways to objectively measure sweat production that are outside the scope of this study. See Online Resource 2 for details on these.

Patient-reported outcome measures

We have identified 15 PROM developed for HH [3, 4, 41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56]. The PROM are presented below, while PROM development and validation studies and PROM measurement properties, as defined by the Consensus-Based Standards for the Selection of Health Measurement Instruments (COSMIN), are summarized in Table 6 [57]. Strengths and limitation of PROM are summarized in Table 7.

Table 6 Overview of hyperhidrosis patient-reported outcome measures and published studies on their measurement properties [4, 41,42,43,44,45,46,47,48,49,50,51,52, 54, 55]
Table 7 Strengths and limitations of patient-reported outcome measures

Hyperhidrosis Disease Severity Scale

The Hyperhidrosis Disease Severity Scale (HDSS) assesses tolerability and impact on everyday life from HH based on one item [17]. We have not encountered a study describing the development process of the HDSS. Kowalski et al. have assessed HDSS’ psychometric properties, and the results were published as an abstract [41]. The HDSS has been translated into Portuguese and assessed for construct validity and reliability [54]. The HDSS determines severity of HH based on a score of 1 to 4. A score of 1 indicates mild HH, 2 indicates moderate HH, and 3 or 4 severe HH [17]. See reference for the complete HDSS questionnaire [17].

Hyperhidrosis Quality of Life Index

The Hyperhidrosis Quality of Life Index (HidroQOL©) is a quality of life measure designed for clinical and research settings [56]. Two studies have described the development and initial validation process [55, 56]. The HidroQOL© consists of 18 items divided between the two domains daily life activities and psychosocial life [56]. Daily life activities includes items on clothing, physical activities, hobbies, work, and holidays. Psychosocial life includes items on nervousness, embarrassment, frustration, expression of affection, health, other people’s reaction, leaving sweat marks, meeting people, public speaking, appearance, and sex life. The responses of each item (i.e. very much, a little, no, not at all) can be summed to create scores of each of the two domains as well as an overall HidroQOL© score. See reference for the complete HidroQOL© questionnaire [56].

Health-related quality of life by Kuo et al.

Kuo et al. have developed a PROM to measure health-related quality of life in patients with HH [42]. It consists of 29 items divided between the five domains functional, psychological, social, affective, and physical [42]. Functional defines capacity and one’s performance. Psychological describes emotions. Social reflects one’s capacity for socializing with others. Affective describes relations with others. Physical reflects the bodily functions heartbeat, breathing, and insomnia. The response to each item ranges from the least disturbance to the most disturbance on a five-level Likert scale. The entire questionnaire takes 8–10 min to complete [42].

Quality of life by Amir et al.

Amir et al. have developed a questionnaire to assess quality of life in individuals with HH [43]. It contains 35 items divided between the five domains functional, social, interpersonal, emotional self, and emotional other [43]. Functional covers functional impairments and includes writing, driving, and sports. Social includes situations such as handshaking, dancing, and friendships. Interpersonal reflect relation with the partner and includes intimate contact. Emotional-self reflects one’s own perception of HH, and emotional other reflects how one perceives other’s opinions of HH. The response to each item ranges from strongly agree to strongly disagree on a seven-level Likert scale. The questionnaire can be accessed from the authors [43].

Quality of life by De Campos et al.

The PROM developed by Amir et al. has been further refined and assessed for psychometric properties by De Campos et al. [4, 43,44,45]. It contains 20 items divided between the five domains functional, social, personal, emotional, and special condition [4, 45]. These were similar to the corresponding domain published by Kuo et al. [42]. Additionally, the special condition domain covered various aspects such as tenseness, public speaking, shoe wear, clothing, and problems at school. The response options to each item ranged from one to five points, which equaled excellent to very poor. The overall score can be calculated by summing the points of all items. See reference for the complete questionnaire [45].

Hyperhidrosis impact questionnaire

Teale et al. have developed the hyperhidrosis impact questionnaire (HHIQ) to measure the influence primary HH has on daily lives and to examine the effect of anti-HH treatments [46]. The HHIQ contains 41 items for baseline assessments and 10 items for follow-up assessments, which are divided between four sections [46]. The sections are disease and treatment background; direct impact on medical and non-medical resource utilization; indirect impact on employment and productivity; and intangible impacts on emotional status, limitations in daily living and leisure activities, and treatment satisfaction. The results were published as an abstract.

Keller scale

Keller et al. have developed two questionnaires for self-diagnosing HH and then validated them against physical examination and sweat measurements [47]. The first questionnaire consists of 15 items that covers symptoms of HH. Item responses range from 0 to 10 points, which equal mild to severe disease [47]. The second questionnaire consists of four parts including 10 items on demographics, 25 items on sweating, 21 items on medical history, and 29 items on family history [47]. Each part of the second questionnaire has several additional sub-items, and each part has different response options. See reference for the complete first and second questionnaire [47].

Illness Intrusiveness Rating Scale

Cinà et al. have validated the pre-existing Illness Intrusiveness Rating Scale (IIRS) to assess the burden of HH [48]. They also have developed and validated 11 new items. The IIRS consist of the domains health, diet, work, active and passive recreation, financial situation, relationship with spouse, sex life, family and other social relations, self-expression, self-improvement, religious expression, and community involvement [48]. The 11 new items covered severity of living with HH [48]. All item responses range from not very much to very much, which equal 1 to 7 on a Likert scale. See reference for the 11 new items [48].

The hyperhidrosis disease severity measure – axillary

The hyperhidrosis disease severity measure – axillary (HDSM-Ax) has been developed to assess the severity of primary axillar HH in clinical research [49]. The HDSM-Ax consist of 11 items that inquire into the following: frequency of wet clothes; frequency of sweating for no reason; severity of sweating while nervous, stressed, or anxious; severity of wet clothes from underarm sweating; severity of underarm wetness; severity of sweating during exercise; severity of unmanageable sweating; severity of sweating while cool; desire to change clothes because of underarm sweating; and desire to wipe sweat from armpits. Each item response ranges from zero to four points, and by summing all item response points, an overall score from 0 to 44 points is calculated, which equals no sweating to worst possible sweating. See reference for the complete questionnaire [49].

Axillary sweating daily diary

The axillary sweating daily diary (ASDD), the ASDD-children for children aged 9–15 years, the weekly impact, and the patient global impression of change (PGIC) have been developed to evaluate the severity of HH in clinical studies [50]. The ASDD has four items that evaluate the presence and severity of axillary sweating, impact on activities, and degree of bother created by HH. The ASDD-children has two items that evaluate the presence and severity of axillary sweating [50]. The weekly impact has six items that inquire into whether axillary sweating has caused the patient to change their shirt, shower, affected the confidence or caused embarrassment, affected interaction with others, and limited the partaking in activities. The global impression of change has one item that evaluates degree of sweating before and after treatments. The different PROM have the item response options: Yes or No or scales from 0 to 4, 1 to 7, or 0 to 10. See reference for complete ASDD, ASDD-children, weekly impacts, and PGIC questionnaires [50].

Subjective Self-Evaluation Scale

The Self-Evaluation Scale (SES) has been developed to subjectively assess the degree of sweating. The SES has one item and was validated against objective sweat evaporation measurements [51]. The response option to the item ranges on a scale from 0 to 10, which equals no sweating to worst imaginable degree of sweating.

Swartling Hyperhidrosis Index

The Swartling Hyperhidrosis Index (SHI) has been developed to assess physical, psychosocial, and consequence-related aspects of HH [52, 53]. The SHI consist of the ten domains hygiene, social contact, self-esteem, impact on clothing, physical contact, physical activity, pattern of movement, practical impact, misinterpretation, and somatic impact.

Hyperhidrosis severity of quantitative observation

Fujimoto et al. have developed a questionnaire to determine daily interference from HH. The questionnaire also contains items on severity of sweating (i.e. HH severity of quantitative observation), treatments, demographics, medical history, familial dispositions, and the use of hygiene products [3]. The quantity of sweating is assessed on a three-level scale as mild, moderate, or severe. See reference for items of HH severity of quantitative observation [3].

Discussion

This study reviewed the diagnostic criteria, focal sweat measurement methods, and PROM for HH. Guidelines recommend that HH is diagnosed based on patient medical history, examination, and exclusion of concurrent disease [1]. Neither focal sweat measurement methods nor laboratory sampling has a high enough diagnostic value. Several studies have found a mismatch between patient-reported HH sweating and focal sweat measurement results [14, 21, 32, 34]. We speculate that this is because of the unpredictability of sweating in combination with the limited interval that is allocated to focal sweat measurements.

Gravimetry is instrumental in selecting patients for clinical studies and for evaluating the effect of HH treatments [16, 58]. As outlined above, the intermittent nature of sweating can however limit the diagnostic value of gravimetry [17]. In support of this, the included studies report a substantial variation of sweating both in individuals with and in individuals without HH. Transepidermal water loss may also be limited by the unpredictability of sweating and by variations in ambient humidity and skin tone [19, 34]. Despite these potential limitations, the included studies reported a higher TEWL in axillar and palmoplantar measurements in individuals with HH than in control individuals [19, 24, 33, 34]. We speculate that the moist environment created by HH may induce skin maceration and water evaporation and consequently increased TEWL [59].

Sweat measuring techniques outside the scope of this study, such as ventilated capsule technique or sudomotor axon reflex test, may have a HH diagnostic potential. Currently, these methods lack important validation studies for diagnosing HH, and therefore, it remains uncertain as to whether they are subjected to the unpredictability of sweating or other limitations. In addition to Minor’s iodine starch test, other staining techniques include application of the compounds quinizarin or alizarin on the skin [60, 61]. Tests with these compounds are ideally conducted in cabinets that can reduce external inter-patient differences [60, 61]. However, this more burdensome testing may explain why these techniques are not as widely used as Minor’s iodine starch test and also why they have not been previously validated for HH. In any case, because of potential allergy towards iodine, it is important to have alternative staining compounds or solutions available. In current guidelines, neither quantitative nor qualitative HH sweat tests hold a diagnostic value. However, development and validation of new techniques may hold the potential to diagnose HH in the future.

In the diagnostic accuracy studies on sweat measurements, patient selection may have been limited by the case–control design of most studies and by nonconsecutive patient enrollment, which may overestimate the index tests’ diagnostic characteristics [62]. The interpretation of sweat measurement results may have been influenced by the researchers’ a priori knowledge of HH status in patients and control individuals. The results of the focal sweat measurement methods may have been subjected to nonidentical conditions and conductions. Different conditions, including temperature and humidity, can influence sweat gland activity and thus the results of sweat measuring tests [63]. Although it is challenging to reproduce identical inter-study conditions when conducting sweat measurement tests, it may be necessary, in order to eliminate this uncertainty that variations in humidity, temperature, or other external conditions introduce. For reference standard, most studies merely stated that the included individuals had HH. Hence, it remains uncertain how the diagnosis was arrived at and whether the included studies used matching diagnostic criteria. However, as the HH patients were included from hospital clinics, we assume that HH was diagnosed by physicians based on guidelines, unless otherwise specified. In the flow of patients, each study employed a reference standard to all the included patients, but no author disclosed time from diagnosing HH to index testing. However, due to the chronicity of HH, the probability of HH having healed in time for the index testing was likely negligible.

The most used PROM in individuals with HH is the HDSS [50]. The HDSS is used to identify individuals with moderate to severe HH and thus identify individuals who are eligible for clinical trials and who are candidates for Botox treatment [17, 64]. Furthermore, the HDSS is often employed to determine construct validity of other PROM [49, 50, 54, 56]. The brevity of HDSS, with one item, is appealing for its efficiency. The HDSS is designed to assess tolerability and impact on daily activities in a single question, and therefore it does not allow for assessment of the two concepts separately [50]. We have not found studies that examine content validity of the HDSS. Content validity refers to how well the PROM measures all aspects of the construct it intends to measure and is therefore considered the most important measurement property [27]. The absence of studies assessing the content validity of HDSS means that it remains uncertain whether the HDSS adequately reflects tolerability and impact on daily activities from HH.

In the current review, we have systematically elaborated a search strategy under guidance from an information specialist, employed highly sensitive PROM search filters designed for systematic reviews, and assessed the risk of bias in studies on focal sweat measurement methods. There are limitations that need to be addressed. We have included grey literature such as conference abstracts, which are succinct and therefore may not provide a high degree of detail. Furthermore, we have not evaluated the methodological quality of studies on HH diagnostic criteria or PROM, as it was outside the scope of this review.

Conclusions

The current algorithm for diagnosing primary focal HH needs more stringent validation in larger cohorts. The pertinent literature on focal sweat quantification is mostly based on a few papers about gravimetry and TEWL. Additional methodologically sound research that assesses the test characteristics of focal sweat measurement methods in large study populations is warranted. Some of the most frequently used PROM for HH lack important validation data, and no consensus on their use exists to date. The use of a validated and consensus-endorsed PROM would allow for inter-study comparison and more reliable evaluation of treatments. A potential solution is to develop a core outcome set that can standardize the outcomes in all clinical trials.

Availability of data and material

Data sharing is not applicable to this article as no new data were created or analyzed in this study.