Introduction

Premenstrual emotional, cognitive, and physical symptoms affect millions of women during their reproductive years, with most symptoms occurring during the final premenstrual phase and subsiding a few days after menses begins (Pearlstein, Yonkers, Fayyad, & Gillepsie, 2005). A mild form of this premenstrual syndrome (PMS) is common, occurring in approximately 75% of women of reproductive age (Campbell, Peterkin, O’Grady, & Sanson-Fisher, 1997). Premenstrual dysphoric disorder (PMDD; American Psychiatric Association (APA), 1994) or late luteal phase dysphoric disorder (LLPDD; APA, 1987), a severe form of PMS, is considered to affect at least 3–8% of women of reproductive age (Halbreich, Bornstein, Pearlstein, & Kahn, 2003).

The defining characteristics of both—PMS and PMDD—are the cyclic pattern of symptoms, which must be confirmed by prospective daily self ratings of symptoms over two consecutive menstrual cycles. The level of symptom severity must also be high enough that it interferes with functioning in work, family, or social relationships (American College of Obstetricians and Gynecologists (ACOG), 2000a; APA, 1994). The main difference between PMS and PMDD lies in the number, severity, duration and the quality of symptoms. Whereas for PMS only one symptom out of a list of physical and emotional symptoms has to be present over three prior cycles (ACOG, 2000a), for PMDD at least five symptoms have to be confirmed in the majority of cycles over the preceding 12 months (APA, 1994). Furthermore, at least one of these has to be an affective symptom (depressed mood, anxiety/tension, affective lability, or anger/irritability). Finally, for PMDD it is also a requirement that the premenstrual symptoms are not merely an exacerbation of the symptoms of some other mental disorder (APA, 1994).

Currently there is no consensus on what causes premenstrual symptoms. Biological, psychological, environmental, and social factors are all thought to play a part and seem to interact with each other. Biological features such as hormonal imbalance, abnormal neurotransmitter responses as well as genetic vulnerabilities have all been discussed as etiological factors. Multidimensional approaches assume that along with biological factors, there are also psychological, environmental, and social aspects. Theoretical justification for multifactorial assumptions comes for example from a bio-psychosocial approach, which focuses on the appraisal of premenstrual symptoms resulting in a series of different idiosyncratic vicious circles that are fuelled by anxious and depressive reactions, due to focusing attention on physical and emotional correlates of hormonal changes and increased striving for control (Blake, Salkovskis, Gath, Day, & Garrod, 1998).

Experts of the ACOG suggest pharmacotherapy as a first-line intervention for PMDD (2000b), especially selective serotonin reuptake inhibitors (SSRIs) and other serotonergic antidepressants (e.g., clomipramine) (Bhatia & Bhatia, 2002; Cunningham, Yonkers, O’Brien, & Eriksson, 2009). The anxiolytic drug alprazolam is recommended only as a second-line drug due to its potential for drug dependence (Bhatia & Bhatia, 2002). In view of the serious side effects, hormonal therapies are considered only in cases of limited responses to antidepressants or anxiolytic drugs (Bhatia & Bhatia, 2002). There is no strong evidence to support the use of oral contraceptives for the treatment of PMS (Cunningham et al., 2009).

In contrast to the unidimensional pharmacological approach, a multidimensional perspective of PMS suggests that interventions should take place within a multidisciplinary framework where pharmacological and psychotherapeutic treatments complement each other. Such a multidisciplinary concept could combine psychotropic therapy, psychoeducational approaches focusing on lifestyle changes (Taylor, 1999), and cognitive-behavioral interventions in particular (Blake et al., 1998).

Although a multi-disciplinary approach to PMS takes on greater significance, an extensive literature search did not reveal any conjoint meta-analysis on the efficacy of psychotherapeutic and psychotropic treatment approaches or on the efficacy of combined interventions. Only separate meta-analyses of psychotherapeutic or pharmacological treatments were obtained. Due to differences in types of effect size indices and types of outcome measures, the comparability of the results of these meta-analyses is limited. For psychopharmacological treatments of PMS, three meta-analyses were found that focus on the efficacy of SSRIs (Brown, O’Brien, Marjoribanks, & Wyatt, 2009; Halbreich, 2008; Shah et al., 2008). All three analyses indicated SSRIs as an effective treatment for PMDD, although the author of one of these meta-analyses emphasized that the response rate to SSRIs was only about 40% (Halbreich, 2008). Regarding psychotherapeutic interventions for PMS, only one meta-analysis was found (Busse, Montori, Krasnik, Patelis-Siotis, & Guyatt, 2008). The authors identify cognitive-behavioral interventions in particular as an effective treatment option.

In the current meta-analysis we aimed to provide a quantitative review of controlled trials of psychotherapeutic interventions and psychopharmacotherapy, or combined treatments for PMS and PMDD. The effects obtained by these trials were analyzed separately for different outcomes.

Methods

Search Procedure and Study Selection

A multiple-phase search process was conducted. First, a computerized search using MEDLINE, PsychINFO, Cochrane Library CENTRAL, ClinicalTrials.gov register, MetaRegister of Controlled Trials, and ProQuest Digital Dissertations was carried out. Symptom specific key words (e.g., premenstrual, premenstrual syndrome, PMS, LLPDD, PMDD) and intervention specific search strategies were used (e.g., psychologic* OR cogniti* OR behaviour* OR behavior*; antidepressant* OR tranquilizer OR anxiolytic OR benzodiazepine*). In the next phase, previous reviews were manually screened. Lastly, relevant institutions were contacted as well as several experts on PMS. All relevant studies published up until March 2010 were collected. There were no restrictions regarding the language of the article.

The search process revealed 212 potentially relevant references. In many cases different references could be assigned to one study (for example as an abstract of a paper presentation, as original study report or as a protocol of a trial registration). Therefore, of the 212 references, only 155 potentially relevant studies could be extracted (see Fig. 1). Of those, only the studies fulfilling the following criteria were included in the meta-analysis: (a) psychotherapy (no restrictions regarding type, mode, setting, or duration of the treatment) or psychopharmacotherapy (antidepressants, neuroleptics, mood-stabilizer, anxiolytics, tranquilizer, phytopharmaca) or a combination of both interventions; (b) treated individuals were aged between 18 and 45 years; (c) participants were characterized by moderate to severe PMS or fulfilled the diagnostic criteria of PMDD or LLPDD (diagnosis confirmed by a prospective symptom diary or by an interview conducted by a mental health professional or a gynecologist); (d) studies with comorbid affective disorders as an exclusion criterion; (e) participants should not have been a non-responder in a previous study; (f) a prospective, controlled design (randomized or non-randomized), was applied; only control groups registered on a wait list, receiving no treatment, or receiving a placebo treatment were considered for effect size calculations; (g) each treatment group included at least ten patients; (h) prospective or retrospective and self- or clinician-rated outcome measures of defined psychological outcomes (see below) were administered.

Fig. 1
figure 1

Process of selection of trials

The status of fulfilling the inclusion criteria was discussed for each study in regular sessions by the three authors and disagreements were resolved by consensus. A flow chart of the study selection process is presented in Fig. 1. Of the 155 potentially relevant studies, only 23 studies fulfilled the inclusion criteria. Reasons for exclusion are summarized in Fig. 1. Most of the studies were excluded as they did not fulfill the criteria in relation to study design or statistical information. Typical reasons of exclusion associated with study design were for example that the study was a case report, the sample size was too small, the study did not involve a controlled study design, or the research question did not focus on evaluating the efficacy of a specific treatment.

Regarding the criteria for statistical information, in some of the excluded studies the relevant statistical values could not be calculated as the data did not fulfill the preconditions of statistical analyses. Therefore, often only medians and interquartile ranges were reported. Furthermore, a lot of studies were based on a crossover design where participants were used as their own controls. Separate results for the group receiving a medication first and the group receiving the placebo first were often not reported. The number of studies that were excluded due to the criteria of participants (e.g., sample included adolescents or patients fulfilled criteria of comorbid affective disorders), criteria of outcome (e.g., only global measures of premenstrual distress were used), or criteria of intervention (e.g., evaluation of the efficacy of hormones, vitamins, or sleep deprivation) was rather small.

Data Extraction and Assessment of Study Quality

For each study participant and intervention specific information, methodological aspects, as well as data needed for the calculation of the effect sizes were coded using a standardized coding scheme (the coding manual can be requested from the corresponding author). Quality of the studies was assessed using the Jadad scale (Jadad et al., 1996). This instrument contains three items taking into account if the trial had been described as randomized, if the trial had been described as double-blind, and if dropouts were described. The second item is problematic to rate for studies evaluating psychotherapy, as double-blind studies are not really suitable for psychological treatments. For this reason, we changed the second item and rated the quality of the diagnostic procedure, as differentiation between PMS and other medical or psychiatric conditions is important for appropriate treatment (Freeman, 2003; Halbreich et al., 2007). We only coded one point if the PMS-diagnosis and assessment for comorbid disorders was based on a prospective screening phase and a structured clinical interview. We drew a random sample of 25% of the included studies for coding by two independent raters who had received training in the use of the coding scheme (Lipsey & Wilson, 2001). The inter-rater reliability for the items of the coding scheme ranged between κ = .57 and κ = 1.00 as well as r = .64 and r = 1.00, and for the items of the quality scale between r = .76 and r = .87. Due to the inadequate internal consistency of the Jadad scale (Cronbach’s α = −.09), we evaluated the single three items separately. For the effect sizes, sufficient inter-rater reliability (r = .79) was also revealed. Disagreements of the raters were resolved by consensus and if necessary the coding manual was corrected.

Effect Size Calculation

We aggregated the effect sizes separately for the primary outcome “mood” (e.g. the subscale “negative affect” of the Menstrual Distress Questionnaire; Moos, 1986), and the following secondary outcomes: “behavioral changes and reactions of autonomic nervous system (ANS)” (e.g. the subscale “food cravings” of the Daily Symptom Report; Freeman, DeRubeis, & Rickels, 1996); “physical symptoms” such as pain or water retention (e.g. the subscale “breast pain” of the Daily Rating Form; Endicott & Halbreich, 1982); and “functional impairment” (e.g. the Sheenan Disability Scale; Sheenan, 1983). Whenever a study reported several measures for the same outcome, we coded the effect size for each measure separately and calculated a mean effect size (Lipsey & Wilson, 2001). Where there were groups of different treatment conditions in a study with a common control condition, each trial arm was entered separately and the number of participants in control conditions was divided equally between the arms.

Between-group effect sizes were calculated separately for psychotherapeutic interventions, SSRIs or other serotonergic drugs, and other psychotropic drugs, as well as separately for each outcome according to the method described by Hedges and Olkin (1985). We expanded the formula by considering pre-differences between treatment and control group (Leichsenring, Rabung, & Leibing, 2004), as aside from randomization there are often considerable baseline differences between the treatment and control group. Due to a bias caused by small sample sizes, we used a simple correction formula (Hedges, 1981). In order to examine the stability of the between-group effects, we calculated effect sizes separately for assessments directly after therapy and at follow-up of maximum 12 months post treatment. For prospective measures we included only the score during the luteal phase. For all measures we used the outcome score of the last treatment cycle, with the exception of one study that reported only a summarized score over all treatment cycles (Steiner et al., 1995).

In order to estimate the uncontrolled effect of SSRI and psychotherapeutic interventions, within-group effect sizes of intra-individual pre-post comparisons were calculated separately for each outcome using a statistic described by Becker (1988). The single study effect sizes were calculated using Microsoft Office Excel® (2003).

Integration of Effect Sizes

Prior to the data aggregation we excluded extreme values deviating more than three standard deviations from the mean of all effect sizes (Lipsey & Wilson, 2001). In a further step, effect sizes were weighted corresponding to assumptions of a fixed effect model (Hedges & Olkin, 1985). Weighted effect sizes were then aggregated to a mean effect size with a 95% confidence interval. If the Q-statistic indicated significant heterogeneity, effect sizes were recalculated on the basis of a random effects model (Hedges & Olkin, 1985). For the calculation of pooled effect sizes we applied SPSS® macros developed by Wilson (2005, available from the web address: http://mason.gmu.edu/~dwilson/ma.html). For interpretation of the magnitude of the effect sizes the convention established by Cohen (1977) was used. According to this an effect size of .20–.30 is defined as a small effect, an effect size of around .50 as a medium effect, and an effect size of .80 to infinity as a large effect.

Publication Bias

There is a risk that meta-analysis may over-represent published literature that is biased toward studies showing statistically significant findings (Lipsey & Wilson, 1993). We tried to reduce this bias by carrying out a thorough search for fugitive studies. In addition, we assessed for associations between effect sizes and their sample sizes by inspecting funnel plots and performing a file drawer analysis (Orwin, 1983).

Missing Data

For studies that do not report the necessary means and standard deviations, we calculated algebraically equivalent effect sizes from t- or F-values, or exact probability levels (Glass, McGaw, & Smith, 1981). The study authors of five out of the potentially relevant studies for the current meta-analysis (see Fig. 1), where relevant statistical values were not reported, were contacted. Unfortunately, no feedback was given and the missing data were not provided. For the remaining studies where no sufficient statistical information was available, authors could not be contacted as the studies were old (conducted in the nineties or earlier), therefore contact details were not up to date and it was not possible to acquire current contact information.

Results

Characteristics of Included Studies

Twenty-three studies fulfilled the inclusion criteria. Of those, one study (Stone, Pearlstein, & Brown, 1991) had to be excluded due to extreme deviating effect sizes. The main characteristics of the remaining included 22 studies are summarized in Tables 1 and 2. Three of the studies examining psychotherapy (50%) and four of the psychopharmacological studies (25%) were not included in previous meta-analyses. Nine of the studies examining psychopharmacological interventions (56%) were published between 2000 and 2009, but none of the psychotherapeutic studies were within this time range. Sixteen studies came from either the USA or Canada, two from the UK, two from Sweden, one from Australia, and one study was from Yugoslavia.

Table 1 Characteristics of included studies of psychotherapeutic interventions
Table 2 Characteristics of included studies of psychopharmacotherapy

The studies included a total number of 173 participants receiving psychotherapy and 1,656 receiving psychopharmacotherapy (both sample sizes include dropouts). On average, participants of psychotherapeutic studies received six sessions during a mean period of 40 days or 1.4 cycles (range: 28 and 84 days or 1–3 cycles). Only cognitive-behavioral or pure cognitive interventions were examined in the included studies. The mean duration of psychopharmacotherapy was 83 days or 2.9 cycles (range: 28–168 days or 1–6 cycles). An average of 79% of individuals treated with medication and 61% of individuals in the control groups experienced at least one intervention-related side effect (most frequent side effects: insomnia, fatigue, nausea, dry mouth, and headache). Only 9 of the 16 included pharmacological studies provided information about tolerability of the medication. The treatment modalities varied considerably between the included studies: Nine of the studies had continuous dosing, five only during the luteal phase, and one study was symptom-onset dosing. In one study continuous and luteal dosing was compared. The mean age of the treated participants of all included studies was 35.9 years and the mean duration of PMS symptoms was 7.9 years.

In 59% of all included studies, medical concomitant treatments (of non-PMS related conditions that can have an impact on premenstrual symptoms) were stopped or remained stable during the treatment period. A screening phase of PMS symptoms prior to the start of the treatment was implemented in all of the studies examining the efficacy of psychotropic drugs, whereas only four out of the six included psychotherapeutic studies implemented such a screening phase. The stability of treatment effects was assessed in five of the six psychotherapeutic studies but none of the psychopharmacological studies. The follow-up period in the psychotherapeutic studies ranged between 1 and 12 months. The psychotherapeutic studies revealed lower scores in all of the three items of the modified Jadad scale (item 1: M = 1.00, SD = 1.10, range: 0–2; item 2: M = 0.33, SD = 0.52, range: 0–1; item 3: M = 0.83, SD = 0.41, range: 0–1) in comparison to the pharmacological studies (item 1: M = 1.29, SD = 0.99; item 2: M = 0.57, SD = 0.51; item 3: M = 1.00, SD = 0.00).

Effect Sizes and Effect Stability of Psychotherapy

Table 3 summarizes all between-group effect sizes and test statistics as well as heterogeneity indices, separately for the different types of outcomes. If the standard effect size calculation (based on a fixed effect model) revealed a significant heterogeneity index, effect sizes were recalculated according to a random effects model (see footnotes in Table 3).

Table 3 Weighted mean effect sizes and heterogeneity indices separately for type of therapy and outcome

At post-assessment, effect sizes based on between-group contrasts were small and non-significant for behavioral changes, or ANS reactions and physical symptoms. For mood and functional impairment, moderate effects were found. As a consequence of the recalculation of the medium effect size for the outcome “mood” (d + = 0.69; 95% CI 0.40–0.97), corresponding to a random effects model, the effect became non-significant (see Table 3). Effect sizes based on follow-up assessments could only be calculated for behavioral changes or ANS reactions, d + = 0.60, 95% CI 0.30–1.16, as well as mood measurements, d + = 0.46, 95% CI −0.09–1.01. They indicate stability of the treatment effects.

In addition, intra-individual pre-post effect sizes for psychotherapeutic interventions were calculated. For mood, a significant small effect size, d + = 0.42, 95% CI 0.19–0.56, was found. For physical symptoms, behavioral changes and ANS reactions, or functional impairment, effect sizes were non-significant and ranged between a small to medium level (range: d + = 0.17–0.51).

Effect Sizes and Effect Stability of Psychopharmacotherapy

For SSRIs and serotonergic antidepressants, small, significant between-group effects on mood, behavioral changes and ANS reactions, and physical symptoms were found. For functional impairment the effect was of medium size. In contrast, other psychotropic drugs (e.g. alprazolam, desipramine, buproprion, hypericum perforatum) were predominantly associated with only very small, or small, non-significant effect sizes at post-assessment for almost all outcomes (see Table 3). Due to the lack of follow-up assessments in pharmacological studies the stability of the treatment effects could not be analyzed.

Finally, intra-individual pre-post effect sizes were also calculated for SSRIs. With the exception of physical symptoms (d + = 0.73, 95% CI 0.62–0.84) only large, significant effect sizes were obtained ranging between d + = 0.94 and d + = 1.38.

Effect Sizes and Effect Stability of a Combined Treatment Including Psychotherapeutic and Psychotropic Interventions

Only one study was found which examined the efficacy of a cognitive-behavioral therapy in direct comparison with an antidepressant, as well as a combination of the psychotherapeutic and psychotropic treatment (Hunter, Ussher, Browne et al., 2002; Hunter, Ussher, Cariss et al., 2002). It had to be excluded from our meta-analysis as it did not include a placebo control group. Regarding the total score on the premenstrual symptom diary, the results show that both separate and combined interventions are effective and do not differ significantly in efficacy. Furthermore, differential treatment effects of both interventions were obtained. Whereas the antidepressant (fluoxetine) had a greater impact upon anxiety symptoms, the cognitive-behavioral intervention was related to an increased use of cognitive and behavioural coping strategies and a shift from a biomedical to bio-psychosocial causal attribution of premenstrual symptoms. At follow-up the cognitive-behavioral therapy showed better maintenance of treatment effects compared with fluoxetine. Results did not support an additional effect for combined treatment.

Publication Bias

Only two unpublished psychotherapeutic studies (see Table 1), but no unpublished pharmacological studies, could be included in our meta-analysis. In order to check for existence of a publication bias in pharmacological studies we examined this potential problem graphically with a funnel plot of effect sizes for the primary outcome “mood”, at post-assessment. Figure 2 displays the funnel plot for pharmacological studies; the distribution of effect sizes assumes the typical shape of a funnel plot. Using a funnel plot to check for publication bias in the psychotherapeutic studies did not make sense due to the small number of studies.

Fig. 2
figure 2

Funnel plot of study effect sizes for the primary outcome “mood” (based on contrasts between a group treated with a psychotropic drug and placebo group at post-treatment) with an overall weighted mean effect size of d + = 0.37 (95% CI 0.28–0.46), displayed as a function of sample size of treatment group

In addition, a file drawer analysis for the primary outcome “mood” resulted in a fail-safe N of 36 studies for psychotherapy and a fail-safe N of 50 for psychopharmacotherapy. This means that 36 (for psychotherapy) or 50 (for psychopharmacotherapy) studies with zero effects would be necessary to reduce the observed weighted mean effect sizes of d + = 0.69 and of d + = 0.38 to a very small effect size (d c = 0.01). Results demonstrate that although the effect sizes are rather small they seem to be relatively robust.

Discussion

In this paper we presented the results of a meta-analysis examining the efficacy of psychotherapeutic and psychotropic interventions, or a combination of both for moderate to severe premenstrual symptoms. The psychotherapeutic studies included only examined cognitive-behavioral interventions, not any other forms of psychotherapy. Analyses revealed for psychotherapeutic interventions, in comparison to a wait list or non-treated group, predominantly small to medium effects. With the exception of functional impairment, the effect sizes were all non-significant. For mood and behavioral or ANS reactions, these effects remained stable at follow-up assessment. For serotonergic antidepressants in comparison to a placebo group, predominantly small and medium, significant effects on the different outcomes were revealed. The stability of the effects could not be checked due to missing follow-up assessments. For other psychotropic drugs (e.g., alprazolam, desipramine) only small or very small, non-significant effects were revealed. The direct comparison of the efficacy of both interventions was not possible as the only obtained study examining the efficacy of the combination of psychotherapeutic and psychotropic treatment had to be excluded from the meta-analytical calculations.

The effect sizes for cognitive-behavioral interventions are comparable to those of a meta-analysis by Busse et al. (2008). They also found a small, non-significant effect size for behavioral symptoms and a medium, significant effect for mood variables. The magnitude of the effects was not satisfactory. Apart from functional impairment, for all other outcomes the confidence intervals include zero. This may of course be due to the small sample size, therefore the effects really need to be determined with a larger sample. Regarding physical symptoms in particular, it is known from other therapies that they are difficult to treat. For example, a meta-analysis on the effectiveness of psychotherapy for hypochondriasis also found only a small between-group effect for physical symptoms, d + = 0.41 (Thomson & Page, 2007). The results may reflect the fact that psychotherapy does not primarily focus on “healing” these symptoms but rather on improving methods of coping with them.

In relation to the influence of cognitive-behavioral interventions on mood and behavioral changes, it has to be considered that PMS comprises a broad spectrum of different mood and behavioral symptoms. The included studies implemented either a relatively strict program including many different cognitive-behavioral interventions or used one specific intervention (e.g., cognitive restructuring; Morse, 1999). Due to the broad spectrum of symptoms it may be important to tailor the therapy modules more specifically, corresponding to the individual needs of the patients.

Regarding the efficacy of psychotropic drugs, the group of SSRIs and serotonergic antidepressants seem to be the most effective. These results coincide with the findings of other meta-analyses on the efficacy of psychotropic drugs for PMS symptoms. For example, a review of SSRIs for PMS by Brown et al. (2009) reported small, significant effects for behavioral changes (d + = 0.42), physical symptoms (d + = 0.34), and functional impairment (d + = 0.27).

For SSRIs in particular, effect sizes based on within-group comparisons between pre- and post-assessment are larger than the effect sizes based on between-group contrasts. This result implies that it is important to consider placebo effects for examination of the efficacy of psychotropic drugs for PMS. There are already some studies that do show a large placebo-response for women with severe PMS. In a study by Freeman and Rickels (1999) for example, 20% of women treated with a placebo showed sustained improvement and another 42% showed partial improvement. A study by Van Ree, Schagen Van Leeuwen, Koppeschaar, and Te Felde (2005) showed a decrease of PMS symptoms in 91% of placebo-treated women.

The most important limitation of the current meta-analysis was the limited comparability of the efficacy of psychotherapy and serotonergic antidepressants, due to the different control groups used in the psychological and pharmacological studies. Furthermore, the efficacy of a combination of both treatments could not be examined due to the lack of studies. A further critical limitation is the small number of psychotherapeutic studies in comparison to pharmacological studies. This small sample size may explain the non-significance of almost all weighted mean effect sizes of the psychological interventions.

An additional problem was that the psychotherapeutic studies contain certain methodological shortcomings. One of the studies was a non-randomized controlled trial (see Table 1). There were only three studies in which the treatment was administered by a professional in mental health (psychologist or psychiatrist). Only two of the psychotherapeutic studies implemented a structured clinical interview for the exclusion of comorbid disorders. All of these mentioned aspects lower the quality of the included psychotherapeutic studies, which in turn can lead to inaccurate estimations of effect sizes (Cuijpers, Van Straten, Bohlmeijer, Hollon, & Andersson, 2010).

Regarding the pharmacological studies, it was also problematic that studies with a mixture of treatment modalities were included, as the other therapies may contribute to treatment efficacy, side effects, patient dropout, and patient preference issues. Unfortunately the number of studies was too small to conduct subgroup analyses in order to examine the effect of such treatment modalities. Nonetheless, the results show that homogeneity of data is relatively high for most of the outcomes and therefore the differences in treatment modalities do not seem to produce significant heterogeneity.

On the one hand the inclusion of studies containing women who do not fulfill the DSM-criteria of PMDD or LLPDD is also crucial. Unfortunately the number of studies is too small to conduct subgroup analyses for comparing the efficacy of psychotherapy or psychopharmacotherapy between patients fulfilling criteria of PMDD and patients with PMS. However, the criteria for PMDD/LLPDD are very strict and epidemiological studies show that women failing to have the requisite number of PMDD symptoms experience severe functional impairment anyway (Angst, Sellaro, Merikangas, & Endicott, 2001; Wittchen, Becker, Lieb, & Krause, 2002). Unfortunately the proportions of women with PMS or with PMDD/LLPDD differ considerably between the psychotherapeutic and pharmacological studies. Whereas only 2 of the 6 psychological studies included a sample of patients with LLPDD, 11 of the 16 psychopharmacological studies included women diagnosed with PMDD/LLPDD. These differences in the degree of severity of premenstrual symptom can also limit the comparability of the treatment effects of psychotherapy and psychotropic drugs.

A further limitation was that only four of the six psychological studies carried out a prospective screening of premenstrual symptoms. In the remaining two studies it is therefore not clear if patients really fulfilled the criteria of PMS. Lastly, the inclusion of studies that do not control for concomitant treatments is one other important limitation.

From the results of the current meta-analysis and the shortcomings of the included studies, several important conclusions for future research in this area can be drawn. Firstly, future studies should be based on the use of high-quality diagnostic procedures including a prospective, premenstrual symptom diary for at least 2 months. This is especially important to draw conclusions about the impact of the degree on severity of premenstrual symptoms of the efficacy of psychological and pharmacological treatments. Secondly, in relation to psychotherapy we recommend that future research focus on examining specific therapeutic elements or their standardized combination. The group of patients suffering from PMS is very heterogeneous. Due to the broad spectrum of symptoms it might be important to tailor the therapy individually, corresponding to the individual needs of the patients. This kind of individually tailored therapy is difficult to research in randomized controlled trials. Future research should therefore also focus on studies in natural, less standardized settings. Thirdly, for psychotropic treatments in particular, we recommend more follow-up assessments. Finally, future research on the combination of psychological and psychotropic interventions seems to be very important.

For clinical practice, our meta-analysis indicates that for moderate to severe PMS, the effectiveness of both psychotherapy and serotonergic drugs is not satisfactory. SSRIs are of course important and the only widespread treatment available to date that has shown some efficacy in treating moderate to severe PMS. Nonetheless, the recommendation of SSRIs as a unidimensional first-line treatment should be re-considered in the future. Even though only one study was found where the efficacy of a psychotherapeutic and a psychotropic intervention were directly compared, this study demonstrates that psychotherapy is associated with a better long-term maintenance of treatment effects (Hunter, Ussher, Cariss et al., 2002). This can be due to processes such as developing biopsychosocial causal attributions of premenstrual symptoms, or functional coping strategies, which can be activated in patients during psychotherapy and could help with the long-term maintenance of treatment effects.

Although SSRIs seem to have short-term effects, the results of the current meta-analysis demonstrate that long-term efficacy cannot be validated at this time. In the study by Hunter, Ussher, Cariss et al. (2002) the stability of treatment effects of fluoxetine was low in contrast to CBT. This low effect stability could also possibly be affected by the treatment preferences of patients. Treatment preference could be lower for a medication than a psychotherapeutic intervention due to large rates of side effects which were shown in the current meta-analysis. Taking everything discussed so far into consideration, a multidisciplinary and individually tailored treatment concept for PMS, where pharmacological and psychotherapeutic treatments are combined in a reciprocally supporting way, could be a valuable option.