Introduction

The movement towards a ‘24/7’ economy has brought about a growing demand for shift work (Presser 2003). Shift work, also known as non-standard work schedules, points to a range of working time arrangements. To date, there appears to be no single, agreed upon, definition of the specific employment conditions that constitute as ‘shift work’. As a consequence, survey items used to record respondents’ work schedules vary across research studies. Generally, the term ‘shift work’ implies that a considerable proportion of work hours fall outside the typical 8/9 am to 4/5 pm, Monday to Friday schedule (Li et al. 2014; Presser 2003). Examples of types of shift work include regular morning, afternoon, evening or night shifts, rotating shifts and irregular shifts, and split shifts. Often people who work during weekends are considered to be shift workers as well (Australian Bureau of Statistics, ABS 2012). Importantly shift work is not the same as ‘overtime’ or ‘extended hours’ (Fenwick and Tausig 2004; Presser 2003). Shift workers work at night or weekends because it is part of their prescribed work schedule, while overtime is often undertaken by choice (i.e., to catch up on job tasks).

Shift work is now very common in most countries. In Australia, data from the Working Time Arrangements Survey shows that in 2012, 16% of employees regularly undertook shift work in their main job (ABS 2012). The proportion is similar to UK data reporting an average of 15% across 1999–2009 (Office for National Statistics 2011). In 2010, data from the US National Health Interview Survey showed that 29% of employees did not work a regular day shift (Alterman et al. 2013). The high prevalence of shift work has led to concerns about its potential impacts on large-scale population health.

To date, a considerable body of research has investigated the effects of shift work on physical health, and the link between the two is now well established (Kecklund and Axelsson 2016). In contrast, shift work and mental health have been studied less comprehensively (Vogel et al. 2012). Mental health problems have been acknowledged as a major public health issue. The 2007 National Survey of Mental Health and Wellbeing showed that in Australia, the prevalence of any mental disorder in a 12-month period was 20%, with anxiety and affective disorders being the most common (14.4%; 6.2%) (Slade et al. 2009). Similarly, results from the 2002 National Comorbidity Survey Replication (NCS-R) in the US showed that in a 12-month period an affective/mood disorder was experienced by 9.5% of the population and an anxiety disorder was experienced by 18.1% (Kessler et al. 2005). The significant prevalence of mental health problems and associated social and economic burden (Whiteford et al. 2013) makes it vital to explore and identify the potential drivers, including non-standard work schedules.

A number of studies provide evidence of an association between shift work and poor mental health. Whilst many of these studies are small, qualitative studies focused on specific occupations, there are a growing number of population-based studies that investigate the association across employment types. For example, Niedhammer et al. (2015) (n = 26,883 men and 20,079 women) found that having an unpredictable work schedule was significantly related to higher levels of anxiety and depression. Another longitudinal research conducted by Bara and Arber (2009) using data from the British Household Panel Survey (n = 4549 men and 5216 women) found that shift work had a negative long-term impact on mental health, which varied according to the duration of exposure, type of shift work and gender. These brief examples demonstrate that individual study results may vary depending on the measure of shift work used, and that a systematic review would assist with summarising this complex research literature. In addition, while several reviews have been conducted on this topic, they have either been about general health more broadly [with only had limited information about mental health (i.e., Costa 1996; Vogel et al. 2012)], or have focused on specific occupations, specific shift work types and/or specific mental health symptoms (Angerer et al. 2017; Tahghighi et al. 2017).

The aim of this systematic review was to provide a comprehensive summary (i.e., reference document) of the population-based research that has examined the association between shift work and mental health. While shift work is often measured as a broad binary variable, i.e., ‘Do you undertake shift work?’ with an either ‘yes’ or ‘no’ response, given there are different types of shift work schedules (i.e., night work, weekend work, rotating rosters, irregular/unpredictable work, etc.), the review sought to draw attention to the research findings in each of these schedule types. The review also aimed to provide an indication of study quality for each shift work schedule investigated, and where comparable data were available, summarise the findings available using a meta-analysis. The review focused on minimising sample and occupation-specific bias, including only large-scale studies (n > 100) that were not occupation or organization specific.

Method

Search strategy

The review was guided by the PRISMA reporting process (Moher et al. 2009). In December 2016 (with an updated search in December 2017), four databases (PubMed, PsycINFO, Web of Science and SCOPUS) were searched to source relevant articles reporting on the statistical association between shift work and mental health. Generally, the search used the “title and abstract” field and was limited to articles in English. Terms were entered to represent shift work and mental health: (post-traumatic stress OR PTSD OR mental health OR mental ill* OR mental dis* OR psychological health OR psychological dis* OR psychiatric OR distress OR mood dis* OR depress* OR anxiety OR panic OR phobi* OR obsessive–compulsive OR OCD OR GAD) AND (shift work OR night work OR evening work OR night shift OR evening shift OR weekend work OR work schedule OR work schedules OR (nonstandard OR non-standard AND (schedule OR schedules OR work OR employment OR job)) OR (irregular AND (schedule OR schedules OR work OR employment OR job)) OR (non-traditional AND (schedule OR schedules OR work OR employment OR job))). These terms were chosen after reviewing the language used in several key articles on this topic (Li et al. 2014; Llena-Nozal 2009). Initially, there were 3547 results. 1408 of the results were identified as duplicates or correspondences, and so were excluded. Therefore, 2139 unique articles were identified.

Study selection

The study selection process is described in the flow chart in Fig. 1. First, titles and abstracts of articles were screened for eligibility. Articles were excluded if they were clearly irrelevant or ineligible based on the following criteria: not human research, not about shift work or mental health, not empirical studies, small samples (n < 100), or did not have comparison groups. 1869 articles were deleted at this stage, leaving 270 potentially relevant articles. Second, the full texts of these 270 articles were obtained. Articles were again excluded if no full-text version was available, or they were found to be irrelevant or ineligible (as described above), had no validated measurements for mental health, or did not report the statistical association between shift work and mental health. Occupational-specific or organizational-specific studies, and studies based on convenience samples, were also excluded at this stage. The final number of studies eligible for inclusion based on this first search was 31.

Fig. 1
figure 1

Study selection and exclusion process

An updated search, which replicated the first, was conducted in December 2017 to source any relevant articles published since the first search. After deleting duplicates, 284 unique articles were identified as potentially relevant. After reviewing these articles, one additional study was included. One other relevant article was also found as part of an unrelated search. Therefore, the final number of studies included in the review was 33 (22 cross-sectional analyses, 10 longitudinal analyses and 1 study that had both cross-sectional and longitudinal analyses).

Data extraction

A standardised coding sheet was used to extract relevant data from the articles. The data to be extracted included the data source, author names, publication year, sample characteristics, study design, shift work measures, mental health measures, the variables/covariates adjusted for, and statistical estimates of the association between shift work and mental health. Data extraction was undertaken independently by two of the authors (YZ and CP). Any discrepancies were resolved through discussion to obtain a consensus.

Study quality was rated based on the Newcastle–Ottawa quality assessment scale (Wells et al. 2009). Each study was awarded one star (or in some instances a half star ) for each criterion it fulfilled. To be specific: (1) representativeness of sample: one star for a representative sample of the general population; half star if the sample was reported to be random or somewhat representative of a specific population; no star if the representativeness was not reported. (2) Outcome measure: one star if it was based on a clinical interview or diagnosis from a medical professional or psychologists; no star for all other assessment types (all measures were validated self-assessment tools at minimum). (3) Covariate adjustment: one star if the study adjusted for both demographic and work-related covariates; half star if it adjusted for demographic covariates; no star when no covariates were included. (4) Study design: all longitudinal studies were given one extra star. (5) Outcome at baseline: one extra star was given to those longitudinal studies that considered participants’ mental health problems at baseline, or examined the changes in mental health during the study period. Overall, cross-sectional studies could be awarded a total of 0–3 stars and longitudinal studies could be awarded a total of 1–5 stars. The quality of the shift work measure adopted was not rated, because all studies relied on participants’ self-report of their work schedules. In addition, attrition rates were not considered as all longitudinal studies reported high follow-up rates.

Study findings were grouped into tables based on the type/measure of shift work adopted to provide an overview of the literature for each shift work type, as well as the associations between each type of shift work and mental health.

Meta-analysis

A meta-analysis was considered for each type/category of shift work, to pool the findings from the longitudinal studies included. However, often, studies did not include essential or comparable information (such as sample size, p value, and effect size) and therefore a meta-analysis was not possible. There was, however, sufficient information to confidently meta-analyse the findings from four of the five longitudinal studies with a broad binary measure of shift work. In these studies, the necessary information was extracted and the results for each study was converted into log odds ratios and then finally odds ratios. One longitudinal study (Llena-Nozal 2009) was omitted as it reported multiple regression coefficients, which we were not able to transform to comparable effect sizes (Aloe and Becker 2012). Meta-analyses were conducted in ‘R’ and a random effects model was used to account for variation between studies. ‘RevMan 5’ was also used to create the final forest plot displaying the results.

Results

Study characteristics

Study populations

Table 1 presents information about each study’s population, sample sizes and compositions, as well as measurements of shift work and mental health. Llena-Nozal (2009) was the only study using data from multiple countries. This study analysed datasets from four nationally representative samples: The British Household Panel Survey (BHPS); The Household, Income, Labour Dynamics in Australia (HILDA) Survey; The National Population Health Survey (NPHS)—Canada; and The Swiss Household Panel (SHP). Thirteen other studies were also based on nationally representative samples, including the Longitudinal Study of Australian Children (LSAC) (Cooklin et al. 2015); the HILDA Survey (Dockery et al. 2009); the Hungarostudy 2002 in Hungary (Kopp et al. 2008); the third Korean Working Conditions Survey (KWCS) (Lee et al. 2015; Park et al. 2016); the Canadian component of an international study—GENACIS (Haines III et al. 2008); the Canadian National Longitudinal Survey of Children and Youth (Strazdins et al. 2006); NPHS in Canada (Marchand et al. 2015; Shields 2002); the Early Childhood Longitudinal Survey (ECLS) in US (Rosenbaum and Morett 2009); the US National Health and Nutrition Examination Survey (NHANES) (Wirth et al. 2017); the National Longitudinal Survey of Youth, 1979 in US (Kleiner and Pavalko 2010) and the BHPS (Bara and Arber, 2009).

Table 1 Descriptive characteristics for each study included in the systematic review

Another seven studies reported using random samples from either a national (Niedhammer et al. 2015; Takahashi et al. 2011; Vallieres et al. 2014) or a regional population (Bildt and Michelsen 2002; Carlson et al. 2011; Joyce et al. 2013; Marchand et al. 2016). In addition, three studies conducted by Marchand et al. (2003a, b, 2005) reported their sample as representative of a regional population and both Perry-Jenkins et al. (2007) and Shepherd-Banigan et al. (2016) reported their samples as representative of new parents/families within specific regions. Specific details about the sample selection were unclear in seven studies (De Raeve et al. 2007; Driesen et al. 2011; Grzywacz et al. 2016; Kawabe et al. 2015; Takada et al. 2009; Takusari et al. 2011; van de Ven et al. 2016).

Sample sizes and compositions

The sample sizes of the 33 studies ranged from 132 to more than 50,000. Twenty-six studies (79%) were identified as having samples larger than 1000 participants [although the specific sample sizes were unclear in Llena-Nozal (2009) and Rosenbaum and Morett (2009)]. The 7 studies (23%) with fewer than 1000 participants typically focused on new parents. Five studies adopted single-gender samples—two all male (Cooklin et al. 2015; De Raeve et al. 2007) and three all female (Carlson et al. 2011; Grzywacz et al. 2016; Shepherd-Banigan et al. 2016). The remaining studies included both females and males, although some samples were male dominant (> 90% male) (van de Ven et al. 2016) or female dominant (> 90% female) (Strazdins et al. 2006).

Several studies applied additional selection criteria consistent with the specific aims of the studies. For example, nine studies were conducted only among parents. Participants were also occasionally excluded due to other reasons, such as health issues (Kawabe et al. 2015; Marchand et al. 2015; Takada et al. 2009), and employment characteristics (Lee et al. 2015; Llena-Nozal 2009). The vast majority of studies included both shift workers and non-shift workers, except van de Ven et al. (2016) (where all participants were shift workers, and differences were tested based on different characteristics of shift work).

Shift work measure

The measurement of shift work varied between studies as described clearly in Table 1. Twelve studies included a broad binary indicator of shift work, with work schedules classified into two groups: shift work and non-shift work. However, definitions and explanations of what was ‘shift work’ differed somewhat even between these twelve studies. In three studies, the term ‘shift work’ was not defined beyond a simple ‘shift work’ or ‘non-shift work’ dichotomy (Llena-Nozal 2009; Takusari et al. 2011; Park et al. 2016). Joyce et al. (2013) focused on fly-in fly-out (FIFO) workers, and the concept of ‘shift work’ was defined as distinct from fly-in fly-out work. In Wirth et al. (2017), shift work was defined as a combination of night/evening shifts and rotating shifts. In the remaining seven studies, a regular daytime schedule was identified as a standard schedule (non-shift work), while all other schedules were identified as shift work.

Twelve studies measured night or evening shifts. In most cases, working night/evening shifts was compared with dayshifts, but in Kawabe et al. (2015) and Takusari et al. (2011) day-to-night shift and day/night mixed shift were also included. Tables 1 and 4 show that seven studies considered weekend work, although two of these studies did not recognize weekend work explicitly as shift work (Shields 2002; Takada et al. 2009).

Work schedule irregularity typically refers to the unpredictability of work schedules (Marchand et al. 2015, 2016) and was measured in thirteen studies. Irregular schedules were defined in a number of ways, including split shifts, being on call, varied patterns of shifts or other irregular schedules (Bara and Arber 2009; Llena-Nozal 2009; Marchand et al. 2003b; Shepherd-Banigan et al. 2016; Shields 2002). In some studies, working rotating or alternate shifts was also described as an ‘irregular schedule’ (Kleiner and Pavalko 2010; Llena-Nozal 2009; Marchand et al. 2003b), although some scholars may disagree, arguing that rotating shifts are predictable (Niedhammer et al. 2015).

Rotating shifts were measured in five studies (Niedhammer et al. 2015; Perry-Jenkins et al. 2007; Shields 2002; Vallieres et al. 2014; Wirth et al. 2017). In these studies, rotating shifts were contrasted with fixed permanent shift schedules. Some ‘other’ characteristics of shift work were also considered. In Kawabe et al. (2015) ‘shift work’ was categorised as one of four work types (i.e., different from fixed night work, day-to-night work and daytime work), with little further information offered. In contrast, van de Ven et al. (2016) focussed on detailed characteristics of work schedules—consecutive shifts, early starts, consecutive working days, direction of rotation, weekends off and recovery days. Three other studies (Perry-Jenkins et al. 2007; Rosenbaum and Morett 2009; Strazdins et al. 2006) measured couples’ joint work schedules.

Mental health measure

Despite the variety of terms entered into the database searches, the measures used to assess mental health outcomes were relatively limited. The majority of the studies focused on general mental health (psychological distress) and/or depression. Validated psychometric scales were used in all but two studies that relied on self-report of diagnoses from doctors (Driesen et al. 2011; Joyce et al. 2013).

In terms of general mental health, four studies (Cooklin et al. 2015; Llena-Nozal 2009; Marchand et al. 2015; Shields 2002) reported using the Kessler-6 which assesses nonspecific psychological distress. Several studies used the Short-Form General Health Survey, either SF-36 or SF-12 (Carlson et al. 2011; Dockery et al. 2009; Kawabe et al. 2015; Kleiner and Pavalko 2010; Llena-Nozal 2009; Vallieres et al. 2014; van de Ven et al. 2016). Five studies adopted 12-item General Health Questionnaire (Bara and Arber 2009; Bildt and Michelsen 2002; De Raeve et al. 2007; Llena-Nozal 2009; Takusari et al. 2011). In addition, three studies using data from the same survey (Marchand et al. 2003a, b, 2005) measured psychiatric symptoms using the Ilfeld Psychiatric Symptoms Index.

In terms of depression, the Centre for Epidemiological Studies Depression Scale (CES-D) was the dominant measure (Grzywacz et al. 2016; Kleiner and Pavalko 2010; Perry-Jenkins et al. 2007; Rosenbaum and Morett, 2009; Shepherd-Banigan et al. 2016; Strazdins et al. 2006; Takada et al. 2009; Takahashi et al. 2011), followed by the Beck Depression Inventory (BDI) and WHO Wellbeing Scale. The former was used in two Canadian studies (Marchand et al. 2016; Vallieres et al. 2014), while the latter was used in two Korean studies which share the same dataset (Lee et al. 2015; Park et al. 2016). In addition, Kopp et al. (2008) reported using both of these scales. In Bildt and Michelsen (2002), sub-clinical depression was defined as a high value on the Nottingham life-quality questionnaire. The other two depression scales adopted were the Composite International Diagnostic Interview (CIDI)-Short form (Haines et al. 2008) and the 9-item module from the Patient Health Questionnaire (PHQ-9) (Wirth et al. 2017). Niedhammer et al. (2015) used the Hospital Anxiety and Depression Scale to measure depression and anxiety symptoms. One further study measured anxiety using the State-Trait Anxiety Inventory (Vallieres et al. 2014).

Study quality

The number of stars each study was awarded for study quality varied greatly—ranging from 0.5 to 3 for cross-sectional research and from 1.5 to 4 for longitudinal research. The median level was 1.5 for cross-sectional studies and 3.5 for longitudinal studies. This shows that compared to cross-sectional studies, a larger proportion of longitudinal studies demonstrated higher methodological quality (regardless of the extra star given for their longitudinal status).

Associations between shift work and mental health

Tables 2, 3, 4, 5, 6 and 7 summarise the results of all the studies, with similar types/schedules of shift work grouped together. Out of the 33 studies identified, 22 found shift work was significantly associated (statistically) with mental health using at least one measure.

Table 2 Broad binary indicators of shift work—associations with mental health
Table 3 Night shift—associations with mental health
Table 4 Weekends/holidays work—associations with mental health
Table 5 Irregular (unpredictable) schedules—associations with mental health
Table 6 Other characteristics of work schedules—associations with mental health
Table 7 Joint work schedules—associations with mental health

Broad binary indicator of shift work

Overall, based on studies adopting a broad binary indicator of shift work, shift workers had more mental health problems than non-shift workers. As indicated in Table 2, four of the seven cross-sectional studies and three of the five longitudinal studies that measured shift work with a broad binary indicator, found significant associations between working shifts and poor mental health. Furthermore, of the six studies that rated two stars or above, five showed significant results. The only exception was De Raeve et al. (2007), who examined transitions between day work and shift work on psychological distress. The results of four of the longitudinal studies were synthesized in a meta-analysis (the exception was Llena-Nozal (2009)—see measures section above for further detail). The results supported the conclusion that shift work is significantly associated with greater mental health problems (OR 1.32, 95% CI [1.01, 1.73]), such that those who reported being in shift work were 32% more likely to experience depression and/or psychological distress than those who reported not being in shift work (see the forest plot in Fig. 2 for further details). In addition, data from the studies showed a symmetrical funnel-shaped distribution (i.e., funnel plot) indicating that publication bias was unlikely.

Fig. 2
figure 2

The odds ratios of longitudinal studies with a broad binary measure of shift work and pooled meta-analysis results

In terms of potential gender differences, eight of the twelve studies analysed the results for males and females separately. Three of these studies, all longitudinal studies rated 3.5 or 4 stars (Bildt and Michelsen 2002; Driesen et al. 2011; Llena-Nozal 2009), reported gender differences with significant effects found only for females. In the remaining studies, the effects were significant either for both females and males (Dockery et al. 2009; Park et al. 2016), or for neither (Haines et al. 2008; Takusari et al. 2011; Wirth et al. 2017). Overall, the results provide some evidence that females are more vulnerable to shift work, but they are far from conclusive.

Night or evening work as shift work

There was no strong evidence of a consistent, significant association between night/evening work and mental health, although it was evident that several high-quality studies have been conducted exploring this association (Table 3). Out of the six cross-sectional studies, two showed significant associations—Kawabe et al. (2015) found day-to-night work (but not fixed night work) was associated with poor mental health and Wirth et al. (2017) found working at nights or evenings was associated with mild depression. As for the six longitudinal studies, three showed significant results, Shields (2002) found that working evening shifts at baseline was associated with an increase in psychological distress 2 years later, however, after another 2 years, this association no longer remained. Bara and Arber (2009) found that men who had been working night shifts for more than 4 years were more likely to have poor mental health. Perry-Jenkins et al. (2007) indicated that fathers and mothers who worked evening/night shifts during the transition to parenthood had significantly higher levels of depressive symptoms. Notably, although few in number, the three longitudinal studies with significant results were high quality compared to those studies with no association.

Weekend work as shift work

There was little evidence supporting an association between weekend work and mental health. Only two out of the seven studies included found significant associations with mental health (Table 4). The two studies (Lee et al. 2015; Takada et al. 2009) were both cross-sectional studies, with relatively low study quality ratings (two and one star). Overall, the studies reporting data on the association with weekend work received relatively low-quality scores—with only one longitudinal study that received four stars.

Shift work as irregular/unpredictable schedules

Interestingly, the majority of studies that examined the association between irregular/unpredictable schedules and poorer mental health found some evidence of an association. Out of seven cross-sectional studies, six showed significant effects. However, when looking at the findings for the longitudinal studies, there were three (out of six) high-quality studies (with 3–4 stars) that found no evidence of an association (i.e., Kleiner and Pavalko 2010; Shepherd-Banigan et al. 2016; Shields 2002).

In terms of potential gender differences, of the nine studies that showed significant results, four studied females and males separately and only Takusari et al. (2011) found the relationship was significant in both females and males. Niedhammer et al. (2015) found a significant association among males only, while in Bara and Arber (2009), the relationship was only significant among females who had been working varied shifts for more than 4 years. Llena-Nozal (2009) found only men who changed their schedules from regular to “other irregular schedules” (distinct from rotating/split shift and irregular schedule or on call, but not further defined) experienced significantly higher distress. Grzywacz et al. (2016) focused on new mothers and found women working more irregular schedules across the child’s first year of life had higher depressive symptoms. On balance, there are no consistent findings as to the influence of gender.

‘Other’ types and contexts of shift work

Other types of shift work, not previously covered, were rarely found to be related to mental health problems. As shown in Table 6, several studies examined the association between rotating shifts and mental health (Niedhammer et al. 2015; Perry-Jenkins et al. 2007; Shields 2002; Vallieres et al. 2014; Wirth et al. 2017). None of these studies found evidence for a significant association. In addition, van de Ven et al. (2016), which assessed a range of shift schedule characteristics found no impacts of any of these characteristics on mental health. Kawabe et al. (2015) found that compared to daytime workers, shift workers had significantly worse mental health. But this study received only one star, and “shift work” was not clearly defined in the study.

Joint work schedules of couples

No conclusions can be drawn as to the effect of couples’ joint work schedules, as only three studies were in this category and their findings all differed. The three studies all focused on parents. In cross-sectional research by Rosenbaum and Morett (2009), both parents had significantly higher depression levels when one or both of them worked a non-day shift. While in Strazdins et al. (2006), parents had significantly higher depressive symptoms only when they and their partners both worked non-standard schedules. However, the cross-spouse effect (i.e., the impact from one spouse’s shift work on the other spouse’s depression levels) was not found in Perry-Jenkins et al. (2007)—the only longitudinal study examining couples’ work schedules.

Discussion

To our knowledge, this is the first systematic review dedicated to summarising the research investigating the association between shift work and mental health that has concurrently drawn attention to different shift work schedule types. All studies recruited from a wide range of occupations and organizations and at least 26 used randomly selected, population-based samples of workers. Therefore, this review provides a useful summary of findings relevant to the general working population. The majority of studies also controlled (or adjusted) for potential covariates (occupational and non-occupational).

Overall, the review demonstrates reasonable evidence in the existing literature that shift work is associated with poorer mental health across occupation types when measured by a broad binary indicator, particularly for women. However, there is a lack of consistency and less evidence when studies are categorised into specific shift work types. The evidence in terms of night work and mental health is mixed. It appears that night work is more likely to be associated with mental health problems after a certain period of time (Shields 2002; Bara and Arber 2009) or during sensitive periods, such as the transition to parenthood (Perry-Jenkins et al. 2007). The findings on the longitudinal studies accord with previous review (Angerer et al. 2017) which found evidence of an elevated risk of depression after several years of night shift work in occupations outside the health sector (i.e., general population). There was some evidence that working irregular schedules were associated with poor mental health. This finding may indicate it is particularly difficult for individuals who work irregular schedules to adapt and maintain homeostasis in life. The unpredictable nature of work schedules for these individuals may mean that their circadian rhythms, eating habits, as well as family and social lives, are continually disrupted, resulting in loss of control. Lack of perceived control has long been established as relevant to negative emotional states, particularly anxiety (Bara and Arber 2009; Fenwick and Tausig 2004; Gallagher et al. 2014; Olsen and Dahl 2010). However, there was little evidence of a link between mental health and other types of shift work, such as weekend shift and rotating shift.

The systematic review showed mixed findings in terms of gender differences. Seventeen studies examined women and men separately—11 of them reported similarities between gender, while one (Niedhammer et al. 2015) only reported a significant impact among men, and three (Bildt and Michelsen 2002; Driesen et al. 2011; Takada et al. 2009) only among women. In two studies that adopted multiple measures of shift work type (Bara and Arber, 2009; Llena-Nozal 2009), there was no clear pattern as to which schedule (i.e., irregular work or night work) was more likely to have an impact on women or men. Of the five single-gender studies, only Grzywacz et al. (2016), found significant association between mothers’ working irregular schedules and depressive symptoms. While overall, the balance of all the extant research available supports that shift work has a greater association with women’s mental health, no conclusions can be drawn without further formal testing. In addition, we are unable to say whether specific types of shift work schedules (e.g., night work) have specific or increased effects on women as opposed to men’s mental health.

Study limitations

The results should be interpreted in light of the review’s limitations. Primarily, this systematic review highlights the complexities in comparing and combining research studies about shift work. The inconsistency, and sometimes ambiguity, of the exact nature of shift work in each study limited our ability to compare between studies. To reflect (and in some way address) this, we have described the studies (and study measures adopted) as accurately as possible in Table 1. It should also be noted that while there was no evidence of publication bias in the longitudinal studies included in the meta-analysis, there may be undetected publication bias in cross-sectional research and studies examining other shift work types (i.e., irregular work, night work, weekend work, etc.), such that studies with negative findings are less likely to be published. In addition, given the studies focused on active workers, the ‘healthy worker effect’ could be a source of selection bias. The healthy worker effect typically means that relatively healthy individuals are more likely to gain and remain employed in good quality work than those with health problems (Pearce et al. 2007). Thus, the negative impacts of shift work may be underestimated as unhealthy workers leave their jobs (and are no longer captured in research). A final limitation is that given the focus on broad, population-based research, this systematic review provides little details of the shift workers at most risk (e.g., young or older people, working parents, immigrant workers, and particular occupation categories).

Conclusions

This systematic review provides some evidence that shift work, in particular irregular shift work, is associated with poorer mental health. While, there was also some evidence that ongoing night work may impact adversely on mental health, there was insufficient evidence to support a link with weekend work and other shift work types (including regular rotating shifts and joint couple work schedules). Shift work is often variable—making it difficult to measure and classify, conduct meta-analyses and draw general conclusions. Despite this, there is a need for continued investigation adopting clear and consistent measures. Although a considerable body of anecdotal evidence suggests that shift work is a stressful work arrangement, rigorous studies are still needed to provide robust evidence and quantify the impacts at the population level. This evidence is necessary to better understand the difficulties confronting shift workers and their families, and to identify where support is needed to protect workers’ mental health.