Introduction

Csikszentmihalyi (1975) began his research on flow with the rather simple question of why people are often highly committed to activities without obvious external rewards. Other researchers at that time also tried to understand the reasons for such “intrinsically” motivated behavior (McReynolds 1971; Berlyne 1960; DeCharms 1968; Deci and Ryan 1980; Hebb 1955; White 1959). In interview studies, Csikszentmihalyi found that such activities share a common aspect, which he labeled “flow state” or “flow experience”. According to Csikszentmihalyi (1975) and Rheinberg (2008), flow state can be characterized by the following components: (1) A balance between perception of one’s skills and the perception of difficulty of the activity (task demand). In this state of balance, one feels both optimally challenged and confident that everything is under control. (2) The activity has coherence, contains no contradictory demands, and provides clear, unambiguous feedback. (3) The activity seems to be guided by an inner logic. (4) A high degree of concentration on the activity due to undivided attention to a limited stimulus field. (5) A change in one’s experience of time. (6) The self and the activity are not separated, leading to a merging of the self and the activity and the loss of self-consciousness.Footnote 1

As can be seen form the components, the flow state has a strong functional aspect, in that individuals experiencing flow are highly concentrated and optimally challenged while being in control of the action. This functional state has positive valence and explains why people are highly committed to tasks lacking external rewards. Csikszentmihalyi and LeFevre (1989) even called the flow experience the “optimal experience”. This holds true to an even greater degree when taking into account later descriptions, which include happiness as part of flow: “Flow is defined as a psychological state in which the person feels simultaneously cognitively efficient, motivated, and happy” (Moneta and Csikszentmihalyi 1996, p. 277).

An early, similar description of the flow state can be found in Woodworth (1918, p. 69f; cf. Rheinberg 2008), who placed special attention on the absorption of adults and children in an activity and referred to the absorption as being of particular interest motivationally. Recent support that flow is a psychologically meaningful state is reported in neurological work clearly indicating that brain structures related to self-reflective introspection were inhibited when task demand was high (Goldberg et al. 2006). The authors conclude that: “Thus, the common idiom ‘losing yourself in the act’ receives here a clear neurophysiological underpinning” (p. 330).

After the qualitative description of flow by Csikszentmihalyi, he and others started to study daily experience with the quantitatively based experience sampling method (ESM, Csikszentmihalyi et al. 1977). The ESM captures participants’ immediate conscious experience via self-reports in response to electronic signals at random times throughout each day. This seems an especially suitable methodological approach to measure flow, which is characterized by a loss of self-consciousness, and retrospectively given statements are biased (retrospectively, the affect of the flow experience was remembered more positively; Aellig 2004). In the self-report forms, perceived skills and challenge were measured with single items, and participants were also asked about concentration. In addition to these two components of flow, affect and the wish to do the activity were assessed.

Instead of measuring all components in these studies, flow was defined operationally according to the flow model by Csikszentmihalyi (1975). This model proposes that flow occurs when the actor perceives a balance between the challenge of the activity and his or her own skill (see left-hand side of Fig. 1). Due to theoretically inconsistent results, this model was reformulated by Csikszentmihalyi and Csikszentmihalyi (1988). The revised model proposes that flow is experienced only when challenge and skill are both high. While this model is sometimes referred to as the “four channel model”, we refer to it as the quadrant model (see right-hand side of Fig. 1).Footnote 2

Fig. 1
figure 1

Original Flow Model (left-hand side; Csikszentmihalyi 1975) and reformulated quadrant Model of Flow (right-hand side; Csikszentmihalyi and Csikszentmihalyi 1988)

There are several problems with the operational definition of flow according to the flow models. Even if flow is indeed characterized by the perceived balance between challenge and skills, this does not necessarily mean that flow is always experienced when this balance is present. In addition, persons differ in the extent to which challenge and skills are related to each other (Pfister 2002). Ellis et al. (1994) further point out that little has been done to examine the construct validity of the indicators of flow; instead, the ESM data are considered to be ecologically valid. In summary, it would be desirable to measure all components of flow and to further examine the external validity of the flow concept. This problem has been recognized and instruments to measure all components of flow have been developed for the areas of sports (Jackson and Eklund 2002) and computer activity (Remy 2000).

An additional problem might be seen in the fact that instead of being asked about the perceived difficulty of the task, the person has to indicate the perceived challenge. Challenge already compounds perceived difficulty and skill (an easy task, for example, could be highly challenging given a lack of skill). Pfister (2002) also regarded this as a problematic issue and empirically compared the operational definition of challenge-skill with difficulty-skill balance, but found no differences. Whether there was a balance of challenge-skill or of difficulty-skill, the participants reported similar experiences, and one could therefore argue that it makes no (empirical) difference whether one asks about challenge or difficulties. Future research should tackle this problem. For example, task difficulty could be manipulated and skills could be objectively measured and then related to subjective experience of challenge and difficulty (e.g. Keller and Bless 2008).

Studies conducted thus far with flow indicators were able to find support for the flow models. In line with the expectations of the quadrant model, affect, concentration, and the wish to do the activity were high in the flow quadrant (Csikszentmihalyi and LeFevre 1989; Schallberger and Pfister 2001). However, the differences between the flow quadrant and the boredom quadrant were not found in all studies (e.g. Clarke and Haworth, 1994; Csikszentmihalyi and Csikszentmihalyi 1988; Ellis et al. 1994). This finding has led to a changing of the name from “boredom quadrant” to “relaxation quadrant” (Csikszentmihalyi 1997, p. 152). Ellis et al. (1994); Moneta and Csikszentmihalyi (1996, 1999); Pfister (2002) also support the claim that the interaction of challenge and skill influences flow indicators, but the empirical effect sizes were small. The results also indicate that situations in which individual skill exceeded task challenge led to positive affect and concentration (this would correspond to boredom/relaxation in the quadrant model).

One possible reason for the unsatisfactory support for the flow model is that it might be only applicable under certain circumstances or for certain kinds of activity. We argue that for activities perceived as unimportant and as having no further important consequences (activities with low importance), the balance between difficulty and skill should lead to flow experiences. If the task is considered to have very important consequences, flow should only be experienced when skill exceeds difficulty. The rationale for this is that in the case of highly important consequences, the threat of potential failure will hinder the experience of flow. However, if skill is higher than difficulty, a person feels more comfortable and this should make flow more likely. This would explain why flow indicators were high in the “flow quadrant” as well as in the “boredom quadrant/relaxation quadrant” (e.g. when individual skill exceeded task challenge).

The second reason for the unsatisfactory support for the flow models has been discussed since the beginning of the research on flow. It has been argued that some people are more likely to experience flow and are more likely to experience it in challenging activities. Csikszentmihalyi (1975, 1990) has described such persons as having an autotelic personality. Empirical evidence reported by Moneta and Csikszentmihalyi (1996) also points to individual differences. They found that the balance of challenge and skill does not go hand in hand with high values in the flow indicators such as high concentration for all individuals (cf. Pfister 2002). When looking at the achievement motivation research, the individual differences could easily be explained by differences in the achievement motive: The assumption that some people experience balance as positive and some as negative forms the core of the risk-taking model of Atkinson (1957); Brunstein and Heckhausen (2008). According to this model, highly achievement-motivated individuals prefer tasks of medium difficulty (e.g. tasks in which the balance of difficulty and skill is present). In contrast to this hope of success aspect of the achievement motive, individuals with a strong motive of fear of failure even avoid tasks of medium difficulty. The assumption that the achievement motive moderates the effects of the balance seems even more plausible considering that “the flow model may be more applicable to social contexts and activities where achievement plays a dominant role…” (Moneta and Csikszentmihalyi 1996, p. 303). In flow research, first support for the moderating role of the achievement motive was presented by Eisenberger et al. (2005); Schüler (2007); see also Clarke and Haworth (1994).

In empirical studies testing the risk-taking model, the achievement motive (hope of success) was measured with the projective measure of the Thematic Apperception Test (TAT, McClelland et al. 1953). Fear of failure was measured with the Test Anxiety Questionnaire (TAQ; cf. Brunstein and Heckhausen 2008). According to the contemporary understanding, the TAT measures the “need achievement” or the “implicit achievement motive” and the TAQ the “self-attributed need achievement” or the “explicit achievement motive” (McClelland et al. 1989; Brunstein and Heckhausen 2008). For implicit and explicit motives, hope of success and fear of failure could be differentiated. The research of the risk-taking model therefore captured the implicit motive of hope of success and the explicit motive of fear of failure. Both personality aspects influence whether or not individuals prefer a balance of challenge and skill.

Since the beginning of the flow research, it has been expected that flow is related to performance, and several studies have indeed reported this relationship. On a conceptual basis, flow should be associated with better performance for two reasons. First, flow is a highly functional state which should in itself foster performance. Second, individuals experiencing flow are more motivated to carry out further (learning) activities, and in order to experience flow again, they will set themselves more challenging tasks. Thus, flow could be seen as a motivating force for excellence. Although several studies document the relationship between flow and performance (Nakamura and Csikszentmihalyi 2005), some of them share the aforementioned methodological problems, making this evidence less convincing (Csikszentmihalyi 1988; Mayers 1978, unpublished; Nakamura 1988). Others were correlational studies or did not control for basic or prior performance (Jackson et al. 2001; Puca and Schmalt 1999; Schüler 2007). Therefore, it could be argued that flow is related to higher performance, but does not necessarily cause it. Many activities require higher expertise in order to get into the smooth performance state typical of flow. Thus, it is likely that individuals with higher ability have higher flow values (expertise effect; Rheinberg 2008). This would mean that the correlation between flow and performance arises simply because expertise leads to more flow, instead of flow fostering performance, as was argued above. To resolve this empirically, it would be helpful to control for differences in expertise as well as ability in order to ascertain whether flow will actually lead to better performance.

The present research

To avoid one central problem of quantitative flow research, we measured all components of flow in the studies reported here. To empirically evaluate the flow model, we also measured perceived difficulty and skill. In addition, we assessed the subjective balance between challenge and skill by asking whether the demands of the task are too low, just right or too high. This was carried out in response to the findings indicating that the relationship between challenge/difficulty and skill varies greatly among individuals (Pfister 2002). Furthermore, individuals may be able to report this perceived balance more accurately than the two quite abstract variables of difficulty and skill (Ellis et al. 1994). Moreover, the combination of two variables leads to an unreliable measure due to the combination of measurement errors (McClelland and Judd 1993).

As we argued above, the importance of the activity should influence whether the balance of difficulty and skill will lead to flow. In all studies, we measured the perceived importance of the activity. Thus, our first hypothesis is that for a low perceived importance, balance will lead to flow; otherwise, flow will be experienced when skill exceeds difficulty. We also compare activities with objectively different importance, expecting results analogous to the perceived importance.

As a second potential moderator we discussed the achievement motive. By trying to replicate the findings of the risk-taking model with the dependent variable of flow, we expect in our second hypothesis that when balance is present, flow will be more intense for highly implicit achievement-motivated (hope of success) individuals and less intense for individuals high in fear of failure in terms of the explicit achievement motive. The latter should be threatened when confronted with balance, which has a negative impact on flow. We had no expectations regarding the implicit motive of fear of failure and the explicit achievement motive of hope of success.

Finally, we studied the relationships between flow and performance. We expect in our third hypothesis that flow will be related to performance even when prior performance and ability are controlled for. The test also seeks to validate the concept of flow and the flow measure employed.

We conducted three studies. In the first study, we tested all three hypotheses. The other two studies were less complex and did not include the achievement motive measure. Here, we focused on testing hypotheses 1 and 3 further. Finally, we conducted a meta-analysis of all studies to test hypothesis 1 by comparing activities with objectively different importance in one analysis.

Study 1: Flow during learning for an obligatory course in statistics

Basic statistics is an obligatory part of studying psychology in Germany. Psychology students must pass a final statistics exam at the end of their first semester in order to continue studying psychology. Therefore, this exam is very important.

Method

Participants

About 273 participants took part in the study, which was conducted at the University of Potsdam and the Technical University of Berlin during two consecutive years (first year N = 71 and 73, second year 63 and 66). Seven participants were not measured for the implicit achievement motive and 11 participants dropped out before flow was measured. These participants were excluded from the analysis (for a detailed description of the dropouts, see Engeser 2005). Of the remaining 246 participants, 197 were women and 49 were men. Their ages ranged from 18 to 54 years, with a mean of M = 22.4 (SD = 4.73). A total of 22 participants did not participate in the final exam. Their missing values were estimated with the Expectation Maximization Method in SPSS (Verleye et al. 1998). Participants obtained course credit for participation.

Procedure

The longitudinal study started at the beginning of the winter semester and ended with the final exam at the end of the semester. The study was part of a larger project attempting to explain learning activities and performance in statistics (Engeser 2005). At the first assessment, age, gender, math grades in school, prior knowledge, and implicit and explicit achievement motives were measured. One week before the exam, participants were asked to work on a statistical task they would have typically worked on to prepare for the final exam. They were also instructed to set an alarm clock to ring ten minutes after they had started the task. At this point they should fill out the flow measure. Finally, participants consented that their scores on the exam could be obtained from the teachers of the statistics course.

Measures

The prior knowledge relevant for the statistics course was measured with the Questionnaire of Probability Theory by Nachtigal and Wolf (2001). The questionnaire contains two parallel forms with seven different topics from probability theory. Each topic is measured with two items. Three of the most difficult items were not used because we wanted to avoid the students feeling frustrated.

The implicit achievement motive was assessed by presenting participants with five pictures and having them write an imaginative story about each picture (TAT or Picture Story Exercise, PSE; Pang and Schultheiss 2005). The stimuli pictures were “architect at a desk”, “two women in lab coats in a laboratory”, “trapeze artists”, “two men (‘inventors’) in a workshop”, and “gymnast on balance beam” (Smith 1992). In the study of the first year, the first picture was “boy with vague operation scene in background” (McClelland et al. 1953). The use of different pictures was due to cooperation with other researchers. The instruction was based on Atkinson (1958). Stories were later coded for motivational imagery by two trained scorers using Heckhausen’s (1963) scoring manual for “hope of success” and “fear of failure”.Footnote 3 In line with the terminology of McClelland et al. (1989), the implicit measure of hope of success was labeled “need hope of success” (nHS) and for the fear of failure “need fear of failure” (nFF). The interrater correlation was r > 0.94 for nHS and nFF. On average, participants wrote 453 (SD = 107) words, containing M = 6.46 (SD = 3.04) images related to hope of success and M = 2.95 (SD = 2.24) images related to fear of failure. We adjusted for protocol length by multiplying by 1000 and dividing by word count. The different picture stimuli were corrected by z-standardizing the motive values for each consecutive year. The correlation between nHS and nFF was r = 0.15 (p = 0.02).

The explicit or self-attributed need of achievement was measured with the German version (Dahme et al. 1993) of the Achievement Motives Scale (AMS; Gjesme and Nygard 1970, unpublished). This scale measures “hope for success” (sanHS) and “fear of failure” (sanFF; Heckhausen et al. 1985). Both scales consist of 15 items to be answered on a 4-point scale, ranging from (1) strongly disagree to (4) strongly agree. The AMS is widely used in Scandinavia and Germany and has been established as a reliable and valid instrument (e.g. Dahme et al. 1993; Rand 1987). The consistency of sanHS was α = 0.82 and the consistency of sanFF was α = 0.91. The mean of sanHS was M = 3.06 (SD = 0.36) and the mean of sanFF M = 2.12 (SD = 0.50). The correlation between sanHS and sanFF was r = −0.44 (p < 0.01). The explicit achievement motive (sanHS and sanFF) did not significantly correlate with the implicit motive (nHS and nFF); rs < |0.11|, ps > 0.11.

Flow was measured with the Flow Short Scale (Rheinberg et al. 2003). This scale measures all components of flow experience with ten items and was used to measure flow during all activities (7-point scale; see Appendix). The scale also contains three additional items to measure the perceived importance (“Something important to me is at stake here”, “I won’t make any mistakes here”, and “I am worried about failing”). The experienced difficulty of the task, perceived skill and perceived balance were measured on a 9-point scale (see Appendix). The Flow Short Scale has been validated and successfully used in various applications ranging from experimental and correlational studies (see Rheinberg et al. 2003; Schüler 2007) to the experience-sampling method (Rheinberg et al. 2007). The factor structure of the Flow Short Scale parallels those now reported for this study (rotated principal factor analysis). An investigation of the scree plot and the application of the parallel analysis method (Zwick and Velicer 1986) indicated a two-factor solution (eigenvalues: 5.86, 2.24, 1.00, 0.76, 0.59) with items for flow and perceived importance falling on separate factors. The internal consistencies were α = 0.92 for the flow score and α = 0.76 for importance, and the two were virtually uncorrelated (r = −0.03, p = 0.65). We use the mean values of the two factors throughout this paper. If three factors were extracted, the flow items fell into factors named “fluency of performance” (items 2, 4, 5, 7, 8, 9) and “absorption by activity” (items 1, 3, 6, 10). The internal consistencies were α= 0.93 and α = 0.78, respectively, and the mean values according to these two factors correlate at r = 0.65 (p < 0.01).

The mean level of flow was M = 4.60 (SD = 1.16) and the mean for perceived importance was M = 3.45 (SD = 1.44). Compared to scores attained with various activities and across various studies (Rheinberg 2004), the flow score lies slightly below the overall mean (T = 47), and importance is slightly above the mean (T = 55). The mean level for difficulty was M = 5.18 (SD = 1.79); for skill, M = 4.68 (SD = 1.71); and for perceived balance, M = 5.42 (SD = 1.32).

The content and difficulty of the final exam were similar between universities and consecutive years. The scores of the final exams were z-standardized within each year and university to eliminate scaling differences (for details on how we ensured that the exams were comparable, see Engeser 2005).

Results

We first conducted a regression analysis on flow, with difficulty, skill, and the interaction terms of both variables (difficulty and skill were centered before the interaction term was calculated). There was a marginally significant main effect for difficulty, β = −0.11, t(244) = −1.86 p = 0.07, and a significant main effect for skill, β = 0.59, t(243) = 10.44, p < 0.01. The interaction of difficulty and skill was not significant, β = 0.03, t(242) = 0.56, p = 0.58. This indicates that flow depends on skill, and on difficulty (marginally significant), but not on the interaction between difficulty and skill. Thus, neither the channel model nor the quadrant model was empirically supported, and difficulty even had a negative influence on flow. This also contradicts existing empirical results, which found weak but reliable interaction effects with flow indicators (Moneta and Csikszentmihalyi 1996, 1999; Pfister 2002). On the other hand, the results are in accordance with empirical findings showing positive experiences for the boredom/relaxation quadrant (Csikszentmihalyi and Csikszentmihalyi 1988) and are in line with our reasoning for the first hypothesis.

Next, we present the descriptive results with the direct measure of balance. Table 1 presents the mean values of flow for each value of the measure of balance (the number of participants are given in brackets). The results indicate that flow was more intense when demand was low or just right. When the demand was too high (e.g. if difficulty exceeds skill), flow was less intense.Footnote 4 In order to go beyond descriptive analysis, a regression analysis was conducted. Balance and squared balance were used as predictors (balance was centered before being squared). We found a reliable main effect for balance, β = −0.45, t(244) = −8.24, p < 0.01 and a reliable quadratic relationship, β = −0.23, t(243) = −4.14, p < 0.01. The significant negative quadratic relationship lends support to the flow model, but the linear relationship is stronger still (the strong linear relationship was expected for the highly instrumental activity of learning statistics).

Table 1 Flow values (number of cases) for the direct measure of balance

We then tested whether the perceived importance of the activity moderates the relationship between balance and flow. Once again, all variables were centered before calculating the interaction terms. There was a main effect for balance β = −0.48, t(244) = 8.79, p < 0.01 and no reliable main effect of importance β = −0.01, t(243) = −0.18, p = 0.85. The quadratic balance term was also significant, β = −0.27, t(242) = −4.84, p < 0.01. The interaction of importance and balance was not significant, β = 0.07, t(241) = 1.22, p = 0.23. Most importantly, the interaction of quadratic balance and importance was significant, β = 0.19, t(240) = 3.08, p < 0.01. Values for one standard deviation above the mean, the mean itself and one standard deviation below the mean were used to illustrate this result. As can be seen in Fig. 2a, the quadratic relationship between balance and flow can only be found for low perceived importance. This result is fully in line with our expectation according to the first hypothesis that the perceived importance moderates the relationship between balance and flow; the lower the perceived importance, the stronger the quadratic relationship between balance and flow.

Fig. 2
figure 2

Interaction of perceived importance and balance on Flow (Studies 1, 2, and 3)

For our second hypothesis, we tested whether hope of success for the implicit achievement motive (nHS) and fear of failure for the explicit achievement motive (sanFF) moderate the relationship between perceived balance and flow. Separate regression analyses for the nHS and sanFF achievement motives revealed that both aspects of the achievement motive are moderators. The analysis showed a main effect for nHS, β = 0.21, t(244) = 3.46, p < 0.01, and for balance, β = −0.49, t(243) = −8.87, p < 0.01. The quadratic balance term was also significant, β = −0.17, t(242) = −3.08, p < 0.01. The interaction of nHS and balance was only marginally significant, β = 0.10, t(241) = 1.78, p = 0.08. The interaction of quadratic balance and nHS was significant, β = −0.16, t(240) = −2.50, p = 0.01. As can be seen in Fig. 3, the quadratic relationship for balance only held for people with higher values of nHS, supporting our expectation (also, the generally strong linear relationship beyond the moderation of the achievement motive is still present).

Fig. 3
figure 3

Interaction of hope of success of implicit achievement motive (nHE) and fear of failure forms the explicit achievement motive (sanFF) and balance on Flow

The analogous regression analysis with sanFF yielded a marginally significant main effect for sanFF, β = −0.13, t(244) = −2.07, p = 0.04, and a main effect for balance and quadratic balance, β = −0.49, t(243) = −8.36, p < 0.01 and β = −0.25, t(242) = −4.31, p < 0.01. The interaction of sanFF and balance was not significant, β = 0.03, t(241) = 0.53, p = 0.60. The interaction of quadratic balance and sanFF was significant, β = 0.16, t(240) = 2.37, p = 0.02. In Fig. 3 it can be seen that the quadratic relationship for balance only held for people with lower values of fear of failure, as we expected. Both moderation effects of nHS and sanFF have been derived from the risk-taking model. The parallel effects to the risk-taking model also hold when the resultant achievement motive (subtracting sanFF from nHS—as has customarily been used in the research tradition of the risk-taking model) was considered. Furthermore, we did not form hypotheses, but conducted analyses with fear of failure of the implicit motive (nFF) and hope of success of the explicit motive (sanHS). Results revealed no moderation of the quadratic relationship of perceived balance (ps > 0.41).

Finally, we tested our assumption that flow is related to academic performance when basic abilities and prior knowledge are controlled for. In order to control for basic or prior skill, math grades and prior knowledge were included in a hierarchical regression analysis. Age had a substantial influence on the performance on the final exam, so we also included it as a predictor in the regression analysis. Table 2 shows the results of the regression analysis. Age and math grades significantly influenced performance on the final exam. Prior knowledge only showed a marginally significant influence. Flow explained an additional 4% of the variance of the final exam results. Thus, flow can be seen as a predictor of performance rather than just being part of high performance. In total, 28% of the variance is explained by all predictors.

Table 2 Predicting final exam performance with hierarchical regression including flow (study 1, statistics course)

Discussion

To avoid a central problem of quantitative flow research in this study, flow was measured in its components. With this measure, it was revealed that flow depends on difficulty and skill, and not—as predicted by both flow models—on the interaction between these two variables. On the other hand, analyses with the additional direct measure of the balance between difficulty and skill validated one aspect of the flow model, namely that flow decreases when task demand is too high. The finding that flow is still high when the task demand is too low is in accordance with our expectations. For highly important activities, i.e. activities with high importance, individuals experience flow even if skill exceeds difficulty. Analyses looking at the perceived importance point in the same direction. The importance moderates the influence of balance on flow in the hypothesized way. Also as expected, when demand is “just right” (i.e. in tasks of medium challenge), flow is higher for individuals high in the implicit achievement motive “hope of success”. The reverse pattern holds true for the explicit achievement motive of “fear of failure”. This pattern of results of the components of the implicit and explicit achievement motive is exactly what was expected from the risk-taking model. Furthermore, flow was related to performance on the final exam.

Taking these results into consideration, it can be argued that the reliance of much of the research on flow merely on difficulty and skill level is not completely justified. Flow should be measured, and not inferred when difficulty/challenge matches skill (on high levels). This is even more important when bearing in mind that the achievement motive moderates how balance affects the experience of flow, at least when learning statistics. Taking into account also the results of other studies (Eisenberger et al. 2005; Schüler 2007), we can conclude that the flow model is more applicable for some individuals and less so for others.

To find further support for our first hypothesis, the next study was conducted with an activity—in contrast to the first study—of very low importance. In this case, we expect flow to be low when the activity is either not demanding enough or too demanding. We again tested the hypothesis that flow relates to performance when prior performance is controlled for.

Study 2: Flow during a computer game

We chose the computer game Pac-Man due to its friendly nature and because the difficulty levels are easy to manipulate. Participants were told that we wanted to evaluate feelings and thoughts while playing computer games and that performance in the game itself was of no consequence.

Method

Participants

About 60 participants took part in this study. The mean age was M = 22.6 (SD = 4.22) with a range from 14 to 49; 48 of the participants were women. The participants were either paid or received course credit.

Procedures

After receiving instructions, the participants played three preliminary rounds lasting for two minutes each in order to get used to the game and provide a baseline measure of playing ability. After playing four rounds of five minutes each, participants were asked to fill out the Flow Short Scale. The first and third round was set at a medium difficulty level, providing a challenging situation for most of the participants. The second round was very difficult and the fourth round was very easy. Only the results regarding our hypotheses of the two rounds played at medium difficulty are reported here (for ease of presentation, these two rounds are labeled first and second time measure; only the mean values of flow for the very difficult and very easy rounds are given). After the final round, participants were thanked for their participation and debriefed.

Measures

Flow was again measured with the Flow Short Scale. In this study, only subjectively perceived balance was measured. The internal consistency of the Flow Short Scale for the two measures was α = 0.87 and α = 0.87, and for the perceived importance α = 0.63 and α = 0.85. Flow and importance were only weakly and not significantly correlated (r = −0.12, p = 0.37 and r = 0.06, p = 0.65). The mean level of flow was M = 4.68 (SD = 1.18) for Time 1 and M = 5.21 (SD = 1.03) for Time 2 (for the very difficult and very easy rounds, the means were M = 3.08, SD = 0.69 and M = 3.83, SD = 0.92). For perceived importance, the mean level was M = 1.65 (SD = 0.86) and M = 1.43 (SD = 0.83). The values for importance are considerably lower than in the first study, supporting our reasoning that the importance of the computer game is lower than that of the statistics exam in the first study. Compared to values attained from various activities (Rheinberg 2004), the flow values here are around the overall mean (T values were 48 and 52) and importance values are well below the mean (T values were 44 and 42). The mean levels for perceived balance were M = 5.27 (SD = 1.76) and M = 5.03 (SD = 1.48).

Pac-Man, created in 1980, was one of the first computer games. The player has to maneuver Pac-Man, a yellow circle with a mouth, through a maze while eating small dots and being hunted by ghosts. Eating power pellets gives Pac-Man the temporary ability to eat the ghosts himself and gain additional points. The mean for the baseline was M = 168 (SD = 42.5). The points for the final rounds were M = 378 (SD = 169) and M = 423 (SD = 173).

Results

Table 1 presents the mean values of flow for the direct measure of balance. The results indicate that flow is more intense when demand is just right and less intense otherwise. Thus, for computer games without serious consequences (e.g. low importance), the flow model seems to fit the data.

To go beyond descriptive analysis, balance and squared balance were used as predictors in a regression analysis. For the Time 1 measure, we found a reliable main effect for perceived balance, β = −0.30, t(58) = −2.90, p < 0.01, and an even stronger quadratic relationship, β = −0.54, t(57) = −5.28, p < 0.01. For the Time 2 measure of flow, the linear relationship between balance and flow was not significant, β = 0.14, t(58) = 1.37, p = 0.17, but a strong quadratic relationship was found, β = −0.68, t(57) = −6.63, p < 0.01. This is in support of our first hypothesis that for activities with low importance, a quadratic relationship will be found according to the flow model.

Next, we tested whether the perceived importance of the activity moderates the relationship between balance and flow. All variables were centered before calculating the interaction terms. For the first measure, there was a main effect of balance and of importance, β = −0.24, t(244) = 2.45, p = 0.02 and β = −0.39, t(243) = −3.45, p < 0.01. The quadratic balance term was also significant, β = −0.43, t(242) = −4.20, p < 0.01. The interaction of importance and balance was not significant, β = 0.08, t(241) = 0.78, p = 0.44. Most importantly, the interaction of quadratic balance and importance was significant, β = 0.37, t(240) = −2.94, p < 0.01. One standard deviation above the mean, the mean itself and one standard deviation below the mean were used to illustrate this result. As can be seen in Fig. 2b, the quadratic relationship between balance and flow is stronger the lower the perceived importance.

For the second measure, there was a main effect of balance and of importance, although these were not significant, β = 0.18, t(244) = 1.66, p = 0.10 and β = −0.02, t(243) = −0.15, p = 0.89. The quadratic balance term was significant, β = −0.69, t(242) = −6.09, p < 0.01. Neither the interaction of importance and balance, β = 0.14, t(241) = 1.12, p = 0.24, nor the interaction of quadratic balance and importance, β = 0.01, t(240) = 0.07 p = 0.94 was significant. Thus, for the second measure, importance does not reliably moderate the strong quadratic relationship.

Finally, we tested our assumption that flow relates to performance. Performance baseline measures in Pac-Man served to control for baseline performance, and flow Time 1 and Time 2 were summed to form a single predictor. There was a main effect for baseline, β = 0.52, t(58) = 3.85, p < 0.01. This baseline measure explains 51% of the variance of the performance. Flow explained an additional 3%, but this effect is only marginally significant, β = 0.27, t(57) = 1.98, p = 0.052.

Discussion

As expected for an activity with low importance, a quadratic relationship of balance and flow was found: Flow was high when balance was present and low when the demand was too high or too low. The individual measure of perceived importance also moderated the relationship as expected for the first measurement point. Only when the perceived importance was low could the quadratic relationship be found. For the second measure, no reliable moderation of the perceived importance was found. The expectation that flow relates to performance beyond ability could not be supported, as its influence beyond the baseline measure was only marginally significant.

For the second measure, the perceived importance was low, and indeed lower than for the first measure. This might explain the fact that importance did not act as a moderator here. The absence of a linear trend and a stronger quadratic relationship for the second measure also lends credence to this explanation: when there is no (or little) perceived importance, only the quadratic relationships are found and the flow model is warranted for these situations. Perceived importance has to be at a minimum level in order for its effect to be apparent (at least statistically).

Regarding the relationship between flow and performance, we argued that flow leads to better performance for two reasons: (1) a better functional state is achieved during flow and (2) there is a higher motivation to perform the activity again. Only the first reason applies to this study, because the experimental situation was standardized and thus did not allow for additional practice. In learning statistics, this second reason could have played a major role. This might also be the case in our third study, in which we examined the activity of learning French. Therefore, we expect that flow will be a predictor of performance again. Regarding our first hypothesis, we expect the relationship between balance and flow to again be moderated by the importance of the activity and the perceived importance.

Study 3: Flow during learning in a voluntary French course

French courses are offered by the university to regular students who want to improve their language skills. Although these courses are not a regular part of the studies, students receive a certificate which could be useful in applying for scholarships and jobs. The importance of learning French could therefore be considered to be greater than that of playing Pac-Man, but less than that of learning for the (obligatory) statistics exam.

Method

Participants

About 61 participants took part in the study. The study was conducted at the language center of the University of Potsdam. The mean age of the participants was M = 22.6 (SD = 2.04) with a range from 19 to 28; thirty-five of the participants were women. About 13 participants (seven of them women) did not take the final exam. Due to the high dropout rate, these values were not replaced and these participants were excluded from the analysis concerning performance. Every participant took part without being paid or receiving course credit.

Procedures

The longitudinal study started at the beginning of the winter term and ended with the final exam at the end of the semester. Before the course started, the language center conducted a placement or ability test to allocate the participants to the appropriate course level. The course was taught every week for two hours. Flow was measured after 60 min of class time at two points: one during the first half of the semester, and one during the second half. At the first point, age and gender were also measured. At the end of the semester, every student received a mark for his or her performance.

Measures

In the ability test the participants could earn a maximum of 100 points. The scores ranged from 31 to 76, with a mean level of M = 54.4 (SD = 12.0). Students earning less than 55 points were allocated to the level 1 course, while all others were placed in the level 2 course. For the analysis conducted below, baseline ability was z-standardized within each ability level.

Flow was again measured with the Flow Short Scale. As in study 2, only the subjectively perceived balance was measured. In this study, the internal consistency of flow was α = 0.87 for both times and α = 0.87 and α = 0.88 for perceived importance. Flow and importance were only weakly and not significantly correlated (r = −0.20, p = 0.13 and r = −0.11, p = 0.39). The mean level of flow was M = 4.12 (SD = 1.10) for Time 1, and M = 4.04 (SD = 1.07) for Time 2. For importance, the mean level was M = 2.45 (SD = 1.46) and M = 2.43 (SD = 1.33). Compared to values attained in various activities (Rheinberg 2004), the flow values are below the overall mean (T values are 43 and 44), while values for importance are slightly below the mean (Ts = 48) and in-between those for statistics and Pac-Man. The mean level for perceived balance was M = 5.34 (SD = 1.41) and M = 5.26 (SD = 1.44).

The final marks are based on oral participation (one third) and on the results of the final exam (two thirds). The marks ranged from 1.5 to 4.3, with a mean level of M = 2.73 (SD = 0.70; here, lower marks indicate better performance). For the analysis conducted below, the marks were reversed and z-standardized within each ability level.

Results

On a descriptive level, Table 1 shows that for Times 1 and 2, flow was more intense when demand was just right, but still relatively high when demand was too low (e.g. when skill exceeds difficulty). If demand was perceived as being too high (e.g. if difficulty exceeds skill), flow was less intense. To go beyond descriptive analysis, regression analyses were conducted. For Time 1, a reliable main effect of perceived balance, β = −0.29, t(59) = −2.48, p = 0.02, and of the quadratic relationship, β = −0.40, t(58) = −3.39, p < 0.01, were found. For Time 2, we found no reliable main effect of perceived balance, β = −0.18, t(59) = −1.33, p = 0.19, and no reliable quadratic relationship, β = −0.15, t(58) = −1.07, p = 0.29. Thus, the moderate linear and quadratic relationship is in line with our first hypothesis only for the first measure. For Time 2, no reliable effect of perceived balance could be found.

Next, we tested whether the perceived importance of the activity moderates the relationship of balance and flow. Again, all variables were centered before calculating the interaction terms. For the first measure, there was a main effect of balance β = −0.45, t(244) = 2.88, p < 0.01 and no reliable main effect of importance β = −0.11, t(243) = −0.74, p = 0.47. The quadratic balance term was also significant, β = −0.45, t(242) = −3.10, p < 0.01. The interaction of importance and balance was not significant, β = −0.10, t(241) = 0.64, p = 0.52. Most importantly, the interaction of quadratic balance and importance was significant, β = 0.39, t(240) = 2.20, p = 0.032. Values for one standard deviation above the mean, the mean itself and one standard deviation below the mean were used to illustrate this result. As can be seen in Fig. 2c, the quadratic relationship between balance and flow is only found for low perceived importance. This result is fully in line with our expectation in the first hypothesis that the importance moderates the relationship between balance and flow. For the second measure, no reliable effects could be found (ps > 0.20). Thus, we were able to support our hypothesis with the first but not the second measure of flow.

Finally, we tested whether final marks were dependent on flow when controlling for language ability as measured before the course. We therefore conducted a regression analysis with the ability test as one predictor and flow Times 1 and 2 summed for a single predictor. There was a main effect of basic ability, β = 0.48, t(46) = 3.87, p < 0.01. This measure explains 26% of the variance of the final marks. Flow explained an additional 7% and this effect was significant, β = 0.28, t(45) = 2.24, p = 0.03.

Discussion

As expected for an activity with medium importance, the relationship between balance and flow showed a linear relationship and a substantial quadratic relationship. The pattern of this relationship could be seen as lying in between learning statistics and playing Pac-Man, which were of especially high and low importance, respectively. However, this only holds true for the first measure of flow; for the second measure, balance had no reliable effect on flow. Our expectation regarding perceived importance could also only be found in the first time measure. The assumption that flow relates to performance beyond basic ability was again supported for this learning activity, as was the case for learning statistics.

The fact that no reliable effects were found for the second measure might possibly be explained by the generally low flow values. When an activity has low overall flow values, flow might not even be experienced when there is balance. Here, flow might be hindered by other aspects or due to special circumstances (e.g. instruction method or tensions between students). However, this is only a tentative explanation. We were not able to validate this reasoning with data as we did not measure such aspects. Future research should therefore be more sensitive to such variables that possibly further restrict the flow model.

Thus far, the comparison of the three studies has been made on a solely descriptive basis. To more substantially support the claim that the activity moderates the relationship between balance and flow, we compared all three studies in one analysis.

Meta-analysis: A direct comparison of the three studies

To realize the direct comparison between all three studies in one analysis, two effect-coded variables for study were used as predictors along with the interaction between balance and squared balance. If the interaction between balance and the effect-coded variable reaches significance, the linear relationship of balance with flow will differ between the studies. If the interaction with the squared balance is significant, the quadratic relationship between balance and flow will differ between the studies. Before computing the interaction, balance was z-standardized within each study and the first measures of the second and third study were included (we excluded the second measure to ensure independence). The linear and quadratic relationship of balance was significant, β = −0.34, t(365) = −6.26, p < 0.00 and β = −0.41, t(364) = −6.96, p < 0.00. The first effect-coded variable representing the statistics course as compared to the entire sample was not significant, β = 0.01, t(363) = 0.11, p = 0.91. The interaction with balance was marginally significant, β = −0.10, t(362) = −1.77, p = 0.78, and was significant for the interaction with squared balance, β = 0.20, t(361) = 2.93, p < 0.01. The implication is that for the statistics course (in comparison to the whole sample), the linear relation between balance and flow was marginally stronger and the quadratic relationship was significantly weaker. The second effect-coded variable—representing Pac-Man as compared to the entire sample—was significant, β = 0.17, t(360) = 3.01, p < 0.01. This means that flow was higher for playing Pac-Man. The interaction with balance was not significant, β = 0.02, t(359) = 0.52, p = 0.61, but the interaction with squared balance was significant, β = −0.14, t(358) = −0.34, p = 0.02. This indicates that for Pac-Man, the linear relationship did not differ, but the quadratic relationship was stronger. Thus, the difference relationship between balance and flow for the three activities can be considered reliable. In this respect, our first hypothesis, in which we reasoned that the importance of the activity moderates the effect of perceived balance on flow, is therefore supported beyond descriptive analysis.

General discussion

In all three studies, we measured flow in all its components and empirically examined how the balance of difficulty and skill influences flow. We hypothesized that the influence of balance on flow will be moderated by the perceived importance of an activity and the achievement motive. Both hypotheses were empirically supported, as well as the hypothesized influence of flow on performance.

In the highly important activity of learning statistics, flow was still high when the demand was low. For the less important activity of playing the computer game Pac-Man, flow was highest when balance was present and low when the demand was too low or too high. Learning French was located in between statistics and Pac-Man. There was a moderate linear and quadratic relationship between balance and flow (in statistics, the linear relationship was predominant, and in Pac-Man the quadratic relationship was predominant). This was precisely the result that we had expected.

The activities compared here also differ in various further characteristics other than importance. Therefore, possible alternative explanations could account for the moderating role. Nevertheless, we see importance as the crucial aspect because the moderating role of perceived importance showed analogous results. However, the results should be replicated in experimental settings in which everything but importance is kept equal. This would give our reasoning an even more solid empirical base.

It should also be pointed out that the perceived importance was measured including items assessing the worries about mistakes and failure. Therefore, the importance of an activity itself might only be a moderator when worries are aroused due to the perceived importance. With our importance measure, we therefore captured possible threat of important activities. Experimental studies could best address this problem by separately varying both aspects in order to shed more light on this important issue.

Other studies (e.g. Ellis et al. 1994; Moneta and Csikszentmihalyi 1996, 1999; Pfister 2002) found a weak but reliable interaction between challenge and skill, but we did not find this in our first study. Besides the fact that these studies did not measure flow in its components, the high importance of learning statistics could explain the different results in our study. According to our hypotheses, the interaction of difficulty and skill would be expected for Pac-Man, but here we measured only the perceived balance (and not difficulty and skill separately). Due to the strong quadratic relationship of balance for Pac-Man, one could assume that difficulty and skill interact. Based on this assumption, the mixture of various degrees of importance in ESM studies would result in a weak interaction effect of challenge/difficulty and skill. To shed more light on this, future research should address the importance of the activity in ESM studies.

The fact that most other studies measured perceived challenge, while we measured perceived difficulty in our first study, might also explain why we did not find a reliable effect of the interaction of difficulty and skill on flow (it seems conceptually clearer to use difficulty as it seems to be less confounded with skill). But taking into account that Pfister (2002) found no empirical evidence that asking about challenge and/or difficulty affects flow differently, this alternative explanation is rather unlikely. Nevertheless, a clarification with respect to challenge and difficulty in future research seems necessary, mainly when flow research still relies heavily on the balance issue.

The finding that the achievement motive moderates the relationship between balance and flow was part of the first study. Analogous to the risk-taking model of Atkinson (1957) and Brunstein and Heckhausen (2008), both aspects of the achievement motives moderate the effect of balance on flow. Individuals high in the implicit achievement motive of hope of success experience more flow when the demand is perceived as just right (e.g. during a task of medium challenge). Individuals high in explicit fear of failure experience less flow in this regard. The fact that other personal variables also moderate the relationship between balance and flow was shown by Keller and Bless (2008) for the action versus state orientation.

Flow while preparing for a statistics exam or learning French is associated with performance at the end of the semester, even when controlling for ability. For the computer game Pac-Man, the relationship was less strong and only marginally significant. This can easily be explained because flow should foster performance due to it is a highly functional state (e.g. high concentration); in addition, flow can be expected to foster performance due to its rewarding nature. Thus, if more flow is experienced, further engagement in an activity should be more frequent, which should foster performance. For Pac-Man, the long-term effect of more frequent engagement could not be accounted for, and this might be the reason why the relationship is weaker here. Future research should consider the functional and rewarding aspects when studying flow and performance. It might even be possible to sequentially separate flow and performance in order to study the causal relationship in greater depth.

Examining flow research in the light of our results, the following conclusions can be drawn: (1) The flow state, as conceptualized by qualitative interviews by Csikszentmihalyi (1975) and measured by the Flow Short Scale, predicts performance. (2) The strong reliance on the skill-challenge balance needs to be questioned. The effect of balance depends at least on the (perceived) importance of the activity and the individual achievement motive. The aspect of “autotelic personality” has long been discussed as a moderator (Csikszentmihalyi 1975) and the achievement motive might be one part of this personality type. The fact that variables other than the importance of an activity—and not the person him/herself—determine flow has recently been demonstrated for goals (Rheinberg et al. 2007; see also Abuhamdeh et al. 2005). (3) Future research should probably not only (operationally) define flow with only one component (the skill-challenge balance) and instead measure flow in its multidimensionality. Most ideal would be to measure flow “online” via unobtrusive physiologically based indicators or with some reliable and observable aspects of behavior or expressions. Such measures are not yet available and should form the subject of future investigations.

Flow research has begun to provide an understanding of the reasons for intrinsic motivation. Experiencing flow is one reason for engaging in activities even without any (obvious) external rewards. The present research also applies the flow concept to activities that are not considered to be solely intrinsically motivated, which has been the case from the very beginnings of flow research. By studying flow in daily experience (see experience sampling method in the introduction), it was expected that flow could potentially be experienced in any activity (e.g. depending on the challenge and skill ratio). Csikszentmihalyi and LeFevre (1989) even found more flow in activities at work (see also Rheinberg, et al. 2007). When studying motivation for different (daily) activities, it is also clear that motivation can rarely be understood as completely intrinsically or extrinsically motivated.

When we study flow, we are also studying the absence of flow (e.g. low levels of flow). For example, we found lower mean levels of flow in the highly important activity of learning statistics compared to playing a computer game. On average, individuals would therefore be less inclined to learn statistics. Or to put it another way, they are less intrinsically motivated in this respect. This finding is also in accordance with the contemporary conception of intrinsic motivation: High instrumentality tasks or ego-threatening conditions will hinder intrinsic motivation (e.g. Deci and Ryan 2000; Elliot and Harakiewicz 1996). On the other hand, external demand or ego-threatening conditions may even foster flow if the personal skill is high compared to the task difficulty. This has parallels in to the finding that fear can lead to higher performance for easy tasks (e.g. Mueller 1992), and in the goal-setting theory, the strongest effects of external standards on performance were found for easy tasks (e.g. when skill exceeds difficulty; Locke and Latham 2002).