Keywords

1 Introduction

Artificial Intelligence has been on the rise for several years [1, 2]. Yet, with the recent emergence of generalized transformer models like ChatGPT a significant shift in their performance has occurred [3]. In academia, prestigious journals like Nature and Science have already highlighted the rise of Human-AI-collaborative projects [3, 4]. Also, other powerful generative tools for image (e.g. DALL-E), audio (e.g. Speechify), and video generation (e.g. Synthesia), but also coding and software application development (e.g. Uizard) have been developed and impact how people produce innovative media and knowledge. While some research has focused on the technical capabilities and limitations [5, 6] and socio-economic impacts on the workforce [1, 3], little attention has been paid to the user experience of working with these powerful AI tools. However, as the popularity of ChatGPT has demonstrated (it gained over 100 million users in less than two months), it is crucial to examine how users interact with these tools and the experiences they elicit.

Anticipating a future where people interact with AI tools rather than being replaced by them, it is pertinent to explore how it feels to work with them [7]. This study proposes to focus on one possible user experience outcome: the elicitation of positive psychological experiences [8]. Specifically, we aim to investigate how AI tools may promote experiences of mastery and well-being by facilitating flow, a state of optimal experience characterized by a high level of task engagement and satisfaction [9]. This flow experience has been an emerging topic in IS—and particularly in NeuroIS research (Koufaris, 2002; Léger et al., 2014; Nadj et al., 2023; Knierim et al., 2017; Knierim, 2018; [10,11,12,13,14,15,16]). Since flow occurs when skills meet task demands (and when both are present at high levels) [17], and since cognitive load has been identified as a proxy for this skill-demand balance [18], we propose that generative AI tools could regulate cognitive load to flow-conducive levels, for example, by providing initial solutions to challenging tasks (lowering task demands) or by shifting user demands from monotonous tasks to more challenging ones (increasing task demands).

In this article, we propose a study design using NeuroIS methodology to investigate these patterns in a knowledge work task scenario. By using wearable EEG recordings, we also aim to gain objective insights into the dynamics of cognitive load and flow, especially during early project phases. As the EEG is well known for monitoring correlates of cognitive load [19] with high temporal resolution, this approach also allows for theoretical extensions of load-flow dynamics in real-time. In the following sections, we detail the theoretical foundations of flow and cognitive load that we use to develop a corresponding study design. Ultimately, this research could lead to neuro-adaptive recommendation agents that suggest AI invocation when undesirable load levels are detected—and ideally facilitate positive experiences when doing so. Before this happens, however, additional questions need to be answered, which are discussed in the outlook section of this article.

2 Related Work

Flow and Related IS Work. The concept of flow encompasses nine dimensions that include the perception of: (1) a balance between task demand and an individual's skill, (2) clear goals, (3) unambiguous feedback, as well as (4) effortless concentration, (5) merging of action and awareness, (6) loss of self-consciousness, (7) control, (8) distortion of time, and (9) intrinsic reward [9]. Research has shown that flow can be experienced in any task that requires active engagement and fulfills the three pre-conditions, of which the balance between perceived demand and skill is particularly noteworthy, as it is often manipulated in flow experiments [17]. The concept of flow has been extensively studied in business and IS research, with several studies linking it to improved job performance, creativity, and enjoyment [20,21,22]. Additionally, it has been associated with technology-enabled team building [23] as well as technology adoption and use [11, 16], highlighting its potential impact on desirable outcomes in the context of IS use and work. The relevance of flow experiences in IS use scenarios and knowledge work has also been recognized in NeuroIS research [10]. While much of the previous research has focused on identifying neurophysiological correlates of flow experiences [14, 15], recent studies have highlighted the connection between flow and cognitive load, which serves as the foundation for this research proposal.

Flow and Cognitive Load. In general, cognitive processing is composed of two primary components: a relatively limited working memory and a much larger long-term memory [24]. When engaging in a task, some degree of working memory must be allocated to that task, a process referred to as cognitive load [24, 25]. Previous work has found that cognitive load and flow experiences are linked through an inverted U-shaped relationship [18, 26]. This has been documented both for self-reports and for well-known EEG correlates of cognitive load like frontal Theta, and posterior Alpha and Beta frequency band power changes [19]. The explanation for this relationship is primarily rooted in the flow pre-condition of demand-skill balance [18]. When a task is neither too demanding nor too easy, the efficient and automated task processing that characterizes flow can occur, likely because processes that are detrimental to the primary task are downregulated (e.g. self-referential attention, conflict monitoring or mind-wandering—[27]). Beyond this general integration of the two theories, little is reported about their cognitive and temporal dynamics. However, this represents an interesting aspect because (1) scholars have reported that flow often emerges sporadically and chaotically [28], and (2) an understanding of the temporal dynamics would provide a highly valuable foundation for the development of flow-facilitating technology (e.g. by understanding when and how to invocate feedback—see [12]). Therefore, in investigating the impact of human-AI collaboration on positive experiences, we propose using NeuroIS methods to study these temporal dynamics to enable theoretical contributions beyond observing their outcomes.

3 Study Proposal

Study Concept, Hypotheses and Procedure. Our proposed study investigates the impact of generative AI tools, such as ChatGPT, on positive experiences in a knowledge work setting. To achieve this, we build on previous NeuroIS research that has examined the emergence of flow experiences in knowledge work tasks like scientific writing [15, 29]. One notable finding from this work was that cognitive load peaked during the start of the writing stage and then decreased after a few minutes [29]. This aligns with flow and writing research that suggests the start of a writing session may require more effort to structure the task than later stages [9, 30]. Based on this observation, we propose that the early stages of a writing project are particularly demanding and that this moment presents an opportunity to use an AI tool to reduce task demands. Thus, we derive the following hypotheses:

  • H1: Cognitive load increases during early stages of a complex writing task.

  • H2: Using generative AI reduces cognitive load during early stages of a complex writing task.

  • H3: The reduction of cognitive load in an early writing stage increases the intensity of flow experiences.

Furthermore, because it is known in flow literature that task skill and importance (here: topic knowledge) significantly influence when and how flow is experienced [31], we expect these two factors to moderate hypotheses H2-H3:

  • H4a: The more important a topic is to a writer, the greater the AI support's load reduction and flow intensification.

  • H4b: The more knowledge a writer has with a topic, the weaker the AI support's load reduction and flow intensification.

To account for individual differences in EEG data and skill levels, we propose a fully within-subject experimental design that includes (1) a standardized text copying task of varying difficulty, and (2) drafting two extended abstracts on a standardized topic, once with and once without the support of a generative AI tool. Figure 1 illustrates the proposed experimental structure, including the procedure, tasks, and questionnaires.

Fig. 1
A block process diagram explains the following stages. Preparation, lead reference stage, writing stage with and without Chat G P T of relevant or irrelevant topics assigned randomly, final survey, sensor removal, payout, and debriefing.

Procedure of the proposed experiment

Tasks and Manipulations. The text copying task (see, e.g. [32]) is proposed as a baseline measure of cognitive load to resemble common knowledge work. This text copying task consists of a set of pre-selected text segments of varied lengths that are presented to participants for 10 s per segment. By varying the length of each segment that has to be typed, low, medium, and high levels of cognitive load can be elicited that can serve as a reference for the report and EEG data later (see, e.g. [18]).

For the abstract writing stage, participants will work on two standardized topics that are identified in an early stage of the experiment. Specifically, as we expect topic relevance and expertise to moderate the load and flow changes induced by generative AI tools (H4a and H4b), participants are asked to indicate this relevance and expertise for a pre-defined range of four topics at the start of the experiment. These four topics will cover recent developments in the IS discipline. To maximize contrasts, participants will be assigned to work on the topic with the highest and the lowest relevance and expertise—randomly assigned to the AI support/no support conditions. Further, to compare general ability levels for a writing task, we will include the writing practice index (WPI) used in related work on creative writing [33]. The order of the two writing assignments will be randomized to counter any unwanted carryover effects.

The writing process itself is divided into two phases. First, in the ideation stage, participants will have time to draft a storyline for their abstract without the support of any AI tool. They will be asked to focus on generating a story in bullet points. We consider this phase necessary because it will serve to test the hypotheses that cognitive load increases substantially in the early stages of a complex writing task (H1), and that the use of a supportive generative AI tool thereafter reduces cognitive load and flow experience (H2 and H3). In addition, the use of the generative AI tool is more comparable across participants because it is not used for ideation, but rather for implementation of the writing. Second, in the work phase, participants are asked to complete the writing of their text based on their previous ideas. In one round, they are asked to use ChatGPT to produce their extended abstract. In both phases, participants must work for a minimum of three minutes and a maximum of 15 min. We offer this variable time span to account for individual differences in the time it takes participants to produce a satisfactory solution. After each ideation and work stage, participants are asked to fill out surveys about their load and flow experience levels—similar to previous work [15, 34, 35].

To complete the experiment, participants will complete resting baseline measurements for 30 s with eyes open (fixating on a cross) and 30 s watching a video of fish swimming in the ocean as a more natural resting stage stimulus [36]. Overall, we expect the experiment to last 60 min, including the time for instruction and set-up of the instrumentation. Participants will be remunerated with a flat fee following recommendations for flow research [37].

Measures. The NASA Task Load Index (six items by [38]) and the short flow scale (ten items by [31]) will be used as main reports at each task interruption or conclusion. In addition, at each interruption, we will include additional items to assess the task difficulty (one item by [31]) and other facets of the affective experience during the task (a five-item stress measure by [39] and two pictorial items for emotional arousal and valence—[40]). After each task (the three main stages), we will also ask about the task importance (three items by [31]) to compare if task motivation might have influenced the experiences. Finally, at the end of the experiment, participants will be asked about their intention to use the AI in similar situations again (three-item TAM construct by [41]—for the AI use condition), or whether they would like to use a generative AI for such tasks (for the non-AI use condition—after description of ChatGPT and its capabilities—adapting the three TAM construct items again).

To continuously monitor cognitive load levels, we will use a wearable EEG system with dry electrodes on the top of the head (importantly with electrodes at Cz)—specifically, a system that resembles headphones that could be used in everyday life (see, e.g. [18, 42] and Fig. 2). We opt for such a system because we believe it is important that NeuroIS scholars engage with such everyday life systems to advance the development of adaptive systems. As cognitive load effects are large and well-observable over frontal and central midline positions [19] we also believe such a system to be an appropriate choice in terms of rigor. We expect to see the classic increase in Theta frequency band power and decrease in Alpha frequency band power as cognitive load increases, especially over the C3, Cz, and C4 positions.

Fig. 2
Left. A photo of a dry-electrode wearable E E G system. Right. A person wearing a dry-electrode wearable E E G system sitting in front of a computer monitor.

Dry-electrode wearable EEG system for cognitive load monitoring (see [18])

4 Discussion and Outlook

In our work, we propose a study that aims to investigate the impact of generative AI tools on cognitive load and flow experience. We believe that understanding the effects of emerging technologies on user experiences is of paramount importance, especially since the integration of AI tools has the potential to enhance productivity and enjoyment in the future of work through human-AI collaboration. Our proposed study design incorporates wearable EEG systems to observe cognitive dynamics with high temporal resolution, which will contribute significantly to the theoretical integration and extension of load and flow theory. We hope that our findings will provide a novel foundation for explaining how cognitive load patterns contribute to the emergence of flow experiences and how they are regulated by AI tools. It is important to note that our approach is based on the assumption that AI tools provide useful solutions that can be easily integrated into primary tasks. However, if this assumption does not hold, the use of AI tools may have the opposite effect, increasing load and creating more stressful experiences. This possibility highlights the importance of studying the human-AI interaction in knowledge work scenarios, which are anticipated to occur at increasing rates now that generalized transformer models have made their debut in the modern world. As IS scholars, we have an opportunity here to lead the way in studying and supporting the impact of AI on people, organizations, and society as a whole.