Keywords

1 Introduction

Working Memory (WM) refers to the limited capacity network (4 to 8 items/stimulus per unit time) for holding information in mind for several seconds in the context of cognitive activity (Baddeley and Hitch 1974). The Working memory and music are considered as an inevitable component of cognition and entertainment. Listening to music is a widely adapted secondary activities of humans. Research has shown background music to improve linguistic information processing (Angel et al. 2010), shielding from ambient noise, increasing task attention (Hargreaves and North 1997). However, the detailed interaction between music and memory on a physiological level needs further understanding about task performance. Baddeley & Hitch’s working memory model (1974) bifurcated the working memory system into two subsystems: 1. Visuospatial sketchpad; 2: Phonological loop. Processing and retaining visual and sound information were carried out in the sketch pad and phonological loop. Cocchini et al. (2001) and Salamé and Baddeley (1982) suggested the potential interference of auditory stimulus with the phonological loop, as the processing of acoustic info takes place in the loop. Salamé and Baddeley (1982) postulated that maintenance and rehearsal of verbal information took place in phonological loop. Listening to music during primary task will only result in the competition for phonological loop resources, resulting in corrupted memory and poor working memory performance. Thus, the current study hypothesized as follows:

  • H1: Background music will negatively impact working memory performance

However, to have more in-depth insights into the working memory function and its performance during musical interference, understanding the neural underpinnings using electrophysiological measurements becomes vital. Gevins et al. (1998) established the reliability of EEG as a tool to analyze working memory tasks. Therefore, this study utilized EEG to study working memory.

As Oberauer et al. (2004) suggested, the capacity of wm could be tested using tasks requiring memory and processing. Hence this study used backward digit span (BDS). Its prevalence in clinical psychology (Ramsay and Reynolds 1995) made BDS an appropriate task for the study. In BDS, subjects have to encode, process the encoded digits to invert, rehearse the digits in memory for efficient recollection. The task consisted of three stages, namely, encoding, maintenance, and recall of digits. The maintenance period was provided to utilize the rehearsing function of working memory. The performance of working memory relied on effective recall contingent on continuous rehearsing of digits. The frequency bands, namely theta, alpha, and beta, were studied. The paper is divided into 4 sections. Sections 2 elaborates on experimental methodology and acquisition of behavioral parameters and EEG signals, followed by results and inference in Sect. 3. Discussion and future direction are provided in Sect. 4.

2 Methodology

Six participants (26 ± 2.19 years old) with no history of mental illness were recruited for the study. All participants gave their informed consent. The study had two conditions: 1. With-music; 2. No-music and the choice of music was selected by the participants. The participants sat comfortably in a normally lit silent room, and the music was played in headphones at 60 db volume during with-music condition. Backward Wechsler Digit Span (BWDS) test was used in the study. All subjects participated in both the conditions in a counterbalanced fashion to reduce the bias. The digit stimuli were presented on the laptop using Paradigm Stimulation Presentation software (Perception Research Systems 2016). Behavioral parameters, namely accuracy and typing duration, along with electrophysiological signal electroencephalogram (EEG), were collected. Typing duration was defined as the period from the entry of the first digit till the last digit. Accuracy was the percentage of correct responses for each digit sequence.

EEG was collected using Emotiv 14 channel headset, and paradigm software recorded the typing duration and accuracy. EEG dataset was divided into two groups: with and without music conditions. The bandpass filter of the range 0.5 Hz–30 Hz was used to filter the dataset. ICA was used to remove muscle artifacts, eye movements, and blinks. Each dataset was segmented into three epochs, namely encode, maintenance, and recall for each digit, theta, alpha, and beta spectral powers was calculated for each epoch for both conditions. Two subjects were removed from analysis due to excessive contamination of signals. All statistical analyses were performed in SPSS version 20.

2.1 Working Memory Task

The task had three stages: Encoding, Maintenance, and Recall. The encoding stage consisted of silent memorization of displayed digits. The maintenance phase consisted of holding the memorized digits in memory, and subjects typed the memorized digits backward during the recall phase. The sequence ranged from 3 to 7 digits, and each digit had five trials. For each trial, the digits were displayed individually on the center of the computer screen for 750 ms, followed by a black screen for 25 ms. The maintenance phase was a black screen with a duration of 4 s followed by recall phase. The methodology used in the experiment was shown in Fig. 1.

Fig. 1.
figure 1

Experimental methodology

3 Results

3.1 EEG Analysis

Spectral values were obtained for three frequency bands, namely theta (3–7.9 Hz), alpha (8–12.9 Hz), and beta (13–30 Hz) at all 14 sensor locations. Spectral values at pre-frontal, frontal, and parietal channels were averaged to obtain global theta, alpha, and beta spectral values. Two-way repeated measure ANOVA for factor Digits(5) X Stages (3) performed separately for two conditions did not yield significant results.

Digit 3 had a positive Pearson correlation between spectral values of alpha during recall and maintenance stages (r = 0.993, p < 0.05) and between spectral values of theta during encode and maintenance stage(r = 0.995, p < 0.05) for with-music condition. Beta’s spectral values during encoding and maintaining stages showed a strong positive Pearson correlation for digit 6 (r = 0.999, p < 0.05). Pearson correlation analysis of spectral values of beta showed strong positive association between encode and recall stages for digits 7 (r = 0.991, p < 0.05) and 6 (r = 0.989, p < 0.05); similar relationship was found between maintenance and recall stages for digits 7(r = 0.992, p < 0.05) and 6 (r = 0.988, p < 0.05). All the p values were Bonferroni corrected. No correlation was found between stages and digits for no-music condition.

Two-way repeated-measures ANOVA for factors Condition (2) x Stages (3) revealed significant main effects for conditions in theta frequency band for digit 6 alone F(1, 3) = 11.42, p < 0.05, ƞ2 = 0.792. The descriptive statistics showed that theta’s mean spectral values in with-music condition were higher (mean = 4.095) than in no-music condition (mean = 3.785). Same analysis for digit 6 revealed significant main effect for stages in alpha band F(2, 6) – 13.94, p < 0.05, ƞ2 = 0.823. A posthoc test using Bonferroni correction showed a statistically significant increase in mean spectral values during encoding than during the maintenance stage (p < 0.05). No interaction effects were significant in both the two-way repeated-measures ANOVA. Thus, as music competes with working memory during with-music condition, the spectral power of theta increases. This result suggests the interference of music in working memory task performance.

3.2 Behavioural Analysis

Typing duration and accuracy were the two behavioral parameters considered to determine music’s influence on working memory. Mean typing duration and mean accuracy percentage for two conditions were shown in Fig. 2 and Fig. 3, respectively. Typing duration of digit 3 had a statistically significant mean difference between music (mean = 1358.96 ± 297.60) and without music (mean = 2126.42 ± 491.27) condition t(3) = −6.89, p < 0.05.

Fig. 2.
figure 2

Mean typing duration for two conditions. Error bars indicate standard errors.

Fig. 3.
figure 3

Mean accuracy percentage for two conditions. Error bars indicate standard errors

A paired t-test for accuracy, between with-music (mean = 35 ± 34.15) and no-music (mean = 60 ± 36.51) conditions revealed statistically significant mean difference in accuracy for digit 6 t(3) = −5, p < 0.05. This result indicates that the presence of music reduces the accuracy as working memory reaches its maximum capacity. Overall, as the digit sequence size increases, music competes for working memory resulting in reduced accuracy and larger typing duration.

3.3 Survey Analysis

A questionnaire survey was conducted to determine the preference of listening to music while engaging in low, medium, and high cognitively intensive tasks. 219 responses were obtained, and three answers were rejected due to duplication errors; thus, a total of 216 responses was considered for further analysis. Household chores, browsing social media were categorized as low-intensity tasks, reading a new novel and multi-texting as moderately intensive; driving, listening to lecture, studying were classified as high-intensity tasks. The results are shown in Table 1.

Table 1. Music listening preferences during tasks

On average, 39.96% preferred listening to music regularly, and 38.16% did not like music irrespective of the task’s intensities. These results reveal the prevalence of music in day-to-day activities making music an essential factor to consider its role in affecting the task performance.

4 Discussion

High theta power during with-music condition indicated the occurrence of greater allocation of attentional resources as the task difficulty increases in the presence of background music. This result agrees with the research by Klimesch et al. (1997), where the authors reported an increase in theta with increasing task difficulty. An increase in alpha spectral power during encoding the digits in with-music condition revealed that more significant effort was needed to encode the digits in memory successfully. The alpha power was reported to reduce with increase in working memory load (Krause et al. 2000) during encoding. However, auditory stimulation resulted in increased alpha synchronization (Krause et al. 1997). Klimesch et al. (1999) also observed an increase in upper alpha power during encoding at central and parietal regions brain regions. As mentioned in existing research, the increase in alpha during encoding, despite being an unusual event, could be attributed to stimulation of the auditory cortex by music. The accuracy decreased as the size of digit sequences increased. The decrease in accuracy during with-music condition (mean = 62 ± 22.89) was greater than during no-music condition (mean = 72 ± 15.06). Typing duration showed an increasing trend as the task difficulty increased, which was in accordance with the Sternberg effect. The effect states an increase in recall time as the items held in working memory increases in size (Sternberg 1966). Thus, as the tasks begin to utilize the maximum capacity of working memory, music could only become a source of hindrance for efficient performance, as revealed by the study results. Although this study’s sample size is limited, the study gave insight into the interaction between music, task difficulty, and working memory capacity. Nevertheless, the survey results showed music to be a preferred choice of the secondary task; the current study result indicated music negatively impacted the performance. The results recommended the disengagement from secondary tasks as the task difficulty increases.

As participants chose their preferred choice of music, music would have captured higher saliency (Gustavson 2014), demanding more attentional resources (Engle et al. 1999) and disrupting the task performance. Future research could manipulate the choice of music to study its impact on performance. In conclusion, music negatively impacted the working memory performance, and hence the hypothesis – music has a negative impact on working memory was accepted.