Introduction

Video games are a diverse medium reaching from vast open worlds (e.g., Elden Ring) to casual mobile games (e.g., Candy Crush), from competitive battle-royale-games (e.g., Fortnite) to turn-based strategy titles (e.g., Civilization), and from meaningful single-player stories (e.g., Detroit: Become Human) to fun racers played with friends (e.g., Mario Kart). In summary, video games vary greatly in their graphical representation, game mechanics, and narrative content (Predescu & Mocanu, 2020). Consequently, gamers may have different intrinsic and extrinsic motives to play (Reid, 2012). Furthermore, reasons to play differ substantially regarding gaming genres, gender, cultural setting, or personality factors (Park et al., 2011; Ratan et al., 2021).

Psychological research on why people play may further disentangle findings on positive and negative video game effects, such as aggression (Anderson et al., 2010), pathological use (Gentile, 2009), stress relief (Pallavicini et al., 2021), cognitive stimulation and well-being (Barr & Copeland-Stewart, 2022), or mental health (Jones et al., 2014). Despite the relevance of gaming motives, a “gold-standard” measure to systematically assess player motivation is still lacking. Therefore, the current study aims to establish a comprehensive and validated scale.

Gaming motivation and its assessment

Classification of gaming motives started with Bartle’s (1996) topology containing achievers, explorers, socializers, and killers. Based on this taxonomy, Yee (2006) and Yee et al. (2012) identified three higher-order motives (i.e., achievement, social, and immersion) for online games. Furthermore, Ryan et al. (2006) applied self-determination theory (SDT) to gaming, proposing autonomy, competence, and relatedness as key player motives. Other approaches are based on uses and gratifications research (e.g., Jansz et al., 2010) or social cognitive theory (De Grove et al., 2017).

All of the above approaches resulted in a variety of different measures of gaming motivation (e.g., DGMS, De Grove et al., 2017; MOGQ, Demetrovics et al., 2011; GAMES, Hilgard et al., 2013). Other researchers developed items focused on more specialized motivational constructs, such as stress relief (Reinecke, 2009) or skill development (Reid, 2012). Furthermore, scales have been constructed for specific genres (e.g., MMORPGs, Yee, 2006) or games (e.g., Pokémon Go, Cheng, 2019). Finally, more recent scales (Király et al., 2022; López-Fernández et al., 2020) were developed in and for a clinical setting and therefore may be limited to assess disordered gaming motivation. Diverse are also the approaches regarding the construction of gaming motivation items and scales. Authors frequently modify already existing scales (e.g., Myrseth et al., 2017) or compile existing items within a literature review that are then reviewed by players (e.g., Demetrovics et al., 2011). Novel items are either based on qualitative player responses (e.g., De Grove et al., 2016) or constructed and reviewed by experimenters themselves or within a larger group of experts (e.g., Hilgard et al., 2013). Unfortunately, there is often little to no information on the origin or construction of items. Within publications, the protocol of item construction and evaluation ranges from a few short sentences (e.g., Hilgard et al., 2013) to detailed reports (e.g., De Grove et al., 2016; Demetrovics et al., 2011). Similarly, statistical evaluation of current scales varies greatly, with some including detailed validation analyses over the mere calculation of reliability coefficients to no reported validation at all (see also Supplementary Table 1).

In summary, previous research indicated the importance of studying gaming motivation while failing to develop and agree upon a comprehensive and thoroughly validated measure of general gaming motives. Existing scales are either limited to a specific theoretical framework, selected gaming motives, game genres, or a single video game, and thus cannot capture gaming motivation in its entirety, or lack empirical evaluation.

This project follows a bottom-up approach based on an extensive literature review to identify and evaluate already existing items on gaming motivation (Study 1). Subsequently, the dimensionality of the resulting item pool was thoroughly evaluated and shortened using exploratory factor analysis (EFA; Study 2). Furthermore, initial evidence was collected on convergent and discriminant validity. Finally, confirmatory factor analysis (CFA) and exploratory structural equation modeling (ESEM; Study 3) cross-validated the factor structure of the scale. Moreover, we identified distinct motive gamer types using latent profile analysis (LPA). Thus, the MOPS is indirectly supported by several theoretical conceptions on gaming motivation without being limited to one theoretical background. Furthermore, the quality of individual items is not determined by subjective reviews, but via well-established statistical parameters.

Study 1: Systematic literature review

The initial item pool was developed by systematically accumulating already existing items measuring self-reported gaming motivation. Gaming motivation is defined in line with classical motivation as “knowledge representing emotional preferences […] and they refer to specific desirable or undesirable aims” (Demetrovics et al., 2011, p. 814). The systematic literature review followed the PRISMA guidelines for systematic reviews (Moher et al., 2015). Based on previous literature reviews in gaming (Boyle et al., 2016; Connolly et al., 2012; Pallavicini et al., 2018) and in collaboration with a team of experienced (gaming) scholars and librarians, the following search term was developed:

  • (“video game*” OR “videogame*” OR “digital game*” OR “electronic game*” OR “computer game*” OR “online game*” OR MMO* OR gaming) AND motiv* AND (scale* OR questionnaire* OR measure* OR item* OR instrument*).

This search string was applied to PsycINFO, PsycARTICLES, and Web of Science on March 27 2020 searching titles, abstracts, and keywords (for all databases), subject headings as well as test and measures records (for PsycINFO/ARTICLES) of publications. No other search restrictions regarding specific populations, game genres, or languages were made. To identify initially missed publications the first and third author conducted a subsequent full-text manual search via Google Scholar.

After removing duplicates, results were screened for relevancy regarding gaming motivation by eight researchers and trained student assistants. To further ensure proper coding the first author double-coded at least 10% of other parties’ screenings. Selected publications had to be thematically relevant, including theoretical or empirical research on motives to play video games. Publications were excluded only if abstracts indicated that (a) solely motives for topics unrelated to gaming were discussed or (b) motivation was assessed as an outcome variable.

All remaining manuscripts were then subject to a full text screening and a subsequent second coding step. This time at least 30% of other parties’ screenings were cross checked by the first author. The inclusion criterion specifically targeted publications assessing gaming motivation via self-report items or referring to other manuscripts doing so. Publications featuring (a) only theoretical discussions of gaming motives, (b) qualitative assessment of gaming motivation, (c) the assessment of only adjacent motives (e.g., motivation to pay for game content), or (d) the assessment of the degree of motivation (i.e., How motivated are you to play [again]? ) were discarded.

Results

The search term initially identified a large number (1,636) of publications (see Fig. 1)Footnote 1. Removing duplicates (253) left a corpus of 1,383 publications for the first screening. Coding for thematically relevance for gaming motivation led to an exclusion of 1,142 manuscripts. Excluded publications mostly dealt with game-based learning and its effects on students’ intrinsic motivation. Remaining results were further reduced to 159 publications featuring or referring to items measuring gaming motivation (see Fig. 1).

Fig. 1
figure 1

Flowchart of the Systematic Literature Review, Item Extraction and Selection

Item extraction and initial item pool

After systematically identifying papers relevant to gaming motivation assessment items were extracted. If scales or items were not directly reported in the manuscript or referenced publications, first authors were consulted, with no reply in 29 cases. Item in languages other than English or German were translated by least approximately native level speakers. The total number of items extracted was 1,097 (see Fig. 1). For a full list of all contributing publications, see Table A1 in the Electronic supplement which also contains the reference number used in the following (e.g., Paper 1 = P1 hereafter). As several items included ambiguous or multiple statements (e.g., “chatting with and getting to know other people”), respective items were split leading to a total of 1,234 items. Furthermore, item wordings were streamlined to fit the phrase “I play video games to/because/for…”. This then allowed for removing 647 duplicates or extremely similar statements (e.g., “…to compete with others” and “…to compete with other players”), leaving 587 items. Unfitting items were then removed that (a) referred to a certain game/genre and were thus overly specific (e.g., ‘… to be the most famous model’ when playing goSupermodel; P37), (b) were reversed and thus redundant items, or (c) described behavior or statements but not motivations (e.g., ‘I try to make sessions last as long as possible’; P28). Further streamlining led to a final pool of 147 items, which was then subject to a first evaluation. For a full list, see Table A2 in the Electronic supplement.

Study 2

Method

Participants

In a first online survey aimed to evaluate the final item pool, more than 400 participants were needed to obtain stable EFA parameters (Goretzko et al., 2019). Participants were recruited in May 2020 via the university’s online platform, and social media (e.g., Twitter, Instagram, Reddit, etc.). Inclusion criteria were being at least 18 years old and having played video games at least once. Ultimately, N = 562 participants took part (Mage = 27.5; SD = 7.99; range = 18 to 66). Most participants identified themselves as male (n = 394, 70.1%), less identified as female (n = 166, 29.5%) and a few (n = 2, 0.4%) had other gender identities. On average, participants spent M = 14.52 (SD = 12.95) hours per week playing video games and listed diverse games as their favorite ones, for example, titles from the Call of Duty (n = 49), FIFA (n = 20), and The Legend of Zelda (n = 19), League of Legends (n = 18), or The Sims (n = 18) series. Participation took approximately 20–25 min and was rewarded with potentially winning a voucher (8 × 25 EUR). Visual inspection of the raw data and additional quality indicators (Leiner, 2019) gave no cause for concern. Thus, the entire collected data was used for further analyses.

Measures

Besides providing information on basic demographics and gaming related questions (e.g., favorite game, gaming hours/week), participants answered the final pool of 147 gaming motivation items on a scale from 1 (= strongly disagree) to 5 (= strongly agree). The 5-point response scale was chosen to balance simplicity and measurement precision (Simms et al., 2019). To evaluate convergent validity, participants rated the 24 items of the PENS+ (Oliver et al., 2016; Rigby & Ryan, 2007), a well-validated questionnaire on gaming and need satisfaction. Items, such as “Video games let you do interesting things”, were rated on a 5-point Likert-type scale (1 = strongly disagree to 5 = strongly agree¸ α = 0.93). To measure discriminant validity, participants filled out the Dirty Dozen scale (Jonason & Webster, 2010), which aims at measuring so-called dark personality characteristics. Items such as “I tend to lack remorse” were answered on a 9-point Likert-type scale (1 = strongly disagree to 9 = strongly agree; α = 0.84).

Statistical analysis

The dimensionality of the item pool was examined using EFA procedures. First, we explored the distributional properties of the data as they can heavily impact EFA results (Watkins, 2018). Moreover, distributional properties are known to affect the selection of appropriate correlation, extraction, and factor retention methods (e.g., Garrido et al., 2013). As our data was categorical (i.e., items with five-point Likert scales) and substantially skewed (see Results section below), we opted to submit a polychoric correlation matrix for EFA. Polychoric correlations also work well with most recommended dimensionality assessments techniques developed for continuous data (Garrido et al., 2011, 2013). As the polychoric correlation matrix was not positive-definite, we applied the Bentler and Yuan (2011) smoothing algorithm for the dimensionality assessment parts as recommended by Debelak and Tran (2016). Whether the data is appropriate for EFA was tested with the Bartlett’s test of sphericity and the Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy.

We applied several current and recommended factor retention criteria (Auerswald & Moshagen, 2019; Goretzko et al., 2019) to obtain a number of competing models. Horn’s (1965) parallel analysis with principal component analysis (PCA) extraction and mean eigenvalue criterion was chosen, as this specification has been found to be most effective under various conditions (Garrido et al., 2013). Furthermore, we conducted the minimum average partial correlation (MAP) test with the specification of Velicer (1976). Moreover, we applied the Empirical Kaiser Criterion and the Hull method with CAF (common part accounted for) criterion.

We applied EFA with weighted least squares with mean and variance adjusted (WLSMV) estimation. Furthermore, we used Browne’s (2001) modified version of Yates (1987) geomin rotation criterion (with ε = 0.01) as it may give satisfactory results when other rotation criteria fail to do so and because its parameter estimations are comparable with CFA when factor loading matrices adhere to a simple structure (Hattori et al., 2017). To get an estimate of the factors’ internal consistency (along with its precision), categorical omega (ω) with 95% confidence intervals obtained via the bias-corrected and accelerated bootstrapping approach (10,000 bootstrap samples) was calculated, as this procedure has been recommended for categorical items (Kelley & Pornprasertmanit, 2016). Nevertheless, we also provide Cronbach’s α with 95% confidence intervals (based on the asymptotic method) for researcher not familiar with categorical omega.

Results

Item distribution and EFA appropriateness

Some items showed relatively large skewness and kurtosis (Mskewness = -0.06, SDskewness = 0.86; Minskewness = -2.39; Maxskewness = 1.84; Mkurtosis = -0.31, SDkurtosis = 1.43; Minkurtosis = -1.47; Maxkurtosis = 8.53; see Table A3 and Figure A1 in the Electronic supplement), which justified the calculation of polychoric correlations.

Polychoric correlations between items ranged between − 0.15 and 0.79 (Mpolycor = 0.28, SDpolycor = 0.14; see Figure A2 in the Electronic supplement). Bartlett’s test of sphericity rejected the hypothesis that the correlation matrix was an identity matrix (χ2 = 81,910.66; df = 10,731, p <.001). The KMO measure of sampling adequacy was acceptable with values of 0.86 for the complete dataset and 0.54 to 0.93 (MKMO = 0.85; SDKMO = 0.06) for each item. Altogether, these measures indicate that the correlation matrix is appropriate for EFA.

Dimensionality assessment and factor structure

The different factor retention criteria indicated a factor number between 9 and 17. The parallel analysis indicated 10, the MAP test 17, the Empirical Kaiser Criterion 9, and the Hull method 16 factors to extract (see Figure A3 in the Electronic supplement). Thus, we examined the suggested factor solutions in greater detail (see Figure A4-A12 in the Electronic supplement for pattern matrices, Figure A13-A21 for structure matrices, and Figure A22-A30 for factor correlation matrices).

The 9-factor solution showed symptoms of underextraction with many cross-loadings. In addition, no meaningful common theme summarizing the different items could be found. In contrast, the 11- to 17-factor solution showed signs of overextraction. Here, factors were unstable and saliently loaded by only few items with low factor loadings. The 10-factor solution had the best statistical balance between comprehensiveness and parsimony (Watkins, 2020). As this structure was additionally plausible and easy to interpret in terms of content, we opted for the 10-factor solution. To preserve these 10 factors, we subsequently selected those six items with the highest factor loadings on each factor (whereby factor six was saliently loaded by only four items). This procedure resulted in a total of 58 items that were then submitted to a second EFA. This time, different factor retention criteria indicated a factor number between 8 and 10, with parallel analysis indicating 9, MAP test 10, the Empirical Kaiser Criterion 8, and the Hull method 10 factors to extract (see Figure A31 in the Electronic supplement). Thus, we examined the 8 to 10-factor solutions again in detail (see Figure A32-A34 for pattern matrices, Figure A35-A37 for structure matrices, and Figure A38-A40 for factor correlation matrices; all in the Electronic supplement) and chose the 10factor structure as the final scale solution as the 8- and 9-factor solutions showed signs of underextraction with many and strong cross-loadings. Reliabilities of these 10 factors reached good to excellent levels (αMin = 0.83, αMax = 0.90; ωMin = 0.86, ωMax = 0.92). A complete list of the factors including hypothetical names, meanings and example items can be found in Table 1. Furthermore, Table 2 lists the factor reliabilities.

Table 1 Hypothesized MOPS factors, meaning, and example items
Table 2 Reliabilities for MOPS factors in study 2 and study 3

A correlation matrix for the resulting 10 factors in combination with PENS + and Dirty Dozen subscales provided support for convergent and discriminant validity. As expected, MOPS and PENS + subscales showed substantial correlations. In contrast MOPS and Dirty Dozen subscales appeared to correlate less often and reached only in certain cases small but reasonable linkages. For a full correlational table and a graphical representation see Table A4 and Figure A41 in the Electronic supplement.

Study 3

Method

Participants

In the second online survey, N = 732 participants took part.Footnote 2 Participants were recruited from July to October 2021 via the university’s online platform, other survey websites (e.g., SurveyCircle, Psychological Research on the Net), and social media (e.g., Twitter, Instagram, Reddit). Inclusion criteria were the same as in study 2. Unfortunately, visual inspection of the raw data indicated that several participants showed careless responding patterns. Thus, participants with overly short completion time (i.e., relative speed index ≥ 2; Leiner, 2019), straightlining (i.e., ≥ 15 identical ratings in a row and an average longstring of ≥ 5; Schonlau & Toepoel, 2015) and/or extremely high or low response variability (Dunn et al., 2018; Marjanovic et al., 2015)Footnote 3 were closer inspected by three independent coders (Krippendorff’s ɑ = 0.73).Footnote 4 This coding led to an omission of n = 13 participants so that only meaningful data (N = 719) was retained for further analyses.

The mean age of remaining participants was M = 25.39 (SD = 8.42, range = 18 to 70). Most participants identified themselves as male (n = 433, 60.2%) or female (n = 269, 37.4%). Furthermore, participants identified as non-binary (n = 5), trans (n = 1) or diverse (n = 1), and a few participants (n = 10) did not wish to indicate their gender identity. On average, participants spent M = 11.92 (SD = 12.9) hours per week playing video games and listed diverse titles as their favorite games (three nominations per participant), for example, the Call of Duty series (k = 128). However, also titles across other popular genres, such as Minecraft, (k = 67), Super Mario series (k = 64), GTA series (k = 50), The Sims series (k = 50), Zelda series (k = 49), League of Legends (k = 47), FIFA series (k = 42), The Elder Scrolls series (n = 38) and the Pokémon series (k = 36) were mentioned among the ten most popular games.Footnote 5 Participation took approximately 10–15 min and was rewarded with potentially winning a voucher (10 × 20 EUR).

Measures

Besides providing information on basic demographics (e.g., age, gender) and gaming related questions (e.g., favorite game, gaming hours/week), participants answered the 58 MOPS items identified in Study 2 on a scale from 1 (strongly disagree) to 5 (strongly disagree). Again, 24 PENS + items were rated by participants as well (α = 0.94).

Statistical analysis

The 10-factor structure was evaluated with CFA and ESEM. For the CFA model, items were only allowed to load on their a priori theoretical factors (informed by the EFA results of Study 2), i.e., cross-loadings were not permitted. Contrary, for the ESEM model, cross-loadings were permitted, but targeted to be as close to zero as possible (i.e., target rotation; Marsh et al., 2014; Morin et al., 2020). All models were estimated with the WLSMV estimator to account for the ordinal nature of the items (Morin et al., 2020; van Zyl & ten Klooster, 2022).

The measurement models were evaluated based on goodness-of-fit indices and measurement quality indicators. Model fit was determined based on χ2 (together with p values and degrees of freedom), RMSEA, CFI, TLI, and SRMR. Measurement quality was assessed with standardized factor loadings (λ > 0.35), levels of tolerance for cross-loadings (only for the ESEM models; λ < 0.30) and communalities (overall R² >0.50) for each item. Besides goodness-of-fit indices and measurement quality indicators, to retain an ESEM compared to a CFA model, the ESEM model should (1) show better data-model fit, (2) yield smaller factor correlations, (c) show only small to medium cross-loadings (e.g., < 0.50), and (d) show well-defined factors (Morin et al., 2020). Internal consistencies were determined via categorical omega.

Factor scores (based on the maximum a posterior approach) were used as LPA indicators as fully latent models often show convergence problems (Meyer & Morin, 2016). All LPA models were estimated using robust Maximum Likelihood estimator to appropriately account for non-normality of indicators (Spurk et al., 2020).Variances of indicators were freely estimated across classes. Wrongly assuming invariant variances may result in profile overextraction (Magidson & Vermunt, 2004). Moreover, our sample was large enough to include this degree of complexity in our models (Meyer & Morin, 2016). LPA models were estimated with 5,000 sets of random starts, allowed 1000 iterations each, and retaining the 200 best solutions for final optimization to avoid converging on a local maximum (Hipp & Bauer, 2006). To determine the number of profiles, statistical adequacy, substantive interpretability and utility, relative size of emergent profiles, as well as classification diagnostics were jointly considered (Masyn, 2013; Nylund-Gibson & Choi, 2018). Statistical adequacy was evaluated with the following statistical criteria: Akaike’s information criterion (AIC), consistent AIC (CAIC), Bayesian information criterion (BIC), sample-size adjusted BIC (SABIC), approximate weight of evidence (AWE), and the Lo-Mendell-Rubin likelihood ratio test (LMR-LRT). To ensure proper classification accuracy, we calculated two indices, namely entropy (i.e., index of overall classification precision; Celeux & Soromenho, 1996) and average posterior probabilities (AvePP, i.e., index of classification uncertainty; Masyn, 2013).

Results

Preliminary analysis

Skewness and kurtosis were again quite pronounced for some MOPS items (Mskewness = -0.21, SDskewness = 0.88; Minskewness = -2.44; Maxskewness = 1.70; Mkurtosis = -0.17, SDkurtosis = 1.91; Minkurtosis = -1.39; Maxkurtosis = 8.21; see Table A5 and Figure A42 in the Electronic supplement), indicating the need to account for nonnormality. Polychoric correlations between items ranged between − 0.11 and 0.85 (Mpolycor = 0.31, SDpolycor = 0.15; see Figure A43 in the Electronic supplement).

Measurement models

The CFA model showed reasonable data-model fit (χ2 = 4342.696; df = 1550; p <.001; RMSEA [90% CI] = 0.050 [0.048; 0.052]; CFI = 0.942; TLI = 0.938; WRMR = 1.577). However, the ESEM model showed an even better data-model fit (χ2 = 1685.169; df = 1118; p <.001; RMSEA [90% CI] = 0.027 [0.024; 0.029]; CFI = 0.988; TLI = 0.983; WRMR = 0.496). The factor correlations in the CFA model ranged between 0.17 and 0.68 (M = 0.43; SD = 0.14; see Figure A44) and in the ESEM model between − 0.05 and 0.49 (M = 0.29; SD = 0.13; see Figure A45). Standardized factor loadings in the CFA model ranged between 0.58 and 0.92 (M = 0.81; SD = 0.08; see Figure A46) and in the ESEM model between 0.31 and 0.94 (M = 0.72; SD = 0.13; only one item of the factor creativity, namely “… to satisfy my curiosity”, had a standardized factor loading of < 0.35; see Figure A47). Communalities in the CFA model ranged between 0.33 and 0.85 (M = 0.67; SD = 0.12; with four items having an overall R² <0.50) and in the ESEM model between 0.40 and 0.87 (M = 0.70; SD = 0.10; with three items having an overall R² <0.50). Cross-loadings in the ESEM model ranged between − 0.24 and 0.32 (M = 0.02; SD = 0.08), only one skill item (‘to test my skills’; cross-loading of 0.32 on the factor competition) and one competition item (‘to prove to my friends that I am the best’; cross-loading of. 0.32 on the factor prestige) showed cross-loadings of > 0.30. This results clearly favored the ESEM model over the CFA model. Thus, we retained the ESEM model and calculated factors scores from this model as LPA indicators. Reliabilities of the 10 factors reached good to excellent levels (αMin = 0.82, αMax = 0.92; ωMin = 0.85, ωMax = 0.94; see Table 2 for more details). The final MOPS version can be found in Table A7 in the Electronic supplement.

A correlation matrix for the 10 factors combined with PENS + again provided support for convergent validity. For a full correlational table see Table A6 in the Electronic supplement.

Latent profile analysis

Table 3 shows the fit indices associated with different LPA solutions including 1 to 10 profiles (for the elbow plot of these indices see Figure A48). The indices did not converge on a single solution. The AWE indicated a 3-profile and the LMR-LRT a 4-profile solution. The CAIC favored a 6-profile, and the BIC a 7-profile solution. The AIC, SABIC and bootstrapped likelihood ratio test (BLRT) kept on decreasing without reaching a minimum. Examination of elbow plots associated with the information criteria was more informative, suggesting small decreases of most information criteria after 4-profiles were extracted. Comparing the 3-profile solution with the 4-profile solution, the added profile was substantially different compared to other profiles, supporting the 4-profile solution (see Figure A49). Comparing the 4-profile solution with the 5-and 6-profile solution, the 5-profile again added a substantially different profile, while the 6-profile solution arbitrarily split an existing profile. As additional profiles were very small (5-profile: 1.9%; 6-profile: 1.8%, 2.6%; see Figure A50)Footnote 6 we disregarded them and retained the 4-profile solution for further analyses. The 4-profile solution had a high level of classification accuracy (i.e., entropy of 0.813 and AvePP between 0.849 and 0.915; see Figure A51). Figure 2 displays the profile plot of the 4-profile solution. Members of the first profile were labeled Casual Players as they had lowest values on all motives whereby playing to fight boredom, entertainment, and escapism were the strongest motives for them. The second profile was labeled High Performers as it describes players scoring high on aggression, competition, and prestige and below average on boredom and enjoyment. Members of the third profile were labeled Crafters, showing high values on creativity and enjoyment and low values on aggression. Lastly, members of the fourth profile were labeled Highly Involved Players with highest values on all motives, especially on escapism.

Table 3 Latent profile analysis models fit statistics
Fig. 2
figure 2

Latent profiles of the 4-profile solution. Note: Profile indicators are estimated from ESEM factor scores with a mean of 0 and a standard deviation of 1. Percentages are based on most likely profile membership. Deviations from 100% are due to rounding

To further examine profile composition regarding other demographic and gaming-related variables, we conducted several descriptive analyses (for an overview see Table 4; for a more detailed graph see Figure A52 in the Electronic supplement).

Table 4 Descriptive statistics for the four profiles (casual players, crafters, high performers and highly involved players)

Discussion

The aim of the current studies was to develop and validate a comprehensive new questionnaire to assess gaming motivation. Initial construct validation of the novel MOPS indicates that it is a suitable measure of general gaming motivation. Three studies provided substantive (i.e., literature review), structural (i.e., item and factor analysis, reliability coefficient), and external (i.e., convergent and discriminant analysis) validity evidence (Flake et al., 2017). The strength of the current measure compared to previous scales lies in its item pool based on a comprehensive literature review. Furthermore, the MOPS was constructed using a bottom-up approach without being limited to a specific theoretical background or gaming genre, thus assesses gaming motivation in general. We can therefore conclude that MOPS is applicable to a broad range of video games.

The factor analyses confirmed some factors identified in earlier work, while at the same time highlighting previously less accentuated dimensions. Especially our factor on socializing motives was often found in other researchers’ analyses (P9, P11, P20). Similarly, items related to the escapism factor have been identified priorly, although sometimes appearing in slightly different wording (e.g., distraction [P14], recovery [P38], or coping [P34]). Furthermore, prior subscales on fun (P13) and (hedonic) entertainment (P26) closely resemble our enjoyment subscale. Factors similar to our achievement subscale are included in several previous gaming motivation questionnaires (P15, P55-57). In addition, several previous questionnaires contain a competition subscale (P3). Since the MOPS items are based on a comprehensive pool of existing scales, this finding is not surprising. Nevertheless, our analyses also reveal stable motivational factors that have been less discussed in the literature. For example, only few scales include an independent motive for exploration (P20, P40). Furthermore, boredom or passing time has only been mentioned in few other scales (P11). Very little attention has been paid regarding measuring skill-based motives (P11) and aggression or violence (P20, P13). In contrast, prestige is mentioned for the first time. As gaming has become a mainstream activity and gamers are less stereotyped (Kowert et al., 2014), prestige-based motives may occur more often.

Based on participants’ motives, we were also able to conduct LPA, resulting in four stable and equally sized player profiles: Casual players, Crafter, High Performers, and Highly Involved Players. Profile assignment matched well with other demographic and gaming related items. For example, in accordance with previous literature (Poels et al., 2012), Casual players were found to have a rather short playing time per week and prefer to play on the phone or console. Furthermore, this group had a larger proportion of female players and covered a relatively wide age range (Scharkow et al., 2015). Highly Involved Players played comparatively longer preferably on a console. In contrast, High Performers preferred the PC to play and had the highest share of male gamers.

Limitations

As our analyses are exclusively based on a comprehensive pool of existing items on gaming motivation, the current MOPS solely includes constructs that have been discussed in current literature. Thus, comparatively novel phenomena in the gaming landscape and related motives may have been overlooked. For example, as Esport gamers and streamers play video games professionally, motivational profiles can slightly differ compared to ‘casual’ players (e.g., higher values for competition; Bányai et al., 2019a, b). Other recently studied aspects of gaming motivation, have not yet been adequately measured and therefore have not been included in our item pool. For example, entertainment media is not only used for pleasure (i.e., hedonic motivation), but also for ‘truth-seeking’ and experiencing meaningfulness (i.e., eudaimonic motivation; Oliver & Raney, 2011). Although video games successfully elicit meaningful experiences (Daneels et al., 2021; Oliver et al., 2016) and players are sometimes guided by eudaimonic motivation (Possler et al., 2020) validated scales on eudaimonic motivation for playing video games are currently missing. We would also like to emphasize that we used a convenience sample and that our results reflect only the motivation reported by the participants themselves. Therefore, our scale still needs to be validated in relation to actual gaming behavior and with a more culturally diverse sample in the future (see Implications and future research). Nonetheless, due to its broad and general scope, we see the MOPS in its present or even future adapted versions to be an adequate testing tool.

Implications and future research

The present studies have important implications for theory, practice, and future research. Despite replicating several priorly well-known gaming motives, the MOPS revealed also novel motivations. The current conceptualization sets the stage for other gaming scholars to further investigate the motivational appeal of new gaming topics (e.g., motivation to play meaningful video games, work in Esports, or livestream gaming content). Practitioners could be especially interested in our latent classes. Game manufacturers, for example, could use the scale to better tailor their products to the target audience (depending on the gamer type and motivation). Finally, as gaming research increasingly focuses on who is playing and why, rather than only on (clinical) outcomes of gaming, the current MOPS provides a solid tool for future research. Future research could expand the scale to include additional language versions to enable the assessment of gaming motivation in different cultural backgrounds.

Conclusion

Past research has indicated the importance of studying gaming motivation while at the same time failing to create a comprehensive validated measures of general gaming motives. In this article, we propose and validate a new measure designed to assess general gaming motives. Results across three studies demonstrate that the 58-item MOPS and its 10-factor structure are reliable and valid. We therefore believe that this new measure provides researchers and practitioners with a useful tool to examine and expand our knowledge on gaming motivation.