1 Introduction

The adoption of mobile devices, smartphones in particular, has been increasing within the past decade at an astronomical rate in almost every corner of the world. Many have studied the relationship between mobile devices and education as they become more ubiquitous. A cross country analysis shows that participation rates in secondary schools are strongly associated with mobile subscription rates (Pratama 2017). A meta-analysis from 39 other studies from 14 countries and regions between 2008 and 2017 shows that in general, mobile phone use has a negative, albeit small, effect on educational outcomes (Kates et al. 2018). On the extreme side, excessive use of mobile devices is associated with psychological distress (Beranuy et al. 2009) or may even lead to the occurrence of cognitive failures (Hadlington 2015). On a more positive and optimistic note, some other studies show the use of mobile devices for learning, if and when done properly, can help support academic empowerment and growth (Mouza and Barrett-Greenly 2015) and enhance learning attitude and achievements of the students (Hwang et al. 2011; Martin and Ertzberger 2013).

From the technological perspective, e-learning is a good example of how information and communication technology (ICT) can be used to facilitate teaching and learning activities. Over the past few decades, the adoption of e-learning by both educational institutions and other organizations has become more common, especially as it can help overcome the time and geographical boundaries that may otherwise prevent any teaching and learning activities from happening in the first place. Over time, e-learning has been proven to have many other advantages, such as increased engagement and knowledge of technology, or improved quality of instruction and learning outcomes (Chugh 2010). A wide range of e-learning tools can now be accessed from any personal computing devices. It is no longer limited to desktop or laptop computers, but also mobile devices such as smartphones and tablets.

Mobile learning (m-learning), a term that became a recognized one in 2005 as noted by Crompton (2013) refers to any kind of learning activities conducted via mobile devices, typically but not necessarily on the Internet. While some researchers argue that m-learning is both derived from and thus, a subset of e-learning (Georgiev et al. 2004; Ozuorcun and Tabak 2012), some other argue that there are major differences between the two, especially in pedagogical aspects as well as in communication and evaluation methods. For example, m-learning is more associated with informal and situated learning as opposed to formal and distance learning settings associated with e-learning (Laouris and Eteokleous 2005; Korucu and Alkan 2011). Nonetheless, they both share something in common, mobile devices can be used to help improve learning experience and by extension the quality of education in general.

Some m-learning activities are shown to have a positive relationship with a better education attainment, especially when the inclusion of mobile devices in learning activities is done by design rather than simply for the sake of catching up with the latest technology. For example, students who prefer to do m-learning activities over traditional e-learning on personal computers show more tendency of being active and collaborative learners (Pratama and Scarlatos 2020a). Another study found that some out-of-class learning activities with the help of mobile devices as opposed to learning in front of a computer within a lab can help improve students’ willingness to continue to engage in learning (Sung and Mayer 2013). Another scenario where m-learning can be helpful is by using mobile devices to track and monitor the time devoted to self-regulated learning to help improve students’ time management skills (Tabuenca et al. 2015). The benefits of m-learning are attainable anywhere, but they are especially important in some places, such as developing countries with limited resources or countries with large populations scattered among different regions. As shown in some studies across the globe, these countries have the needs to grow their economies by educating their people with some cost-effective ways of reaching everyone, including those in the rural areas (Motlik 2008; Grimus et al. 2012).

With a population of over 270 million, the archipelago nation of Indonesia is the fourth most populous country in the world (The World Bank 2020a). Its GDP per capita is just a little over 4,000 USD, putting it among the upper-middle income economies for the first time in 2019 after spending more than a decade in the lower-middle income category (The World Bank 2020b). Mobile subscriptions in Indonesia had already outnumbered the total population by early 2015, placing it ahead of many developed countries when it comes to the number of mobile connections per capita (Kemp 2015). Mobile penetration in this country increased significantly from 33.85% in 2015 to 61.85% in 2019 (Statista 2020), way ahead of the global penetration rate that rose slightly from 33.5% in 2015 to 41.5% in 2019 (Statista 2019). More importantly, half of that web traffic from this country comes from mobile devices, which is roughly one half more than the global average of 33% (Kemp 2017). An earlier study found that smartphones are the most commonly owned personal computing devices among Indonesian secondary school students. These students used their mobile devices mostly for accessing social media. Interestingly, they also use them for educational and productivity apps more than they use them for gaming (Pratama and Scarlatos 2020b). That said, Indonesia could be an ideal environment for realizing the promise of mobile learning.

Given the importance of m-learning, it is necessary to investigate factors determining its acceptance. Numerous studies have done so and found the importance of perceived usefulness or performance expectancy of m-learning on users’ behavioral intention of using m-learning (Huang et al. 2007; Wang et al. 2009; Park et al. 2012; Sidik and Syafar 2020). Nevertheless, those studies were mostly done with either college students or adult learners as their subjects so that their findings might not necessarily be applicable to adolescents who share different and unique characteristics of their own. For example, adolescents tend to report little self-perceived dependence on mobile devices while showing some symptoms, such as loss of control, withdrawal, and using their phones to escape from unwanted emotions (Nikolopoulou and Gialamas 2018). In another study, the intense use of text-messaging apps that requires writing and reading among adolescents did not appear to be significantly associated with their literacy levels (Zebroff and Kaufman 2016). These differences and unique characteristics of adolescents could mean either different factors are at play or even the same factors but with different magnitudes and degrees of importance are at play when it comes to their acceptance of a new technology, including m-learning.

The objective of this study was to develop a general linear structural model of m-learning acceptance of adolescents in secondary schools as well as to investigate the effect of age, sex, and location differences in their m-learning acceptance by adopting and extending the technology acceptance model (TAM) by Davis et al. (1989). The findings of this study would help schools, educators, and policymakers when trying to implement m-learning, particularly in the context of adolescents in secondary school. The rest of this paper is organized as follows. In the next section, the reasoning for adopting TAM as the theoretical framework of this study will be elaborated, followed by descriptions of the research method and the proposed model. Finally, the results of the hypothesis tests will be presented prior to discussion of the implications of this study.

2 Theoretical framework

Ever since its introduction in the late 1980s, TAM has been one of the most popular models used by researchers to help explain and predict human’s acceptance of a new technology. In this model, the two main predictors of behavioral intention (BI) are perceived usefulness (PU) and perceived ease of use (PEOU), both are mediated by attitude (ATT). Over the years, TAM has been extensively used and validated in various applications in ICT sectors. In the case of m-learning, Huang et al. (2007) proposed an extended version of TAM where two new exogenous variables, i.e., perceived enjoyment (PE) and perceived mobility value (PMV) are specified as the external variables, preceding both PU and PEOU in the original TAM, along with some minor changes in the relationship between variables. This new model was tested against a college student dataset in Taiwan, where it was able to explain 60% of the variance in student’s intention of using m-learning. Both new variables are shown to be significant, and the study also found that in the case of m-learning, PU affects student’s ATT and BI more than PEOU.

Apart from TAM, another model called the Unified Theory of Acceptance and Use of Technology (UTAUT) is also commonly used by researchers for predicting acceptance of a new technology (Venkatesh et al. 2003). This model was developed by integrating eight different models, including TAM. When tested using data from four organizations over a six-month period, this model was able to explain 69% of the variance in BI whereas the eight older models could only explain between 17% and 53%. In the case of m-learning, UTAUT has been evaluated by other researchers using data from different countries. Magsamen-Conrad et al. (2015) used samples from the United States to predict multigenerational tablet adoption practices where they found that this model could only explain up to 25% of the variance in tablet use intention, way lower than the one in the original study. They also found that only two of four independent variables in UTAUT significantly predicted behavioral intention in using a tablet. Meanwhile, Wang et al. (2009) adopted this model and tested it against data from Taiwan where it could explain 58% of the variance in behavioral intention to use m-learning. UTAUT is also used most recently in Indonesian context, specifically for m-learning among college students albeit no information about the explanatory power was given in the study (Sidik and Syafar 2020).

Taking from what has been learned from the literature, the proposed model in this study uses the extended TAM, specified by Huang et al. (2007), as its base model as it is shown to have the best explanatory power in predicting student’s intention of using m-learning, but with adding two variables borrowed from UTAUT, namely, social influence (SI) and facilitating condition (FC). Both variables are used to represent external motivation in addition to the internal motivation that is already represented by the other variables in the base model. In total, there are seven factors used to predict BI in this model, outlined as follows:

2.1 Perceived mobility value (PMV)

PMV refers to the individual’s awareness of mobility value (Huang et al. 2007). Mobility in the context of m-learning means students can learn from anywhere at any time, which is one of the most important and unique features that m-learning has to offer compared to traditional e-learning that requires a workstation that is usually attached to a specific location. Even when compared with e-learning with a laptop that offers flexibility to some degree, m-learning still has the advantage of always-on characteristics of smartphones and tablets, which is not the case in laptop computers. Hence, PMV is hypothesized to be the direct antecedent of perceived usefulness (PU).

  • H1. PMV has a positive effect on PU.

2.2 Social influence (SI)

Adopted directly from UTAUT, SI refers to the individual’s perception that others would think they should use the new technology in question (Venkatesh et al. 2003). For a secondary school student, these would mean their friends and their family. Using m-learning can be both more fun and easier for a student when they have social support from their inner circle. Conversely, it can be dull, harder, less appealing to them when their friends and family have rather negative attitudes toward it, turning social influence from a social support system to social pressure and driving them away from adopting it, instead. As a representation of external motivation, SI is hypothesized to influence both perceived enjoyment (PE) and perceived ease of use (PEOU).

  • H2. SI has a positive effect on PE.

  • H3. SI has a positive effect on PEOU.

2.3 Perceived enjoyment (PE)

PE refers to the individual’s feeling of enjoyment when using the system (Huang et al. 2007). As a relatively new technology, m-learning is less likely to be perceived as difficult to adopt when it is considered enjoyable and fun by the users. Individuals could also form a more positive attitude toward a system that is enjoyable (Huang et al. 2007). Hence, PE is hypothesized to influence both perceived ease of use (PEOU) and attitude toward m-learning (ATT).

  • H4. PE has a positive effect on PEOU.

  • H5. PE has a positive effect on ATT.

2.4 Perceived ease of use (PEOU)

PEOU is one of the original constructs in TAM, defined as an individual’s perception that using the technology in question (i.e., m-learning) will be rather effortless (Davis et al. 1989). Individuals will form a more negative attitude as they will not see m-learning as useful if it is difficult to use in the first place. Therefore, just how it is specified in the original TAM, PEOU is hypothesized to be a direct antecedent of both perceived usefulness (PU) and individual’s attitude (ATT).

  • H6. PEOU has a positive effect on PU.

  • H7. PEOU has a positive effect on ATT.

2.5 Perceived usefulness (PU)

PU is another original construct in TAM, defined as an individual’s perception that using the technology in question (i.e., m-learning) will improve their performance (Davis et al. 1989). In the context of m-learning, it means that using m-learning can help improve students’ performance in learning. The more an individual believes that m-learning is useful, the more likely that individual will form a more positive attitude toward and a stronger intention of using it. PU is hypothesized to be a direct antecedent of both attitude (ATT) and behavioral intention (BI) of using m-learning.

  • H8. PU has a positive effect on ATT.

  • H9. PU has a positive effect on BI.

2.6 Attitude (ATT)

ATT is a mediator between PEOU/PU and BI in TAM as specified by Davis et al. (1989). With everything else being equal, individuals will form a behavioral intention of using m-learning when they have a positive rather than negative attitude towards it in the first place. Hence, ATT is hypothesized to be the direct antecedent of behavioral intention (BI).

  • H10. ATT has a positive effect on BI.

2.7 Facilitating conditions (FC)

Adopted directly from UTAUT, FC refer to the individual’s belief that some infrastructure, both organizational and technical, exists to support use of the system (Venkatesh et al. 2003). In the context of m-learning in Indonesian secondary schools, this construct refers to the policy made by the schools and the government that can help support the use of m-learning. Buku Sekolah Elektronik by the Indonesian government, which is a free electronic textbook program along with an official mobile app to help students download and read the ebooks, is one good example of such thing (Pratama and Firmansyah 2020). As shown in UTAUT, FC is a significant predictor of technology acceptance although such a supporting system does not necessarily make m-learning neither more enjoyable, easier, nor more useful. Therefore, FC as another representation of external motivation is hypothesized to directly influence the individual’s attitude (ATT) and behavioral intention (BI).

  • H11. FC has a positive effect on ATT.

  • H12. FC has a positive effect on BI.

Figure 1 shows the proposed model with all hypothesized relationships between all variables.

Fig. 1
figure 1

The proposed extended TAM in this study

3 Methods

An anonymous paper-based survey was conducted in 2016 in two middle schools and two high schools in Indonesia, one for each urban and rural area. Ethical clearances were obtained from the local education authorities that regulate and supervise all research activities in K-12 education systems within their jurisdictions. All four schools were top-performing public schools in the respective regions and levels of education. A total of 1156 students participated in the study. School selection was done by using a purposive sampling method, while classes were selected with a stratified random sampling method based on school and grade. All present students in the selected classes filled out the survey during school hours, yielding a high response rate of 97.15%. The participants ranged from 13 to 19 years of age (M = 15.80, SD = 1.71). The characteristics of the respondent are shown in Table 1.

Table 1 Characteristics of the respondents

3.1 Measures

To ensure the content validity of the scales, the items selected need to represent the concept about which generalizations are to be made. Therefore, items used to measure all constructs in this model were adopted from the previous studies (Huang et al. 2007; Wang et al. 2009). Some adjustments were made by taking into consideration the local culture in Indonesian schools and the difference in characteristics between English and Bahasa Indonesia that was used in delivering the questionnaires to the students. All items were measured by using Likert scales (1–5), with anchors ranging from ‘strongly disagree’ to ‘strongly agree’. The original items in Bahasa Indonesia used in the study are available in the appendix, whereas the English equivalents are listed in Table 2.

Table 2 Means, standard deviations, and construct loadings for all measurement items

4 Results

The structural equation modeling (SEM) analysis was conducted using R 3.4.4 on RStudio 1.1.423 with the lavaan 0.6–3 package (Rosseel 2012).

4.1 Assessment of measurement model

The maximum likelihood method was used to estimate the parameters when analyzing the hypothesized relationships among the variables in this model. The proposed structural equation model was tested for the overall goodness of fit with the data and evaluated with the 2-index presentation strategy to control types I and II errors simultaneously (Hu and Bentler 1998). In this case, a combination of SRMR = .05 and CFI = .95 means that this model met the requirements (i.e., SRMR ≤ .08 and CFI ≥ .95) and that the proposed model exhibited a good fit with the data collected.

Convergent validity of this model was evaluated by two approaches. The first was done by examining the factor loadings from the confirmatory factor analysis, as shown in Table 2. Following the recommendation made by Hair et al. (1992), a factor loading greater than 0.50 is considered to be very significant. The loading of all measurement items in the proposed model range from .752 to .887, indicating adequate convergent validity.

The second approach was made by assessing the composite reliability (CR) and average variance extracted (AVE) as shown in Table 3. Following the recommendation made by Alarcón and Sánchez (2015) and using the criterion of Fornell-Larcker (Fornell and Larcker 1981), the recommended values for CR are above .70. Meanwhile, AVE values above .70 are considered very good, whereas the level of .50 is acceptable. In this case, the results show CR values for all factors range from .777 to .870 while the AVE values for all factors range from .636 to .770, both indicating adequate reliability and convergent validity.

Table 3 Composite Reliability, Average Variance Extracted, and Shared Variance

Off-diagonal elements are the shared variance

To examine discriminant validity, the shared variance between factors was compared with the AVE of the individual factors as recommended by Fornell and Larcker (1981). As shown in Table 3, all the shared variances between factors were lower than the AVE of the individual factors, thus confirming discriminant validity. Another method to examine discriminant validity was done by examining correlations among the constructs, also known as the Heterotrait-monotrait ratio of the correlations (HTMT). Following the recommendation made by Alarcon & Sanchez (Alarcón and Sánchez 2015) and using the criterion of Kline (2011), a correlation value larger than .85 indicates poor discriminant validity in SEM. As shown in Table 4, the correlation values range from .457 to .847, indicating adequate discriminant validity in this model.

Table 4 Correlation matrix between factors

4.2 Structural model and hypotheses tests

The results of path analysis show that 11 out of 12 direct paths in the proposed model are significant, indicating that H1, H2, H3, H4, H5, H6, H7, H8, H9, H10, and H11 are supported. As for H12, the direct effect of FC on BI is not significant, but when considering the indirect effect that is mediated by ATT, a significant total effect of FC on BI is found. Figure 2 shows the coefficients of all the hypothesized paths in the model. Meanwhile, Table 5 summarizes the direct, indirect, and total effects of all factors on BI.

Fig. 2
figure 2

Path coefficient of the proposed extended TAM

Table 5 Standardized estimates of all predictors on BI

4.3 Moderation analysis

To investigate whether sex, age, education level, and location differences have a moderation effect on any factors in this model, the same SEM approach was repeated with multiple group analyses. Furthermore, to simplify age group assignments and to avoid arbitrary cut-off when converting a continuous variable into a categorical variable, education level is used to represent age in this analysis. In other words, middle school students represent the younger group, whereas high school students represent the older group. The total effects of each latent variable on behavioral intention across groups are summarized in Table 6.

Table 6 Total effects of all predictors on BI across groups

Table 7 summarizes the results of hypothesis tests in this study where 11 out of 12 hypotheses (H1-H11) are supported while the other one (H12) is only partially supported due to the non-significant direct effect while still having a significant total effect. Three hypotheses (H8, H9, and H12) are moderated by sex, age, and location differences, while another one (H11) is moderated by age differences.

Table 7 Summary of hypothesis tests results

5 Discussion

The findings show that the extended TAM in this study can be employed to explain and predict the acceptance of m-learning among Indonesian secondary school students. Overall, this model can explain 73.4% of the variance in student’s behavioral intention of using m-learning, which is much higher than the 60% variance explained in the base model as specified by Huang et al. (2007). All factors in the model are confirmed to be significant factors of m-learning acceptance. Just like in the previous research, ATT is the most important factor behind m-learning acceptance, and it also serves as a mediator to all other predictors in the model (Davis et al. 1989; Huang et al. 2007).

Apart from ATT, PE affects BI more than any other predictors in the model while being affected the most by SI. While significant in all groups, the effect of PE on BI is much stronger for male students than it is for female students. Interestingly, the total effect of SI on BI is the next strongest one in this model. Just like PE, the effect is also stronger for male students than for female students. PEOU affects ATT and BI more than PU but less than PE. Also, its effects are quite uniform across different groups indicating little to no moderation effect from neither sex, age, nor location. PU, while is still a significant factor of m-learning acceptance in this model, is heavily moderated by sex, age, and location differences at the same time. Interestingly for some cases, it is no longer a significant factor, especially for male students and younger students in middle school. FC is also found as a significant factor of m-learning acceptance that is mediated by ATT. Its effect is also moderated by location where it is more important for students in rural areas than it is for students in urban areas and even more so by age where younger students in middle school are much more affected by it than older students in high school. PMV significantly affects PU, however, its total effect on BI is moderated by sex differences where it is significant for female students but not for male students and by age differences where it is significant for older students in high school but not for younger students in middle school.

Some findings in this study are intriguing considering the literature suggests that the men are generally more strongly influenced by their perception of usefulness whereas women are more strongly influenced by their perception of ease of use and social influence (Venkatesh and Morris 2000). The differences between the findings in this study and the literature can be attributed to the difference in settings (i.e., location and time) of the research and the uniqueness in characteristics of the subjects of this study compared to the previous studies. Cultural differences between the West and the East might play an important role, especially for the effect of SI that is stronger in a collectivist country like Indonesia (Triandis et al. 1993; Goodwin and Giles 2003; Ralston et al. 2014).

Furthermore, the subjects in this study were born in the late 1990s to early 2000s, also known as the post-millennial generation or Generation Z, whereas the previous studies used the millennial generation or Generation Y (i.e., those who were born in the 1980s to mid-1990s) or even earlier generations as their subjects. Since the rise of social networking sites and mobile technology did not start until the mid-2000s, it is understandable that the post-millennials are more accustomed to mobile technology since they have the opportunity to interact with it earlier in their childhood as opposed later in adulthood for the earlier generations. It is also a good indication that adolescents and younger generations in general do not share the same characteristics (i.e., not as rational) as adults, shown by how PE is much more important to them than PU and PEOU. For many of these students, the usefulness of m-learning is not as important as its playfulness. No matter how useful it is, they might not be interested in using it should they fail to see it being enjoyable. Nevertheless, it does not mean we should all jump to the edutainment (i.e., a hybrid of education and entertainment) train at once, especially when it tends to sacrifice the education part in favor of the entertainment one (Bosacki 2005; Okan 2003). Instead, more research needs to be done by focusing on how to provide a playful and enjoyable learning experience (Resnick 2004). One way of doing so is by integrating gamification in the learning process (Ofosu-Ampong et al. 2020).

Of the three moderator variables in this model, age differences turned out to be more decisive than sex and location differences. While sex differences affect internal motivation (i.e., PMV and PU) and location differences moderate external motivation (i.e., FC), age differences affect both internal and external (i.e., PMV, PU, and FC) at once. Considering there is only a 3-year gap average between middle school and high school students in this study, it is likely that a more significant difference can be found in groups with a wider age gap.

Although the most deciding factors of m-learning acceptance as shown in this model are internal motivations (i.e., ATT and PE), two external motivations (i.e., SI and FC) also play such an important role among all predictors. SI is a direct antecedent and the most influential factor behind PE while FC is especially important for younger students and students in rural areas. There is a huge opportunity for the policymakers, be it schools or governments, in closing the gap between sexes, generations, geographical locations, and socioeconomic status to help promote m-learning to everyone. This study shows it is true for adolescents in Indonesia, and it might also be true in other developing countries.

6 Conclusion

The theoretical significance of this study lies in a different perspective of m-learning acceptance from adolescents’ point of view instead of that of college students and adult learners, which are more commonly studied in the literature. With a high explanatory power of 73%, the findings from the model proposed in this study emphasize the uniqueness of adolescents affecting how they would accept a new technology like m-learning. As it turns out, adolescents in secondary schools are not as rational as adults, shown by how perceived enjoyment and social influence are far more influential than perceived usefulness, which is usually the most important factor driving user’s acceptance of m-learning as shown in the literature.

As for the practical implications, the findings also highlight how policymakers; schools and government in particular, can help promote m-learning and close the gaps between different groups of students, particularly across different sex, age group, and geographical location. As was evident in a previous study conducted with the same subjects of Indonesian secondary school students, Buku Sekolah Elektronic program has helped nudge them to become e-book readers (Pratama and Firmansyah 2020). That said, there is a huge opportunity for the governments to help promote a more universal acceptance of m-learning by providing more facilitating conditions aimed to the specific groups of students who are either in a disadvantaged position or simply need more help, nudge, and guidance for them to see the benefit of adopting m-learning.

The main limitation in this study has something to do with the fact that it was conducted in only one country. It would be interesting to see if the same findings hold water in other parts of the world, not only in other developing countries that share more or less the same characteristics as Indonesia, but also in developed countries that usually show a better utilization of mobile devices and whose students might be more accustomed to m-learning activities. By replicating this study with adolescents from other countries, we could see if generational similarities across different cultures are, indeed, stronger than cultural similarities across different generations. The answer to this question could help shed light on how to better utilize mobile devices for educational purposes among younger generations across the globe.