Keywords

1 Introduction

Mobile devices are widely used in our daily life and lead to the rapid development of mobile applications [1]. The commonly used mobile applications include mobile commerce, mobile social media, mobile health, and so on [2]. From the perspective of mobile application developers, designing the mobile application that people tend to use is of great significance in the highly competitive market. Based on TAM3 (Technology Acceptance Model 3), objective usability is one determinant of the perceived ease of use and then influence the perceived usefulness and usage intention of technology [3]. Therefore, usability has an indirect influence on the usage intention. To enhance the usage intention, usability improvement is an effective way, and research related to mobile applications usability evaluation has become popular.

Based on the definition by International Organization for Standardization, usability is the ability that the system can enable a specific user to accomplish a specific goal under a specific context [4]. Wei & Dong defined the mobile system usability as the ability that the mobile system helps the user meet the intended goals under a specific context [2]. Compared to the usability evaluation of other systems, the following issues should be considered in the context of mobile applications due to the characteristic of mobile devices [5,6,7]. The first issue is the small screen size [2, 5,6,7]. Because of the limited size of the screen, what information should be displayed [2] and how to effectively organize the information are essential in improving mobile applications usability. The second issue is the mobile context [5, 6]. Mobile applications are frequently used in the dynamic environment that contains many interactions between the user and the context [5, 6]. For example, when using the mobile map app while driving, driver should pay attention to both road condition and mobile map app. The third issue is the input method [2, 5,6,7]. This issue is due to the small screen size [2, 5, 6]. Buttons on mobile devices can’t be as large as other devices due to the screen size constraint. However, the small button can increase the operation difficulty and errors, and lower the input speed and efficiency [6]. The remaining issues include connectivity [5, 6], display resolution [5,6,7], etc.

This report is a systematic review related to mobile applications usability evaluation by bibliometric analysis. Scopus and Web of Science were applied to search literature. Three tools, including VOSviewer, CiteSpace, and MAXQDA, were used in this report to do the bibliometric analysis. This part is the introduction, which comprises the background information and topics were going to analyze. The second part shows the procedure of searching related documents and search results. The third section is the analyses results. The fourth section is the discussion. The fifth section is the conclusion, which presents the main work in this report. The last section is the future work, which points out possible directions for further research.

2 Procedure

In this report, documents related to mobile applications usability evaluation were analyzed. Scopus and Web of Science were used for searching papers. The published year is between 2010 to 2021. There are three search terms, including mobile applications, usability, and evaluation. Table 1 is the search results. Based on Table 1, more papers can be found in Scopus.

Table 1. Table of databases and search results.

After searching papers, trend diagram was generated based on the number of documents per year. Next, co-occurrence keyword analysis, co-authorship analysis, and co-citation analysis were performed in VOSviewer based on the search results. Moreover, CiteSpace is another effective tool for doing the co-citation analysis. Then, leading authors and leading sources related to mobile applications usability evaluation were summarized based on the papers searched by Scopus. Finally, the word cloud was generated by MAXQDA.

3 Results

3.1 Trend Analysis

Papers searched by two databases were analyzed. Figure 1 shows the number of documents per year. For the papers searched by Scopus, the number of published papers in each year almost keeps increasing from 2010 to 2019, decreases in 2020, and increases again in 2021. For the papers searched by Web of Science, the number of published papers in each year decreases from 2010 to 2012, increases from 2013 to 2017, decreases in 2018 and 2020, and increases in 2019 and 2021. The almost increasing trend represents the popularity of this topic in recent research.

Fig. 1.
figure 1

Trend diagram based on the number of documents per year from 2010 to 2021.

3.2 Co-occurrence Keywords Analysis

Papers searched above were used to do the co-occurrence keyword analysis. The minimum occurrence for each term was 20. Next, 342 terms were selected based on the papers searched by Scopus and 64 terms by Web of Science. Then, 205 terms based on Scopus and 38 terms by Web of Science were chosen as the most relevant terms. Figure 2 is the co-occurrence keyword diagram based on Scopus, and Fig. 3 is the co-occurrence keyword diagram based on Web of Science.

Fig. 2.
figure 2

Co-occurrence keyword diagram based on Scopus.

Fig. 3.
figure 3

Co-occurrence keyword diagram based on Web of Science.

Figure 2 and Fig. 3 are the co-occurrence keyword diagrams related to mobile applications usability evaluation. In the co-occurrence keyword diagram, the term circle size is related to the occurrence times in these papers’ abstracts or titles [8]. The term circle size will be large if this term appear frequently in the title or abstract of papers [8]. “Mobile application” has the most considerable circle size in Fig. 2 and Fig. 3, which indicates this term is most frequently occurred in the abstract or title based on the documents searched by two databases. Next, the distance between two terms would be closer if they frequently co-occurred [8]. For example, the distance between “mobile health” and “diabete” in Fig. 2 and the distance between “effectiveness” and “satisfaction” in Fig. 3 are close, which means they frequently co-occurred in papers. Then, terms with the same color were divided into the same cluster, and the terms in the same cluster had a similar research topic [8]. For instance, Fig. 2 has four clusters and Fig. 3 has three clusters. In Fig. 2, there are many terms related to mobile health in the red cluster, such as “mobile health”, “mhealth”, “diabete”, “patient”, “treatment”, and so on, which represent the main topic in this cluster. In Fig. 3, three terms are related to usability evaluation attributes in the blue cluster, including “efficiency”, “effectiveness”, and “satisfaction”.

To be more clear about the importance of different terms, terms were ranked by occurrence generated from VOSviewer. Table 2 shows the top terms searched by co-occurrence keyword analysis based on two databases. The top terms based on the documents searched by Scopus are “Mobile application”, “Application”, “Paper”, “App”, “System”, “Participant”, “Usability evaluation”, “Patient”, “Model”, and “Score”. The top terms based on the documents searched by Web of Science are “Mobile application”, “Paper”, “App”, “Usability evaluation”, “Quality”, “Participant”, “Review”, “Mobile device”, “Patient”, and “Problem”.

Table 2. Table of the top terms by co-occurrence keyword analysis in VOSviewer.

3.3 Co-authorship Analysis

The papers searched above were used for the co-authorship analysis, and the analysis was conducted by VOSviewer. The minimum number of documents was 3. Finally, 122 authors were selected based on Scopus, and 17 authors were selected based on Web of Science. Figure 4 and Fig. 6 are the detail of the co-authorship diagrams, and Fig. 5 and Fig. 7 are the co-authorship diagrams. In the co-authorship diagram, the size of the circle indicates the number of documents the author has [9]. For the papers searched by Scopus, Hussain A has 41 papers with the largest circle in Fig. 5. For the papers searched by Web of Science, Hussain A is also the leading author based on the circle size and the number of documents. Total link strength in this part represents the sum of link strength the author has [10]. Hussain A has the most significant total link strength in both papers searched by Scopus and Web of Science.

Fig. 4.
figure 4

Detail of the co-authorship diagram based on Scopus.

Fig. 5.
figure 5

Co-authorship diagram based on Scopus.

Fig. 6.
figure 6

Detail of the co-authorship diagram based on Web of Science.

Fig. 7.
figure 7

Co-authorship diagram based on Web of Science.

3.4 Co-citation Analysis

Papers searched by Scopus were used. The minimum number of citations was 7, and 12 cited references were selected. Figure 8 is the co-citation diagram, and Fig. 9 is the selected references. In Fig. 8, 12 cited references were divided into 7 clusters. From Fig. 9, paper [6] has the enormous citations and total link strength, which indicates the importance of this paper in this area. Since the information of one paper in Fig. 8 is not adequate, we selected 11 papers (6 clusters) in Fig. 9 for further analysis.

The first cluster contains three papers which are related to SUS [11,12,13]. Brooke developed SUS to conduct the effective usability evaluation in different contexts [12], and conducted a review of SUS in 2013 [13]. There are ten statements in SUS and participants are required to fill in the questionnaire based on their user experience immediately after using a system in the experiment [12]. Bangor et al. investigated the relationship between SUS score and usability by adding one statement in SUS [11]. The second cluster contains three papers which focus on the usability evaluation model [5, 6, 14]. Paper [5] and [6] are reviews of mobile applications usability evaluation model, which have discussed usability evaluation methods, usability evaluation attributes, and so on. Seffah et al. proposed an integrated usability evaluation model by considering the existing usability evaluation models [14]. The third cluster has two papers [7, 15]. Coursaris & Kim conducted a review related to mobile applications usability evaluation based on previous empirical studies [7]. Nielsen & Molich used heuristic evaluation to search for usability problems in different user interfaces [15]. The remaining three clusters contain one paper each [16,17,18]. Bangor et al. analyzed SUS based on empirical studies and testified the effectiveness of SUS [16]. Lund proposed the USE questionnaire, which is also a subjective usability evaluation questionnaire [17]. Stoyanov et al. designed MARS (Mobile App Rating Scale) to testify the usability of mobile health [18].

Fig. 8.
figure 8

Co-citation diagram based on Scopus.

Fig. 9.
figure 9

Selected papers in Fig. 8.

Another effective tool is CiteSpace. In this part, papers searched by the second database were used. Figure 10 is the co-citation diagram by CiteSpace. This figure is part of the co-citation diagram. In Fig. 10, some nodes were marked with the author name and year, and some clusters were marked with keywords. In this figure, several keywords, “mobile applications”, “usability testing”, “digital psychiatry”, “health care evaluation mechanisms”, “older adults”, “technology acceptance”, “software”, and “information technology” were identified. Moreover, citation burst in CiteSpace is an effective way to search essential papers on the given topic. As shown in Fig. 11, four important papers related to mobile applications usability evaluation were found, and three of them [5, 7, 18] were identified in Fig. 9. The red line in Fig. 11 represents the paper that was frequently cited during these years [19]. For instance, the paper written by Harrison et al. [5] was frequently cited from 2016 to 2018.

Fig. 10.
figure 10

Co-citation diagram by Citespace based on Web of Science.

Fig. 11.
figure 11

Citation bursts based on Web of Science.

3.5 Leading Table

The first leading table is the leading author table based on papers searched by Scopus. This table is ranked by the number of papers that the author published between 2010–2021. From Table 3, Hussain A has the most significant number of published papers related to this topic from 2010 to 2021. The second leading table is the leading source table based on papers searched by Scopus. In Table 3 and Table 4, we can get both the leading information and the popular research topics for each leading author and source based on keywords. From Table 4, research related to mobile health was popular.

Table 3. Leading author table.
Table 4. Leading source table.

3.6 Word Cloud

Content analysis was conducted by MAXQDA. Four papers from Fig. 9 [5,6,7, 14] and two chapters related to mobile applications usability evaluation from the book [2, 20] were selected. In MAXQDA, the minimal frequency of words was 50, and the number of words in the image was 50. The word cloud was generated after removing some irrelevant words and repetitions. In the word cloud, the size of the term represents the occurrence of this term in the selected materials. In Fig. 12, the size of “usability” and “mobile” are large, which means these two terms frequently appear in the above documents.

Fig. 12.
figure 12

Word cloud generated by MAXQDA.

4 Discussion

4.1 Mobile Applications Usability

Based on the definition by ISO, if one or more than one of these three factors (user, goal, and context) change, the usability of this system may be different [4]. Coursaris & Kim proposed a mobile usability framework [7]. Their mobile usability framework contains three layers. The first layer has four factors (user, environment, task/activity, and technology) that can influence mobile usability, the second layer includes several usability attributes that can represent mobile usability from different perspectives, and the third layer shows the effect of usability [7]. Some keywords identified by co-citation analysis in Fig. 10 is related to the above framework. For example, “older adults” is related to the user in the first layer, and “technology acceptance” is the effect of usability.

In the first layer, usability can be influenced by the above four factors. Users can be divided into different clusters based on their age, gender, culture, etc. [7] When asking different users to use the same system to complete the same task in the same context, system usability might be varied. Therefore, user difference should be considered in mobile applications usability evaluation to meet different groups’ requirements. For example, some factors, such as visual acuity, hearing, memory, etc. should be emphasized when designing for the elderly [21]. Next, the usability of a system may not be the same due to the environment. Taking the mobile map app as an example, even though some mobile map apps are easy to use in a static environment, their usability might change when using in a dynamic environment, such as walking, driving, etc. Then, the task or activity has an impact on usability. With the increment of difficulty or number of tasks, the usability of mobile applications may be varied. Harrison et al. pointed out that mobile application developers tend to add additional functions to achieve as many goals as possible [5]. However, some redundant functions may have negative effects on the original and primary goals [5]. Finally, technology is an essential factor that can influence usability. In the framework proposed by Coursaris & Kim, technology contains device type and interface [7]. Mobile devices include mobile phones, tablets, and wearable devices [2, 22]. Kortum & Sorber found that the usability of phone applications is better than tablet applications [23]. The main difference between the phone and the tablet is the screen size. For the same mobile application, the tablet can display more information than the phone on one page. However, the large amount of information will lead to trouble in proper information selection and decreasing the usability. Additionally, mobile application developers consider more in the necessity of functions due to the limited screen size and capacity [23].

The second layer is the usability attributes. Table 5 shows the usability attributes in previous papers. The first and the second row of Table 5 are usability attributes, and the third and the last row of Table 5 are mobile applications usability attributes. Based on Table 5, the frequently used mobile applications usability attributes are effectiveness, efficiency, satisfaction, learnability, memorability, and error [4,5,6, 24]. Effectiveness evaluates the ability of mobile applications that can enable users to accomplish specified goals with completeness and accuracy [4,5,6]. Efficiency reflects the resources required to achieve goals with speed and accuracy [4,5,6, 24]. Satisfaction is the user’s subjective feeling after using a specific mobile application [4,5,6, 24]. Learnability requires the mobile application should be easy to learn, and the user can achieve a specific level in using this mobile application in the short term [5, 6, 24]. Memorability represents the extent the user can use the mobile application after not using it for a given period [5, 6, 24]. Error can be calculated by the number of errors that occur in the usability testing, and well-designed mobile applications should promise the low error while using them [5, 6, 24].

Table 5. Usability attributes in previous papers.

The third layer is the effect of usability. Well-designed mobile applications can promise users to accomplish tasks with accuracy, speed, and low effort. Improving mobile applications usability can attract more users since people tend to use the easy operating mobile applications. Based on the model related to technology acceptance, objective usability is one factor that has an impact on perceived ease of use, and perceived ease of use is one attribute that can influence perceived usefulness and usage intention [3, 25]. Therefore, the usability improvement can stimulate and keep customers’ usage intention of mobile applications.

4.2 Mobile Applications Usability Evaluation Methods

Lab experiment and field study are two frequently used mobile applications usability evaluation methods [5,6,7]. The difference between these two methods is the lab experiment is conducted in the controlled context while the field study is conducted in a real-world context [5, 6]. Moreover, it is easier to control and record the testing process in the lab experiment than in the field study [6].

After completing the tasks, participants’ subjective feelings toward the usability of the mobile application can be collected by questionnaire. SUS is an effective way to do usability evaluation [12, 13] and is frequently used in mobile applications usability evaluation [26,27,28,29]. It contains ten statements that participants can assess the system usability from different perspectives and participants are required to grade each statement to show their agreement immediately after completing all tasks [12]. Nevertheless, how to use the final grade to interpret the usability is a question [11, 16]. Bangor et al. added one statement related to the overall feeling of the product in SUS, investigated the relationship between the SUS score and the overall feeling, and found they are highly co-related [11, 16]. USE questionnaire, which can evaluate usability from three perspectives, is another subjective usability evaluation questionnaire proposed by Lund [17]. There are several statements in each perspective and participants are required to grade each statement after using a product [17].

Another subjective method is heuristic evaluation. Heuristic evaluation is an effective and efficient way to do the software usability evaluation [30]. Participants can raise usability problems based on different heuristics and their user experience. Nilsen & Molich pointed out that five evaluators can find most of the usability problems based on nine interface usability heuristics, and more usability problems can be identified as the number of evaluators increases [15]. They also proposed ten interface usability heuristics in 1994, and the latest update was in 2020 [31]. However, usability heuristics proposed by Nilsen focused mainly on desktop interfaces rather than mobile applications, which should take the small screen size, input methods, and some other characteristics of mobile devices into consideration [32]. Therefore, Joyce & Lilley developed usability heuristics for mobile applications based on previous literature and experts’ assessments [32]. Joyce et al. have also testified the effectiveness of their usability heuristics in identifying mobile applications usability problems [33]. Inostroza et al. have also contributed to the development of mobile applications usability heuristics [34].

Mobile applications usability can be evaluated by some objective metrics. Different metrics can evaluate mobile applications usability from different perspectives, such as effectiveness, efficiency, learnability, memorability, error, and so on [5, 6, 35]. For example, effectiveness is usually represented by the task completion rate, efficiency is often evaluated by task completion time and task success rate, learnability is frequently assessed by the time spent on achieving a specific level or the time spent on the first trial, memorability can be evaluated by the time spent on the second use, and error can be measured by the error frequency [5, 6, 35].

5 Conclusion

To be more competitive in the mobile applications market, research related to mobile applications usability evaluation has become popular. Therefore, this report analyzed this topic based on recent literature. From the trend analysis, the number of papers related to mobile applications usability evaluation searched by Scopus keeps increasing from 2010 to 2021, except for 2020. The increasing number of papers represents the popularity of this topic in recent years, customers’ demands for mobile applications with high usability, and developers’ awareness of improving the usability of mobile applications. Co-occurrence keyword analysis can indicate both the importance of each term and the relationship between different terms [8]. Based on the co-occurrence keyword analysis, some important terms were identified. What’s more, different terms with a close relationship were also identified, such as mobile health and diabete. Co-authorship analysis and co-citation analysis have a similar function as the co-occurrence keyword analysis. In the co-authorship analysis, the importance of the author can be ranked by the number of documents, citations, and total link strength. The relationship between different authors can be assessed based on the distance and link in the co-authorship diagram. In this report, Hussain A has the most significant number of documents and total link strength in papers searched by both databases. Co-citation analysis was performed by VOSviewer and CiteSpace. Both tools can identify the frequently cited references and the relationship between different references. Additionally, citation burst generated by CiteSpace can represent the frequently cited references in different periods. Based on co-citation analysis, papers related to the usability evaluation model, SUS, USE questionnaire, and usability heuristics were identified. Then, two leading tables were constructed based on the number of documents the author had and the number of documents in each source. Finally, word cloud generated by MAXQDA was based on the previously searched materials, which shows the leading terms directly in the selected material.

6 Future Work

This report is a literature review on mobile applications usability evaluation based on bibliometric analysis methods. However, the papers analyzed in this report are limited since most are from the co-citation analysis based on two databases. Therefore, more databases can be used to search papers and more papers can be included in the review. Next, the mobile usability evaluation framework, including usability attributes and usability evaluation methods, might be different due to the different users, contexts, and goals. In future work, for mobile applications usability evaluation, the choice of usability attributes, evaluation methods, and evaluation heuristics, should be emphasized based on these differences. Finally, the objective of this topic is to detect and improve the usability problems that will influence technology acceptance during the design process. However, some other factors can influence technology acceptance either. One prominent factor related to mobile applications pointed out by Wei & Dong is security [2]. How to prevent security problems and recover immediately after encountering security problems to avoid loss are of great significance in future work.