1 Introduction

Lately, context-aware services and systems using diverse mobile devices such as smartphones and tablet computers have been provided. Based on high mobility and portability, mobile environment offers various sensors, flexible network connectivity and convenient external resources. With these characteristics of mobile environment, there has been a rising demand for context-aware services. Due to paradigm shifts from ubiquitous environment where flexible network connectivity is what matters to mobile computing environment which provides user-centric personal services, in addition, studies on mobile context-aware have become more important [11]. The analysis technologies with user data have also become more critical as the mobile application market grows. As the number of mobile applications skyrockets, however, it’s become more difficult to find user-wanted applications or contents in the mobile market. If an upgraded application with similar contents is released, furthermore, users can easily get bored with previous mobile applications. That’s why users keep searching for a new applications with upgraded functions. A recommendation model has been introduced to the application market because of these trends. In this study is described a context-aware recommendation model with mobile application analysis platform. First, for context-aware recommendation, a log data from mobile applications is collected, and a user profile is automatically generated. Then, the profile is automated by collecting log data such as the duration time, the location information and the device information. Second, the recommendation model basically creates a list of candidates for recommendation with analyzing the use frequency, duration and up-to-dateness of the mobile application. The model proposed in this study has a flexible structure designed to generate a list of candidates depending on user circumstances in each period so that a collaborative filtering model aimed to handle with the duration, location and device information is adopted. Lastly, we suggest the mobile applications with the contextual usage data.

This study is structured as follows: In chapter 2, the studies relating to personalization recommendation and context-aware recommendation are introduced. In chapter 3, a big data system-based mobile application analysis platform is explained. In chapter 4, a recommendation model using various contextual data of mobile applications is described. In chapter 5, the results of the test conducted using the proposed model are stated. In chapter 6, a conclusion is given, and future work is discussed.

2 Related work

2.1 Mobile log analysis

There have been a lot of studies for mobile log analysis. Because a mobile device is highly portable, in particular, data can be collected and analyzed from diverse perspectives.

Xu [24] collected and analyzed user data from diverse perspectives in terms of the mobility and diversity of mobile devices. Based on the ease of use for the mobile device network, packet data were collected and used in usability analysis. However, it is difficult to collect and analyze the user data in a large amount. Girardello [10] proposed a method to recommend a popular application installed by a lot of users by storing the number of installations and deletions as database. This system targeted travelers who are interested in an application which is useful in finding a particular location. Instead of using a general recommendation technique, user location and app’s consumption information were only used for recommendation. Yan [25] analyzed how users actually use the app, using the item-based collaborative filtering. Based on analysis on what applications other users around used, an application was recommended. Verkasalo [22] extracted and analyzed the usability of mobile applications. After installing MobiTrack in the mobile device, the usability-related data were automatically collected. Then, they were periodically transferred to a server for synchronization. The collected data were analyzed under five categories only: communication, usage patterns of the mobile application and navigation through internet, device and multimedia information. Therefore, they weren’t able to provide analysis information under specific categories. Kang [12] collected and analyzed diverse usage patterns in the mobile device. Various information (ex: GPS, SMS, battery, etc.) was collected and used for statistical analysis. Then, user status was induced. Cloud-based sensor data were collected and analyzed. Seo [20] collected and analyzed sensor and network information for the purpose of improving user convenience. There has been a rising interest in the studies which are aimed to analyze usage patterns, using the log data [9, 18, 19]. However, it’s been very hard to find a study on the analysis and materialization of usage patterns based on cloud after collecting mobile log [4, 6, 8, 16, 26].

2.2 Context-aware recommendation system

In mobile environment, there have been a lot of preceding studies on the recommendation of a suitable application or contents depending on user circumstances. Users’ context information can be divided into static context information (ex: age, gender, occupation, etc.) and dynamic context information (ex: temperature, weather, location, etc.). As context-aware technology becomes more important with the growth and expansion of mobile industry, in particular, many techniques aimed to figure out user preference using these context data have been proposed in mobile recommendation-related studies [2, 3].

Debnath [7] proposed a method to change adequacy depending on the duration and frequency of the usage after recommending an application suitable to user circumstances and getting feedback on user preference. Lee [17] suggested a methodology to improve recommendation performances by acquiring context information from the user log. They recognize the users’ tendency with the time based context information. It is adjusted to collaborative filtering with similar circumstances. Lee [15] did not compare the all user’s rating information, but they used above-average users to improve the accuracy of recommendation. After comparing current users’ context, high weighted values were applied to similar users for recommendation. Lee [14] attempted to derive user circumstances for dynamic context information (ex: temperature, humidity, time, etc.) as well as static context information by proposing a Fuzzy-Bayesian Network-based music recommendation method. Woerndl [23] suggested a system aimed to recommend a mobile application based on user context through the hybrid recommendation engine under a multidimensional approach.

In addition, there have been various types of studies such as deriving user behavior or circumstance to recommend more customized service or contents in mobile environment or recommending a similar application. However, most of these studies focused on static context information such as user profile, experienced contents and item history [5]. In terms of the configuration of a similar group, in addition, the conventional filtering technique was concentrated. Therefore, there were limitations in effectively composing a similar user group with similar patterns or propensity under diverse circumstances. In a mobile computing environment where various forms of contents are explosively created, lastly, there have been few studies on mobile context-aware recommendation which provides more personalized contents [13, 21].

3 A Mobile application analysis platform

The structure of a mobile application analysis platform for context-aware recommendation is shown in Figs. 1 and 2. This platform is operated by the Citrix CloudStack Infrastructure’s Big Data System. Hadoop and Hbase to handle a problem of data extensibility. The cloud environment aims to minimize the possibility of system problems which occur because of system extensibility [1].

Fig. 1
figure 1

Structure of the Mobile Application Analysis Platform Module System

Fig. 2
figure 2

Configuration of the Mobile Application Analysis Platform Module

3.1 Client module

A cloud module collects and transfers sensor data based on mobile users’ activities (Fig. 2). It has SDK library to collect user activities easily through the mobile devices. This module monitors all activities created in the library-included application and collects data which is related the time and location dimensions. Then, the collected data are transmitted to a server module when the mobile device is in WiFi mode, and maximum allowable data size exceeded. In this study, two different SDK versions (Android and iOS) are provided, and user log data are collected in three categories depending on attributes: Ds, Dd and Du. First, Ds includes static data which are collected once such as UUID, title of application, OS version, device model number and resolution. Second, Dd covers dynamic data (ex: start time, end time, latitude, longitude, etc.) which are collected according to user activities. Third, Du includes user-defined data aimed to collect additional sensor data in addition to basic ones. The data collected from the mobile sensors are stored in SQLite and transferred to Flume using WiFi or 3G.

3.2 Server module

A server module stores sensor data from the client module, and it analyzes their usability, using the analysis algorithm (Fig. 2). This module consists of Citrix CloudStack Big Data System. The Hadoop-based Big Data system can minimize a problem of system portability which can occur in cloud environment.

A server module gets the sensor data collected by the client module in three different Flume designs depending on attributes. Then, they are processed and analyzed through the big data system. Regardless of data processing environment, this module consists of Tomcat web server, TCP-IP file transfer server and a web service-based data transfer server. The server module receives data from the mobile device and analyzes patterns to check abnormal data transmission. After pattern analysis, the map/reduce module calculates the results derived from the Hadoop or Hbase line by line.

4 Context-aware recommendation model

The proposed recommendation model uses a collaborative filtering algorithm considering three aspects: frequency of use, duration of use and up-to-dateness. First, the frequency of the use of the application executed in the mobile device is calculated. The frequency is analyzed with various factors such as time, location and device. The time-based frequency is calculated according to daily, weekly and yearly time. The location-based frequency is estimated depending on latitude and longitude while device-based frequency is calculated depending on model number, manufacturer and resolution. Second, we calculates the duration. Third, the up-to-dateness would be an important factor in usability analysis because a mobile app’s lifecycle is relatively short.

Let R denote the list of candidates considered for the reliability of the mobile application based on the time, location, and device.

$$ R=\left\{{R}_{time},{R}_{location},{R}_{device}\right\} $$
(1)

Where each R time , R location , R device are an ordered lists of the mobile applications based on the time, location, and the device type.

R time is recommended by the usage frequency, duration, and recentness of the mobile applications. In addition, R location and R device are the means of filtering the candidate applications as an intermediate results from an ordered list selected according with the R time . These two parameters are acceptable for the model to determine for the refined candidates.

The model determines factors with the user context. For example, if the users want to use the location data to find the mobile applications, the model collects candidates based on the current location with the R time and R device .

$$ R=\alpha \times {R}_{time}+\beta \times {R}_{location}+\gamma \times {R}_{device}\left(\alpha +\beta +\gamma =1\right) $$
(2)

Let U denote that the user set consists of the usage frequency (UFT), duration time (UDT), usage time per hour(URT), usage time per day(DOW), usage time per month (MOY), the operating system version(OSV), and the location(L).

$$ U=\left\{UFT,\ UDT,\ URT,\ DOW,\ MOY,\ OSV,\ L\right\} $$
(3)

Let UFT have an element f hi , which is the number of executions of the mobile application a j (a j ∈A). C ij represents execution time for mobile application(C ij = 0 means that there is no relationship between f hi and a j .)

$$ {\widehat{C}}_{ij}=\frac{{\displaystyle {\sum}_{f_{h_k}}}{w}_{i,k}\cdot {c}_{k,j}}{{\displaystyle {\sum}_{f_{h_k}}}{w}_{i,k}} $$
(4)

where w i,k is the similarity between f hi and f hk , is defined as follows.

$$ {w}_{i,k}=\frac{{\displaystyle {\sum}_{a_j\in A}}{c}_{i,j}{c}_{k,j}}{\sqrt{{\displaystyle {\sum}_{a_j\in A}}{c}_{i,j}^2}\sqrt{{\displaystyle {\sum}_{a_j\in A}}{c}_{k,j}^2}} $$
(5)

We could recognize the user preference with this consumption activity(UFT) because the usage frequency is the indirectly measure which is the propensity to consume the mobile applications. In advance, the proposed approach consider two kinds of types, a weekday or weekend, for the recommendation.

Finally we proposed the location based contextual recommendations of the mobile application. While commuting, most people want to know when they should go to their destination or where the it is located. On the other hand, most people use mobile applications related to their works such as an e-mail service, groupware service, or office suite in office hours.

On the other hand, it is very important to consider not only the usage frequency but also the usage duration because it means the contents popularity. For example, YouTube is often used for a long time owing to their popularity over TV or radio.

The spending activity (d hi ∈UDT) of mobile app(a j ∈A) is denoted as S ij .

$$ {w}_{i,k}=\frac{{\displaystyle {\sum}_{a_j\in A}}{s}_{i,j}{s}_{k,j}}{\sqrt{{\displaystyle {\sum}_{a_j\in A}}{s}_{i,j}^2}\sqrt{{\displaystyle {\sum}_{a_j\in A}}{s}_{k,j}^2}} $$
(6)
$$ {\widehat{S}}_{ij}=\frac{{\displaystyle {\sum}_{d_{h_k}}}{w}_{i,k}\cdot {s}_{k,j}}{{\displaystyle {\sum}_{d_{h_k}}}{w}_{i,k}} $$
(7)

The lifecycle of a mobile application registered in an market such as Google Play or the Apple AppStore is relatively much shorter than that of the general applications for a PC. Then we multiply the penalty with the usage rate per month. The consumption activity will be calculated through the following equation. Let M is the number of month, and m k means the durations for consumptions of the application.

$$ new\ {\widehat{C}}_{ij}={\widehat{C}}_{ij}*\frac{c_{ij}}{\frac{{\displaystyle {\sum}_{m_k\in M}}{c}_{i,j}}{\left|M\right|}} $$
(8)

We also consider two additional factors, which are OS versions and location based data, for filtering the unnecessary intermediate results. Recently, the hardware and software characteristics of mobile devices tend to be diverse because many companies have introduced new models continuously. Thus, some mobile applications have been generated to showcase these new features. Furthermore location based data is useful to determine the customized candidates.

5 Experiments

In this experiment, a background application developed using the Android SDK was installed in 10 mobile devices, and the usability data collected for 3 months were analyzed. The mobile usability data of 120 applications but Android or middleware application which is unnecessary for usability analysis were collected.

The importance of a mobile application is determined depending on how many people use it. In this study, some attributes were added to normalize differences among usability patterns during daily activity hours in addition to general features. To investigate what time zone is most important against all mobile applications, the relationship with daily activity hours was measured based on the frequency and duration. To analyze the relation during daily activity hours, objective similarities between all applications were normalized and calculated. The normalization process can prevent the mobile application’s importance from being incorrectly analyzed based on the collected total frequency and amount of consumption.

Figure 3 shows that the relationship with daily activity hours based on the frequency of usage. The pattern of commuting time is more similar than the other times such as the night and the dawn. Furthermore the patterns in a working time is similar than the other times in a day. It means that we have to normalize the real data by using this similarity result for shortening the differentiated usability. After the normalization process, we have evaluated the usage pattern with the frequency and the duration. As a result, we could recommend the mobile applications which many people interest in. Figure 4 is introduced severally to explain and demonstrate the feasibility of the recommendation model.

Fig. 3
figure 3

Relationship with daily activity hours

Fig. 4
figure 4

Usage pattern analysis for the SMS applications

It shows the usage pattern for many applications related with SMS(Short Message Services) and MMS(Multimedia Message Services). The tendency of the usage frequency and average fluctuates heavily according with the time dimension. However, the proposed model shows enhanced result because it normalizes the difference between the hours in a day and recalculates the similarity by using the relative weight for the mobile applications.

Figure 5 shows the usage pattern for the web browser applications. In general, many people are used to consume the time frequently in order to navigate the web site. It shows that the usage duration at night is higher than the other time in a day. Absolutely, we have known that the usage pattern based on the frequency and the duration remains at a high level. It means that web browser is the most popular applications for acquiring information.

Fig. 5
figure 5

Usage pattern analysis for the Web Browser applications

Figure 6 shows that many users spend time with a mobile game in a commuting time. For example, the usage frequency displays at a very high level between PM 6 to 8. This means that many users play a mobile game during the commuting after the work. Additionally, the graph of the proposed model is still at a high level. It means the importance of the mobile game is relatively higher than the other applications in a commuting time irrespective of the usage frequency and the usage duration.

Fig. 6
figure 6

Usage pattern analysis for the Commuter applications

6 Conclusions

A model proposed in this study overcame the limitations of the conventional recommendation system which would provide simple information without considering a mobile device’s features or user circumstances and realized personalized services in mobile environment. In addition, it was designed to provide reliable, intelligent and optimized recommendation services by analyzing cloud-based log data.

This study analyzed mobile users’ usability log after configuring a big data-based mobile application analysis platform and suggested a context-aware recommendation model. The mobile application analysis platform consists of a library-type client module and a big data-based server module. The client module collects usability log data from the mobile devices and performs the function needed to send these data to the server. The server module analyzes and processes the data from the client module, using the pre-defined data patterns. The context-aware recommendation model makes a list of candidates using additional context data such as time and place. The preference was measured through usability experience instead of using user profile or user preference in person.

There will be further studies by adding mobile sensing data or analysis algorithm to the cloud-based usability analysis framework and extending them by a category unit, not by a user unit. In addition, there will be continued studies on a faster and more accurate recommendation method through the precise analysis of the data which have increased due to the rapid growth of the number of users, applications and contents.