Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

With the wide adoption of mobile devices with built-in positioning systems (e.g., GPS), LBSNs (Location-based Social Networks) are becoming increasingly popular, especially among the young. Nowadays, millions of people are using various LBSN apps to share interesting location-embedded information with others in their social networks, while simultaneously expanding their social networks with the new interdependency derived from their locations [1].

These LBSN apps can be roughly divided into two categories (I and II). LBSN apps of category I encourage users to share location-embedded information with their friends, such as Foursquare and Google+. Foursquare, which has achieved more than 55 million users worldwide since 2009 [2], allows users to check-in at some interesting places and then share the check-in locations with their friends. Google+, besides sharing check-in locations, even allows users to share their real-time locations with pre-specified users (e.g., their families) [3].

LBSN apps of category II concentrate on location-based social network discovery. Such LBSN apps allow users to search and interact with strangers around, and make new friends. Salient examples of this category include Wechat, Momo, SayHi, Skout and so forth. WeChat, which now has more than 540 million monthly active users around the world [4], has a feature called “Nearby”. This feature allows users to get a list of other users nearby as well as their coarse-grained relative locations. People can use this feature to discover strangers (and be discovered by others simultaneously), and then make friends with strangers of interest. Some apps (e.g., Facebook and Sina Weibo) that were not originally designed for LBSNs are now also upgraded to this category. For example, Facebook Places was announced in 2010 to bring similar LSBN features into Facebook [5]. Sina Weibo, a Twitter-like microblog app in China, has also come up with a “Nearby” feature to let users discover nearby people, microblogs and hot places.

While using LBSN apps of category I to check-in or share locations with friends, users are likely to explicitly publish their locations to their social networks [6]. On the contrary, while using LBSN apps of category II to discover nearby users, users will get information without explicitly making their locations public. As a matter of fact, when a user (using LBSN apps of category II) searches nearby users, the user’s exact location (e.g., GPS coordinates) will be uploaded to the app server, and then exposed (usually after obfuscation as needed) to nearby users by the app server. At first glance, the users’ exact locations would be secure as long as the app server is securely managed. However, there remains a risk of location privacy leakage when at least one of the following two potential threats happens. First, the location exposed to nearby users by the app server is not properly obfuscated. Second, the exact location can be deduced from (obfuscated) locations exposed to nearby users.

In this paper, we systematically investigate these two threats using typical LBSN apps of category II. We find that existing LBSN appsFootnote 1 are vulnerable to these threats, which could be exploited by the adversary to perform automated and efficient large-scale location probing attacks, Such attacks could reveal the location of any user that uses the “Nearby” feature. We propose a series of novel methods to probe the location privacy of people using different LBSN apps and show that our location probing methods are general and applicable to the vast majority of existing LBSN apps.

Our work is different from existing work in twofold. First, we are able to probe locations of any user whereas in existing works such as [79] the attacker can only probe the locations of his/her friends. Second, we propose three general approaches for location probing, and discuss the scenarios for using different approaches, whereas existing studies usually focus on specific situations. For example, [10] uses Android virtual machines to carry out proof-of-concept attacks via Wechat, Skout and Momo, and [11] discusses the possibilities of launching the sniffing attacks if the HTTP traffic of LBSN apps can be intercepted and manipulated.

To the best of our knowledge, we are the first to carry out a large-scale experiment to evaluate the practical efficiency of such attack in real world. Our contributions include:

  • Track location information flows and evaluate the risk of location privacy leakage in popular LBSN apps. We analyze the location information flows from many aspects including location accuracies, transport protocols, packet contents, etc. in popular LBSN apps such as Wechat, Momo, Mitalk, SayHi, Skout, MeetMe and Weibo, and find out that most of them have high risk of location privacy leakage.

  • Propose three general attack methods for location probing and evaluate them via different LBSN apps. We propose three general attack methods to probe and track users’ location information, which can be applied to the majority of existing LBSN apps. We also discuss the scenarios for using different attack methods, and demonstrate these methods on SayHi, Mitalk and Wechat, separately.

  • Recommendation of count-measures. We suggest some count-measures against this new threat of privacy leakage in LBSNs.

The rest of the paper proceeds as follows. Section 2 presents an overview of location-based social networks and LBSN apps. Section 3 details three general attack approaches with examples. After recommending some countermeasures in Sect. 4, we conclude the paper in Sect. 5.

2 Overview of LBSN Apps

Most LBSN apps have two features: check-in and social network discovery. We focus on the latter because people’s check-in locations are shared with their friends and therefore they are not regarded as private information.

Fig. 1.
figure 1

The workflow of social network discovery in LBSN apps

The workflow of social network discovery in LBSN apps is elaborated in Fig. 1. The following steps will be performed in the scenario where a user searches for people nearby at location \(l_{0}\) and time \(t_{0}\).

  • Step (a): The mobile app sends a request including the user’s current location \(l_{0}\) which is obtained by GPS or online SDKs(e.g., Google SDK [12], Baidu Location SDK [13]) and the authorization token to the server. The authorization token is provided by the server as a unique identifier as long as the user logins into the mobile app.

  • Step (b): Once the request from the user is received, the server saves the user’s location \(l_{0}\), time \(t_{0}\) and other information into the database for further use, such as letting the user be visible to others.

  • Step (c): The server searches the database which contains the request time and locations of all the users who have ever searched for nearby people. Then, it finds out a list of users who are not in the friend list of the user (user0) and have appeared around the location \(l_{0}\) (within a distance of \(\varDelta \)D) less than a finite time \(\varDelta \)T ago. Given a user as u, the people in user \(u_{i}\)’s social network as \(U_{fi}\), and the distance between two locations \(l_{i}\), \(l_{j}\) as \(D_{l_{i},l_{j}}\), the nearby users queried from the database for user \(u_0\) can be described as:

    $$\begin{aligned} \{u,l,t | u \notin U_{f0}, D_{l,l_{0}} \leqslant \varDelta D, t \geqslant t_{0} - \varDelta T\} \end{aligned}$$
    (1)
  • Step (d): The server sends a response to the mobile app with the queried results. For the purpose of privacy protection, the results returned by most LBSN app servers only contain essential user information u and coarse-grained distances l, because if the accurate distances are provided, a user’s exact location can be calculated by trilateration position methods easily [14]. Finally, the mobile app displays these results to the user.

Figure 2 shows the displayed results in typical LBSN apps: Wechat, Mitalk, Momo, Weibo, SKout, SayHi, Badoo and LOVOO. The displayed user information generally contains nickname, gender and other information (e.g., personalized signatures). In particular, Wechat, Mitalk and Weibo provide distances to an accuracy of 100 meters, and Momo and SayHi do so to an accuracy of 10 meters. However, LOVOO provides distances accurate to within 0.1 miles, which is the least accurate.

The user can view detailed information (e.g., publicly available photos) of nearby strangers, send greetings to them, and finally make new friends to extend the user’s own social network.

Fig. 2.
figure 2

Search nearby people in LBSN apps

Figure 1 also presents two other scenarios to show in what circumstances a user can be found by others. In one scenario where user1 searches for people nearby at a place which is close to the location of user0 (\(l_{0}\)) for a short while (\(\varDelta t_1, t_1 < \varDelta T\)) after user0 searches for people nearby, according to Eq. 1, user0 can be found by user1 because \(D_{l_{0},l_{1}} \leqslant \varDelta D\) and \(t_0 \geqslant t_{1} - \varDelta T\). As for the other scenario where user2 searches for people nearby after a long while (\(\varDelta t_2, t_2 > \varDelta T\)), user0 cannot be found.

If an attacker \(u_A\) (whose friend list \(U_{fA}\) is empty) is able to send a fake location \(l_A\) to the server in Step (a), he will get a response containing the users \(u_i\) around \(l_A\) and their distances \(D_{l_{i},l_{A}}\) in Step (d). By changing the value of \(l_A\) constantly, the attacker can probe the users at any location.

In order to perform the location probing attack, we need to address the following challenging issues.

  • How to forge the request with fake locations: We need to intercept the request in Step (a), and tamper the value of current location l. For securing data transportation, some LBSN apps use techniques like SSL authentication and data encryption, making request forgery a challenging task. Therefore, we need to try all possible ways to break or bypass these protection techniques.

  • How to perform a large-scale probing effectively and economically: We need to use as few resources as possible (e.g., 1 PC) to probe thousands of locations for large-scale attacks. Because the location information of the users will be cached for a while (\(\varDelta T\)) in Step (c), using too many probers to probe at different locations synchronously is both resource-consuming and unnecessary. But if the time span of probing two nearby locations is too long (e.g., longer than \(\varDelta T\)), some data may be missed. For example, a user appeared at location \(l_0\) at time \(t_0\), his location information can be probed only if the prober happens to probe at a location near \(l_0\) between time \(t_0\) and \(t_0 + \varDelta T\).

3 Location Privacy Probing via LBSN Apps

This section presents some general paradigms for location privacy probing via popular LBSN apps. We first look deeply into some popular LBSN apps and examine the security through their transport protocols, request encryptions, response data, etc. Then, we propose and demonstrate three general methods for location privacy probing, which can be applied to the majority of existing LBSN apps.

Fig. 3.
figure 3

Intercept and monitor network traffic with Fiddler 4

3.1 Examining Popular LBSN Apps

We install 8 popular LBSN apps including LOVOO, MeetMe, Mitalk, Momo, SayHi, Skout, Wechat and Weibo into an Android mobile phone, and use a web debugging proxy named Fiddler [15] to intercept and examine the network traffic between the apps and their servers. We set up a proxy with Fiddler 4 on a computer, and configure the proxy settings in the Android mobile phone to access Internet through our proxy. Then, all the HTTP/HTTPS traffic of the LBSN apps can be intercepted and monitored by Fiddler 4. Figure 3 shows the user interface of Fiddler 4. We see a list of intercepted HTTP/HTTPS requests on the left side of the user interface, including Protocol, Host, URL, etc. On the right side, there are two windows showing the details of the selected request and decoded response respectively.

We examine the security of the intercepted network traffic from different aspects.

  • Transport protocols: The content in HTTP requests can be easily intercepted and manipulated to launch the request forgery attacks. HTTPS (HTTP over TLS/SSL) can provide data encryption to prevent the data from being tampered [16]. It is worth noting that TLS/SSL can be configured either one-way or two-way. In one-way TLS/SSL communication, the server is required to present the certificate to the client, but the client is not required to present the certificate to the server, meaning that the server will accept the request from anyone. In this case, the HTTPS request can still be forged using a local self-signed certificate [17]. In two-way SSL communication, both the server and the client are required to present the certificates, which makes request interception a hard task. Therefore, we consider the HTTP and HTTPS with one-way SSL protocols are insecure, and HTTPS with two-way SSL protocol is safe.

  • Request encryptions: Another way for data protection is to encrypt some of the parameters in the HTTP request. For example, a checksum or signature can prevent the request from being tampered effectively, as long as the encryption algorithm can not be cracked easily.

  • Response data: The response data should not contain more information than what the app client needs. If the response data contains much more information (e.g., more accurate location than which is displayed in the app, the last time the person appeared), it will bring a risk of information leakage.

The analysis results are shown in Table 1. We can see that most apps use HTTP or HTTPS protocol with one-way SSL for data transportation and have no encrypted parameters in the requests. In this case, we can forge the HTTP/HTTPS requests to query nearby people at any location. Mitalk and LOVOO encrypt parameters (checksum and signature) and therefore the request can be forged only if we can crack the encryption algorithms and figure out the value of checksum or signature parameters. If the requests are too difficult to forge while the data is transported via HTTPS with two-way SSL or the encryption algorithm is irreversible, we can also use mobile phone emulators and automated testing methods to simulate user actions to get people nearby at fake locations. The detailed demonstrations of these three methods are shown in Sects. 3.2, 3.3 and 3.4.

Table 1. Examination results of popular LBSN apps

3.2 Forging Requests

For LBSN apps, the request for searching people nearby contains parameters which are used to locate the user. The attacker can search people at any location by intercepting and tampering the location parameters. We demonstrate the attack in the following steps:

Step 1: Request Interception. We use Fiddler as a web proxy to intercept the HTTP/HTTPS traffic between LBSN apps and their servers. For HTTP traffic in plaintext, we can directly get the contents of the requests and responses. Fiddler can also decrypt HTTPS traffic with one-way SSL, as long as a local self-signed certificate is generated and installed into the mobile phone. If certificate and public key pinning [18] is used in the LBSN app, reverse engineering work should be performed to replace the hard-coded key of the app with the one generated by Fiddler.

Some of the intercepted requests of different LBSN apps are as follows:

Step 2: Request Forgery. We forge HTTP or HTTPS with one-way SSL requests by modifying the values of the location parameters in the intercepted requests to search nearby people at any location. We develop a program to automatically probe nearby people at random locations repeatedly. In order to avoid the alarm of anomaly detections, the program sleeps for a short while after each probing.

Step3: Response Parsing. For most of the LBSN apps, the responses of searching nearby people are in JSON format because it is more efficient than XML and other data interchange formats [19]. We can extract useful information such as the person’s id, name, distance or geo-coordinate by comparing the response data with the information displayed in the app.

Figure 4 shows the displayed results and the JSON-formatted response of searching nearby people in SayHi. SayHi provides distance values to an accuracy of 0.01 km. As shown in Fig. 4, although we can see that the user Sasithorn is 5.01 km away, we cannot figure out the exact location of Sasithorn only using this information. However, we find that the geo-coordinate of Sasithorn (116.339193, 39.9923481) is in the JSON-formatted response data.

Fig. 4.
figure 4

Displayed and JSON results of searching nearby people in SayHi

Besides the geo-coordinate of the user, the JSON-formatted response of Weibo also exposes the time when the user was located in that place for the last time (last_at field). Figure 5 demonstrates a real-world example, which indicates that the user with ID 2753134315 was at the location (116.30042, 40.02080) at 01:09:58, Sep 27th, 2015.

Fig. 5.
figure 5

Displayed and JSON results of searching nearby people in Weibo

3.3 Encryption Cracking

Some LBSN apps use data encryption techniques other than HTTPS protocol to secure the data traffic. They add encrypted parameters such as checksum or signature into the requests for data tampering detection. Take Mitalk for example, the intercepted request of searching nearby people in Mitalk is shown in Fig. 6, in which latitude and longitude represent the searching location. The JSON-formatted response contains an “ok” code and a list of persons around the searching location. However, when we try to modify the value of latitude, longitude or any other parameter in the request, the response indicates errors with code 401.

After a series of experiments, we figure out that the parameter s in the request is generated by a customized algorithm and it represents the checksum of all other parameters. The server will recalculate the checksum and compare it to the value of s when it receives a request. If the values don’t match (i.e., one or more of the parameters might be tampered), an error message will be returned.

Fig. 6.
figure 6

Intercepted request and response of Mitalk

We perform reverse engineering to crack the algorithm of generating parameter s. We decompile the APK of Mitalk into Java using tools including apktool [20], dex2jar [21] and Jd-gui [22], and find out the generation procedure of s, which is shown in Fig. 7(a).

Fig. 7.
figure 7

Cracking encrypted parameter s of Mitalk

From the figure we conclude that the encrypted parameter s is calculated according to Eq. 2:

$$ \begin{aligned} {s=b(name1=value1 \& ... \& nameN=valueN \& paramString)} \end{aligned}$$
(2)

In the equation, name1 to nameN are the alphabetical parameters of the request excluding s. Meanwhile, paramString is a fixed value. In order to get the value of paramString, we firstly disassemble the APK file of Mitalk into smali codes, and insert some debug codes to let the app print the value of paramString into logcat (which is an Android logging system for collecting debug outputs) while runing. Then, we repackage [23] the APK file and install it into an Android phone. When we launch the repackaged Mitalk app and login with an account, the value of paramString can be watched in a logcat viewer, as shown in Fig. 7(b).

From com.xiaomi.channel.d.f.a.b, the function b in Eq. 2 can be decompiled. The decompiled function is shown in Fig. 8. We can calculate the value of s using Eq. (2) to bypass the data tampering detection of the server, and then use the same method in Sect. 3.2 to search nearby people at any location.

Fig. 8.
figure 8

Decompiled function b from com.xiaomi.channel.d.f.a.b

3.4 Emulator Simulation

Some LBSN apps like Wechat and LOVOO use HTTPS with two-way SSL protocol or use advanced encryption techniques. In this case, it will be too difficult, if not impossible, to intercept and forge the requests. Under these circumstances, we use mobile phone emulators and automated testing tools to simulate user’s actions to probe nearby people at any location.

We demonstrate the method on Wechat using Android emulator [24] and uiautomator, which is a testing framework for Android [25]. We create an automated functional UI test case using uiautomator, which will automatically press a series of buttons to launch Wechat app and search nearby people in it. As soon as the results are displayed on the screen, the test case will inspect the UI to find the layout hierarchy and read information we need such as usernames and distances through the properties of specific UI components. The UI and the corresponding layout hierarchy of Wechat are shown in Fig. 9. The algorithm of the testcase is shown in Algorithm 1.

Fig. 9.
figure 9

Inspect the layout hierarchy of Wechat with UIAutomator

figure a

In our experiments, we first send fake geo-coordinates to the emulator using a GPS command geo fix in the emulator’s control console, and then launch the testcase in the emulator to get nearby people at the fake location. By repeating the above two steps, we can probe nearby people at any location.

3.5 Location Tracking

As long as a large volume of data is collected, it is likely that a specific person would be probed multiple times at different places. Then, we can mark the location and the time when the person appeared on a map to track his/her locations.

For some LBSN apps such as Weibo and SayHi, we can get the geo-coordinates of a targeted person directly, and hence we mark the exact locations of the person with points on a map, as shown in Fig. 10(a). For some other apps such as Wechat and Momo, we can only get the coarse-grained locations which are determined by the probed location and the distances from the targeted person to the probed location. In this case, we mark the approximate locations of the person with circles on a map, as shown in Fig. 10(b). The red points indicate the locations of the probers, and the circles denote the possible locations of the probed users.

According to the trilateration positioning method [14], if a point lies on two circles at the same time, we can narrow down the possible locations to the intersections of the two circles. If a point lies on three or more circles, we can narrow down the possibilities to a unique point. Figure 10(c) shows that at nearly the same time, a user is probed by 5 probers (red points) and another user is probed by 3 probers. The locations of these two users can be deduced precisely to Point1 and Point2.

Fig. 10.
figure 10

Location tracking via different LBSN apps

3.6 Risk Evaluation

We next evaluate the overall risk induced by the popular LBSN apps. Table 2 shows the overall risk evaluation of popular LBSN apps. More than half of the apps (i.e., five out of eight) are easy to exploit. Meanwhile, half of the apps (i.e., four out of eight) can expose people’s location privacy with high accuracy. Weibo and SayHi have the highest risks, because they can be exploited for location probing easily and meanwhile can leak people’s geo-coordinate directly. LOVOO and Wechat have a relatively low risk of being exploited for large-scale location probing, because the efficiency of emulator simulation is much lower than forging requests, and they only expose people’s coarse-grained locations.

Table 2. Risk evaluation of popular LBSN apps

4 Recommendations on Counter-Measures

In this section, we discuss some possible counter-measures against the threat of location privacy leakage via LBSN apps.

Firstly, we point out that using HTTP protocol with plaintexts for data transportation is extremely unsafe, because it is vulnerable to both request forgery and MITM (man-in-the-middle) attacks. Besides, since the one-way TLS/SSL authorization does not require the server to check the validity of the certificate from the client, a self-signed local certificate can be generated and used to parse the plain content from the TLS/SSL traffic, which makes HTTPS protocol with one-way TLS/SSL unsafe to use. Other misuses of TLS/SSL in the development of the apps such as allowing all hostnames, trusting all certificates, SSL stripping and lazy SSL usage [26] will also make the apps vulnerable.

Also, anti-probing and anomaly detection methods should be used by the service providers to distinguish automatic probers from normal human users. It is not efficient enough to simply limit the quota for searching nearby people of each user just as what Momo does, because it can be bypassed easily by using multiple probing accounts and devices. A witty designed machine behavior model should be studied and applied for better detection and protection. Last but not least, in the client/server (C/S) model, while responding to the request, the response data volume should be small without more extra information than the app client needs, while leaving the data filtering and omitting work to the client will bring the risk of information leakage.

5 Conclusion

In this paper, we pointed out that mobile social apps will introduce location privacy leakage when they provide the functionality of searching nearby people. We examined the risks of location leakage in popular LBSN apps and find out that they can be exploited for launching the location probing attacks. Moreover, we proposed three general methods for conducting such attacks via LBSN apps.

Using the new attack methods, we evaluated the overall risk induced by popular LBSN apps and had many interesting observations. This study shows that the current methods for location privacy protection in LBSN apps are insufficient and new protection mechanisms are desired to address such risk.