Keywords

1 Introduction

The diversity of a retail shop and its density make an urban district attractive and unique, thereby enhancing the competition between shops and enticing external visitors from other districts both nearby and abroad [1]. Pedestrian exploration and their presence encourage other pedestrians to interact with one another, generating liveliness throughout the neighborhood [2]. Conversely, retailers believe a key driver of store performance is location [3], which collectively determines the way a customer transitions from shop to shop. This is greatly influenced by geographical accessibility to said shops: a central location is easier to be approached from anywhere, making it more visible and popular to attract both people and goods [4]. “Constraints on mobility determine where we can go and what we can buy” [5].

The objective of this paper is to analyze customers’ spatial distribution considering their consecutive transaction activities through three large-scale department stores in the city of Barcelona, Spain. We study similarities in customers’ origin and destination locations between the same chains of these three stores, which are located in varying urban settings. Essential understanding of this area is largely related to how the power of attraction and distribution for each store affects both the customers as well as the holistic urban environment.

For this purpose, we employ a large-scale transaction dataset provided by one of Spain’s largest banks: Banco Bilbao Vizcaya Argentaria (BBVA). This dataset contains the geographic zip code of a shop where a customer made a transaction, timestamps, and monetary amount of said transaction (see Sect. 4 for more details). We extracted the combination of retail shops, where customers make consecutive transactions before or after any transactions in one of three large-scale department stores. This approach differs from that in previous studies, which use credit card transactions in the analysis of human behavior [6, 7]. Similarly, it is different from analyzing the predictability of human spending activities [5], because the latter utilizes detailed topological analysis whereas we use the physical spatial analysis.

The advantages of our dataset can be summarized as follows: contrary to the point of sales (POS) or the customer loyalty cards [8], BBVA’s credit cards are designed to be used with specific readers installed in over 300,000 BBVA card terminals in Spain [6]. This enables us to analyze spatial distributions of a customer’s sequential purchasing behavior between retail shops over the territory. In addition, the detection scale for the purchase location is smaller than the one for passive mobile phone tracking [913] RFID-based studies [14, 15] or Bluetooth sensing techniques [1620]. This indicates that the attractivity analysis for each shop can be studied at a much finer grain of resolution than in previously recorded studies [21, 22].

Conversely, our research does present several limitations. The dataset consists solely of customers who hold BBVA’s credit or debit card and used it for the purchases we analyze. This suggests that our analysis contains a possible bias in terms of the type of customers we study (i.e., highly educated upper and middle class). In addition, our analysis is based on customers’ successive order of purchase behaviors between different retail shops, meaning that we cannot deduce their transition path or their purchase decisions when they don’t use BBVA’s card. Moreover, our dataset cannot reveal customers’ decision-making processes or value consciousness because it doesn’t contain their inner thought process typically derived from interviews, questionnaires or participatory observation. Furthermore, there is an inherent temporal sparseness present in the data with just a small fraction of all activities being recorded, although this provides enough of sample at the aggregated scale.

Within these limitations, we try to uncover the features of a customer’s transaction activities and the similarities of their spatial distribution through the city and the urban structure.

2 Context of the Study: Barcelona

The city of Barcelona is divided into 10 administrative districts, and 73 neighborhoods within those districts, each of which with its own unique identity.

Figure 1 shows the districts, major avenues, and plazas which determine the urban structure of the city of Barcelona. There are approximately 50,000 business entities throughout the city, including department stores, commercial centers, supermarkets, shopping streets with exclusive designer boutiques and international/local brands.

Fig. 1.
figure 1

The map of the city of Barcelona. The zip code, 10 districts and 73 neighborhoods.

This paper analyzes customer spatial distributions through analysis of their mobility, based on their consecutive activities made before and after visiting the same chain of a large-scale department store. They are located in one of three different neighborhoods in the city. We selected the same chain of large-scale department stores rather than small- and medium-scale shops because (1) we can expect a larger number of customer transactions because of the stores’ higher attractivity, (2) customers can be derived from far locations as well as nearby, which enables us to analyze urban structure throughout a larger landscape and (3) the obtained dataset of customers can be more homogeneous rather than distorted and biased.

Each one of these stores attracts a large volume of customers and is therefore able to create expanded distributions of customers to other retail shops in surrounding neighborhoods. They can be considered one of the strongest hubs in the district, triggering a customer’s sequential shopping movements. Thus, their presence has great spatial impact in the district in terms of the volume of attracted customers as well as the associated sequential movements.

The first shop (PC) is located in the city center, Ciutat Vella (old town). Ciutat Vella district is composed of four neighborhoods: El Raval, El Gòtic, La Barceloneta, Sant Pere, Santa Caterina i la Ribera. These neighborhoods are full of retail shops with the most famous brands in the wide commercial area between Pelayo and Portaferrissa streets, and the Portal de l’Àngel. Because of its scenic monuments, architectures and environment, this district attracts tourists as well as locals from all districts of the city.

The second one (AD) is located in Eixample district. This district is divided into six neighbhorhoods (El Fort Pienc, Sagrada Família, Dreta de l´Eixample, Antiga Esquerra de l´Eixample, Nova Esquerra de l´Eixample, Sant Antoni). This area is a business district surrounded by a variety of private companies. Therefore, customers are likely to be workers for these companies as well as people from the wealthy neighborhoods of Pedralbes, Sant Gervasi, and Sarrià.

The last one (PA) is located in Nou Barris district. The shop faces the comer of Sant Andreu and Avenida Meridiana, one of the biggest avenues in Barcelona. This area has a high concentration of immigrants and working-class citizens, as well as a high level of registered unemployment. The specific geographical location is at an entrance to the city of Barcelona and therefore attracts customers traveling from adjacent districts/villages.

By comparing consumer patterns for the same store located in different regions of the city, our analysis reveals dependencies on neighborhood features more clearly than if different shops has been analyzed.

3 Methodology

Our goal is to isolate transactions before and after visiting one of three shops in the city of Barcelona within a 24-hour window. We will refer to these three shops (PC, AD, PA) as the focal shops of our study. Specifically, we extracted consecutive sequential credit and debit transactions as customers moved between stores either before or after visiting the focal shops.

We define an incoming customer as one who makes a transaction in any shop before making a transaction in a focal shop. Similarly, we define a leaving customer as one who makes a transaction in any other shops after doing so in a focal shop.

Figure 2(a), (b), and (c) show the location of each shop. We aggregate the number of customers within a radius of 1 km from each store. This methodology permits us to aggregate customer spending behavior in terms of spatial dimension, where they come from, and where they move to before or after visiting one of those stores.

Fig. 2.
figure 2

(a) The location of the shop PC with radius of 1 km. (b) AD. (c) PA.

Within this framework, this paper assesses the spatial distribution based on customers’ sequential movement around the large-scale department store located in Barcelona.

4 Data Settings

Data for this paper was provided by one of the largest Spanish banks–Banco Bilbao Vizcaya Argentaria (BBVA). The data consists of bank card transactions performed by two groups of card users: direct customers who hold a debit or credit card issued by BBVA and others who made transactions through one of the approximately 300,000 BBVA card terminals. Once customers make transactions with their debit or credit card, the system registers those activities. The information contains the randomly generated IDs of customers, and indication of a customer’s residence and a shop where a customer made a transaction at the level of zip code, a time stamp, and each transaction denoted with its value. The datasets do not contain information about items purchased, and the shops are categorized into 76 business categories such as restaurants, supermarkets, or hotels. In addition, the location where a customer makes transactions is denoted as a zip code rather than the actual street address. The data is aggregated and hashed for anoymization in accordance to all local privacy protection laws and regulations. The total number of customers are around 4.5 million, making more than 178 million transactions totaling over 10 billion euro during 2011 (see [6] for more details).

5 Spatial Analysis

5.1 Customers Distribution in the Micro Scale

In this section, we analyze the spatial distribution based on customer mobility in the microscopic scale, considering their purchase behaviors. We focus on transactions at shops before or after visiting the three focal shops (AD, PA, PC) around the city of Barcelona. This reveals, on the one hand, each shop’s customer mobility in the city of Barcelona, and, on the other hand, the degree of each shop’s attracting power and distribution power and their customers’ sequential movements around each one.

The volume of transactions against distance for the shop PA can be seen in Fig. 3(b). PA starts to attract customers from 1 km to 2 km (8.41 %), meaning their customers don’t make transactions nearby (0–1 km, 0.00 %) before/after visiting it. In addition, almost no customers make transactions from proximate locations such as within 2–3 km (0.26 %), 3–4 km (0.00 %), 4–5 km (0.00 %), 5-6 km (0.00 %). The hot spot of customers’ locations of origin can be found within 6–7 km (9.24 %), 10–12 km (14.67 %), 12–14 km (14.41 %) and 16–18 km (15.12 %).

Fig. 3.
figure 3

(a) The distance against the frequency of transactions by the shop AD. (b) PA. (c) PC. (b) All shop.

Conversely, the shop AD attracts customers who make transactions nearby (0–1 km, 3.10 %). This distribution pattern is unique to AD. The number of customers increases with the distance until 6–7 km (i.e., 3–4 km, 3.71 %, 4–5 km, 4.53 %, 5–6 km, 8.12 %) and is maximized at 7–8 km. In addition, the locations far from the shop tend to show lower percentages of transactions (i.e., 8–9 km, 2.37 %, 9–10 km, 4.95 %, 10–12 km, 4.58 %, 12–14 km, 4.44 %, 14–16 km, 4.34 %, 16–18 km, 8.05 %), indicating that the concentration of transaction volume for AD is intensified in proximal locations.

With respect to the shop PC, customer transactions appear within 2–3 km (3.03 %); meanwhile, there is almost no customer within 2 km (0–1 km, 0.00 %; 1–2 km, 0.29 %). The highest concentration of customer transactions occurs within the 10–12 km radius (19.50 %) with smaller aggregate transactions intervening (4–5 km, 3.69 %; 5–6 km, 3.57 %; 6–7 km, 2.80 %; 7–8 km, 2.66 %; 8–9 km, 4.39; 9–10 km, 6.90 %). The customers also increase positively toward 20 km (12–14 km, 6.60 %; 14–16 km, 8.77 %; 16–18 km, 9.33 %; 18–20 km, 8.98 %).

The following is an analysis of the overlap of those geographical locations between the three shops. Figure 4(a), (b), and (c) visualize the concentration of customer transaction to geographical locations. Figure 5(a), (b), and (c) show the overlap of those concentrations between PC and AD, and PC and AP, and PA and AD, respectively.

Fig. 4.
figure 4

(a) The visualization of the peaks of the number of transactions for the shop AD. (b) PA. (c) PC.

Fig. 5.
figure 5

(a) The visualization of the peaks of the number of transactions for the shop PC and AD. (b) PC and AP. (c) PA and AD.

As we can see, PA’s trading area is sometimes overlapped with that of shops AD and PC. For the former case, it is southwest of Barcelona, and for the latter case, it is northwest of Barcelona. This indicates that those two shops (i.e., PA and AD, and PA and PC) compete for their trading area rather than complement each other in the city. Conversely, the trading areas between shops AD and PC are nonoverlapping. They are clearly separated, meaning that harmonious operations are achieved by each shop despite the proximity between them.

All these facts uncover the hidden structures of shops’ trading areas and their similarities at the micro scale. Each shop has unique concentrations of customer transactions.

5.2 Customers’ Spatial Distributions in the Macro Scale

This section analyzes the customers’ origins and destinations for each store over the wider territory. The goal is to detect the macroscopic trading area through spatial analysis. The difference from the previous section is the scale. While the previous section examined it within the city of Barcelona, this section focuses on the wider territory over the city.

We compute the cumulative number of transactions made by the leaving and incoming customers against the distance from the focal shops. December, January and July show significantly larger number than other months for all three cases. This result coincides with previous studies where those three months mean a high season through a year in Spain. In addition, this result shows that an individual shop’s attractivity seems dynamic rather than static depending on the season.

Conversely, we also compute the cumulative distribution of transactions against the distance from the shop (see Fig. 6(a)). They show that incoming and leaving customers of each shop have a particular pattern in terms of distributions of locations where customers make the consecutive transactions. For instance, shop PC and shop PA present the sudden increase in transactions around 14 km, while shop AD’s happened at 7 km. With respect to shop PC, the slope starts to decrease at around 15 km, and 14 km in the case of shop PA. In addition, Fig. 6(b) presents that log-log plot of the number of transactions against the distance from the shop (Table 1).

Fig. 6.
figure 6

(a) The distance from the shop where transactions are made against the cumulative frequency of the normalized number of transactions of leaving/incoming customers. (b) The transaction frequencies for each rank of distance from the shop.

Table 1. The slope of the line of best fit for each log ranked customers’ frequency vs distance during the high seasons.

Let’s examine the log-log plot of the spatial distribution of the number of transactions in each month. Figure 6(b) presents transaction frequencies for each ranked distance bin for the entire period, and Fig. 7 presents the change of the slope of its rank plot by each month. We can observe that both pre- and posttransaction in shops AD and PA is nearly-1.0 in January, July, and December, which corresponds with the high seasons. This indicates that few locations have a much higher number of transactions, while most locations have very few transactions. And this tendency is even stronger in shop PA than in shop AD and PC. Most of PA’s customers tend to derive from a minimal number of places and subsequently move to few locations. Conversely, the origin and destination shops for PC’s customers become largely dispersed in January, July, and December compared to other months.

Fig. 7.
figure 7

The change of the log ranked distance bin (slope) by the transactions frequency in each month.

We can see from these results that customer transaction activities have unique patterns in terms of their spatial distribution, which are unique to each individual shop. We speculate that PA might attract local customers rather than tourists from far away. This explains that the origin as well as the destination of their customers is quite similar, and those few places are the main sources for their customers. Conversely, PC appears to attract tourists rather than local citizens, and this tendency is magnified during the high seasons of the year. The customer origin and destination become more dispersed throughout the discount season.

We tend to consider that high seasons increase the number of transactions since many drastic discounts cause customers to rush to shops even from abroad. Our result partially reveals this phenomenon in the case of shop PC, but this is not a consistent pattern among all stores. On the other hand, we showed that the number of transactions during the high season has the same proportion as the low season, meaning that the former portrays an increase in transaction volume compared to the latter. That is, the spatial distribution of transaction activities is exactly the same between the high and low seasons. However, the cause of this increase varies largely depending on the specific store and its location. In case of PA, this effect is not due to the increase of customers who come from other places but simply an increase of the quantitative volume from the same places. Contrary to this fact, in the case of PC, this effect is largely due to the ones deriving from other places, indicating that the simple increase of the same customers from the same locations does not apply in this case.

6 Conclusions

This paper uncovers customers’ spatial distributions by analyzing their mobility patterns. We extract locations of consecutive transactions made by customers before and after going to one of the selected three focal shops.

These shops, PC, AD, and PA, are each located in a different urban context across the city of Barcelona, thereby uncovering unique characteristics of their customers as well as the area they are located in. The large-scale and anonymized credit card transaction dataset makes it possible to analyze the successive chains of a customer’s purchase history between shops dispersed over the territory rather than an analysis inside a single unique shop.

Our findings reveal that the trading area of each store is largely distributed in a specific way. Customers of shops AD and PC derive from similar places, resulting in competition to attract said customers from each other. Conversely, customers of shops AD and PC share no overlap within the city, allowing them to coexist rather than compete.

In addition, we discover that some distributions of the number of transactions against the distance from the shop follows a power law. This reveals that few locations have higher frequencies of transactions, while most of them have very few transactions. This tendency is amplified even further in shop PA compared to AD or PC. Moreover, our analysis discloses how transaction volumes increase during high and low season. Specifically, customers during high seasons come from similar places rather than from different locations in the case of shop PA. The number of transactions in the former just increases from a similar place in proportion with the ones for the latter, meaning that the customer’s spatial distribution is exactly the same for both. However, in the case of shop PC, the customer’s mobility pattern is different. The origin and destination of shop PC’s customers become dispersed during the high season rather than converged as in the low season.

The outcome is almost reversed between shops PC and PA, although they are the same chain of the large-scale department store. We speculate that this feature might be due to the geographical and sociocultural context of each store. While shop PA is situated in the suburban area with a higher rate of immigration, shop PC is located at the center of the city, which is one of the most popular touristic places.

We have an intuition that urban contexts and their differences cause the feature of stores and their customers to differ. For instance, the store located at a tourist setting may attract many more tourists compared to one in a business or suburban district, and vice versa. In spite of these beliefs, this paper reveals this difference quantitatively through the spatial analysis based on large-scale dataset.

All of these analyses were not possible prior to our research. The previous researchers have frequently used the Huff model [21, 22] to estimate the trading area of a shop in a macroscopic point of view. This merely reveals the homogeneous distribution of customer home locations and the strength of the shop’s attractivity, since the model simply depends on the distance from and the size of the shop. Thus, the result of the analysis doesn’t represent heterogeneous customers and their geographical features, or the temporal factors. Also, this information is not possible with active mobile phone tracking with or without GPS [23, 24], or with passive mobile phone tracking [12] and Bluetooth detection techniques [20]. The dataset collected by those methods just provide the users’ locations without considering evidence of their purchases. Thus, we are only able to predict when purchases are made with a series of significant assumptions. The combination of RFID [14] and the POS system is proposed to reveal a relationship between sales volumes made by customers and their mobility patterns. However, it is possible only inside a single store or mall.

Our proposed methodologies should address these drawbacks. Our dataset permits us to analyze the customer’s consumer behaviors across different retail shops, which are dispersed in the urban area; thus, we reveal subsequent purchase behaviors while considering their mobility aspects when they complete microscopic transaction activities. This means that our current research shows the locations of customer transactions rather than just customers passing through these shops. In addition, our methodology and analysis can reveal the individual shop’s attractivity and its influences in the territory as trading areas in the micro scale. Furthermore, our methodology and extracted knowledge are extremely helpful in improving Christaller’s urban centrality model [25] and reveal the urban structure as well as its hierarchy. Although spatial structure and hierarchy of cities by size and distance have been well studied [25, 26], “the regularity of the urban size distribution poses a real puzzle, one that neither our approach nor the most plausible alternative approach to city sizes seems to answer” (page 219 in [27]).

These extracted patterns help improve spatial arrangements and services offered to customers. Thus, retail shops and their districts can improve sales as well as their environment, thereby revitalizing the center of the urban districts. In addition, these findings are useful to urban planners and city authorities in revitalizing deteriorated districts or rehabilitating neighborhoods. Understanding customers’ sequential movement with transaction activities enables us to identify potential customer groups and their geographical demographics spatially. Finally, city planners can consider optimizing the infrastructures and the locations of the retail shops to make the district more attractive and active by increasing the number of pedestrians. For instance, the customers’ sequential movement between different retail shops facilitates collaboration between all shops in a district as a whole rather than individually, to organize planned sale periods. Based on our findings, neighborhood associations can organize discount coupons or advertisements in relevant and adequate places. This can serve as an efficient indicator as to when they are most likely to complete transactions as well as their successive locations.