1 Introduction

The proliferation of online shopping in recent years has significantly diversified consumer choices, even regarding food-related products. Despite this trend, the demand for physical stores, such as supermarkets, remains robust. This persistent relevance of physical stores is largely attributable to a critical factor, namely food freshness [1]. In the evolving retail domain, physical stores must leverage their unique strengths and distinguish themselves from their online competitors to sustain consumer interest.

Furthermore, artificial intelligence (AI) technology is revolutionizing the retail industry by enabling a more nuanced response to customer needs or enhancing the accuracy of inventory management and demand forecasting, thus facilitating more efficient product replenishment planning [2]. Particularly, the adoption of smart shopping carts (SSCs) and AI-powered cameras in retail environments has significantly advanced the real-time understanding and grasping of consumer purchasing processes [3, 4]. Previously, understanding the customer purchasing process involved linking purchase and path data collected from ID-POS systems and Bluetooth beacons, as detailed by Ishimaru et al. [5]. Although this approach effectively enhanced the understanding of the customer’s journey, it limited the possibility of facilitating real-time analysis and interaction. The introduction of SSCs and AI cameras represents a paradigm shift from this approach, offering a dynamic and interactive shopping experience through real-time customer contact. This feature offers immediate insights into purchasing decisions and facilitates the direct distribution of targeted coupons and personalized product recommendations to shoppers through the device attached to the cart, thereby enhancing the shopping experience via real-time interaction, increased sales, and improved customer satisfaction.

The purchase of cooking ingredients accounts for the primary motive of shopping in physical stores. However, many consumers consider meal preparation challenging, and \(\sim\) 60% of in-store purchases are impulsive [6]. Thus, recommending products that complement items that are already in the cart or offering discount coupons could support consumers’ cooking and shopping. For retailers, employing such unique in-store sales strategies could attract more customers and encourage repeat business, thereby sustaining and increasing demand.

Therefore, this study is aimed at predicting in-store purchases and offering real-time product recommendations to enhance the shopping experience in physical surpermarket environments.

2 Related Works

Sequential purchasing behavior in real stores has been modeled [5, 7, 8]. Particularly, Gilbride et al. [8] verified the correlation between unplanned and planned purchases during sequential shopping, indicating an increasing probability of unplanned purchases as the shopping process progresses. Mukhopadhyay et al. [9] noted that the successful purchases of an ingredient in the early stages of shopping may justify subsequent unplanned purchases. Hence, recommending products that complement the items in the shopping cart during the shopping process can stimulate unplanned consumer purchases, resulting in increased sales. Moreover, to achieve this, an accurate product recommendation method that incorporates consumer purchase intentions is essential.

In the product recommendation domain, various methods have been developed to enhance the shopping experience across physical stores and online platforms [10]. These methods predominantly leverage historical purchasing data to forecast consumers’ subsequent interests. The user- and item-based recommendation systems are among the most prevalent approaches. The user-based systems generate suggestions based on the purchasing behaviors of similar customers [11], whereas item-based systems recommend products akin to those previously bought by the user [12]. These techniques have proven effective in scenarios where consumer preferences remained relatively stable over time.

However, the dynamic nature of consumer interests, as particularly evident in settings, such as supermarkets, complicates these conventional recommendation systems. As customers’ purchasing intentions can vary significantly from day to day, there could be a risk of generating recommendations that do not align with the current needs of such customers. This discrepancy underscores the necessity for recommendation systems to adopt a more context-aware approach, particularly considering the current state and contents of shoppers’ carts.

To emphasize this, Liao et al. [13] designed a novel rough set-based association rule approach for online consumer recommendation systems. This method could effectively analyze customers’ previous online behaviors and current product information and reveal patterns and rules that facilitate more accurate and behaviorally aligned e-commerce recommendations.

Additionally, a method that considers users’ daily purchasing intentions is based on radio frequency identification (RFID) [14]. This method deploys RFID to identify items in the shopping cart, predict recipes associated with the contents of the cart, and recommend items used with those recipes. Employing the recipe information, this method performs product recommendation that considers the state of the items in the cart, which varies from day to day. However, it may not be suitable for products that consumers purchase regardless of their current recipe considerations, such as eggs or milk. Furthermore, when predicting recipes based on the shopping cart contents, the significance attributed to each ingredient does not derive from data; rather, it is uniformly assigned. Thus, this approach significantly increases the likelihood of recommending inappropriate recipes.

Considering these aforementioned limitations, this study was conducted to develop a recommendation methodology that simultaneously considers past purchase trends and the real-time context of the customers’ shopping cart to improve recommendation accuracy. The existing methods are separately applied, and are associated with issues regarding the recipe prediction logic, which results in the recommendation of items that may not be suitable in each case, thus leaving room for improving the overall recommendation accuracy.

3 Proposed Method

Here, we introduced our proposed product recommendation method. This method is distinguished by its ability to simultaneously consider co-purchasing patterns based on previous purchasing trends and contextual data in the cart. The proposed method assumes a situation in which a user has a list of products in his/her cart at the k-th point of purchase and recommends a list of products he/she is likely to purchase after the k-th point. This means that we assumed a situation in which a user is shopping and recommending products to such a user by considering and integrating the historical data of items bought together as well as the specific context of the current contents of the user’s cart. Figure 1 shows an overview of the proposed method.

Fig. 1
figure 1

Overview of the proposed method

The proposed method comprises two parts: the recommendation of products based on the shopping cart content using general previous purchasing patterns and the recommendation of products based on the shopping cart context with the recipe information. This method enables product recommendations based on the cart context while making recommendations based on overall purchasing patterns.

The list of products in the cart at the k-th point of purchases in session s is given by \(\displaystyle I^{s}_{k} = \{ i_1^s, i_2^s, \ldots , i_k^s \}\), and the list of products purchased after the k-th point of purchases is given by \(\displaystyle I_k^{'s} = \{ i_{k+1}^{'s}, i_{k+2 }^{'s}, \ldots , i_N^{'s} \}\). Based on this \(\displaystyle I^{s}_{k}\), the process involves extracting recommended products by analyzing frequent previous purchasing patterns and extracting recommended products based on the context provided by the recipe information. Subsequently, the products, which are identified by both distinct processes, are recommended for session s.

3.1 Recommendation Process Using General Previous Purchasing Patterns

Association rules are created by co-purchase analyses based on past purchase history data. Therefore, the association rule is used to recommend products that tend to be co-purchased with the products in the cart during recommendation.

Co-purchase analysis is a method for identifying the combinations of products that users purchase simultaneously in a series of purchases  [15, 16]. Further, association rules are combinations of products that exhibit co-purchasing tendencies based on the co-purchase analysis. Employing the co-purchase analysis, we obtained an association rule, as presented in the example in Table 1.

Table 1 Examples of association rules

For example, each record in Table 1 is a rule, and Rule No. 1 indicates that “when a user purchases eggs, they tend to also purchase milk.” The lift value in antecedent item A and consequent item B \(\text {Lift}( A\rightarrow B)\) can be expressed, following Eq. (1), with \(N_{all}\) representing the number of all sessions, \(N_A\) representing the number of the sessions during which item A was purchased, and \(N_{A\ \cap \ B}\) representing the number of sessions in which items A and B were purchased simultaneously.

$$\begin{aligned} \text {Lift}(A \rightarrow B) = \displaystyle \frac{\frac{\displaystyle N_{A \cap B}}{\displaystyle N_{all}}}{\displaystyle \frac{N_A}{N_{all}} \displaystyle \frac{N_B}{N_{all}}} \end{aligned}$$
(1)

Generally, rules with lift values (\(\text {Lift}( A\rightarrow B) ) > 1\) indicate that B and A tends to be purchased simultaneously.

In the proposed method, \(\text {Rule}_{all}\) was assumed to be a set of association rules extracted from purchase history data, and we extracted \(\text {Rule}_{s,k}\), which included the element of \(I^{s}_{k}\) among the “consequent item, A,” of rules \(\text {Rule}_{all}\). Therefore, the consequent item, B, of \(\text {Rule}_{s,k}\) with the top N Lift values becomes the recommended product \((\text {Rec}_{co})\) in \(I_{s,k}\). Put differently, \(\text {Rec}_{\text {co}}\) is the set of items \(B\) from the rules in \(\text {Rule}_{s,k}\) where the value of \(\text {Lift}(A \rightarrow B)\) belongs to the top \(N\) rules. This method enables the recommendation of products that are likely to be purchased simultaneously with the items in the shopping cart based on previous purchasing patterns.

3.2 Recommendation Process Based on the Shopping Cart Context

Users do not intend to cook the same dishes every day. Therefore, the products they purchase from supermarkets and other stores might differ per shopping day. Thus, to make recommendations in supermarkets, having a viewpoint that recommends the most suitable products based on the context of the items in the user’s shopping cart is crucial. In this study, the shopping cart context is supplemented by the cooking recipe information, enabling product recommendations based on the user’s real-time shopping cart content. The recommendation using food recipes comprises the following steps:

  1. i)

    Extract nouns from the list of ingredient names in the recipe and assign significance to them.

  2. ii)

    Extract nouns from \(I^{s}_{k}\) (the list of products purchased at the k-th point of purchases in session s)

  3. iii)

    Score each recipe using the noun significance and predict recipes

  4. iv)

    Recommend products used in the predicted recipes

As product information includes proper nouns, such as the place of origin and brand name (e.g., “potatoes from Saitama Prefecture (brand: xxxx)”), it rarely matches the name of ingredients in the recipe information exactly. Therefore, we propose a method for matching the products in a cart with recipe information by using common nouns among words.

In i-ii), morphological analysisFootnote 1 is respectively used to extract common nouns from the list of products in the cart and the names of ingredients in the recipe in order to conduct noun-by-noun matching.

To better reflect the significance of each ingredient in the recipes, we introduced a concept of importance similar to the term frequency-inverse document frequency (tf-idf), which evaluates the significance of a word based on its frequency in a document relative to its frequency across all documents [17]. This approach allows us to assign weights to ingredients based on their specificity and generality, thereby enhancing the accuracy of recipe predictions and product recommendations.

In the product recommendation based on recipe information by Banno et al. [14], the significance of the ingredients in each recipe is set uniformly. Therefore, the differences in the significance cannot be considered based on the objective frequency of use, such as “Potatoes are generally assumed to be used in a wide range of culinary categories, although they are most often used in category x.” A comparison of the occurrence frequency of the noun in the curry and stew categories indicates that “potato” is used more often in the stew category than in the curry category (Table 2).

Table 2 Proportion of ingredients in the curry and stew categories

Without considering the frequency of occurrence of each ingredient category, recipe prediction from the cart yield recipes that rarely use the ingredient when it is purchased, and this may result in false recommendations. Thus, this method assigns a degree of significance to each recipe noun using a measure of the degree to which it is used in a particular recipe among all recipes (uniqueness) and a measure of the degree to which it is commonly used among specific recipe categories (generality), using a concept similar to tf-idf to calculate the uniqueness and generality [17, 18]. The uniqueness of the noun m among all recipes (\(U_m\)) as well as the generality of the noun m in the recipe category c \(( G_{c,m} )\) are calculated using Eq. (2)(3).

$$\begin{aligned} U_m&= \log \left( \frac{R_{all}}{R_m} \right) \end{aligned}$$
(2)
$$\begin{aligned} G_{c,m}&= \log \left( \frac{R^c_m}{R^c} + 1 \right) \end{aligned}$$
(3)
$$ \begin{pmatrix} R_{all}: \#{\text { of all recipes}} \\ R_m: \#\text { of recipes using the noun { m}} \\ R^c_m: \#\text { of recipes using noun { m} in category { c}} \\ R^c: \#\text { of recipes in category { c}} \end{pmatrix} $$

The logarithm is used to scale both \(U_m\) and \(G_{c,m}\) to the shape of a normal distribution, as they both exhibit long right tails. Additionally, adding 1 to \(G_{c,m}\) before taking the logarithm stems from the condition of the argument of the logarithm.

Afterward, the significance (importance) of the noun, m, in the recipe category, c, is calculated using Eq. (4) with the two indices and the parameters pq. In this study, the values of the parameters p and q were set to 1 and 2, respectively, referencing the study by Ikejiri et al. [18].

$$\begin{aligned} Im_{c,m} = \left( U_m \right) ^p \times \left( G_{c,m} \right) ^q \end{aligned}$$
(4)

Employing the above method, m in each c is calculated quantitatively. The higher the importance, \(Im_{c,m}\), the more frequently the noun, m, is used in a particular c. For example, “curry roux” is used in fewer recipes, however, as it is used frequently in the “curry category,” the importance of “curry roux” in the “curry category” is high.

Next, in iii), matching is performed between the nouns extracted from the ingredient information of each recipe and the nouns extracted from the items in the cart at the recommendation point to predict recipes from the items in the cart. In this approach, the similarity of the string between nouns is used, and the nouns with a similarity above a threshold are considered matched. The similarity of the strings was calculated using the Levenshtein distance, and whether they matched was determined based on a threshold value of 0.7. Specifically, when matching occurs between the noun, m, in the cart of session, s, and m in the recipe, r, s is considered to have purchased m in recipe r. Subsequently, using the matched nouns in recipe r per session, the score of s and r (\(Score_{s,r}\)) is calculated using Eq. (5).

$$\begin{aligned} Score_{s,r} = \displaystyle \frac{ \sum _{m \in M_{r}} \left( f_{s,m} \times Im_{c,m} \right) }{\sum _{m \in M_{r}} Im_{c,m}} \end{aligned}$$
(5)
$$\begin{pmatrix} f_{s,m} = {\left\{ \begin{array}{ll} 0 &{} (\text {Unmatched}) \\ 1 &{} (\text {Matched}) \end{array}\right. } \\ \\ \quad M_{r}: \text {Set of nouns in recipe } r \quad \end{pmatrix}$$

Equation (5) represents the ratio of the sum of the importance of the nouns in recipe r (denominator on the right side of Eq. (5)) to the sum of the importance of the matched nouns in session s (numerator on the right side of Eq. (5)). This ratio is expressed as the entire Eq. (5). Additionally, the function, \(f_{s,m}\), in the numerator of Eq. (5) is a function that takes 1 if the noun, m, in the recipe r matches in session s, and 0 otherwise. It represents the sum of the importance of the matched nouns in session s as the entire numerator. The higher the \(Score_{s, r}\), the more it indicates that s has a higher proportion of matching important nouns in recipe r. Put differently, it signifies a higher matching rate between r and s, indicating that r aligns with the context and requirements of s and is likely a suitable recipe of interest to the user. This scoring is performed per session; by extracting the cooking recipes with high scores per session, cooking recipes can be predicted based on the items in the cart at the recommendation point. In this approach, the Top N recipes with the highest scores in each session are extracted as the predicted recipes (Top-N-Recipe), and the items associated with these predicted recipes are considered as the recommended items (\(\text {Rec}_{*Recipe}\)). In this study, \(Rec_{*Recipe}\) selects N items randomly from the products used in Top-N-Recipe.

3.3 Implementation of Recommendations

Based on the procedures in 3.1 and 3.2, the recommended items were extracted based on co-occurrence relationships from the purchase history \((\text {Rec}_{co})\) and recipe information \((\text {Rec}_{*Recipe})\) in the shopping cart in session \(s\). The items, which were extracted using these two methods, are considered recommended at the k-th point of purchase in session s. This approach makes it possible to perform product recommendations that simultaneously consider previous purchasing trends through co-purchase analysis and the context of items in the cart through recipe information. In the extant studies, these methods were often used separately, highlighting the novelty of this research. Additionally, if recommended items overlap in this study, the duplicates are not removed because we assumed recommendations at the product category level. Furthermore, the overlapping of recommendations offers a potential for applications, such as product recommendations from different brands.

In the following sections, we apply the proposed method to sequential purchase history data from retail stores and validate the effectiveness of this approach using quantitative evaluation metrics.

4 Data and Preprocessing

In this section, we describe the utilized data and its treatment. The utilized data were sequential purchase history data obtained from a supermarket’s SSC and the Rakuten Recipe dataset provided by Rakuten Group, Inc.

4.1 Sequential Purchase History Data

The sequential purchase history data were obtained from SSCs that were installed in a supermarket store in Japan, and Table 3 represents a summary.

Table 3 Overview of the sequential purchase history data

The dataset originated from SSCs in a comprehensive Japanese supermarket; the SSCs were equipped with integrated cash register functions. This supermarket is notable for its extensive product range, from food items to daily necessities. It encompasses a comprehensive array of sequential purchase history data collected between October 4, 2022, to May 30, 2023. The key data fields included User ID, Session ID, Scan time, JAN code (indicative of the scanned item), Item name, and Item count, capturing the intricate details of each transaction. The dataset comprised 85,104 unique products, represented across 162,189 unique shopping sessions by 12,066 distinct users. Table 4 presents an example of this sequential purchase history data.

Table 4 Example of sequential purchase history data

These data contain the session ID, which uniquely identifies each user’s purchase, the scan time per item, and the scanned item. Therefore, these sequential purchase history data could be used to determine which customers purchased what items, as well as when, and in what order the purchase were made. The following processing treatment applied to these sequential purchase history data to perform product recommendations.

In this study, we assumed a situation in which we recommended products that were likely to be purchased after the k-th point of purchase based on the products purchased up to the k-th point. Thus, the list of products purchased up to the k-th point of purchase as well as a list of products purchased after the k-th point were recorded per session, and Table 5 presents an example of the data.

Table 5 Example of the preprocessed sequential purchase history data

4.2 Recipe Data

In this study, we used recipe data to complement the cart context with food recipe information. The recipe data were collected from the Rakuten Recipe Dataset [19] provided by the National Institute of Informatics (NII) and Rakuten Group, Inc. These data were obtained from Rakuten Recipe [20], a website that the Japanese refer to when preparing food. These data use a recipe ID, which uniquely identifies each recipe, recipe title, recipe category, and recipe ingredients. The ingredients of each recipe also include the quantity information, although we did not consider the quantity of the purchased products or the quantity information in this study in order to simplify the data model and enhance the flexibility of our product recommendations. The dataset contains \(\sim\) 800,000 recipes. As the recipes in this study are intended for daily cooking, 4,648 recipes with a minimum of four ingredients were selected from the recipes, excluding the event and foreign cuisine (not common cuisines in Japan) categories.

5 Experiments

5.1 Experimental Overview

In this study, we validated the effectiveness of the proposed method using the purchase history data from the physical supermarket store, as described in 4.1. The purchase history data were obtained from SSCs in the physical supermarket store, where the purchased items per session, \(s\), of user \(u\) were sequentially recorded at the time of \(k\)-item purchase. Table 6 presents an overview of the experiment. Note that the data in Table 6 is a subset of Table 3, both obtained from a single physical supermarket store.

Table 6 Experimental overview

The experiment mainly considered a specific selection of target products categorized into “Grocery,” “Daily,” and “Fresh Produce,” spanning 256 items by category. These categories are essential for individuals who engage in cooking, as they reflect the typical needs of a household’s kitchen. The defined target customers are individuals aged between 45 and 55 years who are frequent shoppers with visit intervals of \(\le\) 4 days. This demographic was identified as the primary purchasing group for the experiment.

Further, we delineates target sessions based on the acquisition of the aforementioned product categories, with sessions categorized by the number of items purchased: 10 items (N=10), 11 items (N=11), and 12 items (N=12) corresponding to 636, 512, 417 sessions, respectively.

We incorporated 4,648 recipes into the experiment, each containing a minimum of four ingredients. These recipes are used as part of the recommendation system to propose relevant products to the targeted customers. The effectiveness of these recommendations was evaluated using three key metrics, namely “Recall,” “Precision,” and the “F1 score”. The function of these metrics was to quantify the accuracy of the recommendations by comparing the items that were proposed to the customers with those they had purchased post-recommendation. This comparison aimed at measuring the degree of alignment of the recommendation system to actual customer purchasing behaviors.

Moreover, the following six experimental methods were adopted:

  1. A)

    Recommendation of 10 random items (Baseline1). This method involves the uniform selection and recommendation of 10 products uniformly from the target products. The results obtained using this method, which are completely random, can be subsequently used to be verify the effectiveness of the method described below.

  2. B)

    Recommendation of 10 items based on Co-Purchase (Baseline2). This method solely relies on basket analysis in 3.1 and is based on the association rules extracted from previous purchase. It recommends items (\(\text {Rec}_{co}\)) that are frequently purchased with those in the cart. In this case, the recommended size (\(|\text {Rec}_{co}|\)) is set to 10.

  3. C)

    Recommendation of 10 items base on recipes without importance. In this method, the importance, Im, of all ingredients is set to 1 (i.e., the importance, Im, is not used), and the recommendations are based on the recipe information. The top five recipes (Top-5-Recipe) are predicted from the products in the cart, and the products used in the predicted recipes are used as the recommended products (\(\text {Rec}_{Recipe}\)Footnote 2). The recommendation size (\(|\text {Rec}_{Recipe}|\)) is set to 10.

  4. D)

    Recommendation of 10 items based on recipes with importance. This method, which is solely based on 3.2, predicts the top five recipes (Top-5-Recipe) from the items in the cart and recommends items (\(\text {Rec}_{*Recipe}\)) used therein. The recommendation size (\(|\text {Rec}_{*Recipe}|\)) is set to 10.

  5. E)

    Recommendation of 10 items: five based on co-purchase and five based on the recipes without importance. This method combines the approaches described 3.1 and Method C). It simultaneously recommends items based on the frequent co-purchase patterns identified from previous purchase data (\(\text {Rec}_{co}\)) and items predicted from recipes (\(\text {Rec}_{Recipe}\)) without importance, Im. To maintain consistency with the other methods, the recommendation size was adjusted to \(|\text {Rec}_{co}| = 5\) for co-purchase and \(|\text {Rec}_{Recipe}| = 5\) for recipe-based recommendations (the total recommendation size is set to 10).

  6. F)

    Recommendation of 10 items: five based on co-purchase and five based on the recipes using Im (Proposed Method). This proposed method combines the approaches described in 3.1 and 3.2. It simultaneously recommends items based on the frequent co-purchase patterns identified from previous purchase data (\(\text {Rec}_{co}\)) and items predicted from recipes (\(\text {Rec}_{*Recipe}\)). To maintain consistency with the other methods, the recommendation size was adjusted to \(|\text {Rec}_{co}| = 5\) for the co-purchase and \(|\text {Rec}_{*Recipe}| = 5\) for the recipe-based recommendations (total recommendation size is set to 10).

We examined the effectiveness of the proposed method by comparing these three metrics across different methods. Assuming \(\text {Answer}_{s, k}\) denotes the set of items purchased after recommendation time \(k\) (i.e., at the \(k\)-th item purchase point) in session \(s\), and \(\text {Predict}_{s, k}\) represents the set of items recommended at recommendation time \(k\) in session \(s\). The evaluation metrics are calculated, as follows:

  • \(\text {Recall} = \displaystyle \frac{|\text {Predict}_{s, k} \cap \text {Answer}_{s, k}|}{|\text {Answer}_{s, k}|}\)

  • \(\text {Precision} = \displaystyle \frac{|\text {Predict}_{s, k} \cap \text {Answer}_{s, k}|}{|\text {Predict}_{s, k}|}\)

  • \(\text {F1} = \displaystyle \frac{2(\text {Precision} \times \text {Recall})}{\text {Precision} + \text {Recall}}\)

where Recall indicates the proportion of products actually purchased after the k items for which the recommendation was successful. Precision indicates the percentage of products actually purchased among those recommended at the k-item point, and \(\text {F1}\) is the harmonic mean of the \(\text {Precision}\) and \(\text {Recall}\), acting as a composite metric considering Precision and Recall. Generally, \(\text {Precision}\) and \(\text {Recall}\) exhibit a trade-off relationship, where a balanced value indicates a higher \(\text {F1}\). Put differently, \(\text {F1}\) tends to increase when there is a balance between \(\text {Precision}\) and \(\text {Recall}\) and when the quality of recommendations is high. Contrarily, a low \(\text {F1}\) value indicated that \(\text {Precision}\) or \(\text {Recall}\) or both performed poorly.

5.2 Experimental Results

The proposed method was applied to the sequential purchase history data of a physical retail store, and the results of the recommendations are presented below.

5.2.1 Comparison of the Metrics in Each Method

The average values of Recall, Precision, and F1 for all purchased item counts (\(N=10, 11, 12\)) were calculated for each recommendation point, \(k\), and the results are illustrated in Figs. 2, 3, 4.

Fig. 2
figure 2

Comparison of Recall per \(N\)

Fig. 3
figure 3

Comparison of Precision per \(N\)

Fig. 4
figure 4

Comparison of F1 per \(N\)

Figures 2, 3, 4 illustrates the Recall, Precision, and F1 comparison for various recommendation methods across different recommendation points, denoted by \(k\), with \(N\) representing the number of recommended items. The horizontal axis denotes the recommendation point (\(k\)) corresponding to the instance where \(k\) products were considered for purchase, and the vertical axis represents the metrics, quantifying the recommendation accuracy.

The method labeled Baseline 1 (A) in the figures refers to a random recommendation strategy, where 10 products were uniformly recommended from the target products, serving as a control for assessing the performances of the other methods. Conversely, Baseline 2 (B) is based on the co-purchase patterns derived from the basket analysis (3.1), recommending items that are frequently bought together with the current items in the user’s cart. Additionally, the method representing light blue (C) highlights the recipe-based recommendation without considering the importance (\(Im\)), and the blue line (D) highlights the recipe-based recommendation considering that \(Im\) incorporates the importance weights of the ingredients. The figures also include a combined approach that integrates co-purchase patterns and recipe information without (\(Im\)) with a light red line (E), and the red line (F) highlights a method that integrates co-purchase patterns with recipe predictions considering \(Im\).

Each figure shows that the method considering co-purchases and recipe information simultaneously (Methods E, F) exhibited improvements in each evaluation metric compared with the methods that randomly recommended items (Method A) or considered only co-purchase or recipe information separately (Methods B,C,D). Particularly, between Methods E and F, the approach incorporating importance weighting (Method F) exhibited slightly improved metrics compared with the method without importance (Method E). This result is attributable to the assumption that considering Im in the recipe method slightly improved the performance metrics, as shown in Figs. 2, 3, 4.

Additionally, the distributions of Recall, Precision, and F1 for each level of the purchased item count (N) were examined by visualizing the kernel density functions. The kernel density plots per metric at different N values are shown in Figs. 5, 6, 7.

Fig. 5
figure 5

Kernel density functions of Recall per method

Fig. 6
figure 6

Kernel density functions of Precision per method

Fig. 7
figure 7

Kernel density functions of F1 per method

Moreover, as observed in Fig. 5, the proposed method reduces the sessions where Recall is close to zero compared with the other methods. The results indicate that the proposed method using the co-purchase patterns and recipe information reduced the number of inappropriate recommendations with no hits.

The method, which considered only recipe information, yielded lower metric values compared to other baselines and methods that considered the co-purchases and recipe information. From Figs. 5, 6, 7, the kernel density plots of the evaluation metrics indicated that the method considering only recipe information recorded a higher number of sessions, with values of 0\(-\)0.2 compared with the other methods, indicating that relying solely on recipe information for cart context consideration is undesirable for improving recommendation accuracy.

Furthermore, in the comparison of the comprehensive indicator, F1, in Fig. 4, as the recommendation point, \(k\), approaches the purchase completion point \(k=N\), the precision decreases. Although Fig. 2 shows that the recall values remain stable regardless of the purchase timing, Fig. 3 reveals a decrease in the precision values, particularly near the purchase completion point, \(k=N\). Thus, the overall decreases in the F1 values was attributed to the decrease in the precision values near the purchase completion point, as evident from the observed trends in Figs. 2 and 3.

5.2.2 Analysis of Products with Improved Accuracy Using Each Method

After applying the six methods to the sequential purchase history data of a physical retail store, we examined the items that displayed improved accuracy in each method. Here, we denoted the number of sessions where item \(i\) is recommended as \(N_{predict}^i\) and denoted the number of sessions where item \(i\) is recommended and where the recommendation is accurate (purchased post-recommendation) as \(N_{hit}^i\). We calculated the hit ratio, \(P^i\), for item \(i\) using the following equation:

$$\begin{aligned} P^i = \frac{N_{hit}^i}{N_{predict}^i} \end{aligned}$$
(6)

Subsequently, we calculated the difference in \(P^i\) per item between the different methods, focusing on the items with improved accuracy. Specifically, we analyzed the characteristics of items in the following categories:

  1. 1.

    The top 15 items where \(P^i\) improved with the recipe method compared with co-purchase

  2. 2.

    The top 15 items where \(P^i\) improved with the proposed method (co-purchase + recipe(using Im)) compared with only co-purchase

  3. 3.

    The top 15 items where \(P^i\) improved with the proposed method (co-purchase + recipe(using Im)) compared with only the recipe method (using Im)

Fig. 8
figure 8

Top 15 items where \(P^i\) improved with each method

First, Fig. 8 shows that the method considering the cart context with recipes exhibited improved hit rates for items, such as cooking ingredients, seasonings, and noodles compared with the method considering purchase tendencies through co-purchases. Furthermore, Fig. 8 shows that the method simultaneously considering the cart context and purchase tendencies through co-purchase displays improved hit rates. Particularly, for items, where considering the cart context (Fig. 8) is effective, the proposed method outperformed the method relying solely on co-purchase tendencies, indicating that the simultaneous considerations of the cart context and co-purchase tendencies allowed each method to compensate for products that are suitable for recommendation.

Additionally, Fig. 8 reveals that the method simultaneously considering the cart context and co-purchase tendencies outperformed the method relying solely on recipes, particularly for items, such as eggs, milk, and bean sprouts. These products are expected to be stocked in households irrespective of recipe considerations, and this improved the hit rates.

6 Discussion

6.1 Potential Impact of the Proposed Method

Based on the comparison of recommendation metrics presented in 5.2.1, the proposed method, which integrates co-purchase data and recipe information, outperforms the baseline methods in accuracy. Specifically, the graph in 5.2.1 indicates that a random recommendation strategy has a minimal hit rate, essentially serving as a baseline with almost no predictive success. In contrast, the proposed method registers a marked improvement over the random baseline and shows about 2% enhancement over the co-purchase baseline in recall score.

This 2% improvement in Recall, assuming a conservative estimate where each customer receives an average of 10 recommendations with a conversion rate of 5% and an average value of $5 per recommended item, translates to an additional $0.05Footnote 3 in revenue per customer. While this may seem nominal, when scaled across the thousands of customers that a supermarket serves daily, this increment can result in a significant uplift in overall revenue.

Moreover, the insights from the field indicate that the proposed method offers a level of improvement that is comparable with or marginally better than the currently applied recommendation systems. This underscores the incremental value of incorporating context-aware strategies within a practical retail context. A nuanced understanding of the dynamics between co-purchase patterns and recipe usage can refine customer recommendations slightly. While not dramatically transformative, this insight can incrementally enhance the shopping experience and contribute to business outcomes.

6.2 Implication of the Proposed Method

The proposed method has shown improved hit rates for items such as curry roux (seasoning) and pasta (noodles) compared with the baseline method, highlighting the effectiveness of considering the context within the cart. In the daily shopping behavior of users at supermarkets, there are products purchased without much consideration for recipes, such as eggs and milk, as well as items bought with a specific dish in mind. Conventional co-purchase-based recommendations cannot grasp the contents of the current cart; however, they can recommend items that tend to be purchased together with each item in the cart. Conversely, recommendation methods such as the proposed one, which also incorporates recipe information, can make informed suggestions based on the understanding of the type of dish the user is planning to make. The method adapts the recommendation to the context of each cart.

The observed improvement in accuracy with the proposed method suggests the importance of considering both macro-level purchase tendencies through co-purchase and the specific context of each cart for in-progress recommendations in a supermarket setting. The proposed method aligns with the consumer’s purchase intention by capturing both the general (macro-level) purchase trend and the individual (micro-level) cart context.

This research aims to identify effective contexts to consider for recommendations, as demonstrated in this study with the inclusion of recipes and in-cart contexts. Clarifying the utility of incorporating an in-cart context in the unique recommendation environment of a food supermarket is a key contribution of this study. With this understanding, we believe that various sequential recommendation techniques can be applied to further improve accuracy. In our experiments, we fixed the number of recommendations based on co-purchase patterns and recipes to five each, treating these as key hyperparameters. We acknowledge that optimizing these hyperparameters is crucial for enhancing effectiveness. However, it is noteworthy that the outcomes of such hyperparameter optimization may vary with changes in store or customer characteristics. Given our objective of validating the effectiveness of incorporating both co-purchase patterns and recipes, we evaluated results that equally weighed these factors.

Furthermore, supermarket customers are likely to engage in unplanned purchases as they progress through their shopping journey, especially after achieving their main goal of ingredient procurement [8]. Therefore, it is not always assumed that providing “correct” recommendations, which suggest items likely to be purchased next, will consistently influence customer purchasing decisions. From the experimental results, it can be concluded that in instances of less frequent planned purchases, where customers proceed with specific recipes in mind, providing “correct” recommendations could potentially prevent forgetting essential items and spark interest in exploring new culinary arrangements.

Additionally, the experimental results suggest that considering cart contexts with recipe information may be effective in addressing related items for “impulse buying” and unplanned purchases, providing valuable insights for developing targeted recommendation systems in the supermarket context. Deepening the understanding of models that achieve “correct” recommendations in situations closer to planned purchases is considered a significant advancement. This understanding is essential for developing recommendations that effectively complement more unplanned and impulse purchases.

6.3 Practical Use

We examine the prerequisites for applying the proposed method to real-world scenarios and explore potential application domains. To apply the proposed method in practical cases, certain prerequisites must be met. Similar to smart shopping carts already implemented in major supermarkets in Japan, which are equipped with LCD displays on carts, integrate customer IDs, and enable real-time network communication, such capabilities are essential. In such an environment, there is no need for extensive deployment of costly RFID or beacons throughout the store. Instead, real-time purchase information can be transmitted and communicated through the network to the recommended system, displaying recommended product information as “recommendations” to customers. While this functionality of displaying “recommendations” is separate from the proposed method and already realized by data providers, it is believed that integration with the recommended system makes its application in real-world cases fully feasible.

We consider the potential applicability of the proposed method’s conceptual framework to other domains. In domains such as electronics and fashion apparel, which feature higher-priced items, customer purchasing behavior is significantly influenced by their previous purchase history. Unlike supermarkets, where contextual factors like recipes were considered, recommending items that have not been previously purchased can be approached by considering contexts such as essential household sets in electronics and outfit coordination in fashion apparel. However, these domains often involve purchases across multiple stores, posing challenges in obtaining comprehensive customer purchase histories compared to the relatively contained environments of supermarkets.

6.4 Limitations

However, there are three limitations to this research. First, this research relies on purchase history data for evaluation, and the proposed method’s predicted recipes lack explicit user feedback on whether the user had actual plans to cook them or desired to make them. To assess the effectiveness of the predicted recipes, empirical experiments or survey studies are necessary.

The second limitation is the lack of consideration for individual user purchase histories, user attributes and regional factors. Therefore, the method does not incorporate specific characteristics of individual purchase items, like “tending to purchase ingredient \(x\) instead of \(y\)” or “preferring brand \(z\) for curry roux.” By incorporating such information, the method could evolve into a more personalized recommendation approach, potentially enhancing recommendation accuracy by considering individual purchasing intentions more effectively. Moreover, in general, there is the variability in user preferences across different demographics and regions, especially concerning recipe preferences. As a future work, we aim to enhance accuracy by incorporating demographic and regional factors into recipe-based recommendations.

Finally, the data used in this study are from a single store and a single period. Our analysis primarily focused on users aged 45–55 who may exhibit a preference for cooking over purchasing ready-made meals, as indicated by preliminary data insights. Future research should expand the scope of users and clarify the effectiveness of the method across different user demographics.

7 Conclusion

In this study, we developed a product recommendation method based on items users’ shopping carts using the co-purchase trends from purchase history data and within the cooking recipes context. When applied to the sequential purchase history data from a supermarket, the proposed method could provide accurate recommendations compared with methods that consider separately the co-purchase trend or cart context. Additionally, supermarket customers are likely to purchase items based on two scenarios: products to restock regardless of recipes and products purchased considering a specific recipe. The analysis of the results obtained from applying the proposed method indicates that recommending restocking items benefits from co-purchase-based methods while considering the context is effective for items purchased with a specific purpose. The simultaneous consideration of both aspects could contribute to improving prediction accuracy.