Keywords

1 Introduction

Artificial intelligence (AI) turns objects into machines that exhibit aspects of human intelligence [1]. The application of a combination of AI techniques to objects, such as automatic speech recognition (ASR) and natural language understanding (NLU), is exhibiting profound effects on the practice of marketing [2, 3]. AI-powered smartphones, smart homes, and smart speakers connect the various nodes of consumers’ lives into one ubiquitous experience while providing new forms of knowledge, entertainment, and shopping [4]. These modern machines are expected to collect relevant information about consumers, learn from such data, and identify consumption patterns, all to predict future individual behaviors [5, 6]. Through such a process, smart objects are able to personalize and contextualize experiences with the potential to alter both consumer [e.g., 7, 8] and managerial behavior [e.g., 9, 10].

The subset of AI-enabled machines with the fastest adoption rate, even above smartphones and tablets [11], are the so-called “voice assistants”. The term voice assistant (VA) refers to conversational agents that perform tasks with or for an individual—whether of functional or social nature—and own the ability to self-improve their understanding of the interlocutor and context [12]. VAs, also called smart speaker assistants [13] or intelligent personal assistants [14], can take various forms of in-home devices such as Bluetooth speakers (e.g., Amazon Echo) and built-in software agents for smartphones and computers (e.g., Apple Siri). All the major tech companies are commercializing in-place voice platforms, with a dominance of the U.S. (Amazon Echo, Apple HomePod, Google Home) and China-based manufacturers (Alibaba Tmall Genie, Xiaomi Xiao AI, and Baidu Xiaodu). As of today, over 200 million in-home voice devices are installed globally, with mainland China registering a record year-on-year penetration growth of 166% in 2019 [15].

Although the world is confronted with the rise of AI-powered VAs, increasingly used for their shopping capabilities, the algorithms underlying the intelligence of VAs represent a “black box” difficult, if not impossible, to decode [16, 17]. These agents will incrementally influence consumer behaviors as they become better at learning consumer preferences and habits. In doing so, VAs may assume a central relational role in the consumer market and progressively mediate market interactions, with a severe impact on brand owners.

The ecosystem in which the firm operates has a profound impact on how the firm can make (and react to) changes [18]. From a network-based perspective [19], changes in the ecosystem structure (e.g., the introduction of a new node like Amazon Alexa) may cause further modification in terms of relationship establishment and technology development. Managers’ interpretations of the marketplace and sense-making of the exchange mechanisms related to activities, resources, and actors [20] are “theories-in-use,” posited to influence and guide companies’ marketing choices [21]. This research seeks to examine managers’ perceptions regarding the evolution of shopping-related VAs and their potential impacts on the marketing practice. In particular, this article aims to offer support to brand owners explaining how VAs work and examining their effects on consumption.

2 Theoretical Background

2.1 The Rise of Voice Marketing and Voice Commerce

Although the most popular VA functions are simple commands like playing music, providing weather information, and setting up alarms [22], an increasing number of users are seeking more sophisticated experiences through the interaction with third-party (branded) apps.

From a marketing perspective, VAs represent a new direct-to-consumer touchpoint for brands [23] that incentivizes novel forms of interaction between consumers and brands [24]. Over 100,000 voice applications are available on the Alexa Skills Store, and 10,000 official apps’ “actions” can be downloaded for Google Home. These applications are used to perform a variety of tasks, such as hailing an Uber ride (utility), guiding the tasting of Talisker whiskey (entertainment), obtaining stain removal tips from Tide (informational), learning all about Mars from Nasa (educational). As the adoption of VAs grows, it is strategic for brands to develop a strong voice presence that fulfills the needs of sophisticated consumers as well as those with disabilities [25].

From a commercial standpoint, voice applications may represent a new revenue stream for premium services such as interactive stories or exclusive features. In-app (or in-skill) purchasing allows users to shop products in a seamless fashion using payment options associated with their account. For instance, with Domino’s skill, customers can build a new pizza order, repurchase the most recent one, or track each stage of the delivery process. In this dynamic context, the number of consumers who have completed at least one purchase through a smart speaker is rising fast. However, the penetration of voice commerce varies widely among product categories. For instance, 21% of U.S. VA owners purchase entertainment such as music or movies, 8% household items, and 7% electronic devices [26].

The act of placing orders online using VAs goes under the names of “voice commerce,” “voice shopping,” or “v-commerce”. This phenomenon is not limited to the transactional phase of the purchasing process but concerns all the commerce capabilities that allow users to search for a product, listen to reviews, add items to a shopping list, track the order, access customer service, and so on. As such, it has the potential to substantially alter all the stages of the consumer journey, from search to (automated) repurchase [e.g., 27, 28].

2.2 Unique Characteristics of AI-Enabled Voice Assistants

A key driver for the VA’s rapid diffusion is the promise of fast, repeatable, and low-cost decision-making combined with an increased level of accuracy, achieved through network effects and feedback loops. Unique from other machines, VAs can naturally converse with users, contextually elaborate requests, and expand their knowledge while learning from mistakes.

VAs are built to mimic natural human-to-human interactions. As such, they react to interlocutors when their name is called and assume a persona (“I”) to refer to themselves. Similar to interpersonal relationships, VAs “memorize” relevant facts during the conversation to give a sense of continuity to the following interactions. The ability of VAs to display emotions through voice [4] and a sense of “spontaneity” using casual language or jokes makes them pleasant conversational partners [14].

VAs become context-aware when the interactions with humans, and other machines, are personalized to the current context. Providing adaptive and context-specific responses requires them to collect and process contextual clues such as the identity of the user, location of the device, time, and purchasing history [1]. Context-awareness is a constituting factor of VAs, allowing them to precisely learn personal consumer’s preferences and automate routines [29].

Increasingly, AI-powered assistants display self-learning abilities that make them different from the initial programming of their developers [30, 31]. Unsupervised AI systems, which operate without manual human annotation, allow VAs to detect unsatisfactory interactions or failures of understanding and automatically recover from these errors. For instance, if a user systematically misspells the name of a song, the system “learns” to address the issue and deploys corrections shortly after. While learning from mistakes, VAs expand their knowledge and reduce friction during interactions [32]. Automatically applying adjustments to a large number of queries using self-learning techniques allows VAs to develop at a faster pace. As such, a significant leap in VAs’ capabilities may be expected [10].

The diffusion of context-aware and self-learning conversational assistants may lead to a transformation of how shoppers search for product information and make purchase decisions.

2.3 The Agency Role of Voice Assistants

In today’s digital age, an increasing number of choices involve the use of AI-enabled agents. These agents assume different forms, from chatbot to newsfeed, and achieve economies of scale and scope for product-related searches as they self-improve the more they are used. Even search engines, the heart of web applications, have become increasingly personalized in their delivery of results, to the extent that they can also be considered recommendation agents [33]. VAs assume the role of an agent during the shopping process and beyond [34]. In fact, VAs can be conceptualized as interaction decision aid tools that generate personalized suggestions with an attempt to match products to consumers’ expressed preferences or implicit behaviors [36]. These algorithms are indispensable in online shopping environments where a potentially extensive set of alternatives are available. Research has shown that agents help consumers by reducing consumers’ information overload and search complexity [34, 35]. As a result, they have the potential to improve the quality of consumer decisions [34, 37,38,39], which also increases consumer satisfaction and loyalty [40]. Corporate managers also positively embrace algorithms that lower search, transaction, and decision-making costs for consumers [41].

The relationship between VAs and their users can be described as an agency relationship. Characterized by information asymmetry, such a relationship requires the consumer to trust the ability of a VA to perform the requested tasks while taking its users’ interests into account [39]. However, given their central role in a complex ecosystem, a goal incongruence between the user and the VA provider might be expected [42]. As such, VAs need to be envisioned as a multi-stakeholder agent, where the strategic goals of the retailer, merchant, advertiser, and VA itself need to coexist [33].

In this context, VAs are expected to match consumer preferences more closely than if they had chosen independently [43]. That is thanks to their ability to collect data systematically and silently over time [44].

2.4 Literature on Voice Assistants

The voice touchpoint is rapidly becoming a focal point in research because of its swift adoption and disruptive potential in buying dynamics. Recent studies have produced insights on the functional characteristics of VAs [13, 51, 52], their adoption and social roles [14, 53, 54]; attitude towards technology [55,56,57], and applications for marketing [43, 58]. However, these investigations have not led to a deeper understanding of the consequences for consumers and brand owners as a result. At the same time, studies on consumer technologies for shopping, such as personal computers, smartphones, and tablets, seem insufficient to understand the unique nature of this new channel and shopping method. In contrast to other consumer touchpoints, VAs are, in fact, designed to process one request at a time and on a turn-by-turn basis to decrease the speech recognition error rate coming from a possible voice overlap. This style of interaction represents a radical difference compared to communicatively richer devices like computers or smartphones, which present multiple pieces of information on a screen concurrently. Although exemplary research on consumer behavior and media possess insights that are likely transferable to VAs, the peculiarities of this technology require new theories that are not yet fully developed [58].

Despite the significance of the phenomenon to our daily lives and the society at large, the study of VAs as a new shopping medium remains substantially unexplored. Munz and Morwitz [27] conducted six experiments demonstrating that information presented by voice is usually more difficult to process than when presented visually. Choice difficulty produces a lack of differentiation between auditory choice options that leads, according to the authors, to greater acceptance of the assistant’s recommendations, but also to a higher likelihood of deferring choice compared to options presented visually.

Sun [28] and her co-authors from academia and Alibaba Inc. used a natural experiment to explore the effect of voice shopping on consumer search and purchasing behavior. According to the researchers, the usage of Alibaba Genie leads consumers to purchase more quantity and spend more. In terms of search behavior, after adopting the VA, on average, consumers searched more categories, and products within the same category.

A conceptual paper by Labecki, Klaus, and Zaichkowsky [59] concludes that voice commerce presents both challenges and opportunities for brand owners. On the one hand, e-commerce has paved the way for voice shopping, bringing consumers to overcome the initial diffidence of buying without directly seeing, touching, or smelling an object. On the other hand, voice technologies further limit the users’ senses, asking consumers to make shopping decisions without browsing photos, videos, or any other visual content.

VAs are “always on” devices that can process (or even automate) orders with a simple command and without providing additional information such as credit cards or address details. However, the uniqueness of VA technologies brings up a new set of interaction rules modeled after the active (and proactive) nature of these smart devices that companies need to learn how to master [60]. To the author’s knowledge, there is no empirical academic work exploring the effect of voice commerce from a managerial perspective.

3 Methodology

Using a “theories-in-use” (TIU) approach [61], this research seeks to examine managers’ perceptions regarding the evolution of shopping-related VAs and their potential effects on the marketing practice. To study these fast-changing market dynamics within the context of voice commerce, we employed an inductive theory construction process based on a mixed-method approach (machine observations, in-depth interviews, and expert surveys). Theories-in-use research is a natural approach for creating theories best suited for addressing broad and profound questions still unexplored [70]. Data were collected and analyzed with the objective of identifying a novel “grounded theory” [62].

This research begins with the observation of a “real-world marketing-relevant phenomena and then identify constructs and relationships that can explain them” [63, p. 3]. Such phenomenon-construct mapping process was built on three district phases.

First, the authors observed Amazon Alexa’s choice architecture (U.S. version) to explore the machine behavior in the context of voice commerce. Data were systematically collected during over 100 direct interactions with Alexa and 40 video reviews on social media to determine the common machine behavior in the shopping mode (see Sect. 4.1).

Second, the authors conducted a total of 30 semi-structured in-depth interviews (Dec’18) with corporate executives, consultants, and researchers selected for both their proven knowledge about AI technology and their diverse background (marketing and business, data science, IT). The TIU approach sees study participants as “theory holders” and “active partners” with whom researchers co-create new theory constructs [70]. One-to-one participants’ conversations with expert participants did not employ theoretical perspectives in order to facilitate the emergence of insights. Conversations were audio-taped, and transcriptions analyzed adopting an inductive line-by-line coding approach using NVivo v12 for Mac. Following a constant comparative data analysis [62], codes were grouped into themes and then re-evaluated to ensure that they reflect data extracts. The emerging conceptual nodes were related to the dual agency and market mediator roles of VAs as well as the main consequences of the diffusion of shopping-related VAs for brands (see Table 1).

Third, a multi-industry expert survey with Swiss and European managers (N = 62) was conducted to collect managers’ present and future-oriented perspectives on how VAs’ diffusion may alter the path to purchase process and in-market dynamic. Expert respondents were recruited for their expertise in e-commerce through a database made available by NetComm Suisse (the largest eCommerce association in Switzerland) and participated on a voluntary basis. Building on the previous research phase (qualitative study), recommendation agents’ literature and the VA studies, the authors analyzed the managers’ anticipated consequence that using VAs has on their: 1) customer base; 2) marketing and branding strategies; 3) future action plan. In each questionnaire’s session, the focus of the study “in-home voice assistants like Amazon Echo, Google Home, and Apple HomePod” was repeated to avoid confusion with other VAs types.

The survey composed of a total of 53 randomized questions (5-point Likert scale) was divided into five sessions: i) anticipated effects on consumers; ii) anticipated effects on brands; iii) anticipated reactions from brands; iv) personal attitude; v) personal information. A link to the Qualtrics survey was shared via email with the managers, and data were analyzed using SPSS Statistics v26 for Mac. This research draws a comprehensive overview of the challenges and opportunities arising with the diffusion of VAs from the eyes of voice commerce-aware managers (see Sect. 4.2).

4 Findings

4.1 Choice Architecture on Voice Assistant

Voice assistants’ retailers, such as Amazon or Alibaba, function as “choice architects” that organize the general context in which people make decisions while voice shopping [47]. In the context of voice commerce, a distinction can be made between VAs designed to find the best-suited products (product brokering) and vendor (merchant brokering) [45]. A popular example of a product brokering recommendation agent is Amazon Alexa, whereas Google Home is a retailer-neutral provider that, at the current state, suggests a selection of merchants. For the purpose of this article, the authors focus on the former type.

Generally speaking, Alexa proposes two distinct interaction flows based on whether users: a) buy in a new product category or b) repurchase in the same category. At the same time, the voice commerce retailer fulfills three main search objectives: 1) broad match; 2) exact match; 3) automated match (Fig. 1).

Fig. 1.
figure 1

Shopping flow on Amazon Alexa

First, during a “broad match,” a user asks Alexa to recommend items for a generic product category, such as “batteries” or “toilet paper”. Being the first time the consumer purchases in the searched category using an Alexa device connected to an Amazon.com account, the VA interprets the individual’s request for products and makes recommendations accordingly. Alexa suggests the selected “top search result” in the form of a default option. A default is the choice option that individuals adopt unless they actively choose an alternative [46, 71]. This suggestion, similar to a pre-checked box on the Internet [48], is designed such that only a single item is presented to a consumer at a time. The sequential presentation (versus simultaneous) of additional items continues only if the consumer answers “No” to the assistant’s question, “Do you want to order this?”. The purchasing process ends when a user agrees to buy the item or quits the operation.

Second, whenever a user expresses a clear brand preference, for instance, “Duracell batteries” or “Colgate toothpaste,” Alexa performs a search for the “exact match”. In case the mentioned brand is not available, the VA will sequentially recommend new items until the user makes a definitive decision.

Lastly, whenever a user buys on Alexa, the information about the bought item is stored in the system. This information is retrieved in the following purchases within the same product category, whether the user had previously expressed a brand preference (exact match) or not (broad match). When the purchase history is available, Alexa performs an “automated match” that allows the user to complete the transaction or seek alternative products swiftly. In this case, the VA tells the user, “Based on your order history, I found one matching item [product information]. Shall I order it?”

In all three search paths, Alexa presents a single item at a time, sequentially. Consequently, the VA may reduce consumers’ visibility of product alternatives [49] and increase brand polarization while enhancing the risk of the so-called filter bubble or echo chamber effects [50]. In this context, product ranking algorithms on VAs assume an even more critical role than in other consumer applications. In fact, providing visibility on the lowest level of product alternatives (one option) is an additional step towards the complete choice delegation to the machine. Ultimately, shopping-related VAs might drive to a user’s loss of autonomy during the decision process, with implications for their assessment and decisions [44].

4.2 Expert Survey

Key findings of the expert survey are discussed in the following paragraphs. Study respondents (28% Female; M age = 40) are well distributed among frequency of VA usage (never or rarely vs. often), voice shopping usage (never vs. at least once), country of residence (Switzerland vs. Europe) as well as the company size (small vs. medium or large). These managers work for 10+ industries with a prevalence of “Consumer goods” (26%), “Marketing, communication, and media” (24%), “Fashion and retail” (21%). In addition, three sub-groups around the key areas of responsibility were formed: “Marketing and sales” (43%), “General management” (24%), and “Other function” (33%).

The authors analyze managers’ perceptions on a) disruptive potential of voice assistants for marketing; b) possible threats for brand owners; c) customer-centricity of VAs; d) effect of VAs on the shopping process; e) short- and long-term relevance of voice commerce.

4.3 The Disruptive Potential of Voice Assistants for Marketing

A review of the current studies on AI-powered VAs led to the definition of seven propositions representing, in the authors’ view, the potential driver of a disruptive market change. Exploratory factor analysis shows the presence of two distinct factors influencing the perception of managers towards the potential disruption of voice commerce for marketing. Concerning factor one “market mediation” (4 items, 45% variance, loading from .675 to .865), a total of 87% of managers agree that VAs will become “powerful marketing, sales, and distribution channel” (M = 4.19, SD = 0.74, P < .001) [43]. Roughly 80% of respondents see VAs as “technology increasingly able to influence consumer’s choices” (M = 4.02, SD = 0.89, P < .001) [58]. Three-fourths of the study participants believe that VAs will become a “new middleman between brands and consumers” (M = 3.90, SD = 1.04, P < .001) [23], and around two-thirds (67%) expect a “severe impact on consumer brands” (M = 3.81, SD = 1.07, P < .001) [24]. While functioning as a “salesperson,” VAs are redefining relationships among consumers and brands. Managers might be concerned by the rapid adoption of VAs as the bargaining power is shifting in favor of VA manufacturers [12, 23, 69]. This fear is primarily expressed towards Amazon, accounting for nearly 45% of the total U.S. retail e-commerce [26].

In factor two, named “customer experience” (3 items, 20% variance, from .662 to .836), 58% agree VAs will “remove traction from customer experience” (M = 3.53, SD = 1.02, P < .001) [72, 73]. Not surprisingly, those managers that often use VAs (vs. never or rarely) believe with a higher degree that VAs will “win consumer’s trust better than other technologies” (P = .015) [43]. Overall, the study participants seem undecided about the VAs’ promise to “make the user smarter by adding a layer of intelligence”.

4.4 Possible Threats to Brand Owners

The outcome of the second phase of this research (in-depth interviews) is summarized in six propositions that reflect the primary concern of AI-aware experts connected to the diffusion of voice commerce (Table 1).

Search algorithms represent the gatekeeper for modern companies. Compared to display enabled smart devices, the optimization of voice search results on VAs present structural challenges due to the nature of consumer interactions, and information framing. As such, nearly two-thirds (65%) of managers believe that brands will “have reduced visibility on voice assistants compared to other touchpoints” (M = 3.65, SD = 1.11, P < .001). Participants’ viewpoints appear to be different among functions and seniority levels. Respondents in the “Marketing and sales” function believe to a greater extent than “General management” that brands will have reduced visibility on VAs (P = .074). At the same time, “C-level” is less concerned than “Mid-management” on the same topic (P = .069).

Table 1. Experts’ view on possible threats for brand owners.

Managers are aware that Amazon’s biased placement on VAs of its private labels might challenge national brands. A total of 71% of respondents “somewhat agree” or “strongly agree” that Alexa will “disproportionally place its private labels while penalizing other consumer brands” (M = 3.76, SD = 1.00, P < .001). With Amazon’s private label portfolio growing to 135 brands and 330 exclusive brands, an increasing number of product categories may be gradually affected by private labels. Those participants that have never purchased on VAs (vs. at least once) think, to a greater extent, that Alexa will disproportionally place its private label compared to other consumer brands (P = .061). In this context, search advertising in the form of voice assumes a paramount role in the marketing practice. Compared to web browser navigation, where search engines can display ten results per page and up to five advertisements, VAs can only suggest a few results with limited space for sponsored messages. This scarcity of space might increase competition among advertisers with a consequent rise in advertising costs. Among respondents, 61% agree that “advertising cost on voice assistants would be higher than web-based advertising because of the limited space available for sponsored messages (paid recommendations)” (M = 3.65, SD = 1.21, P < .001).

Furthermore, voice assistant retailers might function as a gatekeeper of consumer information while improving their power position in the market. While “General management” has a moderate view on this topic, “Marketing and sales” managers think that VA manufacturers will not have access to relevant consumer information (P = .042). In case VA manufacturers decide not to share consumer data and insights with brand owners, there might be a higher likelihood of cross-category “commoditization” [64]. Quoting the conversation with one expert, “Alexa does commoditize entire product categories, all the way from diamonds to detergents”. A total of 44% of managers “somewhat disagree” or “strongly disagree” with the statement that “low involvement product categories will be the only affected by the voice assistant’s diffusion” (M = 2.68, SD = 1.18, P = .036). It is possible to observe a substantial divergence of opinion among functions and seniority levels. While “General management” believes that voice commerce will only affect low involvement product categories, “Marketing and sales” (P = .034) and “Other functions” (P = .054) do not believe the disruption is only limited to these categories. Also, “C-level” does not consider this effect “universal” across categories differently from “Mid-management” (P = .052).

Moreover, three-fourths of respondents believe those VAs will “ongoingly re-evaluate the consumer’s product choice and suggest better alternatives” (M = 3.85, SD = 0.94, P < .001). In a context in which brands are required to continually justify their positions, competition might increase.

4.5 Customer-Centricity of Voice Assistants

This section of the questionnaire deals with the manager’s perceptions about the level of customer-centricity demonstrated by VAs. Questions are formed around three key concepts in human-computer interaction: trusting belief [65, 66], perceived personalization [67], and intention to adopt as a delegated agent [67].

In terms of competence, 53% of study participants trust that VAs possess a “good knowledge of products” (M = 3.34, SD = 1.17, P = .026). Differently from “Marketing and sales” (P = .091) and “Other functions” (P < .001), “General management” does not believe that VAs show a good understanding of the products.

On average, respondents do not believe that VAs put the “customer’s interest first” (M = 2.73; SD = 1.23, P = .084). This sentiment is stronger in managers that “never or rarely” use a VA (P = .038), “never” purchase on VAs (P = .031), and work for “large organizations” with more than 10,000 employees (P = .014). Remarkably, over 85% of the respondents “somewhat agree” or “strongly agree” that VAs “want to understand customer’s needs and preferences” (M = 4.24, SD = .918, P < .001), showing a trustworthy belief of benevolence, i.e., the agent cares about the user and acts in the user’s interest.

Furthermore, when evaluating the integrity of VAs, nearly two-thirds (65%) declared they “somewhat disagree” or “strongly disagree” with the statement that VAs “provide unbiased product recommendations” (M = 2.21, SD = 1.20, P < .001). In particular, respondents who have never purchased through a VA (vs. purchased at least once) believe that VAs provide unfair recommendations to a higher extent (P = .032).

Overall, managers do not believe VAs to be of integrity, i.e., to adhere to a set of principles that the user finds acceptable (M = 2.66, SD = 1.11, P = .020), with “Consumer goods” managers showing a more significant skepticism towards VAs than managers working for “Marketing, communication, media” (P < .001), “Fashion and retailer” (P = .009), and “Pharma and health” (P = .070) companies.

In terms of perceived personalization, that is, the extent to which VAs understand and represent a user’s personal opinion, 63% of managers agree that VAs “provide tailored advice to customers” (M = 3.68, SD = 1.00, P < .001). The assistive nature of the interaction with VAs implies a delegation of responsibility, at least in the absence of explicit requests by the user (exact match). In this respect, 55% of respondents agree that consumers will “delegate to VAs the repurchase of as many products as they can in an automated way” (M = 3.42, SD = 1.15, P = .006). In essence, managers seem to have realized that the ultimate goal of voice commerce is the automation of the buying experience.

4.6 Effect of Voice Assistants on the Shopping Process

The hypotheses presented by Häubl and Trifts [34] in their seminal research on the effect of interactive decision aids in an online shopping environment, were adapted to capture the managers’ opinion on the effect of VAs on shopping behavior. In particular, managers shared their opinion on three measures: the amount of search for product information, consideration set size, and decision quality.

According to 77% of respondents, the use of VAs for shopping “reduces the number of products for which detailed information is obtained” (M = 3.90, SD = 0.95, P < .001). In line with the literature [68], 81% of managers see a risk in the “reduction of the number of alternatives seriously considered by the user” (consideration set size) (M = 3.98, SD = 0.91, P < .001). Shall this expectation become real in practice, brand owners might assist in a higher tendency of users to select the recommended default option.

In terms of decision quality, roughly two-thirds of participants (68%) believe that voice commerce will “reduce the user’s probability of switching to another brand, after making the initial purchase decision” (M = 3.56, SD = 1.05, P < .001). In other words, there is an assumption that VAs may introduce lock-in mechanisms that will reduce the user’s variety seeking, with potential visibility issues for challenging brands. In the same direction, two-thirds of respondents believe that voice commerce leads to a “reduction in consumers’ product choice autonomy” (M = 3.63, SD = 1.04, P < .001).

Overall, managers working for “Consumer goods” companies are particularly doubtful about the ability of VAs to improve consumers’ decision-making. They do not believe that VAs will lead to a “higher degree of confidence in the shopping decisions,” differently from “Consultant” (P = .038) and “Fashion and retailer” (P = .045) managers.

4.7 Short- and Long-Term Relevance of Voice Commerce

While assessing managers’ overall perceptions in terms of VA-related challenges and opportunities, nearly three-fourths of the respondents (74%) believe that voice commerce represents a “great opportunity for their brand” (M = 3.87, SD = 0.97, P < .001). At the same time, 77% consider it a significant challenge (M = 3.97, SD = 0.75, P < .001). This dual mindset captures the current state-of-mind of the companies that see voice commerce as a revolution in marketing and brand management, but also a phenomenon with potentially detrimental consequences.

In terms of phenomenon relevance, roughly 70% of managers believe that voice commerce is important for their industry (M = 3.82, SD = 0.98, P < .001) and the future of their company (M = 3.92, SD = 0.93, P < .001). Regardless of industry and function, respondents with a high frequency of VA usage (vs. never or rarely) see the practice of shopping using in-home VAs more strategic in both the short-term (1 to 3 years) to win shares and long-term (3 to 5 years) for the future of their company.

Overall, managers operating in Switzerland find this technological trend less critical for their country compared to European managers (P = .066). At the same time, European managers believe to a greater extent than the Swiss that voice commerce is “important to win shares in the short-term” (P = .053).

Figure 2 shows the activities that managers expect their customers to conduct using VAs. Almost three-fourths of respondents, coming from both B2B and B2C organizations, believe that customers will “Reorder products,” followed by “Track orders of products” (71%), “Research products” (68%), and “Buy products” (66%). Although nearly 50% of the managers expect to carry out all the listed activities, unsurprisingly most of the B2B organizations give less weight to activities such as “Automate the repurchase of products (e.g., subscriptions)” (53%), “Buy services related to products” (48%), and “Retrieve deals, sales, and promotions” (47%).

Fig. 2.
figure 2

How managers expect their customers to use VAs.

Based on their current understanding of the challenges and opportunities that voice commerce brings up, managers believe that the best way for their company to react is to: “Employ a voice optimization strategy to improve search ranking results” (40%), “Build a voice presence to develop voice-related experiences” (39%), “Invest in advertising on voice assistant platform - paid recommendations” (35%). While listing their company’s top 3 priorities (Fig. 3), respondents generally attributed a lower importance to branding-related activities, with “Create a voice identity” ranking highest (31%), followed by “Build relevant branded apps (skills or actions)” (27%), and “Invest in branding activities to increase consumer’s brand recall” (27%).

Fig. 3.
figure 3

Managers’ anticipated best way to react to the VA’s diffusion.

5 Discussion

The main goal of this paper was to investigate how managers perceive the evolution of voice assistants and their potential effects on marketing practice. There is no doubt that managers look at VAs as a disruptive technology able to radically change the ecosystem in which their company operates. In the managers’ view, VAs may assume a central relational role in the consumer market and progressively mediate market interactions, with a severe impact on brand owners. VAs represent not only new powerful marketing, sales, and distribution channel but also a middleman between brands and consumers, which will increasingly influence consumers’ choice. However, this research found that managers often diverge in their opinions on the basis of four key factors: industry, function, seniority level, and familiarity with voice commerce.

Among different industries, “Consumer goods” managers show a remarked skepticism towards VAs. In particular, they have a reduced trust towards the machine, especially in the area of integrity, that is, the ability of VAs to adhere to a set of principles that the user finds acceptable and provide unbiased product recommendations. Opposite to other industries, in particular, “Fashion and retailer,” managers working for CPG firms do not believe that VAs will improve consumer’s decision quality both in terms of the degree of confidence in the shopping decisions and overall decision-making abilities. Opinions’ divergence between these two industries might be driven by the historical dependence of consumer goods companies from intermediaries (retailers) functioning as the gatekeeper of their distribution. Fashion managers appear significantly less concerned about potentially unfair VA’s manufacturer practices and the influence that the VA might exercise on consumer’s decision making. This might be due to the importance of a multi-sensory experience during the selection process. Besides, automated purchases through product subscription seem less relevant for the fashion industry than others because only a few selected items have the potential to be automatically reordered (e.g., underwear or sneakers). In that context, also given the acceleration of digital practices, the fashion industry might look at the evolution of VAs more from a voice marketing perspective than from a transactional standpoint [75, 76].

Participants’ perception significantly differs between “Marketing and sales” managers and those in “General management”. Marketers appear to feel threatened by the uncertain effect that VAs may have on their brands. Their higher exposure to the practicalities of new consumer touchpoints and applications compared to general managers might be reflected in the belief that brands will have reduced visibility on VAs and limited access to relevant consumer information. Opposite to general managers, marketers believe to a greater extent that voice commerce will affect several product categories and not only low involvement ones. Such divergent perspective comes mostly from a disagreement on the current VAs’ competence. Marketers believe that VAs have a good understanding of products, while general managers appear hesitant about such capabilities. In the context of voice commerce, these two groups of managers have a different sense of urgency that might turn in a strategic misalignment.

In terms of seniority level, a similar dynamic can be noticed between “C-level” executives and “Mid-management”. Senior executives are less worried about the impact of VAs on their brands. For instance, they do not consider Amazon’s biased placement of its private labels a potential challenge to national brands. Additionally, they do not believe that VAs will affects product categories other than low involvements ones.

A key factor influencing the opinion of the study participants is the familiarity with voice commerce, i.e., those who have purchased at least once using a voice assistant. Voice commerce users consider AI-enabled VAs a strategic touchpoint with the potential to drive market share growth in the short-term (1 to 3 years). Extant research posits that companies tend to overestimate the benefit of AI in the short-term but underestimated it in the long-term [5]. However, non-voice commerce users seem to minimize the short-term impact of voice shopping while predicting its long-term effect.

Figure 4 shows a conceptual framework underlining the relevance of being a voice commerce user in the formation of beliefs, attitudes, intentions, and behaviors towards this phenomenon. Building on the theory of reasoned action (TRA) [74], the authors show that manager’s usage of voice commerce is correlated with their a) trust towards VAs (beliefs), b) perception of VA’s personalization and consumer’s intention to delegate the buying process to the VA (attitude), c) view of VA as a disruptive technology, d) view of the importance of voice commerce for their industry and the opportunity for their brand (intention). Direct experience with a shopping-enabled VAs also contributes to i) dual short- and long-term strategic focus, as well as, ii) more articulated planning to act (and react) to change (behavior). In particular, voice commerce users expect their companies to structure an action plan according to a mix of strategic (e.g., build a voice presence), tactical (e.g., buy voice search ads), organizational actions (e.g., run organization-wide training on voice).

Fig. 4.
figure 4

Conceptual framework. Correlation between “voice commerce usage” and any variable in the model (2 tailed). P-value reported as P < .05 (*), P < .01 (**), P < .001 (***).

6 Conclusions

This study was motivated by the swift adoption of shopping-related voice assistants and their potential effect on the marketing practice. We aimed to improve the understanding of managers’ perceptions regarding VAs and their link to future marketing choices. In the absence of extant academic research, we used an inductive theory construction process based on a theories-in-use approach. Beginning with the observation of a real-world marketing-relevant phenomenon, we employed a phenomenon-construct mapping process following a mixed-method approach. We studied the phenomenon of voice commerce through the eyes of AI experts and voice-aware managers in three district data collection phases. First, systematic machine behavior observations unfolded the unique characteristics of voice shopping. Second, in-depth interviews with executives drew the current brand owner’s challenges and opportunities in the context of voice commerce. Third, an expert survey with international managers revealed the expected impact of voice assistants on the shopping process.

We conclude that managers have a shared understanding of voice commerce’s challenges and opportunities for their brands. A dual mindset sees voice commerce as a revolution in marketing and brand management, but also a phenomenon with potentially detrimental consequences. Our findings show that often managers diverge in their opinions on the basis of four key factors: industry, function, seniority level, and familiarity with voice commerce.

This study sheds light on the manager’s perspective on this relevant topic and provides further structure and guidance to brand owners. However, this research is not without limitations. The quality of collected data was preferred to quantity turning in a reduced sample size (N = 62). Future cross-industry studies on multiple functions and seniority levels should better represent the managers’ population with an adequate sample size. Researchers and marketers urge to further explore this emerging stream of research to anticipate the effects of voice commerce on both consumers and brands as a result.

7 Managerial Implications

Managers’ interpretations and individual sense-making of the marketplace are posited to influence and guide companies’ marketing choices. Within the context of voice shopping, brand owners need to carefully monitor its evolution to understand the effects on their consumers’ behavior better while preparing to (re)act. The results of this research offer support to brand owners for developing resilient and sustainable brands in the context of voice commerce. In particular, this article explains how VAs work and examining their effects on consumption.

In order to successfully face a potential market disruption coming from the diffusion of VAs, managers should employ a series of sequential actions to spread the awareness of voice technologies across their organizations.

First, managers are called to understand the unique VA’s characteristics, together with its agency and market mediation role. As such, brand owners need to explore how the VA’s choice architecture can influence the path to purchase process. In particular, they need to anticipate their consumer’s reaction to the machine behavior when searching for products according to a broad, exact, and automated match. Understanding the potential effects of default or lock-in mechanisms on the customer base is deemed fundamental. However, this exploratory task is made more difficult by the continuous evolution of VAs and their choice architecture.

Second, brand owners need to explore opportunities and challenges emerging from the dissemination of VAs. As consumers’ relationships with VAs shift from limited influence to steadfast dependency, brands need to understand how to redesign their value chain [24]. The objective of their brand should be to gain (or protect) a “top of mind” position while building strong relationships with consumers. When consumers are able to express their brand preferences and have a strong attachment to the brand, they become less conditioned by the machine behavior. This brand building (or strengthening) process does not happen on the voice touchpoint in isolation but requires a brand activation across channels. Paradoxically, companies are called to further invest in traditional branding activities that drive brand awareness and recall before they can benefit from the fast growth of voice commerce [23].

Third, managers should understand the divergent views across the organization about the evolution of VAs for marketing. Creating a voice-first strategy that includes a mix of strategic, tactical, and organizational actions requires managers to reach an internal alignment on the strategic relevance of VAs. This study shows the importance of gaining direct experience with the voice shopping process. Not only voice commerce users have a more optimistic view of VAs, in terms of beliefs, attitudes, and intentions, but they also have a higher sense of urgency (behavior). Since a short-term focus might help brand owners to react faster to the market changes, companies need to foster the usage of voice shopping across the organization. Such a direct experience might help the organization to acquire maturity and examine how to leverage voice marketing and voice commerce to grow its brand(s) sustainably.

In light of these firm’s exogenous changes, researchers are called to further study the interplay between consumers and brands in response to “machine behaviors” [31].