Introduction

Advances in information and communication technology have made the Internet a convenient place for people to obtain information. Even as technology has increased the number of tools available to find information on the Internet, such as search engines and generative (Gen) artificial intelligence (AI) chatbots, people continue to use social question-and-answer (Q&A) platforms (Fang and Zhang, 2019). While global social Q&A platforms, such as Google Questions and Yahoo! Answers experienced severe downturns and shutdowns, social Q&A services that specialise in specific regions or languages remain popular, such as Naver Knowledge-iN (kin.naver.com) in South Korea, Baidu Zhidao (zhidao.baidu.com) in China, and Gutefrage (gutefrage.net) in Germany (Zheng et al., 2021).

While people engage with social Q&A platforms to obtain answers to their questions, other factors are known to motivate people to ask questions on social Q&A platforms as well, including a combination of social and emotional factors, user interaction, dependency on user attachment, and emotional or other social factors such as a shared language and vision (Gazan, 2010; Fang and Zhang, 2019; Wang et al., 2021; Zheng et al., 2021). Social Q&A platforms are also evolving from people participating to gain something without monetary value, such as reputation, to paid social Q&A services where askers obtain more professional answers and answerers receive financial incentives (Liu et al., 2021; Ye et al., 2021). These characteristics related to human interaction are difficult for Gen AI chatbots to replace completely, indicating that social Q&A platforms will continue for the foreseeable future.

User-generated questions and answers are the core content of social Q&A platforms. Users answer questions based on their expertise and experience and share rich, useful information. These answers are perceived to provide valuable knowledge and be helpful to other users. Answerers invest time and effort into carefully answering questions from other users and strive to maintain originality and quality (Fang and Zhang, 2019). Among the numerous users of social Q&A platforms, only a small group produce high-quality content (i.e., answers) is thus exhibit different behaviour on these platforms compared to other users (Shen et al., 2014; Zheng et al., 2021). Social Q&A platforms should manage these few answerers (i.e., the best contributors) well because they play a major role in providing the core content for social Q&A platforms, providing valuable information, and helping users (Nam et al., 2009). Their work contributes significantly to the quality and value of social Q&A platforms overall. In addition, managing the best contributors helps to maintain and support their passion and creativity (Zheng et al., 2021).

Furthermore, a small number of the best contributors on social Q&A platforms have the potential to become entrepreneurs. Therefore, they play a vital role and need to be well managed by the platform. An entrepreneur is an individual who is responsible for starting, managing, and operating a business or venture to make a profit (Portuguez et al., 2021). The best answerers on social Q&A platforms can be creative in building their reputation and recognition around a skill and an area of expertise while providing valuable information and helping users. Thus, they may have opportunities in the future to translate their expertise into commercial value and pursue independent businesses or entrepreneurship. For the small group of best answerers with the potential to become entrepreneurs, networking and building trust through social Q&A platforms can be an important starting point for the development of future business opportunities.

Thus, the best answerers on social Q&A platforms are important not only as content generators but also as potential entrepreneurs. Therefore, the characteristics that make social Q&A platform users the best answerers need to be understood. This study explored this topic based on the following research question:

  • RQ: What are the characteristics of the best answerers on Korean social Q&A platforms?

Presumably, the best answerers perform well by writing quality answers that are highly rated by social Q&A users. Therefore, in this study, we consider a user to be a better answerer if their answers are selected by askers at a high rate. We used data from Naver Knowledge-iN, a popular South Korean social Q&A platform. We collected the best answerers’ profile information and activity data and performed multiple regression analysis to identify their characteristics. The results indicate enhanced answerer quality under the following conditions: (1) longer answers, (2) greater similarity between questions and answers, (3) a higher number of credentials beyond social Q&A platforms, and (4) answers across diverse subject areas. Based on this study’s findings, users with such characteristics can be managed separately as potential entrepreneurs to encourage them to voluntarily and continuously produce quality content (i.e., answers) on social Q&A platforms.

The remainder of this paper is organised as follows. Section 2 reviews the literature on common methods for searching for answers online and social Q&A entrepreneurs, to formulate our hypotheses. Section 3 describes the research data, variables, and methodology, and Section 4 reports the results. Finally, Section 5 discusses the significance of this study and future research directions.

Literature review

Obtaining answers online

People actively utilise various tools to obtain answers online, including search engines, social Q&A platforms, and more recently, Gen AI (Table 1). These tools differ in the language used when searching for answers. For example, search engines are keyword-based and require users to enter a refined word or phrase to search for the information, and then search repeatedly until the desired information is found (Shen and Wang, 2017). In contrast, social Q&A platforms and Gen AI chatbots based on natural language processing (NLP) allow people to search for information as if they are having a conversation in their own language. Gen AI chatbots are AI-powered systems that generate human-like responses in conversational interactions (Nicolescu and Tudorache, 2022). While previous large language models have been able to perform various NLP tasks, recent Gen AI chatbots are optimised for conversations and particularly adept at talking in a human-like manner (De Angelis et al., 2023). Unlike rule-based chatbots that rely on predefined rules or templates, Gen AI chatbots can generate unique and contextualised responses based on input, allowing for more natural and informative interactions with users (Adamopoulou and Moussiades, 2020). Thus, people can use natural language to search for information on social Q&A platforms and with Gen AI chatbots.

Table 1 Common ways to get answers online.

However, social Q&A platforms and Gen AI chatbots serve different purposes and have distinct characteristics regarding how people ask questions. First, Gen AI chatbots focus on generating new content based on patterns and structures learned from existing data (Ray, 2023). When someone asks a Gen AI chatbot a question, the system generates a response or output based on its learning, and these answers are generated by an algorithm rather than based on human expertise or personal experience (Koubaa et al., 2023). However, on social Q&A platforms, real people ask and answer the questions. Users of these platforms ask questions to receive answers, insights, opinions, or solutions from other users who have knowledge or experience in a related field, and the answers provided on social Q&A platforms are from people who share their expertise, insights, and personal experiences (Fang and Zhang, 2019; Zheng et al., 2021).

Second, Gen AI chatbots learn from vast amounts of data collected from various sources on the Internet. The responses Gen AI chatbots provide are based on the patterns and information present in the training data; however, the accuracy and reliability may vary because the accuracy of the information generated by the model cannot be verified (Koubaa et al., 2023; Rudolph et al., 2024). However, on social Q&A platforms, answerers provide information directly based on their knowledge, expertise, or experiences, and users can evaluate credibility and accuracy based on factors such as the contributor’s reputation, expertise, and community feedback (Jin et al., 2015).

Third, when people ask Gen AI chatbots questions, responses are typically generated through a one-way process. People ask questions and the Gen AI model responds without further interaction or explanation. Gen AI models may be unable to engage in dynamic conversations or seek additional information to provide more accurate or contextualised answers (Koubaa et al., 2023). However, on social Q&A platforms, users can have interactive discussions with the answerer, such as by asking follow-up questions, seeking clarification, and engaging in back-and-forth conversations (Jin et al., 2015).

Finally, Gen AI chatbots lack the personal experience, opinions, and domain-specific expertise of human contributors (Koubaa et al., 2023). Although Gen AI chatbots can generate content based on patterns, they may not provide the same level of depth, context, or domain expertise as human-authored answers. However, on social Q&A platforms, human contributors can provide answers that reflect their knowledge, expertise, and personal insights (Fang and Zhang, 2019; Zheng et al., 2021), share real-world stories, offer nuanced perspectives, and provide contextual recommendations beyond those Gen AI chatbots.

Top contributors on social Q&A platforms

Contributors are the main content producers on social Q&A platforms because the questions and answers users voluntarily contribute are the core content. The quality of social Q&A platform services is also significantly affected by how many high-quality answers are consistently generated (Fang and Zhang, 2019); thus, social Q&A platforms want to know which users produce quality answers. Social Q&A platforms have heavy-tailed activities in which only a few answerers have many comments (Nam et al., 2009), and these answerers behave differently than other answerers or lurkers (Shen et al., 2014; Zheng et al., 2021). Although some users answer tens of thousands of questions, askers rarely ask more than a few hundred questions (Nam et al., 2009). Therefore, while it is unlikely that high-contributing askers (i.e., contributors who ask many questions) contribute significantly to site activity, high-contributing answerers (i.e., contributors who answer many questions) can help many askers. Furthermore, these few answerers differ from others in both the quantity and quality of their answers (Nam et al., 2009). Therefore, social Q&A platforms rank users and attempt to manage their best answerers as content providers (Fang and Zhang, 2019; Jin et al., 2016; Nam et al., 2009; Yan and Zhou, 2015; Zheng et al., 2021).

In addition to their ability to provide content, the best answerers on social Q&A platforms have the potential to become entrepreneurs based on their expertise, reputation, and ability to create content, similar to influencers on social media platforms (Bi and Liu, 2022). An entrepreneur is an individual who is responsible for starting, managing, and operating a business or venture to make a profit (Portuguez et al., 2021). On social Q&A platforms, the best answerers are free to create content (i.e., answers) based on their expertise and knowledge without being bound by specific guidelines or content requirements set by the platform. This independence allows them to develop unique approaches, styles, and brands as content creators. In addition, the best answerers can build strong reputations both on and off the platform through consistent, high-quality contributions. Such a reputation can attract devoted followers and build trust and credibility among audiences. Building a reputation is a crucial aspect of starting a business because it opens the door to various opportunities (Xie and Lv, 2018). Although immediate monetisation options within social Q&A platforms may be limited, the best answerers can leverage their expertise and reputation to explore different revenue streams. For example, they can expand their presence to other platforms such as by creating blogs or social media accounts, allowing them to explore revenue streams, including sponsored content, affiliate marketing, digital product sales, and consulting services. Furthermore, the best answerers can consider expanding their business beyond content creation as they gain recognition and build their brand. They can also look for opportunities for speaking engagements, workshops, training programmes, or collaborations with other businesses or organisations in their niche. Expanding into various entrepreneurial ventures can help answerers grow and diversify their businesses.

Furthermore, if the same platform offers free and paid social Q&A services, the best answerers with entrepreneurship potential are likely to be active in both. The best answerers can contribute their expertise and knowledge to a paid social Q&A service and be compensated for it, providing them the opportunity to leverage the reputation and knowledge they have built on the free social Q&A service to provide specialised services and work independently. This allows answerers to create greater value and explore business opportunities by responding to consumer needs in their areas of expertise. The best answerers who participate in both free and paid social Q&A services on the same platform can be considered platform-dependent entrepreneurs. A platform-dependent entrepreneur builds or relies heavily on a specific platform or technology infrastructure, usually provided by another company or organisation (Cenamor et al., 2019; Cutolo and Kenney, 2021; Nambisan and Baron, 2021). Platform-dependent entrepreneurs leverage platforms to create, scale, and deliver products or services without having to build everything from scratch (Cenamor et al., 2019; Cutolo and Kenney, 2021; Nambisan and Baron, 2021). Thus, platform-dependent entrepreneurs are characterised by their dependence on a platform and cannot leave it easily. From a social Q&A platform perspective, encouraging the best answerers on a free social Q&A service to switch to a paid service can strengthen platform loyalty and attract and retain users who offer expertise through a service. Social Q&A platforms can encourage users to continue providing free services while building their expertise by earning rewards through paid services. Therefore, social Q&A platforms can support their best answerers to ensure the continued growth and development of the platform.

Hypotheses

Social Q&A platforms have their own methods of determining the best answerers among a large group of users. For example, social Q&A platforms can rank members based on certain criteria, such as the number of times they log onto the platform, write a question or answer, or upvote or downvote other answers (Fang and Zhang, 2019; Nam et al., 2009; Yan and Zhou, 2015; Zheng et al., 2021). These methods are frequently updated; however, what is important for social Q&A platforms is that users consider answers written by a particular user as reliable and useful. In this context, most social Q&A platforms have a feature that allows askers to select the answer they find the most useful among many (Jin et al., 2016), allowing the best answerer to be defined as the user whose answers are selected the most by askers.

Among many answers, askers consider various aspects of an answer and select the best. This decision-making process can be described using the dual-process theory of information processing (James, 2007), which suggests that decision-making and information processing involve two distinct cognitive processes: systematic processing and heuristic processing. Systematic processing, or the central route, involves deliberate thinking, logical reasoning, and cognitive effort to engage in systematic analysis, problem-solving, and decision-making based on available information. Heuristic processing, or the peripheral route, relies on heuristics, associations, and intuitive judgments to make decisions and is driven by emotions, biases, and past experiences. This process generates rapid responses and intuitive judgments without conscious reasoning. These dual processes work concurrently rather than separately during personal information processing. When selecting the best answers on a social Q&A platform, askers can easily see both the answer and related information (e.g., author, date, and comments) because the platform displays these together. Therefore, systematic processing allows askers to decide whether to select an answer as the best based on the response itself, while heuristic processing allows askers to decide whether to select the best answer based on related information (Jin et al., 2016). In this context, we formulated hypotheses based on four primary perspectives to comprehend the characteristics of the best answerers: the content of the answer, the characteristics of the answer’s author, the level of activity exhibited by the answerer as a platform user, and social endorsement. These factors are pivotal in determining what answers are chosen as the best on social Q&A sites.

First, answer content itself has various aspects, including length, immediacy, and similarity. The longer the answer, the more likely the asker is to select it as the best answer (Kim and Oh, 2009; Qi et al., 2021). More immediate answers are provided almost as soon as the question is posted, making it more likely askers will select these answers as the best (Fu and Oh, 2019; Gazan, 2010; Jin et al., 2016). Regarding similarity, the more similar the response an answerer provides is the question, the more likely it is the asker will choose that it as the best (Jin et al., 2016; Fu and Oh, 2019). Finally, the higher the ratio at which a user-written answer is selected as the best, the better the answerer will be. Therefore, we propose the following hypotheses:

  • H1-1: The longer the answers a user writes, the more likely that user is to be the best answerer.

  • H1-2: The more immediate the responses of a user, the more likely that user is to be the best answerer.

  • H1-3: The more similarity a user’s response shares with the question, the more likely the user is to be the best answerer.

Second, regarding the information related to the answer, the asker can access information about the answerer provided by the social Q&A platform (Dong et al., 2021; Ginsca and Popescu, 2013; Zhou et al., 2012). Information about answerers can be broadly divided into information answerers write themselves and information about their activity on the platform. Because the answerer serves as the source of the answer, the asker assesses the quality of the answer based on the available information about the answerer (Jin et al., 2016; Shah and Pomerantz, 2010). Regarding self-described information, social Q&A platforms permit users to write details about themselves on their profile pages, which are visible to all users. The answerer’s profile is recognised as a source characteristic; the greater the degree of openness in the user’s personal information, the more trustworthy other users perceive them to be (Jin et al., 2016; Shah and Pomerantz, 2010). Thus, askers tend to trust answerers more when they provide more detailed profiles (Jin et al., 2016; Liang, 2017; Liu et al., 2021). Moreover, considering that social Q&A platforms mainly focus on answering questions, the expertise of their users is highly valued (Liu et al., 2021). Therefore, users with generally accepted credentials will presumably be more reliable (Kim, 2010). Accordingly, we propose the following hypotheses:

  • H2-1: The more detailed a user’s self-introduction, the more likely that user is to be the best answerer.

  • H2-2: The more credentials a user has, the more likely that user is to be the best answer.

Third, regarding the activity history of an answerer on a social Q&A platform, askers can judge the quality of the answer by looking at their past posting behaviour (Gazan, 2010). Individuals who actively participate in various activities on a given platform earn a good reputation (Liu et al., 2021; Ye et al., 2021), and social Q&A platforms can be considered similar. Thus, on a social Q&A platform, the asker can determine the quality of the answer based on the breadth of the fields in which the answerer mainly provides answers. Those who contribute answers across different domains on social Q&A platforms possess diverse interests in a variety of topics and willingly share their professional expertise on the platform. This enables users in various fields to search for answers and obtain reliable information. In addition, users who are active in several fields can interact with other users and form a rich community, enabling active participation on the platform and an exchange of opinions. Furthermore, as users provide answers in various fields, platform utilisation increases. This raises awareness that the platform is suitable for various topics and users, helping attract new users and maintain existing ones.

On social Q&A platforms, users are not necessarily required to choose between being an asker or answerer; they can act in both roles as information providers and seekers (Adamic et al., 2008; Fu and Oh, 2019). Askers may believe that if answerers are users who have played both roles on the platform, they will be more likely to have knowledge on a wider range of activities and be more engaged with the platform. Therefore, askers may believe that these active users will provide more credible answers. Accordingly, we propose the following hypotheses:

  • H3-1: The more fields in which a user provides answers, the more likely that user is to be the best answerer.

  • H3-2: The more active a user is as both an asker and answerer, the more likely the user is to be the best answerer.

Finally, social Q&A platforms are not only important to askers and answerers but also to an overwhelming majority of lurkers. On social Q&A platforms, not only can the asker select the best answer from the answerers’ post but lurkers can also react by upvoting or downvoting an answer (Fang and Zhang, 2019). If a lurker responds positively to an answer written by a user, their quality can be considered sufficient (Jin et al., 2016). Therefore, we propose the following hypothesis:

  • H4: The more positive responses a user’s answer receives from lurkers, the more likely the user is to receive the best answer.

Methodology

Research site: Naver Knowledge-iN

This study utilises data from Naver Knowledge-iN (kin.naver.com), a leading South Korean social Q&A platform, to identify the characteristics of the best answerers. Naver Knowledge-iN is a section of the platform launched in October 2002, where users can ask questions and obtain answers from the community. At the time Naver Knowledge-iN was launched, Google was gaining ground as a search engine for English content; however, it showed poor performance in searching for Korean content, and Korean content was also scarce online. Thus, Naver attempted to improve the performance of its search engine, while encouraging people to generate Korean content by asking and answering questions on Naver Knowledge iN. Through this service, Naver built 1.65 million knowledge DBs in a year because of the explosive response from young users, helping Naver become the leading portal site in Korea (Chae, 2003). Since its launch, 32 million cumulative users have communicated with Naver Knowledge-iN over the past two decades, and the number of questions and answers they have generated has been 300 million and 500 million, respectively, with a total of 800 million databases accumulated. In 2021, Naver Knowledge-iN added 1 million new questions and 730,000 new answers and currently averages nearly 30 million pageviews per day (Cho, 2022).

Naver Knowledge-iN offers various categories and topics that users can explore. Users can ask questions on topics such as technology, health, travel, entertainment, and education. The platform has a large user base, which people use to seek information, opinions, and advice from fellow users. Similar to other social Q&A platforms, Naver Knowledge-iN allows users to vote on answers, and the most popular and helpful answers receive higher visibility. Users can also participate in discussions, comment on answers, and build a reputation based on their contributions. In addition, as with other social Q&A platforms, user contributions are essential for the sustainability of Naver Knowledge iN; therefore, the platform has been attempting various methods to encourage more users to write better answers and participate in knowledge-sharing. For example, the platform offers monetary rewards, virtual points, badges, and rankings. In addition, the platform provides a profile page for each user and a threaded Q&A page (Figs. 2 and 3).

Naver operates Naver Knowledge iN, a free social Q&A platform, and Naver eXpert, a paid social Q&A platform (expert.naver.com). On Naver eXpert, users can ask questions and obtain answers from verified experts in various fields, such as medicine, law, and finance. Naver eXpert operates on a fee-based model, meaning that users must pay a certain amount of money to ask an expert a question or access premium content provided by experts. Users are provided the opportunity to obtain professional advice and insights from experts in a specific field, and the experts are financially compensated for the knowledge and expertise they share on the platform. Because answerers of Naver eXpert can provide knowledge and monetise it, they can be considered entrepreneurs in one-person businesses. They rely on a specific platform to function; thus, they can be considered platform-dependent entrepreneurs. They rely on the platform’s infrastructure, user base, and features to connect with and serve their audience and monetise their content based on the platform’s revenue model. Thus, they do not rely on the platform to run their entire business, but their ability to reach customers, generate revenue, and build their reputation is highly dependent on the features and opportunities it provides.

As Fig. 1 shows, Naver eXpert and Naver Knowledge-iN are provided on the same Internet portal, Naver. Therefore, the best contributors with entrepreneurship on Naver Knowledge-iN are likely to utilise the reputation and knowledge they have built on the free social Q&A service to work on the paid social Q&A service for financial rewards, transforming them into platform-dependent entrepreneurs. In this context, Naver strives to manage the best contributors to the free social Q&A service to encourage their conversion to the paid social Q&A service, thereby strengthening users’ dependence on the platform and attracting users who offer their expertise. Attracting the top contributors from Naver Knowledge-iN to Naver eXpert has several benefits for Naver as a platform. First, it can build expertise and trust on the platform. The best contributors have built a reputation for Naver Knowledge-iN and have a high level of expertise. By bringing them to Naver eXpert, the platform can gather customers who offer specialised knowledge and services. Other users can trust the expertise of Naver eXpert, thereby increasing the value of the platform. Second, by attracting the best Naver Knowledge-iN contributors to Naver eXpert, Naver maintains the quality and consistency of the content provided on the platform. The best contributors provide authoritative answers based on their knowledge and experience and offer valuable information to platform users. This helps improve the platform’s reputation and user experience. Third, answerers on Naver eXpert can provide paid services to answer questions and receive compensation for doing so; therefore, if the best contributors are converted to Naver eXpert, the platform can monetise their services which will help the platform continue to operate and grow.

Fig. 1
figure 1

Screenshot of Naver Knowledge-iN.

Research data

For this study, we developed a Python crawler to collect the profile and activity data of the top 1000 contributors on Naver KnowledgeiN. First, as shown in Fig. 2, we crawled the profile information of these users to gather their profile information as of 15 October 2022. To investigate these users’ answering activity, we randomly selected 100 posts from their replies and collected the corresponding question posts for each reply. Among the targeted users, we collected all replies written by users with fewer than 100 selections. Ultimately, we analysed 903 users after excluding posts with zero characters and those that showed an excessive difference between the question-and-answer dates.

Fig. 2
figure 2

Example of user profile page in Naver Knowledge-iN.

To measure the answerers’ performance as the dependent variable, we calculated the percentage of answers selected by the askers from the answers written by the answerer.

As shown in Fig. 3, the independent variables related to the content of users’ answers are a key part of their activity as top contributors; therefore, we measured three aspects according to our research hypotheses. For H1-1, we assessed the level of detail and information in responses by considering response length. The level of detail or information can be examined based on both quantitative and qualitative. Quantitative aspects such as the average number of words are relatively straightforward to measure. However, qualitative aspects, such as the content, structure, and logical flow of a response, pose challenges owing to the time-consuming, expensive, and subjective nature of expert evaluations. NLP techniques offer a potential avenue for evaluating the qualitative aspects of responses; however, their effectiveness is hindered by their reliance on domain-specific analysis. In particular, the overall accuracy in the context of Korean-based NLP techniques remains relatively low (Kim et al., 2022). Despite the challenge of equating specificity solely with response length, longer answers are considered to provide more information and are more likely to be specific (Peng et al., 2020; Qi et al., 2021). Consequently, we measured the level of detail or the amount of information in a response by calculating the average number of words, providing a broad gauge of specificity. In addition, analysing the average word count across multiple responses from a user allows us to discern individual response writing patterns. A consistently high or low average word count suggests that users tend to express themselves in a specific manner.

Fig. 3
figure 3

Example of a question-answer exchange in Naver Knowledge-iN.

Regarding the independent variables related to the answerer’s information, we first measured the degree of openness of the user’s personal information, which reflects the degree of detail that the users provide to describe themselves. This was measured based on the number of words on a user’s profile page, aligned with the same rationale applied to the quantitative measurement of answer specificity (H2-1). We also measured the number of credentials (H2-2) users included in their profiles, as the level of credentials is an important factor in gaining askers’ trust on social Q&A platforms.

In terms of the answerer platform activity, we measured the number of activity categories in the answer threads written by the users to measure the scope of their activity on the social Q&A platform (H3-1). We measured the ratio of their selected best answers to posts written they wrote as askers to measure the extent of their contribution as an asker and answerer on the social Q&A platform (H3-2).

Regarding the level of social endorsement, we measured the number of upvotes received on a post written by that user to measure the degree of response from lurkers to a reply written by the user (H4).

Table 2 summarises the variables used to test the hypotheses. For H1-2, we calculated the difference between the creation time of the question post and that of the answer post to measure the immediacy of the responses. Finally, for H1-3, to measure the similarity between the question-and-answer posts, we calculated the average cosine similarity (Jin et al., 2016) between them after excluding questions from the analysis if the thread was private. The detailed steps for calculating the cosine similarity between question-and-answer posts are as follows. To analyse the content of the answers, which are unstructured documents, we must first perform a structuring process. Accordingly, we performed the standard preprocessing steps of an NLP task: converting each document into a vector and tokenising the raw text. To tokenise the answers written in Korean, we used the KoNLPy library for Korean NLP in Python and the verified Python-based morphological analyser MeCab (Kang and Yang, 2018). All words were tagged as parts of speech based on their syntactic categories, and a vector space model was used to represent each document as a point in a vector space with one dimension for each term in the vocabulary. Each document d is represented by Eq. (1), where ti denotes the term contained in d and fi denotes the frequency of occurrence of that term in d.

$${\rm{d}}=\{({{\rm{t}}}_{1},{{\rm{f}}}_{1}),({{\rm{t}}}_{2},{{\rm{f}}}_{2})\ldots ({{\rm{t}}}_{{\rm{i}}},{{\rm{f}}}_{{\rm{i}}})\}$$
(1)
Table 2 Variables.

The cosine similarity between question dq and its answer da is provided by Eq. (2), where fqi and fai represent the frequencies of term ti in dq and da, respectively; n is the number of terms contained in dq and da.

$$cosine\,({d}_{q},{d}_{a})=\frac{{\sum }_{i=1}^{n}\,{f}_{\!\!qi}\times {f}_{\!\!ai}}{\sqrt{{\sum }_{i=1}^{n}{f}_{\!\!qi}^{2}}\sqrt{{\sum }_{i=1}^{n}\,{f}_{\!\!ai}^{2}}}$$
(2)

Empirical analysis results

Descriptive statistics

We first conducted a descriptive analysis of the 903 best users as of 15 October 2022, and Table 3 summarises the descriptive statistics of the variables and the correlation analysis results. The mean value of the respondents’ performance was 0.847 with a standard deviation of 0.150. This indicates that approximately 85% of the answers written by the best answerers were selected as the best answers. The number of words in the self-introduction of the answerer’s profile (r = 0.070, p < 0.05), number of credentials of the answerer outside the social Q&A platform (r = 0.099, p < 0.05), average cosine similarity between questions and answers (r = 0.095, p < 0.05), number of fields to which the answers belongs (r = 0.136, p < 0.001), and percentage of posts and answers selected by the answerer (r = 0.114, p < 0.001) presented significant positive correlations with answerer performance.

Table 3 Descriptive statistics and correlation matrix (903 best users).

Hypothesis testing

Hierarchical regression analysis performed with STATA (MP 14.2) was used to examine the research model. Hierarchical regression offers a distinct advantage in detecting the total variance of the outcome variables. This was demonstrated using the incremental increase in the coefficient of determination, the proportion of variance in the dependent variable that is explained by the independent variables (Petrocelli, 2003). The sequential arrangement of independent variables was methodically predetermined, guided by the research objectives and rationale. The hierarchical order of entry was strategically structured, emphasising the prioritisation of causal relationships, elimination of confounding or spurious relationships, alignment with research goals, and consideration of the structural characteristics of the factors under investigation (Cohen et al., 2003). The analytical process was divided into two blocks, each of which encapsulated a distinct set of independent variables (Falvo and Earhart, 2009). Table 4 presents the results of the hierarchical regression analysis.

Table 4 Result of hierarchical regression analysis.

In Model 1, the effect of the average answer word count on answerer performance was significant (F = 9.910, p < 0.001, R2 = 0.042), indicating that a higher word count for answers (β = 0.014, p < 0.05) was associated with higher answerer performance.

Model 2 showed that the effect of the average answer word count and the answerer’s number of credentials outside the social Q&A platform on answerer performance was significant (F = 8.520, p < 0.001, R2 = 0.054). Compared with Model 1, the change in explanatory power was 1.2%, which was significant (p < 0.001). This indicates that the greater the word count for answers (β = 0.014, p < 0.05) and the greater number of credentials an answerer has outside the social Q&A platform (β = 0.026, p < 0.05), the higher the answerer performance.

Model 3 demonstrated that the effects of the average answer word count, average cosine similarity between questions and answers, answerer’s number of credentials outside the social Q&A platform, number of fields to which the answers belong, and percentage of posts and answers selected by the user on answer performance were significant (F = 9.040, p < 0.001, R2 = 0.075). Compared with Model 2, the change in explanatory power was 2.1%, which was significant (p < 0.001). This indicates that, higher answer word count (β = 0.016, p < 0.05), average cosine similarity between questions and answers (β = 0.087, p < 0.05), number of answerer credentials outside the platform (β = 0.024, p < 0.05) number of fields to which the answers belongs (β = 0.010, p < 0.05), and percentage of posts and answers selected by the user (β = 0.052, p < 0.05) are associated with higher answerer performance.

Model 4 also demonstrated that the average answer word count, average cosine similarity between questions and answers, answerer’s number of credentials outside the social Q&A platform, number of fields to which the answers belong, and percentage of posts and answers selected by the user had significant effects on answerer performance (F = 8.080, p < 0.001, R2 = 0.075). However, compared to Model 3, the change in explanatory power was found to be 0%. This indicates that higher answer word count (β = 0.015, p < 0.05), average cosine similarity between questions and answers (β = 0.082, p < 0.05), number of credentials of the answerer outside the platform (β = 0.025, p < 0.05), number of fields to which the answers belongs (β = 0.010, p < 0.05), and percentage of posts and answers selected by the user (β = 0.053, p < 0.05) are associated with higher answerer performance. Table 5 summarises the verification results for the hypotheses derived from the analysis of Model 4.

Table 5 Hypothesis testing results.

Discussion and conclusion

Discussion

This study analysed the data of 903 of the best answers from an actual social Q&A platform, Naver Knowledge iN, to identify the characteristics of contributors who generate high-quality answers, which are the core content of social Q&A platforms. The results of our hierarchical regression model show that longer answers, more similarity between the question and answer, more answerer credentials outside social Q&A platforms, greater number of categories to answer, and more active participation as an asker are associated with better performance among the top answerers.

Our analysis results can be explained as follows. First, regarding content characteristics, the predictive power of answer length for answerer performance was validated (H1-1), which is in line with our expectations and previous literature regarding the length of answers on social Q&A platforms (Fu and Oh, 2019; Jin et al., 2016). Longer answers are more likely to be comprehensive and informative, likely because they provide more context and elaboration, which can help users understand the answer better and apply it to their own situation. In addition, longer answers may convey a sense of sincerity and effort on the part of the answerer, which can positively influence the asker’s perception of the answer quality.

However, answer immediacy was not found to have a predictive effect on answerer performance (H1-2), contrary to our expectations. Based on the existing literature (Fu and Oh, 2019; Gazan, 2010; Jin et al., 2016), we expected answer immediacy to have a positive effect on answerer performance; however, the results in our analysis were not significant. This suggests that the value placed on answer immediacy varies depending on the context and user expectations. What people value when asking a question differs between social Q&A platforms and Gen AI chatbots. When people ask a Gen AI chatbot a question, they want an immediate answer. However, when they ask a question on a social Q&A platform, they are looking for something other than immediacy. Social Q&A platforms cater to a more thoughtful and nuanced approach to information-seeking, whereas chatbots prioritise immediacy for quick on-the-go queries. Understanding these differences can help in designing and optimising these platforms to meet their users’ specific needs.

The predictive effect of similarity between answers and questions on answerer performance was supported (H1-3). This suggests that contributors who understand the nuances of a question are more likely to provide helpful answers. This finding is in line with our expectations and previous literature regarding the similarity between questions and answers on social Q&A platforms (Fu and Oh, 2019; Jin et al., 2016). The ability to grasp the nuances of a question reflects a contributor’s deep understanding of the subject matter and aptitude for effective communication. These qualities are essential for providing high-quality answers that are both informative and engaging, which ultimately contributes to the platform’s overall success.

Second, regarding the profile information provided by the best answerers, the detail in a user’s self-introduction did not have a significant effect (H2-1); however, the number of credentials did (H2-2), which is consistent with previous research (Kim, 2010). Thus, askers judge the answerer’s credibility based on the number of credentials they have rather than details in their self-introduction. This indicates that askers tend to emphasise objective indicators such as credentials. This aligns with the concept of ‘expertise signals’ in social settings. People often rely on external cues to assess others’ credibility, particularly in situations where direct experience or knowledge is limited. On social Q&A platforms, credentials serve as visible markers of expertise, providing a quick and easy way for askers to gauge the trustworthiness of potential answerers.

Third, regarding the scope of platform activity based on the best answerers’ posting behaviours, the predictive power of the number of answer categories for answerer performance was verified (H3-1). The results showed that those who answered questions in more categories performed better than those who answered questions in fewer categories. The positive relationship between the number of answer categories and answerer performance highlights the value of versatility on social Q&A platforms. Contributors who demonstrate proficiency in multiple areas are more likely to provide valuable insights, gain recognition, and establish themselves as trusted sources of information within the community. This can be explained by the heightened performance of answerers engaged across diverse categories, supporting H3-2, which posits that enhanced performance is associated with users who actively participate as both askers and answerers. This dual engagement allows contributors to gain a deeper understanding of community needs, build stronger connections, identify emerging trends, and contribute to knowledge-sharing, ultimately leading to enhanced performance and a more valuable platform for users.

Finally, regarding social endorsement, the degree to which lurkers responded positively to answers did not have a significant effect (H4). As an information-seeker, a lurker’s response may be important on social Q&A platforms for engaging a high number of lurkers; however, it does not have a significant effect on the answer an asker selects as the best. Askers on social Q&A platforms prioritise their own assessment of answer quality and may not be swayed solely by lurker endorsements. This highlights the importance of answerers consistently providing high-quality, relevant, and informative responses to establish themselves as trusted sources of information within a community.

Implications

Our study examined the characteristics of social Q&A platforms, focusing on Naver Knowledge-iN, a representative social Q&A platform in Korea. We expect the results of this study to provide practical implications for social Q&A platforms, not only in Korea but also in other countries, for identifying top contributors who are likely to produce quality content. The best answerers on social Q&A platforms can produce quality answers as content providers, which is a crucial capability that social Q&A platforms must manage to continue providing services. Accordingly, understanding the distinguishing traits of exemplary contributors, or ‘best answerers’, offers universal insights that can be applied by social Q&A platforms worldwide. In addition, the best answerers have the potential to become entrepreneurs based on their expertise, reputation, and content-creation skills; thus, social Q&A platforms should manage them well. Social Q&A platforms must keep their best users with entrepreneurial skills active on their platforms and not on others. To achieve this, social Q&A platforms must systematise how their best users can consistently receive tangible and intangible rewards for the content they create. Based on our findings, social Q&A platforms worldwide can proactively manage users who have the potential to become the best answerers by screening them in advance. For example, to encourage detailed and informative answers, a platform can implement features that promote longer and more comprehensive responses, such as word count indicators or badges for detailed answers. Furthermore, to highlight answer–question similarity, the platform can develop algorithms that identify answers that closely align with the original question’s intent and context and prioritise those answers in search results or recommendations. Platforms can also encourage contributors to expand their areas of expertise by providing incentives for answering questions in multiple categories. This can diversify the expert pool and increase the platform’s coverage of diverse topics.

Our results provide actionable recommendations for social Q&A platforms seeking to cultivate supportive and engaging community environments. By implementing features that encourage detailed and informative answers and prioritising answer–question similarity, platforms can foster an atmosphere conducive to quality contributions. These strategies apply not only to Korean social Q&A platforms but also to foreign social Q&A platforms aiming to enhance user satisfaction and platform relevance.

Our results can also be used when social Q&A platforms worldwide want to use other tools (e.g., Gen AI chatbots) rather than humans to generate content. Despite the variety of tools (e.g., search engines, social Q&A platforms, and Gen AI chatbots) available to search for answers online, we found that people often use the same criteria to determine a quality answer regardless of the search tool. Our results show that when people seek answers on social Q&A platforms, they place greater importance on knowing that the answers are sincere and empathetic than factors such as the immediacy of the answer or specificity of an answerer’s self-introduction. These qualities, which are often overlooked in other information-seeking tools, can significantly affect the quality and value of answers. Since for the evolution of social Q&A platforms alongside AI-powered tools is a global trend, recognising and nurturing these human qualities will be essential for their continued success and relevance.

Limitations

This study suggests several avenues for future research. First, the cross-sectional nature of the data introduced certain constraints. A more nuanced understanding could be achieved by utilising panel data, allowing for an in-depth examination of dynamics such as whether the asker’s selection of the best answer is influenced by the behaviours of distinguished contributors or lurkers.

Second, we did not perform a comparative analysis of the characteristics of the top and bottom answerers. Several previous investigations have explored distinctions between active users and lurkers using survey data (e.g., Zheng et al., 2021); however, studies that have systematically analysed the actual behavioural data of social Q&A platform users are scarce. While the limited number of posts from the bottom answerers presents a challenge, augmenting our dataset with additional user information, including demographics and other platform activities, could provide richer insights into the factors that influence answer quality.

Third, to unravel the qualities that contribute to the performance of top answerers, we adopted a comprehensive perspective to scrutinise the various dimensions of best-answerer participation. However, despite our efforts to incorporate additional variables, including the depth of the response, readability, objectivity, keyword density, and topic similarity, as recommended by Fu and Oh (2019) and Mousavi et al. (2020), these could not be incorporated because of the limited size of the dataset. Although we attempted to include these variables, we failed to obtain significant results. This limitation is attributed to inherent constraints in the dataset, intricate relationships, and potential interactions among the variables. Accordingly, future studies would require access to larger datasets or the application of strategic sampling techniques to bolster the statistical power, enabling a more comprehensive exploration of the multifaceted nature of responses from top answerers.

Future studies should also consider alternative variables to explore the scope of answerers’ platform activities in more depth. Utilising metrics, such as the Gini coefficient or entropy, could offer a viable approach for quantifying diversity across domain categories. This study could not use Gini or entropy metrics because of the dynamic nature of Naver Knowledge-iN’s answering fields. These measures may provide a more nuanced and comprehensive understanding of the various activities on a platform, thereby contributing to a more sophisticated analysis of user engagement and expertise across domains. By using these measures, future studies may quantify the diversity of domain categories more effectively. Although our study emphasises user engagement, we acknowledge the potential value of incorporating alternative metrics and encourage future researchers to explore these avenues for a more nuanced understanding of the varied characteristics that influence response quality.

Furthermore, future studies should explore alternative approaches for categorising answer fields to better capture the nuances associated with different certifications. The current problems in classifying answer fields, as highlighted by the dynamic nature of Naver Knowledge-iN’s platform and the diverse certifications users may hold, necessitate a more sophisticated analysis. Researchers could investigate alternative methods that consider not only the number of certifications but also the specific domains and expertise levels associated with each certification. This approach could provide a more nuanced understanding of how user qualifications and certifications relate to their answering behaviour, thereby contributing to a more comprehensive interpretation of response quality.

Additionally, our analysis was focused on data from Naver Knowledge-iN, a prominent social Q&A platform in Korea. While the results yielded statistical significance, the scope of the study was confined to a single country and exclusively focused on social Q&A platforms operating in Korea. Future research should consider extending the analysis to include data from social Q&A platforms in diverse countries. This approach would enable cross-cultural and cross-linguistic comparisons, allowing the variations attributed to cultural and linguistic factors to be explored.

In summary, addressing these aspects in future research will contribute to a more comprehensive understanding of the dynamics and factors influencing user engagement and answer quality on social Q&A platforms.