Keywords

1 Introduction

Home users tend to learn “just enough” to keep going [9]. Learning as you go, ideally without consulting the manual, becomes a common practice. It comes as no surprise that these users are frequently faced with doubts about how to proceed. These doubts might be resolved through trial-and-error cycles or resorting to help desks. The former might be cumbersome and unsuccessful, while support for end users tends to be rather limited in most organizations. Indeed, different studies point to the lack of assistance as a major stumbling block for these users’ self-reliance [14, 24]. In this scenario, Question&Answer platforms (Q&A platforms) come in handy [10]. In Q&A sites, users (askers) post questions, and rely on other community members (answerers) to provide a suitable solution to their information needs [23]. Unfortunately, home users are not always aware of Q&A platforms or, more commonly, they are discouraged by the upfront-cost of posing a question, and the risk of not getting a satisfying answer [11]. Findings confirm that (re)editing likelihood was larger for users with limited familiarity with either Stack Overflow or the topic at hand [25], both aspects characterizing our target audience.

The irony is that those who can benefit most from Q&A platforms might lack the skills (and support) to pose “good questions”. Studies found how successful inline posting correlates with affect, presentation quality or time [5]. This leads to our research question: how can home users be assisted in successful question posting in Q&A platforms?

Our main premise is that examples (and doubts) emerge while conducting tasks. Unfortunately, askers need to move away from the working environment (where questions arise) to the Q&A platform (where questions are posed). This requires askers to re-create the task scenario into the question. As an example, consider Google Sheets as the working environment, and Stack Overflow as the Q&A platform. When struggling with spreadsheet formulas, end users might resort to the Stack Overflow community. This involves moving to this Q&A platform and posting the question. This can be achieved in a rambling textual way, or conveniently formatted using markdown (i.e. a markup language for code snippets in Stack Overflow). More to the point, in accordance with StackOverflow’s own recommendations [18], screenshots or separated shared spreadsheet should be provided for respondents to more reliably reproduce the task scenario. This requires more involvement (and skills) on the asker’s side but, in return, it permits potential answerers not only to provide but also to validate their answers with the sample data. Indeed, the presence of code snippets in questions highly correlates with getting prompt and appropriate answers [5]. Hence, users need to balance the upfront investment (i.e. time/effort spent on writing the question) versus the risk of obtaining low-quality answers or no answer at all. The issue is that the upfront “attention capital” available for home users might be too limited.

To reduce this upfront investment, we resort to inline question posting through a question scaffold. Here, users can seamlessly channel questions to Q&A platforms without leaving their working environment. The aim: reducing the cost of posting questions while increasing the payoff, i.e. obtaining more accurate answers by making questions clearer through examples.

QuestionSheet is a research prototype we built to demonstrate the feasibility of this approach on top of Google Sheets. To use QuestionSheet, home users starts with an ordinary spreadsheet. Then, in cells where a formula is needed but unknown, the user enters a special =QUESTION() formula. This triggers an assisted process that helps users to pose their questions in Stack Overflow resorting to examples from the current spreadsheet to illustrate the desired output, all without leaving Google Sheets. Through this work, we pursue three main contributions:

  • a scaffold design aimed at guiding users towards successful inline question posting (Sect. 3),

  • a Google Sheets extension for inline question posting in Stack Overflow (Sect. 4),

  • five use cases that provide insights about the benefits brought by inline question posting (Sect. 5).

We start by framing this work within the related bibliography.

2 Social Computing in Programming

While Personal Computing describes the behavior of isolated users, Social Computing highlights the social aspect when achieved through computers. Q&A platforms, microblogging or wikis can be framed within this term. The social aspect does not stop at collaboratively editing a wiki article or tweeting about a movie. Programming has also been the subject of socialization. A proliferation of approaches to provide this kind of capabilities within programming environments soon began. The vision is for users to seamlessly channel questions to Q&A platforms without leaving their working environment. By channelling is meant the interplay between the Q&A platform and the working platform during the question lifecycle. Questions are written from the working platform, and next, transparently posted into the Q&A platform. And the other way around: the Q&A platform is periodically pulled for answers that will eventually be available at the working platform. Differences among approaches stem from the working environment and the target social platform to tap into. Next, we cluster references based on two main working environments: Eclipse and spreadsheet programs (see Table 1).

Table 1. Approaches to social computing in programming.

Eclipse is an IDE for professional applications. The success of Q&A services soon led to devise mechanisms “for employees to appropriate status message Q&A as one possible source of stable peer support” [20]. For Stack Overflow, Fishtail is an Eclipse plugin that assists programmers in discovering code examples and documentation on the Web relevant to their current task [17]. Another plugin, Seahawk, turns Eclipse-hosted code snippets into hyperlinks which point to relevant Stack Overflow question threads [1]. Seahawk transparently channels all the communication between Eclipse and Stack Overflow.

Spreadsheet frameworks also attempt to capitalize on social networks. Smartsheet [6] adds crowdsourcing capabilities to spreadsheets. Specifically, it permits to create tasks associated with spreadsheet cells so that cell values are obtained as a result of crowdsourcing tasks. Smartsheet transparently handles all the back-end process through Amazon’s Mechanical Turk. However, Smartsheet is not targeted to end users struggling with formulas but to professional programmers looking for data. In a similar vein, AskSheet [16] helps users to gather data for decision making in the form of spreadsheets. Decision making spreadsheets are data intensive, and include functions to aggregate data to help users to make a decision. To this end, AskSheet provides the ASK() function. ASK() takes as parameters a range of expected values (e.g. “1 to 10”), the item and attribute labels (e.g. “Galaxy S3”, “screen size”), the name of the task (“check basic specs”) and the target crowdsourcing platform (Amazon’s Mechanical Turk). The side-effect is the creation of a crowdsourcing task (a.k.a. HIT in the Mechanical Turk parlance). Main contribution rests on the optimization of the number of human tasks required to resolve all the cell dependencies in order to reduce the economic cost use as incentive to the human workers.

Our work is akin to these efforts insofar as inlining Q&A capabilities within the working environment. In so doing, questions are posted in context, facilitating focus and avoiding moving to the Q&A platform. However, previous approaches consider different target audiences. As for askers, previous works target professional programmers where coming up with good questions is not an issue but rather the smooth integration with the working environment. On the other hand, answerers might vary from rewarded ones versus altruistic ones. This introduces a remarkable difference: on money-rewarded systems (e.g. Amazon’s Mechanical Turk), the success is driven mainly by the amount of the incentive whereas in altruistic sites (e.g. Stack Overflow), success very much depends on how interesting the problem is and how well it is described.

Table 1 highlights how our work differs from previous approaches as for the combination (asker, answerer) being considered: (home users, altruistic respondents). If answerers are altruistic, then we should delve into what factors impact respondent engagement (Sect. 3). If askers are home users, then we should strive to provide some kind of scaffold that acts upon the engaging factors. If they are left on their own, studies confirm that newcomers tend to have lower chances of having their questions properly answered [4].

3 Scaffold Design for Successful Question Posting

This section gathers studies about how askers can increase the chances of eliciting a successful answer. Specifically, our approach pivots around the model for successful question posting on Stack Overflow presented in [5]. Calefato et al. focus on three main factors askers can act upon: affect (i.e. the positive or negative sentiment conveyed by text), presentation quality, and time (i.e. the moment at which the question is posted). Figure 1 depicts the main metrics introduced by Calefato et al. together with pairs (coefficient estimate, odds ratio) resulting from their experimentFootnote 1. Next, we elaborate on means to assist users to increase these metrics, i.e. the scaffold.

Fig. 1.
figure 1

A model for successful question posting (adapted from [5]). Arrows are labeled by pairs (coefficient estimate, odds ratio) taken from [5]. Live editors are mentioned in this study but not investigated.

Affect. Successful questions adopt a neutral emotional style. This insight is aligned with previous investigations where expressing sentiment, either positive or negative, might decrease the chances of getting help [2].

Scaffold insight: Parse user text for sentiment analysis. Alternatively, a boilerplate template can be used.

Presentation Quality. Successful questions are “short, contain code snippets, and do not abuse with uppercase characters” [5]. The importance of attaching code snippets to questions was highlighted in a separate questionnaire where 95% of respondents strongly agreed with this view. An interesting twist was put forward by one of the respondents about the importance of using live editors, which would permit answerers to fiddle directly with the code samples provided by askers.

The literature backs the use of examples when posing questions [19]. Good examples can be a substitute for long explanations. This is particularly important for home users who might lack the skills to describe their needs in abstract terms, and hence, examples might be the easiest way to get the idea through. Good-practice manuals exist about how to write good questions [8, 10, 18]. Nasehi et al. advice for questions to have enough details (but not too many), enough depth (without drifting from the core subject), examples (if applicable) as well as including the avenues already investigated by the asker [15]. This effort pays off in terms of getting more high-quality answers, and hence, reducing the risks of obtaining inappropriate answers, or even no answer at all [13, 22].

Scaffold insight: Provide means for examples to be easily extracted from the working environment.

Time Slice. Experiments found that questions posted during the weekend are more likely to be answered than questions posted during the week [3]. In addition, the most successful time slices correspond to 3:00–6:00 PM of West Coast US time, that is, when most experts were available.

Scaffold insight: Provide a timer that decouples question specification from question posting. Users with a different time zone can work out their questions at the most appropriate time, and let the timer post them at the USA working hours or leave it till the weekend.

Fig. 2.
figure 2

Question() makes a QuestionSheet to be displayed as a side panel: the question and the working sheet are shown side by side. Smileys are overlaid to highlight points of interest.

4 A scaffold for inline question posting for GoogleSheet

This Section tackles inline question posting using Google Sheets as the working environment, and Stack Overflow (SO) as the Q&A platform. The outcome is QuestionSheet, an extension for Google Sheets available for download at https://goo.gl/RMD3p3. Readers are encouraged to watch this video to see QuestionSheet at work: https://goo.gl/wSy767.

Functional requirements are those that derive from handling the question lifecycle. This includes: question description (i.e. title, message, answers, votes, respondent data, etc); question sharing (i.e. seamless propagation of the query from the working environment to the Q&A platform); answer awaiting (i.e. once the question is posted, mechanisms should be in place for the working environment to monitor the Q&A platform); answer resolution (i.e. once answers show up, mechanisms are required to handle answer resolution and integration into the working environment).

As for the non-functional requirements, we prevail “compatibility”, i.e. “the degree to which an innovation is perceived as being consistent with the existing values, needs, and past experiences of potential adopters” [21]. In our setting, we depart from a working platform to which the user is familiarized with (i.e. Google Sheet). Compatibility calls for the new functionality (i.e. inline question posting) to mimic as much as possible the modus operandi of Google Sheet. Targeting home users, consistency lowers the entry bar and eases user adoption [12].

The rest of this Section is structured along the stages of the question lifecycle. As the running example, consider the spreadsheet in Fig. 2. Employee data is scattered among two tables. One table holds the employee’s ID and hours worked. The table below keeps the ID and the employee’s name. The user wonders how the employee’s name could be replicated besides the ID column in the first table. Rather than re-typing this information, he looks for a function that achieves this.

4.1 Question Description

For the sake of compatibility, QuestionSheet is realized through two spreadsheet functions: QUESTION(), and SCREENSHOT(Range).

Initialization. To use QuestionSheet, users start with an ordinary spreadsheet. Then, in cells where a formula is needed but unknown, the user enters a special =QUESTION() formula. Figure 2 shows the case for our running example (notice the expression =QUESTION() at the formula bar) From the perspective of Google Sheets, QUESTION() is just another formula. Hence, QUESTION() can appear any place a formula is expected. The difference stems from the output. Traditional formulae output data. However, QUESTION() outputs a formula, and it has a side effect: poping up a QuestionSheet at the right-side. A QuestionSheet holds a cell for each of the question elements (e.g. Title, Message, etc). Function QUESTION() then obtains its arguments from this sheetFootnote 2. As any other formula, QUESTION() output is re-evaluated as its companion QuestionSheet changes its content.

Description. At the onset, the Message cell contains a template that serves as a basic guideline for users. A hallmark of our approach is to enrich these messages with examples, i.e. providing input values and their expected outputs. For our running case, this entails adding the employee names by the IDs (red shaded in Fig. 3). The user is prompted to introduce the expected outputs that the sought formula would return. This not only facilitates understanding by potential respondents (and hence, the chances of getting a more accurate answer) but it also offers a way to validate the solution. Indeed, the blog of Stack Overflow itself advises to provide shared spreadsheet where potential respondents can more reliably reproduce the task scenario and validate their solutions [18]. Including expected outcomes in the working sheets might come in handy, but it also pollutes spreadsheets with spurious data. To avoid smudging the spreadsheet, the data introduced for illustration purposes is transient, i.e. they are kept from QUESTION() first enactment till the question is posted. Once the question is posted, no trace is left in the working sheet. Transient data is highlighted with a red shaded.

Fig. 3.
figure 3

Function Screenshoot() inlays sheet snippets into the message. (Color figure online)

Enriching Questions with Sample Data. Sample data from the working sheet can be inlaid into the question’s message through SCREENSHOT(Range). Figure 3 illustrates the use of this function. SCREENSHOT() obtains (1) a screenshot, and (2), a detached sample spreadsheet for the selected RangeFootnote 3. The screenshot helps message understanding while the detached spreadsheet allows respondents to easily try out their formulae. The actual thread can be found at https://goo.gl/c9vbCF.

4.2 Question Sharing

QuestionSheets can be posted immediately or handed to the timer. In this way, question specification is decoupled from question posting, letting users decide the most appropriate time slide for posting their questions. So far, the timer admits three values: immediate, USA_evening and USA_weekend based on the studies of [5]. When posted, QuestionSheets are transparently turned into their question-thread counterparts. Therefore, questions have a double representation: one in SO (as question threads) and one in Google Sheets (as QuestionSheets). These two representations need to be kept in sync, i.e. answers in SO are propagated to the QuestionSheet counterpart while changes in the Q&A spreadsheet might lead to re-edits of the question in SO (see later). Appropriate drivers map the QuestionSheet representation to the data structure expected by the Q&A platform at hand.

Fig. 4.
figure 4

Adding answers makes function QUESTION() be re-evaluated showing the number of SO answers so far.

4.3 Answer Awaiting

Once the question is posted, drivers periodically polls SO for answers. This happens when either the spreadsheet is opened or in a polling frequency basis. By default, this polling frequency is set to 15’. In this way, answers in SO are propagated to the associated QuestionSheet. This in turn, causes QUESTION() to be re-evaluated. At this point, the user will notice how the QUESTION() display changes from “Question posted...” to “Question answered (n) ...” where “n” stands for the number of available answers. Figure 4 shows the case when two answers are available. It is not need for the QuestionSheet to be visible. Similar to the warnings about available app updates in mobile phones, the existence of answers do not force users to move to resolution right away. Based on the urgency, users might select the first answer that shows up or rather wait to see if additional answers come up.

4.4 Answer Resolution

SO threads are mapped back into QuestionSheets. Information collected from SO includes the answer itself, the contributor’s reputation, and the number of votes (if any) (see Fig. 5). Users can move between answers by clicking on the tabs. This causes QUESTION() to be re-evaluated. Specifically, QUESTION() accesses the cell containing the answer, and attempts to extract the formula from the text. In most case, QUESTION() is successful since most contributors follow SO’s guidelines of using markdown for code. If so, QUESTION() returns the formula. This in turn, causes Google Sheets to evaluate the obtained formula, and return a value (e.g. Ben Hale). This value is compared with the expected outcome (should outcomes be provided at question time), shading the answer tab with green, yellow or red, depending on the number of expect outcomes the answer has matched. In this way, a single click on the question tab promptly permits users to see whether the select answer fits their expectations or not. No need to understand what the formula is about. Formulas are judged by their outputs. This highlights the importance of providing expected outcomes with proper coverage.

Fig. 5.
figure 5

Answer Resolution. Answers are accessible via tabs (see Smiley at the bottom). Clicking on a tab has a three-fold effect: (1) the sought formula is obtained from the SO thread using pattern matching; this formula is displayed in the formula bar; and (3) the associated QUESTION() is re-evaluated to show what would be the formula output (e.g. Ben Hale). (Color figure online)

Users can wait till an answer produces the desired outputs. Once provided, they can make an answer final. To this effect, users click on the stop-sharing button (see top of Fig. 5). This ends the question lifecycle: the QuestionSheet is deleted together with the transient data. As for the SO counterpart, the query thread is finished along with SO good practicesFootnote 4. The only cue about a formula being obtained through SO, is by a comment left attached to the corresponding cell. The comment holds the URL to the SO thread counterpart. This is the only trace left back about how the formula was obtained.

5 Evaluation

This section reports on a formative evaluation conducted through five case studies.For QuestionSheet, the experiment does not stop at posing the question but expand along the question lifecycle, including waiting for the SO community to answer. This potentially long lifespan is what sustains the use of use cases for formative evaluation.

Table 2. Task description. The QuestionSheet-generated SO thread is included for reference.

5.1 The Experiment

Subjects. Participants were recruited locally. For participants to qualify as “home users”, they should have at least one year exposure to Google Sheets but not having created more than six sheets in this period. Five students qualified. Participants were given a brief introduction to QuestionSheet where a running example was developed.

Case Studies. Scenarios were carefully selected so that questions should not be neither too trivial nor too complicated. For the validity of the experiment, it is most important to find the right balance. Too easy cases will not justify the effort of QuestionSheet as most users will know the answers without resorting to SO. In addition, easy questions will get prompt answers no matter whether the description of the query is enhanced with examples or not (one of the aspects to be tested out). On the other hand, too complicated cases will be also difficult for home users to stumble upon in their daily activities. Table 2 displays the selected cases inspired by real doubts risen in forums. The table also includes the sought formula for readers to ponder the scenarios’ complexity. In addition, a link to the QuestionSheet-generated question in Stack Overflow is also included.

Table 3. Constructs of diffusion theory adapted for QuestionSheet.

Methodology. Each subject addresses one of previous scenarios. To prevent scenario understanding from polluting the performance measures on the usage of QuestionSheet, the task was split into parts. First, we made sure subjects understand the scenario, i.e. they were asked to complete the sample spreadsheet with the values the expected formula should return (red shaded in Table 2). This helped to ensure scenarios were properly understood. Next, subjects were asked to post their questions using QuestionSheet. Once the testing session was over, participants were asked to fill in a questionnaire that rates different aspects of QuestionSheet through Likert scales (see Fig. 7). In addition, open comments were also welcome. The questionnaire builds upon a reduction of Roger’s model of Diffusion of Innovations that includes only those constructs consistently related to technology adoption behavior: relative advantage, complexity and compatibility [21]. Table 3 elaborates these constructs for the QuestionSheet case.

5.2 Results

Elapsed Time. That is, the time it took for the SO community to answer the QuestionSheet-generated questions. We frame this outcome w.r.t. the distribution of time-to-answer in SO, more specifically, we obtained the elapsed-time first-answer distribution for “google-spreadsheet”-tagged questions in SO (see Fig. 6)Footnote 5. We use this figure to frame our results: T3, T4 and T5 are among the top 10%, T2 in the 30% and T1 in the 50%. We believe these results to be rather good. More to the point, if we consider that questions were posted by SO newcomers with no reputation, where reputation is being indicated as a main correlator of response speed [5]. This anecdotal evidence supports the findings of Calefato et al. as for the use of examples as main answer attractors.

Fig. 6.
figure 6

Elapsed-time first-answer distribution for “google-spreadsheet”-tagged questions at Stack Overflow as for September, 2017. Questions without answer were removed.

Questionnaire. In general, users perceive QuestionSheet as beneficial w.r.t. directly typing the question in SO (questions 1–4 in Fig. 7). Important enough, users ranked high QuestionSheet compatibility with Google Sheets modus operandi (question 5). This is a main incentive to user adoption.

Fig. 7.
figure 7

Diverging stacked bar chart for the satisfaction questionnaire using Likert scales.

Threats to Validity. A larger number of participants are required to draw any conclusive remarks. As for external validity, QuestionSheet could post questions to Q&A platforms other than StackOverflow as long as appropriate API-based drivers were provided. On the other hand, our experience can be of interest for frameworks that target home users who can resort to examples to make themselves understood in Q&A forums.

5.3 Lessons Learned

This subsection discusses lessons learned along the model introduced by [5].

Affect. Calefato et al. highlight the importance of a neutral emotional style to promote answers. In a similar vein, we realized subjects were not so aware of the impact of posting questions repeatedly. Specifically, two users posted their questions (or small variations) more than once. QuestionSheet might turn both StackOverflow too transparent and question posting too easy, with the resulting risk of “question spam”. This makes true the famous quote: “Everything Should Be Made as Simple as Possible, But Not Simpler”. Users ignored the importance of reputation when interacting with social networks, and the eventuality of being banned, if a proper etiquette is not observed. Hence, QuestionSheet needs to be revised as for restricting the number of questions per sheet, or preventing questions which have already been posted at StackOverflow.

Presentation Quality. Calefato et al. underscore the importance of attaching code snippets and examples to questions. This remark was even more important in our setting: non-native English speaker. Two subjects noticed that without examples they would have not been able to describe their questions in a narrative way using English. One subject indicated that, though she knew Stack Overflow, she was afraid of posing question due to her limited use of English. Here, templates and examples lower the entry bar for home users.

Time Slice. Users were pleasantly surprised for the promptness of the answers. As a subject puts it “it is almost like Google!”. This promptness might be well due to the presentation quality of the questions (i.e. neutral language, example-based). But it might also be influenced by the medium complexity of the questions. Being home users, the expectations are for questions to be of medium complexity hence, easily answerable by more proficiency answerers. At this respect, delaying question posting till the evenings or weekends might make sense when there might be a need to attract a large number of answerers. However, medium-complexity questions might well be equally addressed by a smaller population without waiting till the weekend. If this were the case, it would certainly challenge the need of the built-in timer in QuestionSheet.

6 Conclusions

This work tackles how to spread the benefits of Q&A platforms to home users. To this end, we look into assisted inline question posting where users post their questions without leaving their working environment. QuestionSheet tests this out for Google Sheets. Offering Q&A facilities from within Google Sheets not only avoids platform switches and facilitates focus, but also accounts for in-place example construction to be used for question clarification, which in turn helps to engage answerers. Informative evaluation is being conducted through five use cases. Though conclusive statements cannot yet be drawn, first insights seem to suggest that in-place scaffolds might help to combat the digital divide between home users and more technical savvy ones as for enjoying the benefits of Q&A platforms. Yet, the benefits of educating users to formulate questions goes beyond the askers themselves. Further investigation should look at how “good questions” enhance effective knowledge-sharing behavior that will eventually lead to the creation of long-lasting value pieces of knowledge in Q&A platforms. Along Israelmore Ayivor quote “To get answers, ask questions; but to get good answers, ask good questions”, we could conclude that for Q&A platforms to become a repository of good answers, first you need good questions.