Keywords

1 Introduction

To maintain high product quality, participants in Free/Libre and Open Source Software (FLOSS) projects constantly report and resolve issues of bugs, feature requests, etc. [12]. A number of facts make issue resolution a sophisticated task. In large and active projects, e.g., GnomeFootnote 1 and MozillaFootnote 2, there are more than one thousand new issue reports per month filed by various participants, including users and developers with diverse experience and skills, which lead to mixed report quality. Meanwhile, many of the reporters and developers in FLOSS projects voluntarily and some even occasionally participate, which brings more difficulties like unstable communications, lack of time and effort [12, 13]. To address the complexities and difficulties and get issues properly resolved in time, projects develop issue workflow, i.e., the sequence of issue resolution steps, that allows participants to communicate with each other and to coordinate the tasks. The workflow has to evolve to stay efficient. For example, the Mozilla bugmaster meeting discussed how to adjust the strategy of issue triage to scale as more Mozilla teams and community members were engaging in triageFootnote 3. This suggests the importance of understanding issue workflow.

However, the knowledge of issue workflow in practice is typically tacit. It is often neither known explicitly nor is accurately reflected in the meager documentation even when such documentation exists. For example, in Mozilla, the issue workflow easily confused a junior developerFootnote 4 even though it was posted on-line. While GNOME also defines standard triage steps on its websiteFootnote 5, they are not consistent with how triage is done in practice as illustrated in this paper.

Issue tracking systems such as BugzillaFootnote 6, record history of how issue reports were resolved [5]. These data can be used by practitioners to review actual workflow they practiced in the past or to learn from others. It’s always complicated to conduct such exploration with raw issue repositories. Effort of data collection, cleaning, sorting, aggregation, measurement and visualization is required [8]. To facilitate the investigations, we propose Issue Workflow Explorer (IWE). Based on issue tracking data, IWE offers functionalities of discovering, measuring and visualizing the issue workflows, and it quantifies how various workflows affect the lead time, complexity and output of the resolution process. By applying IWE to GNOME and Mozilla, we study two major concerns about issue workflow, issue triaging and handling of incomplete issue reports. Workflows with private triage vs. public triage, and leaving incomplete issue open vs. closing them, are discovered and evaluated. We obtain insights that triage conducted by reporters themselves should be restricted and it is not cost-effective to keep incomplete issue reports open. These studies demonstrate the ability of IWE to discover and evaluate different types of issue workflows.

The paper is organized as follows: Sect. 2 introduces the concept of issue workflow, Sect. 3 describes the design of IWE, Sect. 4 presents the empirical evaluation, and Sect. 5 demonstrates the detailed operations of IWE.

2 Issue Workflow

In an issue workflow, issue reports transfer through a sequence of steps, e.g., submission, triaging, fixing, etc. Results of these steps are shown with status label in issue tracking systems. For example, in the standard workflow defined by GNOME (see Footnote 1), new filed issues should be labelled as UNCONFIRMED. When a triager confirms it is a valid issue, its status changes to NEW. Alternatively, if it is, for example, a duplicate report, it may be immediately closed and its status changes to RESOLVED. When the report does not contain sufficient information for developers to reproduce and fix, the status would change to NEEDINFO waiting for the reporter or others to complete it. Issue reports in status NEW are to be assigned and resolved. The assignee may accept the report (status ASSIGNED), or pass it to someone else (remains in the status NEW), or resolve it (status RESOLVED). Finally, each RESOLVED report results in a resolution of FIXED, DUPLICATE, INCOMPLETE, or INVALID. In IWE, the issue workflows are described through transitions of these status.

3 Design of IWE

The basic goal of IWE is to simplify query and evaluation of issue workflows with the power of visualization [1, 6] assisting practitioners to analyse previous issue resolution practices. We introduce four types of measures to evaluate issue workflows and design selectors and interactive views to conduct and visualize the measurements. The overview of the tool is shown in Fig. 1.

Fig. 1.
figure 1

Overview of IWE.

3.1 Measurements

Based on literature and our experience on investigating issue workflow (e.g., [16, 17]), we introduce five measures to characterize issue workflows and quantify their efficiency and effectiveness.

The number of issue reports within a defined scope, e.g, modules, time span, is a basic metric to indicate the project workload, software quality [2, 7], etc. We use it to measure the population trend of investigated issues, i.e., the number of reports with properties \(P_s\), submitted or resolved during time span T (M1).

Status transitions tells what workflow practitioners follow to resolve the reports, and a number of studies have tried to model and present the transitions [2, 14, 16]. We measure the occurrence of a workflow through the number of selected issue reports that were transferred through status sequence \(S_s\) (M2).

People have shown great interest on the efficiency of time spent to resolve issue reports [7]. To address this concern, we calculate time spent on transition from the beginning to status \(S_e\) (M3).

Finally, the results and complexity of issue resolution process, e.g., whether they are fixed, determine the effectiveness of effort cost, which have also been paid attention to [3]. To address this concern, we calculate the fraction of reports with resolution result R (M4) and number of transitions a report experienced (M5).

3.2 Selectors

We propose eight selectors for querying workflows. Users can customize the input of the views (introduced in Sect. 3.3) to direct or narrow down their exploration. Issue tracking systems, e.g., Bugzilla, define a number of fieldsFootnote 7 to describe the issue properties. We focus on the commonly used fields [11, 15] and build the products, severity and priority selector to specify issue reports in which product with which severity and priority to investigate. The transition selectors, i.e., starts with and includes, pick out issue reports starting with or including the selected status for further investigation. Since workflow effectiveness is related to the issue resolution result, we make resolution selector to screen out issue reports ending with a specific result. The other two time span selector, i.e., report time and resolve time, are built to set the time span for choosing issue reports submitted or resolved within it.

3.3 Views

We design three views, which receive the input from selectors, for users to conduct the measurements and observe the results.

The Workflow View visualizes the measurements of issue population (M1), status transitions (M2), efficiency (M3) and effectiveness (M4). We present issue workflows, i.e., status transitions, in forms of trees. Thickness and length of the edges visually indicates the number of issue reports experiencing the transitions and time cost respectively. The level of a leaf means the number of transitions that reports in this workflow experienced. Quantitative measures are shown in tips where time spent of each workflow is given by the first quartile, median and third quantile in addition to the mean in consideration of distribution bias. To support investigation of single workflow or part of it, we add selectors to pick it out as the input for the Investigation View and Time Trend View.

The Investigation View is built for studying the efficiency and effectiveness of selected workflows. It has two parts, Resolution View and Resolve Time View, to present measures of efficiency (M3) and effectiveness (M4) respectively. The Resolution View is a bar chart showing the fraction of issue reports with each result. The Resolve Time View is a line chart describing the fraction of reports resolved within the time indicated by X axis.

We design the Time Trend View to adjust measurements in Workflow View and Investigation View by time slots (e.g., days, weeks or months) on time line for investigating trends of those measures. The report time or resolution time option is used to decide which time slot a issue report belongs to, e.g., for a report which was submitted in day d1 and resolved in day d2, if report time is selected and time slot is day, the number of issue reports in day d1 plus one, otherwise, the number of reports in day d2 plus one. The indication of y axis differs when triggered by different views. It is number of reports of the selected workflow for Workflow View, fraction of fixed issued for Resolution View and number of days spent within 90% reports for Resolve Time View.

For comparing the efficiency, effectiveness and trend of different workflows, we remain recent measurement results in Investigation and Time Trend views.

4 Emprical Evaluation

To demonstrate IWE’s value of helping practitioners understand issue workflows, we study the following research questions.

RQ0: Can IWE help users discover and evaluate previous workflows in practice and get insights for future development? This is the over all question about effectiveness of the tool. We answer it by studying two major concerns of issue workflow, issue triaging and handling of incomplete issues.

RQ1: What manners are there for issue triage and what are their strength and weakness on efficiency and effectiveness? Issue triage has got wide attention from both practitioners (see Footnote 1) and researchers [9, 17]. We expect to discover and evaluate workflows with different triage manners and get implications.

RQ2: What strategies are there for handling incomplete issue reports and what are their advantages and disadvantages? It is well known that many reporters voluntarily or occasionally participate in FLOSS projects. Therefore, the issue resolution process often stops and waits for responses from reporters or developers for a long time [16]. Similar with RQ1, we want to discover and evaluate workflows with different strategies for incomplete reports and obtain insights for improvement.

We use the issue tracking data of GNOME and Mozilla, two famous large scale FLOSS projects, from our previous study [18]. Both of them uses Bugzilla, and the issue repositories have 432K and 679K reports submitted in over 10 years. In the following, we study the above questions through IWE.

4.1 Issue Triage

Since issue tracking systems of FLOSS projects are open to everyone, a reporter may be the one who has little experience to properly file issues. As mentioned in Sect. 2, an inspection process is conducted by the community to filter the irrelevant reports, screen out reports needing additional information and assign remained ones to right product. Usually, projects set the status of UNCONFIRMED for new reports, and after public triage, valid reports are transferred to NEW waiting for developers to fix the issues.

Workflow Discovery: Among the workflows of Mozilla and GNOME (Fig. 2(a) and (b)) presented by IWE, we find that not all the reports began with UNCONFIRMED. It implies that trusted reporters were granted privilege to directly file a NEW or ASSIGNED report. Different from public triage, such triage is self-conducted by the reporters (private triage) who may be skilled developers. In the Time Trend View, we can see that private triage was periodically popular, which means these two projects have attempted and adjusted this manner for a long time. We speculate that Hypothesis 1: workflows with private triage could shorten the lead time of the resolution process when the triages are correct but increase the time and effort cost on the contrary.

Fig. 2.
figure 2

Workflows of Mozilla and GNOME.

Efficiency and Effectiveness Exploring: To test Hypothesis 1, first, we study the Mozilla project. Selecting issue reports that started with NEW or ASSIGNED and those began with UNCONFIRMED, we evaluate these two types of workflows respectively through Investigation View. Figure 3 (a) shows the results, and we can see that reports through private triage were processed faster than those through public triage. However, when we restrict selection to reports that are judged as duplicated, i.e., we only include incorrect private triages, the results (see Fig. 3(b)) become opposite. Through looking at the Workflow View, we find the reason that when the private triage was incorrect and the reports were assigned to developers, the resolution was delayed. This may also waste effort of those assigned developers. When we apply the same investigation on GNOME projectFootnote 8, we get similar results. All the evidences support Hypothesis 1.

Fig. 3.
figure 3

Efficiency of different triage manners.

Insights: Through investigating the workflows and testing H1, we have answered RQ1. When properly applied, private issue triage could help community save effort and time in resolving issues, but it will bring side effect when the triage is wrong. Therefore, the privilege to conduct private triage should be strictly restrict to experienced developers. Meanwhile, developers should carefully do that.

4.2 Handling of Incomplete Reports

Less experienced reporters may file issue reports without enough information for developers to understand or reproduce the bug. Therefore, the resolution process must stop and wait for the reporters to come back and complete it. Too many stuck reports may overwhelm other reports, and projects need strategies to address this problem.

Workflow Discovery: In the workflows of GNOME presented by IWE, we find a status, NEEDINFO, which is used to label incomplete issue reports. However, workflows of Mozilla does not have this status. In Mozilla, incomplete reports are directly closed with the resolution of INCOMPLETE. If reporters come back and find the resolution, they may reopen it and provide additional information. In the Time Trend View of GNOME (Fig. 4), we observe that from October 2006 to November 2007, there were a great deal of reports resolved as INCOMPLETE, in particular, majority of the reports submitted before April 2007 experienced NEEDINFO while most of reports filed after that time did not. This suggests that the GNOME community decreased their use of NEEDINFO when there were too many incomplete reports. We searched the mailing list of GNOME and found an email which confirms this observationFootnote 9. We propose Hypothesis 2: skipping NEEDINFO status and closing incomplete issues directly would reduce the resolve time and avoid retention of too many unresolved issues, while using NEEDINFO would make incomplete issues get sufficient information later.

Fig. 4.
figure 4

Trend of incomplete issue reports.

Efficiency and Effectiveness Exploring: To inspect Hypothesis 2, we study the GNOME project which applied both of the strategies for incomplete reports. Targeting at reports resolved as incomplete, we select those went through NEEDINFO and those did not in Workflow View. Conducting measurement in Investigation View, we get the result that issue reports processed with the latter strategy were resolved much faster than the former (Fig. 5(a)). It offers evidence for the first half of H2. When no criteria are given, in Investigation View we can see that 14.3% of all the 432k issue reports, i.e., 62k reports, were finally incomplete (Fig. 5(b)). When we select issue reports that experienced NEEDINFO, we can see that 45.7% of the 71k reports were finally incomplete (Fig. 5(b)), i.e., only 54.3% of them got enough information. This evidence does not strongly support the second half of H2.

Fig. 5.
figure 5

Efficiency and effectiveness of strategies for incomplete reports.

Insights: By exploring the workflows and testing H2, we have got answers for RQ2. To avoid blocking the resolution process and overwhelming other issues, skipping the NEEDINFO status would be the first choice. Using the NEEDINFO status cannot ensure that majority of the incomplete issues get sufficient information.

Through studying RQ1 and RQ2, the answer for RQ0 is clear, i.e., IWE can help users discover previous workflows in practice, evaluate their efficiency and effectiveness and get insights for future development.

Fig. 6.
figure 6

Workflows end with INCOMPLETE.

5 Operations of IWE

We walk through the detailed operations of IWE to show its usability. We take the analysis conducted in Sect. 4.2 as an example.

First, we select the GNOME project through the drop-down list of Project.

Second, to measure the resolve time, we start from Selectors to specify the issue reports resolved as INCOMPLETE in the drop-down lists of Resolution. Through clicking the “Redraw” button in the Workflow View, IWE draws a workflow tree shown in Fig. 6 where there are two workflows, one has the NEEDINFO status, and the other one does not. On the workflow tree, we pick out the workflow with NEEDINFO status using the Select button. In the bottom half of investigation View, we click the Draw button to see time spent within this workflow. We measure resolve time using workflow without NEEDINFO status through similar steps. These results are presented in Fig. 5(a).

Finally, to measure the output of the workflow with NEEDINFO status, we select with drop-down lists of Include Status and click Draw button in the top half of Investigation View to obtain the result, which is shown in Fig. 5(b).

Analysis of the result has been elaborated in Sect. 4.2.

6 Related Work

There has been substantial amount of work on developing approaches or tools to investigate issue tracking repositories in order to understand bug life-cycle and properties. Data visualization is a popular topic in this field and the approach proposed by D’Ambros et al. [2] was one of the early attempts. They proposed system level view which visualizes the distribution of open bugs in the components over time and bug level view for the status changing of a single bug. Similarly, Knab et al. visualized the effort measures, the sequence of issue resolving steps and the duration of each step [10]. Hora et al. proposed a tool to present the bugs in each class/package of a software system [7]. Different from [2], they detailed the change of number of bugs with version iteration and bug lifetime in each class. Gong and Zhang visualized the location of bugs in the system in a topographic map, where contour lines depict the number of bugs in each component/file [4]. Besides the visualization, Ripoche et al. proposed a generalized probabilistic network model in form of Markov model and probabilistic finite state automata (FSA) as a statistical and computational foundation for understanding bug fix process [14].

In this paper, we focus on issue workflow discovery and evaluation in a quantitative and visualized way to help practitioners make their decisions.

7 Conclusions

We build IWE to support exploring issue workflow. Providing practitioners with visualized measurements, IWE makes it easy to find out unsatisfactory workflows to improve. Empirical studies show that IWE achieves our goal. In future, we are going to make IWE compatible with more issue tracking systems and introduce it to both commercial and FLOSS projects to collect feedbacks for improvement.