Bug Triage

General Overview

Most open source software developments incorporate an open bug repository that allows both developers and users to post problems encountered with the software, suggest possible enhancements, and comment upon existing bug reports. One potential advantage of an open bug repository is that it may allow more bugs to be identified and solved, improving the quality of the software produced.

However, this potential advantage also comes with a significant cost. Each bug that is reported must be triaged to determine if it describes a meaningful new problem or enhancement, and if it does, it must be assigned to an appropriate developer for further handling. Consider the case of the Eclipse Platform open source project over a four month period (January 1, 2005 to April 30, 2005) when 3426 reports were filed, averaging 29 reports per day. Assuming that a triager takes approximately five minutes to read and handle each report, two person-hours per day is being spent on this activity. If all of these reports led to improvements in the code, this might be an acceptable cost to the project. However, since many of the reports are duplicates of existing reports or are not valid reports, much of this work does not improve the product. For instance, of the 3426 reports for Eclipse, 1190 (36\%) were marked either as invalid, a duplicate, a bug that could not be replicated, or one that will not be fixed.

We seek to improve the bug triage process by understanding the triage process and creating tools that support triagers in their decisions about bug reports. The various subprojects that we are working on are:

Our work as been published in a variety of venues.

Publications

Bug Report Assignment

Overview

to be written - John

Duplicate Detection

Overview

to be written - Lyndon

Publications

Topic revision: r1 - 2006-02-28 - JohnAnvik