Wednesday, December 18, 2002

SCRUM: Dealing with Bugs in a Sprint

Update on 17 Sep 2008: Please note that this is a very old posting in the early days of PatientKeeper before we learned how to eliminate the "QA Sprint" or "stabilization Sprint" before going to full production in a hospital system. The stabilization Sprint fully involved cross-functional teams. In 2003, a new CEO asked why the definition of DONE couldn't be at least four large hospitals system live with no outstanding issues at the end of every monthly Sprint. It took us two years of work to achieve this and beginning in 1995, PatientKeeper goes live at the end of every Sprint with no extra time for QA, regression or performance testing. Also the Sprint in this posting was an emergency that painfully taught us that a Sprint longer than one month should never be done. This Sprint represents a failure mode that we have never done again.

The good news about this posting is that it shows how early PatientKeeper implemented the best Scrum tracking system I have ever seen. It stills has significant advantages over any Scrum tool on the market. It has been migrated to Jira as the underlying storage system for all tasks. Bugs are simply another development task in this system.

Recently a question was posted on the object-technology list asking how to handle bugs during a SCRUM sprint. We have refined the art of SCRUM project management at PatientKeeper during the past couple of years to handle multiple simultaneous sprints with thousands of simultaneous tasks, integrating bug tracking with development project tracking. The requirement for the project management system was less than 60 seconds per day per developer to update, and less than 10 minutes per day for the project manager to do complete reporting with charts and graphs.

In order to meet this requirement, the developers could not use any other software program than the bug tracking system they already used. We added only a few data items to include development tasks in with bugs - class (development task or bug), initial estimate, time invested, and percent complete. Our team was already using an the open source bug tracking system GNATS. We enhanced GNATS with these data items and the ability to dump data into an Excel spreadsheet automatically, every day, for the project manager.

The Burndown Chart above was published in: Sutherland, Jeff. Agile Can Scale: Inventing and Reinventing SCRUM in Five Companies. Cutter IT Journal 14:12:5-11, Dec 2001. If you would like a copy, send me email.

The chart represents a rather long Sprint required to deliver Release 6 of the PatientKeeper mobile product for clinical results. Planning for this project started in April (while another release was in progress) and development items were entered into GNATS which included initial estimates. Initial estimates are frozen and cannot be changed after entry. This allows us to assess estimation accuracy at a later date. You can see the backlog building as new project pieces are entered. The Sprint was kicked off in June and there was a rapid increase in backlog as product marketing added new tasks and developers found out their initial estimates were too short or tasks were missing. In the initial part of a Sprint, the challenge is to get the backlog complete and start flying the backlog curve down into the delivery date. In this case delivery had a hard stop at 20 August.

As bugs are found they go into the same system. It is possible to enter estimates for fixing bugs, but we usually track them by count. The data dumped to the project manager allows the data to be look at in multiple ways. It also shows bug inflow and outflow by day so you can see how many bugs are being found and fixed as you move forward.This metric is one of the most critical information for assessing stability of the code, as well as the success of the project. In the development Sprint which ended on 20 August in this chart, the focus is on cumulative days remaining for unfinished tasks which includes testers working with developers to eliminate bugs. On August 20, we moved into a stabilization sprint to assure the system was fully production ready and take it live at a large hospital system. The teams then shifted to focus totally on bug counts remaining day by day, including inflow and outflow.

Testing is going on during the development cycle and tasks should not be completed until high priority bugs are eliminated. If bugs are found after the task is logged complete, it can be reopened, or the bug can be fixed by the developer owner in the midst of working on other tasks. In any case, this effort is reflected in a rising or falling backlog curve. At the end of every day, each developer reviews tasks and bugs being worked on in the same GNATS user interface, and updates the database in less than 60 seconds.

This is the best project management system I have ever seen or experienced. Time required for update and reporting project management statistics has been reduced by at least two orders of magnitude over other approaches I've seen. In addition, the Backlog Curve reflects thousands of individual estimates that are updated daily. This micro-costing of the project effort yields more accurate estimates of project completion than other approaches. Highly recommended!

No comments: