Thursday, December 22, 2011

Powerful Strategy for Defect Prevention: Improve the Quality of Your Product



A classic paper from IBM shows how they systematically reduced defects by analyzing root cause. The cost of implementing this practice is less than the cost of fixing defects that you will have if you do not implement it so it should always be implemented.

1. First understand your architecture and where the bugs are coming from by type, severity, component, and point of injection during the development life cycle.

2. You will find 80% of the bugs come from 20% of the code. Mapping these defects on a component architecture will show swarms of bugs around specific components.

3. Apply bug spray through a carefully prioritized automated testing strategy. Find the biggest problem that occurs when doing final regression testing prior to deployment. Implement an automated test that makes this problem impossible to happen again using the detailed knowledge developed about bug infestation in your product. Write a single test that can prevent 100 common problems. Then go to the next highest priority problem and repeat. Doing a few automated tests a week will eventually make your build bullet proof with remarkably few tests.

In three months, one of our venture companies cut a 4-6 week deployment cycle to 2 weeks with only 120 tests. It took one person three weeks to write the test and eliminated several weeks of work by an entire team. It reduced defeats, radically reduced support calls, and the customers liked the new release enough to buy more product, raising revenue.

Everyone should implement this. The return on investment is astronomical. I thought this was basic stuff but our investors say almost none of their companies have implemented it until we invested in them. The developers are often junior, right out of university, and the managers are domain experts, not engineering experts. We have to teach them the basics.

by R. G. Mays, C. L. Jones, G. J. Holloway, D. P. Studinski
IBM SYSTEMS JOURNAL. VOL 29, NO 1, 1990

Defect Prevention is the process of improving quality and productivity by preventing the injection of defects into a product. It consists of four elements integrated into the development process(: 1) causal analysis meetings to identify the root cause of defects and suggest preventive actions; (2) an action team to implement the preventive actions; (3) kickoff meetings to increase awareness of quality issues specific to each development stage; and (4) data collection and tracking of associated data. The Defect Prevention Process has been successfully implemented in a variety of organizations within IBM, some for more than six years. This paper discusses the steps needed to implement this process and the results that may be obtained. Data on quality, process costs, benefits, and practical experiences are also presented. Insights into the nature of programming errors and the application of this process to a variety of working environments are discussed.

3 comments:

Ben Linders said...

Savings with Root Cause Analysis to prevent defect can be big. It's good to see this recognized and amplified, thanks Jeff!

Earlier this year I ran a survey on LinkedIn to find the main business reason to do Root Cause Analysis. Reducing operational/project risks was seen as the biggest benefit, delivering high quality software was second.

Jan van Galen said...

I'm sorry jeff, but this is basic stuff. Are you serious? What are they teaching these whizzkids these days?

Jeff Sutherland said...

If this was basic stuff then 50% of the Scrum teams out there would not be failing to get done at the end of the sprint. Seriously, at OpenView we have 18 startups, run by many kids out of college or managers who do not understand and have not implemented this basic stuff. You may be out of touch with what is really going on in the IT community out there. It is not pretty. You might start publishing some real metrics on velocity and quality improvement by doing this basic stuff. If all of us did that we would not have to retrain most of the managers that are running IT operations! On another note, poor implementation of TDD in some organizations shows more bugs in the tests than in the code. Boiling the ocean with TDD tests needs to move to ATDD where the tests focus on acceptance tests and not unit tests.