Monday 10 December 2012

Software Professionals do Inspections

Are you a software professional or not?

I'm not talking about having some kind of official certification here.  I'm asking whether creating high quality code on a repeatable basis is your top priority.

Professionals do everything possible to write quality code. Are you and your organization doing everything possible to write quality code?  Of course, whether you are a professional or not can only be answered by your peers.

If you are not doing software inspections then you are not doing everything possible to improve the quality of your code.  Software inspections are not the same as code walk throughs, which are used to inform the rest of the team about what you have written and are used mainly for educational purposes.  Walk throughs will find surface defects, but most walk throughs are not designed to find as many defects as possible.  


How do defects get into the code?  It's not like there are elves and goblins that come out at night and put defects into your code.  If the defects are there it is because the team injected them.  Many defects can be discovered and prevented before they cause problems for development.  Defects are only identified when you go looking for them, and that is typically only in QA.

Benefits of Inspections

Inspections involve several people and  require intense preparation before conducting the review. The purpose of inspections is to find defects and eliminate them as early as possible.  Inspections apply to every artifact of software development:
  • Requirements (use cases, user stories)
  • Design (high level and low level, UML diagrams)
  • Code
  • Test plans and cases
Inspections as a methodology have been around since the 1970s and certainly well codified since M. E. Fagin wrote a paper in the IEEE in 1986.  The idea behind inspections is to find defects as early as possible in the software development process and eliminate them.  Without inspections, defects accumulate in the code until testing when you discover all the defects from every phase of development simultaneously.



This diagram from Radice shows that defects will accumulate until testing begins.  Your quality will be limited by the number of defects that you can find before you ship your software.

With inspections, you begin to inspect your artifacts (use cases, user stories, UML diagrams, code, test plans, etc) as they are produced.  You attempt to eliminate defects before they have a chance to cascade and cause other phases of software development to create defects.  For example, a defect during requirements or in the architecture can cause coding problems that are detected very late (see Inspections are not Optional). With inspections the defect injection and removal curve looks like this:



When effective inspections are mandatory, the quality gap shrinks and the quality of the software produced goes up dramatically.  In the Economics of Software Quality, Capers Jones and  Olivier Bonsignour show that defect removal rates rarely top 80% without inspections;  but with inspections you can get to 97%.

Why Don't We Do Inspections?

There is a mistaken belief that inspections waste time.  Yet study after study shows that inspections will dramatically reduce the amount of time in quality assurance.  There is no doubt that inspections require an up-front effort, but that up-front effort pays back with dividends. The hidden effect of inspections is as follows:

 

The issue is that people know that they make mistakes but don't want to admit it, i.e. who wants to admit that they put the milk in the cupboard? They certainly don't want their peers to know about it!

Many defects in a software system are caused by ignorance, a lack of due diligence, or simply a lack of concentration.   Most of these defects can be found by inspection, however, people feel embarrassed and exposed in inspections because simple problems become apparent.

For inspections to work, they must be conducted in a non-judgemental environment where the goal is to eliminate defects and improve quality.  When inspections turn into witch hunts and/or the focus is on style rather than on substance then inspections will fail miserably and they will become a waste of time.

Professional software developers are concerned with high quality code.  Finding out as soon as possible how you inject defects into code is the fastest way to learn how to prevent those defects in the future and become a better developer.  Professionals are always asking themselves how they can become better, do you?

Conclusion

Code inspections have been done for 40 years and offer conclusive proof that they greatly improve software quality without increasing cost or time for delivery.  If you are not doing inspections then you are not producing the best quality software possible


Bibliography


Thursday 15 November 2012

Inspections are not Optional

Every developer is aware that code inspections are possible, some might have experienced the usefulness of code inspections, however, the fact is that inspections are not optional.

Without inspections your defect removal rate will stall out at 85% of defects removed; with inspections defect removal rates of 97% have been achieved.

Code inspections are only the most talked about type of inspection; the reality is that all artifacts from all phases of development should be inspected prior to being used.  Inspections are necessary because software is intangible and it is not once everything is coded that you want to notice problems.

In the physical world it is easier to spot problems because they can be tangible.  For example, if you have specified marble tiles for your bathroom and you see the contractor bring in a pile of ceramic tiles then you know something is wrong.  You don't need the contractor to install the ceramic tiles to realize that there is a problem.

In software, we tend to code up an entire set of functionality, demonstrate it, and then find out that we have built the wrong thing!  If you are working in a domain with many requirements then this is inevitable, however, many times we can find problems through inspection before we create the wrong solutions Let's look at some physical examples and then discuss their software equivalents. 

Requirement Defects

The requirements in software design are equivalent to the blueprints that are given to a contractor.  The requirements specify the system to be built, however, if those requirements are not inspected then you can end up with the following:

Balcony no Door Missing Landing Chimney Covering Window
Stairs Displaced Stairs to Ceiling Door no Balcony

All of the above pictures represent physical engineering failures.  Every one of these disasters could have been identified in the blueprints if a simple inspection had been done. Clearly it must have become clear to the developers that the building features specified by the requirements were incompatible, but they completed the solution anyways.

Balcony no Door

This design flaws can be caused by changing requirements; here there is a balcony feature that has no access to it.  In Balcony no Door it is possible that someone noticed that there was sufficient room for a balcony on the lower floor and put it into one set of plans. The problem was that the developers that install the sliding doors did not have their plans updated.

Here the changed requirement did not lead to an inspection to see if there was an inconsistency introduced by the change.

Door no Balcony

In Door no Balcony something similar probably happened, however, notice that the two issues are not symmetric.  In Balcony no Door represents a feature that is inaccessible because no access was created, i.e. the missing sliding door.

In Door no Balcony we have a feature that is accessible but dangerous if used. In this case a requirements inspection should have turned up the half implemented feature and either: 1) the door should have been removed from the requirements, or 2) a balcony should have been added.

Missing Landing


The Missing Landing occurs because the requirements show a need for stairs, but does not occur to the architect that there is a missing landing.  Looking at a set of blueprints gives you a two dimensional view of the plan and clearly they drew in the stairs.  To make the stairs usable requires a landing so that changing direction is simple.  This represents a missing requirement that causes another feature to be only partially usable.

This problem should have been caught by the architect when the blueprint was drawn up. However, barring the architect locating the issue a simple checklist and inspection of the plans would have turned up the problem.

Stairs to Ceiling

The staircase goes up to a ceiling and therefore is a useless feature.  Not only is the feature incomplete because it does not give access to the next level but also they developers wasted time and effort in building the staircase.

If this problem had been caught in the requirements stage as an inconsistency then either the staircase could have been removed or an access created to the next floor.  The net effect is that construction starts and the developers find the inconsistency when it is too late.

At a minimum the developers should have noticed that the stairway did not serve any purpose and not build the staircase which was a waste of time and materials.

Stairs Displaced

Here we have a clear case of changed requirements.  The stairs were supposed to be centered under the door, in all likelihood plans changed the location of the door and did not move the dependent feature.

When the blueprint was updated to move the door the designer should have looked to see if there was any other dependent feature that would be impacted by the change.

Architectural Defects

Architectural defects come from not understanding the requirements or the environment that you are working with.  In software, you need to understand the non-functional requirements behind the functional requirements -- the ilities of the project (availability, scalability, reliability, customizability, stability, etc). Architectural features are structural and connective.  In a building the internal structure must be strong enough to support the building, i.e. foundation and load bearing walls.

Insufficient foundation Insufficient structure Connectivity problem

Insufficient Foundation

Here the building was built correctly, however, the architect did not check the environment to see if the foundation would be sufficient for the building. The building is identical to the building behind it, so odds are they just duplicated the plan without checking the ground structure.

The equivalent in software is to design for an environment that can not support the functionality that you have designed.  Even if that functionality is perfect, if the environment doesn't support it you will not succeed.


Insufficient Structure

Here the environment is sufficient to hold up the building, however, the architect did not design enough structural strength in the building. The equivalent in software design is to choose architectural components that will not handle the load demanded by the system.

For example, distributed object technologies such as CORBA, EJB, and DCOM provided a way to make objects available remotely, however, the resulting architectures did not scale well under load.

Connectivity Problem

Here a calculation error was made when the two sides of the bridge were started.  When development got to the center they discovered that one side was off and you have an ugly problem joining the two sides.


The equivalent for this problem is when technologies don't quite line up and require awkward and ugly techniques to join different philosophical approaches.  In software, a classic problem is mapping object-oriented structures into relational databases.  The philosophical mismatch accounts for large amounts of code to translate from one scheme into the other.


Coding Defects

Coding defects are better understood (or at least yelled about :-), so I won't spend too much time on them.  Classic coding defects include:
  • Uninitialized data
  • Uncaught exceptions
  • Memory leaks
  • Buffer overruns
  • Not freeing up resources
  • Concurrency violations
  • Insufficient pathways, i.e. 5 conditions but only 4 coded pathway
Many of these problems can be caught with code inspections and static code inspection tools.

Testing Defects

Testing defects occur when the test plan flags a defect that is a phantom problem or a false positive. This often occurs when requirements are poorly documented and/or poorly understood and QA perceives a defect when there is none.

False positives slow down development.

The reverse also happens where requirements are not documented and QA does not perceive a defect, i.e. false negative.

False negatives can slip through to your customers..

Both false positives and negatives can be caught by inspecting the requirements and comparing them with the test cases.

Root Cause of Firefighting

When inspections are not done in all phases of software development there will be fire-fighting in the project in the zone of chaos.

Most software organizations only record and test for defects during the Testing phase.

Unfortunately, at this point you will detect defects in all previous phases at this point.


QA has a tendency to assume that all defects are coding defects -- however, the analysis of 18,000+ projects does not confirm this.  In The Economics of Software Quality, Capers Jones and Olivier Bonsignour show that defects fall into different categories. Below we give the category, the frequency of the defect, and the business role that will address the defect.

Defect Role Category Frequency Role
Requirements defect 9.58% BA/Product Management
Architecture or design defect 14.58% Architect
Code defect 16.67% Developer
Testing defect 15.42% Quality Assurance
Documentation defect 6.25% Technical Writer
Database defect 22.92% DBA
Website defect 14.58% Operations/Webmaster

Note, only the bolded rows above are assigned to developers.

Notice that fully 25% of the defects (requirements, architecture) occur before coding even starts.  These defects are just like the physical defects shown above and only manifest themselves once the code needs to be written.

It is much less expensive to fix requirements and architecture problems before coding.

Also, only about 54% of defects are actually resolvable by developers, so by assigning all defects to the developers you will waste time 46% of the time when you discover that the developer can not resolve the issue.

Fire-fighting is basically when there need to be dozens of meetings that pull together large numbers of people on the team to sort out inconsistencies. These inconsistencies will lie dormant because there are no inspections.

Of course, when all the issues come out, there are so many issues from all the phases of development that it is difficult to sort out the problem!

Learn how to augment your Bug Tracker to help you to understand where your defects are coming from in Bug Tracker Hell and How to Get Out!

Solutions

There are two basic solutions to reducing defects:
  1. Inspect all artifacts
  2. Shorten your development cycle
The second solution is the one adopted by organizations that are pursuing Agile software development.  However, shorter development cycles will reduce the amount of fire-fighting but they will only improve code quality to a point.

In The Economics of Software Quality the statistics show that defect removal is not effective in most organizations.  In fact, on large projects the test coverage will drop below 80% and the defect removal efficiency is rarely above 85%.   So even if you are using Agile development you will still not achieving a high level of defect removal and will be limited in the software quality that you can deliver.

Agile development can reduce fire-fighting but does not address defect removal

Inspect All Artifacts

Organizations that have formal inspections of all artifacts have achieved defect removal efficiencies of 97%! If you are intent on delivering high quality software then inspections are not optional. Of course, inspections are only possible for phases in which you have actual artifacts. Here are the artifacts that may be associated with each phase of development:

Phase Artifact
Requirements use case, user story, UML Diagrams (Activity, Use Case)
Architecture or design UML diagrams (Class, Interaction, Deployment)
Coding UML diagrams (Class, Interaction, State, Source Code)
Testing Test plans and cases
Documentation Documentation
Database Entity-Relationship diagrams, Stored Procedures

Effective Inspections

Inspections are only effective when the review process involves people who know what they are looking for and are accountable for the result of the inspection.  People must be trained to understand what they are looking for and effective check lists need to be developed for each artifact type that you review, e.g. use case inspections will be different than source code reviews.

Inspections must have teeth otherwise they are a waste of time.  For example, one way to put accountability into the process is to have someone other than the author be accountable for any problems found.  There are numerous resources available if you decide that you wish to do inspections.

Conclusion

The statistics overwhelming suggest that inspections will not only remove defects from a software system but also prevent defects from getting in.  With inspections software defect removal rates can achieve 97% and without inspections you are lucky to get to 85%.

Since IBM investigated the issue in 1973, it is interesting to note that teams trained in performing inspections eventually learn how to prevent injecting defects into the software system.  Reduced injection of defects into a system reduces the amount of time spent in fire-fighting and in QA.

You can only inspect artifacts that you take the time to create.  Many smaller organizations don't have any artifacts other than their requirements documents and source code. Discover which artifacts need to be created by augmenting your Bug Tracker (see Bug Tracker Hell and How To Get Out!).   Any phase of development where significant defects are coming from should be documented with an appropriate artifact and be subject to inspections.


Good books on how to perform inspections:

All statistics quoted from The Economics of Software Quality by Capers Jones and Olivier Bonsignour:
Capers Jones can be reached by sending me an email: Dalip Mahal


Other Engineering Disasters
SkyLab (1979) NASA's planners failed to account for sunspot activity causing drag on the satellite during a solar warming cycle causing the satellite to crash.Cost: $10 billion in 2010 inflation adjusted dollars.
Tacoma Narrows Bridge (1948) Engineers failed to account for wind sheer and the Tacoma Narrows Bridge failed spectaculary when high winds caused the bridge to resonate and oscillate in perfect sine waves before breaking and falling apart. Cost: $100 million in 2011 inflation adjusted dollars. Video
The Vasa (1628) The Vasa was built top-heavy with insufficient ballast and foundered and sunk in 32 meters of water just 120 meters from shore as soon as she encountered a wind stronger than a breeze, just a few minutes after first setting sail on her maiden voyage on August, 10th, 1628. Despite clearly lacking stability even in port, she was allowed to set sail.

Thursday 27 September 2012

Does Agile hide development sins?

There are really only two key principles in software development that every project tries to adhere to:
  1. To be highly focused on building the right thing (effectiveness principle).
  2. Deliver code with few defects by preventing them from getting into the code or removing them as quickly as possible.
To be succinct it means build to the correct requirements and minimize defects.



At project completion your code falls into one of the above four buckets.  The amount of code in the correct requirements section with no defects is what determines if the project is successful. If you adhere to principle 1 then there should be little or no code in the red section.  If you adhere to principle 2 then the yellow section will be minimized.  The application of these two principles is what leads to a successful project.

Software projects are still failing at an alarming rate (see Understanding your chances) because people are still not understanding these two principles.  Initially principle 1 is much more important than principle 2; after all, who cares if you build a defect free system if it doesn't do what is needed?

For better or worse, so called Agile development  has become synonymous with virtually any iterative and incremental software process and is generally assumed to be Scrum or XP by many.

The Good News

Good News The Agile manifesto supports principle 1 with the following statements:
  •  Our highest priority is to satisfy the customer through early and continuous delivery of valuable software.
  • Deliver working software frequently, from a couple of weeks to a couple of months, with a preference to the shorter timescale.
  • Working software is the primary measure of progress.
This leads us to focus on an incremental approach to building software where your development cycle is shorter from 2-6 weeks.  Each development cycle has a requirements phase where you can adjust your aim and make sure that you stay on track.

The Bad News

Bad News Shorter development cycles  where you adjust your aim is critical to building the right software system, especially if you are faced with technical or requirements uncertainty (see Uncertainty trumps Risk). The problem is that iterative and incremental development will do very little to prevent defects from getting into your code or to get them out quickly once they are created.

With organizations adopting the so called Agile methodologies (Scrum, XP, etc) they are building better software by getting a handle on principle 1, the problem is that they are doing very little to address principle 2, i.e. minimizing defects.

ScrumFor example, in Scrum all the defects get put into the back log to be addressed at a future time.  How often do people consider the source of the defect or whether it could have been avoided in the first place?

Defects come from multiple sources and need to be addressed by different people. The table below is from Bug Tracker Hell and How to Get Out! where we detail the different kinds of defects, how often they occur, and who has to fix them.

Defect Role Category Frequency Role
Requirements defect 9.58% BA/Product Management
Architecture or design defect 14.58% Architect
Code defect 16.67% Developer
Testing defect 15.42% Quality Assurance
Documentation defect 6.25% Technical Writer
Database defect 22.92% DBA
Website defect 14.58% Operations/Webmaster

Agile won't help you eliminate requirements defects!

With a incremental build cycle that focuses on requirements each cycle you are virtually guaranteed to make some kind of forward progress on your project. But, if you have a poor requirements process and/or poor business analysts (product managers) then you may take 2-3 cycles to solve a problem that could have been solved in one cycle.  The mechanics of a short development cycle will bury defects in your requirements.

Agile won't help you eliminate design defects!

If your development team has built a suboptimal architecture then defects in that architecture will be difficult to detect.  You know that you have an architectural defect when what seems to be a simple change has a huge time estimate on it, something that ends up touching many of the source files. Agile will cover up these problems as people incrementally build out work around solutions to fixing the architecture.  In reality, your architecture will get worse and worse over time and this will manifest in slower and slower development.

Agile won't help you eliminate testing defects!

Defects in test plans and test cases occur 15.4% of the time.  These defects manifest in your bug tracker as Functions as Designed defects.  Each of these defects masks the fact that QA either:
  • did not get a requirement
  • got a bad requirement for the defect
  • did not understand the requirement correctly
Since QA is a continuous process with Agile, QA will fix the test case during the cycle in which they discover it.  This does not leave you with any information on how to prevent these kinds of defects from happening again. Just these 3 categories of defects account for 40% of defects in your development process, and the way that most organizations will implement "Agile" development will bury these problems.  It may surprise you to learn that...

True Agile development does not have these problems.

So just because you are doing Scrum, XP, Crystal, DSDM, or Lean and Kanban software development does not mean that you are Agile.

What it Means to Be Agile

To really be Agile you must support all the principles behind the Agile manifesto. The Agile manifesto supports principle 2 with the following statements:
  • Simplicity--the art of maximizing the amount  of work not done--is essential.
  • Continuous attention to technical excellence and good design enhances agility.
  • At regular intervals, the team reflects on how  to become more effective, then tunes and adjusts its behavior accordingly.
One of the consequences of implementing the simplicity principle above is to prevent defects from getting into the system and causing work for someone else.  This means categorizing your defects and understanding when any of the extended team members (BAs, architects, development, and QA) is creating defects that sandbag your overall development. It means conducting inspections at the requirements, design, and code level to remove defects before they get into the system.  Organizations that do inspections at all these levels achieve an overall defect removal of 97%.  Organizations that do not do inspections will only be able to remove about 80-85% of their defects  (see Capers Jones and Olivier Bonsignour, "The Economics of Software Quality").

To really be Agile you must support all the principles behind the Agile manifesto.

Friday 14 September 2012

Bug Tracker Hell and How To Get Out!

Whether you call it a defect or bug or change request or issue or enhancement you need an application to record and track the life-cycle of these problems.  For brevity, let's call it the Bug Tracker.

Bug trackers are like a roach motel, once defects get in they don't check out!  Because they are append only, shouldn't we be careful and disciplined when we add "tickets" to the bug tracker?  We should, but in the chaos of a release (especially start-ups :-)) the bug tracker goes to hell.

Bug Tracker Hell happens when inconsistent usage of the tool leads to various problems such as duplicate bugs, inconsistent priorities and severities.  While 80% of defects  are straight forward to add to the Bug Tracker, it is the other 20% of the defects that cause real problems.

The most important attribute of a defect is its DefectLifecycleStatus; not surprisingly every Bug Tracker makes this the primary field for sorting.   This primary field is used to generate reports and to manage the defect removal process.  If we manage this field carefully we can generate reports that not only help the current version but also provide key feedback for post-mortem analysis.

Every Bug Tracker has at least the states Open, Fixed, and Closed, however, due to special cases we are tempted to create new statuses for problems that have nothing to do with the life cycle.  The creation of life cycle statuses that are not life cycle states is what caused inconsistent usage of the tool because then it becomes unclear how to enter a defect.

It is much easier to have consistent life cycle states than to have a 10 page manual on how to enter a defect.


(This color is used to indicate a defect attribute, and this color is used to indicate a constant.)

What Life Cycle States Do We Need?

Clearly we want to know how many Open defects need to be fixed for the  current release; after all, management is often breathing down our neck to get this information.

Ideally we would get the defects outstanding report by finding out how many defects are Open. Unfortunately, there are numerous open defects that will not be fixed in the current release (or ever :-( ) and so we seek ways to remove those defects from the defects outstanding.

Why complicate life?
In particular we are tempted to create states like Deferred,  WontFix, and FunctionsAsDesigned, to remove defects from the  defects outstanding. These states have the apparent effect of simplifying the defects outstanding report but will end up complicating other matters.

For example, Deferred is simply an open defect that is not getting fixed in the current release; WontFix is an open defect that  the business has decided not to fix; and FunctionsAsDesigned indicates that either the requirements were faulty or QA saw a phantom problem, but once this defect gets into the Bug Tracker you can't get it out. All three states above variants of the Open life cycle state and creating these life cycle states will create more problems than they solve. The focus of this article is on how to fix the defect life cycle, however, other common issues are addressed.

Life cycle states for Deferred, WontFix, or FunctionsAsDesigned is like a "Go directly to Bug Tracker Hell" card!


Each Defect Must Be Unambiguous

The ideal state of a Bug Tracker is to be able to look at any defect in the system and have a clear answer to each of the following questions.
  • Where is the defect in the life-cycle?
  • Has the problem been verified?
  • How consistently can the problem be reproduced or is it intermittent?
  • Which team role will resolve the issue? (team role, not person)
The initial way to get out of hell is to be consistent with the life cycle state.

Defect Life Cycle

All defects go through the following life cycle (DefectLifecycleStatus) regardless of whether we track all of these states or not:
  • New
  • Verified
  • Open
  • Work in Process
  • Work complete
  • Fixed
  • Closed


Anyone should be able to enter a New defect, but just because someone thinks"I tawt I taw a defect!" in the system doesn't mean that the defect is real.  In poorly specified software systems QA will often perceive a defect where there is none, the famous functions as designed (FAD)  issue.

FAD = Requirements or test defect; if someone thought there was a defect then requirements are insufficient or the test plan is incorrect or someone did not follow a test plan.  You have a defect, it is just not in engineering.

Since there are duplicate and phantom issues that are entered into the Bug Tracker, we need to kick the tires on all New defects before assigning them to someone.  It is much faster and cheaper to verify defects than to simply throw them at the development team and assume that they can fix them.

Trust But Verify

New defects not entered by QA should be assigned to the QA role.  These defects should be verified by QA before the life cycle status is updated to Verified.  QA should also make sure that the steps to reproduce the defect are complete and accurate before moving the defect to the Verified life cycle status.  Ideally even defects entered by QA should be verified by someone else in QA to make sure that the defect is entered correctly.

By introducing a Verified  state you separate out potential work from actual work. If a bug is a phantom then QA can mark it as Closed  it before we assign it to someone and waste their time.  If a bug is a duplicate then it can be marked as such, linked to the other defect, and Closed.

The advantage of the Verified status is that the intermittent bugs get more attention to figure out how to reproduce them.  If QA discovers that a defect is intermittent then a separate field in the Bug Tracker, Reproducibility, should be populated with one of the following values:
  • Always (default)
  • Sometimes
  • Rare
  • Can't reproduce
Defects hard to reproduce stay in the New state until you can reproduce them.  If you can't reproduce them then you can mark the issue as Closed without impacting the development team.


Assign the Defect to a Role

QA has a tendency to assume that all defects are coding defects -- however, the analysis of 18,000+ projects does not confirm this.  In The Economics of Software Quality, Capers Jones and Olivier Bonsignour show that defects fall into different categories. Below we give the category, the frequency of the defect, and the business role that will address the defect.

Note, only the bolded rows below are assigned to developers.


Defect Role Category Frequency Role
Requirements defect 9.58% BA/Product Management
Architecture or design defect 14.58% Architect
Code defect 16.67% Developer
Testing defect 15.42% Quality Assurance
Documentation defect 6.25% Technical Writer
Database defect 22.92% DBA
Website defect 14.58% Operations/Webmaster

Defect Role Categories are important to accelerating  your overall development speed!

Even if all architecture, design, coding, and database defects are handled by the development group this only represents 54% of all defects.  So assigning any New defect to the development group without verification is likely to cause problems inside the team.

Note, 25% of all defects are caused by poor requirements and bad test cases, not bad code.  This means that the business analysts and QA folks are responsible for fixing them.

Given that 46% of all defects are not resolved by the development team there needs to be a triage before a bug is assigned to a role.  Lack of bug triages is the Root cause of 'Fire-Fighting' in Software Projects.

The Bug Tracker should be extended to record the DefectRole in addition to the assigned attribute.  Just this attribute will help to straighten out the Bug Tracker!

Non-development Defects

Most Bug Tracking systems have a category called enhancement.  Enhancements are simply defects in the requirements and should be recorded but not specified in the Bug Tracker; the defect should be Open with a DefectRole of ProductManagement.

Enhancements need to be assigned to product managers/BAs who should document and include a reference to that documentation in the defect.  The description for the defect is not the proper place to keep requirements documentation.  The life cycle of a product requirement is generally very different from a code defect because the requirement is likely to be deferred to a later release if you are late in your product cycle.
Business requirements may have to be confirmed with the end users and/or approved by the business.  As such, they generally take longer to become work items than code defects.

QA should not send enhancements to development without involvement of product management.

Note that  15.42% of the defects are a QA problem and are fixed in the test plans and test cases.

Bug Triage

The only way to correctly assign resources to fix a defect is to have a triage team meet regularly that can identify what the problem is.  A defect triage team needs to include a product manager, QA person, and developer.   The defect triage team should meet at least once a week during development and at least once a day during releases.  Defect triages save you time because only 54% of the defects can be fixed by the developers; correctly assigning defects avoids miscommunication.

Effective bug triage meetings are efficient when the only purpose of the meeting is to correctly assign defects.  Be aggressive and keep design discussions out of triages. 

Defects should be assigned to a role and not a specific person to allow maximum flexibility in getting the work done; they should only be assigned to a specific person when there is only one person who can resolve an issue.

Assigning unverified and intermittent defects to the wrong person will start your team playing the blame game.

As the defects are triaged, product management (not QA) should set the priority and severity as they represent the business.  With a multi-functional team these two values will be set consistently.  In addition the triage team should set the version that the defect will be fixed.  Some teams like to put the actual version number where a defect gets fixed(i.e. ExpectedFixVersion) I prefer to use the following:
  • Next bug fix
  • Next minor release
  • Next major  release
  • Won't fix
I like ExpectedFixVersion because it is conditional, it represents a desire.  Like it (or not) it is very hard to guess when every defect will be fixed.  The reality is that if the release date gets pulled in or the work turns out to be more involved than expected the fix version could be deferred (possibly indefinitely).  If you guess wrong then you will spend a considerable amount of time changing this field.

Getting the Defect Resolved

Once the defects are in the system each functional role can assign the work to its resources.  At that point the defect life cycle state is Work In Progress.

All Work complete means is that the individual working on the defect believes that it is resolved.  When the work is resolved the FixVersion should be set as the next version that will be released.  Note, if you use release numbers in the ExpectedFixVersion field then you should update that field if it is wrong :-)
Of course the defect may or may not be resolved, however, the status of Work complete acts a signal that someone else has work to do.

If a requirements defect is fixed then the issue should be moved to Fixed and assigned to the development manager that will give the work to his team.  Once the team has verified their understanding of the requirement the defect can move from Fixed to Closed.

Work complete means that the fixer  believes that problem is resolved, Fixed means that the team has acknowledged the fix!

For code defects the Work complete status is a signal to QA to retest the issue.  If QA establishes that the defect is fixed they should move the issue to Fixed.  If the issue is not fixed at all then the defect should move back to Open; if the defect is partially fixed then the defect should move to Verified so that it goes back through the bug triage process (i..e severity and priority may have changed).

Once a release is complete, all Fixed items can be moved to Closed.

Tracking Defects Caused by Fixing Defects

Virtually all Bug Trackers allow you to link one or more issues together.  However, it is extremely important to know why bugs are linked, in most cases you link bugs because they are duplicates.
Bugs can be linked together because fixing one defect may cause another.  On average this happens for every 14 defects fixed but in the worst organizations can happen every 4 defects fixed.  Keeping a field called ResultedFromDefect where you link the number of the other defect allows you to determine how new defects are the result of fixing other defects.

Knowing how many defects are created while fixing others is important to improving your development process.


Summary

Let's recap how the above mechanisms will help you get out of hell.
  1. By introducing the Verified step you make sure that bugs are vetted before anyone get pulled into a wild goose chase.
    1. This also will catch intermittent defects and give them a home while you figure out how often they are occurring and work out if there is a reliable way to produce them.
    2.  If you can't reproduce a defect then at least you can annotate it as Can't Reproduce, i.e. status stays as New and it doesn't clog the system
  2. By conducting triage meetings with product management, QA, and development you will end up with very consistent uses of priority and severity
  3. Bug triages will end up categorizing defects according to the role that will fix them which will reduce or eliminate:
    1. The blame game
    2. Defects being assigned to the wrong people
  4. By having the ExpectedFixVersion be conditional you won't have to run around fixing version numbers for defects that did not get fixed in a particular release.  It also gives you a convenient way to tag a defect as Won't Fix, the status should go back to Verified.
  5. By having the person who fixes a defect set the FixVersion then you will have an accurate picture of when defects are fixed
  6. When partially fixed defects go back to Verified the priority and severity can be updated properly during the release.

Benefits of the Process

By implementing the defect life cycle process above you will get the following benefits:
  • Phantom bugs and duplicates won't sandbag the team
  • Intermittent bugs will receive more attention to determine their reproducibility
    • Reproducible bugs are much easier to fix
  • Proper triages will direct defects to the appropriate role
  • You will discover how many defects you create by fixing other defects
By having an extended set of life cycle states you will be able to start reporting on the following:
  • % of defects introduced while fixing defects (value in ResultedFromDefect)
  • % of New bugs that are phantoms or duplicates, relates to QA efficiency
  • % of defects that are NOT development problems, relates to extended team efficiency (i.e. DefectRole <> Development)
  • % of requirements defects which relates to the efficiency of your product management (i.e. DefectRole = ProductManagement)
  • % of defects addressed but not confirmed (Work Completed)
  • % of defects fixed and confirmed (Fixed)
It may sound like too much work to change your existing process, but if you are already in Bug Tracker hell, what is your alternative?


Need help getting out of Bug Tracker hell?  Write to me at dmahal@AcceleratedDevelopment.ca


Appendix: Importance of Capturing Requirements Defects

The report on the % of requirements defects is particularly important because it represents the amount of scope shift (creep) in your project.   You can see this in the blog Shift Happens.  Also, if the rates of scope shift of 2% per month are strong indicators of impending swarms of bugs and project failure. Analysis shows that the probability of a project being canceled is highly correlated with the amount of scope shift.  Simply creating enhancements in the Bug Tracker hides this problem and does not help the team.