Wednesday 20 June 2012

Efficiency is for Losers

Focusing on efficiency and ignoring effectiveness is the root cause of most software project failures.

Effectiveness is producing the intended or expected result. Efficiency is the ability to accomplish a job with a minimum expenditure of time and effort.

Effective software projects deliver code that the end users need; efficient projects deliver that code with a minimum number of resources and time.

Sometimes, we become so obsessed with things we can measure, i.e. project end date, kLOC, that we somehow forget what we were building in the first place.  When you're up to your hips in alligators, it's hard to remember you were there to drain the swamp.


Efficiency only matters if you are being effective.

After 50 years, the top three end-user complaints about software are:
  1. It took too long
  2. It cost too much
  3. It doesn't do what we need
Salaries are the biggest cost of most software projects, hence if it takes too long then it will cost too much, so we can reduce the complaints to:  
  1. It took too long
  2. It doesn't do what we need  
The first issue is a complaint about our efficiency and the second is a complaint about our effectiveness. Let's make sure that we have common  definitions of these two issues before continuing to look at the interplay between efficiency and effectiveness.

Are We There Yet? 

Are you late if you miss the project end date? 

That depends on your point of view; consider a well specified project (i.e. good requirements) with a good work breakdown structure that is estimated by competent architects to take a competent team of 10 developers at least 15 months to build. Let's consider 5 scenarios where this is true except as stated below:

Under which circumstances is the project late?
  A. Senior management gives the team 6 months to build the software.  
  B. Senior management assigns a team of 5 competent developers instead of 10.  
  C. Senior management assigns a team of 10 untrained developers  
  D. You have the correct team, but, each developer needs to spend 20-35% of their time maintaining code on another legacy system  
  E. The project is staffed as expected  

Here are the above scenarios in a table:
#
Team
Resource
Commitment
Months Given
Result
A
10 competent developers
100%
6
Unrealistic estimate
B
5 competent developers
100%
15
Under staffed
C
10 untrained developers
100%
15
Untrained staff
D
10 competent developers
65-80%
15
Team under committed
E
10 competent developers
100%
15
Late


Only the last project (E) is late because the estimation of the end date was consistent with the project resources available.
   
Other well known variations which are not late when the end date is missed:  
  • Project end date is a SWAG or management declared
  • Project has poor requirements
  • You tell the end-user 10 months when the estimate is 15 months.
If any of the conditions of project E are missing then you have a problem in estimation.  You may still be late, but not based on the project end date computed with bad assumptions.
Of course, being late may be acceptable if you deliver a subset of the expected system.

It Doesn't Work


“It doesn't do what we need” is a failure to deliver what the end user needs. How so we figure out what the end user needs?

The requirements for a system come from a variety of sources:
  1. End-users
  2. Sales and marketing (includes competitors)
  3. Product management
  4. Engineering

These initial requirements will rarely be consistent with each other. In fact, each of these constituents will have a different impression of the requirements. Y

You would expect the raw requirements to be contradictory in places. The beliefs are like the 4 circles to the left, and the intersection of their beliefs would be the black area.



The different sources of requirements do not agree because:
  • Everyone has a different point of view
  • Everyone has a different set of beliefs about what is being built
  • Everyone has a different capability of articulating their needs
  • Product managers have varying abilities to synthesize consistent requirements
It is the job of product management to synthesize the different viewpoints into a single set of consistent requirements. If engineering starts before requirements are consistent then you will end up with many fire-fighting meetings and lose time.

Many projects start before the requirements are consistent enough. We hope the initial requirements are a subset of what is required. In practice, we have missed requirements and included requirements that are not needed (see bottom of post, data from Capers Jones)

The yellow circle represents what we have captured, the black circle represents the real requirements.


We rarely have consistent requirements when we start a project, that is why there are different forms of the following cartoon lying around on the Internet.

If you don't do all the following:
  • Interview all stakeholders for requirements  
  • Get end-users to articulate their real needs by product management
  • Synthesize consistent requirements  
Then you will fail to build the correct software.  So if you skip any of this work then you are guaranteed to get the response, " It doesn't do what we need ".

Effectiveness vs. Efficiency

So, let's repeat our user complaints:  
  1. It took too long
  2. It doesn't do what we need  
It's possible to deliver the correct software late.
It's impossible to deliver on-time if the software doesn't work
Focusing on effectiveness is more important than efficiency if a software project is to be delivered successfully.  

Ineffectiveness Comes from Poor Requirements

Most organizations don’t test the validity or completeness of their requirements before starting a software project. The requirements get translated into a project plan and then the project manager will attempt to execute the project plan. The project plan becomes the bible and everyone marches to it. As long as tasks are completed on time everyone assumes that you are effective, i.e. doing the right thing.

That is until virtually all the tasks are jammed at 95% complete and the project is nowhere near completion.

At some point someone will notice something and say, “I don’t think this feature should work this way”. This will provoke discussions between developers, QA, and product management on correct program behavior. This will spark a series of fire-fighting meetings to resolve the inconsistency, issue a defect, and fix the problem. All of the extra meetings will start causing tasks on the project plan to slip.


We discussed the root causes of fire-fighting in a  previous blog entry.

When fire-fighting starts productivity will grind to a halt. Developers will lose productivity because they will end up being pulled into the endless meetings. At this point the schedule starts slipping and we become focused on the project plan and deadline. Scope gets reduced to help make the project deadline; unfortunately, we tend to throw effectiveness out the window at this point.

With any luck the project and product manager can find a way to reduce scope enough to declare victory after missing the original deadline.  

The interesting thing here is that the project failed before it started. The real cause of the failure would be the inconsistent requirements.

But, in the chaos of fire-fighting and endless meetings, no one will remember that the requirements were the root cause of the problem.  

What is the cost of poor requirements? Fortunately, WWMCCS has an answer.  As a military organization they must tracks everything in a detailed fashion and perform root cause analysis for each defect (diagram).

This drawing shows what we know to be true.
The longer a requirement problem takes to discover, the harder and more expensive it is to fix!  
A requirement that would take 1 hour to fix will take 900 hours to fix if it slips to system testing.

Conclusion

It is much more important to focus on effectiveness during a project than efficiency. When it becomes clear that you will not make the project end date, you need to stay focused on building the correct software.

Are you tired of the cycle of:  
  • Collecting inconsistent requirements?  
  • Building a project plan based on the inconsistent requirements?  
  • Estimating projects and having senior management disbelieve it?
  • Focusing on the project end date and not on end user needs?  
  • Fire-fighting over inconsistent requirements?  
  • Losing developer productivity from endless meetings?
  • Not only miss the end date but also not deliver what the end-users need?  
The fact that organizations go through this cycle over and over while expecting successful projects is insanity – real world Dilbert cartoons.

How many times are you going to rinse and repeat this process until you try something different? If you want to break this cycle, then you need to start collecting consistent requirements.

Think about the impact to your career of the following scenarios:
  1. You miss the deadline but build a subset of what the end-user needs
  2. You miss the deadline and don't have what the end-user needs  
You can at least declare some kind of victory in scenario 1 and your resume will not take a big hit. It's pretty hard to make up for scenario 2 no matter how you slice it.

Alternatively, you can save yourself wasted time by making sure the requirements are consistent before you start development. Inconsistent requirements will lead to fire-fighting later in the project.

As a developer, when you are handed the requirements the team should make a point of looking for inconsistent requirements. The entire team should go through the requirements and look for inconsistencies and force product management to fix them before you start developing.

It may sound like a waste of time but it will push the problem of poor requirements back into product management and save you from being in endless meetings.

Cultivating patience on holding out for good requirements will lower your blood pressure and help you to sleep at night.  Of course, once you get good requirements then you should hold out for proper project estimates :-)

<
Moo?


Want to see another sacred cow get killed? Check out



Courtesy of Capers Jones via LinkedIn on 6/22

Customers themselves are often not sure of their requirements.

For a large system of about 10,000 function points, here is what might be seen for the requirements.

This is from a paper on requirements problems - send an email to capers.jones3@gmail.com if you want a copy.

Requirements specification pages = 2,500
Requirements words = 1,125,000
Requirements diagrams = 300

Specific user requirements = 7,407
Missing requirements = 1,050
Incorrect requirements = 875
Superfluous requirements = 375
Toxic harmful requirements = 18

Initial requirements completeness = < 60%
Total requirements creep = 2,687 function points

Deferred requirements to meet schedule = 1,522

Complete and accurate requirements are possible < 1000 function points. Above that errors and missing requirements are endemic.

Monday 18 June 2012

Uncertainty and Risk in Software Development (2 of 3)

Defining Risk and its Components

Part 1 of 3 is here.

There are future events whose impact can have a negative outcome or consequence to our project. A future event can only be risky if the event is uncertain. If an event is certain then it is no longer a risk even if the entire team does not perceive the certainty of the event.

Risks always apply to a measurable goal that we are trying to achieve; if there is no goal there can be no risk, i.e. a project can't have schedule risk if it has no deadline.

Once a goal has been impacted by a risk we say that the risk has triggered. The severity of the outcome depends on how far it displaces us from our goal. Once triggered, there should be a mitigation process to reduce the severity of the possible outcomes.



Before looking at software project risks tied to these goals, let's make sure that we all understand the components of risk by going through an example.

Risk Example: Auto Collision

Let's talk about risk using a physical example to make things concrete. The primary goal of driving a car is to get from point A to point B. Some secondary goals are:
  • Get to the destination in a reasonable time
  • Make sure all passengers arrive in good health.
  • Make sure that the car arrives in the same condition it departs in.
There is a risk of collision every time you drive your car:
  • The event of a collision is uncertain
  • The outcome is the damage cost and possible personal injury
  • The severity is proportional to the amount of damage and personal injury sustained if there is an accident
    • If there is loss of life then the severity is catastrophic
A collision will affect one or more of the above goals. Risk management with respect to auto collisions involves:
  • Reducing the probability of a collision
  • Minimizing the effects of a collision
There are actions that can reduce or increase the likelihood of a collision is:
  • Things that reduce the chance of collision
    • Understanding safe driving techniques  
    • Driving when there are fewer drivers on the road
    • Using proper turn signals
  • Things that increase the chance of collision
    • Drinking and driving
    • Driving in heavy fog 
    • Wearing sunglasses at night
By taking the actions that reduce a collision while avoiding the actions that increase it we can reduce the probability or likelihood of a collision.

Reducing the likelihood of a collision does not change the severity of the event if it occurs. The likelihood of an event and its consequence are independent even if there are actions that will reduce the likelihood and consequences of an event, i.e. driving slowly.

If an auto collision happens then a mitigation strategy would attempt to minimize the effect of the impact. Mitigation strategies with respect to auto collision are:
  • Wear a seat belt
  • Drive a car with excellent safety features
  • Have insurance for collisions
  • Have the ability to communicate at all times (i.e. cell phone, etc)  
Having a mitigation strategy will not reduce the chance of a collision, it will only lessen the severity.

Goals of a Software Project

The primary goals of a software project are:
  1. Building the correct software system
  2. Building the system so that its benefits exceed its costs (i.e. NPV positive)

Building the Correct Software System

What is the correct software system? Cartoons similar to this one are easily found on the Internet:

The correct system is shown in the last frame of the cartoon; so let's define the  correct system as what the customer actually needs. To build the correct system we will need to have correct requirements. 

How Long Will The Project Take?

Let's assume we have complete and consistent requirements for a correct system. How long will it take to build this system?  One approach is to take a competent team and have them build out the system without imposing a deadline.  Once the system is built we would have the actual time to build the system (Tbuild) .

Tbuild  is theoretical because unless you are using an Agile methodology you will want to estimate (Testimate) how long it takes to produce the system before you start.  Nonetheless, given your resources and requirements  Tbuild does exist and is a finite number; as one of my colleagues used to say, "the software will take the time that it takes".

Most executives want to know how long a project is going to take before the project starts.  To do this we take the requirements and form an estimate (Testimate) of how long the system will take to build.  The key point to note here is that the actual time to build,  Tbuild, and the estimated time to build the system, Testimate , will be different.  The key thing to keep in mind is that Testimate is only valid to the extent that you use a valid methodology for establishing an estimate.

Building the System so that its Benefits Exceed its Costs 

Building a system so that its benefits exceed its costs is equivalent to saying that the project puts money on the organization's bottom line. We hope that an organization will do the following:
  1. Define the system correctly (project scope)
  2. Assess the financial viability of the project (capital budgeting)
  3. Establish a viable project plan
Financial viability implies that the available resources will be able to produce the desired system before a specific date (T viable). If Tbuild < Tviable then the organization will have a financially successful project, if Tbuild > Tviable then the organization will have a financial failure.
The problem is that we don't know what Tbuild is unless we build out the software system first, but we don't want to build the project if it is not viable, i.e. Tbuild > Tviable and that we have a financial failure. We need to have a reasonable expectation that the project is viable BEFORE we build it out. Therefore we use a proxy by estimating the time (Testimate) it will take to build the software from our project plan.

Once we have a time estimate then we can go forward on the project if Testimate < Tviable. The estimate, Testimate, for a project can be done in multiple ways:
  1. Formal cost estimation techniques
  2. Senior management declared deadlines
  3. SWAG  estimates  

Software Project Risks

There are several primary risks for a software project:
  • Schedule risk
  • Estimation risk
  • Requirements risk
  • Technical risk
We often confuse schedule risk and estimation risk. Schedule risk is the risk that the tasks on the critical path have been under estimated and the project will miss the end date (i.e. Tbuild > Testimate). A project that takes longer than the estimate is not necessarily a failure unless Tbuild > Tviable.
You can only talk meaningfully about schedule risk in projects where:
  • formal estimation techniques are used
  • proper task dependency analysis is done
  • project critical path is identified
Most of us do not work for organizations that are CMM level 4+ (or equivalent), so you are unlikely to be using formal methods. When the project end date is arbitrary (i.e. method 2 or 3 above) it is not meaningful to talk about schedule risk, especially since history shows that we underestimate how long it will take to build the system, i.e. Testimate <<<  Tbuild. When formal methods are not used (i.e. method 2 or 3 above) then the real issue is estimation risk and not schedule risk.

The real tragedy is when an IT departments attempt to meet unrealistic dates set by management when a realistic date would still yield a viable project (below). Unfortunately, unrealistic deadlines will cause developers to take short cuts and usually cripple the architecture. So that when management gives you additional time after their date fails, the damage to the architecture is terminal and you can't achieve the initial objective.
Requirements risk is the risk that we do not have the correct requirements and are unable to get to a subset of the requirements that enables us to build the correct system prior to the project end date. There are many reasons for having incorrect requirements when a project starts:
  • The customer can not articulate what he needs
  • Requirements are not gathered from all stakeholders for the project
  • Requirements are incomplete
  • Requirements are inconsistent
Technical risk is the risk that some feature of the correct system can not be implemented due to a technical reason. If a technical issue has no work around and is critical to the correct system then the project will need to be abandoned.

If the technical issue has a work around the:
  • If the technical issue prevents the correct system from being built then we have requirements risk
  • If the technical work around takes to long it can trigger schedule risk
Next blog:
  • Discuss other risks and how they roll up into one of the 4 risks outlined above
  • Discuss how risk probability and severity combines to form acceptable or unacceptable risks
  • Discuss risk mitigation strategies
  • Discuss how to form a risk table/database
  • Discuss how to redefine victory for informal projects

Friday 8 June 2012

Polymorphism and Inheritance are Independent of Each Other


Flexible programs focus on polymorphism and not inheritance.  Some languages focus on static type checking (C++, Java, C#) which links the concepts and reduces polymorphic opportunities.  Languages that separate the concepts can allow you to focus on polymorphism and create more robust code.  JavaScript, Python, Ruby, and VB.NET do not have typed variables and defer type checking to runtime.  Is the value of static type checking worth giving up the power of pure polymorphism at runtime?

Inheritance and polymorphism are independent but related entities – it is possible to have one without the other.  If we use a language that requires variables to have a specific type (C++, C#, Java) then we might believe that these concepts are linked.

If you only use languages that do not require variables to be declared with a specific type, i.e. var in JavaScript, def in Python, def in Ruby, dim in VB.NET then you probably have no idea what I'm squawking about! J

I believe that the benefits of pure polymorphism outweigh the value of static type checking.  Now that we have fast processors, sophisticated debuggers, and runtime exception constructs the value of type checking at compile time is minimal.   Some struggle with polymorphism, so let's define it:

Polymorphism is the ability to send a message to an object without knowing what its type is.   

Polymorphism is the reason why we can drive each others cars and why we can use different light switches.  A car is polymorphic because you can send commonly understood messages to ANY car (start(), accelerate(), turnLeft(), turnRight(), etc) without knowing WHO built the car.  A light switch is polymorphic because you can send the message turnOn() and turnOff() to any light switch without knowing who manufactured it.

Polymorphism is literally what makes our economy work.  It allows us to build functionally equivalent products that can have radically different implementations.  This is the basis for price and quality differences in products, i.e. toasters, blenders, etc.

Polymorphism through Inheritance


The UML diagram above shows how polymorphism is stated in languages like C++, Java, and C#.  The method (a.k.a operation) start() is declared to be abstract (in UML), which defers the  implementation of the method to the subclasses in your target language.  The method for start() is declared in class Car and specifies only the method signature and not an implementation (technically polymorphism requires that no code exists for method start() in class Car).

The code for method start() is then implemented separately in the VolkswagenBeetle and SportsCar subclasses.  Polymorphism implies that start() is implemented using different attributes in the subclasses, otherwise the start() method could simply been implemented in the super class Car.

Even though most of us no longer code in C++, it is instructive to see why a strong link between inheritance and polymorphism kills flexibility.

// C++ polymorphism through inheritance

class Car {
// declare signature as pure virtual function
     public virtual boolean start() = 0; 
}

class VolkswagenBeetle : Car {
     public boolean start() {
          // implementation code
}
}

class SportsCar : Car {
     public boolean start() {
          // implementation code
}
}

// Invocation of polymorphism
Car cars[] = { new VolkswagenBeetle(), new SportsCar() };

for( I = 0; I < 2; i++)
     Cars[i].start();


The cars array is of type Car and can only hold objects that derive from Car (VolkswagenBeetle and SportsCar) and polymorphism works as expected.   However, suppose I had the following additional class in my C++ program:

// C++ lack of polymorphism with no inheritance


class Jalopy {
     public boolean start() {
         
}
}

// Jalopy does not inherit from Car, the following is illegal

Car cars[] = { new VolkswagenBeetle(),new Jalopy() };

for( I = 0; I < 2; i++)
     Cars[i].start();

At compile time this will generate an error because the Jalopy type is not derived from Car.  Even though they both implement the start() method with an identical signature, the compiler will stop me because there is a static type error.

Strong type checking imposed at compile time means that all polymorphism has to come through inheritance.  This leads to problems with deep inheritance hierarchies and multiple inheritance where there are all kinds of problems with unexpected side effects.  Even moderately complex programs become very hard to understand and maintain in C++.

Historical note: C++ was dominant until the mid 1990s simply because it was an object oriented solution that was NOT interpreted.  This meant that on the slow CPUs of the time it had decent performance.  We used C++ because we could not get comparable performance with any of the interpreted object-oriented languages of the time, i.e. Smalltalk.

Weakening the Link

The negative effects of the tight link between inheritance and polymorphism lead both Java and C# to introduce the concept of interface to pry apart the ideas of inheritance and polymorphism but keep strong type checking at compile time. 

First it is possible to implement the above C++ example using inheritance as shown by the C# below:

// C# polymorphism using inheritance

class Car {
     public virtual boolean start();  // declare signature
}

class VolkswagenBeetle : Car {
     public override boolean start() {
          // implementation code
}
}

class SportsCar : Car {
     public override boolean start() {
          // implementation code
}
}

// Invocation of polymorphism
Car cars[] = { new VolkswagenBeetle(), new SportsCar() };

for( I = 0; I < 2; i++)
     Cars[i].start();

In addition, through the use of the interface concept we can write the classes in Java as follows:

// Java polymorphism using interface

interface Car {
     public boolean start(); 
}

class VolkswagenBeetle implements Car {
     public boolean start() {
          // implementation code
}
}

class SportsCar implements Car {
     public boolean start() {
          // implementation code
}
}

By using an interface, the implementations of the VolkswagenBeetle and SportsCar can be completely independent as long as they continue to satisfy the Car interface.  In this manner, we can now get our Jalopy class to be polymorphic with the other two classes simply by:

class Jalopy implements Car {
}


Polymorphism without Inheritance

There are languages where you have polymorphism without using inheritance.  Some examples are JavaScript, Python, Ruby, VB.NET, and Small Talk. 

In each of these languages it is possible to write car.start() without knowing anything about the object car and its method. 

# Python polymorphism

class VolkswagenBeetle(Car):
     def start(): # Code to start Volkswagen

class SportsCar(Car):
     def start(): # Code to start SportsCar

# Invocation of polymorphism
cars = [ VolkswagenBeetle(), SportsCar() ]
for car in cars:
     car.start()

The ability to get pure polymorphism stems from these languages only have a single variable type prior to runtime i.e. var in JavaScript, def in Python, def in Ruby, dim in VB.NET.   With only one variable type there can not be type error prior to runtime.

Historical note: It was only during the time frame when Java and C# were introduced that CPU power was sufficient for interpreted languages to give sufficient performance at run time.  The transition from having polymorphism and inheritance tightly coupled to being more loosely coupled depended on run time interpreters being able to execute practical applications with decent performance.

There is no Such Thing as a Free Lunch

When type checking is deferred to runtime you can end up with strange behaviors when you make method calls to objects that don’t implement the method, i.e. sending start() to an object with no start() method.

When type checking is deferred to runtime you want an object to respond “I have no idea how to start()
if you send the start() method to it by accident.

Some pure polymorphic languages generally have a way of detecting missing methods:
  • In Visual Basic you can get the NotImplementedException
  • In Ruby you either implement the method_missing()  method or catch a NoMethodError exception
  • In Smalltalk you get the #doesNotUnderstand exception

Some languages don’t have exceptions but there are clunky work arounds:
  • In Python you have to use the getattr() call to see if an attribute exists for a name and then use callable() to figure out if it can be called.  For the car example above it would look like:
startCar = getattr(obj, "start", None)
if callable(startCar):
    startCar ()
  • JavaScript (ECMAScript) will only raise an exception for a missing method in Firefox/Spidermonkey.

Even if you have a well defined exception mechanism (i.e. try catch), when you defer type checking to runtime it becomes harder to prove that your programs work correctly. 

Untyped variables at development time allow a developer to create collections of heterogeneous objects (i.e. sets, bags, vectors, maps, arrays). 

When you iterate over these heterogeneous collections there is always the possibility that a method will be called on an object that is not implemented.  Even if you get an exception when this happens, it can take a long time for subtle problems to be found.

Conclusion

The concepts of polymorphism and inheritance are linked only if your language requires static type checking  (C++, Java, C#, etc).  Any language with only a generic type for variable declaration has full separation of polymorphism and inheritance (JavaScript, Python, Ruby, VB.NET), whether they are compiled to byte code or are directly interpreted.

The original compiled languages (C++, etc) performed static type checking because of performance issues.  Performing type checking at compile time created a strong link between inheritance and polymorphism.  Needing deep class inheritance structures and multiple inheritance lead to runtime side effects and code that was hard to understand.

Languages like C# and Java used the notion of an interface to preserve type checking at compile time to weaken the link between inheritance and polymorphism.  In addition, these languages compile to a byte code that is interpreted at runtime to give a balance been static type checking and runtime performance.

Languages like Ruby, Python, JavaScript, Visual Basic, and Smalltalk take advantage of powerful CPUs to use interpreters to defer type checking to run time (whether the source code is compiled to byte code or purely interpreted).  By deferring type checking we break the link between inheritance and polymorphism, however, this power comes with the difficulty of proving that a subtle runtime problem won't emerge.

The one caveat to pure polymorphism is that we may develop subtle bugs that can be difficult to track down and fix.  Pure polymorphism is only worth seeking if the language that you are using can reliably throw an exception when a method is not implemented.


Effective programmers are seeking polymorphism and not inheritance.  The benefits of pure polymorphism outweigh any advantage that compile time type checking provides, especially when we have access to very sophisticated debuggers and support for runtime exception handling. In general, I believe that the benefits of pure polymorphism outweigh the value of static type checking.

Tuesday 5 June 2012

Comments are for Losers

Imagine how much time you would waste driving in a city where road signs looked like the one on the right.

If software development is like driving a car then comments are road signs along the way.

Comments are purely informational and do NOT affect the final machine code. 

A good comment is one that reduces the development life cycle for the next developer that drives down the road. 

A bad comment is one that increases the development life cycle for any developer unfortunate enough to have to drive down that road.  

Sometimes that next unfortunate driver will be you several years later!

I know most developers are hardwired to automatically assume that comments are a good thing.  I'd be interested in hearing from anyone who has never experienced the problems below... 

Comments do not Necessarily Increase Development Speed

I was in university in 1985 and one of my professors presented a paper (which I have been unable to locate) of a study done in the 1970s.  The study took a software program, introduced defects into it, and then asked several teams to find as many defects as they could.

The interesting part of the study was that 50% of the teams had the comments completely removed from the source code.  The result was that the teams without comments not only found more defects but also found them in less time.


So unfortunately, comments can serve as weapons of mass distraction

Bad comments

A bad comment is one that wastes your time and does not help you to drive your development faster.  Let's go through the categories of really bad comments:
  • Too many comments
  • Excessive history comments
  • Emotional and humorous comments
Too many comments are a clear case of where less is more.  There are programs with so many comments that it obscures the code.  Some of us have worked on programs where there were so many comments you could barely find the code!

History comments make some sense, but then again isn't that what the version control comment is for?  History comments are questionable when you have to page down multiple times just to get to the beginning of the source code.  If anything, history comments should be moved to the bottom of the file so that Ctrl-End actually takes you to the bottom of the modification history.

We have all run across comments that are not relevant.  Some comments are purely about the developer's instantaneous emotional and intellectual state, some are about how clever they are, and some are simply attempts at humor (don't quit your day job!). 

Check out some of these gems (more can be found here):

//Mr. Compiler, please do not read this.

// I am not sure if we need this, but too scared to delete.

//When I wrote this, only God and I understood what I was doing
//Now, God only knows

// I am not responsible of this code.
// They made me write it, against my will.

// I have to find a better job

try { ... } 
catch (SQLException ex) 
  { // Basically, without saying too much, you're screwed. Royally and totally. } catch(Exception ex) 
   { //If you thought you were screwed before, boy have I news for you!!! }

// Catching exceptions is for communists

// If you're reading this, that means you have been put in charge of my previous project. 
// I am so, so sorry for you. God speed.

// if i ever see this again i'm going to start bringing guns to work

//You are not expected to understand this

Use Self-Documenting Code

We are practitioners of computer science and not computer art.  We apply science to software by checking the functionality we desire (requirements model) against the behavior of the program (machine code model).  

When observations of the final program disagree with the requirements model we have a defect which leads us to change our machine code model.

Of course we don't alter the machine code model directly (at least most of us); we update the source code which is the only real model.  Since comments are not compiled into the machine code there is some logic to making sure that the source code model be self-documenting.

Code is the only model that really counts!

Self-documenting code requires that you choose good names for variables, classes, function names, and enumerated types.  Self-documenting means that OTHER developers can understand what you have done.  Good self-documenting code has the same characteristic of good comments; it decreases the time it takes to do development. 

Practically, your code is self-documenting when your peers say that it is, not when YOU say that it is. Peer reviewed comments and code is the only way to make sure that code will lead to faster development cycles.

Comments gone Wild

Even if all the comments in a program are good (i.e. reduce development life cycle) they are subject to drift over time.  

The speed of software development makes it difficult to make sure that comments stay in alignment with the source code.

Comments that are allowed to drift become road signs that are no longer relevant to drivers.

Good comments go wild when the developer is so focused on getting a release out that he does not stop to maintain comments.  Comments have gone wild when they become misaligned with the source code; you have to terminate them.

No animals (or comments) were harmed in the writing of this blog.

Commented Code

Code gets commented during a software release as we are experimenting with different designs or to help with debugging.  What is really not clear is why code remains commented before the final check-in of a software release.

Over my career as a manager and trainer, I've asked developers why they comment out sections of code. The universal answer that I get is “just in case”. 


Just in case what?  

At the end of a software release you have already established that you are not going to use your commented code, so why are you dragging it around? People hang on to commented code as if it is a “Get Out of Jail Free” card, it isn't.

The reality is that commented code can be a big distraction.  When you leave commented code in your source code you are leaving a land mine for the next developer that walks through it. 

When the pressure is on to get defects fixed developers will uncomment previously commented code to see if it will fix the problem.  There is no substitute for understanding the code you are working on – you might get lucky when you reinstate commented code; in all likelihood it will blow up in your face.

Solutions

If your developers are not  taking (or given) enough time to put in good comments then they should not write ANY comments. You will get more productivity because they will not waste time putting in bad comments that will slow everyone else down. 

Time spent on writing self-documenting code will help you and your successors reduce development life cycles.  It is absolutely false to believe that you do not have time to write self-documenting code.

If you are going to take on the hazards of writing comments then they need to be peer reviewed to make sure that OTHER developers understand the code.  Unless the code reviewer(s) understands all the comments the code should not pass inspection.

If you don't have a code review process then you are only commenting the code for yourself.  The key principle when writing comments is Non Nobis Solum (not for ourselves alone).

When you see a comment that sends you on a wild goose chase – fix it or delete it.  

If you are the new guy on the team and realize that the comments are wasting your time – get rid of them; your development speed will go up.


Articles on Comments by Other People

Other Sources of Amusement


Other articles in the "Loser" series
Moo?

Want to see more sacred cows get tipped? Check out
Make no mistake, I am the biggest "Loser" of them all.  I believe that I have made every mistake in the book at least once :-)