Call today: 866.652.9866

The Real Cost of Technical Debt

by Platinum Edge
Cycle of not automating because we don't have time because we don't automate, because we don't have time

“Technical debt” is a term first coined by Ward Cunningham tto explain how having to rework software to align with the desired outcome slows down a team; it is like paying interest on a loan. It can be a deceptive concept - the idea that "it's okay for now," can grow to the point where it impacts velocity, quality, value and morale. In the end, technical debt can be tricky in that you don’t know what it costs until you have to pay it.

Technical debt is much like financial debt—we use it for various reasons (some calculated, some risky) with the understanding that we will be paying interest on it.  Steve McConnell, author of Code Complete, points out that it is the interest that defines technical debt.  Taking risks such as poor coding practices, insufficient testing, and unclear requirements cannot be ignored in the long run. As soon as teams accept the risk and move forward with risky practices, the negative impact on the overall system and product begins, whether visible or not. At some point, the impact will be exposed and will have to be addressed. However, addressing the issue will usually be more complex and expensive, given whatever changes and evolution the system has undergone since the risk was incurred. This is the interest that must be paid. It compounds over time and will slow down velocity later on.

In limited cases, technical debt may be used in a calculated manner to:

•       Gain short-term speed (before a release or to hit a market window)

•       Extend short-term resources (startup capital, a grant, etc.)

•       Avoid wasting resources when the technical debt will soon be irrelevant (servicing a system near end-of-life)

In the above strategic examples, part of the calculation should involve some plan and budget for paying down the accepted debt sooner than later.

Aside from the above exceptions, riskier technical debt accumulation behavior may stem from:

•       Tendency for short-term focus (“just get this release/fix/feature out”)

•       Laziness or unwillingness to do something correctly (e.g. poor coding techniques or lack of coding standards, skipping unit or other types of tests because it’s boring or seems nit-picky, etc.)

•       Lack of infrastructure to support needed practices (e.g. continuous integration systems, automation of frequent or tedious tasks, etc)

•       Insufficient knowledge or practices (e.g. not doing automated testing, the team isn’t learning from each other, code writers and testers operating in silos, etc.)

•       The work isn’t directly related to a new valuable feature

•       Deadlines imposed on developers without consideration of impact on technical quality

Even if technical debt is incurred for a calculated reason, that doesn’t mean it doesn’t need to be paid.  Not paying down technical debt, like financial debt, means the debt will increase over time.

Impact on Velocity

With enough credit card debt (e.g. maxed out limits), all of your disposable income can go towards just paying off the existing interest, and no new purchases can be made.  You can get to the same point with technical debt, where maintaining current functionality requires all capacity, limiting your ability to innovate and deliver additional value.

For example, suppose that you have a team writing a calculator app. Consider the following progression of value delivered and risk (debt) accumulated:

·       Sprint 1: Basic GUI—enter numbers and they show up in the display.  It’s such a simple program, we decide to not write any automated tests for this.

·       Sprint 2: Implement the addition feature.  Works great, still easy to test manually.  We count our testing of addition as also testing number entry.

·       Sprint 3: Subtraction.  Subtraction is just as easy to test as addition was.  But, to keep quality high, we need to manually test both subtraction and addition.

·       Sprint 4: Multiplication.  Again, the feature works wonderfully, but the tester is starting to get a little frustrated.  She needs to test multiplication, subtraction and addition.

·       Sprint 5: Division.  Manual testing will now include testing for four features—three of which were tested in a previous sprint.  She manages to work a little extra on Saturday, but gets everything done. She’s pretty good at running those manual test scripts now, but she lost part of her weekend.

·       Sprint 6: Decimals.  But in sprint planning, the tester says that it is too big for a sprint.  The rest of the team says that it seems no larger in effort than the other features.  The tester points out that she will now need to retest everything (which already was an entire sprint worth of work, and a bit more), as well as test everything using decimals.  Instead of 6 units of test (GUI, addition, subtraction, multiplication, division, decimals), she will really have 10 (the previous five sprints worth of functionality, both with and without decimals).  If current quality is to be maintained, the velocity has slowed down.

If the development team had managed their technical debt and began using automation earlier, the tester’s point about needing to create new tests for all the previous features would still be valid.  However, the existing automated tests would already cover half the work.  Also, the automation would allow the tester to begin writing tests as the team was implementing the decimals feature, instead of having to do all the testing towards the end of the sprint.

Decrease in Quality

Technical debt is the choice to take the easier short-term solution at the expense of the long term.  This will always have an impact on quality assurance (i.e. does the product meet the customer or business need it was meant to address?) and quality control (i.e. is the product of high technical quality?). In both cases, the result is customer disappointment and wasteful rework later.

Less Value Delivered

As decreases in quality lead to more customer reported defects and firefighting, the product owner’s forecast includes less and less valuable functionality.  Although the development team may still be working just as hard, just as many hours (likely more), the amount of new value being delivered is less to make room for the payoff of the accrued debt that can no longer be ignored. A small investment in quality practices at the beginning would have enabled continuous, sustainable delivery of value over the long term.

Lower Team Morale

Technical debt also has a negative impact on the development team.  Most software engineers (or any other professionals) take pride in doing their job well.  Being told to do an implementation that they feel is not up to their level of quality diminishes their feelings of autonomy, mastery and purpose.

Real-world Examples

Qantas

Qantas, an airline in Australia, has an IT shop that is more than 50 years old, and was struggling with more than 700 applications, many of which were written in old languages such as COBOL and FORTRAN.  This aging infrastructure caused them to need to undertake some large and expensive upgrade projects.  One of these was called Jetsmart, a parts management system.  Qantas didn’t meet with any of the airplane mechanics to see what their needs were, but assumed that they knew well enough what to put in.  The system was so unwieldy that the mechanics’ union actually told their mechanics not to use the software, and the $40 million project was scrapped because of disuse.

Technical debt incurred by Qantas:

•       Not doing incremental upgrades on legacy systems, but instead waiting until the existing system was too complex and difficult to use

•       Not clarifying needs with users and key stakeholders

•       Not doing usability testing with real users

 

HealthCare.gov

A critical piece of the Affordable Care Act, HealthCare.gov experienced long response times and service outages at launch and had 43% uptime.  The system’s failure was so spectacular that less than three weeks after launch, President Obama and his Chief of Staff created a special team to rework the website.  The website was deployed without using any caching (a standard practice for any website).  There was also insufficient hardware redundancy: a single data storage unit and switch failed during maintenance and stopped work for nearly five days.  Also, no metrics or dashboards were used to verify the health of the system (the special team did this their first day on the job).

Technical debt incurred by HealthCare.gov:

•       No caching - not doing the backend properly

•       No hardware redundancy

•       No performance or stress testing

•       No verification of performance requirements

•       Not backend monitoring to track the system and help the team maintain system health

Strategies for Dealing with Technical Debt

Quite often the decision to use technical debt is not a technical one, but a social or business decision.  Lack of top management support for building in quality is an early key indicator of project failure, and the same support is needed for paying technical debt.  It is important to be able to quantify and explain (both from the development team to the product owner, and from the product owner to the rest of the organization) the effects of technical debt and how it is currently being used or abused.

Enforce the Definition of Done

An adequate definition of done does a lot for avoiding technical debt.  Things like automation of tests and tasks, documentation of how things work and key decisions, code reviews, and other things that are often found in a definition of done help to ensure that the work is done correctly.

A product owner is the last line of defense to make sure that a team’s work on a backlog item is up to par before it is accepted.  A scrum master helps the product owner (the entire scrum team, really) enforce the definition of done for every piece of functionality implemented.  If the product owner accepts completed functionality that does not meet the definition of done, debt has accrued.

Track the Debt

Whenever a decision is made that creates technical debt (or other technical debt is discovered), a product backlog item to address that debt should be created and prioritized on the product backlog.  Don’t let it be forgotten or ignored.  These technical debt product backlog items are also estimated like any other product backlog item and prioritized against all other items on the product backlog. If technical debt items are identified by developers, it is the developers’ responsibility to communicate the debt in terms that enable the product owner to make an informed business prioritization decision.

On the flip side, product owners should also ask questions for clarification if it’s not obvious what business value there is in addressing a technical debt issue (yes, there is a business value implication with any work to be done). These backlog items also help the product owner communicate outside of the scrum team how technical debt is affecting priorities.

Prioritize Appropriately

Once technical debt is accrued, it may be realistic to address it consistently over time, rather than in one big push that stops new value creation for an extended period.  Just like with financial debt, not all technical debt items are equally expensive, risky or impactful. Just as you would address different balances or interest rate financial debts with varying priority, technical debts have different urgency factors. Addressing all technical debt items at once may not make economic sense. Addressing items over time allows for sustainable pace with the team’s normal velocity.  Large pushes are more prone to failure and can be so large that they lack the clear direction that working on smaller items brings.

Some technical debt occurs because of changes outside of the company.  Things like new software or language versions, a vendor going out of business, or adjusting to increasing scaling requirements as the business grows fall into this category.  These changes also need to be tracked and prioritized so that they can be worked on before they become problems.

Communicate

Communication is crucial because paying down the debt will impact priorities and schedules.  As mentioned above, the team first tracks their technical debt.  From there, the scrum master and product owner can work together to help educate, quantify and expose the risk.

Quantify

New features often have a monetary value associated with them.  So, in prioritizing technical debt, it can help to quantify what the cost of that debt is.  This can be done by how much it is slowing down development (the cost of team member salary), as well as opportunity cost (being able to deliver faster), or impact on the customer and lost revenue if gone unaddressed.  Also consider other ways to quantify debt which may be specific to your business (customers satisfaction, recruiting, employee morale, etc).

Expose the Risk

Technical debt is risk.  The higher the technical debt, the more risk there is in the project.  Incorrect requirements, rushed code and fragile infrastructure all bring risk.  These risks increase the chance that the product will miss market opportunities, displease customers and lose money.

The financial metaphor comes in useful here as well.  When assessing risk to lend someone money, a credit score is used.  This score is comprised of multiple factors, such as the number of accounts, how long those accounts have been active, total debt, missed payments, etc.

A technical debt credit score can be created.  Though how it is calculated will be different for each business, some factors can include:

•       The number of technical debt backlog items

•       The total estimate of technical debt backlog items (e.g. story points)

•       The average age of technical debt backlog items

•       The ratio of technical debt payed vs. created (are you gaining or losing technical debt)

•       Amount per sprint dedicated to paying technical debt

This score can be a useful metric to succinctly communicate the current technical debt situation and track progress of paying it down.  For example, you could say to a team that is considering taking on some technical debt to get a release out sooner, “Our technical debt credit score is high, so we feel confident that we will be given the time to address this after the release.”  Conversely, you might say to stakeholders about some forecasts, “Our technical debt credit score is too low.  Our risk in the project is high enough that we can’t have confidence in our forecasts.  We need to work to address the risk in the project before we can know when we will release.”

Managing technical debt is an ongoing part of software development (and many other industries, too).  Just like in finances, some manage it cautiously and wisely to expand their opportunities, while others drown in it.  Either way, you have to pay it someday.

Not sure what your technical debt situation is? Wondering how you’re doing in your agile transformation? Contact us to learn how we can help you expose your technical debt and make plans to pay it down.