The Real Cost of Technical Debt

“Technical debt” originated from Ward Cunningham to explain how reworking software to align with the desired outcome slows down a team. It’s like paying interest on a loan. It can be a deceptive concept. The idea that “it’s okay for now” can impact velocity, quality, value, and morale. In the end, technical debt can be tricky. You don’t know what it costs until you have to pay it.

Technical debt echoes the impact of financial debt

Technical debt is much like financial debt. We use it for various reasons (some calculated, some risky) with the understanding we’ll be paying interest on it. Steve McConnell, the author of Code Complete, points out that interest defines technical debt.

You can’t ignore taking risks like poor coding practices, insufficient testing, and unclear requirements in the long run. As soon as teams accept the risk and move forward, the negative impact on the system and product begins, whether visible or not.

At some point, you’ll have to address it. However, addressing the issue will usually be more complex and expensive. It depends on the changes and evolution of the system since first incurring the risk.

This is the interest. It compounds over time and will slow down velocity.

In limited cases, you can use technical debt in a calculated manner to:

Gain short-term speed (before a release or to hit a market window).
Extend short-term resources (startup capital, a grant, etc.).
Avoid wasting resources when the technical debt will soon be irrelevant (servicing a system near end-of-life).

In the above strategic examples, part of the calculation should involve some plan and budget for paying down the accepted debt sooner rather than later.

Aside from the above exceptions, riskier technical debt accumulation behavior may stem from:

A tendency for short-term focus (“just get this release/fix/feature out”).
Unwillingness to do something correctly (e.g., poor coding techniques or lack of coding standards, skipping unit or other types of tests because it’s boring or seems nit-picky, etc.).
Lack of infrastructure to support needed practices (e.g., continuous integration systems, automation of frequent or tedious tasks, etc.)
Insufficient knowledge or practices (e.g., not automating testing, the team isn’t learning from each other, code writers and testers operating in silos, etc.)
The work isn’t directly relevant to a new, valuable feature.
Developer imposed deadlines that don’t consider the impact on technical quality.

Even if you incur technical for a deliberate reason, that doesn’t mean it doesn’t need to be paid. Not paying down technical debt, like financial debt, means it will increase over time.

Impact on velocity

If you max out credit card debt, your disposable income only goes toward the interest, barring you from new purchases. You can get to the same point with technical debt, where maintaining current functionality requires all capacity. Thus, limiting your ability to innovate and deliver additional value.

For example, suppose you’re building a calculator app. Consider the following progression of value delivery and risk (debt) accumulation:

Sprint 1: Basic GUI—enter numbers, and they show up in the display. It’s such a simple program; we decide not to write any automated tests for it.
Sprint 2: Implement the addition feature. It works great, still easy to test manually. We count our testing of addition as also testing number entry.
Sprint 3: Subtraction is just as easy to test as the addition was. But, to keep quality high, we need to test both subtraction and addition manually.
Sprint 4: Multiplication. Again, the feature works wonderfully, but the tester reaches frustration. She needs to test multiplication, subtraction, and addition.
Sprint 5: Division. Manual testing will now include testing for four features, including the three tested in previous sprints. She manages to work extra hours but gets everything done. She’s good at running those manual test scripts now, but she lost part of her weekend.
Sprint 6: Decimals. The tester says that it’s too big for a sprint. The rest of the team says it seems no larger in effort than the other features. The tester points out she will need to retest everything (which was an entire sprint worth of work, and a bit more) and the new feature. Instead of six units of test (GUI, addition, subtraction, multiplication, division, and decimals), she will have 10 (the previous five sprints worth of functionality, both with and without decimals). To maintain current quality, you’ll see velocity slow.

If the development team managed technical debt and used automation earlier, the tester’s point about new tests for the previous features would still be valid.

However, the existing automated tests would already cover half the work. Automation would allow the tester to begin writing tests as the team implemented the decimals feature, instead of doing all the testing towards the end of the sprint.

Decrease in quality

Technical debt is the choice to take the easier short-term solution at the expense of the long term. This decision will always have an impact on quality assurance and control.

Quality assurance: Does the product meet the business/customer need?
Quality control: Does the product have high technical quality?

In both cases, the result is customer disappointment and wasteful rework later.

Less value delivered

A decrease in quality leads to more customer reported defects and firefighting. The product owner’s forecast includes less and less valuable functionality. Although the development team is still working just as hard, the amount of new value is less because of accrued debt that no longer ignorable.

A small investment in quality practices at the beginning would have enabled continuous, sustainable delivery of value over the long term.

Lower team morale

Technical debt also harms the development team. Most software engineers (or any other professionals) take pride in doing their job well. Being told to do an implementation they feel is not up to their standards diminishes their feelings of autonomy, mastery, and purpose.

Real-world examples

Let’s look at some real-world examples of technical debt in various industries

Qantas

Australian airline Qantas has an IT shop, more than 50 years old. The company struggled with more than 700 applications, many written in old languages such as COBOL and FORTRAN. This aging infrastructure caused the need for some substantial and expensive upgrade projects.
One of these was called Jetsmart, a parts management system. Qantas didn’t meet with its airplane mechanics to assess their needs. They assumed they knew well enough what to put in. The system was so unwieldy that the mechanics’ union told their members not to use the software. Disuse caused the $40M project to falter.

Qantas’ technical debt:

Failure to make incremental upgrades on legacy systems, and instead waiting until the existing system was too complex and difficult to use.
Not clarifying needs with users and key stakeholders.
Absence of usability testing with real users.

HealthCare.gov

A critical piece of the Affordable Care Act, HealthCare.gov experienced long response times and service outages at launch. It only had 43% uptime. The system’s failure was so spectacular that less than three weeks after launch, President Obama and his Chief of Staff created a special team to rework the website.

Deployment of the website did not include caching, a standard in website design. There was also insufficient hardware redundancy. A single data storage unit and switch failed during maintenance and stopped work for nearly five days. Also, no metrics or dashboards were available to verify the health of the system. The special team did this their first day on the job.

HealthCare.gov’s technical debt:

No using best practices for backend development.
No hardware redundancy.
No performance or stress testing.
No verification of performance requirements.
Not backend monitoring to track the system and help the team maintain its health.

Strategies for dealing with technical debt

Quite often, the decision to use technical debt is not a technical one, but a social or business decision. Lack of top management support for building in quality is an early key indicator of project failure. The same support is necessary to pay technical debt. It’s important to be able to quantify and explain the effects of technical debt, along with its current use or abuse. This practice should be from the development team to the product owner, and from the product owner to the rest of the organization.

Enforce the definition of done

An adequate definition of done does a lot for avoiding technical debt, including:

Automation of tests and tasks.
Documentation of how things work and key decisions.
Code reviews.
Other things that help to ensure the accuracy of work.

A product owner is the last line of defense to ensure a team’s work on a backlog item is up to par before acceptance. A scrum master helps the product owner (the entire scrum team, really) enforce the definition of done holistically. If the product owner accepts completed functionality that doesn’t meet the definition of done, you’ll accrue debt.

Track the debt

Whenever a decision creates technical debt, or it’s discovered, a product backlog item to address it should be a priority. Don’t forget or ignore it. These technical debt product backlog items are an estimate, like any other one. If developers identify technical debt, they should communicate it in terms to empower the product owner to make the best prioritization decision.

Alternatively, product owners should also ask questions for clarification if the business value isn’t clear. These backlog items help product owners communicate outside of the scrum team how technical debt impacts priorities.

Prioritize appropriately

Once you accrue technical debt, it may be more practical to address it consistently over time. Tackling it in one big push halts new value creation for long periods.

Just like with financial debt, not all technical debt items are equally expensive, risky, or impactful. For example, in paying down financial debt, urgency is given to that with the highest interest rates or balances.

Handling all technical debt items at once may not make economic sense. A plan that deals with them over time allows for a sustainable pace with the team’s normal velocity. Large pushes are more prone to failure. They may be so sizable that they lack the clear direction that working on smaller items brings.

Some technical debt occurs because of external factors. New software or language versions, vendor sunsets, or adjusting to scaling fall into this category. Track and prioritize these items, too.

Communicate

Communication is crucial, as paying down debt affects priorities and schedules. As mentioned above, the team first tracks their technical debt. The scrum master and product owner work together to help educate, quantify, and expose the risk.

Quantify

New features often have a monetary value. So, in prioritizing technical debt, it’s helpful to quantify the cost of that debt. You can do this by calculating in three areas:

How it’s slowing down development (cost of team member salary).
Opportunity costs (being able to deliver faster).
Lost revenue.

There are other ways to quantify debt, specific to your business, such as customer satisfaction, recruiting, employee morale, etc.

Expose the risk

Technical debt is risk. The higher the technical debt, the more risk. Incorrect requirements, rushed code, and fragile infrastructure all generate risk. These risks increase the chance that the product will miss market opportunities, displease customers, and lose money.

The financial metaphor comes in useful here, as well. When assessing risk to lend money, lenders use credit score as a qualification. This score depends on multiple factors, such as the number of accounts, how long they have been active, total debt, missed payments, etc.

You can create a technical debt credit score. Its calculation will be different for each business, relating to:

The number of technical debt backlog items.
The total estimate of technical debt backlog items (e.g. story points).
The average age of technical debt backlog items.
The ratio of technical debt payed vs. created (are you gaining or losing technical debt).
Amount per sprint dedicated to paying technical debt.

This score can be a useful metric to illustrate the state of technical debt and track paying it down. For example, a team may take on technical debt to accelerate releases. They may be likely to say if their technical debt credit score is high, making them confident they have the time to address it post-release.

Conversely, if the score is too low, the team may not have trust in forecasts. In this case, the risk is too high. The team will need to resolve before they can be certain about release dates.

Managing technical debt is a constant challenge

Managing technical debt will always be part of software development and other industries, too. As in finances, some manage it cautiously and wisely, while others drown in it. Either way, you have to pay it someday.

Not sure what your technical debt situation is? Wondering how you’re doing in your agile transformation? Contact us to learn how we can help you expose your technical debt and make plans to pay it down.