Your Codebase Has Technical Debt. But Does Your Team Have Comprehension Debt?

Have you ever been thrown into an unfamiliar codebase while deadlines got tighter, stress levels rose, and incidents became harder to resolve?

I have, and it's not a pleasant experience.

This got me thinking about a kind of debt engineering teams rarely measure.

Not technical debt.

Something more subtle:

comprehension debt.

I think of comprehension debt as the gap between how fast a system changes and how well the team understands it.

And AI is making this gap more important.

AI didn't create the problem.

This problem existed long before AI.

Teams have always struggled with knowledge silos, undocumented systems, fragile ownership, and “only one person knows how this works” situations.

But AI can accelerate the problem.

When AI helps us write, refactor, and ship code faster, the codebase can evolve faster than the team’s shared understanding.

That is useful when paired with strong review, explanation, documentation, and ownership.

But dangerous when it turns into:

“The code changed, but nobody really understands the system better.”

That is the kind of risk I wanted to make more visible.

What if I could quantify comprehension debt somehow? At least to a certain degree of approximation.

I started exploring the different variables and components that would impact comprehension, for better or worse.

Based on my experience, conversations with other engineers, and patterns I’ve seen across teams, I started building a scoring methodology to approximate comprehension debt.

My goal here is to help engineering teams spot where critical or highly connected systems are changing faster than the team understands them.

What is comprehension debt?

Let's start with a more formal definition:

Comprehension debt rises when system impact, complexity, dependency surface area, change velocity, and AI-assisted change speed outpace team understanding, coverage, redundancy, documentation, and human ownership.

I am not basing this just on theory.

I experienced the negative effects of high comprehension debt recently in one of my past teams.

It was a stressful and demoralizing experience.

I constantly felt behind and had to keep up.

Incidents were getting worse and harder to resolve over time because the level of system understanding across the team was too low.

The code existed.

The services existed.

The tickets kept moving.

But the shared mental model of the system was not keeping up.

That is a hard place to work from.

You feel reactive all the time.

You are not just debugging the incident.

You are debugging your own lack of context.

Why AI makes this worth measuring now

AI-assisted development can be incredibly useful.

I use AI myself, quite frequently.

But I keep coming back to this question:

Are we increasing shipping velocity without increasing understanding velocity?

Because those are not the same thing.

A team can ship more code and still understand less of the system over time.

AI can help generate implementation options, refactor code, explain files, write tests, and speed up repetitive work.

But if AI-assisted changes are merged without enough human explanation, review, documentation, or ownership, comprehension debt can accumulate faster.

The issue is not:

“Did AI write this?”

The better question is:

“Can the team still explain, review, modify, deploy, and recover this system safely?”

How the debt score is calculated

This is not meant to be a perfect mathematical model.

It is an attempt to make an invisible engineering risk visible enough to discuss, compare, and improve.

At a high level, the score combines two sides:

1) System pressure

These factors increase comprehension debt:

high criticality
high complexity
high change velocity
high incident sensitivity
high dependency surface area
high ownership concentration
high AI acceleration risk when AI is used without strong human guardrails

2) Team comprehension coverage

These factors reduce comprehension debt:

more safe modifiers
more clear explainers
stronger reviewer redundancy
better documentation
more recent hands-on exposure
stronger on-call familiarity
better engineer-system capability scores

So the rough model is:

Comprehension Debt =
System pressure
+ dependency pressure
+ ownership concentration
+ optional AI acceleration

minus

human coverage
+ documentation quality
+ reviewer redundancy
+ recent exposure
+ operational familiarity

To detect dangerous gaps, a Minimum Viable Coverage check for critical systems is performed.

For a critical system, the sheet checks whether it has:

at least 2 safe modifiers
at least 2 capable reviewers
documentation quality of 3 or higher
recent hands-on exposure of 3 or higher
on-call familiarity of 3 or higher

If one of these is missing, the system gets an MVC gap.

A critical system with an MVC gap should be flagged even if the overall debt score looks moderate.

Discussion

I am very open to feedback on how the methodology could be improved.

Also curious:

How does your team maintain understanding of the codebase?

What signals tell you that your team is starting to lose understanding of a system?

And have you noticed AI changing the speed at which your systems evolve compared to the speed at which your team understands them?

I think this is going to become a much bigger engineering leadership problem as AI generation and automation accelerates.

We are getting better at generating code.

But we still need to get better at preserving shared understanding.

Get the template

To make this easier to reason about, I turned the methodology into a spreadsheet-first template:
System Comprehension Heatmap

It includes:

System Inventory
Engineer-System Matrix
Overview Dashboard
Risk Recommendations
optional AI acceleration scoring
Minimum Viable Coverage checks
quick-start PDF guide

You can get it for free here

I would love feedback.