All posts
Blog

Consulting Decision Velocity: The KPI Most Firms Ignore (and Why It Matters More Than Utilisation)

Consulting decision velocity—how fast your firm answers client questions—predicts profitability better than utilisation. Here's the architecture that makes it possible.

Rui Luís

Consulting decision velocity measures how quickly a firm can answer a client question and move a decision forward — and it predicts profitability more accurately than utilisation. When context retrieval is slow, partners spend their week in information-gathering mode instead of judgment mode, and every "let me check and get back to you" is a tax receipt on that gap. Firms with real-time context access make decisions 3–5x faster, reduce rework by approximately 40%, and close proposals within days instead of weeks.

Key takeaways
  • Consulting decision velocity — how fast a firm moves a client decision forward — is a leading indicator of profitability, while utilisation is a lagging one; firms that decide in hours instead of days win more business and retain clients longer.
  • Firms with real-time context access make decisions 3–5x faster and reduce rework by approximately 40% — on a $2M 10-person firm, that rework reduction alone recovers $150K–$200K in annual capacity.
  • The architectural gap is not talent or process: HubSpot doesn't read Slack, Slack doesn't know what's in the proposal deck, and nobody has connected them — the firms that connect it change the fundamental economics of how partners spend their time.

Three years ago, a GM production manager spent 36 hours assembling context to answer a single question about supplier variance that Toyota's floor supervisors could pull in 90 seconds. Same lean manufacturing research. Same consultants. Same frameworks. The difference wasn't talent or strategy — it was retrieval speed. Toyota had built a system where context was queryable in real time. GM was still running batch cycles.

Last month, I watched a partner at a mid-market consulting firm do the same thing: spend Tuesday morning reassembling a client conversation from six weeks prior because the context was scattered across email, Slack, and a shared drive folder nobody could quickly search. The client needed a number. The number existed. It took four hours to find it. The decision that depended on it slipped a week.

Consulting firms are running the same experiment on themselves right now — and most are on the GM side of it without realising it.

The Batch Decision Cycle Is the Problem, Not the Symptom

Most consulting firms operate on what I'd call a batch decision cycle. You gather information on Monday. You synthesise it by Wednesday. You have the conversation on Friday. If a client asks a question Tuesday afternoon that requires context from two months ago, the answer is "let me check and get back to you."

That phrase — let me check and get back to you — is a tax receipt.

It means the information exists somewhere in your system but isn't queryable in the moment it's needed. It means the meeting continues without the relevant context, or stalls waiting for it. It means a proposal that could go out Thursday goes out the following Monday after someone spends a morning assembling what was already known.

The batch cycle isn't a workflow preference. It's what happens when retrieval is slow. And slow retrieval is a direct drag on consulting decision velocity — the firm's ability to move client decisions forward without a lag.

This is why utilisation is the wrong primary KPI for most consulting firms. Utilisation measures hours billed. It doesn't measure whether those hours produced a decision, a deliverable, or another round of information-gathering that will need to be repeated next week. A firm can be at 85% utilisation and still be haemorrhaging margin on rework cycles and retrieval overhead that never shows up on the capacity dashboard.

What Real-Time Context Access Actually Changes

The shift isn't just speed. It's the structure of how decisions get made.

In a batch model, the sequence is: gather information → assemble → decide. The bottleneck is assembly. Partners spend a meaningful portion of their week in information-gathering mode before they can do the judgment work they're actually paid for. The actual product of a consulting firm — judgment, creativity, client relationships — gets crowded out by overhead that competes with it for the same partner hours.

In a real-time model, the sequence flips: decide → verify instantly. A partner asks a question mid-client call and gets an answer with citations — the original email thread, the Slack conversation from March, the specific line item in the proposal — in under 60 seconds. The judgment moment and the information moment collapse into the same moment.

The cognitive difference is significant. When you don't have to hold context in your head or schedule a retrieval exercise, you can stay in judgment mode. You can think about what the client actually needs rather than managing the logistics of remembering what you already decided. That's the real unlock: not that AI is doing the thinking, but that it's clearing the overhead so the thinking can happen at full capacity.

This is exactly the distinction between overhead and judgment in professional services — and it's why context architecture is becoming a competitive differentiator, not just an operational convenience.

The Numbers That Justify Building the Architecture

This isn't abstract. The operational impact is measurable.

Firms with real-time context access make decisions 3–5x faster, reduce rework by approximately 40%, and close proposals within days instead of weeks. The rework reduction alone is worth examining: rework in consulting is almost always downstream of a decision made without complete context — a deliverable built on an assumption that contradicts something said on a call four weeks earlier that nobody could quickly retrieve when it mattered.

A 10-person firm doing $2M in annual revenue that reduces rework by 40% is recovering somewhere between $150K and $200K in capacity that was previously burning on correction cycles. That's not a productivity metric. That's a profitability number — and it shows up in margin, not in utilisation.

This is also why the hidden cost of slow information retrieval compounds so aggressively at the partner level: every retrieval failure isn't just one lost hour, it's a downstream decision that either slips or gets made on incomplete context. Both outcomes cost more than the retrieval time itself.

The Architecture Is the Strategy

The firms moving fastest on client decisions right now aren't doing it with better talent or longer hours. They've built — or are building — a context layer that makes institutional memory queryable in real time.

The tools exist in their stack already. The problem is that HubSpot doesn't read Slack, and Slack doesn't know what's in the proposal deck, and the proposal deck doesn't reference the email thread from the original scoping call. The context is distributed. Nobody has connected it.

This is the same architectural failure that separated Toyota from GM on the factory floor — and it's playing out identically in consulting firms right now. Toyota's supervisors didn't have better instincts. They had a system that made what was already known instantly accessible. GM's managers had to run a batch cycle to retrieve the same information.

The firms that connect their context layer change the fundamental economics of how partners spend their time. The overhead — recall, retrieval, briefing, rework — shrinks. The judgment expands. That's not an AI story. It's an architecture story. AI is just what makes the architecture queryable.

The question isn't whether your firm has the information. It almost certainly does. The question is whether that information is connected in a way that makes it retrievable in the moment a decision needs to be made — or whether it's sitting in five separate tools that don't talk to each other, waiting to become someone's Tuesday morning project.

How to Start Measuring Consulting Decision Velocity

Most firms have no idea what their current decision velocity actually is, because they've never measured it. Here's a simple starting point:

Track time-to-answer on inbound client questions. Pick the last ten questions a client sent that required you to check something before responding. Measure the elapsed time between receipt and substantive reply. That number — averaged across ten questions — is a rough proxy for your firm's decision velocity.

If the average is more than four hours, you're operating a batch cycle. If it's more than 24 hours, the retrieval overhead is almost certainly affecting client satisfaction scores and renewal conversations, even if nobody has named it yet.

The goal isn't to optimise for speed as an end in itself. It's to remove the retrieval tax that's converting partner judgment capacity into information-management overhead. Measure decision velocity first. Then you'll know exactly where the architecture needs to change.

Frequently asked questions

What is consulting decision velocity and why does it matter?
Consulting decision velocity measures how quickly a firm can retrieve relevant context, answer a client question, and move a decision forward. It matters because it is a leading indicator of profitability — faster decisions mean fewer rework cycles, stronger client retention, and proposals that close in days rather than weeks. Utilisation (billable hours) is a lagging indicator that tells you what already happened; consulting decision velocity tells you whether the firm's judgment capacity is being spent on actual judgment or on information-gathering overhead.
How do you measure decision velocity in a consulting firm?
A practical starting point is to track time-to-answer on inbound client questions. Take the last ten questions a client sent that required checking something before responding, and measure elapsed time from receipt to substantive reply. Averaging that figure across ten questions gives a rough proxy for current decision velocity. Anything above four hours typically indicates a batch retrieval cycle; anything above 24 hours usually signals that retrieval overhead is already affecting client satisfaction and renewal rates, even if it hasn't been named as such.
Why is utilisation the wrong KPI for consulting firm profitability?
Utilisation measures hours billed but not whether those hours produced a decision, a deliverable, or another round of information-gathering that will need to be repeated. A consulting firm can operate at 85% utilisation while haemorrhaging margin on rework cycles and retrieval overhead that never appears on the capacity dashboard. Consulting decision velocity is a better leading indicator because it captures whether partner time is being spent on judgment — the actual product — or on overhead that competes with it for the same hours.
Related reading
narrative