Workflow Optimization

The AI Token Trap: Mistaking Inputs for Outcomes

Tracking token usage won’t help you understand where AI is adding to bottom-line results. Move toward true agentic implementation by measuring what matters.
Trusted by 5,100+ teams Rated #1 High Performer on G2  Productivity gains up to 92+%
No credit card required
Summarize With AI

Guide Topics

Talk to Sales

Our dedicated team is here to
answer all your custom needs.

Key Takeaways

  • Token usage is not a measure of AI progress. Counting seats, prompts, and tokens tells you the tool is being used, not whether it moved the P&L.
  • AI adoption means occurrence; AI absorption means outcomes. You can have high adoption and near-zero absorption, which is exactly where most stalled AI programs sit.
  • What gets measured, AI will automate. Measure the wrong thing and you don’t just train your people to chase it. You train your AI agents to chase it, hard-coding the wrong objective into the workflow.
  • AI ROI is an allocation problem, not a usage problem. Value comes from routing AI at the work where it matters most, not from burning tokens.
  • The people doing the work should redesign it, working backward from the outcome. Done right, this allows them to reduce rote work and improve productivity without changing headcount.
  • Measure what impacts the P&L: cost per outcome, cycle time, and revenue per head.

AI Spend Management: The Input Explosion

Ask most companies how their AI rollout is going, and they’ll show you a number that’s going up. Seats activated. Weekly active users. Prompts sent. Tokens consumed, trending up and to the right. The number is bigger than last quarter, so the story writes itself: adoption is working.

It usually isn’t.

The AI spend is moving faster than the proof. Global corporate AI investment more than doubled to $582 billion in 2025 according to Stanford’s 2026 AI Index, with 88% of organizations reporting at least some level of AI adoption. Meanwhile, OpenAI's annual compute spend jumped to $16.3 billion, and Anthropic's to $6.8 billion.



Yet, according to McKinsey’s State of Organizations 2026, 81% of organizations report no meaningful bottom-line gains. Only 1% of C-suite respondents describe their generative AI rollouts as mature, and only 19% report AI-accelerated revenue increase over 5 percent.

Why isn’t this enormous, constantly accelerating spend generating real value for companies, even as their teams burn through tokens?



The problem is that token usage was never a measure of progress in the first place. It measures consumption, not the value delivered. When an organization manages to a consumption number, it gets exactly that: more consumption, layered on top of the same broken workflows it already has.



What Gets Measured Gets Automated

The old adage, “what gets measured gets managed” is due for an update in the age of agentic AI. As researchers put it in the Harvard Business Review, what gets measured, AI will automate. Point an organization at token usage and it will optimize for token usage: more automation bolted onto whatever process already happens to exist. If that process is bloated, AI just helps you run a bloated process faster.



The same team in their research paper on AGI economics stress that:

 “…the binding constraint on growth is no longer intelligence. It is human verification bandwidth: the scarce capacity to validate outcomes, audit behavior, and underwrite meaning and responsibility when execution is abundant.” 

Cost to automate is diminishing exponentially. The real bottleneck is cost to verify, which is limited by human time and biology. If AI verifies AI, blind spots can propagate endlessly. A human-in-the-loop approach is more difficult, but a necessary control to ensure AI is running properly and automating what matters.

Measure token consumption and you don’t just train your teams to chase it, you train your AI agents to chase it. You hard-code the wrong objective into the workflow and automate away the real value: the work humans actually do.

This isn’t a new worry, it’s an old law with a new accelerant. Economists call it Goodhart’s Law: once a measure becomes a target, it stops being a good measure, because people optimize to the number rather than what the number was meant to stand for. Metrics that are easy to measure, such as token use, get fixated upon. Meanwhile, the process improvements where true value lives go unaddressed.

Value Lives in the Process

Erik Brynjolfsson and colleagues studied over 5,000 customer support agents for their report Generative AI at Work. Instead of measuring usage, they measured the process: Resolutions Per Hour (RPH), Average Handle Time (AHT), Chats Per Hour (CPH), Resolution Rate (RR), and Net Promoter Score (NPS), comparing customer support staff before, during, and after they got AI access.



Some results:

Resolutions Per Hour (RPH). Access to the AI large language model (LLM) increased productivity, measured by issues resolved per hour, by 15% on average, with a 34% improvement for novice and low-skilled workers and minimal impact on experienced staff.



Average Handle Time (AHT). Workers using an LLM spent an average of 35 minutes per chat, versus 40 minutes for colleagues without it, a roughly 12% reduction in handle time.

Chats per hour / ramp time. Within two months, AI-assisted workers were resolving 2.5 chats per hour versus 1.7 for non-users, who needed eight months to reach that level. AI users increased customer chats resolved per hour by 13.8%.



For the most experienced, top-skilled workers, there was a zero effect on handle time, only a small effect on chats per hour, and small but statistically significant decreases in resolution rate and customer satisfaction.



Blanket "AI uplift" from token use is a fiction. You need precise measurement sortable by factors like skill-tier, task, or organizational group. The same tool that helps novices might not have the same impact on the most skilled workers. Some of the most significant gains from AI investment come from capturing that tacit process knowledge of top performers and propagating it.

Value shows up where the process is visible, measurable, and replicable.

Adoption Is Not Absorption

Two terms often get used interchangeably that shouldn’t be. AI Adoption is usage: how many people touched the tool, how often, how many tokens it burned. AI Absorption is something else entirely: it’s when the work itself changes shape, and that change shows up where the CFO can see it: in cost per outcome, cycle time, and revenue per head.

Further reading: Insightful examines the difference and lays out a 90-day implementation roadmap for enabling AI absorption in the AI Adoption Audit Playbook.

Economists draw a similar line. Cambridge’s Diane Coyle, writing on measuring transformative AI for NBER, separates the extensive margin of AI — whether a firm uses it — from the intensive margin — the value and type of use. Surveys and dashboards, she notes, capture the first and miss the second entirely. It helps to put four overused terms under one frame on the same axis:

Extensive:

  • Applied AI is AI pointed at a real business process rather than a demo.
  • AI Adoption is the usage layer, necessary but never the goal.

Intensive:

  • AI Absorption is when AI is embedded in the workflow itself: time allocation changes, AI use co-occurs with core systems, output is integrated into deliverables, and usage persists after training or rollout.
  • AI Implementation is the work of getting from one to the other. Adoption is the input. Absorption is the return.

Maximizing Value in AI Cost Management

In a formal model, Agrawal, Gans, and Goldfarb show that the value of transformative AI flows not from the raw supply of intelligence but from how it’s routed: pointing capacity at the questions where it creates the most value, not at the furthest extremes, and not everywhere at once. Spraying tokens uniformly at every task is the precise negation of that.



In other words: AI ROI is an allocation problem, not a usage problem. You can’t allocate AI to what you can’t see at the task level. McKinsey’s data shows how wide the gap is in practice: even among organizations using AI, nearly two-thirds have not scaled it beyond limited pilots. That’s adoption without absorption.

The People Who Do the Work Have to Redesign It

EY’s look at agentic AI and work redesign around ROI uses a renovation metaphor that lands: a fresh coat of paint can make a building look modernized, but it does nothing for an aging foundation.

Implementing AI in a workplace works the same way. Inject it into an existing process (order-to-cash, claims intake, record-to-report) and you can automate a few steps. What’s harder, and far more valuable, is rewriting the workflow to run natively on AI: built in, not bolted on. Rebuilt to be AI-first, the same work can collapse to a fraction of the previous steps and run in hours instead of weeks.

AI strategy is usually handed down: a tool is purchased at the top, a usage target gets set, and the people who actually run the process are told to adopt it. That’s the Token Trap in organizational form: top-down deployment, measured by consumption, disconnected from the work.

The better approach is to work backward from the outcome, with the power users closest to the work defining the change. The doers redesign the doing. They have the greatest insight into what the current steps are and how they add value. Replicating power users’ unique digital footprint and empowering them to train others is a change management issue, not a technology issue.

And they want the job. EY’s workplace survey found that 84% of desk workers are eager to take on agentic AI, specifically because it can pull them out of rote work and toward the parts of the job that are actually engaging.

There’s a thirty-year-old idea underneath this. The service-profit chain logic is simple: engaged employees do better work, better work keeps customers, and kept customers drive profit. Gallup’s most recent data puts a number on it: business units in the top quartile of engagement deliver 23% higher profit than those in the bottom quartile.



The implication for AI is direct. The employees who absorb AI best are not your heaviest token users. They’re the ones who use it to do measurably better work. Find them, learn how they’ve redesigned their work, and give the rest of the organization the blueprint.


A Process-First Framework for Real AI ROI

A framework doesn’t have to be complicated to be useful. It needs a baseline, the right signals, the right people, and a number the CFO already trusts. Four steps:

  1. Baseline the work, not the tool. Before you deploy, capture how the work actually flows today: who touches it, how long it takes, which skills, what it costs per unit of output. Without a baseline of the process, a usage dashboard is the only thing left to measure, which is how you fall into the trap.
  2. Separate adoption signals from absorption signals. Logins, prompts, and tokens tell you the tool is in use. Cycle time, rework rates, and cost per outcome tell you the work changed. Track both, but never report the first as if it were the second. AI maturity is understanding which parts of the work people are applying AI most meaningfully.
  3. Redesign from the outcome backward, with the doers. Pick a process, or part of a process. Define the input and the output, then treat everything in between as fair game to eliminate, replace, or rebuild AI-first. Route AI at the steps where it creates the most value, not uniformly. Audit through the people who run the process and master it, not just the people who bought the tool.
  4. Tie it to the P&L. Convert the redesign into a metric leadership already believes in: cost per outcome, revenue per employee, cycle time to delivered work. That’s the number that survives a budget review. Token spend isn’t.

How Work Intelligence Makes Absorption Visible

The hard part isn’t agreeing that absorption matters. It’s measuring it. The data you already have on hand lives in the wrong places: disconnected tools, self-reported time logs, and manager estimates that don’t match how work actually flows.

In NBER’s work on measuring AI, the instrument prescribed for capturing process reengineering is, in plain terms, organization-wide process metrics and workflow audits. The economists studying the AI transformation and the operators trying to capture its ROI are pointing at the same missing layer: a view of how work actually changes with the arrival of AI.

Insightful’s Work Intelligence is purpose-built to be that layer. It establishes the baseline for how work actually happens, then shows how the shape of that work changes after AI is introduced: which teams genuinely redesigned a process, where the time actually went, and whether cost per outcome moved.

Request a free 14-day Work Intelligence Audit to see if your organization qualifies for Insightful’s invite-only cohort. Or book a demo to see how Insightful’s AI Adoption Report feature can work for your organization.

FAQs

What is the AI token trap?

The token trap is mistaking AI inputs, such as seats, prompts, and tokens, for AI outcomes. A usage curve going up doesn't mean the work got better or a return on investment was realized. When a company manages to a consumption number, it tends to scale the workflows it already has, inefficiencies included.

What's the difference between AI adoption and AI absorption?

Adoption is surface usage: how many people use an AI tool and how often they log in. Absorption is whether the work itself was redesigned around AI. The benefit of absorption is that processes linked to ROI — such as cost per outcome, cycle time, or revenue per head — actually change and embed. Economists frame the same split as the extensive margin (whether AI is used) versus the intensive margin (the value of that use).

Can tracking token usage help measure my company's AI spend management?

Only on the input side. Token tracking helps you forecast and cap what you pay for AI, which is useful for controlling the bill, but says little about whether that spend produced a return. Rising token count doesn’t predict AI ROI. Real AI spend management connects cost to the outcome produced, delimiting which spend is creating value and which is just consumption.

What's the best way to measure AI absorption if I want to optimize AI cost management?

Baseline how a process works before AI, then track how its shape changes after: fewer steps, less rework, lower cost per outcome. The instrument is a workflow audit, which a work intelligence layer can run continuously. For cost management specifically, the lever is allocation: route AI towards the steps where it lowers cost per outcome the most, rather than spreading spend evenly across every task.

Top Rated Software Globally. Loved by Customers.

Achieve Sustainable Productivity
with Insightful

No credit card required