I audited 12 AI implementations in Spanish mid-market companies…

Between January and December 2025 I worked, at different points, with twelve Spanish mid-market companies that were implementing — or had tried to implement — AI systems. Sector mix: three in legal and tax services, two in manufacturing, two in e-commerce with revenue above five million, two in B2B SaaS, and three in consulting of varying sizes.

The range went from companies that had spent less than 30,000 euros on a pilot to one that had committed more than 400,000 to an 18-month project with a large consultancy.

The failure pattern was the same in eleven of the twelve.

It wasn’t the technology. It wasn’t the technical team’s talent. It wasn’t the budget.

The pattern: decision rights vacuum

I call the pattern a “decision rights vacuum”. No one — not the CEO, not the CTO, not the project lead — had clear authority to do three things:

Decide which use cases to kill when they weren’t working
Define which of the system’s outputs were acceptable and which weren’t
Establish which metrics determined real success or failure

The consequence is always the same: pilots that stay alive for 18-24 months without anyone being able to explain why they’re still alive, without anyone being able to close them without a high political cost, draining budget and attention from the teams that maintain them.

It isn’t inaction through ignorance. It’s inaction by structure.

Case 1: the tax-services firm with the eternal pilot

A tax-advisory firm with 80 employees launched, in January 2024, a pilot to automate the review of tax filings. The initial goal: reduce manual review time by 40%.

By the end of 2025, the pilot was still active. Twenty-two months had passed.

When I came in to review the situation, the system processed around 15% of the filings that were in the original scope. The rest were still reviewed manually. The human-intervention rate on the system’s outputs was above 80% — that is, more than 8 out of every 10 outputs required correction or extensive validation before they could be used.

I asked who had the authority to close the pilot if it didn’t meet its goals. No one could answer clearly. The CTO said it was a business decision. The managing partner said it was a technical decision. The project lead had changed twice.

The pilot stayed active because no one had the mandate — and the accepted political cost — to close it.

Total investment up to that point: more than 120,000 euros across licenses, integration hours, and internal-team time.

Case 2: the e-commerce with three recommendation systems

An e-commerce business with annual revenue of around eight million euros had, when I ran the diagnostic, three product-recommendation systems running in parallel. Not as a controlled experiment. All three were in production, serving recommendations to different user segments or in different parts of the funnel, without any team having a unified view of which one worked best.

Each one had its internal champion. The marketing team defended the first because they had seen a conversion lift in the period after its launch. The product team defended the second because it had improved AOV in some categories. IT owned the third, which was the newest and the one they had invested the most development hours in.

No one had authority to shut down two of the three. The CEO knew the situation was inefficient but didn’t want to create conflict between teams before having “definitive” data.

The problem: data is never definitive if there’s no prior criterion for what defines success. With three systems in production mixing signals, each one’s data was contaminated by the other two.

The cost: licenses for all three systems, maintenance hours, and the opportunity cost of not having a single, well-optimized system.

The solution wasn’t technical. It was who had the mandate to decide.

Case 3: the consultancy with the 18-month project

This is the most common one in companies with more than 200 employees and a more comfortable budget.

A 180-person consultancy had hired a large firm to implement an “AI assistant” system for its consultants. The project had an 18-month timeline, a price above 300,000 euros, and a promise of “a significant reduction in time spent on documentation and research tasks”.

By month 14, the system was technically working. The consultants used it occasionally. Voluntary adoption was around 20%.

The problem wasn’t the system’s quality. The system was competent for the tasks it had been built for.

The problem was that no one had defined, before signing the contract, what “significant reduction” meant. There was no documented baseline of current time spent on documentation and research. There was no agreed adoption metric. There was no success criterion that would allow anyone to say, at month 18, whether the project had worked or not.

In the project’s closing meeting, the vendor presented system-usage data. The company presented the leadership team’s general impression. No one could say with data whether 300,000 euros had been a good investment or not.

The maintenance contract was renewed.

Why the vacuum happens

The decision rights vacuum in AI implementations isn’t an accident. It has structural causes:

AI arrives as a cross-cutting initiative. It crosses IT, operations, product, and business. When something is everyone’s responsibility, it’s no one’s responsibility.

No one wants to be the person who kills the initiative. If the pilot fails after someone closes it, that person bears the political cost. If the pilot fails on its own, the cost is diluted. The incentive is to keep the pilot alive.

Success criteria are vague by design. When the criteria are imprecise (“improve efficiency”, “increase productivity”), it’s impossible to determine when it has failed. That’s convenient for the vendor and for the project’s internal champion.

The budget is already committed. Once a company has spent 80,000 euros on a pilot, closing the pilot turns that spend into a visible loss. Spending another 20,000 feels less painful, even when it’s economically worse.

The 20-minute diagnostic

There are three questions I use to diagnose whether an AI implementation has this pattern in five minutes of conversation with the CEO or COO:

If the AI system fails to reach its goals, who has the authority to close it without needing approval from more than two people?
What is the numeric metric that defines success at 90 days, and who set it before the project started?
If you had to stop this project today, what exactly would happen — who would decide, with what process, and what would the real operational cost be?

If the answers are ambiguous or the person can’t answer in under two minutes, the decision rights vacuum is present.

What a Fractional CAIO solves here

The promise of a Fractional CAIO isn’t “make the AI work”. It’s to install operational governance so that decisions about AI — including the decision to shut down what isn’t working — have an owner, a criterion, and a process.

The three concrete things it solves in this context:

Kill criteria before launch: no pilot starts without a written definition of which conditions trigger closure. That’s not pessimism; it’s what gives the pilots that do move forward a real mandate.

Explicit decision rights: who can stop it, who approves continuing, and under what criteria. A one-page table. Not a framework.

Pre-agreed baseline and metric: the success metric is defined and documented before the project starts, not after. If the vendor won’t accept a pre-agreed success metric, that says something about their incentives.

The twelfth company — the one that didn’t have this pattern — had these three things. It wasn’t the most technically sophisticated. It had the clearest owner and the most explicit closure criterion. Its pilot ended in six months, with a documented decision to scale. The process was boring and effective.

Next action

If you have an AI pilot that’s been active for more than six months, ask yourself this question: if you had to close it tomorrow, who makes the decision, with what criterion, and with what process?

If you can’t answer in one paragraph, you have a decision rights vacuum. Solving that doesn’t require more technology. It requires a conversation about mandate and criteria.

For an AI governance diagnostic at your company, go to contact or open a session at cal.com/brthls.

For the full profile of the role, read What a Fractional CAIO Really Does (and when you don’t need one).

Translated from the Spanish original with AI assistance and reviewed for accuracy. Read the original in Spanish.

I audited 12 AI implementations in Spanish mid-market companies in 2025 — the number-one failure pattern nobody talks about

Key Takeaways

The pattern: decision rights vacuum

Case 1: the tax-services firm with the eternal pilot

Case 2: the e-commerce with three recommendation systems

Case 3: the consultancy with the 18-month project

Why the vacuum happens

The 20-minute diagnostic

What a Fractional CAIO solves here

Next action

Related Reading

MiniMax M3: el open weight que baja el umbral para agentes largos

MiniMax M3: The Open Weight That Lowers the Threshold for Long Agents

MiniMax M3: open weight-modellen der sænker tærsklen for lange agenter

He auditado 12 implementaciones IA en empresa mediana espanola en 2025 — el patron de fracaso numero uno que nadie cuenta

AI Decision Ledger: The Record That Separates Learning from Opinion