Diagram showing how poor internal data quality causes AI project failures

Dec 14, 2025

Why 80% of AI Projects Fail (It's Not the AI)

Teddy Kim
0 comments

Why do most AI projects fail? The quality of an AI deliverable is capped by the quality of its inputs. If upstream data systems don't guarantee correctness, the AI inherits that uncertainty.

Most AI initiatives don't fail because of AI. They fail because the internal deliverables that feed the AI are garbage. You cannot build a reliable system on top of unreliable dependencies. This is true in software, and it's especially true in machine learning.

I have a personal adage I call "Kim's Law": the quality you deliver to customers cannot exceed the quality you give each other. In other words, customer-facing quality is the sum of internal quality. If your internal deliverables are sloppy, your external deliverables will be sloppy too. AI just makes this painfully visible, because ML models have a gift for amplifying upstream dysfunction.

Companies are burning through millions on AI initiatives that are doomed from the start. Not because the models are wrong. Not because the data scientists are incompetent. But because nobody bothered to ask a basic question: do the systems that feed this thing make any guarantees about correctness?

The quality ceiling

Here's something that should be obvious but apparently isn't: the quality of an AI deliverable is capped by the quality of its inputs. If your data warehouse doesn't guarantee freshness, your features inherit that staleness. If your event stream doesn't guarantee ordering, your model inherits that chaos. If your CRM doesn't guarantee deduplication, your predictions inherit those ghosts. Every upstream compromise accumulates downstream.

Zillow learned this the hard way. In 2021, Zillow shut down its iBuying business and wrote off $569 million. The Zestimate algorithm that powered iBuying relied on a patchwork of internal and external data sources. When the housing market got weird during COVID, the algorithm got weird too. But the root cause wasn't the algorithm. The root cause was that Zillow's internal data couldn't keep pace with rapidly shifting market conditions. The features they were mining were stale before they hit the model.

IBM Watson Health is another cautionary tale. IBM poured billions into Watson, promising it would revolutionize cancer treatment. But Watson's recommendations were only as good as the training data, which came from a single hospital system with its own idiosyncratic treatment protocols. When Watson was deployed to other health systems, the recommendations didn't translate. The AI wasn't broken. The internal deliverable—the training data—was never fit for purpose.

This is Kim's Law in action. IBM's internal data quality was mediocre, so the external quality delivered to hospitals was mediocre. The AI was just the messenger.

Responsibility without control

When internal systems make no guarantees, something perverse happens. The consuming team inherits responsibility for validating what they consume. But how can that work? The consuming team doesn't control the upstream system.

This is what I call "responsibility without control," and it's organizational poison.

Imagine you're on the ML team. Your model is producing bad predictions. Leadership wants answers. You dig into the feature pipeline and discover that one of your key features—let's say customer lifetime value—has been wrong for three months because an upstream team changed the definition without telling anyone. Now you're in a meeting explaining why your model is broken, except it's not really your model that's broken. It's someone else's data. But guess whose head is on the block?

This dynamic creates a confusing mess where teams are held accountable for things they cannot influence. The rational response is to build defensive validation layers around every upstream dependency. But that's insane. You're essentially rebuilding the data quality apparatus that should have existed upstream, except now it's fragmented across every consuming team, each with their own interpretation of "correct."

Amazon's AI recruiting tool ran into exactly this problem. The model was trained on historical hiring data that reflected decades of bias toward male candidates. The ML team was responsible for the model, but they didn't control the hiring data. They inherited it. By the time someone noticed the model was penalizing resumes that included the word "women's," the damage was done. Amazon scrapped the project entirely.

The AI initiative was cancelled, but the real culprit was the internal deliverable. The hiring data was never audited for bias because nobody thought of it as a product with quality requirements. It was just... data. Sitting there. Waiting to ruin someone's day.

The real problem is SLAs

When AI initiatives fail, the typical response is to cancel the initiative. Leadership concludes that AI isn't ready, or the use case wasn't viable, or the team wasn't skilled enough. But that amounts to killing the messenger. Perhaps it makes more sense to heed the message:

Your AI stuff is floundering because key internal deliverables don't have SLAs.

Think about it. If you consume an external API, you expect a service level agreement. You expect guarantees about uptime, latency, and data quality. If the API violates the SLA, you have recourse. But internal teams? They ship whatever they ship, whenever they ship it, with whatever definition of "correct" they feel like using that quarter.

This asymmetry is bizarre when you think about it. We hold external vendors to higher standards than we hold ourselves. We negotiate contracts with third parties but communicate with colleagues through vibes and assumptions. No wonder internal quality degrades. There's no mechanism to prevent it.

This is why data contracts are becoming a thing. Companies like PayPal and Saxo Bank have started treating internal data products like external APIs—with explicit schemas, versioning, and quality guarantees. When an upstream team wants to change a field definition, they have to negotiate with downstream consumers. When quality degrades, there are actual consequences.

This sounds like bureaucracy, but it's actually the opposite. Without explicit contracts, you get implicit contracts, which are worse. Implicit contracts are discovered through breakage. Explicit contracts are negotiated through conversation.

The competitive angle

Here's the part that should terrify you: cancelling your AI initiative doesn't make the problem go away. It just means your competitors will solve it first.

You're not competing in a vacuum. Somewhere out there, a rival firm has figured out that AI failures are really internal quality failures. They're hardening their SLAs. They're investing in data contracts. They're treating internal deliverables like products. And while you're writing postmortems about why the AI project didn't work out, they're shipping features that make your product look obsolete.

Quitting halfway just guarantees your competition is going to eat your lunch. The firms that win the AI race won't be the ones with the fanciest models. They'll be the ones with the cleanest internal systems. Because Kim's Law doesn't care about your excuses. The quality you deliver to customers cannot exceed the quality you give each other.

This is why a failed AI initiative is a death knell. It indicates that you can't get your internal house in order, and you're not going to compete in a world where AI is table stakes. You're signaling to the market that your organization lacks the discipline to execute on the most important technology shift since the internet.

Hardening the foundation

If you're serious about AI, you need to get serious about internal deliverable quality. This means treating your data systems, your APIs, and your internal tools as products with customers rather than utilities that just exist.

Here's what that looks like in practice:

Contracts. Every internal deliverable should have an explicit interface, a freshness guarantee, and a definition of correctness. If the deliverable changes, downstream consumers should be notified before it breaks their stuff.

Ownership. Every internal deliverable (including datasets) should have a named owner who is accountable for its quality. Not a team. A person. Someone who will feel personal pain when things go wrong.

Observability. You can't manage what you can't measure. Quality should be monitored continuously, with alerts when things drift out of spec. This applies to data pipelines, feature stores, internal APIs, and anything else that feeds downstream systems.

Consequences. If an upstream team ships a broken deliverable that breaks a downstream system, there should be actual consequences. Not blame-storming. Consequences. Like having to fix it before their next sprint starts.

The punchline

AI initiatives get cancelled all the time, but cancellation is rarely the rational response. The rational response is to harden SLAs and lock in on internal deliverable quality.

Kim's Law holds: the quality you deliver to customers cannot exceed the quality you give each other. If your internal deliverables are held together by duct tape and good intentions, your customer-facing AI will be too.

0 comments

Sign upor login to leave a comment