Most conversations about AI begin with the model. Which model is the most capable. Which model is the fastest. Which model scores highest on benchmarks. These are interesting technical questions, but they are almost never the questions that determine whether an AI system will produce real value in a business environment.
The question that matters more, and that is consistently underestimated, is simpler: what data do you actually have?
The model is not the bottleneck
Large language models and other AI architectures have reached a level of general capability that makes them useful across a wide range of tasks. The bottleneck in most applied AI projects is not the model. It is the data that the model has access to and the quality of the context it receives.
A model without relevant data is a powerful engine with no fuel. It can generate plausible-sounding outputs, but those outputs are disconnected from the operational reality of the business. They lack the specificity, the historical awareness, and the contextual depth that make AI genuinely useful rather than merely impressive. This is the sandbox problem in its most fundamental form: the model operates in an isolated environment where everything you do not explicitly provide simply does not exist.
This is why so many AI pilots succeed in controlled environments and then fail when deployed into real workflows. The demo works because the inputs are curated. The production system struggles because the data is messy, incomplete, inconsistent, or simply absent. A practical example of this principle at work: building a product recognition pipeline without training a model required not just a capable vision model, but a carefully structured test harness — the data layer that made iterative improvement possible.
What historical data actually provides
Historical data is not just a training set. In applied AI, it serves a much broader function. It provides the operational memory that allows a system to contextualise new inputs, detect meaningful patterns, and produce outputs that are grounded in real experience rather than statistical approximation.
Consider a few concrete examples:
-
Customer support automation. A model can generate polite responses to any question. But without access to historical support tickets, resolution patterns, product-specific issues, and past customer interactions, it cannot prioritise correctly, escalate reliably, or resolve issues in a way that reflects how the business actually operates. The same principle applies to AI-assisted bug reporting, where an intake agent that structures vague reports into actionable tickets is only as useful as the quality of the conversational data it collects.
-
Demand forecasting. Any model can produce a forecast. But a forecast that ignores seasonal patterns, historical sales cycles, past promotional impacts, supply chain disruptions, and category-specific trends is not a forecast. It is a guess with confidence intervals.
-
Process anomaly detection. Identifying that something is wrong in an operational workflow requires knowing what normal looks like. That knowledge comes exclusively from historical data. Without it, the system cannot distinguish between a genuine anomaly and a routine variation.
In each case, the model provides the reasoning capability. The data provides the substance. Without the substance, the reasoning is empty.
The data quality problem
Even when historical data exists, its quality is often far below what is needed for reliable AI outputs. This is not a new problem, but AI amplifies it. Traditional reporting systems can tolerate a certain level of data inconsistency because humans interpret the results with contextual knowledge. AI systems, by contrast, treat the data as ground truth and propagate errors with confidence. This is precisely why post-commit testing remains critical even for AI-written code — stacked data quality issues often only become visible when the first layer of error is corrected.
Common data quality issues that undermine AI projects include:
- Inconsistent formats. Dates stored in multiple formats, addresses with varying structures, product names that change over time without mapping.
- Missing records. Gaps in transactional data, incomplete customer histories, periods where logging was disabled or misconfigured.
- Duplicate entries. Multiple records for the same entity without a clear deduplication strategy.
- Lack of context. Data that records what happened but not why. A cancelled order without a reason code. A support ticket marked resolved without a resolution summary.
- Schema drift. Database structures that have evolved over years without documentation, making it difficult to interpret older records correctly.
These problems are not exotic edge cases. They are the default state of most business data environments. Addressing them is unglamorous work, but it is the work that determines whether an AI system will be useful or unreliable.
Data as competitive advantage
There is a strategic dimension to this that is worth stating directly. Models are increasingly commoditised. The major providers offer broadly similar capabilities, and open-source alternatives continue to improve. The differentiator for any business investing in AI is not which model it uses. It is the quality, depth, and relevance of its proprietary data.
A company with twenty years of well-structured operational data has a genuine competitive advantage in applied AI. A company with the same model but poor data infrastructure has an expensive experiment.
This is why the most impactful AI investments are often not in model development or prompt engineering, but in data engineering: cleaning, structuring, connecting, and governing the historical data that makes AI outputs specific, reliable, and actionable.
The gap between demos and production
The AI industry has a presentation problem. Demos are designed to showcase capability in ideal conditions. They use clean data, well-defined tasks, and forgiving evaluation criteria. Production environments have none of these advantages.
In production, the data arrives with errors. The task boundaries are ambiguous. The users interact with the system in unexpected ways. The edge cases multiply. And the system must perform reliably not once, but continuously, under changing conditions.
The gap between demo and production is almost always a data problem. The model can handle the complexity. What it cannot handle is operating without the data foundation that makes its outputs meaningful in context. This same gap appears in AI-assisted development: a coding agent works when given vague instructions in a demo because the demo is a closed system. Production drifts because the agent fills ambiguous instructions with its own assumptions. Structured requirements eliminate that gap by making task boundaries explicit and acceptance criteria verifiable.
This is not an argument against AI. It is an argument for taking the data foundation seriously before investing in the model layer. The organisations that do this consistently are the ones that extract real, sustained value from AI. The ones that skip it consistently end up cycling through pilots that never reach production.
What this means in practice
For anyone working on applied AI in a business context, the practical implications are straightforward:
-
Audit your data before selecting a model. Understand what historical data exists, where it lives, how consistent it is, and what gaps need to be addressed.
-
Invest in data engineering. Cleaning, structuring, and connecting historical data is not preparation for AI. It is the core of the AI project.
-
Set expectations based on data quality. The ceiling of what AI can achieve in your environment is determined by your data, not by the model’s theoretical capabilities.
-
Design for data feedback loops. Every AI interaction should generate structured data that improves future performance. The system should get better over time because the data gets better over time. This is exactly the pattern behind test-driven AI optimisation, where a structured test suite functions as the feedback mechanism that drives measurable, compounding improvements.
-
Be sceptical of model-first narratives. When someone proposes an AI solution without asking about data quality, they are proposing a demo, not a production system.
Historical data is not a prerequisite for AI. It is the foundation. Without it, even the most capable models produce outputs that are technically impressive and operationally useless. With it, even modest models can deliver measurable, sustained business value.
The real work of applied AI is not building smarter models. It is building the data infrastructure that allows any model to be genuinely useful. And the human skill that makes this possible — the ability to judge what matters, to structure complexity, and to articulate what a system should become — is the same capability that defines the future of software itself.