Visualização de leitura

Why a modern data foundation takes more than a new platform

Too many data modernization efforts begin with the platform. The conversation turns to replacing the underlying data environment, moving reporting workloads to the cloud or retiring legacy tooling. Those decisions matter, but in my experience, they are rarely what makes the work hard.

What makes the work hard is everything that has built up around the platform over time.

I have seen this most often in organizations that inherited legacy architecture through acquisition, accumulated technical debt through years of deferred investment or saw reporting logic and master data evolve without enough enterprise discipline. On the surface, the environment may still appear functional. Dashboards are still refreshing. Reports still go out. Teams still find ways to get numbers. But once the business begins to scale, the weaknesses become much harder to hide.

The warning signs usually appear before the platform itself becomes the problem. Different teams start using different numbers for the same KPI, critical reporting logic begins to live outside core systems and analysts spend more time reconciling data than interpreting it. New business units take longer to onboard, reporting changes become harder than they should be and, before long, the issue is no longer just the data platform. It becomes a broader problem of trust, scalability and control.

That is why too many modernization efforts are scoped too narrowly. Replacing the platform is only one part of the challenge. The real work is untangling years of logic, definitions and integration patterns that were never designed to scale together.

The platform is only one layer of the problem

One of the clearest lessons I have learned is that legacy data environments rarely fail in an isolated way. They fail by becoming harder to trust and harder to change.

In many environments, the data platform is carrying far more than data. It is carrying years of workarounds for things that source systems were never able to handle cleanly. Reporting logic ends up split across ETL jobs, SQL transformations, scripts, spreadsheets and side databases. Some of it was built quickly to solve immediate business needs. Some of it was necessary at the time. But over time, those decisions create duplicated logic, hidden dependencies and handoffs that become harder to govern every time the business changes.

The issue is not only technical debt in the traditional sense. It is also reporting debt, where inconsistent definitions and duplicated logic across reports make data harder to trust and maintain. KPI definitions evolve differently across functions. Business logic gets embedded in too many places. Teams build local workarounds to compensate for mismatched source data. The business keeps moving, but the data foundation falls further behind.

That is why I think CIOs need to treat modernization less like a platform replacement and more like an effort to restore architectural separation and control.

In practice, that means separating ingestion, transformation and reporting instead of allowing all three to collapse into the same layer. It means reducing the number of places where business logic can live. It means establishing a clear source of truth for key metrics before they show up in executive dashboards. It also means making sure master data is defined consistently enough that teams are not comparing duplicate records or conflicting definitions and assuming the platform is to blame.

Fit matters more than feature depth

Platform decisions are often misunderstood.

On paper, most modern data platforms are capable. They all promise scale, flexibility and performance. But in practice, the decision is rarely about capability alone. It is about fit.

In recent modernization work, I have seen firsthand that the wrong decision is not always choosing an inferior technology. More often, it is choosing a platform that introduces unnecessary complexity into an environment that is already fragmented.

That complexity shows up quickly in the form of another cloud to manage, another billing model to track, another toolchain to support, another integration layer to maintain, another set of skills to build and another governance surface to control.

Those costs do not always show up clearly in vendor comparisons, but they show up immediately in execution.

That is why I have become more disciplined about asking a different question. Not what is the most powerful platform on paper, but what choice best aligns with the operating model, capabilities and simplification goals of the enterprise.

There is no one-size-fits-all answer. For some organizations, a separate cloud native warehouse may make perfect sense. For others, a more unified platform approach is the better fit because it leverages current skills, preserves momentum and avoids duplicating effort inside an ongoing modernization program.

That distinction matters.

The goal is not to build the most theoretically flexible architecture. It is to build one where the organization can actually govern, extend and operate over time.

Master data is where credibility starts

Modernization does not become credible until master data starts to improve.

That is not a side effort. It is part of the foundation.

In many enterprises, the root problem is not just the reporting layer. It is the fact that core entities such as customers, products, suppliers and locations are still defined differently across systems. When that happens, every downstream discussion about trust, reporting consistency and AI readiness becomes harder than it should be.

One area where this becomes tangible is syndication and deduplication. In most legacy environments, the same customer, product or supplier exists multiple times across systems, often with slight variations in naming, attributes or hierarchy. Over time, teams build local workarounds to compensate, which only reinforces the fragmentation.

Deduplication is not just a technical exercise. It forces alignment to what defines a unique entity. Syndication operationalizes that alignment, ensuring that once data is standardized, it is consistently distributed across systems and downstream processes. Without both, organizations end up maintaining multiple versions of the same truth and the platform becomes harder to trust regardless of how modern it is.

That is why I keep coming back to master data discipline. If important reports are not built on agreed business definitions and trusted logic, leaders end up looking at different versions of the same KPI. If customers, products and suppliers are not defined consistently across the business, the platform may look modern while the reporting remains hard to trust.

That is also why phased execution matters. Master data does not have to be fully resolved upfront, but it does need to be mature enough in the right domains to support the first releases and give the organization a foundation it can extend with confidence.

A modern foundation has to be engineered for change

What has worked best in my experience is a disciplined architecture that separates ingestion, transformation and reporting instead of mixing them together in ways that are hard to maintain.

That is where the medallion model becomes practical, giving the organization a structured way to separate raw data, standardized data and business-facing reporting. Bronze is where data first comes in from different systems. Silver is where it gets standardized, so the business is not working from conflicting definitions or duplicate records. Gold is where reporting and KPIs can sit on a more trusted foundation. That separation makes the environment easier to scale, troubleshoot and govern over time. The value is not in terminology, but in the discipline behind it.

I have seen organizations modernize into cloud data warehouses, data lakes and lakehouse architectures. The pattern is the same. If the underlying logic, master data and governance are still fragmented, the new platform inherits the same trust problems as the old one.

That same discipline has to carry through to the platform itself. If the environment is going to hold up under growth, the pipelines have to be observable, versioned and resilient enough to support change without constant rework. Environment separation, CI/CD workflows and operational monitoring are not extras. They are part of what makes the platform sustainable.

I also would not lead a modernization effort with AI, even when the pressure is high. AI raises the stakes, but it does not change the core problem. If the data foundation is still fragmented, poorly governed or inconsistent, a new AI layer will not solve it. That is increasingly showing up in the market, with Gartner warning that many generative AI efforts will stall because of poor data quality, inadequate risk controls, escalating costs or unclear business value. Foundry’s latest AI research reinforces this, identifying data storage and management as a top foundational investment for internal AI.

Final thought

The technology will continue to evolve.

The organizations that benefit most will not be the ones chasing every new platform. They will be the ones making disciplined decisions about how those platforms fit into their operating model and executing against them consistently.

Modernization does not fail because the technology is not good enough.

It struggles when the decisions behind it are not grounded in how the business actually runs.

This article is published as part of the Foundry Expert Contributor Network.
Want to join?

Data debt will cripple your AI strategy if left unaddressed

As every CIO knows, AI success hinges on rock-solid data practices. But as CEOs and boards have emphasized digital transformations in recent years, funding for data management transformation efforts has been piecemeal at best. Now, with AI atop the CEO agenda, many CIOs find themselves in a bind, having to also overhaul data operations and address years, or decades, of accumulated data debt.   

If your enterprise has data debt, AI will expose it. In fact, data debt can lead to devastating failure rates with AI projects. For technology leaders, there’s no time like the present to pay down this debt with a comprehensive remediation strategy.

Data debt can arise for a variety of reasons, including old and outdated data management practices, shortcuts and compromises in infrastructure to meet near-term goals, poorly documented data sources, and inefficient data storage practices.

Research firm IDC in its 2026 CIO Agenda Predictions notes that by 2027, CIOs who delay the launch of data debt remediation will face 50% higher AI failure rates and rising costs, as model underperformance exposes issues from siloed, redundant, or poor-quality data.

“These findings reinforce that scaling AI requires disciplined investment in data foundations and integrated platforms, and that postponing these fundamentals risks turning AI ambition into sustained operational friction,” the report says.

“AI doesn’t create data problems; it exposes and accelerates them,” says Hrishikesh Pippadipally, CIO at accounting and advisory firm Wiss. “When organizations lack standardized processes, consistent definitions, and disciplined data governance, data naturally decays over time. That decay may not be visible in traditional reporting environments, but AI systems surface those inconsistencies quickly.”

Data debt is often the result of process drift — multiple teams using different definitions, inconsistent data entry standards, and siloed systems evolving independently, Pippadipally says.

“Without standardization and clear ownership, even modern systems degrade,” he says. “At our organization, we’ve learned that remediation isn’t just about cleaning historical data. It’s about instituting disciplined processes that prevent decay going forward: clear data ownership, standardized workflows, and governance embedded into daily operations.”

That said, not all AI initiatives are blocked by imperfect data, Pippadipally says. “There are smaller, well-bounded use cases, such as document summarization, drafting assistance, anomaly flagging, or reconciliation support, that can deliver value with human-in-the-loop verification,” he says. “These contained applications allow organizations to build AI maturity while foundational data improvements are under way.”

A mounting problem that requires a fast fix

A widespread problem, data debt at most organizations has grown organically over decades. In addition to increasing emphasis on data collection, companies have also accumulated data debt over years of mergers and acquisitions, as well as the deployment of new systems and services either enterprisewide or by departments.

“Systems were layered in response to immediate needs, acquisitions, regulatory requirements, or departmental preferences,” Pippadipally says. “Over time, inconsistent processes and standards lead to fragmented data environments.”

Moreover, data management inefficiencies have historically been addressed with manual work-arounds, Pippadipally says. “Teams reconciled reports manually,” he says. “Analysts compensated for inconsistent definitions. But AI reduces tolerance for ambiguity. When automated systems operate at scale, inconsistencies multiply rather than average out.”

It’s vital to address this now because AI initiatives are moving faster than process maturity. There is a clear sense of urgency.

“If organizations don’t institutionalize process discipline and standardization, they risk automating chaos instead of improving outcomes,” Pippadipally says. “The issue is not simply poor data; it is the absence of sustained governance to keep data reliable over time.”

For many enterprises, data debt can stay hidden while they are conducting traditional business intelligence or one-off analytics, says Juan Nassif, regional CTO at software development provider BairesDev.

“AI is different; it’s far less forgiving and it quickly exposes duplicates, inconsistent definitions, missing context, and ‘mystery fields’ with unclear lineage,” Nassif says. “When you scale beyond pilots, those issues show up as model underperformance, higher iteration cycles, and rising operational costs. It’s absolutely a concern for us, too, and we treat it as a prerequisite for scaling AI responsibly.”

If data is incomplete, inconsistent, or duplicated, the output from AI models becomes unreliable. “That can mean wrong answers, poor recommendations, or automations that break at the worst time,” Nassif says. “Teams end up spending most of their time wrangling data, reworking pipelines, and compensating for poor inputs with repeated tuning and exceptions.”

Some form of data debt is present in every sector, and in virtually all sizes of organizations.

“I witness the consequences of data debt in my daily work with schools in the UK every single week,” says Mark Friend, director of Classroom365, which consults educational institutions on technology and architecture and strategies.

“Most people assume that when they purchase the latest AI tool, all their problems will be solved no matter how messy the foundation underneath the hood,” Friend says. “My experience with this is that even the most expensive software is useless if the input is not reliable.” Data debt is “a fundamental risk to institutional stability,” he says.

Tips for effective data debt remediation

Enterprise-wide data debt remediation can be a significant, costly undertaking that involves multiple aspects of the business. It’s not just a technology issue, but a discipline issue as well. It requires cleaning up historical data as well as strengthening process governance to keep from repeating the mistakes or poor practices of the past.

Because of this, building and executing an effective strategy requires an organized and thorough approach. Here are some tips from experts.

Get senior management and board-level sponsorship

Any major IT initiative typically needs buy-in from senior business executives and even boards, particularly if it involves a large, global enterprise. Data debt remediation is no different. There is significant financial risk if remediation does not have the blessing and full backing of senior executives and board members.

Explaining the potential ramifications is a good way to bring attention to the need for remediation. “Make data debt visible and tie it to business risk,” Nassif says. “Data debt won’t get prioritized until it’s linked to AI failure rates, rising costs, and compliance exposure.”

Data debt is now a board-level risk, says Adrian Lawrence, founder of executive recruitment firm NED Capital, who advises boards and finance leaders on enterprise data governance, reporting integrity, and AI readiness.

“I see the pressure mounting with boards linking their AI investment to productivity and profitability objectives, but disjointed financial, sales, and operations data severely undermine model accuracy,” Lawrence says. “They lay bare the deficiencies [enterprise platform] upgrades and antiquated technology did not fully address.”

Success with debt remediation “demands executive sponsorship, disciplined data governance, and staged architecture cleanup treating data as an asset on the balance sheet,” Lawrence says.

Standardize core processes before scaling AI

To make the benefits of data debt remediation more long lasting, enterprises need to standardize their core business processes.

“Data quality reflects process quality,” Pippadipally says. “Leaders must align on standardized workflows, definitions, and system usage before expecting AI to operate consistently. Without process standardization, remediation efforts will be temporary.”

AI performs best in predictable environments, Pippadipally says, and standardization creates the stability AI requires.

BairesDev has embedded automated checks for data freshness, completeness, duplicates, and schema changes, so data quality issues get caught before they reach analytics or AI workflows, Nassif says.

Establish data ownership and ongoing governance

Another way to assure long-term benefits from a remediation effort is to have ongoing governance and accountability processes in place.

“Data remediation is not a one-time cleanup initiative,” Pippadipally says.

“Assigning clear ownership at the domain level, and establishing continuous monitoring, prevents data from degrading again.”

This is important, because governance ensures sustainability. “Without discipline, organizations reaccumulate data debt even after cleanup efforts,” Pippadipally says.

“We’ve been tightening dataset ownership and standardizing common business definitions, so teams aren’t training or prompting on conflicting ‘versions of truth,’” Nassif says. “We’ve been strengthening our cataloging and lineage practices, so teams can trace where data comes from, how it transforms, and who can use it — critical for both trust and governance.”

The biggest shift is mindset. “We don’t treat data remediation as a one-time cleanup,” Nassif says. “We treat it as ongoing engineering with guardrails that prevent debt from coming right back.”

Prioritize high-value, contained AI use cases

While large data modernization initiatives progress within an organization, CIOs can deploy AI in tightly scoped areas where outputs are verifiable and human oversight is straightforward, Pippadipally says.


“Examples include drafting support, controlled reconciliations, workflow triage, or anomaly flagging,” Pippadipally says. “This approach builds organizational confidence and demonstrates ROI without overexposing the enterprise to data risk.”


Clean up storage

When it comes to data storage practices, there’s no doubt that organizations need to clean up their act. Poor practices lead to poor data quality, which could impact AI-driven projects.

“Schools are often very good at storing data like [in] an attic where they just keep throwing boxes without looking inside,” Friend says. “Anyone who has lived through a technology refresh knows that messy storage is a massive financial burden.”

Decades of bad collection practices “have created a technical rot that we can no longer ignore,” Friend says. “You might think that your legacy storage is harmless, but it actually places a massive financial burden in the form of rising operational costs,” and can negatively impact AI initiatives.

오픈텍스트, ‘2026 SAP 글로벌 파트너 어워드’ 2개 부문 수상

SAP 파트너 어워드는 한 해 동안 고객의 비즈니스 혁신과 성과 창출에 기여한 글로벌 파트너를 선정하는 프로그램이다. 성과 지표와 데이터 기반 평가를 통해 수상 기업이 결정된다.

오픈텍스트에 따르면, 이번 수상은 ‘파트너 솔루션 성공(Partner Solution Success)’과 ‘인적 자본 관리(Human Capital Management, HCM) 솔루션 우수성(Human Capital Management Solution Excellence)’ 부문에서 이루어졌으며, SAP(SAP) 솔루션 기반 혁신과 고객 가치 창출 성과를 인정받은 결과다. 오픈텍스트는 SAP 솔루션 확장(SAP Solution Extensions) 분야에서의 기술 리더십과 협업 성과를 인정받았다. SAP 환경 내에서 정보관리(Information Management), 콘텐츠(Content), 데이터(Data), AI(Artificial Intelligence) 기능을 통합해 기업의 운영 효율성과 규제 대응 역량을 동시에 강화한 점이 주요 평가 요소로 작용했다.

양사는 협력을 통해 기업이 SAP 기반 비즈니스 프로세스 전반에서 AI 기반 콘텐츠 활용, 자동화(Automation), 클라우드 전환(Cloud Transformation)을 보다 빠르게 추진할 수 있도록 지원하고 있다. 이를 통해 기업은 SAP S/4HANA(SAP S/4HANA) 클라우드 전환 과정에서 복잡성을 줄이고 생산성을 높일 수 있다.


또한 오픈텍스트는 SAP 석세스팩터스(SAP SuccessFactors)와의 통합을 통해 인사(Human Resources, HR) 영역에서도 디지털 문서 관리(Document Management)와 업무 자동화를 지원하고 있으며, 전사적 업무 효율 개선을 돕고 있다.

오픈텍스트 SAP 파트너십 담당 부사장 마크 베일리는 “이번 수상은 SAP와의 긴밀한 협력을 통해 고객의 디지털 전환(Digital Transformation)과 AI 기반 혁신을 실질적으로 지원해 온 성과를 보여준다”라며 “앞으로도 기업이 보다 빠르고 안전하게 AI 시대에 대응할 수 있도록 지원을 확대할 계획”이라고 전했다.

오픈텍스트는 향후에도 SAP와의 협력을 기반으로 클라우드 전환과 AI 활용을 가속화하고, 글로벌 고객의 정보관리와 비즈니스 혁신을 지속 지원할 방침이다.
dl-ciokorea@foundryco.com



❌