Visualização de leitura

AI sprawl: Why your productivity trap is about to get expensive

I have seen this movie before.

A decade ago, at Tesla, our Finance team faced a data crisis. We had information scattered across accounting, supply chain and delivery systems, all disconnected, all using different structures. The engineering team was rightfully focused on Full Self-Driving (FSD) and manufacturing. So, we did what productivity-hungry teams always do: We built our own solution. We taught ourselves Structured Query Language (SQL), normalized the data with creative IF-THEN logic and created our own reporting database.

It worked beautifully. Until it became a governance nightmare. The VP of Engineering hated our siloed system with embedded business logic. We eventually handed it over to IT, but not before our workaround forced the company to finally resource a proper data team.

The pattern is always the same: Productivity-hungry teams build workarounds faster than the organization can govern them, and by the time leadership notices, the workarounds have become the infrastructure.

That was more than a decade ago. The pattern took years to unfold.

Today, I am watching the exact same dynamic play out in insurance and industries across the board, but compressed into months, not years. AI adoption is sprawling across organizations, led by the same productivity-hungry individuals, but without central platforms or governance. Leadership has not created space for safe experimentation, so adoption spreads like a city without a highway system. The difference? Back then, we were building SQL databases. In 2026, we are building AI agents. And the cost of fragmentation is exponentially higher.

What is AI sprawl?

AI Sprawl is what happens when the cost of building AI drops faster than an organization can govern it. Teams spin up models, agents and automations independently. Each one works in isolation. None of them connect. The result is fragmented data, drifting decisions and intelligent systems that quietly get abandoned.

It happens because execution has become cheap. Large Language Model (LLM) APIs, no-code tools and cloud infrastructure have made spinning up AI trivially easy. A claims team builds an automation to speed adjudication. Underwriting builds a model to assess risk. Customer service deploys a chatbot. Each initiative delivers local value. No single project looks like a problem.

But collectively, they create an ungovernable landscape.

Over the past 18 months, the GenAI acceleration intensified what IDC calls the GenAI scramble: scattered, fragmented and sometimes redundant applications launched by business-led initiatives without central oversight. Many organizations have fallen into what researchers describe as a productivity trap: Focusing on short-sighted value generation instead of scalability, which limits their ability to create reusable capabilities across departments.

AI sprawl is everywhere

A major property and casualty carrier recently invited us to speak with their innovation leadership about implementing process automation. We spoke with more than 10 key stakeholders across multiple lines of business and found more than a dozen different POCs and local solutions across claims intake, underwriting and fraud detection.

Six of them were solving overlapping problems. None shared data infrastructure. Two had been abandoned months earlier but were still running and still being billed.

This is not an outlier. It is the norm.

AI Sprawl persists because it is insidious, hiding in plain sight unless you look for it. Business units move fast, build independently and solve immediate problems. IT discovers shadow AI only when something breaks, when an audit is triggered or when a vendor renewal surfaces a tool, nobody knew existed. And this symptom multiplies as more innovative teams exist within the organization.

The 4 hidden costs of sprawl

AI Sprawl creates costs that compound over time, many of which are not visible in any single budget line. It results in a dangerous cascade of failures:

  1. Governance becomes impossible. Companies cannot govern what they cannot see. When AI systems scatter across departments, audit trails fragment. Bias monitoring becomes inconsistent. Explainability standards vary by team.
  2. Scaling stalls. Disconnected systems cannot integrate. Every new initiative starts from scratch instead of building on shared infrastructure.
  3. Maintenance and redundant spending multiply. Teams that built AI to accelerate their work end up spending most of their time maintaining it. One carrier reported that 60% of their AI engineering capacity was devoted to maintaining existing tools rather than building new capabilities. Meanwhile, teams unknowingly pay for overlapping capabilities because nobody has a complete view of AI spending.
  4. Talent drains away. The best AI engineers want to solve hard problems. When they are cornered into spending their time maintaining fragmented infrastructure, they walk out the door.

Why traditional governance fails

Seventy percent of large insurers are investing in AI governance frameworks. Yet only 5% have mature frameworks in place. This gap is not about commitment or resources. It is about a category mistake.

For the last two decades, enterprise software governance worked because the software itself worked a certain way. Systems were point solutions. A claims platform did claims. A policy admin system did policy admin. Each tool had a clear owner, a defined scope and a predictable boundary. Governance could wrap around the edges, through access controls, audit logs, change management, vendor reviews, because the edges were visible. We governed the perimeter because the perimeter was the product.

AI is not a point solution. It is foundational technology, closer to electricity or a database than to a piece of software. It does not sit inside a defined boundary; it flows across every process, every decision and every department that touches data. And because it flows, it cannot be governed at the perimeter.

This is why carriers applying the old playbook keep running in place. Policy documents, oversight committees and compliance checklists were designed to govern systems that stood still. AI does not stand still. It is built, modified, retrained and extended by the same teams it is meant to serve, often in the same week. By the time a governance committee reviews it, three more versions exist somewhere else in the organization.

The failure is not that carriers are governing AI badly. It is that they are governing it as if it were software, when it’s actually infrastructure. Infrastructure requires a different discipline: Shared foundations, common standards and the assumption that everyone will build on top of it. You do not govern electricity by reviewing each appliance. You govern it by standardizing the grid.

Until carriers make that shift, their frameworks will keep maturing on paper while sprawl compounds underneath.

3 questions every insurance CIO should be able to answer

If the failure of traditional governance is a category mistake, the first job of leadership is to check which category they are actually operating in. These three questions are not meant to produce tidy answers. They are meant to reveal whether you are still governing AI as software when you should be governing it as infrastructure.

1. Are you governing AI at the perimeter, or at the foundation?

Look at your current AI governance artifacts, such as the policies, the committees, the review processes. Are they designed to wrap around individual tools after they are built, or to set shared standards that every tool must be built on top of? Perimeter governance asks, “is this specific model compliant?” Foundational governance asks, “does every model in this organization inherit the same definitions, the same lineage and the same guardrails by default?” If your governance only kicks in at review time, you’re still treating AI like software. You’re already behind.

2. If you standardized one thing across your entire organization tomorrow, what would create the most leverage and why haven’t you?

Every carrier has a list of things they know should be standardized but have not been. Shared definitions for core entities. Common ways of handling unstructured inputs. A single source of truth for how decisions get logged. The question is not which item belongs at the top of the list; most CIOs already know. The question is what has been blocking the standardization: Is it political, budgetary, or organizational? Because that blocker, whatever it is, is also what is letting sprawl compound. Governance frameworks cannot fix what foundational decisions have been deferred.

3. When a new AI initiative launches next quarter, what will it automatically inherit from what already exists?

This is the real test. In a point-solution world, every new system is built fresh and governance is applied afterward. In a foundational world, every new system inherits shared standards, shared definitions, shared oversight before a single line of code is written. If the honest answer is “it will inherit nothing, and we will govern it after the fact,” then you do not have an AI governance problem. You have an AI foundation problem, and no amount of policy will close the gap.

The uncomfortable truth is that most carriers will answer these questions honestly and discover they are still operating from the old playbook. It is a signal that the work to be done is not more governance, but different governance, the kind that assumes AI is the ground floor, not the top floor.

This article is published as part of the Foundry Expert Contributor Network.
Want to join?

From copilot to control plane: Where serious AI governance starts

In practice, that means setting the rules for identity, model access, permissions, logging and human approval before AI tools or agents are allowed to operate inside business workflows. The practical starting point is to identify where AI is already touching repositories, tickets, internal knowledge and business systems, then establish a minimum common control set across those entry points.

The first enterprise AI conversations I kept getting pulled into sounded like tooling debates.

Which copilot should we allow? Which model should we approve? How quickly can teams start using it in the IDE? How much faster will developers move?

Those are reasonable opening questions. In my experience, they are rarely the questions that determine whether AI scales safely inside an enterprise. They are just the entry point.

More than once, I have watched a meeting begin with a simple request to approve an AI coding assistant and end twenty minutes later in a debate about repository access, model approvals, prompt retention, audit trails and whether an agent should be allowed anywhere near a deployment workflow. That is the pattern that matters.

What I have seen instead is a predictable progression. First comes enthusiasm around copilots and coding assistants. Teams want faster code completion, quicker debugging, better documentation and help writing tests. Then the conversation shifts. Leaders start asking what these tools can see, where prompts go, which models are approved, whether responses are retained and how generated output should be reviewed. Then the issue gets bigger again. Once AI starts interacting with repositories, tickets, pipelines, internal knowledge, APIs and systems of record, the problem is no longer the assistant itself. It is the control plane around it.

That is why I no longer think this is mainly a coding tools story. Software development is simply where the governance problem becomes visible first. The broader enterprise issue is whether there is a shared layer for identity, permissions, approved model access, secure context, auditability and action boundaries before AI becomes an execution surface inside the business.

Software development is where the issue surfaces first

Development teams encounter this shift early because the platforms themselves are already moving beyond simple assistance. GitHub Copilot policy controls now let organizations govern feature and model availability, while GitHub’s enterprise AI controls provide a centralized place to manage and monitor policies and agents across the enterprise. GitHub has also made its enterprise AI controls and agent control plane generally available, explicitly positioning them as governance features for deeper control and stronger auditability. That is a sign that governance is starting to surface directly in product design.

Google is sending a similar signal. Gemini Code Assist is framed as support to build, deploy and operate applications across the full software development lifecycle, not just as an IDE helper. Its newer agent mode documentation describes access to built-in tools and Google’s data governance documentation says Standard and Enterprise prompts and responses are not used to train Gemini Code Assist models and are encrypted in transit. When vendors start documenting lifecycle coverage, tool access, data governance and validation expectations, the market is already telling you what matters next.

Microsoft is even more explicit. Microsoft Agent 365 is described as a control plane for AI agents, with unified observability through telemetry, dashboards and alerts. Microsoft’s Copilot architecture and data protection model put equal emphasis on permissions, data flow, Conditional Access, MFA, labeling and auditing. In other words, the control-plane idea is no longer theoretical. Major platforms are operationalizing it.

That is why the productivity-only debate misses the larger point. DORA’s 2025 report argues that AI primarily acts as an amplifier, magnifying an organization’s existing strengths and weaknesses and that the biggest gains come from the surrounding system, not from the tool by itself. The DORA AI Capabilities Model pushes the same idea further by laying out organizational capabilities required to get real value from AI-assisted software development. That lines up with what I have seen in practice. Enterprises do not fail because a model is impressive or unimpressive. They fail when they mistake local tool adoption for operating readiness.

The developer productivity research is mixed, which is exactly why leadership should be careful. MIT Sloan summarized field research showing productivity gains from AI coding assistants, especially among less-experienced developers. METR’s 2025 trial, by contrast, found that experienced open-source developers using early-2025 AI tools took longer in that setting. I do not read those findings as contradictions. I read them as a warning against building enterprise strategy around a narrow “hours saved in the IDE” lens. For leaders, the implication is simple: Mixed productivity data is a reason to strengthen governance and operating discipline, not to make strategy from benchmark claims alone.

The shift from assistant to execution layer

The real change happens when AI stops being a suggestion surface and starts becoming an execution surface.

That threshold arrives faster than many leaders expect. GitHub’s coding agent can create pull requests, make changes in response to comments and work in the background before requesting review. GitHub also documents centralized agent management and policy-compliant execution patterns using hooks to log prompts and control which tools Copilot CLI can run. Once a tool can act inside the delivery system, permission design stops being optional.

Anthropic’s documentation makes the same shift visible from another angle. Claude Code is described as an agentic coding tool that reads a codebase, edits files, runs commands and integrates with development tools. Anthropic’s sandboxing work explains how filesystem and network isolation were added to reduce permission prompts while improving safety. Its work on advanced tool use describes dynamic discovery and loading of tools on demand rather than preloading everything into context. Once tools can be discovered dynamically and invoked during work, governance must move above the assistant.

This is usually the point when the room changes. What started as a discussion about developer productivity becomes a discussion about identity, authority, logging, approval boundaries and who owns the risk if an AI-enabled action causes real enterprise impact. The issue is no longer, “Did the assistant help write code?” The issue becomes, “Who authorized this path from context to action?”

Serious governance starts above the tool

If an organization is serious about AI, governance must start above the assistant.

The first control is identity. Who is acting: A human, a service account, a bot or an agent? Microsoft’s Copilot architecture and agent management guidance make this concrete by tying access to user authorization, Conditional Access and MFA. That is the right instinct. AI does not remove the identity problem. It sharpens it.

The second control is permissions. What can the actor read, write, retrieve or execute? This is where many early deployments are still too loose. If an AI tool can read internal knowledge, query systems, write to a repository or trigger workflows, those capabilities need clear tiering just as privileged human access does. In practice, that usually means mapping agent permissions onto existing identity and access models so read, write, query and execution rights follow least-privilege rules rather than tool convenience. That can mean giving an agent read access to internal knowledge, limited write access in development environments and no production execution rights without an explicit approval boundary.

The third control is approved model access. GitHub now lets organizations govern model and feature availability in Copilot. Google documents edition-specific data handling and validation expectations. Enterprises need a way to decide which models are allowed for which workloads and data classes. Otherwise, every team ends up inventing its own routing logic and risk posture.

The fourth control is secure context. This is where real exposure often sits: Connectors, retrieval, embedded knowledge, prompts and tool calls. Anthropic’s work on context engineering for agents is useful because it shows how agents increasingly load data just in time through references and tools. That is powerful, but it also means context discipline matters as much as model discipline.

The fifth control is auditability. If a system suggests code, opens a ticket, retrieves enterprise content, triggers a tool or initiates a change, the enterprise needs evidence. GitHub’s enterprise agent monitoring and Microsoft’s auditing model both point in this direction. Governance without reconstructable evidence is not governance. It is optimism.

The standards are already telling us this

The control-plane framing matters because it aligns with where the standards bodies are already going.

NIST’s Secure Software Development Framework says secure practices need to be integrated into each SDLC implementation. NIST SP 800-218A extends that logic with AI-specific practices for model development throughout the software lifecycle. NIST’s Generative AI Profile treats generative AI as a risk-management problem spanning design, development, use and evaluation rather than as a narrow feature rollout. That is consistent with what enterprises are now learning in practice: Once AI touches real delivery and operating processes, governance becomes architectural.

The security community is saying the same thing. OWASP’s LLM Top 10 flags prompt injection, sensitive information disclosure, supply chain vulnerabilities and excessive agency as core risk areas. Those are not merely model-quality issues. They are control issues that show up when AI has context, tools and authority.

Software supply chain discipline matters here, too. SLSA ties stronger software trust to provenance and tamper resistance, while OpenSSF’s MLSecOps whitepaper and its Security-Focused Guide for AI Code Assistant Instructions show that AI-assisted development now needs explicit security practice in both pipelines and prompting. In an AI-assisted delivery environment, provenance and secure instruction design become more important, not less.

The market is moving toward a real control-plane layer

This is not just a framework conversation anymore. It is becoming a market category.

Forrester’s agent control plane research described enterprise needs across three functional planes: Building agents, embedding them into workflows and managing and governing them at scale. That matters because it validates the idea that governance has to sit outside the build plane if it is going to remain consistent as agents proliferate.

The market signal is clear. Microsoft is calling Agent 365 a control plane. GitHub has generally available enterprise AI controls and an agent control plane. Airia’s governance launch explicitly positions governance as a distinct layer alongside security and orchestration. The category is converging around the same problem statement: If agents can act, someone has to govern the conditions under which that action is allowed. Any control-plane solution worth serious consideration should work across models and tools while preserving policy consistency, auditability and clear operational boundaries.

The real leadership question

When this becomes real, I usually stop asking which assistant a team prefers and start asking different questions:

  • Who is the actor, and under what identity does it run?
  • What can it read, what can it write and what can it execute?
  • Which models, endpoints and data flows are approved?
  • What evidence survives an audit, an incident review or a board-level question?
  • Where are the mandatory human checkpoints before an AI-assisted action becomes an enterprise action?

Those questions change the quality of the conversation quickly. They move the discussion out of demo mode and into operating model territory. That is also where alignment starts, because governance becomes a cross-functional operating issue for architecture, security, engineering and risk rather than a tooling preference inside one team.

In the conversations I have been in, that is usually the point when the room stops talking about tools and starts talking about control.

The wrong question for this phase is, “Which copilot should we standardize on?”

The better question is, “What control plane will govern AI wherever it runs?”

That is where serious enterprise AI governance starts.

This article is published as part of the Foundry Expert Contributor Network.
Want to join?

Data debt will cripple your AI strategy if left unaddressed

As every CIO knows, AI success hinges on rock-solid data practices. But as CEOs and boards have emphasized digital transformations in recent years, funding for data management transformation efforts has been piecemeal at best. Now, with AI atop the CEO agenda, many CIOs find themselves in a bind, having to also overhaul data operations and address years, or decades, of accumulated data debt.   

If your enterprise has data debt, AI will expose it. In fact, data debt can lead to devastating failure rates with AI projects. For technology leaders, there’s no time like the present to pay down this debt with a comprehensive remediation strategy.

Data debt can arise for a variety of reasons, including old and outdated data management practices, shortcuts and compromises in infrastructure to meet near-term goals, poorly documented data sources, and inefficient data storage practices.

Research firm IDC in its 2026 CIO Agenda Predictions notes that by 2027, CIOs who delay the launch of data debt remediation will face 50% higher AI failure rates and rising costs, as model underperformance exposes issues from siloed, redundant, or poor-quality data.

“These findings reinforce that scaling AI requires disciplined investment in data foundations and integrated platforms, and that postponing these fundamentals risks turning AI ambition into sustained operational friction,” the report says.

“AI doesn’t create data problems; it exposes and accelerates them,” says Hrishikesh Pippadipally, CIO at accounting and advisory firm Wiss. “When organizations lack standardized processes, consistent definitions, and disciplined data governance, data naturally decays over time. That decay may not be visible in traditional reporting environments, but AI systems surface those inconsistencies quickly.”

Data debt is often the result of process drift — multiple teams using different definitions, inconsistent data entry standards, and siloed systems evolving independently, Pippadipally says.

“Without standardization and clear ownership, even modern systems degrade,” he says. “At our organization, we’ve learned that remediation isn’t just about cleaning historical data. It’s about instituting disciplined processes that prevent decay going forward: clear data ownership, standardized workflows, and governance embedded into daily operations.”

That said, not all AI initiatives are blocked by imperfect data, Pippadipally says. “There are smaller, well-bounded use cases, such as document summarization, drafting assistance, anomaly flagging, or reconciliation support, that can deliver value with human-in-the-loop verification,” he says. “These contained applications allow organizations to build AI maturity while foundational data improvements are under way.”

A mounting problem that requires a fast fix

A widespread problem, data debt at most organizations has grown organically over decades. In addition to increasing emphasis on data collection, companies have also accumulated data debt over years of mergers and acquisitions, as well as the deployment of new systems and services either enterprisewide or by departments.

“Systems were layered in response to immediate needs, acquisitions, regulatory requirements, or departmental preferences,” Pippadipally says. “Over time, inconsistent processes and standards lead to fragmented data environments.”

Moreover, data management inefficiencies have historically been addressed with manual work-arounds, Pippadipally says. “Teams reconciled reports manually,” he says. “Analysts compensated for inconsistent definitions. But AI reduces tolerance for ambiguity. When automated systems operate at scale, inconsistencies multiply rather than average out.”

It’s vital to address this now because AI initiatives are moving faster than process maturity. There is a clear sense of urgency.

“If organizations don’t institutionalize process discipline and standardization, they risk automating chaos instead of improving outcomes,” Pippadipally says. “The issue is not simply poor data; it is the absence of sustained governance to keep data reliable over time.”

For many enterprises, data debt can stay hidden while they are conducting traditional business intelligence or one-off analytics, says Juan Nassif, regional CTO at software development provider BairesDev.

“AI is different; it’s far less forgiving and it quickly exposes duplicates, inconsistent definitions, missing context, and ‘mystery fields’ with unclear lineage,” Nassif says. “When you scale beyond pilots, those issues show up as model underperformance, higher iteration cycles, and rising operational costs. It’s absolutely a concern for us, too, and we treat it as a prerequisite for scaling AI responsibly.”

If data is incomplete, inconsistent, or duplicated, the output from AI models becomes unreliable. “That can mean wrong answers, poor recommendations, or automations that break at the worst time,” Nassif says. “Teams end up spending most of their time wrangling data, reworking pipelines, and compensating for poor inputs with repeated tuning and exceptions.”

Some form of data debt is present in every sector, and in virtually all sizes of organizations.

“I witness the consequences of data debt in my daily work with schools in the UK every single week,” says Mark Friend, director of Classroom365, which consults educational institutions on technology and architecture and strategies.

“Most people assume that when they purchase the latest AI tool, all their problems will be solved no matter how messy the foundation underneath the hood,” Friend says. “My experience with this is that even the most expensive software is useless if the input is not reliable.” Data debt is “a fundamental risk to institutional stability,” he says.

Tips for effective data debt remediation

Enterprise-wide data debt remediation can be a significant, costly undertaking that involves multiple aspects of the business. It’s not just a technology issue, but a discipline issue as well. It requires cleaning up historical data as well as strengthening process governance to keep from repeating the mistakes or poor practices of the past.

Because of this, building and executing an effective strategy requires an organized and thorough approach. Here are some tips from experts.

Get senior management and board-level sponsorship

Any major IT initiative typically needs buy-in from senior business executives and even boards, particularly if it involves a large, global enterprise. Data debt remediation is no different. There is significant financial risk if remediation does not have the blessing and full backing of senior executives and board members.

Explaining the potential ramifications is a good way to bring attention to the need for remediation. “Make data debt visible and tie it to business risk,” Nassif says. “Data debt won’t get prioritized until it’s linked to AI failure rates, rising costs, and compliance exposure.”

Data debt is now a board-level risk, says Adrian Lawrence, founder of executive recruitment firm NED Capital, who advises boards and finance leaders on enterprise data governance, reporting integrity, and AI readiness.

“I see the pressure mounting with boards linking their AI investment to productivity and profitability objectives, but disjointed financial, sales, and operations data severely undermine model accuracy,” Lawrence says. “They lay bare the deficiencies [enterprise platform] upgrades and antiquated technology did not fully address.”

Success with debt remediation “demands executive sponsorship, disciplined data governance, and staged architecture cleanup treating data as an asset on the balance sheet,” Lawrence says.

Standardize core processes before scaling AI

To make the benefits of data debt remediation more long lasting, enterprises need to standardize their core business processes.

“Data quality reflects process quality,” Pippadipally says. “Leaders must align on standardized workflows, definitions, and system usage before expecting AI to operate consistently. Without process standardization, remediation efforts will be temporary.”

AI performs best in predictable environments, Pippadipally says, and standardization creates the stability AI requires.

BairesDev has embedded automated checks for data freshness, completeness, duplicates, and schema changes, so data quality issues get caught before they reach analytics or AI workflows, Nassif says.

Establish data ownership and ongoing governance

Another way to assure long-term benefits from a remediation effort is to have ongoing governance and accountability processes in place.

“Data remediation is not a one-time cleanup initiative,” Pippadipally says.

“Assigning clear ownership at the domain level, and establishing continuous monitoring, prevents data from degrading again.”

This is important, because governance ensures sustainability. “Without discipline, organizations reaccumulate data debt even after cleanup efforts,” Pippadipally says.

“We’ve been tightening dataset ownership and standardizing common business definitions, so teams aren’t training or prompting on conflicting ‘versions of truth,’” Nassif says. “We’ve been strengthening our cataloging and lineage practices, so teams can trace where data comes from, how it transforms, and who can use it — critical for both trust and governance.”

The biggest shift is mindset. “We don’t treat data remediation as a one-time cleanup,” Nassif says. “We treat it as ongoing engineering with guardrails that prevent debt from coming right back.”

Prioritize high-value, contained AI use cases

While large data modernization initiatives progress within an organization, CIOs can deploy AI in tightly scoped areas where outputs are verifiable and human oversight is straightforward, Pippadipally says.


“Examples include drafting support, controlled reconciliations, workflow triage, or anomaly flagging,” Pippadipally says. “This approach builds organizational confidence and demonstrates ROI without overexposing the enterprise to data risk.”


Clean up storage

When it comes to data storage practices, there’s no doubt that organizations need to clean up their act. Poor practices lead to poor data quality, which could impact AI-driven projects.

“Schools are often very good at storing data like [in] an attic where they just keep throwing boxes without looking inside,” Friend says. “Anyone who has lived through a technology refresh knows that messy storage is a massive financial burden.”

Decades of bad collection practices “have created a technical rot that we can no longer ignore,” Friend says. “You might think that your legacy storage is harmless, but it actually places a massive financial burden in the form of rising operational costs,” and can negatively impact AI initiatives.

Micro and macro agents: The emerging architecture of the agentic enterprise

Artificial intelligence is entering a new phase. For the past decade, enterprises have focused primarily on predictive analytics and automation — using machine learning models to classify data, detect patterns and improve decision making. Today, a new paradigm is emerging: Agentic AI, systems capable of autonomously executing tasks and coordinating complex workflows.

Yet despite the rapid growth of AI agents, the term itself is often used loosely. Many organizations describe any AI-powered automation as an “agent,” even when it performs only a single function. As enterprises move toward large-scale deployment of autonomous systems, a clearer framework is needed to understand how these systems will be structured.

One useful way to think about the emerging architecture is through the distinction between micro agents and macro agents — two complementary layers that together form the foundation of the agentic enterprise.

The rise of micro agents

Most AI systems being deployed today can be best described as micro agents.

Micro agents are specialized AI systems designed to perform narrow, well-defined tasks within a workflow. They typically operate within existing applications and platforms, augmenting specific functions rather than managing entire processes.

Examples of micro agents are increasingly common across industries:

  • A document extraction agent that reads contracts or insurance policies
  • A fraud detection agent that analyzes transactional anomalies
  • A summarization agent that condenses large volumes of text
  • A classification agent that categorizes customer requests
  • A risk scoring agent that evaluates underwriting inputs

These agents are powerful because they combine machine learning models, large language models and automation tools to complete tasks that previously required human intervention.

In many ways, micro agents resemble AI-powered microservices. Each is optimized for a specific capability and integrated into a broader digital workflow.

However, micro agents have an inherent limitation: They operate at the task level, not the workflow level.

The emergence of macro agents

The next stage in enterprise AI will be defined by the rise of macro agents.

Macro agents operate at a higher level of abstraction. Rather than performing a single task, they coordinate multiple micro agents to complete an end-to-end business process.

Macro agents are, therefore, goal-oriented systems. Their objective is not simply to perform an activity but to deliver an outcome.

This enables seamless integration by integrating with systems requiring real-time decisions and dynamic engagement.

Consider a typical insurance claims process. Traditionally, this workflow involves numerous steps:

  • First notice of loss intake
  • Document analysis
  • Damage assessment
  • Fraud detection
  • Coverage validation
  • Payment authorization

A macro agent could orchestrate each of these steps by coordinating specialized micro agents responsible for individual tasks. The macro agent would manage the workflow, evaluate outcomes and ensure the process is completed successfully.

This orchestration capability fundamentally changes the role of AI in enterprises. Instead of acting as a set of isolated tools, AI begins to function more like a coordinated digital workforce.

The key factor to note is that macro agents are more outcome-based, which is what the businesses want.

The need for governance: Meta agents

As organizations deploy networks of interacting agents, another challenge quickly emerges: Governance.

 The struggle for good AI governance is real, and many organizations deploying AI recognize the need for guardrails, but few have figured out how to build a mature governance system.

Autonomous systems that make decisions, coordinate tasks and execute actions must be monitored carefully to ensure they stay compliant, secure and aligned with business objectives.

This creates the need for a third layer in the agentic architecture: Meta agents.

Meta agents oversee and monitor other agents. Their responsibilities may include:

  • Monitoring risk and model behavior
  • Validating regulatory compliance
  • Auditing decision logic
  • Managing cost and resource consumption
  • Escalating decisions to human operators when necessary

In essence, meta agents serve as the governance layer of the agentic enterprise, ensuring that autonomy does not come at the expense of control.

The need for governance is critical, and meta agents will be the trick to balancing governance with innovation in the age of AI. According to Ian Ruffle, head of data and insight at UK breakdown specialist RAC, “Success is about having the right relationships and never trying to sweep issues under the carpet.”

The agentic enterprise stack

Together, these layers form what can be described as the agentic enterprise stack:

  • Meta agents: Governance and oversight. Monitoring, compliance and risk management across agent systems.
  • Macro agents: Workflow intelligence. Coordination of multi-step processes and delivery of business outcomes.
  • Micro agents: Task execution. Specialized systems are responsible for discrete capabilities and actions.

This layered architecture reflects how large-scale AI systems will likely evolve. Instead of deploying isolated tools, enterprises will build interconnected ecosystems of agents, each operating at a different level of responsibility.

This framework potentially can move today’s ERP systems from a system of records to a new generation of systems that are systems of intelligence.

Where most companies are today

Despite growing interest in agentic AI, most organizations remain in the micro-agent stage.

Many AI initiatives focus on improving individual tasks — automating document processing, generating summaries, or assisting customer service representatives. These use cases deliver meaningful productivity gains, but they represent only the early phase of the agentic transformation.

The real shift will occur when enterprises begin to deploy macro agents capable of managing entire workflows, coordinating dozens of micro agents in the background.

At that point, AI moves beyond augmentation and begins to operate as an operational system for work itself.

Implications across industries

The emergence of agentic architectures will have profound implications across industries.

In financial services and insurance, macro agents could manage complex processes such as underwriting decisions, claims resolution and regulatory reporting.

In healthcare, macro agents may coordinate patient intake, diagnosis support and care management workflows.

In manufacturing and supply chains, agent systems could orchestrate procurement, logistics and production planning.

Across sectors, the defining shift will be the transition from AI tools that assist humans to AI systems that manage workflows autonomously while remaining governed by human oversight.

From automation to autonomous

The evolution from micro agents to macro agents represents more than a technological upgrade. It signals a fundamental shift in how organizations think about work.

Digital transformation modernized technology, while intelligent transformation modernizes the enterprise itself.

Ultimately, success will not be determined by who can showcase the most impressive agent, but by who can develop the most trustworthy agentic ecosystem — one that is secure by design, outcome-oriented and embraced by employees who feel empowered rather than displaced.

For decades, enterprise technology has focused on improving the efficiency of human tasks. Agentic systems instead aim to restructure how work itself is executed, distributing responsibilities across networks of autonomous systems.

In this emerging model, micro agents act as the specialized workers, macro agents serve as workflow managers and meta agents provide the governance and oversight required for responsible autonomy.

This approach moves the organizations from where humans initiate AI agents to where AI initiates AI agents, sometimes with a human overseeing the outcome.

Organizations that understand and design for this layered architecture and are willing to redesign workflows and roles will be best positioned to build the agentic enterprises of the future. Adoption of this enterprise architecture will translate the value creation into value realization.

This article is published as part of the Foundry Expert Contributor Network.
Want to join?

❌