The Future of Data Architecture: Managing Data at the Source with Agents

Traditional enterprise data architectures rely on moving data into centralized lakes or warehouses, which creates complex pipelines and consistency issues. This article proposes a paradigm shift: Agent-centric data architecture. Instead of consolidating data, intelligent agents bring processing and analysis directly to the source systems.

The Three Pillars

Declarative Modeling: Systems define what data is needed; agents figure out how to retrieve it.
Robust Governance: Metadata and shared semantics ensure traceability and control directly within operational systems.
Data Fabresh Layer: A governance layer that regulates contracts and policies rather than storing data.

By prioritizing governed intelligence over storage, this model eliminates the rigid layers of traditional architecture. It offers a real-time, simplified infrastructure suited for a distributed, AI-driven world – evolving beyond centralized repositories to focus on data at its point of origin.

Live Online Course: Data Architecture Intensive

Learn how to design unified, future-ready data architectures that bring together operational, analytical, and AI data.

Enroll Today

“And Yet It Moves”

For many years, data architecture in organizations has followed a fairly stable logic: extracting information from operational systems, transferring it, and concentrating it on platforms designed for analysis. Over time, this approach has taken different forms – first data warehouses, then data lakes, and more recently, lakehouses – but the fundamental premise has remained unchanged. All these models assume that, in order to analyze data, it must be extracted from the systems where it is generated and transferred to a separate environment optimized for its use.

However, recent advances in computing power, virtualization, and distributed processing challenge this basic idea: What if it were no longer necessary to move data in order to analyze it?

The exponential growth in computing power, the maturity of distributed architectures, and the emergence of intelligent agents capable of orchestrating data processes make it possible to envision a new paradigm: agency data architecture.

In this model, data does not move. What moves is intelligence.

The End of the “Copy Paradigm”

Moving data has always been costly and problematic. Every pipeline, every ETL, and every replication introduces additional complexity:

Unnecessary latency
Consistency issues
Duplicate storage costs
Security risks
Deterioration of data quality

Furthermore, each copy raises an inevitable new question: Which is the correct version of the data?

Traditional architecture attempts to solve these problems by centralizing data. But the result is often an ecosystem riddled with intermediate layers:

Data warehouse
Data lake
Data marts, cubes
Operational replications
ETL/ELT pipelines

Paradoxically, the more we try to consolidate data, the more complex the architecture becomes.

Agent-based architecture proposes a radically different approach:

Data must remain where it is generated: in the operational systems.

From Data Transfer to Computing Power Transfer

The key shift is based on a simple technological premise: Today, it is cheaper to move the processing than to move the data.

Instead of building large centralized data warehouses, an agent-based architecture allows intelligent data agents to perform analytical processes directly on the source systems.

These agents can:

Access operational data in real time (with sentinels monitoring for anomalies)
Apply quality rules at the source (within processes)
Perform on-demand transformations (minimal if the data is of high quality)
Orchestrate federated queries across multiple systems (including in a virtualized manner, with queries moving instead of data)
Generate dynamic data products (in a secure, governed-by-design environment)

The result is an architecture where analysis comes to the data, rather than bringing the data to the analysis.

Upskill Your Team — At Scale

Get unlimited access to 250+ courses, 600+ hours of insider content, and enterprise discounts for team-wide learning.

Get the Subscription

4.0 The True Cornerstone: Data Governance

This model is feasible only if there is an extremely robust level of data governance.

Without governance, direct access to operational systems would be chaotic.
With governance, it becomes an extremely powerful architecture.

The governance framework must ensure:

1. Data quality at the source: Quality rules are not applied after ETL, but within operational processes.

Data is created already validated.

2. Shared semantics: A business metamodel defines:

Entities
Definitions
Business rules
Relationships between data

This allows agents to interpret information correctly.

3. Access control: Agents must operate in compliance with strict policies regarding:

Security
Privacy
Traceability
Regulatory compliance

4. Complete audit trail: Every query or result generated by an agent maintains a complete audit trail back to the operating systems.

There are no intermediate layers that hide the origin of the data. The data is not exposed; it is queries that travel through the systems, faster and more securely.

The Role of Data Agents

In this architecture, the key players are what we can call “data agents.”

A data agent is an autonomous component capable of:

Understanding the semantic context of the data from a transactional perspective
Identifying relevant sources
Executing distributed queries (using virtualization and caching engines to avoid overloading the operating systems)
Applying business rules
Generating analytical results instantly and without ETL

We can envision several types of agents:

Discovery agents: Identify where relevant information is located. (We already have these.)
Quality agents: Verify and correct anomalies in real time. (They are already here.)
Analytical agents: Generate metrics, indicators, or models directly from operational data.
Governance agents: Ensure compliance with data usage policies.

This ecosystem creates a network of distributed intelligence that operates on existing systems.

A Radical Change in Architecture

Agent-based data architecture implies a profound conceptual shift.

Traditional architecture:

Move data
Store data
Transform data
Consume data

Agent-based architecture:

Govern data
Discover data
Perform data processing
Generate knowledge on demand

The difference is fundamental.

Instead of building massive archives of historical data, the organization creates an intelligent network of governed access and processing.

Benefits of the Model

When implemented correctly, this approach offers significant benefits.

Elimination of duplicates: A single set of data in the source system.
Real-time information: No ETL delays or replicas.
Reduced infrastructure costs: Less storage space, fewer pipelines, less maintenance.
Greater traceability: Every result can be traced back to the process that generated the data.
Better quality: Data is validated at the time of creation.
No MDM required: Governed processes do not fragment information; they are all Golden Records.

From the Layered Model to Agent-Based Architecture

This approach also breaks with another of the classic pillars of data architecture: the separation into conceptual, logical, and physical layers. For decades, enterprise architecture has sought to structure information systems through this layering, in which each layer abstracts the next. However, in an agentic data architecture, this distinction loses its relevance.

Agents operate directly on the systems where the data resides, using metadata, semantics, and governance rules to interpret information in real time. Conceptual understanding, business logic, and physical execution are no longer separated into rigid layers, but become part of a single dynamic system governed by metadata and declarative policies. In place of a static, layered architecture, a living, distributed, and contextual architecture emerges, in which agents directly link intention, meaning, and execution on operational data.

The Challenges

This model is not without challenges. The main one is cultural and organizational in nature.

Many companies have built their data strategy around large centralized repositories. The transition to a distributed architecture requires:

Maturity in data governance
Clear definition of domains
Robust semantic models
Advanced automation

In addition, operating systems must be able to support additional analytical workloads, which is not always the case in legacy architectures. However, it is possible to virtualize them and make intelligent use of caching in virtualization engines.

The Future of Data Architecture

Agent-based architecture does not necessarily imply the immediate disappearance of data warehouses or data lakes. Many organizations will continue to use them for years to come. Many organizations need to materialize data, if only for regulatory reasons.

But the trend is clear. As distributed computing, data virtualization, and intelligent agents evolve, the model of massive data movement will lose its relevance.

The future may not lie in building ever-larger repositories, but in creating systems capable of processing data where it is generated.

In that world, the question will no longer be, “Where do we store the data?” but rather, “How do we govern the intelligence that operates on it?”

And this is, precisely, the promise of agent data architecture.

The Role of the Declarative Approach

For this model to work, we must also move away from the classic procedural paradigm of data engineering.

For years, data architectures have been built by defining how processes should be executed: step-by-step ETL pipelines, scripts, chained transformations, and complex orchestrations.

In an agent-based architecture, the approach shifts toward a declarative model.

Instead of specifying how to move and transform data, what is defined is:

What data is needed
What rules it must satisfy
What quality is acceptable
Which governance policies must be applied

In other words, you declare the desired state of the data, and the system’s agents automatically figure out how to achieve it.

This is a profound shift: the data engineer stops programming pipelines and moves on to defining contracts, rules, and objectives, rising to the business level.

Data agents, supported by metadata and semantic models, perform the actions necessary to fulfill these declarations. As always, the success of the model lies in the metadata.

The Role of Data Fabresh: Guardians of Processes, Not Data

In this context, an interesting evolution of the traditional concept of the data fabric emerges: what Daniel Torbellino and I call Data Fabresh.

While many data architectures seek to build a layer that controls or centralizes data, the Data Fabresh approach is different.

Data Fabresh does not store data. It governs processes.

Its primary function is to act as a layer of orchestration, governance, and control that ensures data agents and processes are executed correctly on operating systems.

Instead of becoming a new repository, Data Fabresh acts as:

A custodian of quality rules
Guarantor of data contracts
An orchestrator of declarative processes
Controller of provenance and traceability
Manager of access and compliance policies

In this way, the focus is no longer on where the data resides, but on how the processes that use it are governed.

Conclusion

Agent data architecture represents a profound paradigm shift: moving from the movement and replication of data to its processing directly where it is generated, using a declarative approach governed by intelligent agents. This model allows us to work with operational data with guaranteed quality, eliminates duplication, reduces latency, and transforms the way we understand architecture, integrating the conceptual, logical, and physical aspects into a dynamic, distributed system.

Applied Data Governance Practitioner Certification

Validate your expertise – accelerate your career.

Learn More

The Future of Data Architecture: Managing Data at the Source with Agents

The Three Pillars

Live Online Course: Data Architecture Intensive

“And Yet It Moves”

The End of the “Copy Paradigm”

From Data Transfer to Computing Power Transfer

Upskill Your Team — At Scale

4.0 The True Cornerstone: Data Governance

The Role of Data Agents

A Radical Change in Architecture

Benefits of the Model

From the Layered Model to Agent-Based Architecture

The Challenges

The Future of Data Architecture

The Role of the Declarative Approach

The Role of Data Fabresh: Guardians of Processes, Not Data

Conclusion

Applied Data Governance Practitioner Certification

Michele Iurillo

Escaping the AI Pilot Trap: Nurturing Data Estates for an AI Ecosystem

Why Most Data Security Strategies Collapse Under Real-World Pressure

Your Data Is the Product, AI Agents Are Just the Interface

Thanks!

The Future of Data Architecture: Managing Data at the Source with Agents

The Three Pillars

Live Online Course: Data Architecture Intensive

“And Yet It Moves”

The End of the “Copy Paradigm”

From Data Transfer to Computing Power Transfer

Upskill Your Team — At Scale

4.0 The True Cornerstone: Data Governance

The Role of Data Agents

A Radical Change in Architecture

Benefits of the Model

From the Layered Model to Agent-Based Architecture

The Challenges

The Future of Data Architecture

The Role of the Declarative Approach

The Role of Data Fabresh: Guardians of Processes, Not Data

Conclusion

Applied Data Governance Practitioner Certification

Michele Iurillo

Related Articles

Escaping the AI Pilot Trap: Nurturing Data Estates for an AI Ecosystem

Why Most Data Security Strategies Collapse Under Real-World Pressure

Your Data Is the Product, AI Agents Are Just the Interface

Lead the Data Revolution from Your Inbox.

Thanks!