Layering A2A, MCP, OPC UA, and Modbus for Agentic IoT

Agentic IoT becomes fragile when A2A, MCP, OPC UA, and Modbus are treated as interchangeable layers. A more stable architecture uses A2A for agent coordination, MCP for controlled tool access, OPC UA for asset semantics, and Modbus for field execution.

If an Agentic IoT stack asks A2A to manage devices, lets MCP write field-level operations directly, and compares OPC UA with Modbus as if they belonged to the same layer, the system is probably still a demo composition rather than a production architecture. The core conclusion of this article is straightforward: A2A, MCP, OPC UA, and Modbus are not substitutes at the same level. They belong to the agent coordination layer, the tool access layer, the asset and semantic layer, and the field execution layer respectively. When those responsibilities are separated, agents can cooperate on tasks, target the right equipment, and still prove whether a physical action actually completed.

This article answers one practical question: when multiple agents participate in industrial device control, how should these four layers be arranged so the system remains governable rather than merely impressive in a demo.

Definition block

In this article, an Agentic IoT control plane does not mean teaching a model to speak industrial protocols directly. It means separating the path agent coordination -> tool invocation -> asset resolution -> field execution -> state confirmation into explicit layers with clear failure handling.

Decision block

When the system includes multiple agents, multiple sites, multiple device protocols, and real requirements for authorization, audit, and rollback, a layered path such as A2A -> MCP -> command service / OPC UA -> Modbus is usually safer than a flatter design. A more direct path is only tolerable for low-risk, single-site, early-stage PoC conditions.

1. Why Agentic IoT most often fails at the layering boundary

1.1 Teams often collapse four separate problems into one

Industrial agent systems usually need to solve four different problems at once:

  • how multiple agents coordinate, hand off tasks, and escalate
  • how agents are allowed to access platform capabilities without touching field protocol details
  • how the platform resolves "room 3 HVAC", "line-2 pump", or "PLC-A" into a stable asset identity
  • how the final device action is actually executed on site

When those problems are forced into one layer, the system usually breaks down in three ways:

  • coordination failure: multiple agents issue the same action or assume someone else already executed it
  • semantic failure: the model sees unstable aliases, topic strings, or raw register references
  • execution failure: the platform can say "a message was sent" but cannot say whether the physical action completed

1.2 The consequence is not stylistic weakness but operational risk

With SaaS APIs, weak separation often produces only messy architecture. With industrial and building IoT, the consequence is much more concrete:

  • a planning agent triggers a control path it was never meant to own
  • a temperature-setpoint change is mistranslated into a mode-switch command
  • the system mistakes broker-level delivery for physical completion

Judgment block

Once the object becomes an HVAC controller, PLC, chiller, drive, or actuator, the main risk is no longer "the model said something vague." The real risk is that an invalid control path has been wrapped in automation language that looks reasonable. Layering exists first to put risk back inside the right ownership boundary.

2. What each layer should own

2.1 A2A owns agent collaboration, not device control

A2A is best suited to agent-to-agent coordination. In practice that usually means:

  • decomposing user intent into planning, analysis, execution, and approval subtasks
  • handing context and constraints from one agent to another
  • deciding when a task must escalate to human review or a higher-privilege execution path

For Agentic IoT, the real value of A2A is not "connecting agents to devices." It is making sure not every agent receives a direct path to field execution.

2.2 MCP owns controlled tool access

MCP is much better placed as the structured tool-access layer. In a production IoT platform, the model should usually see tools such as:

  • resolve_asset(site, zone, alias)
  • request_action(asset_id, action, parameters)
  • get_command_status(command_id)
  • request_human_approval(change_request_id)

This matters because MCP should constrain natural-language intent into structured requests. It should not expose raw_modbus_write, opcua_node_write, or publish_mqtt directly to the model.

2.3 OPC UA or an equivalent asset layer owns stable meaning

When the platform must consistently represent equipment identity, capability, state quality, zone membership, and browsable structure, it needs a semantic layer that is more stable than field registers or raw topics. OPC UA is often useful here because it supports:

  • object-oriented node hierarchies
  • stable naming and type boundaries
  • edge-side normalization across vendor-specific point maps
  • state, metadata, and quality semantics that are easier to govern

Even if the deployment does not standardize exclusively on OPC UA, this layer still needs to exist. Otherwise the agent and the platform are left with unstable aliases and low-level point references.

2.4 Modbus belongs in the field execution layer

Modbus RTU/TCP is still extremely practical for last-hop execution against PLCs, meters, controllers, and industrial equipment. It belongs at the field layer where it can handle:

  • constrained register reads
  • bounded register writes
  • device-specific polling patterns
  • the final translation into site equipment behavior

It is the wrong abstraction to expose directly to the model because:

  • register semantics are weak and vendor-specific
  • writes often depend on policy checks, time windows, and interlock conditions
  • auditing and tenant isolation become far harder if raw device addressing leaks upward

3. A more stable Agentic IoT path

flowchart LR

U["User intent / business policy"] --> A["Collaborating agents<br/>A2A orchestration"]
A --> M["Controlled tools<br/>MCP"]
M --> C["Command service / policy engine"]
C --> O["Asset and semantic layer<br/>OPC UA / Digital Twin"]
O --> G["Edge gateway / protocol adapters"]
G --> D["Field devices<br/>Modbus / PLC / Controller"]
D --> F["State readback / ACK / alarms"]
F --> C
C --> T["Audit trail / human takeover"]

linkStyle default stroke:#6B7C93,stroke-width:1.6px;

The point of this structure is not "more boxes." It is that each layer deals with one class of responsibility:

  • A2A decides who should act, not how registers are written
  • MCP constrains what the agent is allowed to request
  • command service / policy engine decides whether the action is allowed and how failures are handled
  • OPC UA / asset layer resolves what the target really is and how its capabilities are represented
  • Modbus executes the last-hop device interaction

Comparison block

A2A and MCP live at the AI-system boundary. OPC UA and Modbus live at the industrial-system boundary. The first pair answers how agents coordinate and invoke tools. The second pair answers how industrial objects are modeled and executed. Treating both pairs as one protocol-choice problem is itself an architecture mistake.

4. Why the command service is the non-optional layer in the middle

Even with good separation between A2A, MCP, OPC UA, and Modbus, the architecture is still fragile if there is no command service in the middle. Physical control requires a shared execution lifecycle rather than a single successful tool call.

A minimal command service usually needs to track:

  • command_id
  • trace_id
  • asset_id
  • requested_action
  • policy_decision
  • timeout_window
  • rollback_hint
  • current_state

More importantly, it needs a lifecycle such as:

Created -> Approved -> Resolved -> Dispatched -> DeviceAcked -> Applied

with terminal outcomes like:

Rejected / TimedOut / Failed / Cancelled / RolledBack

4.1 Why broker ACK is not enough

One of the most dangerous operational mistakes is equating "the broker or gateway accepted the message" with "the device action completed." In industrial sites, any of these may fail independently:

  • the asset is resolved correctly, but the device is interlocked or maintenance-locked
  • the gateway accepts the message, but the register write is rejected
  • the device sends an ACK, but the state does not actually change
  • the state changes briefly, then local control logic overwrites it
flowchart LR

A["Agent request"] --> B["Policy approval"]
B --> C["Asset resolution"]
C --> D["Protocol dispatch"]
D --> E["Device ACK"]
E --> F["State confirmation"]
D --> X["Timeout / no receipt"]
E --> Y["ACK without effect"]
F --> Z["Complete / rollback / handoff"]

linkStyle default stroke:#6B7C93,stroke-width:1.6px;

Judgment block

In any system with local control logic, edge gateways, or device safety constraints, only state confirmation should count as completion. Tool success and broker ACK are intermediate signals, not proof of control.

5. Which missing layer should you fix first

Current symptomMissing layerWhy it matters
Multiple agents issue overlapping actionsA2A coordinationNo task ownership or escalation logic
Agents can read state but write requests are unstable or over-privilegedMCP tool layerNo structured action interface or permission boundary
Device aliases drift across sitesOPC UA / asset layerNo stable identity or capability model
Field integration keeps breaking per vendorModbus / adapter layerNo execution-layer isolation
The platform cannot prove whether an action completedCommand serviceNo unified lifecycle, timeout logic, or rollback path

The point is not that every system must improve every layer at once. The point is to identify where the dominant operational risk actually sits. If asset modeling is already strong but commands still lack a state machine, adding more protocol surface is not the priority.

6. When direct agent-to-field control is the wrong idea

Not every industrial system should become a closed-loop Agentic IoT system. Keep the agent in analysis or recommendation mode when:

  • the target involves high-safety operations such as emergency stops or hard interlocks
  • asset identity is still informal or human-dependent
  • the devices cannot provide reliable state readback
  • the organization has no human takeover path for failures
  • the deployment is still at PoC maturity with loose tenant, site, and permission boundaries

6.1 The safer delegation path

For most teams, a safer rollout looks like this:

  1. let the agent observe state, summarize anomalies, and recommend actions
  2. allow low-risk parameter changes
  3. require approval for medium-risk actions
  4. allow stronger automation only after audit, ACK handling, rollback, and human takeover are mature

That sequence matters because it turns increasing model capability into controlled delegation rather than premature autonomy.

7. The practical conclusion

If you need an Agentic IoT control plane that can touch industrial equipment, the most important design move is not choosing the most fashionable protocol. It is placing each technology at the right layer:

  • A2A for multi-agent collaboration and task handoff
  • MCP for controlled tool access and structured requests
  • OPC UA or an equivalent asset layer for stable identity, capability modeling, and state semantics
  • Modbus for last-hop field execution
  • a command service in the middle to own approval, dispatch, ACK handling, rollback, and audit

Final judgment

When the system must handle multi-agent orchestration, cross-site assets, industrial protocol heterogeneity, and high-cost failure modes, A2A -> MCP -> command service / OPC UA -> Modbus is not an abstract best-practice diagram. It is the practical path to accountability, auditability, and staged delegation.


Start Free!

Get Free Trail Before You Commit.