If an Agentic IoT stack asks A2A to manage devices, lets MCP write field-level operations directly, and compares OPC UA with Modbus as if they belonged to the same layer, the system is probably still a demo composition rather than a production architecture. The core conclusion of this article is straightforward: A2A, MCP, OPC UA, and Modbus are not substitutes at the same level. They belong to the agent coordination layer, the tool access layer, the asset and semantic layer, and the field execution layer respectively. When those responsibilities are separated, agents can cooperate on tasks, target the right equipment, and still prove whether a physical action actually completed.
This article answers one practical question: when multiple agents participate in industrial device control, how should these four layers be arranged so the system remains governable rather than merely impressive in a demo.
Definition block
In this article, an
Agentic IoT control planedoes not mean teaching a model to speak industrial protocols directly. It means separating the pathagent coordination -> tool invocation -> asset resolution -> field execution -> state confirmationinto explicit layers with clear failure handling.
Decision block
When the system includes multiple agents, multiple sites, multiple device protocols, and real requirements for authorization, audit, and rollback, a layered path such as
A2A -> MCP -> command service / OPC UA -> Modbusis usually safer than a flatter design. A more direct path is only tolerable for low-risk, single-site, early-stage PoC conditions.
1. Why Agentic IoT most often fails at the layering boundary
1.1 Teams often collapse four separate problems into one
Industrial agent systems usually need to solve four different problems at once:
- how multiple agents coordinate, hand off tasks, and escalate
- how agents are allowed to access platform capabilities without touching field protocol details
- how the platform resolves "room 3 HVAC", "line-2 pump", or "PLC-A" into a stable asset identity
- how the final device action is actually executed on site
When those problems are forced into one layer, the system usually breaks down in three ways:
coordination failure: multiple agents issue the same action or assume someone else already executed itsemantic failure: the model sees unstable aliases, topic strings, or raw register referencesexecution failure: the platform can say "a message was sent" but cannot say whether the physical action completed
1.2 The consequence is not stylistic weakness but operational risk
With SaaS APIs, weak separation often produces only messy architecture. With industrial and building IoT, the consequence is much more concrete:
- a planning agent triggers a control path it was never meant to own
- a temperature-setpoint change is mistranslated into a mode-switch command
- the system mistakes broker-level delivery for physical completion
Judgment block
Once the object becomes an HVAC controller, PLC, chiller, drive, or actuator, the main risk is no longer "the model said something vague." The real risk is that an invalid control path has been wrapped in automation language that looks reasonable. Layering exists first to put risk back inside the right ownership boundary.
2. What each layer should own
2.1 A2A owns agent collaboration, not device control
A2A is best suited to agent-to-agent coordination. In practice that usually means:
- decomposing user intent into planning, analysis, execution, and approval subtasks
- handing context and constraints from one agent to another
- deciding when a task must escalate to human review or a higher-privilege execution path
For Agentic IoT, the real value of A2A is not "connecting agents to devices." It is making sure not every agent receives a direct path to field execution.
2.2 MCP owns controlled tool access
MCP is much better placed as the structured tool-access layer. In a production IoT platform, the model should usually see tools such as:
resolve_asset(site, zone, alias)request_action(asset_id, action, parameters)get_command_status(command_id)request_human_approval(change_request_id)
This matters because MCP should constrain natural-language intent into structured requests. It should not expose raw_modbus_write, opcua_node_write, or publish_mqtt directly to the model.
2.3 OPC UA or an equivalent asset layer owns stable meaning
When the platform must consistently represent equipment identity, capability, state quality, zone membership, and browsable structure, it needs a semantic layer that is more stable than field registers or raw topics. OPC UA is often useful here because it supports:
- object-oriented node hierarchies
- stable naming and type boundaries
- edge-side normalization across vendor-specific point maps
- state, metadata, and quality semantics that are easier to govern
Even if the deployment does not standardize exclusively on OPC UA, this layer still needs to exist. Otherwise the agent and the platform are left with unstable aliases and low-level point references.
2.4 Modbus belongs in the field execution layer
Modbus RTU/TCP is still extremely practical for last-hop execution against PLCs, meters, controllers, and industrial equipment. It belongs at the field layer where it can handle:
- constrained register reads
- bounded register writes
- device-specific polling patterns
- the final translation into site equipment behavior
It is the wrong abstraction to expose directly to the model because:
- register semantics are weak and vendor-specific
- writes often depend on policy checks, time windows, and interlock conditions
- auditing and tenant isolation become far harder if raw device addressing leaks upward
3. A more stable Agentic IoT path
flowchart LR
U["User intent / business policy"] --> A["Collaborating agents<br/>A2A orchestration"]
A --> M["Controlled tools<br/>MCP"]
M --> C["Command service / policy engine"]
C --> O["Asset and semantic layer<br/>OPC UA / Digital Twin"]
O --> G["Edge gateway / protocol adapters"]
G --> D["Field devices<br/>Modbus / PLC / Controller"]
D --> F["State readback / ACK / alarms"]
F --> C
C --> T["Audit trail / human takeover"]
linkStyle default stroke:#6B7C93,stroke-width:1.6px;The point of this structure is not "more boxes." It is that each layer deals with one class of responsibility:
A2Adecides who should act, not how registers are writtenMCPconstrains what the agent is allowed to requestcommand service / policy enginedecides whether the action is allowed and how failures are handledOPC UA / asset layerresolves what the target really is and how its capabilities are representedModbusexecutes the last-hop device interaction
Comparison block
A2AandMCPlive at the AI-system boundary.OPC UAandModbuslive at the industrial-system boundary. The first pair answers how agents coordinate and invoke tools. The second pair answers how industrial objects are modeled and executed. Treating both pairs as one protocol-choice problem is itself an architecture mistake.
4. Why the command service is the non-optional layer in the middle
Even with good separation between A2A, MCP, OPC UA, and Modbus, the architecture is still fragile if there is no command service in the middle. Physical control requires a shared execution lifecycle rather than a single successful tool call.
A minimal command service usually needs to track:
command_idtrace_idasset_idrequested_actionpolicy_decisiontimeout_windowrollback_hintcurrent_state
More importantly, it needs a lifecycle such as:
Created -> Approved -> Resolved -> Dispatched -> DeviceAcked -> Applied
with terminal outcomes like:
Rejected / TimedOut / Failed / Cancelled / RolledBack
4.1 Why broker ACK is not enough
One of the most dangerous operational mistakes is equating "the broker or gateway accepted the message" with "the device action completed." In industrial sites, any of these may fail independently:
- the asset is resolved correctly, but the device is interlocked or maintenance-locked
- the gateway accepts the message, but the register write is rejected
- the device sends an ACK, but the state does not actually change
- the state changes briefly, then local control logic overwrites it
flowchart LR
A["Agent request"] --> B["Policy approval"]
B --> C["Asset resolution"]
C --> D["Protocol dispatch"]
D --> E["Device ACK"]
E --> F["State confirmation"]
D --> X["Timeout / no receipt"]
E --> Y["ACK without effect"]
F --> Z["Complete / rollback / handoff"]
linkStyle default stroke:#6B7C93,stroke-width:1.6px;Judgment block
In any system with local control logic, edge gateways, or device safety constraints, only state confirmation should count as completion. Tool success and broker ACK are intermediate signals, not proof of control.
5. Which missing layer should you fix first
| Current symptom | Missing layer | Why it matters |
|---|---|---|
| Multiple agents issue overlapping actions | A2A coordination | No task ownership or escalation logic |
| Agents can read state but write requests are unstable or over-privileged | MCP tool layer | No structured action interface or permission boundary |
| Device aliases drift across sites | OPC UA / asset layer | No stable identity or capability model |
| Field integration keeps breaking per vendor | Modbus / adapter layer | No execution-layer isolation |
| The platform cannot prove whether an action completed | Command service | No unified lifecycle, timeout logic, or rollback path |
The point is not that every system must improve every layer at once. The point is to identify where the dominant operational risk actually sits. If asset modeling is already strong but commands still lack a state machine, adding more protocol surface is not the priority.
6. When direct agent-to-field control is the wrong idea
Not every industrial system should become a closed-loop Agentic IoT system. Keep the agent in analysis or recommendation mode when:
- the target involves high-safety operations such as emergency stops or hard interlocks
- asset identity is still informal or human-dependent
- the devices cannot provide reliable state readback
- the organization has no human takeover path for failures
- the deployment is still at PoC maturity with loose tenant, site, and permission boundaries
6.1 The safer delegation path
For most teams, a safer rollout looks like this:
- let the agent observe state, summarize anomalies, and recommend actions
- allow low-risk parameter changes
- require approval for medium-risk actions
- allow stronger automation only after audit, ACK handling, rollback, and human takeover are mature
That sequence matters because it turns increasing model capability into controlled delegation rather than premature autonomy.
7. The practical conclusion
If you need an Agentic IoT control plane that can touch industrial equipment, the most important design move is not choosing the most fashionable protocol. It is placing each technology at the right layer:
A2Afor multi-agent collaboration and task handoffMCPfor controlled tool access and structured requestsOPC UAor an equivalent asset layer for stable identity, capability modeling, and state semanticsModbusfor last-hop field execution- a command service in the middle to own approval, dispatch, ACK handling, rollback, and audit
Final judgment
When the system must handle multi-agent orchestration, cross-site assets, industrial protocol heterogeneity, and high-cost failure modes,
A2A -> MCP -> command service / OPC UA -> Modbusis not an abstract best-practice diagram. It is the practical path to accountability, auditability, and staged delegation.