CVE-2026-49121: AITER ROCm Unauthenticated RCE

AMD's AI Tensor Engine for ROCm deserializes attacker-supplied pickle payloads from an unauthenticated ZeroMQ socket, letting one crafted message execute code simultaneously across every reader worker in a cluster.

Few anti-patterns are as durable in this field as "deserialize untrusted data with pickle." Python's pickle format is, by construction, a code-execution mechanism — unpickling can instantiate arbitrary objects and invoke arbitrary callables — which is why feeding it attacker-controlled bytes is treated as equivalent to handing the attacker a shell. CVE-2026-49121 is that anti-pattern wired directly to an unauthenticated network socket inside AMD's AI Tensor Engine for ROCm, the AITER library used to accelerate inference workloads, and the National Vulnerability Database scores it 8.1.

The vulnerable code is the inter-process broadcast machinery that AITER uses to fan messages out to worker processes. According to the NVD, the flaw lives "in the MessageQueue.recv() function within shm_broadcast.py," and the transport it trusts is a ZeroMQ subscribe socket with no gatekeeping whatsoever.

"AI Tensor Engine for ROCm (AITER) through 0.1.14 contains an unauthenticated remote code execution vulnerability in the MessageQueue.recv() function within shm_broadcast.py that allows unauthenticated remote attackers to execute arbitrary code by sending a malicious pickle payload to a ZMQ SUB socket with no authentication, HMAC, or format validation."— NVD, source

Three missing controls are named explicitly, and each is load-bearing. There is no authentication, so the receiver never verifies who sent the message. There is no HMAC, so it never verifies that the bytes were not forged or tampered with in transit. And there is no format validation, so it never checks that the payload is benign before handing it to the deserializer. With all three absent, a SUB socket that accepts pickle is an open door: whatever arrives gets unpickled, and unpickling is execution.

The amplification is the scary part

A single-host RCE is serious. What elevates CVE-2026-49121 is the broadcast topology it abuses. The whole point of the shm_broadcast module is to distribute a message from one writer to many readers, and the vulnerability inherits that fan-out. As the NVD record continues, "Attackers who can reach the writer XPUB endpoint on the cluster network or supply a forged Handle with an attacker-controlled remote_subscribe_addr can deliver a crafted pickle payload that executes arbitrary code simultaneously as the inference worker process on every remote reader worker."

Read that as the operational reality: one message, every worker. An attacker does not compromise a single node and then grind through lateral movement; the legitimate broadcast fabric does the lateral movement for them, delivering the payload to every subscribed reader at once. In a GPU inference cluster, those workers are exactly the high-value, expensive, GPU-attached processes an attacker most wants — ideal for cryptomining, model and data theft, or staging deeper intrusion. The two reachability conditions the advisory describes — network access to the writer's XPUB endpoint, or the ability to supply a forged Handle pointing at an attacker-controlled remote_subscribe_addr — are both plausible in real deployments where the inter-worker fabric was assumed to live on a trusted internal network.

Reading the 8.1

The CVSS vector is AV:N/AC:H/PR:N/UI:N/S:U/C:H/I:H/A:H, and the one term holding the score below the 9-and-up range is AC:H — high attack complexity. That reflects the reachability preconditions: an attacker has to get onto the cluster network path to the XPUB endpoint, or manufacture the forged handle scenario, rather than firing blindly from the open internet. Everything else is maximal: no privileges required, no user interaction, and high impact across confidentiality, integrity, and availability, with the weakness classified as CWE-502, deserialization of untrusted data. The "high complexity" qualifier should not be misread as comfort. In a flat internal network — the default for many ML clusters — reaching a worker's message bus is not a high bar at all, and once reached, exploitation is a single payload.

The most important factor in any given deployment is network exposure of the broadcast endpoints. An AITER setup whose ZMQ sockets bind only to loopback or a tightly segmented, mutually authenticated fabric is far harder to reach than one that binds to a routable cluster interface accessible from the same VPC, a neighboring tenant, or a compromised adjacent pod. The default and the deployment topology determine whether AC:H is a meaningful obstacle or a formality.

What to do about it

The upstream project tracked the issue and shipped a fix; the NVD entry references the ROCm/aiter issue and the pull request that addresses it, along with a VulnCheck advisory. Operators should upgrade past 0.1.14 to the patched version. The structural remedy these fixes embody is the standard one for this bug class: stop trusting pickle on the wire — replace it with a safe serialization format, or at minimum add authentication and an HMAC so that only legitimately originated, untampered messages are ever deserialized, and validate the payload format before processing.

This vulnerability also belongs to a larger trend worth naming. As inference has scaled from single boxes to distributed clusters, ML systems have absorbed a great deal of distributed-systems plumbing — message queues, shared-memory broadcast, RPC fabrics — often borrowed or hand-rolled under intense performance pressure, and frequently with the implicit assumption that the cluster network is a trusted environment. That assumption is exactly what attackers now probe, whether via a compromised neighboring workload, a misconfigured network policy, or a multi-tenant boundary that turns out to be thinner than believed. Pickle-over-ZMQ is a recurring offender in this space precisely because it is the path of least resistance for passing arbitrary Python objects between workers, and it is correspondingly the path of least resistance for an attacker who reaches the bus. Treating the inter-worker transport as hostile by default — authenticated, integrity-protected, and using a non-executable serialization format — is the durable fix that outlives any single CVE.

Until the patched build is deployed, the highest-leverage compensating control is network segmentation: ensure the ZMQ XPUB/SUB endpoints are not reachable from anywhere an attacker might sit, bind them to trusted interfaces only, and treat the inter-worker fabric as a sensitive boundary rather than an implementation detail. CVE-2026-49121 is a clean reminder that the plumbing connecting distributed ML workers is real attack surface — and that pickle on an unauthenticated socket is one of the most reliable ways to turn a performance optimization into a cluster-wide compromise.

An Unauthenticated Pickle on a ZMQ Socket: CVE-2026-49121 Hands Attackers Every AITER Worker

The amplification is the scary part

Reading the 8.1

What to do about it

Comments