CVE-2025-29770 - Vulnerability Details

vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. The outlines library is one of the backends used by vLLM to support structured output (a.k.a. guided decoding). Outlines provides an optional cache for its compiled grammars on the local filesystem. This cache has been on by default in vLLM. Outlines is also available by default through the OpenAI compatible API server. The affected code in vLLM is vllm/model_executor/guided_decoding/outlines_logits_processors.py, which unconditionally uses the cache from outlines. A malicious user can send a stream of very short decoding requests with unique schemas, resulting in an addition to the cache for each request. This can result in a Denial of Service if the filesystem runs out of space. Note that even if vLLM was configured to use a different backend by default, it is still possible to choose outlines on a per-request basis using the guided_decoding_backend key of the extra_body field of the request. This issue applies only to the V0 engine and is fixed in 0.8.0.

No CVSS v4.0

Attack Vector Network

Attack Complexity Low

Privileges Required Low

Scope Unchanged

Confidentiality Impact None

Integrity Impact None

Availability Impact High

User Interaction None

No CVSS v3.0

No CVSS v2

This CVE is not in the KEV list.

The EPSS score is 0.00316.

Exploitation none

Automatable no

Technical Impact partial

Default status is the baseline for the product, each version can override it (e.g. patched versions marked unaffected).

Vendor Product Default status Versions

vllm-project

vllm

affected

Version	Status	Constraints
`< 0.8.0`	affected	—

Configuration 1 [-]

cpe:2.3:a:vllm:vllm:*:*:*:*:*:*:*:*

No data.

Project Subscriptions

Vendors	Products
Vllm Subscribe	Vllm Subscribe

Advisories

Source	ID	Title
EUVD	EUVD-2025-6726	vLLM denial of service via outlines unbounded cache on disk
Github GHSA	GHSA-mgrm-fgjv-mhv8	vLLM denial of service via outlines unbounded cache on disk

Fixes

Solution

No solution given by the vendor.

Workaround

No workaround given by the vendor.

References

Link	Providers
https://github.com/vllm-project/vllm/blob/53be4a863486d02bd96a59c674bbec23eec508f6/vllm/model_executor/guided_decoding/outlines_logits_processors.py
https://github.com/vllm-project/vllm/pull/14837
https://github.com/vllm-project/vllm/security/advisories/GHSA-mgrm-fgjv-mhv8
https://nvd.nist.gov/vuln/detail/CVE-2025-29770
https://www.cve.org/CVERecord?id=CVE-2025-29770

History

Thu, 31 Jul 2025 16:00:00 +0000

Type	Values Removed	Values Added
First Time appeared		Vllm Vllm vllm
CPEs		cpe:2.3:a:vllm:vllm::::::::
Vendors & Products		Vllm Vllm vllm

Thu, 20 Mar 2025 14:00:00 +0000

Type	Values Removed	Values Added
References		https://nvd.nist.gov/vuln/detail/CVE-2025-29770 https://www.cve.org/CVERecord?id=CVE-2025-29770
Metrics	threat_severity `None`	threat_severity `Moderate`

Wed, 19 Mar 2025 21:15:00 +0000

Type	Values Removed	Values Added
Metrics		ssvc `{'options': {'Automatable': 'no', 'Exploitation': 'none', 'Technical Impact': 'partial'}, 'version': '2.0.3'}`

Wed, 19 Mar 2025 15:45:00 +0000

Type	Values Removed	Values Added
Description		vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. The outlines library is one of the backends used by vLLM to support structured output (a.k.a. guided decoding). Outlines provides an optional cache for its compiled grammars on the local filesystem. This cache has been on by default in vLLM. Outlines is also available by default through the OpenAI compatible API server. The affected code in vLLM is vllm/model_executor/guided_decoding/outlines_logits_processors.py, which unconditionally uses the cache from outlines. A malicious user can send a stream of very short decoding requests with unique schemas, resulting in an addition to the cache for each request. This can result in a Denial of Service if the filesystem runs out of space. Note that even if vLLM was configured to use a different backend by default, it is still possible to choose outlines on a per-request basis using the guided_decoding_backend key of the extra_body field of the request. This issue applies only to the V0 engine and is fixed in 0.8.0.
Title		vLLM denial of service via outlines unbounded cache on disk
Weaknesses		CWE-770
References		https://github.com/vllm-project/vllm/blob/53be4a863486d02bd96a59c674bbec23eec508f6/vllm/model_executor/guided_decoding/outlines_logits_processors.py https://github.com/vllm-project/vllm/pull/14837 https://github.com/vllm-project/vllm/security/advisories/GHSA-mgrm-fgjv-mhv8
Metrics		cvssV3_1 `{'score': 6.5, 'vector': 'CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H'}`

Projects

Sign in to view the affected projects.

MITRE

Status: PUBLISHED

Assigner: GitHub_M

Published: 2025-03-19T15:31:00.403Z

Updated: 2025-03-19T20:15:47.505Z

Reserved: 2025-03-11T14:23:00.474Z

Link: CVE-2025-29770

Vulnrichment

Updated: 2025-03-19T20:15:43.284Z

NVD

Status : Analyzed

Published: 2025-03-19T16:15:31.977

Modified: 2026-06-17T09:05:38.477

Link: CVE-2025-29770

Redhat

Severity : Moderate

Publid Date: 2025-03-19T15:31:00Z

Links: CVE-2025-29770 - Bugzilla

OpenCVE Enrichment

No data.

Weaknesses

CWE-770

Attack Vector Network

Attack Complexity Low

Privileges Required Low

Scope Unchanged

Confidentiality Impact None

Integrity Impact None

Availability Impact High

User Interaction None

Exploitation none

Automatable no

Technical Impact partial

Project Subscriptions

Projects

JSON object

JSON object

JSON object

JSON object

JSON object