CVE-2026-34756

MEDIUM NVD

CVSS Score 6.5

Severity MEDIUM

Published Apr 06, 2026

Vendor unknown

Description

vLLM is an inference and serving engine for large language models (LLMs). From 0.1.0 to before 0.19.0, a Denial of Service vulnerability exists in the vLLM OpenAI-compatible API server. Due to the lack of an upper bound validation on the n parameter in the ChatCompletionRequest and CompletionRequest Pydantic models, an unauthenticated attacker can send a single HTTP request with an astronomically large n value. This completely blocks the Python asyncio event loop and causes immediate Out-Of-Memory crashes by allocating millions of request object copies in the heap before the request even reaches the scheduling queue. This vulnerability is fixed in 0.19.0.

References

https://github.com/vllm-project/vllm/commit/b111f8a61f100fdca08706f41f29ef3548de7380
https://github.com/vllm-project/vllm/pull/37952
https://github.com/vllm-project/vllm/security/advisories/GHSA-3mwp-wvh9-7528