cvelistv5 - cve-2025-62164

CVE-2025-62164 (GCVE-0-2025-62164)

Vulnerability from cvelistv5

Published

2025-11-21 01:18

Modified

2025-11-21 01:18

Severity ?

8.8 (High) - CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H

CWE

CWE-20 - Improper Input Validation
CWE-123 - Write-what-where Condition
CWE-502 - Deserialization of Untrusted Data
CWE-787 - Out-of-bounds Write

Summary

vLLM is an inference and serving engine for large language models (LLMs). From versions 0.10.2 to before 0.11.1, a memory corruption vulnerability could lead to a crash (denial-of-service) and potentially remote code execution (RCE), exists in the Completions API endpoint. When processing user-supplied prompt embeddings, the endpoint loads serialized tensors using torch.load() without sufficient validation. Due to a change introduced in PyTorch 2.8.0, sparse tensor integrity checks are disabled by default. As a result, maliciously crafted tensors can bypass internal bounds checks and trigger an out-of-bounds memory write during the call to to_dense(). This memory corruption can crash vLLM and potentially lead to code execution on the server hosting vLLM. This issue has been patched in version 0.11.1.

References

URL

Tags

	security-advisories@github.com	https://github.com/vllm-project/vllm/commit/58fab50d82838d5014f4a14d991fdb9352c9c84b
	security-advisories@github.com	https://github.com/vllm-project/vllm/pull/27204
	security-advisories@github.com	https://github.com/vllm-project/vllm/security/advisories/GHSA-mrw7-hf4f-83pf

Impacted products

	Vendor	Product	Version
	vllm-project	vllm	Version: >= 0.10.2, < 0.11.1

Show details on NVD website

JSON

To clipboard

{
  "containers": {
    "cna": {
      "affected": [
        {
          "product": "vllm",
          "vendor": "vllm-project",
          "versions": [
            {
              "status": "affected",
              "version": "\u003e= 0.10.2, \u003c 0.11.1"
            }
          ]
        }
      ],
      "descriptions": [
        {
          "lang": "en",
          "value": "vLLM is an inference and serving engine for large language models (LLMs). From versions 0.10.2 to before 0.11.1, a memory corruption vulnerability could lead to a crash (denial-of-service) and potentially remote code execution (RCE), exists in the Completions API endpoint. When processing user-supplied prompt embeddings, the endpoint loads serialized tensors using torch.load() without sufficient validation. Due to a change introduced in PyTorch 2.8.0, sparse tensor integrity checks are disabled by default. As a result, maliciously crafted tensors can bypass internal bounds checks and trigger an out-of-bounds memory write during the call to to_dense(). This memory corruption can crash vLLM and potentially lead to code execution on the server hosting vLLM. This issue has been patched in version 0.11.1."
        }
      ],
      "metrics": [
        {
          "cvssV3_1": {
            "attackComplexity": "LOW",
            "attackVector": "NETWORK",
            "availabilityImpact": "HIGH",
            "baseScore": 8.8,
            "baseSeverity": "HIGH",
            "confidentialityImpact": "HIGH",
            "integrityImpact": "HIGH",
            "privilegesRequired": "LOW",
            "scope": "UNCHANGED",
            "userInteraction": "NONE",
            "vectorString": "CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H",
            "version": "3.1"
          }
        }
      ],
      "problemTypes": [
        {
          "descriptions": [
            {
              "cweId": "CWE-20",
              "description": "CWE-20: Improper Input Validation",
              "lang": "en",
              "type": "CWE"
            }
          ]
        },
        {
          "descriptions": [
            {
              "cweId": "CWE-123",
              "description": "CWE-123: Write-what-where Condition",
              "lang": "en",
              "type": "CWE"
            }
          ]
        },
        {
          "descriptions": [
            {
              "cweId": "CWE-502",
              "description": "CWE-502: Deserialization of Untrusted Data",
              "lang": "en",
              "type": "CWE"
            }
          ]
        },
        {
          "descriptions": [
            {
              "cweId": "CWE-787",
              "description": "CWE-787: Out-of-bounds Write",
              "lang": "en",
              "type": "CWE"
            }
          ]
        }
      ],
      "providerMetadata": {
        "dateUpdated": "2025-11-21T01:18:38.803Z",
        "orgId": "a0819718-46f1-4df5-94e2-005712e83aaa",
        "shortName": "GitHub_M"
      },
      "references": [
        {
          "name": "https://github.com/vllm-project/vllm/security/advisories/GHSA-mrw7-hf4f-83pf",
          "tags": [
            "x_refsource_CONFIRM"
          ],
          "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-mrw7-hf4f-83pf"
        },
        {
          "name": "https://github.com/vllm-project/vllm/pull/27204",
          "tags": [
            "x_refsource_MISC"
          ],
          "url": "https://github.com/vllm-project/vllm/pull/27204"
        },
        {
          "name": "https://github.com/vllm-project/vllm/commit/58fab50d82838d5014f4a14d991fdb9352c9c84b",
          "tags": [
            "x_refsource_MISC"
          ],
          "url": "https://github.com/vllm-project/vllm/commit/58fab50d82838d5014f4a14d991fdb9352c9c84b"
        }
      ],
      "source": {
        "advisory": "GHSA-mrw7-hf4f-83pf",
        "discovery": "UNKNOWN"
      },
      "title": "VLLM deserialization vulnerability leading to DoS and potential RCE"
    }
  },
  "cveMetadata": {
    "assignerOrgId": "a0819718-46f1-4df5-94e2-005712e83aaa",
    "assignerShortName": "GitHub_M",
    "cveId": "CVE-2025-62164",
    "datePublished": "2025-11-21T01:18:38.803Z",
    "dateReserved": "2025-10-07T16:12:03.425Z",
    "dateUpdated": "2025-11-21T01:18:38.803Z",
    "state": "PUBLISHED"
  },
  "dataType": "CVE_RECORD",
  "dataVersion": "5.2",
  "vulnerability-lookup:meta": {
    "nvd": "{\"cve\":{\"id\":\"CVE-2025-62164\",\"sourceIdentifier\":\"security-advisories@github.com\",\"published\":\"2025-11-21T02:15:43.193\",\"lastModified\":\"2025-11-21T15:13:13.800\",\"vulnStatus\":\"Undergoing Analysis\",\"cveTags\":[],\"descriptions\":[{\"lang\":\"en\",\"value\":\"vLLM is an inference and serving engine for large language models (LLMs). From versions 0.10.2 to before 0.11.1, a memory corruption vulnerability could lead to a crash (denial-of-service) and potentially remote code execution (RCE), exists in the Completions API endpoint. When processing user-supplied prompt embeddings, the endpoint loads serialized tensors using torch.load() without sufficient validation. Due to a change introduced in PyTorch 2.8.0, sparse tensor integrity checks are disabled by default. As a result, maliciously crafted tensors can bypass internal bounds checks and trigger an out-of-bounds memory write during the call to to_dense(). This memory corruption can crash vLLM and potentially lead to code execution on the server hosting vLLM. This issue has been patched in version 0.11.1.\"}],\"metrics\":{\"cvssMetricV31\":[{\"source\":\"security-advisories@github.com\",\"type\":\"Secondary\",\"cvssData\":{\"version\":\"3.1\",\"vectorString\":\"CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H\",\"baseScore\":8.8,\"baseSeverity\":\"HIGH\",\"attackVector\":\"NETWORK\",\"attackComplexity\":\"LOW\",\"privilegesRequired\":\"LOW\",\"userInteraction\":\"NONE\",\"scope\":\"UNCHANGED\",\"confidentialityImpact\":\"HIGH\",\"integrityImpact\":\"HIGH\",\"availabilityImpact\":\"HIGH\"},\"exploitabilityScore\":2.8,\"impactScore\":5.9}]},\"weaknesses\":[{\"source\":\"security-advisories@github.com\",\"type\":\"Primary\",\"description\":[{\"lang\":\"en\",\"value\":\"CWE-20\"},{\"lang\":\"en\",\"value\":\"CWE-123\"},{\"lang\":\"en\",\"value\":\"CWE-502\"},{\"lang\":\"en\",\"value\":\"CWE-787\"}]}],\"references\":[{\"url\":\"https://github.com/vllm-project/vllm/commit/58fab50d82838d5014f4a14d991fdb9352c9c84b\",\"source\":\"security-advisories@github.com\"},{\"url\":\"https://github.com/vllm-project/vllm/pull/27204\",\"source\":\"security-advisories@github.com\"},{\"url\":\"https://github.com/vllm-project/vllm/security/advisories/GHSA-mrw7-hf4f-83pf\",\"source\":\"security-advisories@github.com\"}]}}"
  }
}

ghsa-mrw7-hf4f-83pf

Vulnerability from github

Published

2025-11-20 20:59

Modified

2025-11-21 15:31

Severity ?

8.8 (High) - CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H

Summary

vLLM deserialization vulnerability leading to DoS and potential RCE

Details

Summary

A memory corruption vulnerability that leading to a crash (denial-of-service) and potentially remote code execution (RCE) exists in vLLM versions 0.10.2 and later, in the Completions API endpoint. When processing user-supplied prompt embeddings, the endpoint loads serialized tensors using torch.load() without sufficient validation.

Due to a change introduced in PyTorch 2.8.0, sparse tensor integrity checks are disabled by default. As a result, maliciously crafted tensors can bypass internal bounds checks and trigger an out-of-bounds memory write during the call to to_dense(). This memory corruption can crash vLLM and potentially lead to code execution on the server hosting vLLM.

Details

A vulnerability that can lead to RCE from the completions API endpoint exists in vllm, where due to missing checks when loading user-provided tensors, an out-of-bounds write can be triggered. This happens because the default behavior of torch.load(tensor, weights_only=True) since pytorch 2.8.0 is to not perform validity checks for sparse tensors, and this needs to be enabled explicitly using the torch.sparse.check_sparse_tensor_invariants context manager.

The vulnerability is in the following code in vllm/entrypoints/renderer.py:148

python def _load_and_validate_embed(embed: bytes) -> EngineEmbedsPrompt: tensor = torch.load( io.BytesIO(pybase64.b64decode(embed, validate=True)), weights_only=True, map_location=torch.device("cpu"), ) assert isinstance(tensor, torch.Tensor) and tensor.dtype in ( torch.float32, torch.bfloat16, torch.float16, ) tensor = tensor.to_dense()

Because of the missing checks, loading invalid prompt embedding tensors provided by the user can cause an out-of-bounds write in the call to to_dense .

Impact

All users with access to this API are able to exploit this vulnerability. Unsafe deserialization of untrusted input can be abused to achieve DoS and potentially remote code execution (RCE) in the vLLM server process. This impacts deployments running vLLM as a server or any instance that deserializes untrusted/model-provided payloads.

Fix

https://github.com/vllm-project/vllm/pull/27204

Acknowledgements

Finder: AXION Security Research Team (Omri Fainaro, Bary Levy): discovery and coordinated disclosure.

Show details on source website

JSON

To clipboard

{
  "affected": [
    {
      "package": {
        "ecosystem": "PyPI",
        "name": "vllm"
      },
      "ranges": [
        {
          "events": [
            {
              "introduced": "0.10.2"
            },
            {
              "fixed": "0.11.1"
            }
          ],
          "type": "ECOSYSTEM"
        }
      ]
    }
  ],
  "aliases": [
    "CVE-2025-62164"
  ],
  "database_specific": {
    "cwe_ids": [
      "CWE-123",
      "CWE-20",
      "CWE-502",
      "CWE-787"
    ],
    "github_reviewed": true,
    "github_reviewed_at": "2025-11-20T20:59:34Z",
    "nvd_published_at": "2025-11-21T02:15:43Z",
    "severity": "HIGH"
  },
  "details": "### Summary\nA memory corruption vulnerability that leading to a crash (denial-of-service) and potentially remote code execution (RCE) exists in vLLM versions 0.10.2 and later, in the Completions API endpoint. When processing user-supplied prompt embeddings, the endpoint loads serialized tensors using torch.load() without sufficient validation.\n\nDue to a change introduced in PyTorch 2.8.0, sparse tensor integrity checks are disabled by default. As a result, maliciously crafted tensors can bypass internal bounds checks and trigger an out-of-bounds memory write during the call to to_dense(). This memory corruption can crash vLLM and potentially lead to code execution on the server hosting vLLM.\n\n### Details\nA vulnerability that can lead to RCE from the completions API endpoint exists in vllm, where due to missing checks when loading user-provided tensors, an out-of-bounds write can be triggered. This happens because the default behavior of `torch.load(tensor, weights_only=True)`  since pytorch 2.8.0 is to not perform validity checks for sparse tensors, and this needs to be enabled explicitly using the [torch.sparse.check_sparse_tensor_invariants](https://docs.pytorch.org/docs/stable/generated/torch.sparse.check_sparse_tensor_invariants.html) context manager.\n\nThe vulnerability is in the following code in [vllm/entrypoints/renderer.py:148](https://github.com/vllm-project/vllm/blob/a332b84578cdc0706e040f6a765954c8a289904f/vllm/entrypoints/renderer.py#L148)\n\n```python\n    def _load_and_validate_embed(embed: bytes) -\u003e EngineEmbedsPrompt:\n        tensor = torch.load(\n            io.BytesIO(pybase64.b64decode(embed, validate=True)),\n            weights_only=True,\n            map_location=torch.device(\"cpu\"),\n        )\n        assert isinstance(tensor, torch.Tensor) and tensor.dtype in (\n            torch.float32,\n            torch.bfloat16,\n            torch.float16,\n        )\n        tensor = tensor.to_dense()\n```\n\nBecause of the missing checks, loading invalid prompt embedding tensors provided by the user can cause an out-of-bounds write in the call to `to_dense` .\n\n### Impact\nAll users with access to this API are able to exploit this vulnerability. Unsafe deserialization of untrusted input can be abused to achieve DoS and potentially remote code execution (RCE) in the vLLM server process. This impacts deployments running vLLM as a server or any instance that deserializes untrusted/model-provided payloads.\n\n## Fix\n\nhttps://github.com/vllm-project/vllm/pull/27204\n\n## Acknowledgements\n\nFinder: AXION Security Research Team (Omri Fainaro, Bary Levy): discovery and coordinated disclosure.",
  "id": "GHSA-mrw7-hf4f-83pf",
  "modified": "2025-11-21T15:31:32Z",
  "published": "2025-11-20T20:59:34Z",
  "references": [
    {
      "type": "WEB",
      "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-mrw7-hf4f-83pf"
    },
    {
      "type": "ADVISORY",
      "url": "https://nvd.nist.gov/vuln/detail/CVE-2025-62164"
    },
    {
      "type": "WEB",
      "url": "https://github.com/vllm-project/vllm/pull/27204"
    },
    {
      "type": "WEB",
      "url": "https://github.com/vllm-project/vllm/commit/58fab50d82838d5014f4a14d991fdb9352c9c84b"
    },
    {
      "type": "PACKAGE",
      "url": "https://github.com/vllm-project/vllm"
    }
  ],
  "schema_version": "1.4.0",
  "severity": [
    {
      "score": "CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H",
      "type": "CVSS_V3"
    }
  ],
  "summary": "vLLM deserialization vulnerability leading to DoS and potential RCE"
}

fkie_cve-2025-62164

Vulnerability from fkie_nvd

Published

2025-11-21 02:15

Modified

2025-11-21 15:13

Severity ?

8.8 (High) - CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H

Summary

References

	URL	Tags
	security-advisories@github.com	https://github.com/vllm-project/vllm/commit/58fab50d82838d5014f4a14d991fdb9352c9c84b
	security-advisories@github.com	https://github.com/vllm-project/vllm/pull/27204
	security-advisories@github.com	https://github.com/vllm-project/vllm/security/advisories/GHSA-mrw7-hf4f-83pf

Impacted products

	Vendor	Product	Version

JSON

To clipboard

{
  "cveTags": [],
  "descriptions": [
    {
      "lang": "en",
      "value": "vLLM is an inference and serving engine for large language models (LLMs). From versions 0.10.2 to before 0.11.1, a memory corruption vulnerability could lead to a crash (denial-of-service) and potentially remote code execution (RCE), exists in the Completions API endpoint. When processing user-supplied prompt embeddings, the endpoint loads serialized tensors using torch.load() without sufficient validation. Due to a change introduced in PyTorch 2.8.0, sparse tensor integrity checks are disabled by default. As a result, maliciously crafted tensors can bypass internal bounds checks and trigger an out-of-bounds memory write during the call to to_dense(). This memory corruption can crash vLLM and potentially lead to code execution on the server hosting vLLM. This issue has been patched in version 0.11.1."
    }
  ],
  "id": "CVE-2025-62164",
  "lastModified": "2025-11-21T15:13:13.800",
  "metrics": {
    "cvssMetricV31": [
      {
        "cvssData": {
          "attackComplexity": "LOW",
          "attackVector": "NETWORK",
          "availabilityImpact": "HIGH",
          "baseScore": 8.8,
          "baseSeverity": "HIGH",
          "confidentialityImpact": "HIGH",
          "integrityImpact": "HIGH",
          "privilegesRequired": "LOW",
          "scope": "UNCHANGED",
          "userInteraction": "NONE",
          "vectorString": "CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H",
          "version": "3.1"
        },
        "exploitabilityScore": 2.8,
        "impactScore": 5.9,
        "source": "security-advisories@github.com",
        "type": "Secondary"
      }
    ]
  },
  "published": "2025-11-21T02:15:43.193",
  "references": [
    {
      "source": "security-advisories@github.com",
      "url": "https://github.com/vllm-project/vllm/commit/58fab50d82838d5014f4a14d991fdb9352c9c84b"
    },
    {
      "source": "security-advisories@github.com",
      "url": "https://github.com/vllm-project/vllm/pull/27204"
    },
    {
      "source": "security-advisories@github.com",
      "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-mrw7-hf4f-83pf"
    }
  ],
  "sourceIdentifier": "security-advisories@github.com",
  "vulnStatus": "Undergoing Analysis",
  "weaknesses": [
    {
      "description": [
        {
          "lang": "en",
          "value": "CWE-20"
        },
        {
          "lang": "en",
          "value": "CWE-123"
        },
        {
          "lang": "en",
          "value": "CWE-502"
        },
        {
          "lang": "en",
          "value": "CWE-787"
        }
      ],
      "source": "security-advisories@github.com",
      "type": "Primary"
    }
  ]
}

Sightings

Author	Source	Type	Date

Nomenclature

Seen: The vulnerability was mentioned, discussed, or seen somewhere by the user.
Confirmed: The vulnerability is confirmed from an analyst perspective.
Published Proof of Concept: A public proof of concept is available for this vulnerability.
Exploited: This vulnerability was exploited and seen by the user reporting the sighting.
Patched: This vulnerability was successfully patched by the user reporting the sighting.
Not exploited: This vulnerability was not exploited or seen by the user reporting the sighting.
Not confirmed: The user expresses doubt about the veracity of the vulnerability.
Not patched: This vulnerability was not successfully patched by the user reporting the sighting.

Action not permitted

CVE-2025-62164 (GCVE-0-2025-62164)

Vulnerability from cvelistv5

ghsa-mrw7-hf4f-83pf

Vulnerability from github

Summary

Details

Impact

Fix

Acknowledgements

fkie_cve-2025-62164

Vulnerability from fkie_nvd

Tags

Sightings

Nomenclature