GHSA-3WW4-5JV9-J5GM

Vulnerability from github – Published: 2026-06-10 17:11 – Updated: 2026-06-10 17:11
VLAI
Summary
vLLM's Artifact Pin Decay allows pinned deployments to load unpinned code, weights, and processors
Details

Summary

vLLM's revision pinning controls do not consistently apply to all artifacts loaded for a model. A deployment that supplies --revision or --code-revision can still load dynamic code, GGUF files, image processors, retrieval side weights, or same-repository subfolder weights/config from an unpinned/default revision.

This is a supply-chain integrity issue for pinned vLLM deployments. Operators can believe they are serving a reviewed model revision while vLLM resolves behavior-affecting nested or sibling artifacts outside that reviewed revision.

Details

The expected invariant is:

When a vLLM operator supplies a model or code revision pin, every code, config, processor, weight file, side weight, and same-repository subfolder artifact loaded as part of that model should resolve under that pin unless vLLM exposes and enforces a separate explicit pin for that artifact.

Current main was verified affected at commit 3795d7acf431980e62e738493f437ae2a51549da.

Affected source boundaries:

  • vllm/model_executor/models/registry.py:1045-1051 and :1058-1064
  • _try_resolve_transformers() passes revision=model_config.revision and trust_remote_code=model_config.trust_remote_code, but omits code_revision=model_config.code_revision for external auto_map dynamic module imports.
  • vllm/model_executor/model_loader/gguf_loader.py:58-60
  • The direct-file GGUF form repo/file.gguf calls hf_hub_download(repo_id=repo_id, filename=filename) without passing revision.
  • vllm/model_executor/models/roberta.py:203-209
  • BGE-M3 secondary sparse and ColBERT side weights are declared with revision=None.
  • vllm/model_executor/models/kimi_k25.py:111-114
  • Kimi-K2.5 calls cached_get_image_processor() without passing model_config.revision.
  • vllm/model_executor/models/kimi_audio.py:92-95
  • Kimi-Audio loads Whisper config from the whisper-large-v3 subfolder without a revision argument.
  • vllm/model_executor/models/kimi_audio.py:425-430
  • Kimi-Audio declares same-repository whisper-large-v3 secondary weights with revision=None.
  • vllm/model_executor/model_loader/default_loader.py:287-301
  • The default loader preserves model_config.revision for the primary source, then consumes model-supplied secondary sources as declared.

The strongest example is Kimi-Audio: the primary moonshotai/Kimi-Audio-7B-Instruct weights preserve the configured model revision, but the same-repository whisper-large-v3 audio tower config/weights do not. A pinned Kimi-Audio deployment can therefore load the Whisper subfolder outside the audited revision.

This report does not claim a trust_remote_code=False bypass, unauthenticated RCE, or real artifact compromise. The issue is improper propagation of explicit artifact pins across supported loader paths.

Impact

Affected users are operators who pin vLLM model deployments to a reviewed Hugging Face revision for safety review, provenance, rollback, or reproducibility. The impact is that the pin does not reliably describe the full set of artifacts vLLM serves. Even when the operator selects an audited revision, vLLM can resolve behavior-affecting secondary artifacts from the repository default branch or another mutable ref.

Depending on the model path, the unpinned artifact can be dynamic model code, a GGUF file, an image processor, retrieval side weights, or the same-repository Kimi-Audio Whisper subfolder weights/config.

This breaks the operational guarantee of a pinned deployment: "serve the exact artifact set I reviewed." A later change to an unpinned secondary artifact can alter model behavior without changing the operator's configured revision, making review, rollback, incident response, and audit records unreliable.

Occurrences

  • vllm/model_executor/models/kimi_k25.py L111-L114 — Kimi-K2.5 loads its image processor with cached_get_image_processor() but does not pass self.ctx.model_config.revision. The processor can therefore resolve from the default repository revision even when the model deployment is pinned.
  • vllm/model_executor/models/kimi_audio.py L425-L430 — Kimi-Audio declares same-repository whisper-large-v3 secondary weights with revision=None. A pinned Kimi-Audio deployment can therefore load the Whisper audio tower weights from an unpinned/default revision.
  • vllm/model_executor/models/kimi_audio.py L92-L95 — Kimi-Audio loads Whisper config from the same repository's whisper-large-v3 subfolder without passing the top-level model revision. The config for this behavior-affecting subcomponent can be resolved outside the audited model revision.
  • vllm/model_executor/models/registry.py L1058-L1064 — The later dynamic model-class resolution repeats the same pin-decay pattern: it forwards revision and trust_remote_code, but omits code_revision. This means an operator-provided code pin is not enforced at the dynamic module loader boundary.
  • vllm/model_executor/model_loader/gguf_loader.py L58-L60 — The direct GGUF form repo/file.gguf calls hf_hub_download(repo_id=repo_id, filename=filename) without passing model_config.revision. A deployment that pins the model revision can therefore resolve this GGUF file from the repository default revision.
  • vllm/model_executor/models/registry.py L1045-L1051 — try_get_class_from_dynamic_module() is called for external auto_map config/model classes with revision=model_config.revision, but without forwarding model_config.code_revision. When --code-revision is set, this dynamic module resolution can still fall back to the default code revision instead of the audited code revision.
  • vllm/model_executor/models/roberta.py L203-L209 — BgeM3EmbeddingModel creates same-repository secondary sparse/ColBERT weight sources with revision=None. The primary model revision is not propagated to these side weights, so they can be downloaded outside the operator-selected model revision.

Fixes

This was fixed in: https://github.com/vllm-project/vllm/pull/42616


Originally filed via huntr: https://huntr.com/bounties/3f1e24c0-87d2-4f6c-a705-820f380879ac.

The vLLM maintainer (Russell Bryant) redirected the report to the private GHSA channel. Offline proof bundle (vllm_artifact_pin_decay_bundle_verify.py + bundle-verification-20260430T143506Z.json) is available upon request.

Show details on source website

{
  "affected": [
    {
      "package": {
        "ecosystem": "PyPI",
        "name": "vllm"
      },
      "ranges": [
        {
          "events": [
            {
              "introduced": "0"
            },
            {
              "fixed": "0.22.0"
            }
          ],
          "type": "ECOSYSTEM"
        }
      ]
    }
  ],
  "aliases": [
    "CVE-2026-47155"
  ],
  "database_specific": {
    "cwe_ids": [
      "CWE-345"
    ],
    "github_reviewed": true,
    "github_reviewed_at": "2026-06-10T17:11:38Z",
    "nvd_published_at": null,
    "severity": "MODERATE"
  },
  "details": "### Summary\n\nvLLM\u0027s revision pinning controls do not consistently apply to all artifacts loaded for a model. A deployment that supplies `--revision` or `--code-revision` can still load dynamic code, GGUF files, image processors, retrieval side weights, or same-repository subfolder weights/config from an unpinned/default revision.\n\nThis is a supply-chain integrity issue for pinned vLLM deployments. Operators can believe they are serving a reviewed model revision while vLLM resolves behavior-affecting nested or sibling artifacts outside that reviewed revision.\n\n### Details\n\nThe expected invariant is:\n\n\u003e When a vLLM operator supplies a model or code revision pin, every code, config, processor, weight file, side weight, and same-repository subfolder artifact loaded as part of that model should resolve under that pin unless vLLM exposes and enforces a separate explicit pin for that artifact.\n\nCurrent `main` was verified affected at commit `3795d7acf431980e62e738493f437ae2a51549da`.\n\nAffected source boundaries:\n\n- `vllm/model_executor/models/registry.py:1045-1051` and `:1058-1064`\n  - `_try_resolve_transformers()` passes `revision=model_config.revision` and `trust_remote_code=model_config.trust_remote_code`, but omits `code_revision=model_config.code_revision` for external `auto_map` dynamic module imports.\n- `vllm/model_executor/model_loader/gguf_loader.py:58-60`\n  - The direct-file GGUF form `repo/file.gguf` calls `hf_hub_download(repo_id=repo_id, filename=filename)` without passing `revision`.\n- `vllm/model_executor/models/roberta.py:203-209`\n  - BGE-M3 secondary sparse and ColBERT side weights are declared with `revision=None`.\n- `vllm/model_executor/models/kimi_k25.py:111-114`\n  - Kimi-K2.5 calls `cached_get_image_processor()` without passing `model_config.revision`.\n- `vllm/model_executor/models/kimi_audio.py:92-95`\n  - Kimi-Audio loads Whisper config from the `whisper-large-v3` subfolder without a `revision` argument.\n- `vllm/model_executor/models/kimi_audio.py:425-430`\n  - Kimi-Audio declares same-repository `whisper-large-v3` secondary weights with `revision=None`.\n- `vllm/model_executor/model_loader/default_loader.py:287-301`\n  - The default loader preserves `model_config.revision` for the primary source, then consumes model-supplied secondary sources as declared.\n\nThe strongest example is Kimi-Audio: the primary `moonshotai/Kimi-Audio-7B-Instruct` weights preserve the configured model revision, but the same-repository `whisper-large-v3` audio tower config/weights do not. A pinned Kimi-Audio deployment can therefore load the Whisper subfolder outside the audited revision.\n\nThis report does not claim a `trust_remote_code=False` bypass, unauthenticated RCE, or real artifact compromise. The issue is improper propagation of explicit artifact pins across supported loader paths.\n\n### Impact\n\nAffected users are operators who pin vLLM model deployments to a reviewed Hugging Face revision for safety review, provenance, rollback, or reproducibility. The impact is that the pin does not reliably describe the full set of artifacts vLLM serves. Even when the operator selects an audited revision, vLLM can resolve behavior-affecting secondary artifacts from the repository default branch or another mutable ref.\n\nDepending on the model path, the unpinned artifact can be dynamic model code, a GGUF file, an image processor, retrieval side weights, or the same-repository Kimi-Audio Whisper subfolder weights/config.\n\nThis breaks the operational guarantee of a pinned deployment: \"serve the exact artifact set I reviewed.\" A later change to an unpinned secondary artifact can alter model behavior without changing the operator\u0027s configured revision, making review, rollback, incident response, and audit records unreliable.\n\n### Occurrences\n\n- `vllm/model_executor/models/kimi_k25.py` L111-L114 \u2014 Kimi-K2.5 loads its image processor with `cached_get_image_processor()` but does not pass `self.ctx.model_config.revision`. The processor can therefore resolve from the default repository revision even when the model deployment is pinned.\n- `vllm/model_executor/models/kimi_audio.py` L425-L430 \u2014 Kimi-Audio declares same-repository `whisper-large-v3` secondary weights with `revision=None`. A pinned Kimi-Audio deployment can therefore load the Whisper audio tower weights from an unpinned/default revision.\n- `vllm/model_executor/models/kimi_audio.py` L92-L95 \u2014 Kimi-Audio loads Whisper config from the same repository\u0027s `whisper-large-v3` subfolder without passing the top-level model revision. The config for this behavior-affecting subcomponent can be resolved outside the audited model revision.\n- `vllm/model_executor/models/registry.py` L1058-L1064 \u2014 The later dynamic model-class resolution repeats the same pin-decay pattern: it forwards `revision` and `trust_remote_code`, but omits `code_revision`. This means an operator-provided code pin is not enforced at the dynamic module loader boundary.\n- `vllm/model_executor/model_loader/gguf_loader.py` L58-L60 \u2014 The direct GGUF form `repo/file.gguf` calls `hf_hub_download(repo_id=repo_id, filename=filename)` without passing `model_config.revision`. A deployment that pins the model revision can therefore resolve this GGUF file from the repository default revision.\n- `vllm/model_executor/models/registry.py` L1045-L1051 \u2014 `try_get_class_from_dynamic_module()` is called for external `auto_map` config/model classes with `revision=model_config.revision`, but without forwarding `model_config.code_revision`. When `--code-revision` is set, this dynamic module resolution can still fall back to the default code revision instead of the audited code revision.\n- `vllm/model_executor/models/roberta.py` L203-L209 \u2014 `BgeM3EmbeddingModel` creates same-repository secondary sparse/ColBERT weight sources with `revision=None`. The primary model revision is not propagated to these side weights, so they can be downloaded outside the operator-selected model revision.\n\n### Fixes\n\nThis was fixed in: https://github.com/vllm-project/vllm/pull/42616\n\n___\n\nOriginally filed via huntr: https://huntr.com/bounties/3f1e24c0-87d2-4f6c-a705-820f380879ac.\n\nThe vLLM maintainer (Russell Bryant) redirected the report to the private GHSA channel. Offline proof bundle (`vllm_artifact_pin_decay_bundle_verify.py` + `bundle-verification-20260430T143506Z.json`) is available upon request.",
  "id": "GHSA-3ww4-5jv9-j5gm",
  "modified": "2026-06-10T17:11:38Z",
  "published": "2026-06-10T17:11:38Z",
  "references": [
    {
      "type": "WEB",
      "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-3ww4-5jv9-j5gm"
    },
    {
      "type": "PACKAGE",
      "url": "https://github.com/vllm-project/vllm"
    }
  ],
  "schema_version": "1.4.0",
  "severity": [
    {
      "score": "CVSS:3.1/AV:N/AC:H/PR:N/UI:N/S:U/C:L/I:H/A:N",
      "type": "CVSS_V3"
    }
  ],
  "summary": "vLLM\u0027s Artifact Pin Decay allows pinned deployments to load unpinned code, weights, and processors"
}


Log in or create an account to share your comment.




Tags
Taxonomy of the tags.


Loading…

Loading…

Loading…

Forecast uses a logistic model when the trend is rising, or an exponential decay model when the trend is falling. Fitted via linearized least squares.

Sightings

Author Source Type Date Other

Nomenclature

  • Seen: The vulnerability was mentioned, discussed, or observed by the user.
  • Confirmed: The vulnerability has been validated from an analyst's perspective.
  • Published Proof of Concept: A public proof of concept is available for this vulnerability.
  • Exploited: The vulnerability was observed as exploited by the user who reported the sighting.
  • Patched: The vulnerability was observed as successfully patched by the user who reported the sighting.
  • Not exploited: The vulnerability was not observed as exploited by the user who reported the sighting.
  • Not confirmed: The user expressed doubt about the validity of the vulnerability.
  • Not patched: The vulnerability was not observed as successfully patched by the user who reported the sighting.

Loading…

Detection rules are retrieved from Rulezet.

Loading…

Loading…