Vulnerability-Lookup

PYSEC-2024-235

Vulnerability from pysec - Published: 2024-02-26 16:27 - Updated: 2025-02-26 02:48

Details

With the following crawler configuration:

from bs4 import BeautifulSoup as Soup

url = "https://example.com"
loader = RecursiveUrlLoader(
    url=url, max_depth=2, extractor=lambda x: Soup(x, "html.parser").text
)
docs = loader.load()

An attacker in control of the contents of https://example.com could place a malicious HTML file in there with links like "https://example.completely.different/my_file.html" and the crawler would proceed to download that file as well even though prevent_outside=True.

https://github.com/langchain-ai/langchain/blob/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22/libs/community/langchain_community/document_loaders/recursive_url_loader.py#L51-L51

Resolved in https://github.com/langchain-ai/langchain/pull/15559

Severity ?

8.1 (High)


                  
                    CVSS:3.1/AV:N/AC:H/PR:N/UI:N/S:U/C:H/I:H/A:H

Impacted products

Name	purl
langchain-exa	pkg:pypi/langchain-exa

Aliases

CVE-2024-0243

JSON

To clipboard

{
  "affected": [
    {
      "package": {
        "ecosystem": "PyPI",
        "name": "langchain-exa",
        "purl": "pkg:pypi/langchain-exa"
      },
      "ranges": [
        {
          "events": [
            {
              "introduced": "0"
            },
            {
              "fixed": "bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22"
            },
            {
              "fixed": "bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22"
            }
          ],
          "repo": "https://github.com/langchain-ai/langchain",
          "type": "GIT"
        },
        {
          "events": [
            {
              "introduced": "0"
            },
            {
              "fixed": "0.1.0"
            }
          ],
          "type": "ECOSYSTEM"
        }
      ],
      "versions": [
        "0.0.1"
      ]
    }
  ],
  "aliases": [
    "CVE-2024-0243"
  ],
  "details": "With the following crawler configuration:\n\n```python\nfrom bs4 import BeautifulSoup as Soup\n\nurl = \"https://example.com\"\nloader = RecursiveUrlLoader(\n    url=url, max_depth=2, extractor=lambda x: Soup(x, \"html.parser\").text\n)\ndocs = loader.load()\n```\n\nAn attacker in control of the contents of `https://example.com` could place a malicious HTML file in there with links like \"https://example.completely.different/my_file.html\" and the crawler would proceed to download that file as well even though `prevent_outside=True`.\n\nhttps://github.com/langchain-ai/langchain/blob/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22/libs/community/langchain_community/document_loaders/recursive_url_loader.py#L51-L51\n\nResolved in https://github.com/langchain-ai/langchain/pull/15559",
  "id": "PYSEC-2024-235",
  "modified": "2025-02-26T02:48:56.937312+00:00",
  "published": "2024-02-26T16:27:49+00:00",
  "references": [
    {
      "type": "EVIDENCE",
      "url": "https://huntr.com/bounties/370904e7-10ac-40a4-a8d4-e2d16e1ca861"
    },
    {
      "type": "FIX",
      "url": "https://github.com/langchain-ai/langchain/commit/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22"
    },
    {
      "type": "FIX",
      "url": "https://github.com/langchain-ai/langchain/pull/15559"
    },
    {
      "type": "REPORT",
      "url": "https://github.com/langchain-ai/langchain/pull/15559"
    },
    {
      "type": "REPORT",
      "url": "https://huntr.com/bounties/370904e7-10ac-40a4-a8d4-e2d16e1ca861"
    },
    {
      "type": "WEB",
      "url": "https://huntr.com/bounties/370904e7-10ac-40a4-a8d4-e2d16e1ca861"
    }
  ],
  "severity": [
    {
      "score": "CVSS:3.1/AV:N/AC:H/PR:N/UI:N/S:U/C:H/I:H/A:H",
      "type": "CVSS_V3"
    }
  ]
}

CVE-2024-0243 (GCVE-0-2024-0243)

Vulnerability from cvelistv5 – Published: 2024-02-24 17:59 – Updated: 2025-04-22 16:14

Title

Server-side Request Forgery In Recursive URL Loader

Summary

With the following crawler configuration: ```python from bs4 import BeautifulSoup as Soup url = "https://example.com" loader = RecursiveUrlLoader( url=url, max_depth=2, extractor=lambda x: Soup(x, "html.parser").text ) docs = loader.load() ``` An attacker in control of the contents of `https://example.com` could place a malicious HTML file in there with links like "https://example.completely.different/my_file.html" and the crawler would proceed to download that file as well even though `prevent_outside=True`. https://github.com/langchain-ai/langchain/blob/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22/libs/community/langchain_community/document_loaders/recursive_url_loader.py#L51-L51 Resolved in https://github.com/langchain-ai/langchain/pull/15559

Severity ?

3.7 (Low)


                        
                          CVSS:3.0/AV:L/AC:H/PR:H/UI:R/S:C/C:L/I:L/A:N

CWE

CWE-918 - Server-Side Request Forgery (SSRF)

Assigner

@huntr_ai

References

URL

Tags

	https://huntr.com/bounties/370904e7-10ac-40a4-a8d…
	https://github.com/langchain-ai/langchain/commit/…
	https://github.com/langchain-ai/langchain/pull/15559

Impacted products

	Vendor	Product	Version
	langchain-ai	langchain-ai/langchain	Affected: unspecified , < 0.1.0 (custom)

Show details on NVD website

JSON

To clipboard

{
  "containers": {
    "adp": [
      {
        "affected": [
          {
            "cpes": [
              "cpe:2.3:a:langchain-ai:langchain-ai\\/langchain:*:*:*:*:*:*:*:*"
            ],
            "defaultStatus": "unknown",
            "product": "langchain-ai\\/langchain",
            "vendor": "langchain-ai",
            "versions": [
              {
                "lessThan": "0.1.0",
                "status": "affected",
                "version": "0",
                "versionType": "custom"
              }
            ]
          }
        ],
        "metrics": [
          {
            "other": {
              "content": {
                "id": "CVE-2024-0243",
                "options": [
                  {
                    "Exploitation": "poc"
                  },
                  {
                    "Automatable": "no"
                  },
                  {
                    "Technical Impact": "partial"
                  }
                ],
                "role": "CISA Coordinator",
                "timestamp": "2024-02-26T18:43:11.371044Z",
                "version": "2.0.3"
              },
              "type": "ssvc"
            }
          }
        ],
        "providerMetadata": {
          "dateUpdated": "2025-04-22T16:14:26.674Z",
          "orgId": "134c704f-9b21-4f2e-91b3-4a467353bcc0",
          "shortName": "CISA-ADP"
        },
        "title": "CISA ADP Vulnrichment"
      },
      {
        "providerMetadata": {
          "dateUpdated": "2024-08-01T17:41:16.443Z",
          "orgId": "af854a3a-2127-422b-91ae-364da2661108",
          "shortName": "CVE"
        },
        "references": [
          {
            "tags": [
              "x_transferred"
            ],
            "url": "https://huntr.com/bounties/370904e7-10ac-40a4-a8d4-e2d16e1ca861"
          },
          {
            "tags": [
              "x_transferred"
            ],
            "url": "https://github.com/langchain-ai/langchain/commit/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22"
          },
          {
            "tags": [
              "x_transferred"
            ],
            "url": "https://github.com/langchain-ai/langchain/pull/15559"
          }
        ],
        "title": "CVE Program Container"
      }
    ],
    "cna": {
      "affected": [
        {
          "product": "langchain-ai/langchain",
          "vendor": "langchain-ai",
          "versions": [
            {
              "lessThan": "0.1.0",
              "status": "affected",
              "version": "unspecified",
              "versionType": "custom"
            }
          ]
        }
      ],
      "descriptions": [
        {
          "lang": "en",
          "value": "With the following crawler configuration:\n\n```python\nfrom bs4 import BeautifulSoup as Soup\n\nurl = \"https://example.com\"\nloader = RecursiveUrlLoader(\n    url=url, max_depth=2, extractor=lambda x: Soup(x, \"html.parser\").text\n)\ndocs = loader.load()\n```\n\nAn attacker in control of the contents of `https://example.com` could place a malicious HTML file in there with links like \"https://example.completely.different/my_file.html\" and the crawler would proceed to download that file as well even though `prevent_outside=True`.\n\nhttps://github.com/langchain-ai/langchain/blob/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22/libs/community/langchain_community/document_loaders/recursive_url_loader.py#L51-L51\n\nResolved in https://github.com/langchain-ai/langchain/pull/15559"
        }
      ],
      "metrics": [
        {
          "cvssV3_0": {
            "attackComplexity": "HIGH",
            "attackVector": "LOCAL",
            "availabilityImpact": "NONE",
            "baseScore": 3.7,
            "baseSeverity": "LOW",
            "confidentialityImpact": "LOW",
            "integrityImpact": "LOW",
            "privilegesRequired": "HIGH",
            "scope": "CHANGED",
            "userInteraction": "REQUIRED",
            "vectorString": "CVSS:3.0/AV:L/AC:H/PR:H/UI:R/S:C/C:L/I:L/A:N",
            "version": "3.0"
          }
        }
      ],
      "problemTypes": [
        {
          "descriptions": [
            {
              "cweId": "CWE-918",
              "description": "CWE-918 Server-Side Request Forgery (SSRF)",
              "lang": "en",
              "type": "CWE"
            }
          ]
        }
      ],
      "providerMetadata": {
        "dateUpdated": "2024-03-13T20:57:24.633Z",
        "orgId": "c09c270a-b464-47c1-9133-acb35b22c19a",
        "shortName": "@huntr_ai"
      },
      "references": [
        {
          "url": "https://huntr.com/bounties/370904e7-10ac-40a4-a8d4-e2d16e1ca861"
        },
        {
          "url": "https://github.com/langchain-ai/langchain/commit/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22"
        },
        {
          "url": "https://github.com/langchain-ai/langchain/pull/15559"
        }
      ],
      "source": {
        "advisory": "370904e7-10ac-40a4-a8d4-e2d16e1ca861",
        "discovery": "EXTERNAL"
      },
      "title": "Server-side Request Forgery In Recursive URL Loader"
    }
  },
  "cveMetadata": {
    "assignerOrgId": "c09c270a-b464-47c1-9133-acb35b22c19a",
    "assignerShortName": "@huntr_ai",
    "cveId": "CVE-2024-0243",
    "datePublished": "2024-02-24T17:59:26.498Z",
    "dateReserved": "2024-01-04T21:47:13.281Z",
    "dateUpdated": "2025-04-22T16:14:26.674Z",
    "state": "PUBLISHED"
  },
  "dataType": "CVE_RECORD",
  "dataVersion": "5.1"
}

Sightings

Author	Source	Type	Date

Nomenclature

Seen: The vulnerability was mentioned, discussed, or observed by the user.
Confirmed: The vulnerability has been validated from an analyst's perspective.
Published Proof of Concept: A public proof of concept is available for this vulnerability.
Exploited: The vulnerability was observed as exploited by the user who reported the sighting.
Patched: The vulnerability was observed as successfully patched by the user who reported the sighting.
Not exploited: The vulnerability was not observed as exploited by the user who reported the sighting.
Not confirmed: The user expressed doubt about the validity of the vulnerability.
Not patched: The vulnerability was not observed as successfully patched by the user who reported the sighting.

Detection rules are retrieved from Rulezet.

Action not permitted

PYSEC-2024-235

CVE-2024-0243 (GCVE-0-2024-0243)

Tags

Sightings

Nomenclature