CVE-2023-52934 (GCVE-0-2023-52934)
Vulnerability from cvelistv5
Published
2025-03-27 16:37
Modified
2025-05-04 07:46
Severity ?
Summary
In the Linux kernel, the following vulnerability has been resolved: mm/MADV_COLLAPSE: catch !none !huge !bad pmd lookups In commit 34488399fa08 ("mm/madvise: add file and shmem support to MADV_COLLAPSE") we make the following change to find_pmd_or_thp_or_none(): - if (!pmd_present(pmde)) - return SCAN_PMD_NULL; + if (pmd_none(pmde)) + return SCAN_PMD_NONE; This was for-use by MADV_COLLAPSE file/shmem codepaths, where MADV_COLLAPSE might identify a pte-mapped hugepage, only to have khugepaged race-in, free the pte table, and clear the pmd. Such codepaths include: A) If we find a suitably-aligned compound page of order HPAGE_PMD_ORDER already in the pagecache. B) In retract_page_tables(), if we fail to grab mmap_lock for the target mm/address. In these cases, collapse_pte_mapped_thp() really does expect a none (not just !present) pmd, and we want to suitably identify that case separate from the case where no pmd is found, or it's a bad-pmd (of course, many things could happen once we drop mmap_lock, and the pmd could plausibly undergo multiple transitions due to intervening fault, split, etc). Regardless, the code is prepared install a huge-pmd only when the existing pmd entry is either a genuine pte-table-mapping-pmd, or the none-pmd. However, the commit introduces a logical hole; namely, that we've allowed !none- && !huge- && !bad-pmds to be classified as genuine pte-table-mapping-pmds. One such example that could leak through are swap entries. The pmd values aren't checked again before use in pte_offset_map_lock(), which is expecting nothing less than a genuine pte-table-mapping-pmd. We want to put back the !pmd_present() check (below the pmd_none() check), but need to be careful to deal with subtleties in pmd transitions and treatments by various arch. The issue is that __split_huge_pmd_locked() temporarily clears the present bit (or otherwise marks the entry as invalid), but pmd_present() and pmd_trans_huge() still need to return true while the pmd is in this transitory state. For example, x86's pmd_present() also checks the _PAGE_PSE , riscv's version also checks the _PAGE_LEAF bit, and arm64 also checks a PMD_PRESENT_INVALID bit. Covering all 4 cases for x86 (all checks done on the same pmd value): 1) pmd_present() && pmd_trans_huge() All we actually know here is that the PSE bit is set. Either: a) We aren't racing with __split_huge_page(), and PRESENT or PROTNONE is set. => huge-pmd b) We are currently racing with __split_huge_page(). The danger here is that we proceed as-if we have a huge-pmd, but really we are looking at a pte-mapping-pmd. So, what is the risk of this danger? The only relevant path is: madvise_collapse() -> collapse_pte_mapped_thp() Where we might just incorrectly report back "success", when really the memory isn't pmd-backed. This is fine, since split could happen immediately after (actually) successful madvise_collapse(). So, it should be safe to just assume huge-pmd here. 2) pmd_present() && !pmd_trans_huge() Either: a) PSE not set and either PRESENT or PROTNONE is. => pte-table-mapping pmd (or PROT_NONE) b) devmap. This routine can be called immediately after unlocking/locking mmap_lock -- or called with no locks held (see khugepaged_scan_mm_slot()), so previous VMA checks have since been invalidated. 3) !pmd_present() && pmd_trans_huge() Not possible. 4) !pmd_present() && !pmd_trans_huge() Neither PRESENT nor PROTNONE set => not present I've checked all archs that implement pmd_trans_huge() (arm64, riscv, powerpc, longarch, x86, mips, s390) and this logic roughly translates (though devmap treatment is unique to x86 and powerpc, and (3) doesn't necessarily hold in general -- but that doesn't matter since !pmd_present() always takes failure path). Also, add a comment above find_pmd_or_thp_or_none() ---truncated---
Impacted products
Vendor Product Version
Linux Linux Version: 34488399fa08faaf664743fa54b271eb6f9e1321
Version: 34488399fa08faaf664743fa54b271eb6f9e1321
Create a notification for this product.
   Linux Linux Version: 6.1
Create a notification for this product.
Show details on NVD website


{
  "containers": {
    "cna": {
      "affected": [
        {
          "defaultStatus": "unaffected",
          "product": "Linux",
          "programFiles": [
            "mm/khugepaged.c"
          ],
          "repo": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git",
          "vendor": "Linux",
          "versions": [
            {
              "lessThan": "96aaaf8666010a39430cecf8a65c7ce2908a030f",
              "status": "affected",
              "version": "34488399fa08faaf664743fa54b271eb6f9e1321",
              "versionType": "git"
            },
            {
              "lessThan": "edb5d0cf5525357652aff6eacd9850b8ced07143",
              "status": "affected",
              "version": "34488399fa08faaf664743fa54b271eb6f9e1321",
              "versionType": "git"
            }
          ]
        },
        {
          "defaultStatus": "affected",
          "product": "Linux",
          "programFiles": [
            "mm/khugepaged.c"
          ],
          "repo": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git",
          "vendor": "Linux",
          "versions": [
            {
              "status": "affected",
              "version": "6.1"
            },
            {
              "lessThan": "6.1",
              "status": "unaffected",
              "version": "0",
              "versionType": "semver"
            },
            {
              "lessThanOrEqual": "6.1.*",
              "status": "unaffected",
              "version": "6.1.11",
              "versionType": "semver"
            },
            {
              "lessThanOrEqual": "*",
              "status": "unaffected",
              "version": "6.2",
              "versionType": "original_commit_for_fix"
            }
          ]
        }
      ],
      "cpeApplicability": [
        {
          "nodes": [
            {
              "cpeMatch": [
                {
                  "criteria": "cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*",
                  "versionEndExcluding": "6.1.11",
                  "versionStartIncluding": "6.1",
                  "vulnerable": true
                },
                {
                  "criteria": "cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*",
                  "versionEndExcluding": "6.2",
                  "versionStartIncluding": "6.1",
                  "vulnerable": true
                }
              ],
              "negate": false,
              "operator": "OR"
            }
          ]
        }
      ],
      "descriptions": [
        {
          "lang": "en",
          "value": "In the Linux kernel, the following vulnerability has been resolved:\n\nmm/MADV_COLLAPSE: catch !none !huge !bad pmd lookups\n\nIn commit 34488399fa08 (\"mm/madvise: add file and shmem support to\nMADV_COLLAPSE\") we make the following change to find_pmd_or_thp_or_none():\n\n\t-       if (!pmd_present(pmde))\n\t-               return SCAN_PMD_NULL;\n\t+       if (pmd_none(pmde))\n\t+               return SCAN_PMD_NONE;\n\nThis was for-use by MADV_COLLAPSE file/shmem codepaths, where\nMADV_COLLAPSE might identify a pte-mapped hugepage, only to have\nkhugepaged race-in, free the pte table, and clear the pmd.  Such codepaths\ninclude:\n\nA) If we find a suitably-aligned compound page of order HPAGE_PMD_ORDER\n   already in the pagecache.\nB) In retract_page_tables(), if we fail to grab mmap_lock for the target\n   mm/address.\n\nIn these cases, collapse_pte_mapped_thp() really does expect a none (not\njust !present) pmd, and we want to suitably identify that case separate\nfrom the case where no pmd is found, or it\u0027s a bad-pmd (of course, many\nthings could happen once we drop mmap_lock, and the pmd could plausibly\nundergo multiple transitions due to intervening fault, split, etc). \nRegardless, the code is prepared install a huge-pmd only when the existing\npmd entry is either a genuine pte-table-mapping-pmd, or the none-pmd.\n\nHowever, the commit introduces a logical hole; namely, that we\u0027ve allowed\n!none- \u0026\u0026 !huge- \u0026\u0026 !bad-pmds to be classified as genuine\npte-table-mapping-pmds.  One such example that could leak through are swap\nentries.  The pmd values aren\u0027t checked again before use in\npte_offset_map_lock(), which is expecting nothing less than a genuine\npte-table-mapping-pmd.\n\nWe want to put back the !pmd_present() check (below the pmd_none() check),\nbut need to be careful to deal with subtleties in pmd transitions and\ntreatments by various arch.\n\nThe issue is that __split_huge_pmd_locked() temporarily clears the present\nbit (or otherwise marks the entry as invalid), but pmd_present() and\npmd_trans_huge() still need to return true while the pmd is in this\ntransitory state.  For example, x86\u0027s pmd_present() also checks the\n_PAGE_PSE , riscv\u0027s version also checks the _PAGE_LEAF bit, and arm64 also\nchecks a PMD_PRESENT_INVALID bit.\n\nCovering all 4 cases for x86 (all checks done on the same pmd value):\n\n1) pmd_present() \u0026\u0026 pmd_trans_huge()\n   All we actually know here is that the PSE bit is set. Either:\n   a) We aren\u0027t racing with __split_huge_page(), and PRESENT or PROTNONE\n      is set.\n      =\u003e huge-pmd\n   b) We are currently racing with __split_huge_page().  The danger here\n      is that we proceed as-if we have a huge-pmd, but really we are\n      looking at a pte-mapping-pmd.  So, what is the risk of this\n      danger?\n\n      The only relevant path is:\n\n\tmadvise_collapse() -\u003e collapse_pte_mapped_thp()\n\n      Where we might just incorrectly report back \"success\", when really\n      the memory isn\u0027t pmd-backed.  This is fine, since split could\n      happen immediately after (actually) successful madvise_collapse().\n      So, it should be safe to just assume huge-pmd here.\n\n2) pmd_present() \u0026\u0026 !pmd_trans_huge()\n   Either:\n   a) PSE not set and either PRESENT or PROTNONE is.\n      =\u003e pte-table-mapping pmd (or PROT_NONE)\n   b) devmap.  This routine can be called immediately after\n      unlocking/locking mmap_lock -- or called with no locks held (see\n      khugepaged_scan_mm_slot()), so previous VMA checks have since been\n      invalidated.\n\n3) !pmd_present() \u0026\u0026 pmd_trans_huge()\n  Not possible.\n\n4) !pmd_present() \u0026\u0026 !pmd_trans_huge()\n  Neither PRESENT nor PROTNONE set\n  =\u003e not present\n\nI\u0027ve checked all archs that implement pmd_trans_huge() (arm64, riscv,\npowerpc, longarch, x86, mips, s390) and this logic roughly translates\n(though devmap treatment is unique to x86 and powerpc, and (3) doesn\u0027t\nnecessarily hold in general -- but that doesn\u0027t matter since\n!pmd_present() always takes failure path).\n\nAlso, add a comment above find_pmd_or_thp_or_none()\n---truncated---"
        }
      ],
      "providerMetadata": {
        "dateUpdated": "2025-05-04T07:46:19.066Z",
        "orgId": "416baaa9-dc9f-4396-8d5f-8c081fb06d67",
        "shortName": "Linux"
      },
      "references": [
        {
          "url": "https://git.kernel.org/stable/c/96aaaf8666010a39430cecf8a65c7ce2908a030f"
        },
        {
          "url": "https://git.kernel.org/stable/c/edb5d0cf5525357652aff6eacd9850b8ced07143"
        }
      ],
      "title": "mm/MADV_COLLAPSE: catch !none !huge !bad pmd lookups",
      "x_generator": {
        "engine": "bippy-1.2.0"
      }
    }
  },
  "cveMetadata": {
    "assignerOrgId": "416baaa9-dc9f-4396-8d5f-8c081fb06d67",
    "assignerShortName": "Linux",
    "cveId": "CVE-2023-52934",
    "datePublished": "2025-03-27T16:37:14.857Z",
    "dateReserved": "2024-08-21T06:07:11.020Z",
    "dateUpdated": "2025-05-04T07:46:19.066Z",
    "state": "PUBLISHED"
  },
  "dataType": "CVE_RECORD",
  "dataVersion": "5.1",
  "vulnerability-lookup:meta": {
    "nvd": "{\"cve\":{\"id\":\"CVE-2023-52934\",\"sourceIdentifier\":\"416baaa9-dc9f-4396-8d5f-8c081fb06d67\",\"published\":\"2025-03-27T17:15:43.207\",\"lastModified\":\"2025-03-28T18:11:49.747\",\"vulnStatus\":\"Awaiting Analysis\",\"cveTags\":[],\"descriptions\":[{\"lang\":\"en\",\"value\":\"In the Linux kernel, the following vulnerability has been resolved:\\n\\nmm/MADV_COLLAPSE: catch !none !huge !bad pmd lookups\\n\\nIn commit 34488399fa08 (\\\"mm/madvise: add file and shmem support to\\nMADV_COLLAPSE\\\") we make the following change to find_pmd_or_thp_or_none():\\n\\n\\t-       if (!pmd_present(pmde))\\n\\t-               return SCAN_PMD_NULL;\\n\\t+       if (pmd_none(pmde))\\n\\t+               return SCAN_PMD_NONE;\\n\\nThis was for-use by MADV_COLLAPSE file/shmem codepaths, where\\nMADV_COLLAPSE might identify a pte-mapped hugepage, only to have\\nkhugepaged race-in, free the pte table, and clear the pmd.  Such codepaths\\ninclude:\\n\\nA) If we find a suitably-aligned compound page of order HPAGE_PMD_ORDER\\n   already in the pagecache.\\nB) In retract_page_tables(), if we fail to grab mmap_lock for the target\\n   mm/address.\\n\\nIn these cases, collapse_pte_mapped_thp() really does expect a none (not\\njust !present) pmd, and we want to suitably identify that case separate\\nfrom the case where no pmd is found, or it\u0027s a bad-pmd (of course, many\\nthings could happen once we drop mmap_lock, and the pmd could plausibly\\nundergo multiple transitions due to intervening fault, split, etc). \\nRegardless, the code is prepared install a huge-pmd only when the existing\\npmd entry is either a genuine pte-table-mapping-pmd, or the none-pmd.\\n\\nHowever, the commit introduces a logical hole; namely, that we\u0027ve allowed\\n!none- \u0026\u0026 !huge- \u0026\u0026 !bad-pmds to be classified as genuine\\npte-table-mapping-pmds.  One such example that could leak through are swap\\nentries.  The pmd values aren\u0027t checked again before use in\\npte_offset_map_lock(), which is expecting nothing less than a genuine\\npte-table-mapping-pmd.\\n\\nWe want to put back the !pmd_present() check (below the pmd_none() check),\\nbut need to be careful to deal with subtleties in pmd transitions and\\ntreatments by various arch.\\n\\nThe issue is that __split_huge_pmd_locked() temporarily clears the present\\nbit (or otherwise marks the entry as invalid), but pmd_present() and\\npmd_trans_huge() still need to return true while the pmd is in this\\ntransitory state.  For example, x86\u0027s pmd_present() also checks the\\n_PAGE_PSE , riscv\u0027s version also checks the _PAGE_LEAF bit, and arm64 also\\nchecks a PMD_PRESENT_INVALID bit.\\n\\nCovering all 4 cases for x86 (all checks done on the same pmd value):\\n\\n1) pmd_present() \u0026\u0026 pmd_trans_huge()\\n   All we actually know here is that the PSE bit is set. Either:\\n   a) We aren\u0027t racing with __split_huge_page(), and PRESENT or PROTNONE\\n      is set.\\n      =\u003e huge-pmd\\n   b) We are currently racing with __split_huge_page().  The danger here\\n      is that we proceed as-if we have a huge-pmd, but really we are\\n      looking at a pte-mapping-pmd.  So, what is the risk of this\\n      danger?\\n\\n      The only relevant path is:\\n\\n\\tmadvise_collapse() -\u003e collapse_pte_mapped_thp()\\n\\n      Where we might just incorrectly report back \\\"success\\\", when really\\n      the memory isn\u0027t pmd-backed.  This is fine, since split could\\n      happen immediately after (actually) successful madvise_collapse().\\n      So, it should be safe to just assume huge-pmd here.\\n\\n2) pmd_present() \u0026\u0026 !pmd_trans_huge()\\n   Either:\\n   a) PSE not set and either PRESENT or PROTNONE is.\\n      =\u003e pte-table-mapping pmd (or PROT_NONE)\\n   b) devmap.  This routine can be called immediately after\\n      unlocking/locking mmap_lock -- or called with no locks held (see\\n      khugepaged_scan_mm_slot()), so previous VMA checks have since been\\n      invalidated.\\n\\n3) !pmd_present() \u0026\u0026 pmd_trans_huge()\\n  Not possible.\\n\\n4) !pmd_present() \u0026\u0026 !pmd_trans_huge()\\n  Neither PRESENT nor PROTNONE set\\n  =\u003e not present\\n\\nI\u0027ve checked all archs that implement pmd_trans_huge() (arm64, riscv,\\npowerpc, longarch, x86, mips, s390) and this logic roughly translates\\n(though devmap treatment is unique to x86 and powerpc, and (3) doesn\u0027t\\nnecessarily hold in general -- but that doesn\u0027t matter since\\n!pmd_present() always takes failure path).\\n\\nAlso, add a comment above find_pmd_or_thp_or_none()\\n---truncated---\"}],\"metrics\":{},\"references\":[{\"url\":\"https://git.kernel.org/stable/c/96aaaf8666010a39430cecf8a65c7ce2908a030f\",\"source\":\"416baaa9-dc9f-4396-8d5f-8c081fb06d67\"},{\"url\":\"https://git.kernel.org/stable/c/edb5d0cf5525357652aff6eacd9850b8ced07143\",\"source\":\"416baaa9-dc9f-4396-8d5f-8c081fb06d67\"}]}}"
  }
}


Log in or create an account to share your comment.




Tags
Taxonomy of the tags.


Loading…

Loading…

Loading…

Sightings

Author Source Type Date

Nomenclature

  • Seen: The vulnerability was mentioned, discussed, or seen somewhere by the user.
  • Confirmed: The vulnerability is confirmed from an analyst perspective.
  • Exploited: This vulnerability was exploited and seen by the user reporting the sighting.
  • Patched: This vulnerability was successfully patched by the user reporting the sighting.
  • Not exploited: This vulnerability was not exploited or seen by the user reporting the sighting.
  • Not confirmed: The user expresses doubt about the veracity of the vulnerability.
  • Not patched: This vulnerability was not successfully patched by the user reporting the sighting.


Loading…