cve-2024-38306
Vulnerability from cvelistv5
Published
2024-06-25 14:22
Modified
2024-12-19 09:03
Severity ?
Summary
In the Linux kernel, the following vulnerability has been resolved: btrfs: protect folio::private when attaching extent buffer folios [BUG] Since v6.8 there are rare kernel crashes reported by various people, the common factor is bad page status error messages like this: BUG: Bad page state in process kswapd0 pfn:d6e840 page: refcount:0 mapcount:0 mapping:000000007512f4f2 index:0x2796c2c7c pfn:0xd6e840 aops:btree_aops ino:1 flags: 0x17ffffe0000008(uptodate|node=0|zone=2|lastcpupid=0x3fffff) page_type: 0xffffffff() raw: 0017ffffe0000008 dead000000000100 dead000000000122 ffff88826d0be4c0 raw: 00000002796c2c7c 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: non-NULL mapping [CAUSE] Commit 09e6cef19c9f ("btrfs: refactor alloc_extent_buffer() to allocate-then-attach method") changes the sequence when allocating a new extent buffer. Previously we always called grab_extent_buffer() under mapping->i_private_lock, to ensure the safety on modification on folio::private (which is a pointer to extent buffer for regular sectorsize). This can lead to the following race: Thread A is trying to allocate an extent buffer at bytenr X, with 4 4K pages, meanwhile thread B is trying to release the page at X + 4K (the second page of the extent buffer at X). Thread A | Thread B -----------------------------------+------------------------------------- | btree_release_folio() | | This is for the page at X + 4K, | | Not page X. | | alloc_extent_buffer() | |- release_extent_buffer() |- filemap_add_folio() for the | | |- atomic_dec_and_test(eb->refs) | page at bytenr X (the first | | | | page). | | | | Which returned -EEXIST. | | | | | | | |- filemap_lock_folio() | | | | Returned the first page locked. | | | | | | | |- grab_extent_buffer() | | | | |- atomic_inc_not_zero() | | | | | Returned false | | | | |- folio_detach_private() | | |- folio_detach_private() for X | |- folio_test_private() | | |- folio_test_private() | Returned true | | | Returned true |- folio_put() | |- folio_put() Now there are two puts on the same folio at folio X, leading to refcount underflow of the folio X, and eventually causing the BUG_ON() on the page->mapping. The condition is not that easy to hit: - The release must be triggered for the middle page of an eb If the release is on the same first page of an eb, page lock would kick in and prevent the race. - folio_detach_private() has a very small race window It's only between folio_test_private() and folio_clear_private(). That's exactly when mapping->i_private_lock is used to prevent such race, and commit 09e6cef19c9f ("btrfs: refactor alloc_extent_buffer() to allocate-then-attach method") screwed that up. At that time, I thought the page lock would kick in as filemap_release_folio() also requires the page to be locked, but forgot the filemap_release_folio() only locks one page, not all pages of an extent buffer. [FIX] Move all the code requiring i_private_lock into attach_eb_folio_to_filemap(), so that everything is done with proper lock protection. Furthermore to prevent future problems, add an extra lockdep_assert_locked() to ensure we're holding the proper lock. To reproducer that is able to hit the race (takes a few minutes with instrumented code inserting delays to alloc_extent_buffer()): #!/bin/sh drop_caches () { while(true); do echo 3 > /proc/sys/vm/drop_caches echo 1 > /proc/sys/vm/compact_memory done } run_tar () { while(true); do for x in `seq 1 80` ; do tar cf /dev/zero /mnt > /dev/null & done wait done } mkfs.btrfs -f -d single -m single ---truncated---
Impacted products
Vendor Product Version
Linux Linux Version: 6.8
Show details on NVD website


{
  "containers": {
    "adp": [
      {
        "providerMetadata": {
          "dateUpdated": "2024-08-02T04:04:25.336Z",
          "orgId": "af854a3a-2127-422b-91ae-364da2661108",
          "shortName": "CVE"
        },
        "references": [
          {
            "tags": [
              "x_transferred"
            ],
            "url": "https://git.kernel.org/stable/c/952f048eb901881a7cc6f7c1368b53cd386ead7b"
          },
          {
            "tags": [
              "x_transferred"
            ],
            "url": "https://git.kernel.org/stable/c/f3a5367c679d31473d3fbb391675055b4792c309"
          }
        ],
        "title": "CVE Program Container"
      },
      {
        "metrics": [
          {
            "other": {
              "content": {
                "id": "CVE-2024-38306",
                "options": [
                  {
                    "Exploitation": "none"
                  },
                  {
                    "Automatable": "no"
                  },
                  {
                    "Technical Impact": "partial"
                  }
                ],
                "role": "CISA Coordinator",
                "timestamp": "2024-09-10T17:08:21.055578Z",
                "version": "2.0.3"
              },
              "type": "ssvc"
            }
          }
        ],
        "providerMetadata": {
          "dateUpdated": "2024-09-11T17:34:42.868Z",
          "orgId": "134c704f-9b21-4f2e-91b3-4a467353bcc0",
          "shortName": "CISA-ADP"
        },
        "title": "CISA ADP Vulnrichment"
      }
    ],
    "cna": {
      "affected": [
        {
          "defaultStatus": "unaffected",
          "product": "Linux",
          "programFiles": [
            "fs/btrfs/extent_io.c"
          ],
          "repo": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git",
          "vendor": "Linux",
          "versions": [
            {
              "lessThan": "952f048eb901881a7cc6f7c1368b53cd386ead7b",
              "status": "affected",
              "version": "09e6cef19c9fc0e10547135476865b5272aa0406",
              "versionType": "git"
            },
            {
              "lessThan": "f3a5367c679d31473d3fbb391675055b4792c309",
              "status": "affected",
              "version": "09e6cef19c9fc0e10547135476865b5272aa0406",
              "versionType": "git"
            }
          ]
        },
        {
          "defaultStatus": "affected",
          "product": "Linux",
          "programFiles": [
            "fs/btrfs/extent_io.c"
          ],
          "repo": "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git",
          "vendor": "Linux",
          "versions": [
            {
              "status": "affected",
              "version": "6.8"
            },
            {
              "lessThan": "6.8",
              "status": "unaffected",
              "version": "0",
              "versionType": "semver"
            },
            {
              "lessThanOrEqual": "6.9.*",
              "status": "unaffected",
              "version": "6.9.5",
              "versionType": "semver"
            },
            {
              "lessThanOrEqual": "*",
              "status": "unaffected",
              "version": "6.10",
              "versionType": "original_commit_for_fix"
            }
          ]
        }
      ],
      "descriptions": [
        {
          "lang": "en",
          "value": "In the Linux kernel, the following vulnerability has been resolved:\n\nbtrfs: protect folio::private when attaching extent buffer folios\n\n[BUG]\nSince v6.8 there are rare kernel crashes reported by various people,\nthe common factor is bad page status error messages like this:\n\n  BUG: Bad page state in process kswapd0  pfn:d6e840\n  page: refcount:0 mapcount:0 mapping:000000007512f4f2 index:0x2796c2c7c\n  pfn:0xd6e840\n  aops:btree_aops ino:1\n  flags: 0x17ffffe0000008(uptodate|node=0|zone=2|lastcpupid=0x3fffff)\n  page_type: 0xffffffff()\n  raw: 0017ffffe0000008 dead000000000100 dead000000000122 ffff88826d0be4c0\n  raw: 00000002796c2c7c 0000000000000000 00000000ffffffff 0000000000000000\n  page dumped because: non-NULL mapping\n\n[CAUSE]\nCommit 09e6cef19c9f (\"btrfs: refactor alloc_extent_buffer() to\nallocate-then-attach method\") changes the sequence when allocating a new\nextent buffer.\n\nPreviously we always called grab_extent_buffer() under\nmapping-\u003ei_private_lock, to ensure the safety on modification on\nfolio::private (which is a pointer to extent buffer for regular\nsectorsize).\n\nThis can lead to the following race:\n\nThread A is trying to allocate an extent buffer at bytenr X, with 4\n4K pages, meanwhile thread B is trying to release the page at X + 4K\n(the second page of the extent buffer at X).\n\n           Thread A                |                 Thread B\n-----------------------------------+-------------------------------------\n                                   | btree_release_folio()\n\t\t\t\t   | | This is for the page at X + 4K,\n\t\t\t\t   | | Not page X.\n\t\t\t\t   | |\nalloc_extent_buffer()              | |- release_extent_buffer()\n|- filemap_add_folio() for the     | |  |- atomic_dec_and_test(eb-\u003erefs)\n|  page at bytenr X (the first     | |  |\n|  page).                          | |  |\n|  Which returned -EEXIST.         | |  |\n|                                  | |  |\n|- filemap_lock_folio()            | |  |\n|  Returned the first page locked. | |  |\n|                                  | |  |\n|- grab_extent_buffer()            | |  |\n|  |- atomic_inc_not_zero()        | |  |\n|  |  Returned false               | |  |\n|  |- folio_detach_private()       | |  |- folio_detach_private() for X\n|     |- folio_test_private()      | |     |- folio_test_private()\n      |  Returned true             | |     |  Returned true\n      |- folio_put()               |       |- folio_put()\n\nNow there are two puts on the same folio at folio X, leading to refcount\nunderflow of the folio X, and eventually causing the BUG_ON() on the\npage-\u003emapping.\n\nThe condition is not that easy to hit:\n\n- The release must be triggered for the middle page of an eb\n  If the release is on the same first page of an eb, page lock would kick\n  in and prevent the race.\n\n- folio_detach_private() has a very small race window\n  It\u0027s only between folio_test_private() and folio_clear_private().\n\nThat\u0027s exactly when mapping-\u003ei_private_lock is used to prevent such race,\nand commit 09e6cef19c9f (\"btrfs: refactor alloc_extent_buffer() to\nallocate-then-attach method\") screwed that up.\n\nAt that time, I thought the page lock would kick in as\nfilemap_release_folio() also requires the page to be locked, but forgot\nthe filemap_release_folio() only locks one page, not all pages of an\nextent buffer.\n\n[FIX]\nMove all the code requiring i_private_lock into\nattach_eb_folio_to_filemap(), so that everything is done with proper\nlock protection.\n\nFurthermore to prevent future problems, add an extra\nlockdep_assert_locked() to ensure we\u0027re holding the proper lock.\n\nTo reproducer that is able to hit the race (takes a few minutes with\ninstrumented code inserting delays to alloc_extent_buffer()):\n\n  #!/bin/sh\n  drop_caches () {\n\t  while(true); do\n\t\t  echo 3 \u003e /proc/sys/vm/drop_caches\n\t\t  echo 1 \u003e /proc/sys/vm/compact_memory\n\t  done\n  }\n\n  run_tar () {\n\t  while(true); do\n\t\t  for x in `seq 1 80` ; do\n\t\t\t  tar cf /dev/zero /mnt \u003e /dev/null \u0026\n\t\t  done\n\t\t  wait\n\t  done\n  }\n\n  mkfs.btrfs -f -d single -m single\n---truncated---"
        }
      ],
      "providerMetadata": {
        "dateUpdated": "2024-12-19T09:03:53.382Z",
        "orgId": "416baaa9-dc9f-4396-8d5f-8c081fb06d67",
        "shortName": "Linux"
      },
      "references": [
        {
          "url": "https://git.kernel.org/stable/c/952f048eb901881a7cc6f7c1368b53cd386ead7b"
        },
        {
          "url": "https://git.kernel.org/stable/c/f3a5367c679d31473d3fbb391675055b4792c309"
        }
      ],
      "title": "btrfs: protect folio::private when attaching extent buffer folios",
      "x_generator": {
        "engine": "bippy-5f407fcff5a0"
      }
    }
  },
  "cveMetadata": {
    "assignerOrgId": "416baaa9-dc9f-4396-8d5f-8c081fb06d67",
    "assignerShortName": "Linux",
    "cveId": "CVE-2024-38306",
    "datePublished": "2024-06-25T14:22:36.903Z",
    "dateReserved": "2024-06-24T13:53:25.575Z",
    "dateUpdated": "2024-12-19T09:03:53.382Z",
    "state": "PUBLISHED"
  },
  "dataType": "CVE_RECORD",
  "dataVersion": "5.1",
  "meta": {
    "nvd": "{\"cve\":{\"id\":\"CVE-2024-38306\",\"sourceIdentifier\":\"416baaa9-dc9f-4396-8d5f-8c081fb06d67\",\"published\":\"2024-06-25T15:15:13.367\",\"lastModified\":\"2024-11-21T09:25:20.867\",\"vulnStatus\":\"Awaiting Analysis\",\"cveTags\":[],\"descriptions\":[{\"lang\":\"en\",\"value\":\"In the Linux kernel, the following vulnerability has been resolved:\\n\\nbtrfs: protect folio::private when attaching extent buffer folios\\n\\n[BUG]\\nSince v6.8 there are rare kernel crashes reported by various people,\\nthe common factor is bad page status error messages like this:\\n\\n  BUG: Bad page state in process kswapd0  pfn:d6e840\\n  page: refcount:0 mapcount:0 mapping:000000007512f4f2 index:0x2796c2c7c\\n  pfn:0xd6e840\\n  aops:btree_aops ino:1\\n  flags: 0x17ffffe0000008(uptodate|node=0|zone=2|lastcpupid=0x3fffff)\\n  page_type: 0xffffffff()\\n  raw: 0017ffffe0000008 dead000000000100 dead000000000122 ffff88826d0be4c0\\n  raw: 00000002796c2c7c 0000000000000000 00000000ffffffff 0000000000000000\\n  page dumped because: non-NULL mapping\\n\\n[CAUSE]\\nCommit 09e6cef19c9f (\\\"btrfs: refactor alloc_extent_buffer() to\\nallocate-then-attach method\\\") changes the sequence when allocating a new\\nextent buffer.\\n\\nPreviously we always called grab_extent_buffer() under\\nmapping-\u003ei_private_lock, to ensure the safety on modification on\\nfolio::private (which is a pointer to extent buffer for regular\\nsectorsize).\\n\\nThis can lead to the following race:\\n\\nThread A is trying to allocate an extent buffer at bytenr X, with 4\\n4K pages, meanwhile thread B is trying to release the page at X + 4K\\n(the second page of the extent buffer at X).\\n\\n           Thread A                |                 Thread B\\n-----------------------------------+-------------------------------------\\n                                   | btree_release_folio()\\n\\t\\t\\t\\t   | | This is for the page at X + 4K,\\n\\t\\t\\t\\t   | | Not page X.\\n\\t\\t\\t\\t   | |\\nalloc_extent_buffer()              | |- release_extent_buffer()\\n|- filemap_add_folio() for the     | |  |- atomic_dec_and_test(eb-\u003erefs)\\n|  page at bytenr X (the first     | |  |\\n|  page).                          | |  |\\n|  Which returned -EEXIST.         | |  |\\n|                                  | |  |\\n|- filemap_lock_folio()            | |  |\\n|  Returned the first page locked. | |  |\\n|                                  | |  |\\n|- grab_extent_buffer()            | |  |\\n|  |- atomic_inc_not_zero()        | |  |\\n|  |  Returned false               | |  |\\n|  |- folio_detach_private()       | |  |- folio_detach_private() for X\\n|     |- folio_test_private()      | |     |- folio_test_private()\\n      |  Returned true             | |     |  Returned true\\n      |- folio_put()               |       |- folio_put()\\n\\nNow there are two puts on the same folio at folio X, leading to refcount\\nunderflow of the folio X, and eventually causing the BUG_ON() on the\\npage-\u003emapping.\\n\\nThe condition is not that easy to hit:\\n\\n- The release must be triggered for the middle page of an eb\\n  If the release is on the same first page of an eb, page lock would kick\\n  in and prevent the race.\\n\\n- folio_detach_private() has a very small race window\\n  It\u0027s only between folio_test_private() and folio_clear_private().\\n\\nThat\u0027s exactly when mapping-\u003ei_private_lock is used to prevent such race,\\nand commit 09e6cef19c9f (\\\"btrfs: refactor alloc_extent_buffer() to\\nallocate-then-attach method\\\") screwed that up.\\n\\nAt that time, I thought the page lock would kick in as\\nfilemap_release_folio() also requires the page to be locked, but forgot\\nthe filemap_release_folio() only locks one page, not all pages of an\\nextent buffer.\\n\\n[FIX]\\nMove all the code requiring i_private_lock into\\nattach_eb_folio_to_filemap(), so that everything is done with proper\\nlock protection.\\n\\nFurthermore to prevent future problems, add an extra\\nlockdep_assert_locked() to ensure we\u0027re holding the proper lock.\\n\\nTo reproducer that is able to hit the race (takes a few minutes with\\ninstrumented code inserting delays to alloc_extent_buffer()):\\n\\n  #!/bin/sh\\n  drop_caches () {\\n\\t  while(true); do\\n\\t\\t  echo 3 \u003e /proc/sys/vm/drop_caches\\n\\t\\t  echo 1 \u003e /proc/sys/vm/compact_memory\\n\\t  done\\n  }\\n\\n  run_tar () {\\n\\t  while(true); do\\n\\t\\t  for x in `seq 1 80` ; do\\n\\t\\t\\t  tar cf /dev/zero /mnt \u003e /dev/null \u0026\\n\\t\\t  done\\n\\t\\t  wait\\n\\t  done\\n  }\\n\\n  mkfs.btrfs -f -d single -m single\\n---truncated---\"},{\"lang\":\"es\",\"value\":\"En el kernel de Linux, se ha resuelto la siguiente vulnerabilidad: btrfs: proteger folio::privado al adjuntar folios de b\u00fafer de extensi\u00f3n [ERROR] Desde la versi\u00f3n 6.8, varias personas reportan fallas raras del kernel, el factor com\u00fan son mensajes de error de estado incorrecto de la p\u00e1gina as\u00ed: ERROR: Estado incorrecto de la p\u00e1gina en el proceso kswapd0 pfn:d6e840 p\u00e1gina: refcount:0 mapcount:0 mapeo:000000007512f4f2 index:0x2796c2c7c pfn:0xd6e840 aops:btree_aops ino:1 flags: 0x17ffffe0000008(uptodate|node=0|zone= 2 |lastcpupid=0x3fffff) tipo de p\u00e1gina: 0xffffffff() raw: 0017ffffe0000008 dead000000000100 dead000000000122 ffff88826d0be4c0 raw: 00000002796c2c7c 0000000000000000 0000 0000ffffffff 0000000000000000 p\u00e1gina volcada porque: mapeo no NULL [CAUSA] Commit 09e6cef19c9f (\\\"btrfs: refactor alloc_extent_buffer() para asignar el m\u00e9todo luego adjuntar \\\") cambia la secuencia al asignar un nuevo b\u00fafer de extensi\u00f3n. Anteriormente siempre llam\u00e1bamos a grab_extent_buffer() en mapeo-\u0026gt;i_private_lock, para garantizar la seguridad en la modificaci\u00f3n en folio::private (que es un puntero al b\u00fafer de extensi\u00f3n para el tama\u00f1o de sector normal). Esto puede llevar a la siguiente ejecuci\u00f3n: el subproceso A est\u00e1 intentando asignar un b\u00fafer de extensi\u00f3n en el bytenr X, con 4 p\u00e1ginas de 4K, mientras que el subproceso B est\u00e1 intentando liberar la p\u00e1gina en X + 4K (la segunda p\u00e1gina del b\u00fafer de extensi\u00f3n en X) . Hilo A | Hilo B -----------------------------------+------------ ------------------------- | btree_release_folio() | | Esto es para la p\u00e1gina en X + 4K, | | No la p\u00e1gina X. | | alloc_extent_buffer() | |- release_extent_buffer() |- filemap_add_folio() para el | | |- atomic_dec_and_test(eb-\u0026gt;refs) | p\u00e1gina en bytenr X (la primera | | | | p\u00e1gina). | | | | Que devolvi\u00f3 -EEXIST. | | | | | | | |- filemap_lock_folio() | | | | Devolvi\u00f3 la primera p\u00e1gina bloqueada. | | | | | | | |- grab_extent_buffer() | | | | |- atomic_inc_not_zero() | | | | | Devuelto falso | | | | |- folio_detach_private() | | |- folio_detach_private() para X | |- folio_test_private() | | |- folio_test_private() | Devuelto verdadero | | | Devuelto verdadero |- folio_put() | |- folio_put() Ahora hay dos opciones de venta en el mismo folio en el folio X, lo que provoca un recuento insuficiente del folio X y, finalmente, provoca el error BUG_ON() en la p\u00e1gina-\u0026gt;mapeo. La condici\u00f3n no es tan f\u00e1cil de cumplir: - La publicaci\u00f3n debe activarse para la p\u00e1gina intermedia de un eb. Si la publicaci\u00f3n est\u00e1 en la misma primera p\u00e1gina de un eb, el bloqueo de p\u00e1gina se activar\u00eda e impedir\u00eda la ejecuci\u00f3n. - folio_detach_private() tiene una ventana de ejecuci\u00f3n muy peque\u00f1a. Es solo entre folio_test_private() y folio_clear_private(). Eso es exactamente cuando se usa mapeo-\u0026gt;i_private_lock para evitar dicha ejecuci\u00f3n, y la confirmaci\u00f3n 09e6cef19c9f (\\\"btrfs: refactor alloc_extent_buffer() para asignar-luego-adjuntar m\u00e9todo\\\") arruin\u00f3 eso. En ese momento, pens\u00e9 que el bloqueo de p\u00e1gina se activar\u00eda ya que filemap_release_folio() tambi\u00e9n requiere que la p\u00e1gina est\u00e9 bloqueada, pero olvid\u00e9 que filemap_release_folio() solo bloquea una p\u00e1gina, no todas las p\u00e1ginas de un b\u00fafer de extensi\u00f3n. [FIX] Mueva todo el c\u00f3digo que requiere i_private_lock a adjunto_eb_folio_to_filemap(), para que todo se haga con la protecci\u00f3n de bloqueo adecuada. Adem\u00e1s, para evitar problemas futuros, agregue un lockdep_assert_locked() adicional para garantizar que mantenemos el bloqueo adecuado. Para el reproductor que puede iniciar la ejecuci\u00f3n (tarda unos minutos con el c\u00f3digo instrumentado insertando retrasos en alloc_extent_buffer()): #!/bin/sh drop_caches () { while(true); hacer echo 3 \u0026gt; /proc/sys/vm/drop_caches echo 1 \u0026gt; /proc/sys/vm/compact_memory hecho } run_tar () { while(true); hacer para x en `seq 1 80`; hacer tar cf /dev/zero /mnt \u0026gt; /dev/null \u0026amp; hecho esperar hecho } mkfs.btrfs -f -d single -m single ---truncado---\"}],\"metrics\":{},\"references\":[{\"url\":\"https://git.kernel.org/stable/c/952f048eb901881a7cc6f7c1368b53cd386ead7b\",\"source\":\"416baaa9-dc9f-4396-8d5f-8c081fb06d67\"},{\"url\":\"https://git.kernel.org/stable/c/f3a5367c679d31473d3fbb391675055b4792c309\",\"source\":\"416baaa9-dc9f-4396-8d5f-8c081fb06d67\"},{\"url\":\"https://git.kernel.org/stable/c/952f048eb901881a7cc6f7c1368b53cd386ead7b\",\"source\":\"af854a3a-2127-422b-91ae-364da2661108\"},{\"url\":\"https://git.kernel.org/stable/c/f3a5367c679d31473d3fbb391675055b4792c309\",\"source\":\"af854a3a-2127-422b-91ae-364da2661108\"}]}}"
  }
}


Log in or create an account to share your comment.




Tags
Taxonomy of the tags.


Loading…

Loading…

Loading…

Sightings

Author Source Type Date

Nomenclature

  • Seen: The vulnerability was mentioned, discussed, or seen somewhere by the user.
  • Confirmed: The vulnerability is confirmed from an analyst perspective.
  • Exploited: This vulnerability was exploited and seen by the user reporting the sighting.
  • Patched: This vulnerability was successfully patched by the user reporting the sighting.
  • Not exploited: This vulnerability was not exploited or seen by the user reporting the sighting.
  • Not confirmed: The user expresses doubt about the veracity of the vulnerability.
  • Not patched: This vulnerability was not successfully patched by the user reporting the sighting.