ghsa-3j29-8r33-hfrc
Vulnerability from github
Published
2025-02-27 21:32
Modified
2025-02-27 21:32
Details

In the Linux kernel, the following vulnerability has been resolved:

skbuff: fix coalescing for page_pool fragment recycling

Fix a use-after-free when using page_pool with page fragments. We encountered this problem during normal RX in the hns3 driver:

(1) Initially we have three descriptors in the RX queue. The first one allocates PAGE1 through page_pool, and the other two allocate one half of PAGE2 each. Page references look like this:

            RX_BD1 _______ PAGE1
            RX_BD2 _______ PAGE2
            RX_BD3 _________/

(2) Handle RX on the first descriptor. Allocate SKB1, eventually added to the receive queue by tcp_queue_rcv().

(3) Handle RX on the second descriptor. Allocate SKB2 and pass it to netif_receive_skb():

netif_receive_skb(SKB2)
  ip_rcv(SKB2)
    SKB3 = skb_clone(SKB2)

SKB2 and SKB3 share a reference to PAGE2 through
skb_shinfo()->dataref. The other ref to PAGE2 is still held by
RX_BD3:

                  SKB2 ---+- PAGE2
                  SKB3 __/   /
            RX_BD3 _________/

(3b) Now while handling TCP, coalesce SKB3 with SKB1:

  tcp_v4_rcv(SKB3)
    tcp_try_coalesce(to=SKB1, from=SKB3)    // succeeds
    kfree_skb_partial(SKB3)
      skb_release_data(SKB3)                // drops one dataref

                  SKB1 _____ PAGE1
                       \____
                  SKB2 _____ PAGE2
                             /
            RX_BD3 _________/

In skb_try_coalesce(), __skb_frag_ref() takes a page reference to
PAGE2, where it should instead have increased the page_pool frag
reference, pp_frag_count. Without coalescing, when releasing both
SKB2 and SKB3, a single reference to PAGE2 would be dropped. Now
when releasing SKB1 and SKB2, two references to PAGE2 will be
dropped, resulting in underflow.

(3c) Drop SKB2:

  af_packet_rcv(SKB2)
    consume_skb(SKB2)
      skb_release_data(SKB2)                // drops second dataref
        page_pool_return_skb_page(PAGE2)    // drops one pp_frag_count

                  SKB1 _____ PAGE1
                       \____
                             PAGE2
                             /
            RX_BD3 _________/

(4) Userspace calls recvmsg() Copies SKB1 and releases it. Since SKB3 was coalesced with SKB1, we release the SKB3 page as well:

tcp_eat_recv_skb(SKB1)
  skb_release_data(SKB1)
    page_pool_return_skb_page(PAGE1)
    page_pool_return_skb_page(PAGE2)        // drops second pp_frag_count

(5) PAGE2 is freed, but the third RX descriptor was still using it! In our case this causes IOMMU faults, but it would silently corrupt memory if the IOMMU was disabled.

Change the logic that checks whether pp_recycle SKBs can be coalesced. We still reject differing pp_recycle between 'from' and 'to' SKBs, but in order to avoid the situation described above, we also reject coalescing when both 'from' and 'to' are pp_recycled and 'from' is cloned.

The new logic allows coalescing a cloned pp_recycle SKB into a page refcounted one, because in this case the release (4) will drop the right reference, the one taken by skb_try_coalesce().

Show details on source website


{
  "affected": [],
  "aliases": [
    "CVE-2022-49093"
  ],
  "database_specific": {
    "cwe_ids": [
      "CWE-416"
    ],
    "github_reviewed": false,
    "github_reviewed_at": null,
    "nvd_published_at": "2025-02-26T07:00:46Z",
    "severity": "HIGH"
  },
  "details": "In the Linux kernel, the following vulnerability has been resolved:\n\nskbuff: fix coalescing for page_pool fragment recycling\n\nFix a use-after-free when using page_pool with page fragments. We\nencountered this problem during normal RX in the hns3 driver:\n\n(1) Initially we have three descriptors in the RX queue. The first one\n    allocates PAGE1 through page_pool, and the other two allocate one\n    half of PAGE2 each. Page references look like this:\n\n                RX_BD1 _______ PAGE1\n                RX_BD2 _______ PAGE2\n                RX_BD3 _________/\n\n(2) Handle RX on the first descriptor. Allocate SKB1, eventually added\n    to the receive queue by tcp_queue_rcv().\n\n(3) Handle RX on the second descriptor. Allocate SKB2 and pass it to\n    netif_receive_skb():\n\n    netif_receive_skb(SKB2)\n      ip_rcv(SKB2)\n        SKB3 = skb_clone(SKB2)\n\n    SKB2 and SKB3 share a reference to PAGE2 through\n    skb_shinfo()-\u003edataref. The other ref to PAGE2 is still held by\n    RX_BD3:\n\n                      SKB2 ---+- PAGE2\n                      SKB3 __/   /\n                RX_BD3 _________/\n\n (3b) Now while handling TCP, coalesce SKB3 with SKB1:\n\n      tcp_v4_rcv(SKB3)\n        tcp_try_coalesce(to=SKB1, from=SKB3)    // succeeds\n        kfree_skb_partial(SKB3)\n          skb_release_data(SKB3)                // drops one dataref\n\n                      SKB1 _____ PAGE1\n                           \\____\n                      SKB2 _____ PAGE2\n                                 /\n                RX_BD3 _________/\n\n    In skb_try_coalesce(), __skb_frag_ref() takes a page reference to\n    PAGE2, where it should instead have increased the page_pool frag\n    reference, pp_frag_count. Without coalescing, when releasing both\n    SKB2 and SKB3, a single reference to PAGE2 would be dropped. Now\n    when releasing SKB1 and SKB2, two references to PAGE2 will be\n    dropped, resulting in underflow.\n\n (3c) Drop SKB2:\n\n      af_packet_rcv(SKB2)\n        consume_skb(SKB2)\n          skb_release_data(SKB2)                // drops second dataref\n            page_pool_return_skb_page(PAGE2)    // drops one pp_frag_count\n\n                      SKB1 _____ PAGE1\n                           \\____\n                                 PAGE2\n                                 /\n                RX_BD3 _________/\n\n(4) Userspace calls recvmsg()\n    Copies SKB1 and releases it. Since SKB3 was coalesced with SKB1, we\n    release the SKB3 page as well:\n\n    tcp_eat_recv_skb(SKB1)\n      skb_release_data(SKB1)\n        page_pool_return_skb_page(PAGE1)\n        page_pool_return_skb_page(PAGE2)        // drops second pp_frag_count\n\n(5) PAGE2 is freed, but the third RX descriptor was still using it!\n    In our case this causes IOMMU faults, but it would silently corrupt\n    memory if the IOMMU was disabled.\n\nChange the logic that checks whether pp_recycle SKBs can be coalesced.\nWe still reject differing pp_recycle between \u0027from\u0027 and \u0027to\u0027 SKBs, but\nin order to avoid the situation described above, we also reject\ncoalescing when both \u0027from\u0027 and \u0027to\u0027 are pp_recycled and \u0027from\u0027 is\ncloned.\n\nThe new logic allows coalescing a cloned pp_recycle SKB into a page\nrefcounted one, because in this case the release (4) will drop the right\nreference, the one taken by skb_try_coalesce().",
  "id": "GHSA-3j29-8r33-hfrc",
  "modified": "2025-02-27T21:32:11Z",
  "published": "2025-02-27T21:32:11Z",
  "references": [
    {
      "type": "ADVISORY",
      "url": "https://nvd.nist.gov/vuln/detail/CVE-2022-49093"
    },
    {
      "type": "WEB",
      "url": "https://git.kernel.org/stable/c/1effe8ca4e34c34cdd9318436a4232dcb582ebf4"
    },
    {
      "type": "WEB",
      "url": "https://git.kernel.org/stable/c/72bb856d16e883437023ff2ff77d0c498018728a"
    },
    {
      "type": "WEB",
      "url": "https://git.kernel.org/stable/c/ba965e8605aee5387cecaa28fcf7ee9f61779a49"
    },
    {
      "type": "WEB",
      "url": "https://git.kernel.org/stable/c/c4fa19615806a9a7e518c295b39175aa47a685ac"
    }
  ],
  "schema_version": "1.4.0",
  "severity": [
    {
      "score": "CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H",
      "type": "CVSS_V3"
    }
  ]
}


Log in or create an account to share your comment.




Tags
Taxonomy of the tags.


Loading…

Loading…

Loading…

Sightings

Author Source Type Date

Nomenclature

  • Seen: The vulnerability was mentioned, discussed, or seen somewhere by the user.
  • Confirmed: The vulnerability is confirmed from an analyst perspective.
  • Exploited: This vulnerability was exploited and seen by the user reporting the sighting.
  • Patched: This vulnerability was successfully patched by the user reporting the sighting.
  • Not exploited: This vulnerability was not exploited or seen by the user reporting the sighting.
  • Not confirmed: The user expresses doubt about the veracity of the vulnerability.
  • Not patched: This vulnerability was not successfully patched by the user reporting the sighting.


Loading…