ghsa-cgvj-gwgx-p5f7
Vulnerability from github
In the Linux kernel, the following vulnerability has been resolved:
mm: vmscan: remove deadlock due to throttling failing to make progress
A soft lockup bug in kcompactd was reported in a private bugzilla with the following visible in dmesg;
watchdog: BUG: soft lockup - CPU#33 stuck for 26s! [kcompactd0:479] watchdog: BUG: soft lockup - CPU#33 stuck for 52s! [kcompactd0:479] watchdog: BUG: soft lockup - CPU#33 stuck for 78s! [kcompactd0:479] watchdog: BUG: soft lockup - CPU#33 stuck for 104s! [kcompactd0:479]
The machine had 256G of RAM with no swap and an earlier failed allocation indicated that node 0 where kcompactd was run was potentially unreclaimable;
Node 0 active_anon:29355112kB inactive_anon:2913528kB active_file:0kB inactive_file:0kB unevictable:64kB isolated(anon):0kB isolated(file):0kB mapped:8kB dirty:0kB writeback:0kB shmem:26780kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 23480320kB writeback_tmp:0kB kernel_stack:2272kB pagetables:24500kB all_unreclaimable? yes
Vlastimil Babka investigated a crash dump and found that a task migrating pages was trying to drain PCP lists;
PID: 52922 TASK: ffff969f820e5000 CPU: 19 COMMAND: "kworker/u128:3" Call Trace: __schedule schedule schedule_timeout wait_for_completion __flush_work __drain_all_pages __alloc_pages_slowpath.constprop.114 __alloc_pages alloc_migration_target migrate_pages migrate_to_node do_migrate_pages cpuset_migrate_mm_workfn process_one_work worker_thread kthread ret_from_fork
This failure is specific to CONFIG_PREEMPT=n builds. The root of the problem is that kcompact0 is not rescheduling on a CPU while a task that has isolated a large number of the pages from the LRU is waiting on kcompact0 to reschedule so the pages can be released. While shrink_inactive_list() only loops once around too_many_isolated, reclaim can continue without rescheduling if sc->skipped_deactivate == 1 which could happen if there was no file LRU and the inactive anon list was not low.
{ "affected": [], "aliases": [ "CVE-2022-48800" ], "database_specific": { "cwe_ids": [ "CWE-667" ], "github_reviewed": false, "github_reviewed_at": null, "nvd_published_at": "2024-07-16T12:15:04Z", "severity": "MODERATE" }, "details": "In the Linux kernel, the following vulnerability has been resolved:\n\nmm: vmscan: remove deadlock due to throttling failing to make progress\n\nA soft lockup bug in kcompactd was reported in a private bugzilla with\nthe following visible in dmesg;\n\n watchdog: BUG: soft lockup - CPU#33 stuck for 26s! [kcompactd0:479]\n watchdog: BUG: soft lockup - CPU#33 stuck for 52s! [kcompactd0:479]\n watchdog: BUG: soft lockup - CPU#33 stuck for 78s! [kcompactd0:479]\n watchdog: BUG: soft lockup - CPU#33 stuck for 104s! [kcompactd0:479]\n\nThe machine had 256G of RAM with no swap and an earlier failed\nallocation indicated that node 0 where kcompactd was run was potentially\nunreclaimable;\n\n Node 0 active_anon:29355112kB inactive_anon:2913528kB active_file:0kB\n inactive_file:0kB unevictable:64kB isolated(anon):0kB isolated(file):0kB\n mapped:8kB dirty:0kB writeback:0kB shmem:26780kB shmem_thp:\n 0kB shmem_pmdmapped: 0kB anon_thp: 23480320kB writeback_tmp:0kB\n kernel_stack:2272kB pagetables:24500kB all_unreclaimable? yes\n\nVlastimil Babka investigated a crash dump and found that a task\nmigrating pages was trying to drain PCP lists;\n\n PID: 52922 TASK: ffff969f820e5000 CPU: 19 COMMAND: \"kworker/u128:3\"\n Call Trace:\n __schedule\n schedule\n schedule_timeout\n wait_for_completion\n __flush_work\n __drain_all_pages\n __alloc_pages_slowpath.constprop.114\n __alloc_pages\n alloc_migration_target\n migrate_pages\n migrate_to_node\n do_migrate_pages\n cpuset_migrate_mm_workfn\n process_one_work\n worker_thread\n kthread\n ret_from_fork\n\nThis failure is specific to CONFIG_PREEMPT=n builds. The root of the\nproblem is that kcompact0 is not rescheduling on a CPU while a task that\nhas isolated a large number of the pages from the LRU is waiting on\nkcompact0 to reschedule so the pages can be released. While\nshrink_inactive_list() only loops once around too_many_isolated, reclaim\ncan continue without rescheduling if sc-\u003eskipped_deactivate == 1 which\ncould happen if there was no file LRU and the inactive anon list was not\nlow.", "id": "GHSA-cgvj-gwgx-p5f7", "modified": "2024-08-21T18:31:27Z", "published": "2024-07-16T12:30:40Z", "references": [ { "type": "ADVISORY", "url": "https://nvd.nist.gov/vuln/detail/CVE-2022-48800" }, { "type": "WEB", "url": "https://git.kernel.org/stable/c/3980cff6349687f73d5109f156f23cb261c24164" }, { "type": "WEB", "url": "https://git.kernel.org/stable/c/b485c6f1f9f54b81443efda5f3d8a5036ba2cd91" } ], "schema_version": "1.4.0", "severity": [ { "score": "CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H", "type": "CVSS_V3" } ] }
Sightings
Author | Source | Type | Date |
---|
Nomenclature
- Seen: The vulnerability was mentioned, discussed, or seen somewhere by the user.
- Confirmed: The vulnerability is confirmed from an analyst perspective.
- Exploited: This vulnerability was exploited and seen by the user reporting the sighting.
- Patched: This vulnerability was successfully patched by the user reporting the sighting.
- Not exploited: This vulnerability was not exploited or seen by the user reporting the sighting.
- Not confirmed: The user expresses doubt about the veracity of the vulnerability.
- Not patched: This vulnerability was not successfully patched by the user reporting the sighting.