ghsa-r696-8qvr-qpj7
Vulnerability from github
Published
2025-12-16 15:30
Modified
2025-12-16 15:30
Details

In the Linux kernel, the following vulnerability has been resolved:

amd/amdkfd: enhance kfd process check in switch partition

current switch partition only check if kfd_processes_table is empty. kfd_prcesses_table entry is deleted in kfd_process_notifier_release, but kfd_process tear down is in kfd_process_wq_release.

consider two processes:

Process A (workqueue) -> kfd_process_wq_release -> Access kfd_node member Process B switch partition -> amdgpu_xcp_pre_partition_switch -> amdgpu_amdkfd_device_fini_sw -> kfd_node tear down.

Process A and B may trigger a race as shown in dmesg log.

This patch is to resolve the race by adding an atomic kfd_process counter kfd_processes_count, it increment as create kfd process, decrement as finish kfd_process_wq_release.

v2: Put kfd_processes_count per kfd_dev, move decrement to kfd_process_destroy_pdds and bug fix. (Philip Yang)

[3966658.307702] divide error: 0000 [#1] SMP NOPTI [3966658.350818] i10nm_edac [3966658.356318] CPU: 124 PID: 38435 Comm: kworker/124:0 Kdump: loaded Tainted [3966658.356890] Workqueue: kfd_process_wq kfd_process_wq_release [amdgpu] [3966658.362839] nfit [3966658.366457] RIP: 0010:kfd_get_num_sdma_engines+0x17/0x40 [amdgpu] [3966658.366460] Code: 00 00 e9 ac 81 02 00 66 66 2e 0f 1f 84 00 00 00 00 00 90 0f 1f 44 00 00 48 8b 4f 08 48 8b b7 00 01 00 00 8b 81 58 26 03 00 99 be b8 01 00 00 80 b9 70 2e 00 00 00 74 0b 83 f8 02 ba 02 00 00 [3966658.380967] x86_pkg_temp_thermal [3966658.391529] RSP: 0018:ffffc900a0edfdd8 EFLAGS: 00010246 [3966658.391531] RAX: 0000000000000008 RBX: ffff8974e593b800 RCX: ffff888645900000 [3966658.391531] RDX: 0000000000000000 RSI: ffff888129154400 RDI: ffff888129151c00 [3966658.391532] RBP: ffff8883ad79d400 R08: 0000000000000000 R09: ffff8890d2750af4 [3966658.391532] R10: 0000000000000018 R11: 0000000000000018 R12: 0000000000000000 [3966658.391533] R13: ffff8883ad79d400 R14: ffffe87ff662ba00 R15: ffff8974e593b800 [3966658.391533] FS: 0000000000000000(0000) GS:ffff88fe7f600000(0000) knlGS:0000000000000000 [3966658.391534] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [3966658.391534] CR2: 0000000000d71000 CR3: 000000dd0e970004 CR4: 0000000002770ee0 [3966658.391535] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [3966658.391535] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 [3966658.391536] PKRU: 55555554 [3966658.391536] Call Trace: [3966658.391674] deallocate_sdma_queue+0x38/0xa0 [amdgpu] [3966658.391762] process_termination_cpsch+0x1ed/0x480 [amdgpu] [3966658.399754] intel_powerclamp [3966658.402831] kfd_process_dequeue_from_all_devices+0x5b/0xc0 [amdgpu] [3966658.402908] kfd_process_wq_release+0x1a/0x1a0 [amdgpu] [3966658.410516] coretemp [3966658.434016] process_one_work+0x1ad/0x380 [3966658.434021] worker_thread+0x49/0x310 [3966658.438963] kvm_intel [3966658.446041] ? process_one_work+0x380/0x380 [3966658.446045] kthread+0x118/0x140 [3966658.446047] ? __kthread_bind_mask+0x60/0x60 [3966658.446050] ret_from_fork+0x1f/0x30 [3966658.446053] Modules linked in: kpatch_20765354(OEK) [3966658.455310] kvm [3966658.464534] mptcp_diag xsk_diag raw_diag unix_diag af_packet_diag netlink_diag udp_diag act_pedit act_mirred act_vlan cls_flower kpatch_21951273(OEK) kpatch_18424469(OEK) kpatch_19749756(OEK) [3966658.473462] idxd_mdev [3966658.482306] kpatch_17971294(OEK) sch_ingress xt_conntrack amdgpu(OE) amdxcp(OE) amddrm_buddy(OE) amd_sched(OE) amdttm(OE) amdkcl(OE) intel_ifs iptable_mangle tcm_loop target_core_pscsi tcp_diag target_core_file inet_diag target_core_iblock target_core_user target_core_mod coldpgs kpatch_18383292(OEK) ip6table_nat ip6table_filter ip6_tables ip_set_hash_ipportip ip_set_hash_ipportnet ip_set_hash_ipport ip_set_bitmap_port xt_comment iptable_nat nf_nat iptable_filter ip_tables ip_set ip_vs_sh ip_vs_wrr ip_vs_rr ip_vs nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 sn_core_odd(OE) i40e overlay binfmt_misc tun bonding(OE) aisqos(OE) aisqo ---truncated---

Show details on source website


{
  "affected": [],
  "aliases": [
    "CVE-2025-68174"
  ],
  "database_specific": {
    "cwe_ids": [],
    "github_reviewed": false,
    "github_reviewed_at": null,
    "nvd_published_at": "2025-12-16T14:15:49Z",
    "severity": null
  },
  "details": "In the Linux kernel, the following vulnerability has been resolved:\n\namd/amdkfd: enhance kfd process check in switch partition\n\ncurrent switch partition only check if kfd_processes_table is empty.\nkfd_prcesses_table entry is deleted in kfd_process_notifier_release, but\nkfd_process tear down is in kfd_process_wq_release.\n\nconsider two processes:\n\nProcess A (workqueue) -\u003e kfd_process_wq_release -\u003e Access kfd_node member\nProcess B switch partition -\u003e amdgpu_xcp_pre_partition_switch -\u003e amdgpu_amdkfd_device_fini_sw\n-\u003e kfd_node tear down.\n\nProcess A and B may trigger a race as shown in dmesg log.\n\nThis patch is to resolve the race by adding an atomic kfd_process counter\nkfd_processes_count, it increment as create kfd process, decrement as\nfinish kfd_process_wq_release.\n\nv2: Put kfd_processes_count per kfd_dev, move decrement to kfd_process_destroy_pdds\nand bug fix. (Philip Yang)\n\n[3966658.307702] divide error: 0000 [#1] SMP NOPTI\n[3966658.350818]  i10nm_edac\n[3966658.356318] CPU: 124 PID: 38435 Comm: kworker/124:0 Kdump: loaded Tainted\n[3966658.356890] Workqueue: kfd_process_wq kfd_process_wq_release [amdgpu]\n[3966658.362839]  nfit\n[3966658.366457] RIP: 0010:kfd_get_num_sdma_engines+0x17/0x40 [amdgpu]\n[3966658.366460] Code: 00 00 e9 ac 81 02 00 66 66 2e 0f 1f 84 00 00 00 00 00 90 0f 1f 44 00 00 48 8b 4f 08 48 8b b7 00 01 00 00 8b 81 58 26 03 00 99 \u003cf7\u003e be b8 01 00 00 80 b9 70 2e 00 00 00 74 0b 83 f8 02 ba 02 00 00\n[3966658.380967]  x86_pkg_temp_thermal\n[3966658.391529] RSP: 0018:ffffc900a0edfdd8 EFLAGS: 00010246\n[3966658.391531] RAX: 0000000000000008 RBX: ffff8974e593b800 RCX: ffff888645900000\n[3966658.391531] RDX: 0000000000000000 RSI: ffff888129154400 RDI: ffff888129151c00\n[3966658.391532] RBP: ffff8883ad79d400 R08: 0000000000000000 R09: ffff8890d2750af4\n[3966658.391532] R10: 0000000000000018 R11: 0000000000000018 R12: 0000000000000000\n[3966658.391533] R13: ffff8883ad79d400 R14: ffffe87ff662ba00 R15: ffff8974e593b800\n[3966658.391533] FS:  0000000000000000(0000) GS:ffff88fe7f600000(0000) knlGS:0000000000000000\n[3966658.391534] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033\n[3966658.391534] CR2: 0000000000d71000 CR3: 000000dd0e970004 CR4: 0000000002770ee0\n[3966658.391535] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000\n[3966658.391535] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400\n[3966658.391536] PKRU: 55555554\n[3966658.391536] Call Trace:\n[3966658.391674]  deallocate_sdma_queue+0x38/0xa0 [amdgpu]\n[3966658.391762]  process_termination_cpsch+0x1ed/0x480 [amdgpu]\n[3966658.399754]  intel_powerclamp\n[3966658.402831]  kfd_process_dequeue_from_all_devices+0x5b/0xc0 [amdgpu]\n[3966658.402908]  kfd_process_wq_release+0x1a/0x1a0 [amdgpu]\n[3966658.410516]  coretemp\n[3966658.434016]  process_one_work+0x1ad/0x380\n[3966658.434021]  worker_thread+0x49/0x310\n[3966658.438963]  kvm_intel\n[3966658.446041]  ? process_one_work+0x380/0x380\n[3966658.446045]  kthread+0x118/0x140\n[3966658.446047]  ? __kthread_bind_mask+0x60/0x60\n[3966658.446050]  ret_from_fork+0x1f/0x30\n[3966658.446053] Modules linked in: kpatch_20765354(OEK)\n[3966658.455310]  kvm\n[3966658.464534]  mptcp_diag xsk_diag raw_diag unix_diag af_packet_diag netlink_diag udp_diag act_pedit act_mirred act_vlan cls_flower kpatch_21951273(OEK) kpatch_18424469(OEK) kpatch_19749756(OEK)\n[3966658.473462]  idxd_mdev\n[3966658.482306]  kpatch_17971294(OEK) sch_ingress xt_conntrack amdgpu(OE) amdxcp(OE) amddrm_buddy(OE) amd_sched(OE) amdttm(OE) amdkcl(OE) intel_ifs iptable_mangle tcm_loop target_core_pscsi tcp_diag target_core_file inet_diag target_core_iblock target_core_user target_core_mod coldpgs kpatch_18383292(OEK) ip6table_nat ip6table_filter ip6_tables ip_set_hash_ipportip ip_set_hash_ipportnet ip_set_hash_ipport ip_set_bitmap_port xt_comment iptable_nat nf_nat iptable_filter ip_tables ip_set ip_vs_sh ip_vs_wrr ip_vs_rr ip_vs nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 sn_core_odd(OE) i40e overlay binfmt_misc tun bonding(OE) aisqos(OE) aisqo\n---truncated---",
  "id": "GHSA-r696-8qvr-qpj7",
  "modified": "2025-12-16T15:30:44Z",
  "published": "2025-12-16T15:30:44Z",
  "references": [
    {
      "type": "ADVISORY",
      "url": "https://nvd.nist.gov/vuln/detail/CVE-2025-68174"
    },
    {
      "type": "WEB",
      "url": "https://git.kernel.org/stable/c/45da20e00d5da842e17dfc633072b127504f0d0e"
    },
    {
      "type": "WEB",
      "url": "https://git.kernel.org/stable/c/536d80f660ec12058e461f4db387ea42bee9250d"
    }
  ],
  "schema_version": "1.4.0",
  "severity": []
}


Log in or create an account to share your comment.




Tags
Taxonomy of the tags.


Loading…

Loading…

Loading…

Sightings

Author Source Type Date

Nomenclature

  • Seen: The vulnerability was mentioned, discussed, or seen somewhere by the user.
  • Confirmed: The vulnerability is confirmed from an analyst perspective.
  • Published Proof of Concept: A public proof of concept is available for this vulnerability.
  • Exploited: This vulnerability was exploited and seen by the user reporting the sighting.
  • Patched: This vulnerability was successfully patched by the user reporting the sighting.
  • Not exploited: This vulnerability was not exploited or seen by the user reporting the sighting.
  • Not confirmed: The user expresses doubt about the veracity of the vulnerability.
  • Not patched: This vulnerability was not successfully patched by the user reporting the sighting.


Loading…

Loading…