Summary: A Linux Kernel vulnerability was identified and fixed in the powerpc/qspinlock module. It is related to a deadlock issue in the MCS queue. This blog post explains the problem and the fix in detail, along with code snippets and references.

Introduction: A vulnerability has been resolved in the Linux Kernel, specifically in the powerpc/qspinlock module. The vulnerability results in a deadlock issue within the MCS queue, causing a system to lock up under certain circumstances. This post will cover the details of this vulnerability, including the exploit and code snippets to illustrate the problem and its solution.

Background: The powerpc/qspinlock is a module within the Linux Kernel responsible for managing spinlocks on PowerPC-based systems. The vulnerability is discovered when the system runs stress-ng, a stress testing tool, on a shared LPAR with 16 cores (16EC/16VP). When the stress test is executed, the system encounters occasional lockups.

Problem Description: The deadlock occurs when an interrupt takes place in queued_spin_lock_slowpath() after qnodesp->count is incremented but before node->lock is initialized. Another CPU might see stale lock values in get_tail_qnode(), and if the stale lock value happens to match the lock on that CPU, it writes to the "next" pointer of the incorrect qnode. This causes a deadlock as the affected CPU waits indefinitely for its "next" pointer to be set by its successor in the queue.

Exploit Details: To trigger this vulnerability, the stress-ng tool is run with the following command

$ stress-ng --all 128 --vm-bytes 80% --aggressive \
           --maximize --oomable --verify  --syslog \
           --metrics  --times  --timeout 5m

The output log shows the following

watchdog: CPU 15 Hard LOCKUP
NIP [c000000000b78f4] queued_spin_lock_slowpath+x1184/x149
LR [c000000001037c5c] _raw_spin_lock+x6c/x90
Call Trace:
xc000002cfffa3bf (unreliable)
_raw_spin_lock+x6c/x90
raw_spin_rq_lock_nested.part.135+x4c/xd
sched_ttwu_pending+x60/x1f
__flush_smp_call_function_queue+x1dc/x670
smp_ipi_demux_relaxed+xa4/x100
xive_muxed_ipi_action+x20/x40
__handle_irq_event_percpu+x80/x240
handle_irq_event_percpu+x2c/x80
handle_percpu_irq+x84/xd
generic_handle_irq+x54/x80
__do_irq+xac/x210
__do_IRQ+x74/xd
x
do_IRQ+x8c/x170
hardware_interrupt_common_virt+x29c/x2a
--- interrupt: 500 at queued_spin_lock_slowpath+x4b8/x149
NIP [c000000000b6c28] queued_spin_lock_slowpath+x4b8/x149
LR [c000000001037c5c] _raw_spin_lock+x6c/x90
--- interrupt: 500
xc0000029c1a41d00 (unreliable)
_raw_spin_lock+x6c/x90
futex_wake+x100/x260
do_futex+x21c/x2a
sys_futex+x98/x270
system_call_exception+x14c/x2f
system_call_vectored_common+x15c/x2ec

The code flow to explain this deadlock issue is illustrated in the original post, which can be found at this link: Linux kernel mailing list

Solution: The Linux kernel developers have resolved this issue by initializing the node->lock and using a memory barrier (smp_wmb()) to ensure that node->lock is initialized before any modification to qnodesp->count.

Conclusion: The identified deadlock vulnerability in the Linux Kernel powerpc/qspinlock module has been resolved. Users running Linux on PowerPC systems should ensure they have the latest kernel updates applied to their systems to mitigate this vulnerability.

Timeline

Published on: 09/18/2024 08:15:06 UTC
Last modified on: 09/20/2024 18:18:18 UTC