iommu/vt-d: Don't issue ATS Invalidation request when device is disconnected

For endpoint devices that connect to the system via hotplug capable ports, users could request a device reset by flapping the device's link through setting the slot's link control register. As a response to the pciehp_ist() DLLSC interrupt sequence, pciehp will unload the device driver and power it off.

This issue will cause an IOMMU device-TLB invalidation (Intel VT-d spec, or ATS Invalidation in PCIe spec r6.1) request for a non-existing target device to be sent and a deadly loop to retry that request after an ITE fault is triggered in the interrupt context.

The problem could lead to continuous hard lockup warnings and system hangs, such as the ones shown below:

[ 4211.433662] pcieport 000:17:01.: pciehp: Slot(108): Link Down
[ 4211.433664] pcieport 000:17:01.: pciehp: Slot(108): Card not present
[ 4223.822591] NMI watchdog: Watchdog detected hard LOCKUP on cpu 144
[ 4223.822622] CPU: 144 PID: 1422 Comm: irq/57-pciehp Kdump: loaded Tainted: G S
         OE    kernel version xxxx

truncated---

This vulnerability has been resolved by not issuing an ATS Invalidation request when the device is disconnected.

For more information, please refer to the original references

- Linux kernel commits history
- Linux Kernel Mailing List

Timeline

Published on: 04/17/2024 11:15:10 UTC
Last modified on: 06/25/2024 22:15:25 UTC