Message ID | b712b87a509dc64619a3aec7c673935ed09be5de.1531488173.git.ps@pks.im (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Jul 13, 2018 at 9:29 PM, Patrick Steinhardt <ps@pks.im> wrote: > When power management for SCSI is enabled and if a device uses blk-mq, > it is possible to trigger a `NULL` pointer exception when resuming that > device. The NPE is triggered when trying to dereference the `request_fn` > function pointer of the device's `request_queue`: > > __blk_run_queue_uncond:470 > __blk_run_queue:490 > blk_post_runtime_resume:3889 > sdev_runtime_resume:263 > scsi_runtime_resume:275 > > When the SCSI device is being allocated by `scsi_alloc_sdev`, the > device's request queue will either be initialized via > `scsi_mq_alloc_queue` or `scsi_old_alloc_queue`. But the `request_fn` > member of the request queue is in fact only being set in > `scsi_old_alloc_queue`, which will then later cause the mentioned NPE. > > Fix the issue by checking whether the `request_fn` is set in > `__blk_run_queue_uncond`. In case it is unset, we'll silently return and > not try to invoke the callback, thus fixing the NPE. > > Signed-off-by: Patrick Steinhardt <ps@pks.im> > --- > > Since at least v4.14, I am easily able to trigger above NPE by > unplugging USB mass storage devices on my computer (Skylake, ASUS > Z170I) with CONFIG_SCSI_MQ_DEFAULT=y. The attached patch fixes > the issue, but keep in mind that this is my first patch, so the > proposed fix may not be appropriate at all. Feedback would be > highly appreciated. > > block/blk-core.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/block/blk-core.c b/block/blk-core.c > index f84a9b7b6f5a..0a2041660cd9 100644 > --- a/block/blk-core.c > +++ b/block/blk-core.c > @@ -456,7 +456,7 @@ inline void __blk_run_queue_uncond(struct request_queue *q) > lockdep_assert_held(q->queue_lock); > WARN_ON_ONCE(q->mq_ops); > > - if (unlikely(blk_queue_dead(q))) > + if (unlikely(!q->request_fn) || unlikely(blk_queue_dead(q))) > return; > Now runtime PM is disabled for blk-mq/scsi_mq, not sure how this issue is triggered on your machine. Could you share the steps for reproducing this issue? Thanks, Ming Lei
On Fri, Jul 13, 2018 at 09:41:41PM +0800, Ming Lei wrote: > On Fri, Jul 13, 2018 at 9:29 PM, Patrick Steinhardt <ps@pks.im> wrote: > > When power management for SCSI is enabled and if a device uses blk-mq, > > it is possible to trigger a `NULL` pointer exception when resuming that > > device. The NPE is triggered when trying to dereference the `request_fn` > > function pointer of the device's `request_queue`: > > > > __blk_run_queue_uncond:470 > > __blk_run_queue:490 > > blk_post_runtime_resume:3889 > > sdev_runtime_resume:263 > > scsi_runtime_resume:275 > > > > When the SCSI device is being allocated by `scsi_alloc_sdev`, the > > device's request queue will either be initialized via > > `scsi_mq_alloc_queue` or `scsi_old_alloc_queue`. But the `request_fn` > > member of the request queue is in fact only being set in > > `scsi_old_alloc_queue`, which will then later cause the mentioned NPE. > > > > Fix the issue by checking whether the `request_fn` is set in > > `__blk_run_queue_uncond`. In case it is unset, we'll silently return and > > not try to invoke the callback, thus fixing the NPE. > > > > Signed-off-by: Patrick Steinhardt <ps@pks.im> > > --- > > > > Since at least v4.14, I am easily able to trigger above NPE by > > unplugging USB mass storage devices on my computer (Skylake, ASUS > > Z170I) with CONFIG_SCSI_MQ_DEFAULT=y. The attached patch fixes > > the issue, but keep in mind that this is my first patch, so the > > proposed fix may not be appropriate at all. Feedback would be > > highly appreciated. > > > > block/blk-core.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/block/blk-core.c b/block/blk-core.c > > index f84a9b7b6f5a..0a2041660cd9 100644 > > --- a/block/blk-core.c > > +++ b/block/blk-core.c > > @@ -456,7 +456,7 @@ inline void __blk_run_queue_uncond(struct request_queue *q) > > lockdep_assert_held(q->queue_lock); > > WARN_ON_ONCE(q->mq_ops); > > > > - if (unlikely(blk_queue_dead(q))) > > + if (unlikely(!q->request_fn) || unlikely(blk_queue_dead(q))) > > return; > > > > Now runtime PM is disabled for blk-mq/scsi_mq, not sure how this issue is > triggered on your machine. > > Could you share the steps for reproducing this issue? I bet that the issue stems from custom hotplugging scripts then, which change the value of power/control. See the attachment of this mail for all sysfs changes that are being performed after plugging in the USB stick. Basically, the reproduction steps on my machine are: 1. plug in USB stick (assumed to be /dev/sdb now) 2. wait a short amount of time 3. dd if=/dev/sdb of=/dev/null bs=4M 4. wait a short amount of time 5. unplug the USB stick, which immediately crashes the system The issue isn't always reproducible, I think there is some variance depending on how much time passes by at step 2 and/or 4. It probably is related to the autosuspend delay. From my experience the crash becomes more likely the longer I wait after step 4. Regards Patrick /sys//devices/pci0000:00/0000:00:14.0/usb2/2-2/power/autosuspend_delay_ms -> 60000 /sys//devices/pci0000:00/0000:00:14.0/usb2/2-2/power/control -> auto /sys//devices/pci0000:00/0000:00:14.0/usb2/2-2/power/control -> on /sys//devices/pci0000:00/0000:00:14.0/usb2/2-2/2-2:1.0/host4/power/autosuspend_delay_ms -> 60000 /sys//devices/pci0000:00/0000:00:14.0/usb2/2-2/2-2:1.0/host4/power/control -> auto /sys//devices/pci0000:00/0000:00:14.0/usb2/2-2/2-2:1.0/host4/scsi_host/host4/power/autosuspend_delay_ms -> 60000 /sys//devices/pci0000:00/0000:00:14.0/usb2/2-2/2-2:1.0/host4/scsi_host/host4/power/control -> auto /sys//devices/pci0000:00/0000:00:14.0/usb2/2-2/2-2:1.0/host4/target4:0:0/4:0:0:0/power/autosuspend_delay_ms -> 60000 /sys//devices/pci0000:00/0000:00:14.0/usb2/2-2/2-2:1.0/host4/target4:0:0/4:0:0:0/scsi_device/4:0:0:0/power/autosuspend_delay_ms -> 60000 /sys//devices/pci0000:00/0000:00:14.0/usb2/2-2/2-2:1.0/host4/target4:0:0/4:0:0:0/bsg/4:0:0:0/power/autosuspend_delay_ms -> 60000 /sys//devices/pci0000:00/0000:00:14.0/usb2/2-2/2-2:1.0/host4/target4:0:0/4:0:0:0/power/control -> auto /sys//devices/pci0000:00/0000:00:14.0/usb2/2-2/2-2:1.0/host4/target4:0:0/4:0:0:0/scsi_device/4:0:0:0/power/control -> auto /sys//devices/pci0000:00/0000:00:14.0/usb2/2-2/2-2:1.0/host4/target4:0:0/4:0:0:0/bsg/4:0:0:0/power/control -> auto /sys//devices/pci0000:00/0000:00:14.0/usb2/2-2/2-2:1.0/host4/target4:0:0/power/autosuspend_delay_ms -> 60000 /sys//devices/pci0000:00/0000:00:14.0/usb2/2-2/2-2:1.0/host4/target4:0:0/power/control -> auto /sys//devices/virtual/bdi/8:16/power/autosuspend_delay_ms -> 60000 /sys//devices/pci0000:00/0000:00:14.0/usb2/2-2/2-2:1.0/host4/target4:0:0/4:0:0:0/scsi_disk/4:0:0:0/power/autosuspend_delay_ms -> 60000 /sys//devices/virtual/bdi/8:16/power/control -> auto /sys//devices/pci0000:00/0000:00:14.0/usb2/2-2/2-2:1.0/host4/target4:0:0/4:0:0:0/scsi_disk/4:0:0:0/power/control -> auto /sys//devices/pci0000:00/0000:00:14.0/usb2/2-2/2-2:1.0/host4/target4:0:0/4:0:0:0/block/sdb/power/autosuspend_delay_ms -> 60000 /sys//devices/pci0000:00/0000:00:14.0/usb2/2-2/2-2:1.0/host4/target4:0:0/4:0:0:0/block/sdb/sdb2/power/autosuspend_delay_ms -> 60000 /sys//devices/pci0000:00/0000:00:14.0/usb2/2-2/2-2:1.0/host4/target4:0:0/4:0:0:0/block/sdb/power/control -> auto /sys//devices/pci0000:00/0000:00:14.0/usb2/2-2/2-2:1.0/host4/target4:0:0/4:0:0:0/block/sdb/sdb2/power/control -> auto /sys//devices/pci0000:00/0000:00:14.0/usb2/2-2/2-2:1.0/host4/target4:0:0/4:0:0:0/block/sdb/sdb1/power/autosuspend_delay_ms -> 60000 /sys//devices/pci0000:00/0000:00:14.0/usb2/2-2/2-2:1.0/host4/target4:0:0/4:0:0:0/block/sdb/sdb1/power/control -> auto
On Fri, 2018-07-13 at 15:29 +0200, Patrick Steinhardt wrote: > When power management for SCSI is enabled and if a device uses blk-mq, > it is possible to trigger a `NULL` pointer exception when resuming that > device. The NPE is triggered when trying to dereference the `request_fn` > function pointer of the device's `request_queue`: > > __blk_run_queue_uncond:470 > __blk_run_queue:490 > blk_post_runtime_resume:3889 > sdev_runtime_resume:263 > scsi_runtime_resume:275 > > When the SCSI device is being allocated by `scsi_alloc_sdev`, the > device's request queue will either be initialized via > `scsi_mq_alloc_queue` or `scsi_old_alloc_queue`. But the `request_fn` > member of the request queue is in fact only being set in > `scsi_old_alloc_queue`, which will then later cause the mentioned NPE. > > Fix the issue by checking whether the `request_fn` is set in > `__blk_run_queue_uncond`. In case it is unset, we'll silently return and > not try to invoke the callback, thus fixing the NPE. Which kernel version are you using? Can you check whether the following two commits are in your kernel tree? * 4fd41a8552af ("SCSI: Fix NULL pointer dereference in runtime PM"; December 2015). * 765e40b675a9 ("block: disable runtime-pm for blk-mq"; July 2017). Thanks, Bart.
On Wed, Jul 25, 2018 at 06:13:02PM +0000, Bart Van Assche wrote: > On Fri, 2018-07-13 at 15:29 +-0200, Patrick Steinhardt wrote: > +AD4- When power management for SCSI is enabled and if a device uses blk-mq, > +AD4- it is possible to trigger a +AGA-NULL+AGA- pointer exception when resuming that > +AD4- device. The NPE is triggered when trying to dereference the +AGA-request+AF8-fn+AGA- > +AD4- function pointer of the device's +AGA-request+AF8-queue+AGA-: > +AD4- > +AD4- +AF8AXw-blk+AF8-run+AF8-queue+AF8-uncond:470 > +AD4- +AF8AXw-blk+AF8-run+AF8-queue:490 > +AD4- blk+AF8-post+AF8-runtime+AF8-resume:3889 > +AD4- sdev+AF8-runtime+AF8-resume:263 > +AD4- scsi+AF8-runtime+AF8-resume:275 > +AD4- > +AD4- When the SCSI device is being allocated by +AGA-scsi+AF8-alloc+AF8-sdev+AGA-, the > +AD4- device's request queue will either be initialized via > +AD4- +AGA-scsi+AF8-mq+AF8-alloc+AF8-queue+AGA- or +AGA-scsi+AF8-old+AF8-alloc+AF8-queue+AGA-. But the +AGA-request+AF8-fn+AGA- > +AD4- member of the request queue is in fact only being set in > +AD4- +AGA-scsi+AF8-old+AF8-alloc+AF8-queue+AGA-, which will then later cause the mentioned NPE. > +AD4- > +AD4- Fix the issue by checking whether the +AGA-request+AF8-fn+AGA- is set in > +AD4- +AGAAXwBf-blk+AF8-run+AF8-queue+AF8-uncond+AGA-. In case it is unset, we'll silently return and > +AD4- not try to invoke the callback, thus fixing the NPE. > > Which kernel version are you using? Can you check whether the following two > commits are in your kernel tree? > > +ACo- 4fd41a8552af (+ACI-SCSI: Fix NULL pointer dereference in runtime PM+ACIAOw- December > 2015). > +ACo- 765e40b675a9 (+ACI-block: disable runtime-pm for blk-mq+ACIAOw- July 2017). Commit 765e40b675a9 (block: disable runtime-pm for blk-mq July 2017) was indeed not part of my kernel. Back when I upgraded to v4.14, suspend to RAM on my laptop was broken, and a bisect showed that this particular commit was the culprit. So I reverted it and forgot until it bit me now -- so it's been my own stupidity. Guess it serves me right for not checking my own changes. That still leaves the other problem of broken suspend. I've just checked with v4.17.10, and it's still there: as soon as I resume from suspend, the kernel oopses and reboots the machine. I guess I'll have to do another debugging session and see where it fails exactly (and no, this time there are no more changes to the kernel tree). Anyway, thanks for pointing out my own mistakes. Regards Patrick
On Thu, 2018-07-26 at 10:38 +0200, Patrick Steinhardt wrote: > That still leaves the other problem of broken suspend. I've just > checked with v4.17.10, and it's still there: as soon as I resume > from suspend, the kernel oopses and reboots the machine. I guess > I'll have to do another debugging session and see where it fails > exactly (and no, this time there are no more changes to the > kernel tree). Can you share the kernel oops details before you start running a bisect? Thanks, Bart.
On Thu, Jul 26, 2018 at 01:51:48PM +0000, Bart Van Assche wrote: > On Thu, 2018-07-26 at 10:38 +-0200, Patrick Steinhardt wrote: > +AD4- That still leaves the other problem of broken suspend. I've just > +AD4- checked with v4.17.10, and it's still there: as soon as I resume > +AD4- from suspend, the kernel oopses and reboots the machine. I guess > +AD4- I'll have to do another debugging session and see where it fails > +AD4- exactly (and no, this time there are no more changes to the > +AD4- kernel tree). > > Can you share the kernel oops details before you start running a bisect? I've already done the bisect a few months ago, which resulted in commit 765e40b675a9 (block: disable runtime-pm for blk-mq July 2017). I've never been able to get the Oops, as the screen was blank, but I tried a little harder with pstore today and finally got hold of it. Please see below for the Oops and previous dmesg entries. Kernel version was v4.17.10. Thanks Patrick <6>[ 49.088931] PM: suspend entry (deep) <6>[ 49.088933] PM: Syncing filesystems ... done. <6>[ 49.093066] Freezing user space processes ... (elapsed 0.001 seconds) done. <6>[ 49.094283] OOM killer disabled. <6>[ 49.094284] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. <6>[ 49.095470] Suspending console(s) (use no_console_suspend to debug) <6>[ 50.350548] ACPI: EC: interrupt blocked <6>[ 50.383913] ACPI: Preparing to enter system sleep state S3 <6>[ 50.397529] ACPI: EC: EC stopped <6>[ 50.397529] ACPI: EC: event blocked <6>[ 50.397530] PM: Saving platform NVS memory <6>[ 50.397605] Disabling non-boot CPUs ... <6>[ 50.414940] smpboot: CPU 1 is now offline <6>[ 50.445153] smpboot: CPU 2 is now offline <6>[ 50.471702] smpboot: CPU 3 is now offline <6>[ 50.473710] ACPI: Low-level resume complete <6>[ 50.473770] ACPI: EC: EC started <6>[ 50.473771] PM: Restoring platform NVS memory <6>[ 50.476149] Enabling non-boot CPUs ... <6>[ 50.476175] x86: Booting SMP configuration: <6>[ 50.476176] smpboot: Booting Node 0 Processor 1 APIC 0x2 <4>[ 50.478031] cache: parent cpu1 should not be sleeping <6>[ 50.478333] CPU1 is up <6>[ 50.478348] smpboot: Booting Node 0 Processor 2 APIC 0x1 <4>[ 50.478950] cache: parent cpu2 should not be sleeping <6>[ 50.479194] CPU2 is up <6>[ 50.479206] smpboot: Booting Node 0 Processor 3 APIC 0x3 <4>[ 50.479824] cache: parent cpu3 should not be sleeping <6>[ 50.480470] CPU3 is up <6>[ 50.490620] ACPI: Waking up from system sleep state S3 <6>[ 53.334256] ACPI: EC: interrupt unblocked <6>[ 53.411349] ACPI: EC: event unblocked <1>[ 53.411700] BUG: unable to handle kernel NULL pointer dereference at 00000000000001a8 <6>[ 53.411705] PGD 0 P4D 0 <4>[ 53.411711] Oops: 0002 [#1] SMP DEBUG_PAGEALLOC PTI <4>[ 53.411716] CPU: 3 PID: 1600 Comm: kworker/u8:48 Tainted: G U 4.17.10 #9 <4>[ 53.411717] Hardware name: Dell Inc. XPS 13 9343/0310JH, BIOS A15 01/23/2018 <4>[ 53.411724] Workqueue: events_unbound async_run_entry_fn <4>[ 53.411733] RIP: 0010:blk_set_runtime_active+0x2d/0x50 <4>[ 53.411736] RSP: 0000:ffff88020ef37df0 EFLAGS: 00010046 <4>[ 53.411739] RAX: 0000000000000000 RBX: ffff880210a9a000 RCX: 0000000000000000 <4>[ 53.411742] RDX: 00000000fffedf06 RSI: 0000000000000009 RDI: ffff880210a9a1b0 <4>[ 53.411745] RBP: ffffffff815158d0 R08: ffffffff81c13ce0 R09: 0000000000000000 <4>[ 53.411747] R10: 0000000000004400 R11: 0000000000005cd3 R12: ffff8802141671e8 <4>[ 53.411749] R13: 0000000000000000 R14: ffff880215414c98 R15: 0000000000000000 <4>[ 53.411753] FS: 0000000000000000(0000) GS:ffff88021e780000(0000) knlGS:0000000000000000 <4>[ 53.411755] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4>[ 53.411757] CR2: 00000000000001a8 CR3: 000000000300a003 CR4: 00000000003606e0 <4>[ 53.411758] Call Trace: <4>[ 53.411766] scsi_bus_resume_common+0xf8/0x110 <4>[ 53.411771] ? scsi_bus_thaw+0x10/0x10 <4>[ 53.411776] dpm_run_callback.isra.16+0x27/0x70 <4>[ 53.411781] device_resume+0xab/0x170 <4>[ 53.411785] async_resume+0x14/0x40 <4>[ 53.411788] async_run_entry_fn+0x34/0x100 <4>[ 53.411793] process_one_work+0x14d/0x2c0 <4>[ 53.411797] worker_thread+0x29/0x370 <4>[ 53.411800] ? process_one_work+0x2c0/0x2c0 <4>[ 53.411803] kthread+0x109/0x120 <4>[ 53.411806] ? __kthread_create_on_node+0x190/0x190 <4>[ 53.411810] ret_from_fork+0x35/0x40 <4>[ 53.411813] Code: 89 fb 48 8b bf b8 01 00 00 e8 30 8c 4e 00 48 8b 83 40 02 00 00 be 09 00 00 00 c7 83 48 02 00 00 00 00 00 00 48 8b 15 03 31 cf 00 <48> 89 90 a8 01 00 00 48 8b bb 40 02 00 00 e8 00 1e 1e 00 48 8b <1>[ 53.411873] RIP: blk_set_runtime_active+0x2d/0x50 RSP: ffff88020ef37df0 <4>[ 53.411874] CR2: 00000000000001a8 <4>[ 53.411877] ---[ end trace bbd10b6fcf31d6c4 ]---
On Fri, 2018-07-27 at 14:35 +0200, Patrick Steinhardt wrote: > <1>[ 53.411700] BUG: unable to handle kernel NULL pointer dereference at 00000000000001a8 > <4>[ 53.411733] RIP: 0010:blk_set_runtime_active+0x2d/0x50 Are you sure commit 4fd41a8552af ("SCSI: Fix NULL pointer dereference in runtime PM"; December 2015) is in your kernel tree? Bart.
Hi, On Fri, Jul 13, 2018 at 09:41:41PM +0800, Ming Lei wrote: > Now runtime PM is disabled for blk-mq/scsi_mq, not sure how this issue is > triggered on your machine. While Patrick did miss the following patch: * 765e40b675a9 ("block: disable runtime-pm for blk-mq"; July 2017). there is at least one other way to trigger it -- enable laptop-mode-tools or tlp which enable runtime-pm for all devices. The "disable runtime-pm for blk-mq" only disables it _by_default_, but doesn't prevent it from being enabled again from user-space, which it is unless one manually blacklists sd devices from runtime-pm enablement. It's bitten a few people already: https://github.com/rickysarraf/laptop-mode-tools/issues/123 (I found this thread because I'm also getting the NULL pointer dereference at 00000000000001a8 on resume from suspend.)
On Sun, Jul 29, 2018 at 11:41:31AM +0200, Tomas Janousek wrote: > Hi, > > On Fri, Jul 13, 2018 at 09:41:41PM +0800, Ming Lei wrote: > > Now runtime PM is disabled for blk-mq/scsi_mq, not sure how this issue is > > triggered on your machine. > > While Patrick did miss the following patch: > > * 765e40b675a9 ("block: disable runtime-pm for blk-mq"; July 2017). > > there is at least one other way to trigger it -- enable laptop-mode-tools > or tlp which enable runtime-pm for all devices. > > The "disable runtime-pm for blk-mq" only disables it _by_default_, but doesn't > prevent it from being enabled again from user-space, which it is unless one > manually blacklists sd devices from runtime-pm enablement. It's bitten a few > people already: https://github.com/rickysarraf/laptop-mode-tools/issues/123 > > (I found this thread because I'm also getting the NULL pointer dereference at > 00000000000001a8 on resume from suspend.) Huh, I did send out some more details on how I reproduce the issue, but it seems like my mail didn't get through. While I don't use laptop-mode-tools, I do have some custom hotplugging scripts which do in fact enable runtime-PM for most devices. Regards Patrick
On Mon, Jul 30, 2018 at 09:59:49AM +0200, Patrick Steinhardt wrote: > On Sun, Jul 29, 2018 at 11:41:31AM +0200, Tomas Janousek wrote: > > Hi, > > > > On Fri, Jul 13, 2018 at 09:41:41PM +0800, Ming Lei wrote: > > > Now runtime PM is disabled for blk-mq/scsi_mq, not sure how this issue is > > > triggered on your machine. > > > > While Patrick did miss the following patch: > > > > * 765e40b675a9 ("block: disable runtime-pm for blk-mq"; July 2017). > > > > there is at least one other way to trigger it -- enable laptop-mode-tools > > or tlp which enable runtime-pm for all devices. > > > > The "disable runtime-pm for blk-mq" only disables it _by_default_, but doesn't > > prevent it from being enabled again from user-space, which it is unless one > > manually blacklists sd devices from runtime-pm enablement. It's bitten a few > > people already: https://github.com/rickysarraf/laptop-mode-tools/issues/123 > > > > (I found this thread because I'm also getting the NULL pointer dereference at > > 00000000000001a8 on resume from suspend.) > > Huh, I did send out some more details on how I reproduce the > issue, but it seems like my mail didn't get through. While I > don't use laptop-mode-tools, I do have some custom hotplugging > scripts which do in fact enable runtime-PM for most devices. Now runtime PM is still enabled for sd/sr, and I believe the following patch is needed until we figure out one perfect way for supporting it well: diff --git a/block/blk-core.c b/block/blk-core.c index 03a4ea93a5f3..090b782df129 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -3769,9 +3769,11 @@ EXPORT_SYMBOL(blk_finish_plug); */ void blk_pm_runtime_init(struct request_queue *q, struct device *dev) { - /* not support for RQF_PM and ->rpm_status in blk-mq yet */ - if (q->mq_ops) + /* Don't enable runtime PM for blk-mq until it is ready */ + if (q->mq_ops) { + pm_runtime_disable(dev); return; + } q->dev = dev; q->rpm_status = RPM_ACTIVE; Thanks, Ming
Hi Ming, On Mon, Jul 30, 2018 at 07:50:03PM +0800, Ming Lei wrote: > Now runtime PM is still enabled for sd/sr, and I believe the following > patch is needed until we figure out one perfect way for supporting it > well: > [...] > + pm_runtime_disable(dev); Awesome, thanks for the quick reaction! :-) (I don't happen to have a .config for the hardware I'm experincing the problem on, so don't expect a Tested-By from me, but I don't see why it wouldn't fix the issue, and it did for Patrick.)
diff --git a/block/blk-core.c b/block/blk-core.c index f84a9b7b6f5a..0a2041660cd9 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -456,7 +456,7 @@ inline void __blk_run_queue_uncond(struct request_queue *q) lockdep_assert_held(q->queue_lock); WARN_ON_ONCE(q->mq_ops); - if (unlikely(blk_queue_dead(q))) + if (unlikely(!q->request_fn) || unlikely(blk_queue_dead(q))) return; /*
When power management for SCSI is enabled and if a device uses blk-mq, it is possible to trigger a `NULL` pointer exception when resuming that device. The NPE is triggered when trying to dereference the `request_fn` function pointer of the device's `request_queue`: __blk_run_queue_uncond:470 __blk_run_queue:490 blk_post_runtime_resume:3889 sdev_runtime_resume:263 scsi_runtime_resume:275 When the SCSI device is being allocated by `scsi_alloc_sdev`, the device's request queue will either be initialized via `scsi_mq_alloc_queue` or `scsi_old_alloc_queue`. But the `request_fn` member of the request queue is in fact only being set in `scsi_old_alloc_queue`, which will then later cause the mentioned NPE. Fix the issue by checking whether the `request_fn` is set in `__blk_run_queue_uncond`. In case it is unset, we'll silently return and not try to invoke the callback, thus fixing the NPE. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- Since at least v4.14, I am easily able to trigger above NPE by unplugging USB mass storage devices on my computer (Skylake, ASUS Z170I) with CONFIG_SCSI_MQ_DEFAULT=y. The attached patch fixes the issue, but keep in mind that this is my first patch, so the proposed fix may not be appropriate at all. Feedback would be highly appreciated. block/blk-core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)