diff mbox series

scsi: core: Call blk_mq_free_tag_set() earlier

Message ID 20220628175612.2157218-1-bvanassche@acm.org (mailing list archive)
State Superseded
Headers show
Series scsi: core: Call blk_mq_free_tag_set() earlier | expand

Commit Message

Bart Van Assche June 28, 2022, 5:56 p.m. UTC
There are two .exit_cmd_priv implementations. Both implementations use the
SCSI host pointer. Make sure that the SCSI host pointer is valid when
.exit_cmd_priv is called by moving the .exit_cmd_priv calls from
scsi_device_dev_release() to scsi_forget_host(). Moving
blk_mq_free_tag_set() from scsi_device_dev_release() to scsi_forget_host()
is safe because scsi_forget_host() drains all the request queues that use
the host tag set. This guarantees that no requests are in flight and also
that no new requests will be allocated from the host tag set.

This patch fixes the following use-after-free:

==================================================================
BUG: KASAN: use-after-free in srp_exit_cmd_priv+0x27/0xd0 [ib_srp]
Read of size 8 at addr ffff888100337000 by task multipathd/16727
Call Trace:
 <TASK>
 dump_stack_lvl+0x34/0x44
 print_report.cold+0x5e/0x5db
 kasan_report+0xab/0x120
 srp_exit_cmd_priv+0x27/0xd0 [ib_srp]
 scsi_mq_exit_request+0x4d/0x70
 blk_mq_free_rqs+0x143/0x410
 __blk_mq_free_map_and_rqs+0x6e/0x100
 blk_mq_free_tag_set+0x2b/0x160
 scsi_host_dev_release+0xf3/0x1a0
 device_release+0x54/0xe0
 kobject_put+0xa5/0x120
 device_release+0x54/0xe0
 kobject_put+0xa5/0x120
 scsi_device_dev_release_usercontext+0x4c1/0x4e0
 execute_in_process_context+0x23/0x90
 device_release+0x54/0xe0
 kobject_put+0xa5/0x120
 scsi_disk_release+0x3f/0x50
 device_release+0x54/0xe0
 kobject_put+0xa5/0x120
 disk_release+0x17f/0x1b0
 device_release+0x54/0xe0
 kobject_put+0xa5/0x120
 dm_put_table_device+0xa3/0x160 [dm_mod]
 dm_put_device+0xd0/0x140 [dm_mod]
 free_priority_group+0xd8/0x110 [dm_multipath]
 free_multipath+0x94/0xe0 [dm_multipath]
 dm_table_destroy+0xa2/0x1e0 [dm_mod]
 __dm_destroy+0x196/0x350 [dm_mod]
 dev_remove+0x10c/0x160 [dm_mod]
 ctl_ioctl+0x2c2/0x590 [dm_mod]
 dm_ctl_ioctl+0x5/0x10 [dm_mod]
 __x64_sys_ioctl+0xb4/0xf0
 dm_ctl_ioctl+0x5/0x10 [dm_mod]
 __x64_sys_ioctl+0xb4/0xf0
 do_syscall_64+0x3b/0x90
 entry_SYSCALL_64_after_hwframe+0x46/0xb0

Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: John Garry <john.garry@huawei.com>
Cc: Li Zhijian <lizhijian@fujitsu.com>
Reported-by: Li Zhijian <lizhijian@fujitsu.com>
Tested-by: Li Zhijian <lizhijian@fujitsu.com>
Fixes: 65ca846a5314 ("scsi: core: Introduce {init,exit}_cmd_priv()")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/scsi/hosts.c    | 10 +++++-----
 drivers/scsi/scsi_lib.c |  3 +++
 2 files changed, 8 insertions(+), 5 deletions(-)

Comments

Ming Lei June 29, 2022, 1:17 a.m. UTC | #1
On Tue, Jun 28, 2022 at 10:56:12AM -0700, Bart Van Assche wrote:
> There are two .exit_cmd_priv implementations. Both implementations use the
> SCSI host pointer. Make sure that the SCSI host pointer is valid when
> .exit_cmd_priv is called by moving the .exit_cmd_priv calls from
> scsi_device_dev_release() to scsi_forget_host(). Moving
> blk_mq_free_tag_set() from scsi_device_dev_release() to scsi_forget_host()
> is safe because scsi_forget_host() drains all the request queues that use
> the host tag set. This guarantees that no requests are in flight and also
> that no new requests will be allocated from the host tag set.

Not sure scsi_forget_host really drains all queues since it bypasses
sdev which state is SDEV_DEL, so removal for this sdev could be
in-progress, not done yet.

Thanks,
Ming
Bart Van Assche June 29, 2022, 9:49 p.m. UTC | #2
On 6/28/22 18:17, Ming Lei wrote:
> On Tue, Jun 28, 2022 at 10:56:12AM -0700, Bart Van Assche wrote:
>> There are two .exit_cmd_priv implementations. Both implementations use the
>> SCSI host pointer. Make sure that the SCSI host pointer is valid when
>> .exit_cmd_priv is called by moving the .exit_cmd_priv calls from
>> scsi_device_dev_release() to scsi_forget_host(). Moving
>> blk_mq_free_tag_set() from scsi_device_dev_release() to scsi_forget_host()
>> is safe because scsi_forget_host() drains all the request queues that use
>> the host tag set. This guarantees that no requests are in flight and also
>> that no new requests will be allocated from the host tag set.
> 
> Not sure scsi_forget_host really drains all queues since it bypasses
> sdev which state is SDEV_DEL, so removal for this sdev could be
> in-progress, not done yet.

Ah, that's right. How about making scsi_forget_host() wait until all request
activity on the associated queues has stopped, e.g. by replacing the current
scsi_forget_host() implementation by the one below?

/**
  * scsi_forget_host() - Remove all SCSI devices from a host.
  * @shost: SCSI host to remove devices from.
  *
  * Removes all SCSI devices that have not yet been removed. For the SCSI devices
  * for which removal started before scsi_forget_host(), wait until the
  * associated request queue has reached the "dead" state. In that state it is
  * guaranteed that no new requests will be allocated and also that no requests
  * are in progress anymore.
  */
void scsi_forget_host(struct Scsi_Host *shost)
{
	struct scsi_device *sdev;

	might_sleep();

  restart:
	spin_lock_irq(shost->host_lock);
	list_for_each_entry(sdev, &shost->__devices, siblings) {
		if (sdev->sdev_state == SDEV_DEL &&
		    blk_queue_dead(sdev->request_queue)) {
			continue;
		}
		if (sdev->sdev_state == SDEV_DEL) {
			get_device(&sdev->sdev_gendev);
			spin_unlock_irq(shost->host_lock);

			while (!blk_queue_dead(sdev->request_queue))
				msleep(10);

			spin_lock_irq(shost->host_lock);
			put_device(&sdev->sdev_gendev);
			goto restart;
		}
		spin_unlock_irq(shost->host_lock);
		__scsi_remove_device(sdev);
		goto restart;
	}
	spin_unlock_irq(shost->host_lock);
}

Thanks,

Bart.
Ming Lei June 30, 2022, 1:01 a.m. UTC | #3
Hi Bart,

I'd rather to understand the issue first.

On Wed, Jun 29, 2022 at 02:49:27PM -0700, Bart Van Assche wrote:
> On 6/28/22 18:17, Ming Lei wrote:
> > On Tue, Jun 28, 2022 at 10:56:12AM -0700, Bart Van Assche wrote:
> > > There are two .exit_cmd_priv implementations. Both implementations use the
> > > SCSI host pointer. Make sure that the SCSI host pointer is valid when
> > > .exit_cmd_priv is called by moving the .exit_cmd_priv calls from
> > > scsi_device_dev_release() to scsi_forget_host(). Moving

.exit_cmd_priv is actually called from scsi_host_dev_release() instead
of scsi_device_dev_release(). Both scsi host pointer and host->shost_data is
still valid when calling .exit_cmd_priv via scsi_mq_destroy_tags().

Previously I fixed[1] one similar issue, and that is caused by early module
unloading, and anywhere host->hostt is referred, the scsi driver module
should be prevented from being unloaded.


[1] f2b85040acec scsi: core: Put LLD module refcnt after SCSI device is released


Thanks,
Ming
Ming Lei June 30, 2022, 8:57 a.m. UTC | #4
On Thu, Jun 30, 2022 at 09:01:39AM +0800, Ming Lei wrote:
> Hi Bart,
> 
> I'd rather to understand the issue first.
> 
> On Wed, Jun 29, 2022 at 02:49:27PM -0700, Bart Van Assche wrote:
> > On 6/28/22 18:17, Ming Lei wrote:
> > > On Tue, Jun 28, 2022 at 10:56:12AM -0700, Bart Van Assche wrote:
> > > > There are two .exit_cmd_priv implementations. Both implementations use the
> > > > SCSI host pointer. Make sure that the SCSI host pointer is valid when
> > > > .exit_cmd_priv is called by moving the .exit_cmd_priv calls from
> > > > scsi_device_dev_release() to scsi_forget_host(). Moving
> 
> .exit_cmd_priv is actually called from scsi_host_dev_release() instead
> of scsi_device_dev_release(). Both scsi host pointer and host->shost_data is
> still valid when calling .exit_cmd_priv via scsi_mq_destroy_tags().
> 
> Previously I fixed[1] one similar issue, and that is caused by early module
> unloading, and anywhere host->hostt is referred, the scsi driver module
> should be prevented from being unloaded.
> 
> 
> [1] f2b85040acec scsi: core: Put LLD module refcnt after SCSI device is released

Hi Bart,

BTW, Changhui reported one very similar issue when running elevator
switch/scsi debug LUN hotplug.

From Changhui's report, the issue is basically same with what
f2b85040acec tried to address, but the try_module_get() in
scsi_device_dev_release() may fail, so the scsi_debug module
still can be unloaded.

The thing is that sdev can be released in async style, and target/host
release is triggered by scsi_device_dev_release_usercontext().

So after scsi_host_remove() returns, the shost may still be live from
driver core/sysfs viewpoint, and its release handler can be called
after the LLD module is unloaded. Then this kind of issue is triggered.

Seems there are at least two approaches for fixing the issue:

1) the one suggested in this thread:
- moving any reference to shost->hostt in host release handler into
scsi_host_remove(), and scsi_mq_destroy_tags()/scsi_proc_hostdir_rm(shost->hostt)()
should be covered at least

2) wait until all targets are released in scsi_host_remove()

I am fine with either of the two approaches.

Bart, please let me know if you are working towards the approach in 1).
If not, I have one patch which implements 2).

BTW, after either 1) or 2) is done, commit f2b85040acec can be reverted.


Thanks,
Ming
Bart Van Assche June 30, 2022, 8:19 p.m. UTC | #5
On 6/29/22 18:01, Ming Lei wrote:
> .exit_cmd_priv is actually called from scsi_host_dev_release() instead
> of scsi_device_dev_release().

Agreed, I will fix this in the patch description.

> Both scsi host pointer and host->shost_data is
> still valid when calling .exit_cmd_priv via scsi_mq_destroy_tags().

I will change the patch description to make it clear that both 
exit_cmd_priv implementations use resources associated with the SCSI host.

Thanks,

Bart.
Bart Van Assche June 30, 2022, 8:22 p.m. UTC | #6
On 6/30/22 01:57, Ming Lei wrote:
> BTW, Changhui reported one very similar issue when running elevator
> switch/scsi debug LUN hotplug.
> 
>>From Changhui's report, the issue is basically same with what
> f2b85040acec tried to address, but the try_module_get() in
> scsi_device_dev_release() may fail, so the scsi_debug module
> still can be unloaded.
> 
> The thing is that sdev can be released in async style, and target/host
> release is triggered by scsi_device_dev_release_usercontext().
> 
> So after scsi_host_remove() returns, the shost may still be live from
> driver core/sysfs viewpoint, and its release handler can be called
> after the LLD module is unloaded. Then this kind of issue is triggered.
> 
> Seems there are at least two approaches for fixing the issue:
> 
> 1) the one suggested in this thread:
> - moving any reference to shost->hostt in host release handler into
> scsi_host_remove(), and scsi_mq_destroy_tags()/scsi_proc_hostdir_rm(shost->hostt)()
> should be covered at least
> 
> 2) wait until all targets are released in scsi_host_remove()
> 
> I am fine with either of the two approaches.
> 
> Bart, please let me know if you are working towards the approach in 1).
> If not, I have one patch which implements 2).
> 
> BTW, after either 1) or 2) is done, commit f2b85040acec can be reverted.

Hi Ming,

All I need is that all activity on the host tag set has stopped by the 
time scsi_forget_host() returns. I do not need (1) or (2) to fix the bug 
reported in the description of the patch at the start of this thread.

Thanks,

Bart.
diff mbox series

Patch

diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index ef6c0e37acce..74bfa187fe19 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -182,6 +182,8 @@  void scsi_remove_host(struct Scsi_Host *shost)
 	mutex_unlock(&shost->scan_mutex);
 	scsi_proc_host_rm(shost);
 
+	scsi_mq_destroy_tags(shost);
+
 	spin_lock_irqsave(shost->host_lock, flags);
 	if (scsi_host_set_state(shost, SHOST_DEL))
 		BUG_ON(scsi_host_set_state(shost, SHOST_DEL_RECOVERY));
@@ -295,8 +297,8 @@  int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev,
 	return error;
 
 	/*
-	 * Any host allocation in this function will be freed in
-	 * scsi_host_dev_release().
+	 * Any resources associated with the SCSI host in this function except
+	 * the tag set will be freed by scsi_host_dev_release().
 	 */
  out_del_dev:
 	device_del(&shost->shost_dev);
@@ -312,6 +314,7 @@  int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev,
 	pm_runtime_disable(&shost->shost_gendev);
 	pm_runtime_set_suspended(&shost->shost_gendev);
 	pm_runtime_put_noidle(&shost->shost_gendev);
+	scsi_mq_destroy_tags(shost);
  fail:
 	return error;
 }
@@ -345,9 +348,6 @@  static void scsi_host_dev_release(struct device *dev)
 		kfree(dev_name(&shost->shost_dev));
 	}
 
-	if (shost->tag_set.tags)
-		scsi_mq_destroy_tags(shost);
-
 	kfree(shost->shost_data);
 
 	ida_free(&host_index_ida, shost->host_no);
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 6ffc9e4258a8..1aa1a279f8f3 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1990,7 +1990,10 @@  int scsi_mq_setup_tags(struct Scsi_Host *shost)
 
 void scsi_mq_destroy_tags(struct Scsi_Host *shost)
 {
+	if (!shost->tag_set.tags)
+		return;
 	blk_mq_free_tag_set(&shost->tag_set);
+	WARN_ON_ONCE(shost->tag_set.tags);
 }
 
 /**