diff mbox series

[PATCH-next,v2,2/2] scsi: fix iscsi rescan fails to create block device

Message ID 20230128094146.205858-3-zhongjinghua@huawei.com (mailing list archive)
State Rejected
Headers show
Series scsi, driver core: fix iscsi rescan fails to create block device | expand

Commit Message

zhongjinghua Jan. 28, 2023, 9:41 a.m. UTC
When the three iscsi operations delete, logout, and rescan are concurrent
at the same time, there is a probability of failure to add disk through
device_add_disk(). The concurrent process is as follows:

T0: scan host // echo 1 > /sys/devices/platform/host1/scsi_host/host1/scan
T1: delete target // echo 1 > /sys/devices/platform/host1/session1/target1:0:0/1:0:0:1/delete
T2: logout // iscsiadm -m node --login
T3: T2 scsi_queue_work
T4: T0 bus_probe_device

T0                          T1                     T2                     T3
scsi_scan_target
 mutex_lock(&shost->scan_mutex);
  __scsi_scan_target
   scsi_report_lun_scan
    scsi_add_lun
     scsi_sysfs_add_sdev
      device_add
       kobject_add
       //create session1/target1:0:0/1:0:0:1/
       ...
       bus_probe_device
       // Create block asynchronously
 mutex_unlock(&shost->scan_mutex);
                       sdev_store_delete
                        scsi_remove_device
                         device_remove_file
                          mutex_lock(scan_mutex)
                           __scsi_remove_device
                            res = scsi_device_set_state(sdev, SDEV_CANCEL)
                                             iscsi_if_recv_msg
                                              scsi_queue_work
                                                                 __iscsi_unbind_session
                                                                 session->target_id = ISCSI_MAX_TARGET
                                                                   __scsi_remove_target
                                                                   sdev->sdev_state == SDEV_CANCEL
                                                                   continue;
                                                                   // end, No delete kobject 1:0:0:1
                                             iscsi_if_recv_msg
                                              transport->destroy_session(session)
                                               __iscsi_destroy_session
                                               iscsi_session_teardown
                                                iscsi_remove_session
                                                 __iscsi_unbind_session
                                                  iscsi_session_event
                                                 device_del
                                                 // delete session
T4:
// create the block, its parent is 1:0:0:1
// If kobject 1:0:0:1 does not exist, it won't go down
__device_attach_async_helper
 device_lock
 ...
 __device_attach_driver
  driver_probe_device
   really_probe
    sd_probe
     device_add_disk
      register_disk
       device_add
      // error

The block is created after the seesion is deleted.
When T2 deletes the session, it will mark block'parent 1:0:01 as unusable:
T2
device_del
 kobject_del
  sysfs_remove_dir
   __kernfs_remove
   // Mark the children under the session as unusable
    while ((pos = kernfs_next_descendant_post(pos, kn)))
		if (kernfs_active(pos))
			atomic_add(KN_DEACTIVATED_BIAS, &pos->active);

Then, create the block:
T4
device_add
 kobject_add
  kobject_add_varg
   kobject_add_internal
    create_dir
     sysfs_create_dir_ns
      kernfs_create_dir_ns
       kernfs_add_one
        if ((parent->flags & KERNFS_ACTIVATED) && !kernfs_active(parent))
		goto out_unlock;
		// return error

This error will cause a warning:
kobject_add_internal failed for block (error: -2 parent: 1:0:0:1).
In the lower version (such as 5.10), there is no corresponding error handling, continuing
to go down will trigger a kernel panic, so cc stable.

Therefore, creating the block should not be done after deleting the session.
More practically, we should ensure that the target under the session is deleted first,
and then the session is deleted. In this way, there are two possibilities:

1) if the process(T1) of deleting the target execute first, it will grab the device_lock(),
and the process(T4) of creating the block will wait for the deletion to complete.
Then, block's parent 1:0:0:1 has been deleted, it won't go down.

2) if the process(T4) of creating block execute first, it will grab the device_lock(),
and the process(T1) of deleting the target will wait for the creation block to complete.
Then, the process(T2) of deleting the session should need wait for the deletion to complete.

Fix it by removing the judgment of state equal to SDEV_CANCEL in
__scsi_remove_target() to ensure the order of deletion. Then, it will wait for
T1's mutex_lock(scan_mutex) and device_del() in __scsi_remove_device() will wait for
T4's device_lock(dev).
But we found that such a fix would cause the previous problem:
commit 81b6c9998979 ("scsi: core: check for device state in __scsi_remove_target()").
So we use get_device_unless_zero() instead of get_devcie() to fix the previous problem.

Fixes: 81b6c9998979 ("scsi: core: check for device state in __scsi_remove_target()")
Cc: <stable@vger.kernel.org>
Signed-off-by: Zhong Jinghua <zhongjinghua@huawei.com>
---
 drivers/scsi/scsi_sysfs.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

Comments

Greg KH Jan. 28, 2023, 10:45 a.m. UTC | #1
On Sat, Jan 28, 2023 at 05:41:46PM +0800, Zhong Jinghua wrote:
> When the three iscsi operations delete, logout, and rescan are concurrent
> at the same time, there is a probability of failure to add disk through
> device_add_disk(). The concurrent process is as follows:
> 
> T0: scan host // echo 1 > /sys/devices/platform/host1/scsi_host/host1/scan
> T1: delete target // echo 1 > /sys/devices/platform/host1/session1/target1:0:0/1:0:0:1/delete
> T2: logout // iscsiadm -m node --login
> T3: T2 scsi_queue_work
> T4: T0 bus_probe_device
> 
> T0                          T1                     T2                     T3
> scsi_scan_target
>  mutex_lock(&shost->scan_mutex);
>   __scsi_scan_target
>    scsi_report_lun_scan
>     scsi_add_lun
>      scsi_sysfs_add_sdev
>       device_add
>        kobject_add
>        //create session1/target1:0:0/1:0:0:1/
>        ...
>        bus_probe_device
>        // Create block asynchronously
>  mutex_unlock(&shost->scan_mutex);
>                        sdev_store_delete
>                         scsi_remove_device
>                          device_remove_file
>                           mutex_lock(scan_mutex)
>                            __scsi_remove_device
>                             res = scsi_device_set_state(sdev, SDEV_CANCEL)
>                                              iscsi_if_recv_msg
>                                               scsi_queue_work
>                                                                  __iscsi_unbind_session
>                                                                  session->target_id = ISCSI_MAX_TARGET
>                                                                    __scsi_remove_target
>                                                                    sdev->sdev_state == SDEV_CANCEL
>                                                                    continue;
>                                                                    // end, No delete kobject 1:0:0:1
>                                              iscsi_if_recv_msg
>                                               transport->destroy_session(session)
>                                                __iscsi_destroy_session
>                                                iscsi_session_teardown
>                                                 iscsi_remove_session
>                                                  __iscsi_unbind_session
>                                                   iscsi_session_event
>                                                  device_del
>                                                  // delete session
> T4:
> // create the block, its parent is 1:0:0:1
> // If kobject 1:0:0:1 does not exist, it won't go down
> __device_attach_async_helper
>  device_lock
>  ...
>  __device_attach_driver
>   driver_probe_device
>    really_probe
>     sd_probe
>      device_add_disk
>       register_disk
>        device_add
>       // error
> 
> The block is created after the seesion is deleted.
> When T2 deletes the session, it will mark block'parent 1:0:01 as unusable:
> T2
> device_del
>  kobject_del
>   sysfs_remove_dir
>    __kernfs_remove
>    // Mark the children under the session as unusable
>     while ((pos = kernfs_next_descendant_post(pos, kn)))
> 		if (kernfs_active(pos))
> 			atomic_add(KN_DEACTIVATED_BIAS, &pos->active);
> 
> Then, create the block:
> T4
> device_add
>  kobject_add
>   kobject_add_varg
>    kobject_add_internal
>     create_dir
>      sysfs_create_dir_ns
>       kernfs_create_dir_ns
>        kernfs_add_one
>         if ((parent->flags & KERNFS_ACTIVATED) && !kernfs_active(parent))
> 		goto out_unlock;
> 		// return error
> 
> This error will cause a warning:
> kobject_add_internal failed for block (error: -2 parent: 1:0:0:1).
> In the lower version (such as 5.10), there is no corresponding error handling, continuing
> to go down will trigger a kernel panic, so cc stable.
> 
> Therefore, creating the block should not be done after deleting the session.
> More practically, we should ensure that the target under the session is deleted first,
> and then the session is deleted. In this way, there are two possibilities:
> 
> 1) if the process(T1) of deleting the target execute first, it will grab the device_lock(),
> and the process(T4) of creating the block will wait for the deletion to complete.
> Then, block's parent 1:0:0:1 has been deleted, it won't go down.
> 
> 2) if the process(T4) of creating block execute first, it will grab the device_lock(),
> and the process(T1) of deleting the target will wait for the creation block to complete.
> Then, the process(T2) of deleting the session should need wait for the deletion to complete.
> 
> Fix it by removing the judgment of state equal to SDEV_CANCEL in
> __scsi_remove_target() to ensure the order of deletion. Then, it will wait for
> T1's mutex_lock(scan_mutex) and device_del() in __scsi_remove_device() will wait for
> T4's device_lock(dev).
> But we found that such a fix would cause the previous problem:
> commit 81b6c9998979 ("scsi: core: check for device state in __scsi_remove_target()").
> So we use get_device_unless_zero() instead of get_devcie() to fix the previous problem.
> 
> Fixes: 81b6c9998979 ("scsi: core: check for device state in __scsi_remove_target()")
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Zhong Jinghua <zhongjinghua@huawei.com>
> ---
>  drivers/scsi/scsi_sysfs.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
> index cac7c902cf70..a22109cdb8ef 100644
> --- a/drivers/scsi/scsi_sysfs.c
> +++ b/drivers/scsi/scsi_sysfs.c
> @@ -1535,9 +1535,7 @@ static void __scsi_remove_target(struct scsi_target *starget)
>  		if (sdev->channel != starget->channel ||
>  		    sdev->id != starget->id)
>  			continue;
> -		if (sdev->sdev_state == SDEV_DEL ||
> -		    sdev->sdev_state == SDEV_CANCEL ||
> -		    !get_device(&sdev->sdev_gendev))
> +		if (!get_device_unless_zero(&sdev->sdev_gendev))

If sdev_gendev is 0 here, the object is gone and you are working with
memory that is already freed so something is _VERY_ wrong.

This isn't ok, sorry.

greg k-h
Yu Kuai Jan. 29, 2023, 1:13 a.m. UTC | #2
Hi, Greg

在 2023/01/28 18:45, Greg KH 写道:
>> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
>> index cac7c902cf70..a22109cdb8ef 100644
>> --- a/drivers/scsi/scsi_sysfs.c
>> +++ b/drivers/scsi/scsi_sysfs.c
>> @@ -1535,9 +1535,7 @@ static void __scsi_remove_target(struct scsi_target *starget)
>>   		if (sdev->channel != starget->channel ||
>>   		    sdev->id != starget->id)
>>   			continue;
>> -		if (sdev->sdev_state == SDEV_DEL ||
>> -		    sdev->sdev_state == SDEV_CANCEL ||
>> -		    !get_device(&sdev->sdev_gendev))
>> +		if (!get_device_unless_zero(&sdev->sdev_gendev))
> 
> If sdev_gendev is 0 here, the object is gone and you are working with
> memory that is already freed so something is _VERY_ wrong.

In fact, this patch will work:

In __scsi_remove_target(), 'host_lock' is held to protect iterating
siblings, and object will wait for this lock in
scsi_device_dev_release() to remove siblings. Hence sdev will not be
freed untill the lock is released.

Thanks,
Kuai
> 
> This isn't ok, sorry.
> 
> greg k-h
> .
>
Greg KH Jan. 29, 2023, 6:46 a.m. UTC | #3
On Sun, Jan 29, 2023 at 09:13:55AM +0800, Yu Kuai wrote:
> Hi, Greg
> 
> 在 2023/01/28 18:45, Greg KH 写道:
> > > diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
> > > index cac7c902cf70..a22109cdb8ef 100644
> > > --- a/drivers/scsi/scsi_sysfs.c
> > > +++ b/drivers/scsi/scsi_sysfs.c
> > > @@ -1535,9 +1535,7 @@ static void __scsi_remove_target(struct scsi_target *starget)
> > >   		if (sdev->channel != starget->channel ||
> > >   		    sdev->id != starget->id)
> > >   			continue;
> > > -		if (sdev->sdev_state == SDEV_DEL ||
> > > -		    sdev->sdev_state == SDEV_CANCEL ||
> > > -		    !get_device(&sdev->sdev_gendev))
> > > +		if (!get_device_unless_zero(&sdev->sdev_gendev))
> > 
> > If sdev_gendev is 0 here, the object is gone and you are working with
> > memory that is already freed so something is _VERY_ wrong.
> 
> In fact, this patch will work:
> 
> In __scsi_remove_target(), 'host_lock' is held to protect iterating
> siblings, and object will wait for this lock in
> scsi_device_dev_release() to remove siblings. Hence sdev will not be
> freed untill the lock is released.

Then you got lucky, as that is not how a reference counted object should
be working (i.e. the reference dropped to 0 and it still be kept alive.)

Please fix up the scsi logic here, don't abuse the reference count code.

thanks,

greg k-h
Yu Kuai Jan. 29, 2023, 6:55 a.m. UTC | #4
Hi,

在 2023/01/29 14:46, Greg KH 写道:
> On Sun, Jan 29, 2023 at 09:13:55AM +0800, Yu Kuai wrote:
>> Hi, Greg
>>
>> 在 2023/01/28 18:45, Greg KH 写道:
>>>> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
>>>> index cac7c902cf70..a22109cdb8ef 100644
>>>> --- a/drivers/scsi/scsi_sysfs.c
>>>> +++ b/drivers/scsi/scsi_sysfs.c
>>>> @@ -1535,9 +1535,7 @@ static void __scsi_remove_target(struct scsi_target *starget)
>>>>    		if (sdev->channel != starget->channel ||
>>>>    		    sdev->id != starget->id)
>>>>    			continue;
>>>> -		if (sdev->sdev_state == SDEV_DEL ||
>>>> -		    sdev->sdev_state == SDEV_CANCEL ||
>>>> -		    !get_device(&sdev->sdev_gendev))
>>>> +		if (!get_device_unless_zero(&sdev->sdev_gendev))
>>>
>>> If sdev_gendev is 0 here, the object is gone and you are working with
>>> memory that is already freed so something is _VERY_ wrong.
>>
>> In fact, this patch will work:
>>
>> In __scsi_remove_target(), 'host_lock' is held to protect iterating
>> siblings, and object will wait for this lock in
>> scsi_device_dev_release() to remove siblings. Hence sdev will not be
>> freed untill the lock is released.
> 
> Then you got lucky, as that is not how a reference counted object should
> be working (i.e. the reference dropped to 0 and it still be kept alive.)
> 
> Please fix up the scsi logic here, don't abuse the reference count code.
> 

Thanks for the reply, I agree that we should fix this in scsi layer.

Kuai
James Bottomley Jan. 29, 2023, 5:30 p.m. UTC | #5
On Sat, 2023-01-28 at 17:41 +0800, Zhong Jinghua wrote:
> This error will cause a warning:
> kobject_add_internal failed for block (error: -2 parent: 1:0:0:1).
> In the lower version (such as 5.10), there is no corresponding error
> handling, continuing
> to go down will trigger a kernel panic, so cc stable.

Is this is important point and what you're saying is that this only
panics on kernels before 5.10 or so because after that it's correctly
failed by block device error handling so there's nothing to fix in
later kernels?

In that case, isn't the correct fix to look at backporting the block
device error handling:

commit 83cbce9574462c6b4eed6797bdaf18fae6859ab3
Author: Luis Chamberlain <mcgrof@kernel.org>
Date:   Wed Aug 18 16:45:40 2021 +0200

    block: add error handling for device_add_disk / add_disk

?

James
Yu Kuai Jan. 30, 2023, 3:07 a.m. UTC | #6
Hi,

在 2023/01/30 1:30, James Bottomley 写道:
> On Sat, 2023-01-28 at 17:41 +0800, Zhong Jinghua wrote:
>> This error will cause a warning:
>> kobject_add_internal failed for block (error: -2 parent: 1:0:0:1).
>> In the lower version (such as 5.10), there is no corresponding error
>> handling, continuing
>> to go down will trigger a kernel panic, so cc stable.
> 
> Is this is important point and what you're saying is that this only
> panics on kernels before 5.10 or so because after that it's correctly
> failed by block device error handling so there's nothing to fix in
> later kernels?
> 
> In that case, isn't the correct fix to look at backporting the block
> device error handling:

This is the last commit that support error handling, and there are many
relied patches, and there are lots of refactor in block layer. It's not
a good idea to backport error handling to lower version.

Althrough error handling can prevent kernel crash in this case, I still
think it make sense to make sure kobject is deleted in order, parent
should not be deleted before child.

Thanks,
Kuai
> 
> commit 83cbce9574462c6b4eed6797bdaf18fae6859ab3
> Author: Luis Chamberlain <mcgrof@kernel.org>
> Date:   Wed Aug 18 16:45:40 2021 +0200
> 
>      block: add error handling for device_add_disk / add_disk
> 
> ?
> 
> James
> 
> .
>
James Bottomley Jan. 30, 2023, 3:29 a.m. UTC | #7
On Mon, 2023-01-30 at 11:07 +0800, Yu Kuai wrote:
> Hi,
> 
> 在 2023/01/30 1:30, James Bottomley 写道:
> > On Sat, 2023-01-28 at 17:41 +0800, Zhong Jinghua wrote:
> > > This error will cause a warning:
> > > kobject_add_internal failed for block (error: -2 parent:
> > > 1:0:0:1). In the lower version (such as 5.10), there is no
> > > corresponding error handling, continuing to go down will trigger
> > > a kernel panic, so cc stable.
> > 
> > Is this is important point and what you're saying is that this only
> > panics on kernels before 5.10 or so because after that it's
> > correctly failed by block device error handling so there's nothing
> > to fix in later kernels?
> > 
> > In that case, isn't the correct fix to look at backporting the
> > block device error handling:
> 
> This is the last commit that support error handling, and there are
> many relied patches, and there are lots of refactor in block layer.
> It's not a good idea to backport error handling to lower version.
> 
> Althrough error handling can prevent kernel crash in this case, I
> still think it make sense to make sure kobject is deleted in order,
> parent should not be deleted before child.

Well, look, you've created a very artificial situation where a create
closely followed by a delete of the underlying sdev races with the
create of the block gendisk devices of sd that bind asynchronously to
the created sdev.  The asynchronous nature of the bind gives the
elongated race window so the only real fix is some sort of check that
the sdev is still viable by the time the bind occurs ... probably in
sd_probe(), say a scsi_device_get of sdp at the top which would ensure
viability of the sdev for the entire bind or fail the probe if the sdev
can't be got.

James
Yu Kuai Jan. 30, 2023, 3:46 a.m. UTC | #8
Hi,

在 2023/01/30 11:29, James Bottomley 写道:
> On Mon, 2023-01-30 at 11:07 +0800, Yu Kuai wrote:
>> Hi,
>>
>> 在 2023/01/30 1:30, James Bottomley 写道:
>>> On Sat, 2023-01-28 at 17:41 +0800, Zhong Jinghua wrote:
>>>> This error will cause a warning:
>>>> kobject_add_internal failed for block (error: -2 parent:
>>>> 1:0:0:1). In the lower version (such as 5.10), there is no
>>>> corresponding error handling, continuing to go down will trigger
>>>> a kernel panic, so cc stable.
>>>
>>> Is this is important point and what you're saying is that this only
>>> panics on kernels before 5.10 or so because after that it's
>>> correctly failed by block device error handling so there's nothing
>>> to fix in later kernels?
>>>
>>> In that case, isn't the correct fix to look at backporting the
>>> block device error handling:
>>
>> This is the last commit that support error handling, and there are
>> many relied patches, and there are lots of refactor in block layer.
>> It's not a good idea to backport error handling to lower version.
>>
>> Althrough error handling can prevent kernel crash in this case, I
>> still think it make sense to make sure kobject is deleted in order,
>> parent should not be deleted before child.
> 
> Well, look, you've created a very artificial situation where a create
> closely followed by a delete of the underlying sdev races with the
> create of the block gendisk devices of sd that bind asynchronously to
> the created sdev.  The asynchronous nature of the bind gives the
> elongated race window so the only real fix is some sort of check that
> the sdev is still viable by the time the bind occurs ... probably in
> sd_probe(), say a scsi_device_get of sdp at the top which would ensure
> viability of the sdev for the entire bind or fail the probe if the sdev
> can't be got.

Sorry, I don't follow here. 
James Bottomley Jan. 30, 2023, 1:17 p.m. UTC | #9
On Mon, 2023-01-30 at 11:46 +0800, Yu Kuai wrote:
> Hi,
> 
> 在 2023/01/30 11:29, James Bottomley 写道:
> > On Mon, 2023-01-30 at 11:07 +0800, Yu Kuai wrote:
> > > Hi,
> > > 
> > > 在 2023/01/30 1:30, James Bottomley 写道:
> > > > On Sat, 2023-01-28 at 17:41 +0800, Zhong Jinghua wrote:
> > > > > This error will cause a warning:
> > > > > kobject_add_internal failed for block (error: -2 parent:
> > > > > 1:0:0:1). In the lower version (such as 5.10), there is no
> > > > > corresponding error handling, continuing to go down will
> > > > > trigger a kernel panic, so cc stable.
> > > > 
> > > > Is this is important point and what you're saying is that this
> > > > only panics on kernels before 5.10 or so because after that
> > > > it's correctly failed by block device error handling so there's
> > > > nothing to fix in later kernels?
> > > > 
> > > > In that case, isn't the correct fix to look at backporting the
> > > > block device error handling:
> > > 
> > > This is the last commit that support error handling, and there
> > > are many relied patches, and there are lots of refactor in block
> > > layer. It's not a good idea to backport error handling to lower
> > > version. 
> > > Althrough error handling can prevent kernel crash in this case, I
> > > still think it make sense to make sure kobject is deleted in
> > > order, parent should not be deleted before child.
> > 
> > Well, look, you've created a very artificial situation where a
> > create closely followed by a delete of the underlying sdev races
> > with the create of the block gendisk devices of sd that bind
> > asynchronously to the created sdev.  The asynchronous nature of the
> > bind gives the elongated race window so the only real fix is some
> > sort of check that the sdev is still viable by the time the bind
> > occurs ... probably in sd_probe(), say a scsi_device_get of sdp at
> > the top which would ensure viability of the sdev for the entire
> > bind or fail the probe if the sdev can't be got.
> 
> Sorry, I don't follow here. 
Yu Kuai Jan. 31, 2023, 1:43 a.m. UTC | #10
Hi,

在 2023/01/30 21:17, James Bottomley 写道:
> On Mon, 2023-01-30 at 11:46 +0800, Yu Kuai wrote:
>> Hi,
>>
>> 在 2023/01/30 11:29, James Bottomley 写道:
>>> On Mon, 2023-01-30 at 11:07 +0800, Yu Kuai wrote:
>>>> Hi,
>>>>
>>>> 在 2023/01/30 1:30, James Bottomley 写道:
>>>>> On Sat, 2023-01-28 at 17:41 +0800, Zhong Jinghua wrote:
>>>>>> This error will cause a warning:
>>>>>> kobject_add_internal failed for block (error: -2 parent:
>>>>>> 1:0:0:1). In the lower version (such as 5.10), there is no
>>>>>> corresponding error handling, continuing to go down will
>>>>>> trigger a kernel panic, so cc stable.
>>>>>
>>>>> Is this is important point and what you're saying is that this
>>>>> only panics on kernels before 5.10 or so because after that
>>>>> it's correctly failed by block device error handling so there's
>>>>> nothing to fix in later kernels?
>>>>>
>>>>> In that case, isn't the correct fix to look at backporting the
>>>>> block device error handling:
>>>>
>>>> This is the last commit that support error handling, and there
>>>> are many relied patches, and there are lots of refactor in block
>>>> layer. It's not a good idea to backport error handling to lower
>>>> version.
>>>> Althrough error handling can prevent kernel crash in this case, I
>>>> still think it make sense to make sure kobject is deleted in
>>>> order, parent should not be deleted before child.
>>>
>>> Well, look, you've created a very artificial situation where a
>>> create closely followed by a delete of the underlying sdev races
>>> with the create of the block gendisk devices of sd that bind
>>> asynchronously to the created sdev.  The asynchronous nature of the
>>> bind gives the elongated race window so the only real fix is some
>>> sort of check that the sdev is still viable by the time the bind
>>> occurs ... probably in sd_probe(), say a scsi_device_get of sdp at
>>> the top which would ensure viability of the sdev for the entire
>>> bind or fail the probe if the sdev can't be got.
>>
>> Sorry, I don't follow here. 
James Bottomley Jan. 31, 2023, 3:25 a.m. UTC | #11
On Tue, 2023-01-31 at 09:43 +0800, Yu Kuai wrote:
> Hi,
> 
> 在 2023/01/30 21:17, James Bottomley 写道:
> > On Mon, 2023-01-30 at 11:46 +0800, Yu Kuai wrote:
> > > Hi,
> > > 
> > > 在 2023/01/30 11:29, James Bottomley 写道:
> > > > On Mon, 2023-01-30 at 11:07 +0800, Yu Kuai wrote:
> > > > > Hi,
> > > > > 
> > > > > 在 2023/01/30 1:30, James Bottomley 写道:
> > > > > > On Sat, 2023-01-28 at 17:41 +0800, Zhong Jinghua wrote:
> > > > > > > This error will cause a warning:
> > > > > > > kobject_add_internal failed for block (error: -2 parent:
> > > > > > > 1:0:0:1). In the lower version (such as 5.10), there is
> > > > > > > no corresponding error handling, continuing to go down
> > > > > > > will trigger a kernel panic, so cc stable.
> > > > > > 
> > > > > > Is this is important point and what you're saying is that
> > > > > > this only panics on kernels before 5.10 or so because after
> > > > > > that it's correctly failed by block device error handling
> > > > > > so there's nothing to fix in later kernels?
> > > > > > 
> > > > > > In that case, isn't the correct fix to look at backporting
> > > > > > the block device error handling:
> > > > > 
> > > > > This is the last commit that support error handling, and
> > > > > there are many relied patches, and there are lots of refactor
> > > > > in block layer. It's not a good idea to backport error
> > > > > handling to lower version. Althrough error handling can
> > > > > prevent kernel crash in this case, I still think it make
> > > > > sense to make sure kobject is deleted in order, parent should
> > > > > not be deleted before child.
> > > > 
> > > > Well, look, you've created a very artificial situation where a
> > > > create closely followed by a delete of the underlying sdev
> > > > races with the create of the block gendisk devices of sd that
> > > > bind asynchronously to the created sdev.  The asynchronous
> > > > nature of the bind gives the elongated race window so the only
> > > > real fix is some sort of check that the sdev is still viable by
> > > > the time the bind occurs ... probably in sd_probe(), say a
> > > > scsi_device_get of sdp at the top which would ensure viability
> > > > of the sdev for the entire bind or fail the probe if the sdev
> > > > can't be got.
> > > 
> > > Sorry, I don't follow here. 
diff mbox series

Patch

diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index cac7c902cf70..a22109cdb8ef 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -1535,9 +1535,7 @@  static void __scsi_remove_target(struct scsi_target *starget)
 		if (sdev->channel != starget->channel ||
 		    sdev->id != starget->id)
 			continue;
-		if (sdev->sdev_state == SDEV_DEL ||
-		    sdev->sdev_state == SDEV_CANCEL ||
-		    !get_device(&sdev->sdev_gendev))
+		if (!get_device_unless_zero(&sdev->sdev_gendev))
 			continue;
 		spin_unlock_irqrestore(shost->host_lock, flags);
 		scsi_remove_device(sdev);