diff mbox series

[V2] scsi: core: put LLD module refcnt after SCSI device is released

Message ID 20210930074026.1011114-1-ming.lei@redhat.com (mailing list archive)
State Superseded
Headers show
Series [V2] scsi: core: put LLD module refcnt after SCSI device is released | expand

Commit Message

Ming Lei Sept. 30, 2021, 7:40 a.m. UTC
SCSI host release is triggered when SCSI device is freed, and we have to
make sure that LLD module won't be unloaded before SCSI host instance is
released because shost->hostt is required in host release handler.

So put LLD module refcnt after SCSI device is released.

The real release handler can be run from wq context in case of
in_interrupt(), so add one atomic counter for serializing putting
module via current and wq context. This way is fine since we don't
call scsi_device_put() in fast IO path.

Reported-by: Changhui Zhong <czhong@redhat.com>
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/scsi/scsi.c        |  8 +++++++-
 drivers/scsi/scsi_sysfs.c  | 10 ++++++++++
 include/scsi/scsi_device.h |  2 ++
 3 files changed, 19 insertions(+), 1 deletion(-)

Comments

Ming Lei Sept. 30, 2021, 7:50 a.m. UTC | #1
On Thu, Sep 30, 2021 at 03:40:26PM +0800, Ming Lei wrote:
> SCSI host release is triggered when SCSI device is freed, and we have to
> make sure that LLD module won't be unloaded before SCSI host instance is
> released because shost->hostt is required in host release handler.
> 
> So put LLD module refcnt after SCSI device is released.
> 
> The real release handler can be run from wq context in case of
> in_interrupt(), so add one atomic counter for serializing putting
> module via current and wq context. This way is fine since we don't
> call scsi_device_put() in fast IO path.
> 
> Reported-by: Changhui Zhong <czhong@redhat.com>
> Reported-by: Yi Zhang <yi.zhang@redhat.com>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>  drivers/scsi/scsi.c        |  8 +++++++-
>  drivers/scsi/scsi_sysfs.c  | 10 ++++++++++
>  include/scsi/scsi_device.h |  2 ++
>  3 files changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
> index b241f9e3885c..b6612161587f 100644
> --- a/drivers/scsi/scsi.c
> +++ b/drivers/scsi/scsi.c
> @@ -553,8 +553,14 @@ EXPORT_SYMBOL(scsi_device_get);
>   */
>  void scsi_device_put(struct scsi_device *sdev)
>  {
> -	module_put(sdev->host->hostt->module);
> +	struct module *mod = sdev->host->hostt->module;
> +
> +	atomic_inc(&sdev->put_dev_cnt);
> +
>  	put_device(&sdev->sdev_gendev);
> +
> +	if (atomic_dec_if_positive(&sdev->put_dev_cnt) >= 0)
> +		module_put(mod);

oops, sdev can be freed now, so this approach isn't good too, :-(

Will think further about the solution.

thanks,
Ming
Greg KH Sept. 30, 2021, 8:07 a.m. UTC | #2
On Thu, Sep 30, 2021 at 03:40:26PM +0800, Ming Lei wrote:
> SCSI host release is triggered when SCSI device is freed, and we have to
> make sure that LLD module won't be unloaded before SCSI host instance is
> released because shost->hostt is required in host release handler.
> 
> So put LLD module refcnt after SCSI device is released.
> 
> The real release handler can be run from wq context in case of
> in_interrupt(), so add one atomic counter for serializing putting
> module via current and wq context. This way is fine since we don't
> call scsi_device_put() in fast IO path.
> 
> Reported-by: Changhui Zhong <czhong@redhat.com>
> Reported-by: Yi Zhang <yi.zhang@redhat.com>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>  drivers/scsi/scsi.c        |  8 +++++++-
>  drivers/scsi/scsi_sysfs.c  | 10 ++++++++++
>  include/scsi/scsi_device.h |  2 ++
>  3 files changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
> index b241f9e3885c..b6612161587f 100644
> --- a/drivers/scsi/scsi.c
> +++ b/drivers/scsi/scsi.c
> @@ -553,8 +553,14 @@ EXPORT_SYMBOL(scsi_device_get);
>   */
>  void scsi_device_put(struct scsi_device *sdev)
>  {
> -	module_put(sdev->host->hostt->module);
> +	struct module *mod = sdev->host->hostt->module;
> +
> +	atomic_inc(&sdev->put_dev_cnt);

Ick, no!  Why are you making a new lock and reference count for no
reason?

> +
>  	put_device(&sdev->sdev_gendev);
> +
> +	if (atomic_dec_if_positive(&sdev->put_dev_cnt) >= 0)
> +		module_put(mod);

How do you know if your module pointer is still valid here?

Why do you care?

What problem are you trying to solve and why is it unique to scsi
devices?

thanks,

greg k-h
Ming Lei Sept. 30, 2021, 8:20 a.m. UTC | #3
On Thu, Sep 30, 2021 at 10:07:44AM +0200, Greg Kroah-Hartman wrote:
> On Thu, Sep 30, 2021 at 03:40:26PM +0800, Ming Lei wrote:
> > SCSI host release is triggered when SCSI device is freed, and we have to
> > make sure that LLD module won't be unloaded before SCSI host instance is
> > released because shost->hostt is required in host release handler.
> > 
> > So put LLD module refcnt after SCSI device is released.
> > 
> > The real release handler can be run from wq context in case of
> > in_interrupt(), so add one atomic counter for serializing putting
> > module via current and wq context. This way is fine since we don't
> > call scsi_device_put() in fast IO path.
> > 
> > Reported-by: Changhui Zhong <czhong@redhat.com>
> > Reported-by: Yi Zhang <yi.zhang@redhat.com>
> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > ---
> >  drivers/scsi/scsi.c        |  8 +++++++-
> >  drivers/scsi/scsi_sysfs.c  | 10 ++++++++++
> >  include/scsi/scsi_device.h |  2 ++
> >  3 files changed, 19 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
> > index b241f9e3885c..b6612161587f 100644
> > --- a/drivers/scsi/scsi.c
> > +++ b/drivers/scsi/scsi.c
> > @@ -553,8 +553,14 @@ EXPORT_SYMBOL(scsi_device_get);
> >   */
> >  void scsi_device_put(struct scsi_device *sdev)
> >  {
> > -	module_put(sdev->host->hostt->module);
> > +	struct module *mod = sdev->host->hostt->module;
> > +
> > +	atomic_inc(&sdev->put_dev_cnt);
> 
> Ick, no!  Why are you making a new lock and reference count for no
> reason?

The reason is to make sure that the LLD module is only put from either
scsi_device_put() and scsi_device_dev_release_usercontext().

> 
> > +
> >  	put_device(&sdev->sdev_gendev);
> > +
> > +	if (atomic_dec_if_positive(&sdev->put_dev_cnt) >= 0)
> > +		module_put(mod);
> 
> How do you know if your module pointer is still valid here?

module refcnt is grabbed in scsi_device_get(), so it is valid.

> 
> Why do you care?
> 
> What problem are you trying to solve and why is it unique to scsi
> devices?

See it from the commit log:

	SCSI host release is triggered when SCSI device is freed, and we have to
	make sure that LLD module won't be unloaded before SCSI host instance is
	released because shost->hostt is required in host release handler.
	
	So put LLD module refcnt after SCSI device is released.

and the upstream report on the issue:

https://lore.kernel.org/linux-block/CAHj4cs8XNtkzbbiLnFmVu82wYeQpLkVp6_wCtrnbhODay+OP9w@mail.gmail.com/


Thanks,
Ming
Greg KH Sept. 30, 2021, 8:29 a.m. UTC | #4
On Thu, Sep 30, 2021 at 04:20:11PM +0800, Ming Lei wrote:
> On Thu, Sep 30, 2021 at 10:07:44AM +0200, Greg Kroah-Hartman wrote:
> > On Thu, Sep 30, 2021 at 03:40:26PM +0800, Ming Lei wrote:
> > > SCSI host release is triggered when SCSI device is freed, and we have to
> > > make sure that LLD module won't be unloaded before SCSI host instance is
> > > released because shost->hostt is required in host release handler.
> > > 
> > > So put LLD module refcnt after SCSI device is released.
> > > 
> > > The real release handler can be run from wq context in case of
> > > in_interrupt(), so add one atomic counter for serializing putting
> > > module via current and wq context. This way is fine since we don't
> > > call scsi_device_put() in fast IO path.
> > > 
> > > Reported-by: Changhui Zhong <czhong@redhat.com>
> > > Reported-by: Yi Zhang <yi.zhang@redhat.com>
> > > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > > ---
> > >  drivers/scsi/scsi.c        |  8 +++++++-
> > >  drivers/scsi/scsi_sysfs.c  | 10 ++++++++++
> > >  include/scsi/scsi_device.h |  2 ++
> > >  3 files changed, 19 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
> > > index b241f9e3885c..b6612161587f 100644
> > > --- a/drivers/scsi/scsi.c
> > > +++ b/drivers/scsi/scsi.c
> > > @@ -553,8 +553,14 @@ EXPORT_SYMBOL(scsi_device_get);
> > >   */
> > >  void scsi_device_put(struct scsi_device *sdev)
> > >  {
> > > -	module_put(sdev->host->hostt->module);
> > > +	struct module *mod = sdev->host->hostt->module;
> > > +
> > > +	atomic_inc(&sdev->put_dev_cnt);
> > 
> > Ick, no!  Why are you making a new lock and reference count for no
> > reason?
> 
> The reason is to make sure that the LLD module is only put from either
> scsi_device_put() and scsi_device_dev_release_usercontext().
> 
> > 
> > > +
> > >  	put_device(&sdev->sdev_gendev);
> > > +
> > > +	if (atomic_dec_if_positive(&sdev->put_dev_cnt) >= 0)
> > > +		module_put(mod);
> > 
> > How do you know if your module pointer is still valid here?
> 
> module refcnt is grabbed in scsi_device_get(), so it is valid.

Then you don't need the extra atomic variable.

> > 
> > Why do you care?
> > 
> > What problem are you trying to solve and why is it unique to scsi
> > devices?
> 
> See it from the commit log:
> 
> 	SCSI host release is triggered when SCSI device is freed, and we have to
> 	make sure that LLD module won't be unloaded before SCSI host instance is
> 	released because shost->hostt is required in host release handler.

What is "hostt"?

> 	
> 	So put LLD module refcnt after SCSI device is released.

Why not just drop it explicitly when you drop the reference count of the
device object?  Like you tried to do here, but no need for the extra
atomic variable.

thanks,

greg k-h
Ming Lei Sept. 30, 2021, 8:44 a.m. UTC | #5
On Thu, Sep 30, 2021 at 10:29:24AM +0200, Greg Kroah-Hartman wrote:
> On Thu, Sep 30, 2021 at 04:20:11PM +0800, Ming Lei wrote:
> > On Thu, Sep 30, 2021 at 10:07:44AM +0200, Greg Kroah-Hartman wrote:
> > > On Thu, Sep 30, 2021 at 03:40:26PM +0800, Ming Lei wrote:
> > > > SCSI host release is triggered when SCSI device is freed, and we have to
> > > > make sure that LLD module won't be unloaded before SCSI host instance is
> > > > released because shost->hostt is required in host release handler.
> > > > 
> > > > So put LLD module refcnt after SCSI device is released.
> > > > 
> > > > The real release handler can be run from wq context in case of
> > > > in_interrupt(), so add one atomic counter for serializing putting
> > > > module via current and wq context. This way is fine since we don't
> > > > call scsi_device_put() in fast IO path.
> > > > 
> > > > Reported-by: Changhui Zhong <czhong@redhat.com>
> > > > Reported-by: Yi Zhang <yi.zhang@redhat.com>
> > > > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > > > ---
> > > >  drivers/scsi/scsi.c        |  8 +++++++-
> > > >  drivers/scsi/scsi_sysfs.c  | 10 ++++++++++
> > > >  include/scsi/scsi_device.h |  2 ++
> > > >  3 files changed, 19 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
> > > > index b241f9e3885c..b6612161587f 100644
> > > > --- a/drivers/scsi/scsi.c
> > > > +++ b/drivers/scsi/scsi.c
> > > > @@ -553,8 +553,14 @@ EXPORT_SYMBOL(scsi_device_get);
> > > >   */
> > > >  void scsi_device_put(struct scsi_device *sdev)
> > > >  {
> > > > -	module_put(sdev->host->hostt->module);
> > > > +	struct module *mod = sdev->host->hostt->module;
> > > > +
> > > > +	atomic_inc(&sdev->put_dev_cnt);
> > > 
> > > Ick, no!  Why are you making a new lock and reference count for no
> > > reason?
> > 
> > The reason is to make sure that the LLD module is only put from either
> > scsi_device_put() and scsi_device_dev_release_usercontext().
> > 
> > > 
> > > > +
> > > >  	put_device(&sdev->sdev_gendev);
> > > > +
> > > > +	if (atomic_dec_if_positive(&sdev->put_dev_cnt) >= 0)
> > > > +		module_put(mod);
> > > 
> > > How do you know if your module pointer is still valid here?
> > 
> > module refcnt is grabbed in scsi_device_get(), so it is valid.
> 
> Then you don't need the extra atomic variable.
> 
> > > 
> > > Why do you care?
> > > 
> > > What problem are you trying to solve and why is it unique to scsi
> > > devices?
> > 
> > See it from the commit log:
> > 
> > 	SCSI host release is triggered when SCSI device is freed, and we have to
> > 	make sure that LLD module won't be unloaded before SCSI host instance is
> > 	released because shost->hostt is required in host release handler.
> 
> What is "hostt"?

hostt is 'struct scsi_host_template' which is defined in LLD module, and
often allocated as static global variable, that is what try_get_module()
tries to protect.

> 
> > 	
> > 	So put LLD module refcnt after SCSI device is released.
> 
> Why not just drop it explicitly when you drop the reference count of the
> device object?  Like you tried to do here, but no need for the extra
> atomic variable.

scsi_device_dev_release_usercontext() may be scheduled via schedule_work from
the device object's release handler for releasing the scsi_device, which may
trigger scsi host's release handler in which hostt is required.

If we simply call module_put() after put_device() simply, the module
refcnt may be dropped earlier than running
scsi_device_dev_release_usercontext(), then the kernel panic still can't
be addressed.


Thanks,
Ming
Greg KH Sept. 30, 2021, 10:12 a.m. UTC | #6
On Thu, Sep 30, 2021 at 04:44:07PM +0800, Ming Lei wrote:
> On Thu, Sep 30, 2021 at 10:29:24AM +0200, Greg Kroah-Hartman wrote:
> > On Thu, Sep 30, 2021 at 04:20:11PM +0800, Ming Lei wrote:
> > > On Thu, Sep 30, 2021 at 10:07:44AM +0200, Greg Kroah-Hartman wrote:
> > > > On Thu, Sep 30, 2021 at 03:40:26PM +0800, Ming Lei wrote:
> > > > > SCSI host release is triggered when SCSI device is freed, and we have to
> > > > > make sure that LLD module won't be unloaded before SCSI host instance is
> > > > > released because shost->hostt is required in host release handler.
> > > > > 
> > > > > So put LLD module refcnt after SCSI device is released.
> > > > > 
> > > > > The real release handler can be run from wq context in case of
> > > > > in_interrupt(), so add one atomic counter for serializing putting
> > > > > module via current and wq context. This way is fine since we don't
> > > > > call scsi_device_put() in fast IO path.
> > > > > 
> > > > > Reported-by: Changhui Zhong <czhong@redhat.com>
> > > > > Reported-by: Yi Zhang <yi.zhang@redhat.com>
> > > > > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > > > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > > > > ---
> > > > >  drivers/scsi/scsi.c        |  8 +++++++-
> > > > >  drivers/scsi/scsi_sysfs.c  | 10 ++++++++++
> > > > >  include/scsi/scsi_device.h |  2 ++
> > > > >  3 files changed, 19 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
> > > > > index b241f9e3885c..b6612161587f 100644
> > > > > --- a/drivers/scsi/scsi.c
> > > > > +++ b/drivers/scsi/scsi.c
> > > > > @@ -553,8 +553,14 @@ EXPORT_SYMBOL(scsi_device_get);
> > > > >   */
> > > > >  void scsi_device_put(struct scsi_device *sdev)
> > > > >  {
> > > > > -	module_put(sdev->host->hostt->module);
> > > > > +	struct module *mod = sdev->host->hostt->module;
> > > > > +
> > > > > +	atomic_inc(&sdev->put_dev_cnt);
> > > > 
> > > > Ick, no!  Why are you making a new lock and reference count for no
> > > > reason?
> > > 
> > > The reason is to make sure that the LLD module is only put from either
> > > scsi_device_put() and scsi_device_dev_release_usercontext().
> > > 
> > > > 
> > > > > +
> > > > >  	put_device(&sdev->sdev_gendev);
> > > > > +
> > > > > +	if (atomic_dec_if_positive(&sdev->put_dev_cnt) >= 0)
> > > > > +		module_put(mod);
> > > > 
> > > > How do you know if your module pointer is still valid here?
> > > 
> > > module refcnt is grabbed in scsi_device_get(), so it is valid.
> > 
> > Then you don't need the extra atomic variable.
> > 
> > > > 
> > > > Why do you care?
> > > > 
> > > > What problem are you trying to solve and why is it unique to scsi
> > > > devices?
> > > 
> > > See it from the commit log:
> > > 
> > > 	SCSI host release is triggered when SCSI device is freed, and we have to
> > > 	make sure that LLD module won't be unloaded before SCSI host instance is
> > > 	released because shost->hostt is required in host release handler.
> > 
> > What is "hostt"?
> 
> hostt is 'struct scsi_host_template' which is defined in LLD module, and
> often allocated as static global variable, that is what try_get_module()
> tries to protect.
> 
> > 
> > > 	
> > > 	So put LLD module refcnt after SCSI device is released.
> > 
> > Why not just drop it explicitly when you drop the reference count of the
> > device object?  Like you tried to do here, but no need for the extra
> > atomic variable.
> 
> scsi_device_dev_release_usercontext() may be scheduled via schedule_work from
> the device object's release handler for releasing the scsi_device, which may
> trigger scsi host's release handler in which hostt is required.

If a release handler can be called from the device release function,
then that is when you need to drop the reference, after that function is
finished being called, right?

thanks,

greg k-h
Ming Lei Sept. 30, 2021, 11:07 a.m. UTC | #7
On Thu, Sep 30, 2021 at 12:12:36PM +0200, Greg Kroah-Hartman wrote:
> On Thu, Sep 30, 2021 at 04:44:07PM +0800, Ming Lei wrote:
> > On Thu, Sep 30, 2021 at 10:29:24AM +0200, Greg Kroah-Hartman wrote:
> > > On Thu, Sep 30, 2021 at 04:20:11PM +0800, Ming Lei wrote:
> > > > On Thu, Sep 30, 2021 at 10:07:44AM +0200, Greg Kroah-Hartman wrote:
> > > > > On Thu, Sep 30, 2021 at 03:40:26PM +0800, Ming Lei wrote:
> > > > > > SCSI host release is triggered when SCSI device is freed, and we have to
> > > > > > make sure that LLD module won't be unloaded before SCSI host instance is
> > > > > > released because shost->hostt is required in host release handler.
> > > > > > 
> > > > > > So put LLD module refcnt after SCSI device is released.
> > > > > > 
> > > > > > The real release handler can be run from wq context in case of
> > > > > > in_interrupt(), so add one atomic counter for serializing putting
> > > > > > module via current and wq context. This way is fine since we don't
> > > > > > call scsi_device_put() in fast IO path.
> > > > > > 
> > > > > > Reported-by: Changhui Zhong <czhong@redhat.com>
> > > > > > Reported-by: Yi Zhang <yi.zhang@redhat.com>
> > > > > > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > > > > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > > > > > ---
> > > > > >  drivers/scsi/scsi.c        |  8 +++++++-
> > > > > >  drivers/scsi/scsi_sysfs.c  | 10 ++++++++++
> > > > > >  include/scsi/scsi_device.h |  2 ++
> > > > > >  3 files changed, 19 insertions(+), 1 deletion(-)
> > > > > > 
> > > > > > diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
> > > > > > index b241f9e3885c..b6612161587f 100644
> > > > > > --- a/drivers/scsi/scsi.c
> > > > > > +++ b/drivers/scsi/scsi.c
> > > > > > @@ -553,8 +553,14 @@ EXPORT_SYMBOL(scsi_device_get);
> > > > > >   */
> > > > > >  void scsi_device_put(struct scsi_device *sdev)
> > > > > >  {
> > > > > > -	module_put(sdev->host->hostt->module);
> > > > > > +	struct module *mod = sdev->host->hostt->module;
> > > > > > +
> > > > > > +	atomic_inc(&sdev->put_dev_cnt);
> > > > > 
> > > > > Ick, no!  Why are you making a new lock and reference count for no
> > > > > reason?
> > > > 
> > > > The reason is to make sure that the LLD module is only put from either
> > > > scsi_device_put() and scsi_device_dev_release_usercontext().
> > > > 
> > > > > 
> > > > > > +
> > > > > >  	put_device(&sdev->sdev_gendev);
> > > > > > +
> > > > > > +	if (atomic_dec_if_positive(&sdev->put_dev_cnt) >= 0)
> > > > > > +		module_put(mod);
> > > > > 
> > > > > How do you know if your module pointer is still valid here?
> > > > 
> > > > module refcnt is grabbed in scsi_device_get(), so it is valid.
> > > 
> > > Then you don't need the extra atomic variable.
> > > 
> > > > > 
> > > > > Why do you care?
> > > > > 
> > > > > What problem are you trying to solve and why is it unique to scsi
> > > > > devices?
> > > > 
> > > > See it from the commit log:
> > > > 
> > > > 	SCSI host release is triggered when SCSI device is freed, and we have to
> > > > 	make sure that LLD module won't be unloaded before SCSI host instance is
> > > > 	released because shost->hostt is required in host release handler.
> > > 
> > > What is "hostt"?
> > 
> > hostt is 'struct scsi_host_template' which is defined in LLD module, and
> > often allocated as static global variable, that is what try_get_module()
> > tries to protect.
> > 
> > > 
> > > > 	
> > > > 	So put LLD module refcnt after SCSI device is released.
> > > 
> > > Why not just drop it explicitly when you drop the reference count of the
> > > device object?  Like you tried to do here, but no need for the extra
> > > atomic variable.
> > 
> > scsi_device_dev_release_usercontext() may be scheduled via schedule_work from
> > the device object's release handler for releasing the scsi_device, which may
> > trigger scsi host's release handler in which hostt is required.
> 
> If a release handler can be called from the device release function,
> then that is when you need to drop the reference, after that function is
> finished being called, right?

No.

Not like device object refcnt, module refcnt is special, it is grabbed when
someone is using the device, then module won't be unloaded when there is
active user. When no one uses the device, we should allow rmmod to
unload the module.

SCSI stack models the device uses via scsi_device_get() and scsi_device_put(),
such as, when one disk is opened, scsi_device_get() is called, and
scsi_device_put() is called when the disk is closed.

Wrt. this issue, we can think that the LLD module holds one device refcnt of
the scsi_host, and the refcnt is dropped when running module_exit(). But
one disk attached to the host may still be opened by userspace, so
the scsi_host won't be released until the disk is closed, and the LLD
module won't be unloaded until the disk is closed too. When the disk
is closed, scsi_device_put() is called and module_put() can unload the
LLD module immediately if rmmod is started.

So what the patch is doing is that we need to make sure that module
refcnt is put after the device is released since the host is released
from its release handler directly if rmmod has been started.


Thanks,
Ming
diff mbox series

Patch

diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index b241f9e3885c..b6612161587f 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -553,8 +553,14 @@  EXPORT_SYMBOL(scsi_device_get);
  */
 void scsi_device_put(struct scsi_device *sdev)
 {
-	module_put(sdev->host->hostt->module);
+	struct module *mod = sdev->host->hostt->module;
+
+	atomic_inc(&sdev->put_dev_cnt);
+
 	put_device(&sdev->sdev_gendev);
+
+	if (atomic_dec_if_positive(&sdev->put_dev_cnt) >= 0)
+		module_put(mod);
 }
 EXPORT_SYMBOL(scsi_device_put);
 
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 86793259e541..ba6defac91ae 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -449,9 +449,16 @@  static void scsi_device_dev_release_usercontext(struct work_struct *work)
 	struct scsi_vpd *vpd_pg80 = NULL, *vpd_pg83 = NULL;
 	struct scsi_vpd *vpd_pg0 = NULL, *vpd_pg89 = NULL;
 	unsigned long flags;
+	struct module *mod;
+	bool put_mod = false;
 
 	sdev = container_of(work, struct scsi_device, ew.work);
 
+	if (atomic_dec_if_positive(&sdev->put_dev_cnt) >= 0) {
+		put_mod = true;
+		mod = sdev->host->hostt->module;
+	}
+
 	scsi_dh_release_device(sdev);
 
 	parent = sdev->sdev_gendev.parent;
@@ -502,6 +509,9 @@  static void scsi_device_dev_release_usercontext(struct work_struct *work)
 
 	if (parent)
 		put_device(parent);
+
+	if (put_mod)
+		module_put(mod);
 }
 
 static void scsi_device_dev_release(struct device *dev)
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index 430b73bd02ac..76268c473c22 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -111,6 +111,8 @@  struct scsi_device {
 	struct sbitmap budget_map;
 	atomic_t device_blocked;	/* Device returned QUEUE_FULL. */
 
+	atomic_t put_dev_cnt;	/* increased by 1 when we are putting device */
+
 	atomic_t restarts;
 	spinlock_t list_lock;
 	struct list_head starved_entry;