Message ID | 20180913121546.5710-12-ming.lei@redhat.com (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | [V3,01/17] blk-mq: allow to pass default queue flags for creating & initializing queue | expand |
Hi Ming On 09/13/2018 08:15 PM, Ming Lei wrote: > EXPORT_SYMBOL(__scsi_execute); > @@ -3246,6 +3251,7 @@ static int scsi_internal_device_block(struct scsi_device *sdev) > else > scsi_wait_for_queuecommand(sdev); > } > + wait_event(sdev->admin_wq, !atomic_read(&sdev->nr_admin_pending)); > mutex_unlock(&sdev->state_mutex); > > return err; ... > diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c > index 3aee9464a7bf..8bcb7ecc0c06 100644 > --- a/drivers/scsi/scsi_sysfs.c > +++ b/drivers/scsi/scsi_sysfs.c > @@ -1393,6 +1393,7 @@ void __scsi_remove_device(struct scsi_device *sdev) > > blk_cleanup_queue(sdev->request_queue); > cancel_work_sync(&sdev->requeue_work); > + wait_event(sdev->admin_wq, !atomic_read(&sdev->nr_admin_pending)) This nr_admin_pending could drain the ongoing scsi_request_fn/scsi_queue_rq, but I'm afraid it cannot stop new ones coming in, such as the ones that have passed the sdev state checking and have not crossed the atomic_inc(&sdev->nr_admin_pending). Thanks Jianchao
On Fri, Sep 14, 2018 at 11:40 AM jianchao.wang <jianchao.w.wang@oracle.com> wrote: > > Hi Ming > > On 09/13/2018 08:15 PM, Ming Lei wrote: > > EXPORT_SYMBOL(__scsi_execute); > > @@ -3246,6 +3251,7 @@ static int scsi_internal_device_block(struct scsi_device *sdev) > > else > > scsi_wait_for_queuecommand(sdev); > > } > > + wait_event(sdev->admin_wq, !atomic_read(&sdev->nr_admin_pending)); > > mutex_unlock(&sdev->state_mutex); > > > > return err; > ... > > diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c > > index 3aee9464a7bf..8bcb7ecc0c06 100644 > > --- a/drivers/scsi/scsi_sysfs.c > > +++ b/drivers/scsi/scsi_sysfs.c > > @@ -1393,6 +1393,7 @@ void __scsi_remove_device(struct scsi_device *sdev) > > > > blk_cleanup_queue(sdev->request_queue); > > cancel_work_sync(&sdev->requeue_work); > > + wait_event(sdev->admin_wq, !atomic_read(&sdev->nr_admin_pending)) > > This nr_admin_pending could drain the ongoing scsi_request_fn/scsi_queue_rq, > but I'm afraid it cannot stop new ones coming in, such as the ones that have passed > the sdev state checking and have not crossed the atomic_inc(&sdev->nr_admin_pending). > The counter of .nr_admin_pending is introduced for draining queued admin requests to this scsi device. Actually new requests have been prevented from entering scsi_queue_rq(), please see the two callers of wait_event(sdev->admin_wq, !atomic_read(&sdev->nr_admin_pending)). Thanks, Ming Lei
Hi Ming On 09/14/2018 07:33 PM, Ming Lei wrote: > On Fri, Sep 14, 2018 at 11:40 AM jianchao.wang > <jianchao.w.wang@oracle.com> wrote: >> >> Hi Ming >> >> On 09/13/2018 08:15 PM, Ming Lei wrote: >>> EXPORT_SYMBOL(__scsi_execute); >>> @@ -3246,6 +3251,7 @@ static int scsi_internal_device_block(struct scsi_device *sdev) >>> else >>> scsi_wait_for_queuecommand(sdev); >>> } >>> + wait_event(sdev->admin_wq, !atomic_read(&sdev->nr_admin_pending)); >>> mutex_unlock(&sdev->state_mutex); >>> >>> return err; >> ... >>> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c >>> index 3aee9464a7bf..8bcb7ecc0c06 100644 >>> --- a/drivers/scsi/scsi_sysfs.c >>> +++ b/drivers/scsi/scsi_sysfs.c >>> @@ -1393,6 +1393,7 @@ void __scsi_remove_device(struct scsi_device *sdev) >>> >>> blk_cleanup_queue(sdev->request_queue); >>> cancel_work_sync(&sdev->requeue_work); >>> + wait_event(sdev->admin_wq, !atomic_read(&sdev->nr_admin_pending)) >> >> This nr_admin_pending could drain the ongoing scsi_request_fn/scsi_queue_rq, >> but I'm afraid it cannot stop new ones coming in, such as the ones that have passed >> the sdev state checking and have not crossed the atomic_inc(&sdev->nr_admin_pending). >> > > The counter of .nr_admin_pending is introduced for draining queued > admin requests to this scsi device. > > Actually new requests have been prevented from entering scsi_queue_rq(), > please see the two callers of wait_event(sdev->admin_wq, > !atomic_read(&sdev->nr_admin_pending)). > For example _scsi_execute ... scsi_internal_device_block scsi_internal_device_block_nowait blk_mq_quiesce_queue wait_event(sdev->admin_wq, !atomic_read(&sdev->nr_admin_pending)) atomic_inc(&sdev->nr_admin_pending); blk_execute_rq(...) atomic_dec(&sdev->nr_admin_pending); wake_up_all(&sdev->admin_wq); Or do you mean the scsi_queue_rq -> scsi_prep_state_check could gate out of ? Thanks Jianchao > Thanks, > Ming Lei >
On Mon, Sep 17, 2018 at 10:46:34AM +0800, jianchao.wang wrote: > Hi Ming > > On 09/14/2018 07:33 PM, Ming Lei wrote: > > On Fri, Sep 14, 2018 at 11:40 AM jianchao.wang > > <jianchao.w.wang@oracle.com> wrote: > >> > >> Hi Ming > >> > >> On 09/13/2018 08:15 PM, Ming Lei wrote: > >>> EXPORT_SYMBOL(__scsi_execute); > >>> @@ -3246,6 +3251,7 @@ static int scsi_internal_device_block(struct scsi_device *sdev) > >>> else > >>> scsi_wait_for_queuecommand(sdev); > >>> } > >>> + wait_event(sdev->admin_wq, !atomic_read(&sdev->nr_admin_pending)); > >>> mutex_unlock(&sdev->state_mutex); > >>> > >>> return err; > >> ... > >>> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c > >>> index 3aee9464a7bf..8bcb7ecc0c06 100644 > >>> --- a/drivers/scsi/scsi_sysfs.c > >>> +++ b/drivers/scsi/scsi_sysfs.c > >>> @@ -1393,6 +1393,7 @@ void __scsi_remove_device(struct scsi_device *sdev) > >>> > >>> blk_cleanup_queue(sdev->request_queue); > >>> cancel_work_sync(&sdev->requeue_work); > >>> + wait_event(sdev->admin_wq, !atomic_read(&sdev->nr_admin_pending)) > >> > >> This nr_admin_pending could drain the ongoing scsi_request_fn/scsi_queue_rq, > >> but I'm afraid it cannot stop new ones coming in, such as the ones that have passed > >> the sdev state checking and have not crossed the atomic_inc(&sdev->nr_admin_pending). > >> > > > > The counter of .nr_admin_pending is introduced for draining queued > > admin requests to this scsi device. > > > > Actually new requests have been prevented from entering scsi_queue_rq(), > > please see the two callers of wait_event(sdev->admin_wq, > > !atomic_read(&sdev->nr_admin_pending)). > > > For example > > _scsi_execute > ... > scsi_internal_device_block > scsi_internal_device_block_nowait > blk_mq_quiesce_queue > wait_event(sdev->admin_wq, !atomic_read(&sdev->nr_admin_pending)) > atomic_inc(&sdev->nr_admin_pending); > > blk_execute_rq(...) > > atomic_dec(&sdev->nr_admin_pending); > wake_up_all(&sdev->admin_wq); > > Or do you mean the scsi_queue_rq -> scsi_prep_state_check could gate out of ? I got it, then this issue can be fixed simply by moving atomic_inc/dec(&sdev->nr_admin_pending) and related wake_up_all(&sdev->admin_wq) into scsi_admin_queue_rq(). Thanks, Ming
Hi Ming On 09/17/2018 07:35 PM, Ming Lei wrote: > On Mon, Sep 17, 2018 at 10:46:34AM +0800, jianchao.wang wrote: >> Hi Ming >> >> On 09/14/2018 07:33 PM, Ming Lei wrote: >>> On Fri, Sep 14, 2018 at 11:40 AM jianchao.wang >>> <jianchao.w.wang@oracle.com> wrote: >>>> >>>> Hi Ming >>>> >>>> On 09/13/2018 08:15 PM, Ming Lei wrote: >>>>> EXPORT_SYMBOL(__scsi_execute); >>>>> @@ -3246,6 +3251,7 @@ static int scsi_internal_device_block(struct scsi_device *sdev) >>>>> else >>>>> scsi_wait_for_queuecommand(sdev); >>>>> } >>>>> + wait_event(sdev->admin_wq, !atomic_read(&sdev->nr_admin_pending)); >>>>> mutex_unlock(&sdev->state_mutex); >>>>> >>>>> return err; >>>> ... >>>>> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c >>>>> index 3aee9464a7bf..8bcb7ecc0c06 100644 >>>>> --- a/drivers/scsi/scsi_sysfs.c >>>>> +++ b/drivers/scsi/scsi_sysfs.c >>>>> @@ -1393,6 +1393,7 @@ void __scsi_remove_device(struct scsi_device *sdev) >>>>> >>>>> blk_cleanup_queue(sdev->request_queue); >>>>> cancel_work_sync(&sdev->requeue_work); >>>>> + wait_event(sdev->admin_wq, !atomic_read(&sdev->nr_admin_pending)) >>>> >>>> This nr_admin_pending could drain the ongoing scsi_request_fn/scsi_queue_rq, >>>> but I'm afraid it cannot stop new ones coming in, such as the ones that have passed >>>> the sdev state checking and have not crossed the atomic_inc(&sdev->nr_admin_pending). >>>> >>> >>> The counter of .nr_admin_pending is introduced for draining queued >>> admin requests to this scsi device. >>> >>> Actually new requests have been prevented from entering scsi_queue_rq(), >>> please see the two callers of wait_event(sdev->admin_wq, >>> !atomic_read(&sdev->nr_admin_pending)). >>> >> For example >> >> _scsi_execute >> ... >> scsi_internal_device_block >> scsi_internal_device_block_nowait >> blk_mq_quiesce_queue >> wait_event(sdev->admin_wq, !atomic_read(&sdev->nr_admin_pending)) >> &sdev->nr_admin_pending; >> >> blk_execute_rq(...) >> >> atomic_dec(&sdev->nr_admin_pending); >> wake_up_all(&sdev->admin_wq); >> >> Or do you mean the scsi_queue_rq -> scsi_prep_state_check could gate out of ? > > I got it, then this issue can be fixed simply by moving atomic_inc/dec(&sdev->nr_admin_pending) > and related wake_up_all(&sdev->admin_wq) into scsi_admin_queue_rq(). > I don't think so. It is a similar scenario. I guess a state checking is needed after atomic_inc(&sdev->nr_admin_pending), like: _scsi_execute ... scsi_internal_device_block scsi_internal_device_block_nowait blk_mq_quiesce_queue wait_event(sdev->admin_wq, !atomic_read(&sdev->nr_admin_pending)) atomic_inc(&sdev->nr_admin_pending); if state checking fails goto done blk_execute_rq(...) atomic_dec(&sdev->nr_admin_pending); wake_up_all(&sdev->admin_wq); Thanks Jianchao
On Tue, Sep 18, 2018 at 09:22:32AM +0800, jianchao.wang wrote: > Hi Ming > > On 09/17/2018 07:35 PM, Ming Lei wrote: > > On Mon, Sep 17, 2018 at 10:46:34AM +0800, jianchao.wang wrote: > >> Hi Ming > >> > >> On 09/14/2018 07:33 PM, Ming Lei wrote: > >>> On Fri, Sep 14, 2018 at 11:40 AM jianchao.wang > >>> <jianchao.w.wang@oracle.com> wrote: > >>>> > >>>> Hi Ming > >>>> > >>>> On 09/13/2018 08:15 PM, Ming Lei wrote: > >>>>> EXPORT_SYMBOL(__scsi_execute); > >>>>> @@ -3246,6 +3251,7 @@ static int scsi_internal_device_block(struct scsi_device *sdev) > >>>>> else > >>>>> scsi_wait_for_queuecommand(sdev); > >>>>> } > >>>>> + wait_event(sdev->admin_wq, !atomic_read(&sdev->nr_admin_pending)); > >>>>> mutex_unlock(&sdev->state_mutex); > >>>>> > >>>>> return err; > >>>> ... > >>>>> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c > >>>>> index 3aee9464a7bf..8bcb7ecc0c06 100644 > >>>>> --- a/drivers/scsi/scsi_sysfs.c > >>>>> +++ b/drivers/scsi/scsi_sysfs.c > >>>>> @@ -1393,6 +1393,7 @@ void __scsi_remove_device(struct scsi_device *sdev) > >>>>> > >>>>> blk_cleanup_queue(sdev->request_queue); > >>>>> cancel_work_sync(&sdev->requeue_work); > >>>>> + wait_event(sdev->admin_wq, !atomic_read(&sdev->nr_admin_pending)) > >>>> > >>>> This nr_admin_pending could drain the ongoing scsi_request_fn/scsi_queue_rq, > >>>> but I'm afraid it cannot stop new ones coming in, such as the ones that have passed > >>>> the sdev state checking and have not crossed the atomic_inc(&sdev->nr_admin_pending). > >>>> > >>> > >>> The counter of .nr_admin_pending is introduced for draining queued > >>> admin requests to this scsi device. > >>> > >>> Actually new requests have been prevented from entering scsi_queue_rq(), > >>> please see the two callers of wait_event(sdev->admin_wq, > >>> !atomic_read(&sdev->nr_admin_pending)). > >>> > >> For example > >> > >> _scsi_execute > >> ... > >> scsi_internal_device_block > >> scsi_internal_device_block_nowait > >> blk_mq_quiesce_queue > >> wait_event(sdev->admin_wq, !atomic_read(&sdev->nr_admin_pending)) > >> &sdev->nr_admin_pending; > >> > >> blk_execute_rq(...) > >> > >> atomic_dec(&sdev->nr_admin_pending); > >> wake_up_all(&sdev->admin_wq); > >> > >> Or do you mean the scsi_queue_rq -> scsi_prep_state_check could gate out of ? > > > > I got it, then this issue can be fixed simply by moving atomic_inc/dec(&sdev->nr_admin_pending) > > and related wake_up_all(&sdev->admin_wq) into scsi_admin_queue_rq(). > > > > I don't think so. It is a similar scenario. > > I guess a state checking is needed after atomic_inc(&sdev->nr_admin_pending), like: > > _scsi_execute > ... > scsi_internal_device_block > scsi_internal_device_block_nowait > blk_mq_quiesce_queue > wait_event(sdev->admin_wq, !atomic_read(&sdev->nr_admin_pending)) > atomic_inc(&sdev->nr_admin_pending); > if state checking fails > goto done The check will be done in scsi_admin_queue_rq(). > > blk_execute_rq(...) > > atomic_dec(&sdev->nr_admin_pending); > wake_up_all(&sdev->admin_wq); I guess you may misunderstand the purpose of .nr_admin_pending, which is for draining requests to .queue_rq(). So it is enough to just move the inc/dec of .nr_admin_pending into scsi_admin_queue_rq(), right? Thanks, Ming
On 09/18/2018 03:39 PM, Ming Lei wrote: > On Tue, Sep 18, 2018 at 09:22:32AM +0800, jianchao.wang wrote: >> Hi Ming >> >> On 09/17/2018 07:35 PM, Ming Lei wrote: >>> On Mon, Sep 17, 2018 at 10:46:34AM +0800, jianchao.wang wrote: >>>> Hi Ming >>>> >>>> On 09/14/2018 07:33 PM, Ming Lei wrote: >>>>> On Fri, Sep 14, 2018 at 11:40 AM jianchao.wang >>>>> <jianchao.w.wang@oracle.com> wrote: >>>>>> >>>>>> Hi Ming >>>>>> >>>>>> On 09/13/2018 08:15 PM, Ming Lei wrote: >>>>>>> EXPORT_SYMBOL(__scsi_execute); >>>>>>> @@ -3246,6 +3251,7 @@ static int scsi_internal_device_block(struct scsi_device *sdev) >>>>>>> else >>>>>>> scsi_wait_for_queuecommand(sdev); >>>>>>> } >>>>>>> + wait_event(sdev->admin_wq, !atomic_read(&sdev->nr_admin_pending)); >>>>>>> mutex_unlock(&sdev->state_mutex); >>>>>>> >>>>>>> return err; >>>>>> ... >>>>>>> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c >>>>>>> index 3aee9464a7bf..8bcb7ecc0c06 100644 >>>>>>> --- a/drivers/scsi/scsi_sysfs.c >>>>>>> +++ b/drivers/scsi/scsi_sysfs.c >>>>>>> @@ -1393,6 +1393,7 @@ void __scsi_remove_device(struct scsi_device *sdev) >>>>>>> >>>>>>> blk_cleanup_queue(sdev->request_queue); >>>>>>> cancel_work_sync(&sdev->requeue_work); >>>>>>> + wait_event(sdev->admin_wq, !atomic_read(&sdev->nr_admin_pending)) >>>>>> >>>>>> This nr_admin_pending could drain the ongoing scsi_request_fn/scsi_queue_rq, >>>>>> but I'm afraid it cannot stop new ones coming in, such as the ones that have passed >>>>>> the sdev state checking and have not crossed the atomic_inc(&sdev->nr_admin_pending). >>>>>> >>>>> >>>>> The counter of .nr_admin_pending is introduced for draining queued >>>>> admin requests to this scsi device. >>>>> >>>>> Actually new requests have been prevented from entering scsi_queue_rq(), >>>>> please see the two callers of wait_event(sdev->admin_wq, >>>>> !atomic_read(&sdev->nr_admin_pending)). >>>>> >>>> For example >>>> >>>> _scsi_execute >>>> ... >>>> scsi_internal_device_block >>>> scsi_internal_device_block_nowait >>>> blk_mq_quiesce_queue >>>> wait_event(sdev->admin_wq, !atomic_read(&sdev->nr_admin_pending)) >>>> &sdev->nr_admin_pending; >>>> >>>> blk_execute_rq(...) >>>> >>>> atomic_dec(&sdev->nr_admin_pending); >>>> wake_up_all(&sdev->admin_wq); >>>> >>>> Or do you mean the scsi_queue_rq -> scsi_prep_state_check could gate out of ? >>> >>> I got it, then this issue can be fixed simply by moving atomic_inc/dec(&sdev->nr_admin_pending) >>> and related wake_up_all(&sdev->admin_wq) into scsi_admin_queue_rq(). >>> >> >> I don't think so. It is a similar scenario. >> >> I guess a state checking is needed after atomic_inc(&sdev->nr_admin_pending), like: >> >> _scsi_execute >> ... >> scsi_internal_device_block >> scsi_internal_device_block_nowait >> blk_mq_quiesce_queue >> wait_event(sdev->admin_wq, !atomic_read(&sdev->nr_admin_pending)) >> atomic_inc(&sdev->nr_admin_pending); >> if state checking fails >> goto done > > The check will be done in scsi_admin_queue_rq(). > >> >> blk_execute_rq(...) >> >> atomic_dec(&sdev->nr_admin_pending); >> wake_up_all(&sdev->admin_wq); > > I guess you may misunderstand the purpose of .nr_admin_pending, which is > for draining requests to .queue_rq(). So it is enough to just move the > inc/dec of .nr_admin_pending into scsi_admin_queue_rq(), right? Yes. But I just think of how to assign with queue quiesce. The existence of per-host adminq seems to break it. Thanks Jianchao
On Tue, Sep 18, 2018 at 03:51:10PM +0800, jianchao.wang wrote: > > > On 09/18/2018 03:39 PM, Ming Lei wrote: > > On Tue, Sep 18, 2018 at 09:22:32AM +0800, jianchao.wang wrote: > >> Hi Ming > >> > >> On 09/17/2018 07:35 PM, Ming Lei wrote: > >>> On Mon, Sep 17, 2018 at 10:46:34AM +0800, jianchao.wang wrote: > >>>> Hi Ming > >>>> > >>>> On 09/14/2018 07:33 PM, Ming Lei wrote: > >>>>> On Fri, Sep 14, 2018 at 11:40 AM jianchao.wang > >>>>> <jianchao.w.wang@oracle.com> wrote: > >>>>>> > >>>>>> Hi Ming > >>>>>> > >>>>>> On 09/13/2018 08:15 PM, Ming Lei wrote: > >>>>>>> EXPORT_SYMBOL(__scsi_execute); > >>>>>>> @@ -3246,6 +3251,7 @@ static int scsi_internal_device_block(struct scsi_device *sdev) > >>>>>>> else > >>>>>>> scsi_wait_for_queuecommand(sdev); > >>>>>>> } > >>>>>>> + wait_event(sdev->admin_wq, !atomic_read(&sdev->nr_admin_pending)); > >>>>>>> mutex_unlock(&sdev->state_mutex); > >>>>>>> > >>>>>>> return err; > >>>>>> ... > >>>>>>> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c > >>>>>>> index 3aee9464a7bf..8bcb7ecc0c06 100644 > >>>>>>> --- a/drivers/scsi/scsi_sysfs.c > >>>>>>> +++ b/drivers/scsi/scsi_sysfs.c > >>>>>>> @@ -1393,6 +1393,7 @@ void __scsi_remove_device(struct scsi_device *sdev) > >>>>>>> > >>>>>>> blk_cleanup_queue(sdev->request_queue); > >>>>>>> cancel_work_sync(&sdev->requeue_work); > >>>>>>> + wait_event(sdev->admin_wq, !atomic_read(&sdev->nr_admin_pending)) > >>>>>> > >>>>>> This nr_admin_pending could drain the ongoing scsi_request_fn/scsi_queue_rq, > >>>>>> but I'm afraid it cannot stop new ones coming in, such as the ones that have passed > >>>>>> the sdev state checking and have not crossed the atomic_inc(&sdev->nr_admin_pending). > >>>>>> > >>>>> > >>>>> The counter of .nr_admin_pending is introduced for draining queued > >>>>> admin requests to this scsi device. > >>>>> > >>>>> Actually new requests have been prevented from entering scsi_queue_rq(), > >>>>> please see the two callers of wait_event(sdev->admin_wq, > >>>>> !atomic_read(&sdev->nr_admin_pending)). > >>>>> > >>>> For example > >>>> > >>>> _scsi_execute > >>>> ... > >>>> scsi_internal_device_block > >>>> scsi_internal_device_block_nowait > >>>> blk_mq_quiesce_queue > >>>> wait_event(sdev->admin_wq, !atomic_read(&sdev->nr_admin_pending)) > >>>> &sdev->nr_admin_pending; > >>>> > >>>> blk_execute_rq(...) > >>>> > >>>> atomic_dec(&sdev->nr_admin_pending); > >>>> wake_up_all(&sdev->admin_wq); > >>>> > >>>> Or do you mean the scsi_queue_rq -> scsi_prep_state_check could gate out of ? > >>> > >>> I got it, then this issue can be fixed simply by moving atomic_inc/dec(&sdev->nr_admin_pending) > >>> and related wake_up_all(&sdev->admin_wq) into scsi_admin_queue_rq(). > >>> > >> > >> I don't think so. It is a similar scenario. > >> > >> I guess a state checking is needed after atomic_inc(&sdev->nr_admin_pending), like: > >> > >> _scsi_execute > >> ... > >> scsi_internal_device_block > >> scsi_internal_device_block_nowait > >> blk_mq_quiesce_queue > >> wait_event(sdev->admin_wq, !atomic_read(&sdev->nr_admin_pending)) > >> atomic_inc(&sdev->nr_admin_pending); > >> if state checking fails > >> goto done > > > > The check will be done in scsi_admin_queue_rq(). > > > >> > >> blk_execute_rq(...) > >> > >> atomic_dec(&sdev->nr_admin_pending); > >> wake_up_all(&sdev->admin_wq); > > > > I guess you may misunderstand the purpose of .nr_admin_pending, which is > > for draining requests to .queue_rq(). So it is enough to just move the > > inc/dec of .nr_admin_pending into scsi_admin_queue_rq(), right? > Yes. Thanks for your confirmation. > > But I just think of how to assign with queue quiesce. > The existence of per-host adminq seems to break it. The per-host adminq won't be frozen or quiesced at all, could you explain your concern in a bit detail about 'assign with queue quiesce'? Thanks, Ming
On 09/18/2018 03:55 PM, Ming Lei wrote: > On Tue, Sep 18, 2018 at 03:51:10PM +0800, jianchao.wang wrote: >> >> >> On 09/18/2018 03:39 PM, Ming Lei wrote: >>> On Tue, Sep 18, 2018 at 09:22:32AM +0800, jianchao.wang wrote: >>>> Hi Ming >>>> >>>> On 09/17/2018 07:35 PM, Ming Lei wrote: >>>>> On Mon, Sep 17, 2018 at 10:46:34AM +0800, jianchao.wang wrote: >>>>>> Hi Ming >>>>>> >>>>>> On 09/14/2018 07:33 PM, Ming Lei wrote: >>>>>>> On Fri, Sep 14, 2018 at 11:40 AM jianchao.wang >>>>>>> <jianchao.w.wang@oracle.com> wrote: >>>>>>>> >>>>>>>> Hi Ming >>>>>>>> >>>>>>>> On 09/13/2018 08:15 PM, Ming Lei wrote: >>>>>>>>> EXPORT_SYMBOL(__scsi_execute); >>>>>>>>> @@ -3246,6 +3251,7 @@ static int scsi_internal_device_block(struct scsi_device *sdev) >>>>>>>>> else >>>>>>>>> scsi_wait_for_queuecommand(sdev); >>>>>>>>> } >>>>>>>>> + wait_event(sdev->admin_wq, !atomic_read(&sdev->nr_admin_pending)); >>>>>>>>> mutex_unlock(&sdev->state_mutex); >>>>>>>>> >>>>>>>>> return err; >>>>>>>> ... >>>>>>>>> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c >>>>>>>>> index 3aee9464a7bf..8bcb7ecc0c06 100644 >>>>>>>>> --- a/drivers/scsi/scsi_sysfs.c >>>>>>>>> +++ b/drivers/scsi/scsi_sysfs.c >>>>>>>>> @@ -1393,6 +1393,7 @@ void __scsi_remove_device(struct scsi_device *sdev) >>>>>>>>> >>>>>>>>> blk_cleanup_queue(sdev->request_queue); >>>>>>>>> cancel_work_sync(&sdev->requeue_work); >>>>>>>>> + wait_event(sdev->admin_wq, !atomic_read(&sdev->nr_admin_pending)) >>>>>>>> >>>>>>>> This nr_admin_pending could drain the ongoing scsi_request_fn/scsi_queue_rq, >>>>>>>> but I'm afraid it cannot stop new ones coming in, such as the ones that have passed >>>>>>>> the sdev state checking and have not crossed the atomic_inc(&sdev->nr_admin_pending). >>>>>>>> >>>>>>> >>>>>>> The counter of .nr_admin_pending is introduced for draining queued >>>>>>> admin requests to this scsi device. >>>>>>> >>>>>>> Actually new requests have been prevented from entering scsi_queue_rq(), >>>>>>> please see the two callers of wait_event(sdev->admin_wq, >>>>>>> !atomic_read(&sdev->nr_admin_pending)). >>>>>>> >>>>>> For example >>>>>> >>>>>> _scsi_execute >>>>>> ... >>>>>> scsi_internal_device_block >>>>>> scsi_internal_device_block_nowait >>>>>> blk_mq_quiesce_queue >>>>>> wait_event(sdev->admin_wq, !atomic_read(&sdev->nr_admin_pending)) >>>>>> &sdev->nr_admin_pending; >>>>>> >>>>>> blk_execute_rq(...) >>>>>> >>>>>> atomic_dec(&sdev->nr_admin_pending); >>>>>> wake_up_all(&sdev->admin_wq); >>>>>> >>>>>> Or do you mean the scsi_queue_rq -> scsi_prep_state_check could gate out of ? >>>>> >>>>> I got it, then this issue can be fixed simply by moving atomic_inc/dec(&sdev->nr_admin_pending) >>>>> and related wake_up_all(&sdev->admin_wq) into scsi_admin_queue_rq(). >>>>> >>>> >>>> I don't think so. It is a similar scenario. >>>> >>>> I guess a state checking is needed after atomic_inc(&sdev->nr_admin_pending), like: >>>> >>>> _scsi_execute >>>> ... >>>> scsi_internal_device_block >>>> scsi_internal_device_block_nowait >>>> blk_mq_quiesce_queue >>>> wait_event(sdev->admin_wq, !atomic_read(&sdev->nr_admin_pending)) >>>> atomic_inc(&sdev->nr_admin_pending); >>>> if state checking fails >>>> goto done >>> >>> The check will be done in scsi_admin_queue_rq(). >>> >>>> >>>> blk_execute_rq(...) >>>> >>>> atomic_dec(&sdev->nr_admin_pending); >>>> wake_up_all(&sdev->admin_wq); >>> >>> I guess you may misunderstand the purpose of .nr_admin_pending, which is >>> for draining requests to .queue_rq(). So it is enough to just move the >>> inc/dec of .nr_admin_pending into scsi_admin_queue_rq(), right? >> Yes. > > Thanks for your confirmation. > >> >> But I just think of how to assign with queue quiesce. >> The existence of per-host adminq seems to break it. > > The per-host adminq won't be frozen or quiesced at all, This is the place which makes me confused. Before this patchset, when scsi_internal_device_block quiesces the request_queue, Both normal IO or admin requests cannot be processed any more. Given the per-adminq, we could say every scsi device has two request_queues, or there could be two request_queues sending requests to the a scsi_device: - normal IO request_queue - shared per-host adminq We could quiesce the normal IO request_queue, but what's about the per-host adminq ? > could you explain > your concern in a bit detail about 'assign with queue quiesce'? > The scsi_queue_rq->scsi_prep_state_check could stop requests entering further, but the scsi_queue_rq is still invoked. Thanks Jianchao
On Tue, Sep 18, 2018 at 04:25:33PM +0800, jianchao.wang wrote: > > > On 09/18/2018 03:55 PM, Ming Lei wrote: > > On Tue, Sep 18, 2018 at 03:51:10PM +0800, jianchao.wang wrote: > >> > >> > >> On 09/18/2018 03:39 PM, Ming Lei wrote: > >>> On Tue, Sep 18, 2018 at 09:22:32AM +0800, jianchao.wang wrote: > >>>> Hi Ming > >>>> > >>>> On 09/17/2018 07:35 PM, Ming Lei wrote: > >>>>> On Mon, Sep 17, 2018 at 10:46:34AM +0800, jianchao.wang wrote: > >>>>>> Hi Ming > >>>>>> > >>>>>> On 09/14/2018 07:33 PM, Ming Lei wrote: > >>>>>>> On Fri, Sep 14, 2018 at 11:40 AM jianchao.wang > >>>>>>> <jianchao.w.wang@oracle.com> wrote: > >>>>>>>> > >>>>>>>> Hi Ming > >>>>>>>> > >>>>>>>> On 09/13/2018 08:15 PM, Ming Lei wrote: > >>>>>>>>> EXPORT_SYMBOL(__scsi_execute); > >>>>>>>>> @@ -3246,6 +3251,7 @@ static int scsi_internal_device_block(struct scsi_device *sdev) > >>>>>>>>> else > >>>>>>>>> scsi_wait_for_queuecommand(sdev); > >>>>>>>>> } > >>>>>>>>> + wait_event(sdev->admin_wq, !atomic_read(&sdev->nr_admin_pending)); > >>>>>>>>> mutex_unlock(&sdev->state_mutex); > >>>>>>>>> > >>>>>>>>> return err; > >>>>>>>> ... > >>>>>>>>> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c > >>>>>>>>> index 3aee9464a7bf..8bcb7ecc0c06 100644 > >>>>>>>>> --- a/drivers/scsi/scsi_sysfs.c > >>>>>>>>> +++ b/drivers/scsi/scsi_sysfs.c > >>>>>>>>> @@ -1393,6 +1393,7 @@ void __scsi_remove_device(struct scsi_device *sdev) > >>>>>>>>> > >>>>>>>>> blk_cleanup_queue(sdev->request_queue); > >>>>>>>>> cancel_work_sync(&sdev->requeue_work); > >>>>>>>>> + wait_event(sdev->admin_wq, !atomic_read(&sdev->nr_admin_pending)) > >>>>>>>> > >>>>>>>> This nr_admin_pending could drain the ongoing scsi_request_fn/scsi_queue_rq, > >>>>>>>> but I'm afraid it cannot stop new ones coming in, such as the ones that have passed > >>>>>>>> the sdev state checking and have not crossed the atomic_inc(&sdev->nr_admin_pending). > >>>>>>>> > >>>>>>> > >>>>>>> The counter of .nr_admin_pending is introduced for draining queued > >>>>>>> admin requests to this scsi device. > >>>>>>> > >>>>>>> Actually new requests have been prevented from entering scsi_queue_rq(), > >>>>>>> please see the two callers of wait_event(sdev->admin_wq, > >>>>>>> !atomic_read(&sdev->nr_admin_pending)). > >>>>>>> > >>>>>> For example > >>>>>> > >>>>>> _scsi_execute > >>>>>> ... > >>>>>> scsi_internal_device_block > >>>>>> scsi_internal_device_block_nowait > >>>>>> blk_mq_quiesce_queue > >>>>>> wait_event(sdev->admin_wq, !atomic_read(&sdev->nr_admin_pending)) > >>>>>> &sdev->nr_admin_pending; > >>>>>> > >>>>>> blk_execute_rq(...) > >>>>>> > >>>>>> atomic_dec(&sdev->nr_admin_pending); > >>>>>> wake_up_all(&sdev->admin_wq); > >>>>>> > >>>>>> Or do you mean the scsi_queue_rq -> scsi_prep_state_check could gate out of ? > >>>>> > >>>>> I got it, then this issue can be fixed simply by moving atomic_inc/dec(&sdev->nr_admin_pending) > >>>>> and related wake_up_all(&sdev->admin_wq) into scsi_admin_queue_rq(). > >>>>> > >>>> > >>>> I don't think so. It is a similar scenario. > >>>> > >>>> I guess a state checking is needed after atomic_inc(&sdev->nr_admin_pending), like: > >>>> > >>>> _scsi_execute > >>>> ... > >>>> scsi_internal_device_block > >>>> scsi_internal_device_block_nowait > >>>> blk_mq_quiesce_queue > >>>> wait_event(sdev->admin_wq, !atomic_read(&sdev->nr_admin_pending)) > >>>> atomic_inc(&sdev->nr_admin_pending); > >>>> if state checking fails > >>>> goto done > >>> > >>> The check will be done in scsi_admin_queue_rq(). > >>> > >>>> > >>>> blk_execute_rq(...) > >>>> > >>>> atomic_dec(&sdev->nr_admin_pending); > >>>> wake_up_all(&sdev->admin_wq); > >>> > >>> I guess you may misunderstand the purpose of .nr_admin_pending, which is > >>> for draining requests to .queue_rq(). So it is enough to just move the > >>> inc/dec of .nr_admin_pending into scsi_admin_queue_rq(), right? > >> Yes. > > > > Thanks for your confirmation. > > > >> > >> But I just think of how to assign with queue quiesce. > >> The existence of per-host adminq seems to break it. > > > > The per-host adminq won't be frozen or quiesced at all, > > This is the place which makes me confused. > > Before this patchset, when scsi_internal_device_block quiesces the request_queue, > Both normal IO or admin requests cannot be processed any more. > > Given the per-adminq, we could say every scsi device has two request_queues, or > there could be two request_queues sending requests to the a scsi_device: > - normal IO request_queue > - shared per-host adminq > > We could quiesce the normal IO request_queue, but what's about the per-host adminq ? The adminq request can be quisced in the scsi_device level, just as normal IO request, because the scsi_device instance is same with normal IO. But in blk-mq level, this admin queue won't be quiesced or frozen at all. > > > could you explain > > your concern in a bit detail about 'assign with queue quiesce'? > > > > The scsi_queue_rq->scsi_prep_state_check could stop requests entering further, but > the scsi_queue_rq is still invoked. As I mentioned, there isn't any change in scsi level about handling the admin request because the scsi_device is shared, and scsi_host too. Thanks, Ming
Hi Ming On 09/18/2018 08:15 PM, Ming Lei wrote: > The adminq request can be quisced in the scsi_device level, just as normal IO > request, because the scsi_device instance is same with normal IO. Yes, the scsi_queue_rq->scsi_prep_state_check could gate things out of. > > But in blk-mq level, this admin queue won't be quiesced or frozen at all. But the scsi_queue_rq/scsi_request_fn will be invoked, right ? This is different from the original implementation. Thanks Jianchao
On Wed, Sep 19, 2018 at 11:52:21AM +0800, jianchao.wang wrote: > Hi Ming > > On 09/18/2018 08:15 PM, Ming Lei wrote: > > The adminq request can be quisced in the scsi_device level, just as normal IO > > request, because the scsi_device instance is same with normal IO. > > Yes, the scsi_queue_rq->scsi_prep_state_check could gate things out of. > > > > > But in blk-mq level, this admin queue won't be quiesced or frozen at all. > > But the scsi_queue_rq/scsi_request_fn will be invoked, right ? > > This is different from the original implementation. For legacy path, as you mentioned, scsi_prep_fn() will work, then scsi_request_fn() won't be called. For blk-mq, scsi_queue_rq() is always called, and that is a bit different, but the result is fine given admin request is submitted much less frequently. Thanks, Ming
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 8d129b601cc5..4db08458a127 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -299,6 +299,8 @@ int __scsi_execute(struct scsi_device *sdev, const unsigned char *cmd, req->cmd_flags |= flags; req->rq_flags |= rq_flags | RQF_QUIET; + atomic_inc(&sdev->nr_admin_pending); + /* * head injection *required* here otherwise quiesce won't work */ @@ -323,6 +325,9 @@ int __scsi_execute(struct scsi_device *sdev, const unsigned char *cmd, out: blk_put_request(req); + atomic_dec(&sdev->nr_admin_pending); + wake_up_all(&sdev->admin_wq); + return ret; } EXPORT_SYMBOL(__scsi_execute); @@ -3246,6 +3251,7 @@ static int scsi_internal_device_block(struct scsi_device *sdev) else scsi_wait_for_queuecommand(sdev); } + wait_event(sdev->admin_wq, !atomic_read(&sdev->nr_admin_pending)); mutex_unlock(&sdev->state_mutex); return err; diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c index 78ca63dfba4a..b83ad0dc8890 100644 --- a/drivers/scsi/scsi_scan.c +++ b/drivers/scsi/scsi_scan.c @@ -243,6 +243,7 @@ static struct scsi_device *scsi_alloc_sdev(struct scsi_target *starget, mutex_init(&sdev->inquiry_mutex); INIT_WORK(&sdev->event_work, scsi_evt_thread); INIT_WORK(&sdev->requeue_work, scsi_requeue_run_queue); + init_waitqueue_head(&sdev->admin_wq); sdev->sdev_gendev.parent = get_device(&starget->dev); sdev->sdev_target = starget; diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c index 3aee9464a7bf..8bcb7ecc0c06 100644 --- a/drivers/scsi/scsi_sysfs.c +++ b/drivers/scsi/scsi_sysfs.c @@ -1393,6 +1393,7 @@ void __scsi_remove_device(struct scsi_device *sdev) blk_cleanup_queue(sdev->request_queue); cancel_work_sync(&sdev->requeue_work); + wait_event(sdev->admin_wq, !atomic_read(&sdev->nr_admin_pending)); if (sdev->host->hostt->slave_destroy) sdev->host->hostt->slave_destroy(sdev); diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h index 202f4d6a4342..f6820da1dc37 100644 --- a/include/scsi/scsi_device.h +++ b/include/scsi/scsi_device.h @@ -227,6 +227,10 @@ struct scsi_device { struct mutex state_mutex; enum scsi_device_state sdev_state; struct task_struct *quiesced_by; + + atomic_t nr_admin_pending; + wait_queue_head_t admin_wq; + unsigned long sdev_data[0]; } __attribute__((aligned(sizeof(unsigned long))));
Firstly we have to make sure that all pending admin commands to one same scsi_device are completed before removing the scsi_device. Secondly scsi_internal_device_block() needs this too. So introduce one waitqueue and atomic counter for this purpose. Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Christoph Hellwig <hch@lst.de> Cc: Bart Van Assche <bart.vanassche@wdc.com> Cc: Jianchao Wang <jianchao.w.wang@oracle.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Johannes Thumshirn <jthumshirn@suse.de> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: linux-scsi@vger.kernel.org Signed-off-by: Ming Lei <ming.lei@redhat.com> --- drivers/scsi/scsi_lib.c | 6 ++++++ drivers/scsi/scsi_scan.c | 1 + drivers/scsi/scsi_sysfs.c | 1 + include/scsi/scsi_device.h | 4 ++++ 4 files changed, 12 insertions(+)