diff mbox series

[1/1] xen-blkback: do not wake up shutdown_wq after xen_blkif_schedule() is stopped

Message ID 1547646461-29803-1-git-send-email-dongli.zhang@oracle.com (mailing list archive)
State New, archived
Headers show
Series [1/1] xen-blkback: do not wake up shutdown_wq after xen_blkif_schedule() is stopped | expand

Commit Message

Dongli Zhang Jan. 16, 2019, 1:47 p.m. UTC
There is no need to wake up xen_blkif_schedule() as kthread_stop() is able
to already wake up the kernel thread.

Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
 drivers/block/xen-blkback/xenbus.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

Comments

Roger Pau Monné Jan. 16, 2019, 2:29 p.m. UTC | #1
On Wed, Jan 16, 2019 at 09:47:41PM +0800, Dongli Zhang wrote:
> There is no need to wake up xen_blkif_schedule() as kthread_stop() is able
> to already wake up the kernel thread.
> 
> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

kthread_stop waits for the thread to exit, so it must obviously wake
it up.

Thanks, Roger.
Roger Pau Monné Jan. 16, 2019, 2:52 p.m. UTC | #2
On Wed, Jan 16, 2019 at 09:47:41PM +0800, Dongli Zhang wrote:
> There is no need to wake up xen_blkif_schedule() as kthread_stop() is able
> to already wake up the kernel thread.
> 
> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
> ---
>  drivers/block/xen-blkback/xenbus.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
> index a4bc74e..37ac59e 100644
> --- a/drivers/block/xen-blkback/xenbus.c
> +++ b/drivers/block/xen-blkback/xenbus.c
> @@ -254,10 +254,8 @@ static int xen_blkif_disconnect(struct xen_blkif *blkif)
>  		if (!ring->active)
>  			continue;
>  
> -		if (ring->xenblkd) {
> +		if (ring->xenblkd)
>  			kthread_stop(ring->xenblkd);
> -			wake_up(&ring->shutdown_wq);

I've now realized that shutdown_wq is basically useless, and should be
removed, could you please do it in this patch?

Thanks, Roger.
Dongli Zhang Jan. 17, 2019, 2:10 a.m. UTC | #3
Hi Roger,

On 2019/1/16 下午10:52, Roger Pau Monné wrote:
> On Wed, Jan 16, 2019 at 09:47:41PM +0800, Dongli Zhang wrote:
>> There is no need to wake up xen_blkif_schedule() as kthread_stop() is able
>> to already wake up the kernel thread.
>>
>> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
>> ---
>>  drivers/block/xen-blkback/xenbus.c | 4 +---
>>  1 file changed, 1 insertion(+), 3 deletions(-)
>>
>> diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
>> index a4bc74e..37ac59e 100644
>> --- a/drivers/block/xen-blkback/xenbus.c
>> +++ b/drivers/block/xen-blkback/xenbus.c
>> @@ -254,10 +254,8 @@ static int xen_blkif_disconnect(struct xen_blkif *blkif)
>>  		if (!ring->active)
>>  			continue;
>>  
>> -		if (ring->xenblkd) {
>> +		if (ring->xenblkd)
>>  			kthread_stop(ring->xenblkd);
>> -			wake_up(&ring->shutdown_wq);
> 
> I've now realized that shutdown_wq is basically useless, and should be
> removed, could you please do it in this patch?

I do not think shutdown_wq is useless.

It is used to halt the xen_blkif_schedule() kthread when
RING_REQUEST_PROD_OVERFLOW() returns true in __do_block_io_op():

1121 static int
1122 __do_block_io_op(struct xen_blkif_ring *ring)
... ...
1134         if (RING_REQUEST_PROD_OVERFLOW(&blk_rings->common, rp)) {
1135                 rc = blk_rings->common.rsp_prod_pvt;
1136                 pr_warn("Frontend provided bogus ring requests (%d - %d =
%d). Halting ring processing on dev=%04x\n",
1137                         rp, rc, rp - rc, ring->blkif->vbd.pdevice);
1138                 return -EACCES;
1139         }


If there is bogus/invalid ring requests, __do_block_io_op() would return -EACCES
without modifying prod/cons index.

Without shutdown_wq (just simply assuming we remove the below code without
handling -EACCES in xen_blkif_schedule()), the kernel thread would continue the
while loop.

648                 if (ret == -EACCES)
649                         wait_event_interruptible(ring->shutdown_wq,
650                                                  kthread_should_stop());


If xen_blkif_be_int() is triggered again (let's assume there is no optimization
on guest part and guest would send event for every request it puts on ring
buffer), we may come to do_block_io_op() again.


As the prod/cons index are not modified last time the code runs into
do_block_io_op() to process bogus request, we would hit the bogus request issue
again.


With shutdown_wq, the kernel kthread is blocked forever until such queue/ring is
destroyed. If we remove shutdown_wq, we are changing the policy to handle bogus
requests on ring buffer?

Please correct me if my understanding is wrong.

Thank you very much!

Dongli Zhang
Roger Pau Monné Jan. 17, 2019, 8:20 a.m. UTC | #4
On Thu, Jan 17, 2019 at 10:10:00AM +0800, Dongli Zhang wrote:
> Hi Roger,
> 
> On 2019/1/16 下午10:52, Roger Pau Monné wrote:
> > On Wed, Jan 16, 2019 at 09:47:41PM +0800, Dongli Zhang wrote:
> >> There is no need to wake up xen_blkif_schedule() as kthread_stop() is able
> >> to already wake up the kernel thread.
> >>
> >> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
> >> ---
> >>  drivers/block/xen-blkback/xenbus.c | 4 +---
> >>  1 file changed, 1 insertion(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
> >> index a4bc74e..37ac59e 100644
> >> --- a/drivers/block/xen-blkback/xenbus.c
> >> +++ b/drivers/block/xen-blkback/xenbus.c
> >> @@ -254,10 +254,8 @@ static int xen_blkif_disconnect(struct xen_blkif *blkif)
> >>  		if (!ring->active)
> >>  			continue;
> >>  
> >> -		if (ring->xenblkd) {
> >> +		if (ring->xenblkd)
> >>  			kthread_stop(ring->xenblkd);
> >> -			wake_up(&ring->shutdown_wq);
> > 
> > I've now realized that shutdown_wq is basically useless, and should be
> > removed, could you please do it in this patch?
> 
> I do not think shutdown_wq is useless.
> 
> It is used to halt the xen_blkif_schedule() kthread when
> RING_REQUEST_PROD_OVERFLOW() returns true in __do_block_io_op():
> 
> 1121 static int
> 1122 __do_block_io_op(struct xen_blkif_ring *ring)
> ... ...
> 1134         if (RING_REQUEST_PROD_OVERFLOW(&blk_rings->common, rp)) {
> 1135                 rc = blk_rings->common.rsp_prod_pvt;
> 1136                 pr_warn("Frontend provided bogus ring requests (%d - %d =
> %d). Halting ring processing on dev=%04x\n",
> 1137                         rp, rc, rp - rc, ring->blkif->vbd.pdevice);
> 1138                 return -EACCES;
> 1139         }
> 
> 
> If there is bogus/invalid ring requests, __do_block_io_op() would return -EACCES
> without modifying prod/cons index.
> 
> Without shutdown_wq (just simply assuming we remove the below code without
> handling -EACCES in xen_blkif_schedule()), the kernel thread would continue the
> while loop.
> 
> 648                 if (ret == -EACCES)
> 649                         wait_event_interruptible(ring->shutdown_wq,
> 650                                                  kthread_should_stop());
> 
> 
> If xen_blkif_be_int() is triggered again (let's assume there is no optimization
> on guest part and guest would send event for every request it puts on ring
> buffer), we may come to do_block_io_op() again.
> 
> 
> As the prod/cons index are not modified last time the code runs into
> do_block_io_op() to process bogus request, we would hit the bogus request issue
> again.
> 
> 
> With shutdown_wq, the kernel kthread is blocked forever until such queue/ring is
> destroyed. If we remove shutdown_wq, we are changing the policy to handle bogus
> requests on ring buffer?

AFAICT the only wakeup call to shutdown_wq is removed in this patch,
hence waiting on it seems useless. I would replace the
wait_event_interruptible call in xen_blkif_schedule with a break, so
that the kthread ends as soon as a bogus request is found. I think
there's no point in waiting for xen_blkif_disconnect to stop the
kthread.

Thanks, Roger.
Dongli Zhang Jan. 17, 2019, 10:23 a.m. UTC | #5
Hi Roger,

On 01/17/2019 04:20 PM, Roger Pau Monné wrote:
> On Thu, Jan 17, 2019 at 10:10:00AM +0800, Dongli Zhang wrote:
>> Hi Roger,
>>
>> On 2019/1/16 下午10:52, Roger Pau Monné wrote:
>>> On Wed, Jan 16, 2019 at 09:47:41PM +0800, Dongli Zhang wrote:
>>>> There is no need to wake up xen_blkif_schedule() as kthread_stop() is able
>>>> to already wake up the kernel thread.
>>>>
>>>> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
>>>> ---
>>>>  drivers/block/xen-blkback/xenbus.c | 4 +---
>>>>  1 file changed, 1 insertion(+), 3 deletions(-)
>>>>
>>>> diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
>>>> index a4bc74e..37ac59e 100644
>>>> --- a/drivers/block/xen-blkback/xenbus.c
>>>> +++ b/drivers/block/xen-blkback/xenbus.c
>>>> @@ -254,10 +254,8 @@ static int xen_blkif_disconnect(struct xen_blkif *blkif)
>>>>  		if (!ring->active)
>>>>  			continue;
>>>>  
>>>> -		if (ring->xenblkd) {
>>>> +		if (ring->xenblkd)
>>>>  			kthread_stop(ring->xenblkd);
>>>> -			wake_up(&ring->shutdown_wq);
>>>
>>> I've now realized that shutdown_wq is basically useless, and should be
>>> removed, could you please do it in this patch?
>>
>> I do not think shutdown_wq is useless.
>>
>> It is used to halt the xen_blkif_schedule() kthread when
>> RING_REQUEST_PROD_OVERFLOW() returns true in __do_block_io_op():
>>
>> 1121 static int
>> 1122 __do_block_io_op(struct xen_blkif_ring *ring)
>> ... ...
>> 1134         if (RING_REQUEST_PROD_OVERFLOW(&blk_rings->common, rp)) {
>> 1135                 rc = blk_rings->common.rsp_prod_pvt;
>> 1136                 pr_warn("Frontend provided bogus ring requests (%d - %d =
>> %d). Halting ring processing on dev=%04x\n",
>> 1137                         rp, rc, rp - rc, ring->blkif->vbd.pdevice);
>> 1138                 return -EACCES;
>> 1139         }
>>
>>
>> If there is bogus/invalid ring requests, __do_block_io_op() would return -EACCES
>> without modifying prod/cons index.
>>
>> Without shutdown_wq (just simply assuming we remove the below code without
>> handling -EACCES in xen_blkif_schedule()), the kernel thread would continue the
>> while loop.
>>
>> 648                 if (ret == -EACCES)
>> 649                         wait_event_interruptible(ring->shutdown_wq,
>> 650                                                  kthread_should_stop());
>>
>>
>> If xen_blkif_be_int() is triggered again (let's assume there is no optimization
>> on guest part and guest would send event for every request it puts on ring
>> buffer), we may come to do_block_io_op() again.
>>
>>
>> As the prod/cons index are not modified last time the code runs into
>> do_block_io_op() to process bogus request, we would hit the bogus request issue
>> again.
>>
>>
>> With shutdown_wq, the kernel kthread is blocked forever until such queue/ring is
>> destroyed. If we remove shutdown_wq, we are changing the policy to handle bogus
>> requests on ring buffer?
> 
> AFAICT the only wakeup call to shutdown_wq is removed in this patch,
> hence waiting on it seems useless. I would replace the
> wait_event_interruptible call in xen_blkif_schedule with a break, so
> that the kthread ends as soon as a bogus request is found. I think
> there's no point in waiting for xen_blkif_disconnect to stop the
> kthread.
> 
> Thanks, Roger.
> 

My fault. The shutdown_wq is useless. I think I was going to say
wait_event_interruptible() is useful as it is used to halt the kthread when
there is bogus request.

It is fine to replace wait_event_interruptible with a break to exit immediately
when there is bogus request.

Thank you very much!

Dongli Zhang
diff mbox series

Patch

diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
index a4bc74e..37ac59e 100644
--- a/drivers/block/xen-blkback/xenbus.c
+++ b/drivers/block/xen-blkback/xenbus.c
@@ -254,10 +254,8 @@  static int xen_blkif_disconnect(struct xen_blkif *blkif)
 		if (!ring->active)
 			continue;
 
-		if (ring->xenblkd) {
+		if (ring->xenblkd)
 			kthread_stop(ring->xenblkd);
-			wake_up(&ring->shutdown_wq);
-		}
 
 		/* The above kthread_stop() guarantees that at this point we
 		 * don't have any discard_io or other_io requests. So, checking