diff mbox

[1/2] xen-blkfront: don't call talk_to_blkback when already connected to blkback

Message ID 1464685157-30738-1-git-send-email-bob.liu@oracle.com (mailing list archive)
State New, archived
Headers show

Commit Message

Bob Liu May 31, 2016, 8:59 a.m. UTC
Sometimes blkfont may receive twice blkback_changed() notification after
migration, then talk_to_blkback() will be called twice too and confused
xen-blkback.

Signed-off-by: Bob Liu <bob.liu@oracle.com>
---
 drivers/block/xen-blkfront.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Konrad Rzeszutek Wilk May 31, 2016, 8:33 p.m. UTC | #1
On Tue, May 31, 2016 at 04:59:16PM +0800, Bob Liu wrote:
> Sometimes blkfont may receive twice blkback_changed() notification after
> migration, then talk_to_blkback() will be called twice too and confused
> xen-blkback.

Could you enlighten the patch description by having some form of
state transition here? I am curious how you got the frontend
to get in XenbusStateConnected (via blkif_recover right) and then
the backend triggering the update once more?

Or is just a simple race - the backend moves from XenbusStateConnected->
XenbusStateConnected - which retriggers the frontend to hit in
blkback_changed the XenbusStateConnected state and go in there?
(That would be in conenct_ring changing the state). But I don't
see how the frontend_changed code get there as we have:

 770                 /*
 771                  * Ensure we connect even when two watches fire in
 772                  * close succession and we miss the intermediate value
 773                  * of frontend_state.
 774                  */
 775                 if (dev->state == XenbusStateConnected)
 776                         break;
 777 

?

Now what about 'blkfront_connect' being called on the second time?

Ah, info->connected is probably by then in BLKIF_STATE_CONNECTED
(as blkif_recover changed) and we just reread the size of the disk.

Is that how about the flow goes?
> 
> Signed-off-by: Bob Liu <bob.liu@oracle.com>
> ---
>  drivers/block/xen-blkfront.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
> index ca13df8..01aa460 100644
> --- a/drivers/block/xen-blkfront.c
> +++ b/drivers/block/xen-blkfront.c
> @@ -2485,7 +2485,8 @@ static void blkback_changed(struct xenbus_device *dev,
>  		break;
>  
>  	case XenbusStateConnected:
> -		if (dev->state != XenbusStateInitialised) {
> +		if ((dev->state != XenbusStateInitialised) &&
> +			(dev->state != XenbusStateConnected)) {
>  			if (talk_to_blkback(dev, info))
>  				break;
>  		}
> -- 
> 2.7.4
>
Bob Liu June 1, 2016, 5:49 a.m. UTC | #2
On 06/01/2016 04:33 AM, Konrad Rzeszutek Wilk wrote:
> On Tue, May 31, 2016 at 04:59:16PM +0800, Bob Liu wrote:
>> Sometimes blkfont may receive twice blkback_changed() notification after
>> migration, then talk_to_blkback() will be called twice too and confused
>> xen-blkback.
> 
> Could you enlighten the patch description by having some form of
> state transition here? I am curious how you got the frontend
> to get in XenbusStateConnected (via blkif_recover right) and then
> the backend triggering the update once more?
> 
> Or is just a simple race - the backend moves from XenbusStateConnected->
> XenbusStateConnected - which retriggers the frontend to hit in
> blkback_changed the XenbusStateConnected state and go in there?
> (That would be in conenct_ring changing the state). But I don't
> see how the frontend_changed code get there as we have:
> 
>  770                 /*
>  771                  * Ensure we connect even when two watches fire in
>  772                  * close succession and we miss the intermediate value
>  773                  * of frontend_state.
>  774                  */
>  775                 if (dev->state == XenbusStateConnected)
>  776                         break;
>  777 
> 
> ?
> 
> Now what about 'blkfront_connect' being called on the second time?
> 
> Ah, info->connected is probably by then in BLKIF_STATE_CONNECTED
> (as blkif_recover changed) and we just reread the size of the disk.
> 
> Is that how about the flow goes?

	blkfront 					blkback
blkfront_resume()                                   
 > talk_to_blkback()
  > Set blkfront to XenbusStateInitialised
						Front changed()
  						 > Connect()
						  > Set blkback to XenbusStateConnected

blkback_changed()
 > Skip talk_to_blkback()
   because frontstate == XenbusStateInitialised
 > blkfront_connect()
  > Set blkfront to XenbusStateConnected


------------------------------------------------------------------
But sometimes blkfront receives
blkback_changed() event more than once!
Not sure why.

blkback_changed()
 > because now frontstate != XenbusStateInitialised
   talk_to_blkback() is also called again
  > blkfront state changed from 
    XenbusStateConnected to XenbusStateInitialised
    (Which is not correct!)


    						Front_changed():
						 > Do nothing because blkback
						   already in XenbusStateConnected


Now blkback is XenbusStateConnected but blkfront still XenbusStateInitialised.

>>
>> Signed-off-by: Bob Liu <bob.liu@oracle.com>
>> ---
>>  drivers/block/xen-blkfront.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
>> index ca13df8..01aa460 100644
>> --- a/drivers/block/xen-blkfront.c
>> +++ b/drivers/block/xen-blkfront.c
>> @@ -2485,7 +2485,8 @@ static void blkback_changed(struct xenbus_device *dev,
>>  		break;
>>  
>>  	case XenbusStateConnected:
>> -		if (dev->state != XenbusStateInitialised) {
>> +		if ((dev->state != XenbusStateInitialised) &&
>> +			(dev->state != XenbusStateConnected)) {
>>  			if (talk_to_blkback(dev, info))
>>  				break;
>>  		}
>> -- 
>> 2.7.4
>>
Konrad Rzeszutek Wilk June 2, 2016, 2:30 p.m. UTC | #3
On Wed, Jun 01, 2016 at 01:49:23PM +0800, Bob Liu wrote:
> 
> On 06/01/2016 04:33 AM, Konrad Rzeszutek Wilk wrote:
> > On Tue, May 31, 2016 at 04:59:16PM +0800, Bob Liu wrote:
> >> Sometimes blkfont may receive twice blkback_changed() notification after
> >> migration, then talk_to_blkback() will be called twice too and confused
> >> xen-blkback.
> > 
> > Could you enlighten the patch description by having some form of
> > state transition here? I am curious how you got the frontend
> > to get in XenbusStateConnected (via blkif_recover right) and then
> > the backend triggering the update once more?
> > 
> > Or is just a simple race - the backend moves from XenbusStateConnected->
> > XenbusStateConnected - which retriggers the frontend to hit in
> > blkback_changed the XenbusStateConnected state and go in there?
> > (That would be in conenct_ring changing the state). But I don't
> > see how the frontend_changed code get there as we have:
> > 
> >  770                 /*
> >  771                  * Ensure we connect even when two watches fire in
> >  772                  * close succession and we miss the intermediate value
> >  773                  * of frontend_state.
> >  774                  */
> >  775                 if (dev->state == XenbusStateConnected)
> >  776                         break;
> >  777 
> > 
> > ?
> > 
> > Now what about 'blkfront_connect' being called on the second time?
> > 
> > Ah, info->connected is probably by then in BLKIF_STATE_CONNECTED
> > (as blkif_recover changed) and we just reread the size of the disk.
> > 
> > Is that how about the flow goes?
> 
> 	blkfront 					blkback
> blkfront_resume()                                   
>  > talk_to_blkback()
>   > Set blkfront to XenbusStateInitialised
> 						Front changed()
>   						 > Connect()
> 						  > Set blkback to XenbusStateConnected
> 
> blkback_changed()
>  > Skip talk_to_blkback()
>    because frontstate == XenbusStateInitialised
>  > blkfront_connect()
>   > Set blkfront to XenbusStateConnected
> 
> 
> ------------------------------------------------------------------
> But sometimes blkfront receives
> blkback_changed() event more than once!

Could the control stack (xend) be doing this?

> Not sure why.
> 
> blkback_changed()
>  > because now frontstate != XenbusStateInitialised
>    talk_to_blkback() is also called again
>   > blkfront state changed from 
>     XenbusStateConnected to XenbusStateInitialised
>     (Which is not correct!)
> 
> 
>     						Front_changed():
> 						 > Do nothing because blkback
> 						   already in XenbusStateConnected
> 
> 
> Now blkback is XenbusStateConnected but blkfront still XenbusStateInitialised.

Ah!
> 
> >>
> >> Signed-off-by: Bob Liu <bob.liu@oracle.com>
> >> ---
> >>  drivers/block/xen-blkfront.c | 3 ++-
> >>  1 file changed, 2 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
> >> index ca13df8..01aa460 100644
> >> --- a/drivers/block/xen-blkfront.c
> >> +++ b/drivers/block/xen-blkfront.c
> >> @@ -2485,7 +2485,8 @@ static void blkback_changed(struct xenbus_device *dev,
> >>  		break;
> >>  
> >>  	case XenbusStateConnected:
> >> -		if (dev->state != XenbusStateInitialised) {
> >> +		if ((dev->state != XenbusStateInitialised) &&
> >> +			(dev->state != XenbusStateConnected)) {
> >>  			if (talk_to_blkback(dev, info))
> >>  				break;
> >>  		}
> >> -- 
> >> 2.7.4
> >>
diff mbox

Patch

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index ca13df8..01aa460 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -2485,7 +2485,8 @@  static void blkback_changed(struct xenbus_device *dev,
 		break;
 
 	case XenbusStateConnected:
-		if (dev->state != XenbusStateInitialised) {
+		if ((dev->state != XenbusStateInitialised) &&
+			(dev->state != XenbusStateConnected)) {
 			if (talk_to_blkback(dev, info))
 				break;
 		}