diff mbox series

xen-blkback: fix compatibility bug with single page rings

Message ID 20210127103034.2559-1-paul@xen.org (mailing list archive)
State Superseded
Headers show
Series xen-blkback: fix compatibility bug with single page rings | expand

Commit Message

Paul Durrant Jan. 27, 2021, 10:30 a.m. UTC
From: Paul Durrant <pdurrant@amazon.com>

Prior to commit 4a8c31a1c6f5 ("xen/blkback: rework connect_ring() to avoid
inconsistent xenstore 'ring-page-order' set by malicious blkfront"), the
behaviour of xen-blkback when connecting to a frontend was:

- read 'ring-page-order'
- if not present then expect a single page ring specified by 'ring-ref'
- else expect a ring specified by 'ring-refX' where X is between 0 and
  1 << ring-page-order

This was correct behaviour, but was broken by the afforementioned commit to
become:

- read 'ring-page-order'
- if not present then expect a single page ring
- expect a ring specified by 'ring-refX' where X is between 0 and
  1 << ring-page-order
- if that didn't work then see if there's a single page ring specified by
  'ring-ref'

This incorrect behaviour works most of the time but fails when a frontend
that sets 'ring-page-order' is unloaded and replaced by one that does not
because, instead of reading 'ring-ref', xen-blkback will read the stale
'ring-ref0' left around by the previous frontend will try to map the wrong
grant reference.

This patch restores the original behaviour.

Fixes: 4a8c31a1c6f5 ("xen/blkback: rework connect_ring() to avoid inconsistent xenstore 'ring-page-order' set by malicious blkfront")
Signed-off-by: Paul Durrant <pdurrant@amazon.com>
---
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Dongli Zhang <dongli.zhang@oracle.com>
---
 drivers/block/xen-blkback/common.h |  1 +
 drivers/block/xen-blkback/xenbus.c | 36 +++++++++++++-----------------
 2 files changed, 17 insertions(+), 20 deletions(-)

Comments

Jan Beulich Jan. 27, 2021, 10:56 a.m. UTC | #1
On 27.01.2021 11:30, Paul Durrant wrote:
> From: Paul Durrant <pdurrant@amazon.com>
> 
> Prior to commit 4a8c31a1c6f5 ("xen/blkback: rework connect_ring() to avoid
> inconsistent xenstore 'ring-page-order' set by malicious blkfront"), the
> behaviour of xen-blkback when connecting to a frontend was:
> 
> - read 'ring-page-order'
> - if not present then expect a single page ring specified by 'ring-ref'
> - else expect a ring specified by 'ring-refX' where X is between 0 and
>   1 << ring-page-order
> 
> This was correct behaviour, but was broken by the afforementioned commit to
> become:
> 
> - read 'ring-page-order'
> - if not present then expect a single page ring
> - expect a ring specified by 'ring-refX' where X is between 0 and
>   1 << ring-page-order
> - if that didn't work then see if there's a single page ring specified by
>   'ring-ref'
> 
> This incorrect behaviour works most of the time but fails when a frontend
> that sets 'ring-page-order' is unloaded and replaced by one that does not
> because, instead of reading 'ring-ref', xen-blkback will read the stale
> 'ring-ref0' left around by the previous frontend will try to map the wrong
> grant reference.
> 
> This patch restores the original behaviour.

Isn't this only the 2nd of a pair of fixes that's needed, the
first being the drivers, upon being unloaded, to fully clean up
after itself? Any stale key left may lead to confusion upon
re-use of the containing directory.

Jan
Paul Durrant Jan. 27, 2021, 11:09 a.m. UTC | #2
> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 27 January 2021 10:57
> To: Paul Durrant <paul@xen.org>
> Cc: Paul Durrant <pdurrant@amazon.com>; Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>; Roger Pau
> Monné <roger.pau@citrix.com>; Jens Axboe <axboe@kernel.dk>; Dongli Zhang <dongli.zhang@oracle.com>;
> linux-kernel@vger.kernel.org; linux-block@vger.kernel.org; xen-devel@lists.xenproject.org
> Subject: Re: [PATCH] xen-blkback: fix compatibility bug with single page rings
> 
> On 27.01.2021 11:30, Paul Durrant wrote:
> > From: Paul Durrant <pdurrant@amazon.com>
> >
> > Prior to commit 4a8c31a1c6f5 ("xen/blkback: rework connect_ring() to avoid
> > inconsistent xenstore 'ring-page-order' set by malicious blkfront"), the
> > behaviour of xen-blkback when connecting to a frontend was:
> >
> > - read 'ring-page-order'
> > - if not present then expect a single page ring specified by 'ring-ref'
> > - else expect a ring specified by 'ring-refX' where X is between 0 and
> >   1 << ring-page-order
> >
> > This was correct behaviour, but was broken by the afforementioned commit to
> > become:
> >
> > - read 'ring-page-order'
> > - if not present then expect a single page ring
> > - expect a ring specified by 'ring-refX' where X is between 0 and
> >   1 << ring-page-order
> > - if that didn't work then see if there's a single page ring specified by
> >   'ring-ref'
> >
> > This incorrect behaviour works most of the time but fails when a frontend
> > that sets 'ring-page-order' is unloaded and replaced by one that does not
> > because, instead of reading 'ring-ref', xen-blkback will read the stale
> > 'ring-ref0' left around by the previous frontend will try to map the wrong
> > grant reference.
> >
> > This patch restores the original behaviour.
> 
> Isn't this only the 2nd of a pair of fixes that's needed, the
> first being the drivers, upon being unloaded, to fully clean up
> after itself? Any stale key left may lead to confusion upon
> re-use of the containing directory.

In a backend we shouldn't be relying on, nor really expect IMO, a frontend to clean up after itself. Any backend should know *exactly* what xenstore nodes it’s looking for from a frontend.

  Paul

> 
> Jan
Jan Beulich Jan. 27, 2021, 11:20 a.m. UTC | #3
On 27.01.2021 12:09, Paul Durrant wrote:
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: 27 January 2021 10:57
>> To: Paul Durrant <paul@xen.org>
>> Cc: Paul Durrant <pdurrant@amazon.com>; Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>; Roger Pau
>> Monné <roger.pau@citrix.com>; Jens Axboe <axboe@kernel.dk>; Dongli Zhang <dongli.zhang@oracle.com>;
>> linux-kernel@vger.kernel.org; linux-block@vger.kernel.org; xen-devel@lists.xenproject.org
>> Subject: Re: [PATCH] xen-blkback: fix compatibility bug with single page rings
>>
>> On 27.01.2021 11:30, Paul Durrant wrote:
>>> From: Paul Durrant <pdurrant@amazon.com>
>>>
>>> Prior to commit 4a8c31a1c6f5 ("xen/blkback: rework connect_ring() to avoid
>>> inconsistent xenstore 'ring-page-order' set by malicious blkfront"), the
>>> behaviour of xen-blkback when connecting to a frontend was:
>>>
>>> - read 'ring-page-order'
>>> - if not present then expect a single page ring specified by 'ring-ref'
>>> - else expect a ring specified by 'ring-refX' where X is between 0 and
>>>   1 << ring-page-order
>>>
>>> This was correct behaviour, but was broken by the afforementioned commit to
>>> become:
>>>
>>> - read 'ring-page-order'
>>> - if not present then expect a single page ring
>>> - expect a ring specified by 'ring-refX' where X is between 0 and
>>>   1 << ring-page-order
>>> - if that didn't work then see if there's a single page ring specified by
>>>   'ring-ref'
>>>
>>> This incorrect behaviour works most of the time but fails when a frontend
>>> that sets 'ring-page-order' is unloaded and replaced by one that does not
>>> because, instead of reading 'ring-ref', xen-blkback will read the stale
>>> 'ring-ref0' left around by the previous frontend will try to map the wrong
>>> grant reference.
>>>
>>> This patch restores the original behaviour.
>>
>> Isn't this only the 2nd of a pair of fixes that's needed, the
>> first being the drivers, upon being unloaded, to fully clean up
>> after itself? Any stale key left may lead to confusion upon
>> re-use of the containing directory.
> 
> In a backend we shouldn't be relying on, nor really expect IMO, a frontend to clean up after itself. Any backend should know *exactly* what xenstore nodes it’s looking for from a frontend.

But the backend can't know whether a node exists because the present
frontend has written it, or because an earlier instance forgot to
delete it. It can only honor what's there. (In fact the other day I
was wondering whether some of the writes of boolean "false" nodes
wouldn't better be xenbus_rm() instead.)

Jan
Paul Durrant Jan. 27, 2021, 11:33 a.m. UTC | #4
> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 27 January 2021 11:21
> To: paul@xen.org
> Cc: 'Paul Durrant' <pdurrant@amazon.com>; 'Konrad Rzeszutek Wilk' <konrad.wilk@oracle.com>; 'Roger Pau
> Monné' <roger.pau@citrix.com>; 'Jens Axboe' <axboe@kernel.dk>; 'Dongli Zhang'
> <dongli.zhang@oracle.com>; linux-kernel@vger.kernel.org; linux-block@vger.kernel.org; xen-
> devel@lists.xenproject.org
> Subject: Re: [PATCH] xen-blkback: fix compatibility bug with single page rings
> 
> On 27.01.2021 12:09, Paul Durrant wrote:
> >> -----Original Message-----
> >> From: Jan Beulich <jbeulich@suse.com>
> >> Sent: 27 January 2021 10:57
> >> To: Paul Durrant <paul@xen.org>
> >> Cc: Paul Durrant <pdurrant@amazon.com>; Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>; Roger Pau
> >> Monné <roger.pau@citrix.com>; Jens Axboe <axboe@kernel.dk>; Dongli Zhang <dongli.zhang@oracle.com>;
> >> linux-kernel@vger.kernel.org; linux-block@vger.kernel.org; xen-devel@lists.xenproject.org
> >> Subject: Re: [PATCH] xen-blkback: fix compatibility bug with single page rings
> >>
> >> On 27.01.2021 11:30, Paul Durrant wrote:
> >>> From: Paul Durrant <pdurrant@amazon.com>
> >>>
> >>> Prior to commit 4a8c31a1c6f5 ("xen/blkback: rework connect_ring() to avoid
> >>> inconsistent xenstore 'ring-page-order' set by malicious blkfront"), the
> >>> behaviour of xen-blkback when connecting to a frontend was:
> >>>
> >>> - read 'ring-page-order'
> >>> - if not present then expect a single page ring specified by 'ring-ref'
> >>> - else expect a ring specified by 'ring-refX' where X is between 0 and
> >>>   1 << ring-page-order
> >>>
> >>> This was correct behaviour, but was broken by the afforementioned commit to
> >>> become:
> >>>
> >>> - read 'ring-page-order'
> >>> - if not present then expect a single page ring
> >>> - expect a ring specified by 'ring-refX' where X is between 0 and
> >>>   1 << ring-page-order
> >>> - if that didn't work then see if there's a single page ring specified by
> >>>   'ring-ref'
> >>>
> >>> This incorrect behaviour works most of the time but fails when a frontend
> >>> that sets 'ring-page-order' is unloaded and replaced by one that does not
> >>> because, instead of reading 'ring-ref', xen-blkback will read the stale
> >>> 'ring-ref0' left around by the previous frontend will try to map the wrong
> >>> grant reference.
> >>>
> >>> This patch restores the original behaviour.
> >>
> >> Isn't this only the 2nd of a pair of fixes that's needed, the
> >> first being the drivers, upon being unloaded, to fully clean up
> >> after itself? Any stale key left may lead to confusion upon
> >> re-use of the containing directory.
> >
> > In a backend we shouldn't be relying on, nor really expect IMO, a frontend to clean up after itself.
> Any backend should know *exactly* what xenstore nodes it’s looking for from a frontend.
> 
> But the backend can't know whether a node exists because the present
> frontend has written it, or because an earlier instance forgot to
> delete it. It can only honor what's there. (In fact the other day I
> was wondering whether some of the writes of boolean "false" nodes
> wouldn't better be xenbus_rm() instead.)

In the particular case this patch is fixing for me, the frontends are the Windows XENVBD driver and the Windows crash version of the same driver (actually built from different code). The 'normal' instance is multi-page aware and the crash instance is not quite, i.e. it uses the old ring-ref but knows to clean up 'ring-page-order'.
Clearly, in a crash situation, we cannot rely on frontend to clean up so what you say does highlight that there indeed needs to be a second patch to xen-blkback to make sure it removes 'ring-page-order' itself as 'state' cycles through Closed and back to InitWait. I think this patch does still stand on its own though.

  Paul

> 
> Jan
Jan Beulich Jan. 27, 2021, 11:37 a.m. UTC | #5
On 27.01.2021 12:33, Paul Durrant wrote:
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: 27 January 2021 11:21
>> To: paul@xen.org
>> Cc: 'Paul Durrant' <pdurrant@amazon.com>; 'Konrad Rzeszutek Wilk' <konrad.wilk@oracle.com>; 'Roger Pau
>> Monné' <roger.pau@citrix.com>; 'Jens Axboe' <axboe@kernel.dk>; 'Dongli Zhang'
>> <dongli.zhang@oracle.com>; linux-kernel@vger.kernel.org; linux-block@vger.kernel.org; xen-
>> devel@lists.xenproject.org
>> Subject: Re: [PATCH] xen-blkback: fix compatibility bug with single page rings
>>
>> On 27.01.2021 12:09, Paul Durrant wrote:
>>>> -----Original Message-----
>>>> From: Jan Beulich <jbeulich@suse.com>
>>>> Sent: 27 January 2021 10:57
>>>> To: Paul Durrant <paul@xen.org>
>>>> Cc: Paul Durrant <pdurrant@amazon.com>; Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>; Roger Pau
>>>> Monné <roger.pau@citrix.com>; Jens Axboe <axboe@kernel.dk>; Dongli Zhang <dongli.zhang@oracle.com>;
>>>> linux-kernel@vger.kernel.org; linux-block@vger.kernel.org; xen-devel@lists.xenproject.org
>>>> Subject: Re: [PATCH] xen-blkback: fix compatibility bug with single page rings
>>>>
>>>> On 27.01.2021 11:30, Paul Durrant wrote:
>>>>> From: Paul Durrant <pdurrant@amazon.com>
>>>>>
>>>>> Prior to commit 4a8c31a1c6f5 ("xen/blkback: rework connect_ring() to avoid
>>>>> inconsistent xenstore 'ring-page-order' set by malicious blkfront"), the
>>>>> behaviour of xen-blkback when connecting to a frontend was:
>>>>>
>>>>> - read 'ring-page-order'
>>>>> - if not present then expect a single page ring specified by 'ring-ref'
>>>>> - else expect a ring specified by 'ring-refX' where X is between 0 and
>>>>>   1 << ring-page-order
>>>>>
>>>>> This was correct behaviour, but was broken by the afforementioned commit to
>>>>> become:
>>>>>
>>>>> - read 'ring-page-order'
>>>>> - if not present then expect a single page ring
>>>>> - expect a ring specified by 'ring-refX' where X is between 0 and
>>>>>   1 << ring-page-order
>>>>> - if that didn't work then see if there's a single page ring specified by
>>>>>   'ring-ref'
>>>>>
>>>>> This incorrect behaviour works most of the time but fails when a frontend
>>>>> that sets 'ring-page-order' is unloaded and replaced by one that does not
>>>>> because, instead of reading 'ring-ref', xen-blkback will read the stale
>>>>> 'ring-ref0' left around by the previous frontend will try to map the wrong
>>>>> grant reference.
>>>>>
>>>>> This patch restores the original behaviour.
>>>>
>>>> Isn't this only the 2nd of a pair of fixes that's needed, the
>>>> first being the drivers, upon being unloaded, to fully clean up
>>>> after itself? Any stale key left may lead to confusion upon
>>>> re-use of the containing directory.
>>>
>>> In a backend we shouldn't be relying on, nor really expect IMO, a frontend to clean up after itself.
>> Any backend should know *exactly* what xenstore nodes it’s looking for from a frontend.
>>
>> But the backend can't know whether a node exists because the present
>> frontend has written it, or because an earlier instance forgot to
>> delete it. It can only honor what's there. (In fact the other day I
>> was wondering whether some of the writes of boolean "false" nodes
>> wouldn't better be xenbus_rm() instead.)
> 
> In the particular case this patch is fixing for me, the frontends are the Windows XENVBD driver and the Windows crash version of the same driver (actually built from different code). The 'normal' instance is multi-page aware and the crash instance is not quite, i.e. it uses the old ring-ref but knows to clean up 'ring-page-order'.
> Clearly, in a crash situation, we cannot rely on frontend to clean up

Ah, I see (and agree).

> so what you say does highlight that there indeed needs to be a second patch to xen-blkback to make sure it removes 'ring-page-order' itself as 'state' cycles through Closed and back to InitWait.

And not just this one node then, I suppose?

> I think this patch does still stand on its own though.

Perhaps, yes.

Jan
Dongli Zhang Jan. 27, 2021, 7:57 p.m. UTC | #6
On 1/27/21 2:30 AM, Paul Durrant wrote:
> From: Paul Durrant <pdurrant@amazon.com>
> 
> Prior to commit 4a8c31a1c6f5 ("xen/blkback: rework connect_ring() to avoid
> inconsistent xenstore 'ring-page-order' set by malicious blkfront"), the
> behaviour of xen-blkback when connecting to a frontend was:
> 
> - read 'ring-page-order'
> - if not present then expect a single page ring specified by 'ring-ref'
> - else expect a ring specified by 'ring-refX' where X is between 0 and
>   1 << ring-page-order
> 
> This was correct behaviour, but was broken by the afforementioned commit to
> become:
> 
> - read 'ring-page-order'
> - if not present then expect a single page ring
> - expect a ring specified by 'ring-refX' where X is between 0 and
>   1 << ring-page-order
> - if that didn't work then see if there's a single page ring specified by
>   'ring-ref'
> 
> This incorrect behaviour works most of the time but fails when a frontend
> that sets 'ring-page-order' is unloaded and replaced by one that does not
> because, instead of reading 'ring-ref', xen-blkback will read the stale
> 'ring-ref0' left around by the previous frontend will try to map the wrong
> grant reference.
> 
> This patch restores the original behaviour.
> 
> Fixes: 4a8c31a1c6f5 ("xen/blkback: rework connect_ring() to avoid inconsistent xenstore 'ring-page-order' set by malicious blkfront")
> Signed-off-by: Paul Durrant <pdurrant@amazon.com>
> ---
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: "Roger Pau Monné" <roger.pau@citrix.com>
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: Dongli Zhang <dongli.zhang@oracle.com>
> ---
>  drivers/block/xen-blkback/common.h |  1 +
>  drivers/block/xen-blkback/xenbus.c | 36 +++++++++++++-----------------
>  2 files changed, 17 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/block/xen-blkback/common.h b/drivers/block/xen-blkback/common.h
> index b0c71d3a81a0..524a79f10de6 100644
> --- a/drivers/block/xen-blkback/common.h
> +++ b/drivers/block/xen-blkback/common.h
> @@ -313,6 +313,7 @@ struct xen_blkif {
>  
>  	struct work_struct	free_work;
>  	unsigned int 		nr_ring_pages;
> +	bool                    multi_ref;
>  	/* All rings for this device. */
>  	struct xen_blkif_ring	*rings;
>  	unsigned int		nr_rings;
> diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
> index 9860d4842f36..4c1541cde68c 100644
> --- a/drivers/block/xen-blkback/xenbus.c
> +++ b/drivers/block/xen-blkback/xenbus.c
> @@ -998,10 +998,15 @@ static int read_per_ring_refs(struct xen_blkif_ring *ring, const char *dir)
>  	for (i = 0; i < nr_grefs; i++) {
>  		char ring_ref_name[RINGREF_NAME_LEN];
>  
> -		snprintf(ring_ref_name, RINGREF_NAME_LEN, "ring-ref%u", i);
> +		if (blkif->multi_ref)
> +			snprintf(ring_ref_name, RINGREF_NAME_LEN, "ring-ref%u", i);
> +		else {
> +			WARN_ON(i != 0);
> +			snprintf(ring_ref_name, RINGREF_NAME_LEN, "ring-ref");
> +		}
> +
>  		err = xenbus_scanf(XBT_NIL, dir, ring_ref_name,
>  				   "%u", &ring_ref[i]);
> -
>  		if (err != 1) {
>  			if (nr_grefs == 1)
>  				break;

I think we should not simply break here because the failure can be due to when
(nr_grefs == 1) and reading from legacy "ring-ref".

Should we do something as below?

err = -EINVAL;
xenbus_dev_fatal(dev, err, "reading %s/ring-ref", dir);
return err;

Dongli Zhang


> @@ -1013,18 +1018,6 @@ static int read_per_ring_refs(struct xen_blkif_ring *ring, const char *dir)
>  		}
>  	}
>  
> -	if (err != 1) {
> -		WARN_ON(nr_grefs != 1);
> -
> -		err = xenbus_scanf(XBT_NIL, dir, "ring-ref", "%u",
> -				   &ring_ref[0]);
> -		if (err != 1) {
> -			err = -EINVAL;
> -			xenbus_dev_fatal(dev, err, "reading %s/ring-ref", dir);
> -			return err;
> -		}
> -	}
> -
>  	err = -ENOMEM;
>  	for (i = 0; i < nr_grefs * XEN_BLKIF_REQS_PER_PAGE; i++) {
>  		req = kzalloc(sizeof(*req), GFP_KERNEL);
> @@ -1129,10 +1122,15 @@ static int connect_ring(struct backend_info *be)
>  		 blkif->nr_rings, blkif->blk_protocol, protocol,
>  		 blkif->vbd.feature_gnt_persistent ? "persistent grants" : "");
>  
> -	ring_page_order = xenbus_read_unsigned(dev->otherend,
> -					       "ring-page-order", 0);
> -
> -	if (ring_page_order > xen_blkif_max_ring_order) {
> +	err = xenbus_scanf(XBT_NIL, dev->otherend, "ring-page-order", "%u",
> +			   &ring_page_order);
> +	if (err != 1) {
> +		blkif->nr_ring_pages = 1;
> +		blkif->multi_ref = false;
> +	} else if (ring_page_order <= xen_blkif_max_ring_order) {
> +		blkif->nr_ring_pages = 1 << ring_page_order;
> +		blkif->multi_ref = true;
> +	} else {
>  		err = -EINVAL;
>  		xenbus_dev_fatal(dev, err,
>  				 "requested ring page order %d exceed max:%d",
> @@ -1141,8 +1139,6 @@ static int connect_ring(struct backend_info *be)
>  		return err;
>  	}
>  
> -	blkif->nr_ring_pages = 1 << ring_page_order;
> -
>  	if (blkif->nr_rings == 1)
>  		return read_per_ring_refs(&blkif->rings[0], dev->otherend);
>  	else {
>
Paul Durrant Jan. 28, 2021, 8:30 a.m. UTC | #7
> -----Original Message-----
> From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of Dongli Zhang
> Sent: 27 January 2021 19:57
> To: Paul Durrant <paul@xen.org>; xen-devel@lists.xenproject.org; linux-block@vger.kernel.org; linux-
> kernel@vger.kernel.org
> Cc: Paul Durrant <pdurrant@amazon.com>; Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>; Roger Pau
> Monné <roger.pau@citrix.com>; Jens Axboe <axboe@kernel.dk>
> Subject: Re: [PATCH] xen-blkback: fix compatibility bug with single page rings
> 
> 
> 
> On 1/27/21 2:30 AM, Paul Durrant wrote:
> > From: Paul Durrant <pdurrant@amazon.com>
> >
> > Prior to commit 4a8c31a1c6f5 ("xen/blkback: rework connect_ring() to avoid
> > inconsistent xenstore 'ring-page-order' set by malicious blkfront"), the
> > behaviour of xen-blkback when connecting to a frontend was:
> >
> > - read 'ring-page-order'
> > - if not present then expect a single page ring specified by 'ring-ref'
> > - else expect a ring specified by 'ring-refX' where X is between 0 and
> >   1 << ring-page-order
> >
> > This was correct behaviour, but was broken by the afforementioned commit to
> > become:
> >
> > - read 'ring-page-order'
> > - if not present then expect a single page ring
> > - expect a ring specified by 'ring-refX' where X is between 0 and
> >   1 << ring-page-order
> > - if that didn't work then see if there's a single page ring specified by
> >   'ring-ref'
> >
> > This incorrect behaviour works most of the time but fails when a frontend
> > that sets 'ring-page-order' is unloaded and replaced by one that does not
> > because, instead of reading 'ring-ref', xen-blkback will read the stale
> > 'ring-ref0' left around by the previous frontend will try to map the wrong
> > grant reference.
> >
> > This patch restores the original behaviour.
> >
> > Fixes: 4a8c31a1c6f5 ("xen/blkback: rework connect_ring() to avoid inconsistent xenstore 'ring-page-
> order' set by malicious blkfront")
> > Signed-off-by: Paul Durrant <pdurrant@amazon.com>
> > ---
> > Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > Cc: "Roger Pau Monné" <roger.pau@citrix.com>
> > Cc: Jens Axboe <axboe@kernel.dk>
> > Cc: Dongli Zhang <dongli.zhang@oracle.com>
> > ---
> >  drivers/block/xen-blkback/common.h |  1 +
> >  drivers/block/xen-blkback/xenbus.c | 36 +++++++++++++-----------------
> >  2 files changed, 17 insertions(+), 20 deletions(-)
> >
> > diff --git a/drivers/block/xen-blkback/common.h b/drivers/block/xen-blkback/common.h
> > index b0c71d3a81a0..524a79f10de6 100644
> > --- a/drivers/block/xen-blkback/common.h
> > +++ b/drivers/block/xen-blkback/common.h
> > @@ -313,6 +313,7 @@ struct xen_blkif {
> >
> >  	struct work_struct	free_work;
> >  	unsigned int 		nr_ring_pages;
> > +	bool                    multi_ref;
> >  	/* All rings for this device. */
> >  	struct xen_blkif_ring	*rings;
> >  	unsigned int		nr_rings;
> > diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
> > index 9860d4842f36..4c1541cde68c 100644
> > --- a/drivers/block/xen-blkback/xenbus.c
> > +++ b/drivers/block/xen-blkback/xenbus.c
> > @@ -998,10 +998,15 @@ static int read_per_ring_refs(struct xen_blkif_ring *ring, const char *dir)
> >  	for (i = 0; i < nr_grefs; i++) {
> >  		char ring_ref_name[RINGREF_NAME_LEN];
> >
> > -		snprintf(ring_ref_name, RINGREF_NAME_LEN, "ring-ref%u", i);
> > +		if (blkif->multi_ref)
> > +			snprintf(ring_ref_name, RINGREF_NAME_LEN, "ring-ref%u", i);
> > +		else {
> > +			WARN_ON(i != 0);
> > +			snprintf(ring_ref_name, RINGREF_NAME_LEN, "ring-ref");
> > +		}
> > +
> >  		err = xenbus_scanf(XBT_NIL, dir, ring_ref_name,
> >  				   "%u", &ring_ref[i]);
> > -
> >  		if (err != 1) {
> >  			if (nr_grefs == 1)
> >  				break;
> 
> I think we should not simply break here because the failure can be due to when
> (nr_grefs == 1) and reading from legacy "ring-ref".
> 

Yes, you're quite right. This special case is no longer correct.

> Should we do something as below?
> 
> err = -EINVAL;
> xenbus_dev_fatal(dev, err, "reading %s/ring-ref", dir);
> return err;
> 

I think simply removing the 'if (nr_grefs == 1)' will be sufficient.

  Paul

> Dongli Zhang
> 
> 
> > @@ -1013,18 +1018,6 @@ static int read_per_ring_refs(struct xen_blkif_ring *ring, const char *dir)
> >  		}
> >  	}
> >
> > -	if (err != 1) {
> > -		WARN_ON(nr_grefs != 1);
> > -
> > -		err = xenbus_scanf(XBT_NIL, dir, "ring-ref", "%u",
> > -				   &ring_ref[0]);
> > -		if (err != 1) {
> > -			err = -EINVAL;
> > -			xenbus_dev_fatal(dev, err, "reading %s/ring-ref", dir);
> > -			return err;
> > -		}
> > -	}
> > -
> >  	err = -ENOMEM;
> >  	for (i = 0; i < nr_grefs * XEN_BLKIF_REQS_PER_PAGE; i++) {
> >  		req = kzalloc(sizeof(*req), GFP_KERNEL);
> > @@ -1129,10 +1122,15 @@ static int connect_ring(struct backend_info *be)
> >  		 blkif->nr_rings, blkif->blk_protocol, protocol,
> >  		 blkif->vbd.feature_gnt_persistent ? "persistent grants" : "");
> >
> > -	ring_page_order = xenbus_read_unsigned(dev->otherend,
> > -					       "ring-page-order", 0);
> > -
> > -	if (ring_page_order > xen_blkif_max_ring_order) {
> > +	err = xenbus_scanf(XBT_NIL, dev->otherend, "ring-page-order", "%u",
> > +			   &ring_page_order);
> > +	if (err != 1) {
> > +		blkif->nr_ring_pages = 1;
> > +		blkif->multi_ref = false;
> > +	} else if (ring_page_order <= xen_blkif_max_ring_order) {
> > +		blkif->nr_ring_pages = 1 << ring_page_order;
> > +		blkif->multi_ref = true;
> > +	} else {
> >  		err = -EINVAL;
> >  		xenbus_dev_fatal(dev, err,
> >  				 "requested ring page order %d exceed max:%d",
> > @@ -1141,8 +1139,6 @@ static int connect_ring(struct backend_info *be)
> >  		return err;
> >  	}
> >
> > -	blkif->nr_ring_pages = 1 << ring_page_order;
> > -
> >  	if (blkif->nr_rings == 1)
> >  		return read_per_ring_refs(&blkif->rings[0], dev->otherend);
> >  	else {
> >
diff mbox series

Patch

diff --git a/drivers/block/xen-blkback/common.h b/drivers/block/xen-blkback/common.h
index b0c71d3a81a0..524a79f10de6 100644
--- a/drivers/block/xen-blkback/common.h
+++ b/drivers/block/xen-blkback/common.h
@@ -313,6 +313,7 @@  struct xen_blkif {
 
 	struct work_struct	free_work;
 	unsigned int 		nr_ring_pages;
+	bool                    multi_ref;
 	/* All rings for this device. */
 	struct xen_blkif_ring	*rings;
 	unsigned int		nr_rings;
diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
index 9860d4842f36..4c1541cde68c 100644
--- a/drivers/block/xen-blkback/xenbus.c
+++ b/drivers/block/xen-blkback/xenbus.c
@@ -998,10 +998,15 @@  static int read_per_ring_refs(struct xen_blkif_ring *ring, const char *dir)
 	for (i = 0; i < nr_grefs; i++) {
 		char ring_ref_name[RINGREF_NAME_LEN];
 
-		snprintf(ring_ref_name, RINGREF_NAME_LEN, "ring-ref%u", i);
+		if (blkif->multi_ref)
+			snprintf(ring_ref_name, RINGREF_NAME_LEN, "ring-ref%u", i);
+		else {
+			WARN_ON(i != 0);
+			snprintf(ring_ref_name, RINGREF_NAME_LEN, "ring-ref");
+		}
+
 		err = xenbus_scanf(XBT_NIL, dir, ring_ref_name,
 				   "%u", &ring_ref[i]);
-
 		if (err != 1) {
 			if (nr_grefs == 1)
 				break;
@@ -1013,18 +1018,6 @@  static int read_per_ring_refs(struct xen_blkif_ring *ring, const char *dir)
 		}
 	}
 
-	if (err != 1) {
-		WARN_ON(nr_grefs != 1);
-
-		err = xenbus_scanf(XBT_NIL, dir, "ring-ref", "%u",
-				   &ring_ref[0]);
-		if (err != 1) {
-			err = -EINVAL;
-			xenbus_dev_fatal(dev, err, "reading %s/ring-ref", dir);
-			return err;
-		}
-	}
-
 	err = -ENOMEM;
 	for (i = 0; i < nr_grefs * XEN_BLKIF_REQS_PER_PAGE; i++) {
 		req = kzalloc(sizeof(*req), GFP_KERNEL);
@@ -1129,10 +1122,15 @@  static int connect_ring(struct backend_info *be)
 		 blkif->nr_rings, blkif->blk_protocol, protocol,
 		 blkif->vbd.feature_gnt_persistent ? "persistent grants" : "");
 
-	ring_page_order = xenbus_read_unsigned(dev->otherend,
-					       "ring-page-order", 0);
-
-	if (ring_page_order > xen_blkif_max_ring_order) {
+	err = xenbus_scanf(XBT_NIL, dev->otherend, "ring-page-order", "%u",
+			   &ring_page_order);
+	if (err != 1) {
+		blkif->nr_ring_pages = 1;
+		blkif->multi_ref = false;
+	} else if (ring_page_order <= xen_blkif_max_ring_order) {
+		blkif->nr_ring_pages = 1 << ring_page_order;
+		blkif->multi_ref = true;
+	} else {
 		err = -EINVAL;
 		xenbus_dev_fatal(dev, err,
 				 "requested ring page order %d exceed max:%d",
@@ -1141,8 +1139,6 @@  static int connect_ring(struct backend_info *be)
 		return err;
 	}
 
-	blkif->nr_ring_pages = 1 << ring_page_order;
-
 	if (blkif->nr_rings == 1)
 		return read_per_ring_refs(&blkif->rings[0], dev->otherend);
 	else {