diff mbox

pv-grub guest booting fail with recent qemu-xen

Message ID B8376D2DEA074F45BA033984477C453E03440296@shsmsx102.ccr.corp.intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Hao, Xudong March 30, 2016, 2:05 a.m. UTC
> -----Original Message-----
> From: Wei Liu [mailto:wei.liu2@citrix.com]
> Sent: Wednesday, March 30, 2016 12:58 AM
> To: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Hao, Xudong <xudong.hao@intel.com>; wei.liu2@citrix.com;
> samuel.thibault@ens-lyon.org; stefano.stabellini@eu.citrix.com; xen-
> devel@lists.xen.org
> Subject: Re: [Xen-devel] pv-grub guest booting fail with recent qemu-xen
> 
> On Mon, Mar 28, 2016 at 09:21:14AM -0400, Konrad Rzeszutek Wilk wrote:
> > On Mon, Mar 28, 2016 at 02:03:35AM +0000, Hao, Xudong wrote:
> > > > -----Original Message-----
> > > > From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf
> > > > Of Konrad Rzeszutek Wilk
> > > > Sent: Saturday, March 26, 2016 2:58 AM
> > > > To: Hao, Xudong <xudong.hao@intel.com>
> > > > Cc: stefano.stabellini@eu.citrix.com; xen-devel@lists.xen.org
> > > > Subject: Re: [Xen-devel] pv-grub guest booting fail with recent
> > > > qemu-xen
> > > >
> > > > On Wed, Mar 02, 2016 at 07:16:40AM +0000, Hao, Xudong wrote:
> > > > > Hi,
> > > > > For Xen upstream master branch with commit 1949868d, After
> > > > > updating qemu-
> > > > xen version from fcf6ac57 to 2ce1d30e, booting a pv-grub guest will fail.
> >
> 
> pv-grub should be using qemu-traditional, not qemu-xen
> 

Never hear this limitation.

> The log message you posted in your original post doesn't seem to reveal much.
> Can you have a look at relevant QEMU logs under /var/log/xen?

There is not valuable qemu log, only one line: "qemu: terminating on signal 1 from pid 36642".

Bisect and the bad commit of qemu-xen is:

commit 2ce1d30ef2858dfed72a281872579e5a26b090dd
Author: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Date:   Wed Jan 6 16:32:22 2016 +0000

    xenfb.c: avoid expensive loops when prod <= out_cons

    If the frontend sets out_cons to a value higher than out_prod, it will
    cause xenfb_handle_events to loop about 2^32 times. Avoid that by using
    better checks at the beginning of the function.

    upstream-commit-id: ac0487e1d2ae811cd4d035741a109a4ecfb013f1

    Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
    Reported-by: Ling Liu <liuling-it@360.cn>

Comments

Konrad Rzeszutek Wilk April 1, 2016, 3:54 p.m. UTC | #1
On Wed, Mar 30, 2016 at 02:05:28AM +0000, Hao, Xudong wrote:
> > -----Original Message-----
> > From: Wei Liu [mailto:wei.liu2@citrix.com]
> > Sent: Wednesday, March 30, 2016 12:58 AM
> > To: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > Cc: Hao, Xudong <xudong.hao@intel.com>; wei.liu2@citrix.com;
> > samuel.thibault@ens-lyon.org; stefano.stabellini@eu.citrix.com; xen-
> > devel@lists.xen.org
> > Subject: Re: [Xen-devel] pv-grub guest booting fail with recent qemu-xen
> > 
> > On Mon, Mar 28, 2016 at 09:21:14AM -0400, Konrad Rzeszutek Wilk wrote:
> > > On Mon, Mar 28, 2016 at 02:03:35AM +0000, Hao, Xudong wrote:
> > > > > -----Original Message-----
> > > > > From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf
> > > > > Of Konrad Rzeszutek Wilk
> > > > > Sent: Saturday, March 26, 2016 2:58 AM
> > > > > To: Hao, Xudong <xudong.hao@intel.com>
> > > > > Cc: stefano.stabellini@eu.citrix.com; xen-devel@lists.xen.org
> > > > > Subject: Re: [Xen-devel] pv-grub guest booting fail with recent
> > > > > qemu-xen
> > > > >
> > > > > On Wed, Mar 02, 2016 at 07:16:40AM +0000, Hao, Xudong wrote:
> > > > > > Hi,
> > > > > > For Xen upstream master branch with commit 1949868d, After
> > > > > > updating qemu-
> > > > > xen version from fcf6ac57 to 2ce1d30e, booting a pv-grub guest will fail.
> > >
> > 
> > pv-grub should be using qemu-traditional, not qemu-xen
> > 
> 
> Never hear this limitation.
> 
> > The log message you posted in your original post doesn't seem to reveal much.
> > Can you have a look at relevant QEMU logs under /var/log/xen?
> 
> There is not valuable qemu log, only one line: "qemu: terminating on signal 1 from pid 36642".
> 
> Bisect and the bad commit of qemu-xen is:

If you use PV-grub without the framebuffer does it boot?


> 
> commit 2ce1d30ef2858dfed72a281872579e5a26b090dd
> Author: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> Date:   Wed Jan 6 16:32:22 2016 +0000
> 
>     xenfb.c: avoid expensive loops when prod <= out_cons
> 
>     If the frontend sets out_cons to a value higher than out_prod, it will
>     cause xenfb_handle_events to loop about 2^32 times. Avoid that by using
>     better checks at the beginning of the function.
> 
>     upstream-commit-id: ac0487e1d2ae811cd4d035741a109a4ecfb013f1
> 
>     Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
>     Reported-by: Ling Liu <liuling-it@360.cn>
> 
> diff --git a/hw/display/xenfb.c b/hw/display/xenfb.c
> index 4e2a27a..8eb3046 100644
> --- a/hw/display/xenfb.c
> +++ b/hw/display/xenfb.c
> @@ -789,8 +789,9 @@ static void xenfb_handle_events(struct XenFB *xenfb)
> 
>      prod = page->out_prod;
>      out_cons = page->out_cons;
> -    if (prod == out_cons)
> -       return;
> +    if (prod - out_cons >= XENFB_OUT_RING_LEN) {
> +        return;
> +    }
>      xen_rmb();         /* ensure we see ring contents up to prod */
>      for (cons = out_cons; cons != prod; cons++) {
>         union xenfb_out_event *event = &XENFB_OUT_RING_REF(page, cons);
>
Hao, Xudong April 5, 2016, 1:26 a.m. UTC | #2
> -----Original Message-----

> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Konrad

> Rzeszutek Wilk

> Sent: Friday, April 1, 2016 11:55 PM

> To: Hao, Xudong <xudong.hao@intel.com>

> Cc: samuel.thibault@ens-lyon.org; xen-devel@lists.xen.org; Wei Liu

> <wei.liu2@citrix.com>; stefano.stabellini@eu.citrix.com

> Subject: Re: [Xen-devel] pv-grub guest booting fail with recent qemu-xen

> 

> On Wed, Mar 30, 2016 at 02:05:28AM +0000, Hao, Xudong wrote:

> > > -----Original Message-----

> > > From: Wei Liu [mailto:wei.liu2@citrix.com]

> > > Sent: Wednesday, March 30, 2016 12:58 AM

> > > To: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

> > > Cc: Hao, Xudong <xudong.hao@intel.com>; wei.liu2@citrix.com;

> > > samuel.thibault@ens-lyon.org; stefano.stabellini@eu.citrix.com; xen-

> > > devel@lists.xen.org

> > > Subject: Re: [Xen-devel] pv-grub guest booting fail with recent

> > > qemu-xen

> > >

> > > On Mon, Mar 28, 2016 at 09:21:14AM -0400, Konrad Rzeszutek Wilk wrote:

> > > > On Mon, Mar 28, 2016 at 02:03:35AM +0000, Hao, Xudong wrote:

> > > > > > -----Original Message-----

> > > > > > From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On

> > > > > > Behalf Of Konrad Rzeszutek Wilk

> > > > > > Sent: Saturday, March 26, 2016 2:58 AM

> > > > > > To: Hao, Xudong <xudong.hao@intel.com>

> > > > > > Cc: stefano.stabellini@eu.citrix.com; xen-devel@lists.xen.org

> > > > > > Subject: Re: [Xen-devel] pv-grub guest booting fail with

> > > > > > recent qemu-xen

> > > > > >

> > > > > > On Wed, Mar 02, 2016 at 07:16:40AM +0000, Hao, Xudong wrote:

> > > > > > > Hi,

> > > > > > > For Xen upstream master branch with commit 1949868d, After

> > > > > > > updating qemu-

> > > > > > xen version from fcf6ac57 to 2ce1d30e, booting a pv-grub guest will

> fail.

> > > >

> > >

> > > pv-grub should be using qemu-traditional, not qemu-xen

> > >

> >

> > Never hear this limitation.

> >

> > > The log message you posted in your original post doesn't seem to reveal

> much.

> > > Can you have a look at relevant QEMU logs under /var/log/xen?

> >

> > There is not valuable qemu log, only one line: "qemu: terminating on signal 1

> from pid 36642".

> >

> > Bisect and the bad commit of qemu-xen is:

> 

> If you use PV-grub without the framebuffer does it boot?


How to disable framebuffer when VM booting.
Reverted this patch, PV-grub guest boot successfully.

> 

> 

> >

> > commit 2ce1d30ef2858dfed72a281872579e5a26b090dd

> > Author: Stefano Stabellini <stefano.stabellini@eu.citrix.com>

> > Date:   Wed Jan 6 16:32:22 2016 +0000

> >

> >     xenfb.c: avoid expensive loops when prod <= out_cons

> >

> >     If the frontend sets out_cons to a value higher than out_prod, it will

> >     cause xenfb_handle_events to loop about 2^32 times. Avoid that by using

> >     better checks at the beginning of the function.

> >

> >     upstream-commit-id: ac0487e1d2ae811cd4d035741a109a4ecfb013f1

> >

> >     Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>

> >     Reported-by: Ling Liu <liuling-it@360.cn>

> >

> > diff --git a/hw/display/xenfb.c b/hw/display/xenfb.c index

> > 4e2a27a..8eb3046 100644

> > --- a/hw/display/xenfb.c

> > +++ b/hw/display/xenfb.c

> > @@ -789,8 +789,9 @@ static void xenfb_handle_events(struct XenFB

> > *xenfb)

> >

> >      prod = page->out_prod;

> >      out_cons = page->out_cons;

> > -    if (prod == out_cons)

> > -       return;

> > +    if (prod - out_cons >= XENFB_OUT_RING_LEN) {

> > +        return;

> > +    }

> >      xen_rmb();         /* ensure we see ring contents up to prod */

> >      for (cons = out_cons; cons != prod; cons++) {

> >         union xenfb_out_event *event = &XENFB_OUT_RING_REF(page,

> > cons);

> >

> 

> _______________________________________________

> Xen-devel mailing list

> Xen-devel@lists.xen.org

> http://lists.xen.org/xen-devel
Wei Liu April 5, 2016, 3:41 p.m. UTC | #3
On Tue, Apr 05, 2016 at 01:26:47AM +0000, Hao, Xudong wrote:
> > -----Original Message-----
> > From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Konrad
> > Rzeszutek Wilk
> > Sent: Friday, April 1, 2016 11:55 PM
> > To: Hao, Xudong <xudong.hao@intel.com>
> > Cc: samuel.thibault@ens-lyon.org; xen-devel@lists.xen.org; Wei Liu
> > <wei.liu2@citrix.com>; stefano.stabellini@eu.citrix.com
> > Subject: Re: [Xen-devel] pv-grub guest booting fail with recent qemu-xen
> > 
> > On Wed, Mar 30, 2016 at 02:05:28AM +0000, Hao, Xudong wrote:
> > > > -----Original Message-----
> > > > From: Wei Liu [mailto:wei.liu2@citrix.com]
> > > > Sent: Wednesday, March 30, 2016 12:58 AM
> > > > To: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > > > Cc: Hao, Xudong <xudong.hao@intel.com>; wei.liu2@citrix.com;
> > > > samuel.thibault@ens-lyon.org; stefano.stabellini@eu.citrix.com; xen-
> > > > devel@lists.xen.org
> > > > Subject: Re: [Xen-devel] pv-grub guest booting fail with recent
> > > > qemu-xen
> > > >
> > > > On Mon, Mar 28, 2016 at 09:21:14AM -0400, Konrad Rzeszutek Wilk wrote:
> > > > > On Mon, Mar 28, 2016 at 02:03:35AM +0000, Hao, Xudong wrote:
> > > > > > > -----Original Message-----
> > > > > > > From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On
> > > > > > > Behalf Of Konrad Rzeszutek Wilk
> > > > > > > Sent: Saturday, March 26, 2016 2:58 AM
> > > > > > > To: Hao, Xudong <xudong.hao@intel.com>
> > > > > > > Cc: stefano.stabellini@eu.citrix.com; xen-devel@lists.xen.org
> > > > > > > Subject: Re: [Xen-devel] pv-grub guest booting fail with
> > > > > > > recent qemu-xen
> > > > > > >
> > > > > > > On Wed, Mar 02, 2016 at 07:16:40AM +0000, Hao, Xudong wrote:
> > > > > > > > Hi,
> > > > > > > > For Xen upstream master branch with commit 1949868d, After
> > > > > > > > updating qemu-
> > > > > > > xen version from fcf6ac57 to 2ce1d30e, booting a pv-grub guest will
> > fail.
> > > > >
> > > >
> > > > pv-grub should be using qemu-traditional, not qemu-xen
> > > >
> > >
> > > Never hear this limitation.
> > >
> > > > The log message you posted in your original post doesn't seem to reveal
> > much.
> > > > Can you have a look at relevant QEMU logs under /var/log/xen?
> > >
> > > There is not valuable qemu log, only one line: "qemu: terminating on signal 1
> > from pid 36642".
> > >
> > > Bisect and the bad commit of qemu-xen is:
> > 
> > If you use PV-grub without the framebuffer does it boot?
> 
> How to disable framebuffer when VM booting.
> Reverted this patch, PV-grub guest boot successfully.
> 

It could be the frontend that's buggy. I think mini-os's fbfront might
aggressively overwrites prod without checking if there is space
available.

We should probably fix mini-os. On the other hand, can this check in
QEMU be improved a bit so that it accommodate buggy frontend? Anthony?

Wei.


> > 
> > 
> > >
> > > commit 2ce1d30ef2858dfed72a281872579e5a26b090dd
> > > Author: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> > > Date:   Wed Jan 6 16:32:22 2016 +0000
> > >
> > >     xenfb.c: avoid expensive loops when prod <= out_cons
> > >
> > >     If the frontend sets out_cons to a value higher than out_prod, it will
> > >     cause xenfb_handle_events to loop about 2^32 times. Avoid that by using
> > >     better checks at the beginning of the function.
> > >
> > >     upstream-commit-id: ac0487e1d2ae811cd4d035741a109a4ecfb013f1
> > >
> > >     Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> > >     Reported-by: Ling Liu <liuling-it@360.cn>
> > >
> > > diff --git a/hw/display/xenfb.c b/hw/display/xenfb.c index
> > > 4e2a27a..8eb3046 100644
> > > --- a/hw/display/xenfb.c
> > > +++ b/hw/display/xenfb.c
> > > @@ -789,8 +789,9 @@ static void xenfb_handle_events(struct XenFB
> > > *xenfb)
> > >
> > >      prod = page->out_prod;
> > >      out_cons = page->out_cons;
> > > -    if (prod == out_cons)
> > > -       return;
> > > +    if (prod - out_cons >= XENFB_OUT_RING_LEN) {
> > > +        return;
> > > +    }
> > >      xen_rmb();         /* ensure we see ring contents up to prod */
> > >      for (cons = out_cons; cons != prod; cons++) {
> > >         union xenfb_out_event *event = &XENFB_OUT_RING_REF(page,
> > > cons);
> > >
> > 
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xen.org
> > http://lists.xen.org/xen-devel
Samuel Thibault April 10, 2016, 8:14 p.m. UTC | #4
Hello,

> > > > +    if (prod - out_cons >= XENFB_OUT_RING_LEN) {
> > > > +        return;
> > > > +    }

This test seems overzealous to me: AIUI, the producer can produce
XENFB_OUT_RING_LEN events, and thus prod - out_cons is exactly
XENFB_OUT_RING_LEN, i.e. there is no room left at all.

The frontend part is:

   while (page->out_prod - page->out_cons == XENFB_OUT_RING_LEN)
        schedule();

I.e. it waits while the buffer is exactly full.

So it seems to me the bug is at the backend side.

Samuel
diff mbox

Patch

diff --git a/hw/display/xenfb.c b/hw/display/xenfb.c
index 4e2a27a..8eb3046 100644
--- a/hw/display/xenfb.c
+++ b/hw/display/xenfb.c
@@ -789,8 +789,9 @@  static void xenfb_handle_events(struct XenFB *xenfb)

     prod = page->out_prod;
     out_cons = page->out_cons;
-    if (prod == out_cons)
-       return;
+    if (prod - out_cons >= XENFB_OUT_RING_LEN) {
+        return;
+    }
     xen_rmb();         /* ensure we see ring contents up to prod */
     for (cons = out_cons; cons != prod; cons++) {
        union xenfb_out_event *event = &XENFB_OUT_RING_REF(page, cons);