diff mbox

virtio balloon: do not call blocking ops when !TASK_RUNNING

Message ID 87d24pjnkx.fsf@rustcorp.com.au (mailing list archive)
State New, archived
Headers show

Commit Message

Rusty Russell March 4, 2015, 6:14 a.m. UTC
"Michael S. Tsirkin" <mst@redhat.com> writes:
> On Mon, Mar 02, 2015 at 10:37:26AM +1030, Rusty Russell wrote:
>> Thomas Huth <thuth@linux.vnet.ibm.com> writes:
>> > On Thu, 26 Feb 2015 11:50:42 +1030
>> > Rusty Russell <rusty@rustcorp.com.au> wrote:
>> >
>> >> Thomas Huth <thuth@linux.vnet.ibm.com> writes:
>> >> >  Hi all,
>> >> >
>> >> > with the recent kernel 3.19, I get a kernel warning when I start my
>> >> > KVM guest on s390 with virtio balloon enabled:
>> >> 
>> >> The deeper problem is that virtio_ccw_get_config just silently fails on
>> >> OOM.
>> >> 
>> >> Neither get_config nor set_config are expected to fail.
>> >
>> > AFAIK this is currently not a problem. According to
>> > http://lwn.net/Articles/627419/ these kmalloc calls never
>> > fail because they allocate less than a page.
>> 
>> I strongly suggest you unlearn that fact.
>> The fix for this is in two parts:
>> 
>> 1) Annotate using sched_annotate_sleep() and add a comment: we may spin
>>    a few times in low memory situations, but this isn't a high
>>    performance path.
>> 
>> 2) Handle get_config (and other) failure in some more elegant way.
>> 
>> Cheers,
>> Rusty.
>
> I agree, but I'd like to point out that even without kmalloc,
> on s390 get_config is blocking - it's waiting
> for a hardware interrupt.
>
> And it makes sense: config is not data path, I don't think
> we should spin there.
>
> So I think besides these two parts, we still need my two patches:
>     virtio-balloon: do not call blocking ops when !TASK_RUNNING

I prefer to annotate, over trying to fix this.

Because it's not important.  We might spin a few times, but it's very
unlikely, and it's certainly not performance critical.

Thanks,
Rusty.

Subject: virtio_balloon: annotate possible sleep waiting for event.

CCW (s390) does this.

Reported-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Michael S. Tsirkin March 4, 2015, 10:25 a.m. UTC | #1
On Wed, Mar 04, 2015 at 04:44:54PM +1030, Rusty Russell wrote:
> "Michael S. Tsirkin" <mst@redhat.com> writes:
> > On Mon, Mar 02, 2015 at 10:37:26AM +1030, Rusty Russell wrote:
> >> Thomas Huth <thuth@linux.vnet.ibm.com> writes:
> >> > On Thu, 26 Feb 2015 11:50:42 +1030
> >> > Rusty Russell <rusty@rustcorp.com.au> wrote:
> >> >
> >> >> Thomas Huth <thuth@linux.vnet.ibm.com> writes:
> >> >> >  Hi all,
> >> >> >
> >> >> > with the recent kernel 3.19, I get a kernel warning when I start my
> >> >> > KVM guest on s390 with virtio balloon enabled:
> >> >> 
> >> >> The deeper problem is that virtio_ccw_get_config just silently fails on
> >> >> OOM.
> >> >> 
> >> >> Neither get_config nor set_config are expected to fail.
> >> >
> >> > AFAIK this is currently not a problem. According to
> >> > http://lwn.net/Articles/627419/ these kmalloc calls never
> >> > fail because they allocate less than a page.
> >> 
> >> I strongly suggest you unlearn that fact.
> >> The fix for this is in two parts:
> >> 
> >> 1) Annotate using sched_annotate_sleep() and add a comment: we may spin
> >>    a few times in low memory situations, but this isn't a high
> >>    performance path.
> >> 
> >> 2) Handle get_config (and other) failure in some more elegant way.
> >> 
> >> Cheers,
> >> Rusty.
> >
> > I agree, but I'd like to point out that even without kmalloc,
> > on s390 get_config is blocking - it's waiting
> > for a hardware interrupt.
> >
> > And it makes sense: config is not data path, I don't think
> > we should spin there.
> >
> > So I think besides these two parts, we still need my two patches:
> >     virtio-balloon: do not call blocking ops when !TASK_RUNNING
> 
> I prefer to annotate, over trying to fix this.
> 
> Because it's not important.  We might spin a few times, but it's very
> unlikely, and it's certainly not performance critical.
> 
> Thanks,
> Rusty.
> 
> Subject: virtio_balloon: annotate possible sleep waiting for event.
> 
> CCW (s390) does this.
> 
> Reported-by: Thomas Huth <thuth@linux.vnet.ibm.com>
> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> 
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index 0413157f3b49..3f4d5acdbde0 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -340,6 +340,15 @@ static int balloon(void *_vballoon)
>  		s64 diff;
>  
>  		try_to_freeze();
> +
> +		/*
> +		 * Reading the config on the ccw backend involves an
> +		 * allocation, so we may actually sleep and have an
> +		 * extra iteration.  It's extremely unlikely,

Hmm, this part of the comment seems wrong to me.
Reading the config on the ccw backend always sleeps
because it's interrupt driven.

This is the relevant code:

static int ccw_io_helper(struct virtio_ccw_device *vcdev,
                         struct ccw1 *ccw, __u32 intparm)
{
        int ret;
        unsigned long flags;
        int flag = intparm & VIRTIO_CCW_INTPARM_MASK;

        do {
                spin_lock_irqsave(get_ccwdev_lock(vcdev->cdev), flags);
                ret = ccw_device_start(vcdev->cdev, ccw, intparm, 0, 0);
                if (!ret) {
                        if (!vcdev->curr_io)
                                vcdev->err = 0;
                        vcdev->curr_io |= flag;
                }
                spin_unlock_irqrestore(get_ccwdev_lock(vcdev->cdev), flags);
                cpu_relax();
        } while (ret == -EBUSY);
        wait_event(vcdev->wait_q, doing_io(vcdev, flag) == 0);
        return ret ? ret : vcdev->err;
}


>		 and this
> +		 * isn't a fast path in any sense.
> +		 */
> +		sched_annotate_sleep();
> +
>  		wait_event_interruptible(vb->config_change,
>  					 (diff = towards_target(vb)) != 0
>  					 || vb->need_stats_update


So the wait_event_interruptible always calls wait_event
which then becomes a busy wait on s390, which is not nice.

So I suspect
http://mid.gmane.org/1424874878-17155-1-git-send-email-mst@redhat.com
is better.

What do you think?
Cornelia Huck March 6, 2015, 11:56 a.m. UTC | #2
On Wed, 4 Mar 2015 11:25:56 +0100
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Wed, Mar 04, 2015 at 04:44:54PM +1030, Rusty Russell wrote:
> > "Michael S. Tsirkin" <mst@redhat.com> writes:
> > > On Mon, Mar 02, 2015 at 10:37:26AM +1030, Rusty Russell wrote:
> > >> Thomas Huth <thuth@linux.vnet.ibm.com> writes:
> > >> > On Thu, 26 Feb 2015 11:50:42 +1030
> > >> > Rusty Russell <rusty@rustcorp.com.au> wrote:
> > >> >
> > >> >> Thomas Huth <thuth@linux.vnet.ibm.com> writes:
> > >> >> >  Hi all,
> > >> >> >
> > >> >> > with the recent kernel 3.19, I get a kernel warning when I start my
> > >> >> > KVM guest on s390 with virtio balloon enabled:
> > >> >> 
> > >> >> The deeper problem is that virtio_ccw_get_config just silently fails on
> > >> >> OOM.
> > >> >> 
> > >> >> Neither get_config nor set_config are expected to fail.
> > >> >
> > >> > AFAIK this is currently not a problem. According to
> > >> > http://lwn.net/Articles/627419/ these kmalloc calls never
> > >> > fail because they allocate less than a page.
> > >> 
> > >> I strongly suggest you unlearn that fact.
> > >> The fix for this is in two parts:
> > >> 
> > >> 1) Annotate using sched_annotate_sleep() and add a comment: we may spin
> > >>    a few times in low memory situations, but this isn't a high
> > >>    performance path.
> > >> 
> > >> 2) Handle get_config (and other) failure in some more elegant way.
> > >> 
> > >> Cheers,
> > >> Rusty.
> > >
> > > I agree, but I'd like to point out that even without kmalloc,
> > > on s390 get_config is blocking - it's waiting
> > > for a hardware interrupt.
> > >
> > > And it makes sense: config is not data path, I don't think
> > > we should spin there.
> > >
> > > So I think besides these two parts, we still need my two patches:
> > >     virtio-balloon: do not call blocking ops when !TASK_RUNNING
> > 
> > I prefer to annotate, over trying to fix this.
> > 
> > Because it's not important.  We might spin a few times, but it's very
> > unlikely, and it's certainly not performance critical.
> > 
> > Thanks,
> > Rusty.
> > 
> > Subject: virtio_balloon: annotate possible sleep waiting for event.
> > 
> > CCW (s390) does this.
> > 
> > Reported-by: Thomas Huth <thuth@linux.vnet.ibm.com>
> > Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> > 
> > diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> > index 0413157f3b49..3f4d5acdbde0 100644
> > --- a/drivers/virtio/virtio_balloon.c
> > +++ b/drivers/virtio/virtio_balloon.c
> > @@ -340,6 +340,15 @@ static int balloon(void *_vballoon)
> >  		s64 diff;
> >  
> >  		try_to_freeze();
> > +
> > +		/*
> > +		 * Reading the config on the ccw backend involves an
> > +		 * allocation, so we may actually sleep and have an
> > +		 * extra iteration.  It's extremely unlikely,
> 
> Hmm, this part of the comment seems wrong to me.
> Reading the config on the ccw backend always sleeps
> because it's interrupt driven.

(...)

> So I suspect
> http://mid.gmane.org/1424874878-17155-1-git-send-email-mst@redhat.com
> is better.
> 
> What do you think?

I'd prefer to fix this as well. While the I/O request completes
instantly on current qemu (the ssch backend handles the start function
immediately, not asynchronously as on real hardware), this (a) is an
implementation detail that may change and (b) doesn't account for the
need to deliver the interrupt to the guest - which might take non-zero
time.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rusty Russell March 10, 2015, 1:26 a.m. UTC | #3
Cornelia Huck <cornelia.huck@de.ibm.com> writes:
> On Wed, 4 Mar 2015 11:25:56 +0100
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
>
>> On Wed, Mar 04, 2015 at 04:44:54PM +1030, Rusty Russell wrote:
>> > "Michael S. Tsirkin" <mst@redhat.com> writes:
>> > > On Mon, Mar 02, 2015 at 10:37:26AM +1030, Rusty Russell wrote:
>> > >> Thomas Huth <thuth@linux.vnet.ibm.com> writes:
>> > >> > On Thu, 26 Feb 2015 11:50:42 +1030
>> > >> > Rusty Russell <rusty@rustcorp.com.au> wrote:
>> > >> >
>> > >> >> Thomas Huth <thuth@linux.vnet.ibm.com> writes:
>> > >> >> >  Hi all,
>> > >> >> >
>> > >> >> > with the recent kernel 3.19, I get a kernel warning when I start my
>> > >> >> > KVM guest on s390 with virtio balloon enabled:
>> > >> >> 
>> > >> >> The deeper problem is that virtio_ccw_get_config just silently fails on
>> > >> >> OOM.
>> > >> >> 
>> > >> >> Neither get_config nor set_config are expected to fail.
>> > >> >
>> > >> > AFAIK this is currently not a problem. According to
>> > >> > http://lwn.net/Articles/627419/ these kmalloc calls never
>> > >> > fail because they allocate less than a page.
>> > >> 
>> > >> I strongly suggest you unlearn that fact.
>> > >> The fix for this is in two parts:
>> > >> 
>> > >> 1) Annotate using sched_annotate_sleep() and add a comment: we may spin
>> > >>    a few times in low memory situations, but this isn't a high
>> > >>    performance path.
>> > >> 
>> > >> 2) Handle get_config (and other) failure in some more elegant way.
>> > >> 
>> > >> Cheers,
>> > >> Rusty.
>> > >
>> > > I agree, but I'd like to point out that even without kmalloc,
>> > > on s390 get_config is blocking - it's waiting
>> > > for a hardware interrupt.
>> > >
>> > > And it makes sense: config is not data path, I don't think
>> > > we should spin there.
>> > >
>> > > So I think besides these two parts, we still need my two patches:
>> > >     virtio-balloon: do not call blocking ops when !TASK_RUNNING
>> > 
>> > I prefer to annotate, over trying to fix this.
>> > 
>> > Because it's not important.  We might spin a few times, but it's very
>> > unlikely, and it's certainly not performance critical.
>> > 
>> > Thanks,
>> > Rusty.
>> > 
>> > Subject: virtio_balloon: annotate possible sleep waiting for event.
>> > 
>> > CCW (s390) does this.
>> > 
>> > Reported-by: Thomas Huth <thuth@linux.vnet.ibm.com>
>> > Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
>> > 
>> > diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
>> > index 0413157f3b49..3f4d5acdbde0 100644
>> > --- a/drivers/virtio/virtio_balloon.c
>> > +++ b/drivers/virtio/virtio_balloon.c
>> > @@ -340,6 +340,15 @@ static int balloon(void *_vballoon)
>> >  		s64 diff;
>> >  
>> >  		try_to_freeze();
>> > +
>> > +		/*
>> > +		 * Reading the config on the ccw backend involves an
>> > +		 * allocation, so we may actually sleep and have an
>> > +		 * extra iteration.  It's extremely unlikely,
>> 
>> Hmm, this part of the comment seems wrong to me.
>> Reading the config on the ccw backend always sleeps
>> because it's interrupt driven.
>
> (...)
>
>> So I suspect
>> http://mid.gmane.org/1424874878-17155-1-git-send-email-mst@redhat.com
>> is better.
>> 
>> What do you think?
>
> I'd prefer to fix this as well. While the I/O request completes
> instantly on current qemu (the ssch backend handles the start function
> immediately, not asynchronously as on real hardware), this (a) is an
> implementation detail that may change and (b) doesn't account for the
> need to deliver the interrupt to the guest - which might take non-zero
> time.

Ah, I see.  My mistake.

I've thrown out my patch, applied that one.

Thanks,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 0413157f3b49..3f4d5acdbde0 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -340,6 +340,15 @@  static int balloon(void *_vballoon)
 		s64 diff;
 
 		try_to_freeze();
+
+		/*
+		 * Reading the config on the ccw backend involves an
+		 * allocation, so we may actually sleep and have an
+		 * extra iteration.  It's extremely unlikely, and this
+		 * isn't a fast path in any sense.
+		 */
+		sched_annotate_sleep();
+
 		wait_event_interruptible(vb->config_change,
 					 (diff = towards_target(vb)) != 0
 					 || vb->need_stats_update