[v2,6/8] kvm tools: Add rwlock wrapper

Message ID	20110603193157.GO2333@linux.vnet.ibm.com (mailing list archive)
State	New, archived
Headers	show Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by demeter2.kernel.org (8.14.4/8.14.3) with ESMTP id p53JWKSI023334 for <patchwork-kvm@patchwork.kernel.org>; Fri, 3 Jun 2011 19:32:20 GMT Date: Fri, 3 Jun 2011 12:31:57 -0700 From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> To: Sasha Levin <levinsasha928@gmail.com> Cc: Ingo Molnar <mingo@elte.hu>, Yinghai Lu <yinghai@kernel.org>, Avi Kivity <avi@redhat.com>, Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>, Pekka Enberg <penberg@kernel.org>, kvm@vger.kernel.org, asias.hejun@gmail.com, gorcunov@gmail.com, prasadjoshi124@gmail.com, takuya.yoshikawa@gmail.com Subject: Re: [PATCH v2 6/8] kvm tools: Add rwlock wrapper Message-ID: <20110603193157.GO2333@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <20110530201110.f3bf20b5.yoshikawa.takuya@oss.ntt.co.jp> <1306753954.14564.92.camel@lappy> <20110530202646.eff0ea28.yoshikawa.takuya@oss.ntt.co.jp> <4DE381DB.8040804@redhat.com> <20110530114949.GD22324@elte.hu> <4DE387DA.3020307@redhat.com> <20110530123602.GK22324@elte.hu> <1307086079.13088.8.camel@lappy> <20110603073427.GF15375@elte.hu> <1307087659.13088.10.camel@lappy> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1307087659.13088.10.camel@lappy> User-Agent: Mutt/1.5.20 (2009-06-14) Sender: kvm-owner@vger.kernel.org Precedence: bulk

Paul E. McKenney June 3, 2011, 7:31 p.m. UTC

On Fri, Jun 03, 2011 at 10:54:19AM +0300, Sasha Levin wrote:
> On Fri, 2011-06-03 at 09:34 +0200, Ingo Molnar wrote:
> > * Sasha Levin <levinsasha928@gmail.com> wrote:
> > 
> > > > with no apparent progress being made.
> > > 
> > > Since it's something that worked in 2.6.37, I've looked into it to 
> > > find what might have caused this issue.
> > > 
> > > I've bisected guest kernels and found that the problem starts with:
> > > 
> > > a26ac2455ffcf3be5c6ef92bc6df7182700f2114 is the first bad commit
> > > commit a26ac2455ffcf3be5c6ef92bc6df7182700f2114
> > > Author: Paul E. McKenney <paul.mckenney@linaro.org>
> > > Date:   Wed Jan 12 14:10:23 2011 -0800
> > > 
> > >     rcu: move TREE_RCU from softirq to kthread
> > > 
> > > Ingo, could you confirm that the problem goes away for you when you 
> > > use an earlier commit?
> > 
> > testing will have to wait, but there's a recent upstream fix:
> > 
> >   d72bce0e67e8: rcu: Cure load woes
> > 
> > That *might* perhaps address this problem too.
> > 
> I've re-tested with Linus's current git, the problem is still there.
> 
> > If not then this appears to be some sort of RCU related livelock with 
> > brutally overcommitted vcpus. On native this would show up too, in a 
> > less drastic form, as a spurious bootup delay.
> 
> I don't think it was overcommited by *that* much. With that commit it
> usually hangs at 20-40 vcpus, while without it I can go up to 255.

Here is a diagnostic patch, untested.  It assumes that your system
has only a few CPUs (maybe 8-16) and that timers are still running.
It dumps out some RCU state if grace periods extend for more than
a few seconds.

To activate it, call rcu_diag_timer_start() from process context.
To stop it, call rcu_diag_timer_stop(), also from process context.

Thoughts?

							Thanx, Paul

------------------------------------------------------------------------

rcu: diagnostic check of kthread state
    
Not-signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Sasha Levin June 3, 2011, 7:56 p.m. UTC | #1

On Fri, 2011-06-03 at 12:31 -0700, Paul E. McKenney wrote:
> On Fri, Jun 03, 2011 at 10:54:19AM +0300, Sasha Levin wrote:
> > On Fri, 2011-06-03 at 09:34 +0200, Ingo Molnar wrote:
> > > * Sasha Levin <levinsasha928@gmail.com> wrote:
> > > 
> > > > > with no apparent progress being made.
> > > > 
> > > > Since it's something that worked in 2.6.37, I've looked into it to 
> > > > find what might have caused this issue.
> > > > 
> > > > I've bisected guest kernels and found that the problem starts with:
> > > > 
> > > > a26ac2455ffcf3be5c6ef92bc6df7182700f2114 is the first bad commit
> > > > commit a26ac2455ffcf3be5c6ef92bc6df7182700f2114
> > > > Author: Paul E. McKenney <paul.mckenney@linaro.org>
> > > > Date:   Wed Jan 12 14:10:23 2011 -0800
> > > > 
> > > >     rcu: move TREE_RCU from softirq to kthread
> > > > 
> > > > Ingo, could you confirm that the problem goes away for you when you 
> > > > use an earlier commit?
> > > 
> > > testing will have to wait, but there's a recent upstream fix:
> > > 
> > >   d72bce0e67e8: rcu: Cure load woes
> > > 
> > > That *might* perhaps address this problem too.
> > > 
> > I've re-tested with Linus's current git, the problem is still there.
> > 
> > > If not then this appears to be some sort of RCU related livelock with 
> > > brutally overcommitted vcpus. On native this would show up too, in a 
> > > less drastic form, as a spurious bootup delay.
> > 
> > I don't think it was overcommited by *that* much. With that commit it
> > usually hangs at 20-40 vcpus, while without it I can go up to 255.
> 
> Here is a diagnostic patch, untested.  It assumes that your system
> has only a few CPUs (maybe 8-16) and that timers are still running.
> It dumps out some RCU state if grace periods extend for more than
> a few seconds.
> 
> To activate it, call rcu_diag_timer_start() from process context.
> To stop it, call rcu_diag_timer_stop(), also from process context.

Since the hang happens in guest kernel very early during boot, I can't
call rcu_diag_timer_start(). What would be a good place to put the
_start() code instead?

Paul E. McKenney June 3, 2011, 8:22 p.m. UTC | #2

On Fri, Jun 03, 2011 at 10:56:20PM +0300, Sasha Levin wrote:
> On Fri, 2011-06-03 at 12:31 -0700, Paul E. McKenney wrote:
> > On Fri, Jun 03, 2011 at 10:54:19AM +0300, Sasha Levin wrote:
> > > On Fri, 2011-06-03 at 09:34 +0200, Ingo Molnar wrote:
> > > > * Sasha Levin <levinsasha928@gmail.com> wrote:
> > > > 
> > > > > > with no apparent progress being made.
> > > > > 
> > > > > Since it's something that worked in 2.6.37, I've looked into it to 
> > > > > find what might have caused this issue.
> > > > > 
> > > > > I've bisected guest kernels and found that the problem starts with:
> > > > > 
> > > > > a26ac2455ffcf3be5c6ef92bc6df7182700f2114 is the first bad commit
> > > > > commit a26ac2455ffcf3be5c6ef92bc6df7182700f2114
> > > > > Author: Paul E. McKenney <paul.mckenney@linaro.org>
> > > > > Date:   Wed Jan 12 14:10:23 2011 -0800
> > > > > 
> > > > >     rcu: move TREE_RCU from softirq to kthread
> > > > > 
> > > > > Ingo, could you confirm that the problem goes away for you when you 
> > > > > use an earlier commit?
> > > > 
> > > > testing will have to wait, but there's a recent upstream fix:
> > > > 
> > > >   d72bce0e67e8: rcu: Cure load woes
> > > > 
> > > > That *might* perhaps address this problem too.
> > > > 
> > > I've re-tested with Linus's current git, the problem is still there.
> > > 
> > > > If not then this appears to be some sort of RCU related livelock with 
> > > > brutally overcommitted vcpus. On native this would show up too, in a 
> > > > less drastic form, as a spurious bootup delay.
> > > 
> > > I don't think it was overcommited by *that* much. With that commit it
> > > usually hangs at 20-40 vcpus, while without it I can go up to 255.
> > 
> > Here is a diagnostic patch, untested.  It assumes that your system
> > has only a few CPUs (maybe 8-16) and that timers are still running.
> > It dumps out some RCU state if grace periods extend for more than
> > a few seconds.
> > 
> > To activate it, call rcu_diag_timer_start() from process context.
> > To stop it, call rcu_diag_timer_stop(), also from process context.
> 
> Since the hang happens in guest kernel very early during boot, I can't
> call rcu_diag_timer_start(). What would be a good place to put the
> _start() code instead?

Assuming that the failure occurs in the host OS rather than in the guest
OS, I suggest placing rcu_diag_timer_start() in the host code that starts
up the guest.

On the other hand, if the failure is occuring in the guest OS, then
I suggest placing the call to rcu_diag_timer_start() just after timer
initialization -- that is, assuming that interrupts are enabled at the
time of the failure.  If interrupts are not yet enabled at the time of
the failure, color me clueless.

							Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Sasha Levin June 3, 2011, 9:03 p.m. UTC | #3

On Fri, 2011-06-03 at 13:22 -0700, Paul E. McKenney wrote:
> On Fri, Jun 03, 2011 at 10:56:20PM +0300, Sasha Levin wrote:
> > On Fri, 2011-06-03 at 12:31 -0700, Paul E. McKenney wrote:
> > > On Fri, Jun 03, 2011 at 10:54:19AM +0300, Sasha Levin wrote:
> > > > On Fri, 2011-06-03 at 09:34 +0200, Ingo Molnar wrote:
> > > > > * Sasha Levin <levinsasha928@gmail.com> wrote:
> > > > > 
> > > > > > > with no apparent progress being made.
> > > > > > 
> > > > > > Since it's something that worked in 2.6.37, I've looked into it to 
> > > > > > find what might have caused this issue.
> > > > > > 
> > > > > > I've bisected guest kernels and found that the problem starts with:
> > > > > > 
> > > > > > a26ac2455ffcf3be5c6ef92bc6df7182700f2114 is the first bad commit
> > > > > > commit a26ac2455ffcf3be5c6ef92bc6df7182700f2114
> > > > > > Author: Paul E. McKenney <paul.mckenney@linaro.org>
> > > > > > Date:   Wed Jan 12 14:10:23 2011 -0800
> > > > > > 
> > > > > >     rcu: move TREE_RCU from softirq to kthread
> > > > > > 
> > > > > > Ingo, could you confirm that the problem goes away for you when you 
> > > > > > use an earlier commit?
> > > > > 
> > > > > testing will have to wait, but there's a recent upstream fix:
> > > > > 
> > > > >   d72bce0e67e8: rcu: Cure load woes
> > > > > 
> > > > > That *might* perhaps address this problem too.
> > > > > 
> > > > I've re-tested with Linus's current git, the problem is still there.
> > > > 
> > > > > If not then this appears to be some sort of RCU related livelock with 
> > > > > brutally overcommitted vcpus. On native this would show up too, in a 
> > > > > less drastic form, as a spurious bootup delay.
> > > > 
> > > > I don't think it was overcommited by *that* much. With that commit it
> > > > usually hangs at 20-40 vcpus, while without it I can go up to 255.
> > > 
> > > Here is a diagnostic patch, untested.  It assumes that your system
> > > has only a few CPUs (maybe 8-16) and that timers are still running.
> > > It dumps out some RCU state if grace periods extend for more than
> > > a few seconds.
> > > 
> > > To activate it, call rcu_diag_timer_start() from process context.
> > > To stop it, call rcu_diag_timer_stop(), also from process context.
> > 
> > Since the hang happens in guest kernel very early during boot, I can't
> > call rcu_diag_timer_start(). What would be a good place to put the
> > _start() code instead?
> 
> Assuming that the failure occurs in the host OS rather than in the guest
> OS, I suggest placing rcu_diag_timer_start() in the host code that starts
> up the guest.
> 
> On the other hand, if the failure is occuring in the guest OS, then
> I suggest placing the call to rcu_diag_timer_start() just after timer
> initialization -- that is, assuming that interrupts are enabled at the
> time of the failure.  If interrupts are not yet enabled at the time of
> the failure, color me clueless.

It happens in the guest OS, the bisection was done on a guest kernel.

I've placed the rcu debug _start() right at the end of init_timers()
which happens before the hang, but I'm not seeing any output from the
debug function.

Paul E. McKenney June 3, 2011, 9:20 p.m. UTC | #4

On Sat, Jun 04, 2011 at 12:03:59AM +0300, Sasha Levin wrote:
> On Fri, 2011-06-03 at 13:22 -0700, Paul E. McKenney wrote:
> > On Fri, Jun 03, 2011 at 10:56:20PM +0300, Sasha Levin wrote:
> > > On Fri, 2011-06-03 at 12:31 -0700, Paul E. McKenney wrote:
> > > > On Fri, Jun 03, 2011 at 10:54:19AM +0300, Sasha Levin wrote:
> > > > > On Fri, 2011-06-03 at 09:34 +0200, Ingo Molnar wrote:
> > > > > > * Sasha Levin <levinsasha928@gmail.com> wrote:
> > > > > > 
> > > > > > > > with no apparent progress being made.
> > > > > > > 
> > > > > > > Since it's something that worked in 2.6.37, I've looked into it to 
> > > > > > > find what might have caused this issue.
> > > > > > > 
> > > > > > > I've bisected guest kernels and found that the problem starts with:
> > > > > > > 
> > > > > > > a26ac2455ffcf3be5c6ef92bc6df7182700f2114 is the first bad commit
> > > > > > > commit a26ac2455ffcf3be5c6ef92bc6df7182700f2114
> > > > > > > Author: Paul E. McKenney <paul.mckenney@linaro.org>
> > > > > > > Date:   Wed Jan 12 14:10:23 2011 -0800
> > > > > > > 
> > > > > > >     rcu: move TREE_RCU from softirq to kthread
> > > > > > > 
> > > > > > > Ingo, could you confirm that the problem goes away for you when you 
> > > > > > > use an earlier commit?
> > > > > > 
> > > > > > testing will have to wait, but there's a recent upstream fix:
> > > > > > 
> > > > > >   d72bce0e67e8: rcu: Cure load woes
> > > > > > 
> > > > > > That *might* perhaps address this problem too.
> > > > > > 
> > > > > I've re-tested with Linus's current git, the problem is still there.
> > > > > 
> > > > > > If not then this appears to be some sort of RCU related livelock with 
> > > > > > brutally overcommitted vcpus. On native this would show up too, in a 
> > > > > > less drastic form, as a spurious bootup delay.
> > > > > 
> > > > > I don't think it was overcommited by *that* much. With that commit it
> > > > > usually hangs at 20-40 vcpus, while without it I can go up to 255.
> > > > 
> > > > Here is a diagnostic patch, untested.  It assumes that your system
> > > > has only a few CPUs (maybe 8-16) and that timers are still running.
> > > > It dumps out some RCU state if grace periods extend for more than
> > > > a few seconds.
> > > > 
> > > > To activate it, call rcu_diag_timer_start() from process context.
> > > > To stop it, call rcu_diag_timer_stop(), also from process context.
> > > 
> > > Since the hang happens in guest kernel very early during boot, I can't
> > > call rcu_diag_timer_start(). What would be a good place to put the
> > > _start() code instead?
> > 
> > Assuming that the failure occurs in the host OS rather than in the guest
> > OS, I suggest placing rcu_diag_timer_start() in the host code that starts
> > up the guest.
> > 
> > On the other hand, if the failure is occuring in the guest OS, then
> > I suggest placing the call to rcu_diag_timer_start() just after timer
> > initialization -- that is, assuming that interrupts are enabled at the
> > time of the failure.  If interrupts are not yet enabled at the time of
> > the failure, color me clueless.
> 
> It happens in the guest OS, the bisection was done on a guest kernel.
> 
> I've placed the rcu debug _start() right at the end of init_timers()
> which happens before the hang, but I'm not seeing any output from the
> debug function.

Sounds like the hang is happening before timers start?  Are scheduling
clock interrupts happening in the guest, as in scheduler_tick()?  If so,
I could just as easily put the debug there.

							Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Sasha Levin June 3, 2011, 10:54 p.m. UTC | #5

On Fri, 2011-06-03 at 14:20 -0700, Paul E. McKenney wrote:
> On Sat, Jun 04, 2011 at 12:03:59AM +0300, Sasha Levin wrote:
> > On Fri, 2011-06-03 at 13:22 -0700, Paul E. McKenney wrote:
> > > On Fri, Jun 03, 2011 at 10:56:20PM +0300, Sasha Levin wrote:
> > > > On Fri, 2011-06-03 at 12:31 -0700, Paul E. McKenney wrote:
> > > > > On Fri, Jun 03, 2011 at 10:54:19AM +0300, Sasha Levin wrote:
> > > > > > On Fri, 2011-06-03 at 09:34 +0200, Ingo Molnar wrote:
> > > > > > > * Sasha Levin <levinsasha928@gmail.com> wrote:
> > > > > > > 
> > > > > > > > > with no apparent progress being made.
> > > > > > > > 
> > > > > > > > Since it's something that worked in 2.6.37, I've looked into it to 
> > > > > > > > find what might have caused this issue.
> > > > > > > > 
> > > > > > > > I've bisected guest kernels and found that the problem starts with:
> > > > > > > > 
> > > > > > > > a26ac2455ffcf3be5c6ef92bc6df7182700f2114 is the first bad commit
> > > > > > > > commit a26ac2455ffcf3be5c6ef92bc6df7182700f2114
> > > > > > > > Author: Paul E. McKenney <paul.mckenney@linaro.org>
> > > > > > > > Date:   Wed Jan 12 14:10:23 2011 -0800
> > > > > > > > 
> > > > > > > >     rcu: move TREE_RCU from softirq to kthread
> > > > > > > > 
> > > > > > > > Ingo, could you confirm that the problem goes away for you when you 
> > > > > > > > use an earlier commit?
> > > > > > > 
> > > > > > > testing will have to wait, but there's a recent upstream fix:
> > > > > > > 
> > > > > > >   d72bce0e67e8: rcu: Cure load woes
> > > > > > > 
> > > > > > > That *might* perhaps address this problem too.
> > > > > > > 
> > > > > > I've re-tested with Linus's current git, the problem is still there.
> > > > > > 
> > > > > > > If not then this appears to be some sort of RCU related livelock with 
> > > > > > > brutally overcommitted vcpus. On native this would show up too, in a 
> > > > > > > less drastic form, as a spurious bootup delay.
> > > > > > 
> > > > > > I don't think it was overcommited by *that* much. With that commit it
> > > > > > usually hangs at 20-40 vcpus, while without it I can go up to 255.
> > > > > 
> > > > > Here is a diagnostic patch, untested.  It assumes that your system
> > > > > has only a few CPUs (maybe 8-16) and that timers are still running.
> > > > > It dumps out some RCU state if grace periods extend for more than
> > > > > a few seconds.
> > > > > 
> > > > > To activate it, call rcu_diag_timer_start() from process context.
> > > > > To stop it, call rcu_diag_timer_stop(), also from process context.
> > > > 
> > > > Since the hang happens in guest kernel very early during boot, I can't
> > > > call rcu_diag_timer_start(). What would be a good place to put the
> > > > _start() code instead?
> > > 
> > > Assuming that the failure occurs in the host OS rather than in the guest
> > > OS, I suggest placing rcu_diag_timer_start() in the host code that starts
> > > up the guest.
> > > 
> > > On the other hand, if the failure is occuring in the guest OS, then
> > > I suggest placing the call to rcu_diag_timer_start() just after timer
> > > initialization -- that is, assuming that interrupts are enabled at the
> > > time of the failure.  If interrupts are not yet enabled at the time of
> > > the failure, color me clueless.
> > 
> > It happens in the guest OS, the bisection was done on a guest kernel.
> > 
> > I've placed the rcu debug _start() right at the end of init_timers()
> > which happens before the hang, but I'm not seeing any output from the
> > debug function.
> 
> Sounds like the hang is happening before timers start?  Are scheduling
> clock interrupts happening in the guest, as in scheduler_tick()?  If so,
> I could just as easily put the debug there.

Yes, scheduler_tick() is running before the hang.
Should I just call the handler function directly from there?

Paul E. McKenney June 3, 2011, 11:05 p.m. UTC | #6

On Sat, Jun 04, 2011 at 01:54:45AM +0300, Sasha Levin wrote:
> On Fri, 2011-06-03 at 14:20 -0700, Paul E. McKenney wrote:
> > On Sat, Jun 04, 2011 at 12:03:59AM +0300, Sasha Levin wrote:
> > > On Fri, 2011-06-03 at 13:22 -0700, Paul E. McKenney wrote:
> > > > On Fri, Jun 03, 2011 at 10:56:20PM +0300, Sasha Levin wrote:
> > > > > On Fri, 2011-06-03 at 12:31 -0700, Paul E. McKenney wrote:
> > > > > > On Fri, Jun 03, 2011 at 10:54:19AM +0300, Sasha Levin wrote:
> > > > > > > On Fri, 2011-06-03 at 09:34 +0200, Ingo Molnar wrote:
> > > > > > > > * Sasha Levin <levinsasha928@gmail.com> wrote:
> > > > > > > > 
> > > > > > > > > > with no apparent progress being made.
> > > > > > > > > 
> > > > > > > > > Since it's something that worked in 2.6.37, I've looked into it to 
> > > > > > > > > find what might have caused this issue.
> > > > > > > > > 
> > > > > > > > > I've bisected guest kernels and found that the problem starts with:
> > > > > > > > > 
> > > > > > > > > a26ac2455ffcf3be5c6ef92bc6df7182700f2114 is the first bad commit
> > > > > > > > > commit a26ac2455ffcf3be5c6ef92bc6df7182700f2114
> > > > > > > > > Author: Paul E. McKenney <paul.mckenney@linaro.org>
> > > > > > > > > Date:   Wed Jan 12 14:10:23 2011 -0800
> > > > > > > > > 
> > > > > > > > >     rcu: move TREE_RCU from softirq to kthread
> > > > > > > > > 
> > > > > > > > > Ingo, could you confirm that the problem goes away for you when you 
> > > > > > > > > use an earlier commit?
> > > > > > > > 
> > > > > > > > testing will have to wait, but there's a recent upstream fix:
> > > > > > > > 
> > > > > > > >   d72bce0e67e8: rcu: Cure load woes
> > > > > > > > 
> > > > > > > > That *might* perhaps address this problem too.
> > > > > > > > 
> > > > > > > I've re-tested with Linus's current git, the problem is still there.
> > > > > > > 
> > > > > > > > If not then this appears to be some sort of RCU related livelock with 
> > > > > > > > brutally overcommitted vcpus. On native this would show up too, in a 
> > > > > > > > less drastic form, as a spurious bootup delay.
> > > > > > > 
> > > > > > > I don't think it was overcommited by *that* much. With that commit it
> > > > > > > usually hangs at 20-40 vcpus, while without it I can go up to 255.
> > > > > > 
> > > > > > Here is a diagnostic patch, untested.  It assumes that your system
> > > > > > has only a few CPUs (maybe 8-16) and that timers are still running.
> > > > > > It dumps out some RCU state if grace periods extend for more than
> > > > > > a few seconds.
> > > > > > 
> > > > > > To activate it, call rcu_diag_timer_start() from process context.
> > > > > > To stop it, call rcu_diag_timer_stop(), also from process context.
> > > > > 
> > > > > Since the hang happens in guest kernel very early during boot, I can't
> > > > > call rcu_diag_timer_start(). What would be a good place to put the
> > > > > _start() code instead?
> > > > 
> > > > Assuming that the failure occurs in the host OS rather than in the guest
> > > > OS, I suggest placing rcu_diag_timer_start() in the host code that starts
> > > > up the guest.
> > > > 
> > > > On the other hand, if the failure is occuring in the guest OS, then
> > > > I suggest placing the call to rcu_diag_timer_start() just after timer
> > > > initialization -- that is, assuming that interrupts are enabled at the
> > > > time of the failure.  If interrupts are not yet enabled at the time of
> > > > the failure, color me clueless.
> > > 
> > > It happens in the guest OS, the bisection was done on a guest kernel.
> > > 
> > > I've placed the rcu debug _start() right at the end of init_timers()
> > > which happens before the hang, but I'm not seeing any output from the
> > > debug function.
> > 
> > Sounds like the hang is happening before timers start?  Are scheduling
> > clock interrupts happening in the guest, as in scheduler_tick()?  If so,
> > I could just as easily put the debug there.
> 
> Yes, scheduler_tick() is running before the hang.
> Should I just call the handler function directly from there?

I suggest calling it no more than once every few seconds, but yes.
(Otherwise you will get a very full dmesg.)

							Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Sasha Levin June 4, 2011, 6:26 a.m. UTC | #7

On Fri, 2011-06-03 at 16:05 -0700, Paul E. McKenney wrote:
> On Sat, Jun 04, 2011 at 01:54:45AM +0300, Sasha Levin wrote:
> > On Fri, 2011-06-03 at 14:20 -0700, Paul E. McKenney wrote:
> > > On Sat, Jun 04, 2011 at 12:03:59AM +0300, Sasha Levin wrote:
> > > > On Fri, 2011-06-03 at 13:22 -0700, Paul E. McKenney wrote:
> > > > > On Fri, Jun 03, 2011 at 10:56:20PM +0300, Sasha Levin wrote:
> > > > > > On Fri, 2011-06-03 at 12:31 -0700, Paul E. McKenney wrote:
> > > > > > > On Fri, Jun 03, 2011 at 10:54:19AM +0300, Sasha Levin wrote:
> > > > > > > > On Fri, 2011-06-03 at 09:34 +0200, Ingo Molnar wrote:
> > > > > > > > > * Sasha Levin <levinsasha928@gmail.com> wrote:
> > > > > > > > > 
> > > > > > > > > > > with no apparent progress being made.
> > > > > > > > > > 
> > > > > > > > > > Since it's something that worked in 2.6.37, I've looked into it to 
> > > > > > > > > > find what might have caused this issue.
> > > > > > > > > > 
> > > > > > > > > > I've bisected guest kernels and found that the problem starts with:
> > > > > > > > > > 
> > > > > > > > > > a26ac2455ffcf3be5c6ef92bc6df7182700f2114 is the first bad commit
> > > > > > > > > > commit a26ac2455ffcf3be5c6ef92bc6df7182700f2114
> > > > > > > > > > Author: Paul E. McKenney <paul.mckenney@linaro.org>
> > > > > > > > > > Date:   Wed Jan 12 14:10:23 2011 -0800
> > > > > > > > > > 
> > > > > > > > > >     rcu: move TREE_RCU from softirq to kthread
> > > > > > > > > > 
> > > > > > > > > > Ingo, could you confirm that the problem goes away for you when you 
> > > > > > > > > > use an earlier commit?
> > > > > > > > > 
> > > > > > > > > testing will have to wait, but there's a recent upstream fix:
> > > > > > > > > 
> > > > > > > > >   d72bce0e67e8: rcu: Cure load woes
> > > > > > > > > 
> > > > > > > > > That *might* perhaps address this problem too.
> > > > > > > > > 
> > > > > > > > I've re-tested with Linus's current git, the problem is still there.
> > > > > > > > 
> > > > > > > > > If not then this appears to be some sort of RCU related livelock with 
> > > > > > > > > brutally overcommitted vcpus. On native this would show up too, in a 
> > > > > > > > > less drastic form, as a spurious bootup delay.
> > > > > > > > 
> > > > > > > > I don't think it was overcommited by *that* much. With that commit it
> > > > > > > > usually hangs at 20-40 vcpus, while without it I can go up to 255.
> > > > > > > 
> > > > > > > Here is a diagnostic patch, untested.  It assumes that your system
> > > > > > > has only a few CPUs (maybe 8-16) and that timers are still running.
> > > > > > > It dumps out some RCU state if grace periods extend for more than
> > > > > > > a few seconds.
> > > > > > > 
> > > > > > > To activate it, call rcu_diag_timer_start() from process context.
> > > > > > > To stop it, call rcu_diag_timer_stop(), also from process context.
> > > > > > 
> > > > > > Since the hang happens in guest kernel very early during boot, I can't
> > > > > > call rcu_diag_timer_start(). What would be a good place to put the
> > > > > > _start() code instead?
> > > > > 
> > > > > Assuming that the failure occurs in the host OS rather than in the guest
> > > > > OS, I suggest placing rcu_diag_timer_start() in the host code that starts
> > > > > up the guest.
> > > > > 
> > > > > On the other hand, if the failure is occuring in the guest OS, then
> > > > > I suggest placing the call to rcu_diag_timer_start() just after timer
> > > > > initialization -- that is, assuming that interrupts are enabled at the
> > > > > time of the failure.  If interrupts are not yet enabled at the time of
> > > > > the failure, color me clueless.
> > > > 
> > > > It happens in the guest OS, the bisection was done on a guest kernel.
> > > > 
> > > > I've placed the rcu debug _start() right at the end of init_timers()
> > > > which happens before the hang, but I'm not seeing any output from the
> > > > debug function.
> > > 
> > > Sounds like the hang is happening before timers start?  Are scheduling
> > > clock interrupts happening in the guest, as in scheduler_tick()?  If so,
> > > I could just as easily put the debug there.
> > 
> > Yes, scheduler_tick() is running before the hang.
> > Should I just call the handler function directly from there?
> 
> I suggest calling it no more than once every few seconds, but yes.
> (Otherwise you will get a very full dmesg.)

Here is the last group of printk(), basically it just repeats itself
when it's stuck:

KVM setup async PF for cpu 19
 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Call Trace:
rcu_diag: rcuc17 2/46/0 
kworker/0:1     R  running task        0     0     14 0x00000008
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Call Trace:
rcu_diag: rcuc18 2/23/0 
kworker/0:1     R  running task        0     0     14 0x00000008
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Call Trace:
rcu_diag: rcuc19 0/0/0 
kworker/0:1     R  running task        0     0     14 0x00000008
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Call Trace:
[    0.527044]  #20
 0000000000000000
Call Trace:
rcu_diag: rcuc3 0/0/0 
kworker/0:1     R  running task        0     0     14 0x00000008
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Call Trace:
rcu_diag: rcuc4 2/358/0 
kworker/0:1     R  running task        0     0     14 0x00000008
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Call Trace:
rcu_diag: rcuc5 2/340/1 
kworker/0:1     R  running task        0     0     14 0x00000008
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Call Trace:
rcu_diag: rcuc6 2/321/1 
kworker/0:1     R  running task        0     0     14 0x00000008
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Call Trace:
rcu_diag: rcuc7 2/301/1 
kworker/0:1     R  running task        0     0     14 0x00000008
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Call Trace:
rcu_diag: rcuc8 2/277/0 
kworker/0:1     R  running task        0     0     14 0x00000008
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Call Trace:
rcu_diag: rcuc9 2/255/0 
kworker/0:1     R  running task        0     0     14 0x00000008
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Call Trace:
rcu_diag: rcuc10 2/226/0 
kworker/0:1     R  running task        0     0     14 0x00000008
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Call Trace:
rcu_diag: rcuc11 2/214/1 
kworker/0:1     R  running task        0     0     14 0x00000008
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Call Trace:
rcu_diag: rcuc12 2/193/1 
kworker/0:1     R  running task        0     0     14 0x00000008
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Call Trace:
rcu_diag: rcuc13 2/172/0 
kworker/0:1     R  running task        0     0     14 0x00000008
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Call Trace:
rcu_diag: rcuc14 2/146/0 
kworker/0:1     R  running task        0     0     14 0x00000008
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Call Trace:
rcu_diag: rcuc15 2/130/0 
kworker/0:1     R  running task        0     0     14 0x00000008
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Call Trace:
rcu_diag: rcuc16 2/107/0 
kworker/0:1     R  running task        0     0     14 0x00000008
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Call Trace:
rcu_diag: rcuc17 2/86/0 
kworker/0:1     R  running task        0     0     14 0x00000008
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Call Trace:
rcu_diag: rcuc18 2/62/1 
kworker/0:1     R  running task        0     0     14 0x00000008
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Call Trace:
rcu_diag: rcuc19 2/39/0 
kworker/0:1     R  running task        0     0     14 0x00000008
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Call Trace:

Paul E. McKenney June 4, 2011, 4:30 p.m. UTC | #8

On Sat, Jun 04, 2011 at 09:26:15AM +0300, Sasha Levin wrote:
> On Fri, 2011-06-03 at 16:05 -0700, Paul E. McKenney wrote:
> > On Sat, Jun 04, 2011 at 01:54:45AM +0300, Sasha Levin wrote:
> > > On Fri, 2011-06-03 at 14:20 -0700, Paul E. McKenney wrote:
> > > > On Sat, Jun 04, 2011 at 12:03:59AM +0300, Sasha Levin wrote:
> > > > > On Fri, 2011-06-03 at 13:22 -0700, Paul E. McKenney wrote:
> > > > > > On Fri, Jun 03, 2011 at 10:56:20PM +0300, Sasha Levin wrote:
> > > > > > > On Fri, 2011-06-03 at 12:31 -0700, Paul E. McKenney wrote:
> > > > > > > > On Fri, Jun 03, 2011 at 10:54:19AM +0300, Sasha Levin wrote:
> > > > > > > > > On Fri, 2011-06-03 at 09:34 +0200, Ingo Molnar wrote:
> > > > > > > > > > * Sasha Levin <levinsasha928@gmail.com> wrote:
> > > > > > > > > > 
> > > > > > > > > > > > with no apparent progress being made.
> > > > > > > > > > > 
> > > > > > > > > > > Since it's something that worked in 2.6.37, I've looked into it to 
> > > > > > > > > > > find what might have caused this issue.
> > > > > > > > > > > 
> > > > > > > > > > > I've bisected guest kernels and found that the problem starts with:
> > > > > > > > > > > 
> > > > > > > > > > > a26ac2455ffcf3be5c6ef92bc6df7182700f2114 is the first bad commit
> > > > > > > > > > > commit a26ac2455ffcf3be5c6ef92bc6df7182700f2114
> > > > > > > > > > > Author: Paul E. McKenney <paul.mckenney@linaro.org>
> > > > > > > > > > > Date:   Wed Jan 12 14:10:23 2011 -0800
> > > > > > > > > > > 
> > > > > > > > > > >     rcu: move TREE_RCU from softirq to kthread
> > > > > > > > > > > 
> > > > > > > > > > > Ingo, could you confirm that the problem goes away for you when you 
> > > > > > > > > > > use an earlier commit?
> > > > > > > > > > 
> > > > > > > > > > testing will have to wait, but there's a recent upstream fix:
> > > > > > > > > > 
> > > > > > > > > >   d72bce0e67e8: rcu: Cure load woes
> > > > > > > > > > 
> > > > > > > > > > That *might* perhaps address this problem too.
> > > > > > > > > > 
> > > > > > > > > I've re-tested with Linus's current git, the problem is still there.
> > > > > > > > > 
> > > > > > > > > > If not then this appears to be some sort of RCU related livelock with 
> > > > > > > > > > brutally overcommitted vcpus. On native this would show up too, in a 
> > > > > > > > > > less drastic form, as a spurious bootup delay.
> > > > > > > > > 
> > > > > > > > > I don't think it was overcommited by *that* much. With that commit it
> > > > > > > > > usually hangs at 20-40 vcpus, while without it I can go up to 255.
> > > > > > > > 
> > > > > > > > Here is a diagnostic patch, untested.  It assumes that your system
> > > > > > > > has only a few CPUs (maybe 8-16) and that timers are still running.
> > > > > > > > It dumps out some RCU state if grace periods extend for more than
> > > > > > > > a few seconds.
> > > > > > > > 
> > > > > > > > To activate it, call rcu_diag_timer_start() from process context.
> > > > > > > > To stop it, call rcu_diag_timer_stop(), also from process context.
> > > > > > > 
> > > > > > > Since the hang happens in guest kernel very early during boot, I can't
> > > > > > > call rcu_diag_timer_start(). What would be a good place to put the
> > > > > > > _start() code instead?
> > > > > > 
> > > > > > Assuming that the failure occurs in the host OS rather than in the guest
> > > > > > OS, I suggest placing rcu_diag_timer_start() in the host code that starts
> > > > > > up the guest.
> > > > > > 
> > > > > > On the other hand, if the failure is occuring in the guest OS, then
> > > > > > I suggest placing the call to rcu_diag_timer_start() just after timer
> > > > > > initialization -- that is, assuming that interrupts are enabled at the
> > > > > > time of the failure.  If interrupts are not yet enabled at the time of
> > > > > > the failure, color me clueless.
> > > > > 
> > > > > It happens in the guest OS, the bisection was done on a guest kernel.
> > > > > 
> > > > > I've placed the rcu debug _start() right at the end of init_timers()
> > > > > which happens before the hang, but I'm not seeing any output from the
> > > > > debug function.
> > > > 
> > > > Sounds like the hang is happening before timers start?  Are scheduling
> > > > clock interrupts happening in the guest, as in scheduler_tick()?  If so,
> > > > I could just as easily put the debug there.
> > > 
> > > Yes, scheduler_tick() is running before the hang.
> > > Should I just call the handler function directly from there?
> > 
> > I suggest calling it no more than once every few seconds, but yes.
> > (Otherwise you will get a very full dmesg.)
> 
> Here is the last group of printk(), basically it just repeats itself
> when it's stuck:
> 
> KVM setup async PF for cpu 19
>  0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> Call Trace:
> rcu_diag: rcuc17 2/46/0 
> kworker/0:1     R  running task        0     0     14 0x00000008
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> Call Trace:
> rcu_diag: rcuc18 2/23/0 
> kworker/0:1     R  running task        0     0     14 0x00000008
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> Call Trace:
> rcu_diag: rcuc19 0/0/0 
> kworker/0:1     R  running task        0     0     14 0x00000008
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> Call Trace:
> [    0.527044]  #20
>  0000000000000000
> Call Trace:
> rcu_diag: rcuc3 0/0/0 
> kworker/0:1     R  running task        0     0     14 0x00000008
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> Call Trace:
> rcu_diag: rcuc4 2/358/0 
> kworker/0:1     R  running task        0     0     14 0x00000008
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> Call Trace:
> rcu_diag: rcuc5 2/340/1 

OK, this one (RCU's per-CPU kthread for CPU 5) has work to do (this
is the "1" after the final "/").  Does rcuc5 show up later in the
output?

							Thanx, Paul

> kworker/0:1     R  running task        0     0     14 0x00000008
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> Call Trace:
> rcu_diag: rcuc6 2/321/1 
> kworker/0:1     R  running task        0     0     14 0x00000008
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> Call Trace:
> rcu_diag: rcuc7 2/301/1 
> kworker/0:1     R  running task        0     0     14 0x00000008
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> Call Trace:
> rcu_diag: rcuc8 2/277/0 
> kworker/0:1     R  running task        0     0     14 0x00000008
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> Call Trace:
> rcu_diag: rcuc9 2/255/0 
> kworker/0:1     R  running task        0     0     14 0x00000008
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> Call Trace:
> rcu_diag: rcuc10 2/226/0 
> kworker/0:1     R  running task        0     0     14 0x00000008
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> Call Trace:
> rcu_diag: rcuc11 2/214/1 
> kworker/0:1     R  running task        0     0     14 0x00000008
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> Call Trace:
> rcu_diag: rcuc12 2/193/1 
> kworker/0:1     R  running task        0     0     14 0x00000008
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> Call Trace:
> rcu_diag: rcuc13 2/172/0 
> kworker/0:1     R  running task        0     0     14 0x00000008
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> Call Trace:
> rcu_diag: rcuc14 2/146/0 
> kworker/0:1     R  running task        0     0     14 0x00000008
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> Call Trace:
> rcu_diag: rcuc15 2/130/0 
> kworker/0:1     R  running task        0     0     14 0x00000008
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> Call Trace:
> rcu_diag: rcuc16 2/107/0 
> kworker/0:1     R  running task        0     0     14 0x00000008
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> Call Trace:
> rcu_diag: rcuc17 2/86/0 
> kworker/0:1     R  running task        0     0     14 0x00000008
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> Call Trace:
> rcu_diag: rcuc18 2/62/1 
> kworker/0:1     R  running task        0     0     14 0x00000008
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> Call Trace:
> rcu_diag: rcuc19 2/39/0 
> kworker/0:1     R  running task        0     0     14 0x00000008
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
>  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> Call Trace:
> 
> 
> -- 
> 
> Sasha.
> 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Sasha Levin June 14, 2011, 10:26 p.m. UTC | #9

Sorry for the delay on this.

On Sat, 2011-06-04 at 09:30 -0700, Paul E. McKenney wrote:
> On Sat, Jun 04, 2011 at 09:26:15AM +0300, Sasha Levin wrote:
> > On Fri, 2011-06-03 at 16:05 -0700, Paul E. McKenney wrote:
> > > On Sat, Jun 04, 2011 at 01:54:45AM +0300, Sasha Levin wrote:
> > > > On Fri, 2011-06-03 at 14:20 -0700, Paul E. McKenney wrote:
> > > > > On Sat, Jun 04, 2011 at 12:03:59AM +0300, Sasha Levin wrote:
> > > > > > On Fri, 2011-06-03 at 13:22 -0700, Paul E. McKenney wrote:
> > > > > > > On Fri, Jun 03, 2011 at 10:56:20PM +0300, Sasha Levin wrote:
> > > > > > > > On Fri, 2011-06-03 at 12:31 -0700, Paul E. McKenney wrote:
> > > > > > > > > On Fri, Jun 03, 2011 at 10:54:19AM +0300, Sasha Levin wrote:
> > > > > > > > > > On Fri, 2011-06-03 at 09:34 +0200, Ingo Molnar wrote:
> > > > > > > > > > > * Sasha Levin <levinsasha928@gmail.com> wrote:
> > > > > > > > > > > 
> > > > > > > > > > > > > with no apparent progress being made.
> > > > > > > > > > > > 
> > > > > > > > > > > > Since it's something that worked in 2.6.37, I've looked into it to 
> > > > > > > > > > > > find what might have caused this issue.
> > > > > > > > > > > > 
> > > > > > > > > > > > I've bisected guest kernels and found that the problem starts with:
> > > > > > > > > > > > 
> > > > > > > > > > > > a26ac2455ffcf3be5c6ef92bc6df7182700f2114 is the first bad commit
> > > > > > > > > > > > commit a26ac2455ffcf3be5c6ef92bc6df7182700f2114
> > > > > > > > > > > > Author: Paul E. McKenney <paul.mckenney@linaro.org>
> > > > > > > > > > > > Date:   Wed Jan 12 14:10:23 2011 -0800
> > > > > > > > > > > > 
> > > > > > > > > > > >     rcu: move TREE_RCU from softirq to kthread
> > > > > > > > > > > > 
> > > > > > > > > > > > Ingo, could you confirm that the problem goes away for you when you 
> > > > > > > > > > > > use an earlier commit?
> > > > > > > > > > > 
> > > > > > > > > > > testing will have to wait, but there's a recent upstream fix:
> > > > > > > > > > > 
> > > > > > > > > > >   d72bce0e67e8: rcu: Cure load woes
> > > > > > > > > > > 
> > > > > > > > > > > That *might* perhaps address this problem too.
> > > > > > > > > > > 
> > > > > > > > > > I've re-tested with Linus's current git, the problem is still there.
> > > > > > > > > > 
> > > > > > > > > > > If not then this appears to be some sort of RCU related livelock with 
> > > > > > > > > > > brutally overcommitted vcpus. On native this would show up too, in a 
> > > > > > > > > > > less drastic form, as a spurious bootup delay.
> > > > > > > > > > 
> > > > > > > > > > I don't think it was overcommited by *that* much. With that commit it
> > > > > > > > > > usually hangs at 20-40 vcpus, while without it I can go up to 255.
> > > > > > > > > 
> > > > > > > > > Here is a diagnostic patch, untested.  It assumes that your system
> > > > > > > > > has only a few CPUs (maybe 8-16) and that timers are still running.
> > > > > > > > > It dumps out some RCU state if grace periods extend for more than
> > > > > > > > > a few seconds.
> > > > > > > > > 
> > > > > > > > > To activate it, call rcu_diag_timer_start() from process context.
> > > > > > > > > To stop it, call rcu_diag_timer_stop(), also from process context.
> > > > > > > > 
> > > > > > > > Since the hang happens in guest kernel very early during boot, I can't
> > > > > > > > call rcu_diag_timer_start(). What would be a good place to put the
> > > > > > > > _start() code instead?
> > > > > > > 
> > > > > > > Assuming that the failure occurs in the host OS rather than in the guest
> > > > > > > OS, I suggest placing rcu_diag_timer_start() in the host code that starts
> > > > > > > up the guest.
> > > > > > > 
> > > > > > > On the other hand, if the failure is occuring in the guest OS, then
> > > > > > > I suggest placing the call to rcu_diag_timer_start() just after timer
> > > > > > > initialization -- that is, assuming that interrupts are enabled at the
> > > > > > > time of the failure.  If interrupts are not yet enabled at the time of
> > > > > > > the failure, color me clueless.
> > > > > > 
> > > > > > It happens in the guest OS, the bisection was done on a guest kernel.
> > > > > > 
> > > > > > I've placed the rcu debug _start() right at the end of init_timers()
> > > > > > which happens before the hang, but I'm not seeing any output from the
> > > > > > debug function.
> > > > > 
> > > > > Sounds like the hang is happening before timers start?  Are scheduling
> > > > > clock interrupts happening in the guest, as in scheduler_tick()?  If so,
> > > > > I could just as easily put the debug there.
> > > > 
> > > > Yes, scheduler_tick() is running before the hang.
> > > > Should I just call the handler function directly from there?
> > > 
> > > I suggest calling it no more than once every few seconds, but yes.
> > > (Otherwise you will get a very full dmesg.)
> > 
> > Here is the last group of printk(), basically it just repeats itself
> > when it's stuck:
> > 
> > KVM setup async PF for cpu 19
> >  0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > Call Trace:
> > rcu_diag: rcuc17 2/46/0 
> > kworker/0:1     R  running task        0     0     14 0x00000008
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > Call Trace:
> > rcu_diag: rcuc18 2/23/0 
> > kworker/0:1     R  running task        0     0     14 0x00000008
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > Call Trace:
> > rcu_diag: rcuc19 0/0/0 
> > kworker/0:1     R  running task        0     0     14 0x00000008
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > Call Trace:
> > [    0.527044]  #20
> >  0000000000000000
> > Call Trace:
> > rcu_diag: rcuc3 0/0/0 
> > kworker/0:1     R  running task        0     0     14 0x00000008
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > Call Trace:
> > rcu_diag: rcuc4 2/358/0 
> > kworker/0:1     R  running task        0     0     14 0x00000008
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > Call Trace:
> > rcu_diag: rcuc5 2/340/1 
> 
> OK, this one (RCU's per-CPU kthread for CPU 5) has work to do (this
> is the "1" after the final "/").  Does rcuc5 show up later in the
> output?
> 

The output grinds to a stop, I've pasted the last part of the prints -
there are no more prints at all past this one.

> 							Thanx, Paul
> 
> > kworker/0:1     R  running task        0     0     14 0x00000008
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > Call Trace:
> > rcu_diag: rcuc6 2/321/1 
> > kworker/0:1     R  running task        0     0     14 0x00000008
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > Call Trace:
> > rcu_diag: rcuc7 2/301/1 
> > kworker/0:1     R  running task        0     0     14 0x00000008
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > Call Trace:
> > rcu_diag: rcuc8 2/277/0 
> > kworker/0:1     R  running task        0     0     14 0x00000008
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > Call Trace:
> > rcu_diag: rcuc9 2/255/0 
> > kworker/0:1     R  running task        0     0     14 0x00000008
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > Call Trace:
> > rcu_diag: rcuc10 2/226/0 
> > kworker/0:1     R  running task        0     0     14 0x00000008
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > Call Trace:
> > rcu_diag: rcuc11 2/214/1 
> > kworker/0:1     R  running task        0     0     14 0x00000008
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > Call Trace:
> > rcu_diag: rcuc12 2/193/1 
> > kworker/0:1     R  running task        0     0     14 0x00000008
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > Call Trace:
> > rcu_diag: rcuc13 2/172/0 
> > kworker/0:1     R  running task        0     0     14 0x00000008
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > Call Trace:
> > rcu_diag: rcuc14 2/146/0 
> > kworker/0:1     R  running task        0     0     14 0x00000008
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > Call Trace:
> > rcu_diag: rcuc15 2/130/0 
> > kworker/0:1     R  running task        0     0     14 0x00000008
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > Call Trace:
> > rcu_diag: rcuc16 2/107/0 
> > kworker/0:1     R  running task        0     0     14 0x00000008
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > Call Trace:
> > rcu_diag: rcuc17 2/86/0 
> > kworker/0:1     R  running task        0     0     14 0x00000008
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > Call Trace:
> > rcu_diag: rcuc18 2/62/1 
> > kworker/0:1     R  running task        0     0     14 0x00000008
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > Call Trace:
> > rcu_diag: rcuc19 2/39/0 
> > kworker/0:1     R  running task        0     0     14 0x00000008
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > Call Trace:
> > 
> > 
> > -- 
> > 
> > Sasha.
> >

Paul E. McKenney June 14, 2011, 11:42 p.m. UTC | #10

On Tue, Jun 14, 2011 at 06:26:01PM -0400, Sasha Levin wrote:
> Sorry for the delay on this.

Actually, you might have had excellent timing on this one.

Could you please try Shaohua Li's patch at:

	https://lkml.org/lkml/2011/6/14/20

This patch makes RCU much less prone to causing massive scheduler lock
contention, which might well be the cause of the problems that you
are seeing.

							Thanx, Paul

> On Sat, 2011-06-04 at 09:30 -0700, Paul E. McKenney wrote:
> > On Sat, Jun 04, 2011 at 09:26:15AM +0300, Sasha Levin wrote:
> > > On Fri, 2011-06-03 at 16:05 -0700, Paul E. McKenney wrote:
> > > > On Sat, Jun 04, 2011 at 01:54:45AM +0300, Sasha Levin wrote:
> > > > > On Fri, 2011-06-03 at 14:20 -0700, Paul E. McKenney wrote:
> > > > > > On Sat, Jun 04, 2011 at 12:03:59AM +0300, Sasha Levin wrote:
> > > > > > > On Fri, 2011-06-03 at 13:22 -0700, Paul E. McKenney wrote:
> > > > > > > > On Fri, Jun 03, 2011 at 10:56:20PM +0300, Sasha Levin wrote:
> > > > > > > > > On Fri, 2011-06-03 at 12:31 -0700, Paul E. McKenney wrote:
> > > > > > > > > > On Fri, Jun 03, 2011 at 10:54:19AM +0300, Sasha Levin wrote:
> > > > > > > > > > > On Fri, 2011-06-03 at 09:34 +0200, Ingo Molnar wrote:
> > > > > > > > > > > > * Sasha Levin <levinsasha928@gmail.com> wrote:
> > > > > > > > > > > > 
> > > > > > > > > > > > > > with no apparent progress being made.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Since it's something that worked in 2.6.37, I've looked into it to 
> > > > > > > > > > > > > find what might have caused this issue.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > I've bisected guest kernels and found that the problem starts with:
> > > > > > > > > > > > > 
> > > > > > > > > > > > > a26ac2455ffcf3be5c6ef92bc6df7182700f2114 is the first bad commit
> > > > > > > > > > > > > commit a26ac2455ffcf3be5c6ef92bc6df7182700f2114
> > > > > > > > > > > > > Author: Paul E. McKenney <paul.mckenney@linaro.org>
> > > > > > > > > > > > > Date:   Wed Jan 12 14:10:23 2011 -0800
> > > > > > > > > > > > > 
> > > > > > > > > > > > >     rcu: move TREE_RCU from softirq to kthread
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Ingo, could you confirm that the problem goes away for you when you 
> > > > > > > > > > > > > use an earlier commit?
> > > > > > > > > > > > 
> > > > > > > > > > > > testing will have to wait, but there's a recent upstream fix:
> > > > > > > > > > > > 
> > > > > > > > > > > >   d72bce0e67e8: rcu: Cure load woes
> > > > > > > > > > > > 
> > > > > > > > > > > > That *might* perhaps address this problem too.
> > > > > > > > > > > > 
> > > > > > > > > > > I've re-tested with Linus's current git, the problem is still there.
> > > > > > > > > > > 
> > > > > > > > > > > > If not then this appears to be some sort of RCU related livelock with 
> > > > > > > > > > > > brutally overcommitted vcpus. On native this would show up too, in a 
> > > > > > > > > > > > less drastic form, as a spurious bootup delay.
> > > > > > > > > > > 
> > > > > > > > > > > I don't think it was overcommited by *that* much. With that commit it
> > > > > > > > > > > usually hangs at 20-40 vcpus, while without it I can go up to 255.
> > > > > > > > > > 
> > > > > > > > > > Here is a diagnostic patch, untested.  It assumes that your system
> > > > > > > > > > has only a few CPUs (maybe 8-16) and that timers are still running.
> > > > > > > > > > It dumps out some RCU state if grace periods extend for more than
> > > > > > > > > > a few seconds.
> > > > > > > > > > 
> > > > > > > > > > To activate it, call rcu_diag_timer_start() from process context.
> > > > > > > > > > To stop it, call rcu_diag_timer_stop(), also from process context.
> > > > > > > > > 
> > > > > > > > > Since the hang happens in guest kernel very early during boot, I can't
> > > > > > > > > call rcu_diag_timer_start(). What would be a good place to put the
> > > > > > > > > _start() code instead?
> > > > > > > > 
> > > > > > > > Assuming that the failure occurs in the host OS rather than in the guest
> > > > > > > > OS, I suggest placing rcu_diag_timer_start() in the host code that starts
> > > > > > > > up the guest.
> > > > > > > > 
> > > > > > > > On the other hand, if the failure is occuring in the guest OS, then
> > > > > > > > I suggest placing the call to rcu_diag_timer_start() just after timer
> > > > > > > > initialization -- that is, assuming that interrupts are enabled at the
> > > > > > > > time of the failure.  If interrupts are not yet enabled at the time of
> > > > > > > > the failure, color me clueless.
> > > > > > > 
> > > > > > > It happens in the guest OS, the bisection was done on a guest kernel.
> > > > > > > 
> > > > > > > I've placed the rcu debug _start() right at the end of init_timers()
> > > > > > > which happens before the hang, but I'm not seeing any output from the
> > > > > > > debug function.
> > > > > > 
> > > > > > Sounds like the hang is happening before timers start?  Are scheduling
> > > > > > clock interrupts happening in the guest, as in scheduler_tick()?  If so,
> > > > > > I could just as easily put the debug there.
> > > > > 
> > > > > Yes, scheduler_tick() is running before the hang.
> > > > > Should I just call the handler function directly from there?
> > > > 
> > > > I suggest calling it no more than once every few seconds, but yes.
> > > > (Otherwise you will get a very full dmesg.)
> > > 
> > > Here is the last group of printk(), basically it just repeats itself
> > > when it's stuck:
> > > 
> > > KVM setup async PF for cpu 19
> > >  0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > > Call Trace:
> > > rcu_diag: rcuc17 2/46/0 
> > > kworker/0:1     R  running task        0     0     14 0x00000008
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > > Call Trace:
> > > rcu_diag: rcuc18 2/23/0 
> > > kworker/0:1     R  running task        0     0     14 0x00000008
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > > Call Trace:
> > > rcu_diag: rcuc19 0/0/0 
> > > kworker/0:1     R  running task        0     0     14 0x00000008
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > > Call Trace:
> > > [    0.527044]  #20
> > >  0000000000000000
> > > Call Trace:
> > > rcu_diag: rcuc3 0/0/0 
> > > kworker/0:1     R  running task        0     0     14 0x00000008
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > > Call Trace:
> > > rcu_diag: rcuc4 2/358/0 
> > > kworker/0:1     R  running task        0     0     14 0x00000008
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > > Call Trace:
> > > rcu_diag: rcuc5 2/340/1 
> > 
> > OK, this one (RCU's per-CPU kthread for CPU 5) has work to do (this
> > is the "1" after the final "/").  Does rcuc5 show up later in the
> > output?
> > 
> 
> The output grinds to a stop, I've pasted the last part of the prints -
> there are no more prints at all past this one.
> 
> > 							Thanx, Paul
> > 
> > > kworker/0:1     R  running task        0     0     14 0x00000008
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > > Call Trace:
> > > rcu_diag: rcuc6 2/321/1 
> > > kworker/0:1     R  running task        0     0     14 0x00000008
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > > Call Trace:
> > > rcu_diag: rcuc7 2/301/1 
> > > kworker/0:1     R  running task        0     0     14 0x00000008
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > > Call Trace:
> > > rcu_diag: rcuc8 2/277/0 
> > > kworker/0:1     R  running task        0     0     14 0x00000008
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > > Call Trace:
> > > rcu_diag: rcuc9 2/255/0 
> > > kworker/0:1     R  running task        0     0     14 0x00000008
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > > Call Trace:
> > > rcu_diag: rcuc10 2/226/0 
> > > kworker/0:1     R  running task        0     0     14 0x00000008
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > > Call Trace:
> > > rcu_diag: rcuc11 2/214/1 
> > > kworker/0:1     R  running task        0     0     14 0x00000008
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > > Call Trace:
> > > rcu_diag: rcuc12 2/193/1 
> > > kworker/0:1     R  running task        0     0     14 0x00000008
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > > Call Trace:
> > > rcu_diag: rcuc13 2/172/0 
> > > kworker/0:1     R  running task        0     0     14 0x00000008
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > > Call Trace:
> > > rcu_diag: rcuc14 2/146/0 
> > > kworker/0:1     R  running task        0     0     14 0x00000008
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > > Call Trace:
> > > rcu_diag: rcuc15 2/130/0 
> > > kworker/0:1     R  running task        0     0     14 0x00000008
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > > Call Trace:
> > > rcu_diag: rcuc16 2/107/0 
> > > kworker/0:1     R  running task        0     0     14 0x00000008
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > > Call Trace:
> > > rcu_diag: rcuc17 2/86/0 
> > > kworker/0:1     R  running task        0     0     14 0x00000008
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > > Call Trace:
> > > rcu_diag: rcuc18 2/62/1 
> > > kworker/0:1     R  running task        0     0     14 0x00000008
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > > Call Trace:
> > > rcu_diag: rcuc19 2/39/0 
> > > kworker/0:1     R  running task        0     0     14 0x00000008
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > >  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > > Call Trace:
> > > 
> > > 
> > > -- 
> > > 
> > > Sasha.
> > > 
> 
> -- 
> 
> Sasha.
> 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Sasha Levin June 15, 2011, 1:25 a.m. UTC | #11

On Tue, 2011-06-14 at 16:42 -0700, Paul E. McKenney wrote:
> On Tue, Jun 14, 2011 at 06:26:01PM -0400, Sasha Levin wrote:
> > Sorry for the delay on this.
> 
> Actually, you might have had excellent timing on this one.
> 
> Could you please try Shaohua Li's patch at:
> 
> 	https://lkml.org/lkml/2011/6/14/20
> 
> This patch makes RCU much less prone to causing massive scheduler lock
> contention, which might well be the cause of the problems that you
> are seeing.

This patch has solved the hang issue I've had. kvm tools now boots 60
vcpus easily with 3.0 kernel.

Paul E. McKenney June 15, 2011, 4:22 a.m. UTC | #12

On Tue, Jun 14, 2011 at 09:25:22PM -0400, Sasha Levin wrote:
> On Tue, 2011-06-14 at 16:42 -0700, Paul E. McKenney wrote:
> > On Tue, Jun 14, 2011 at 06:26:01PM -0400, Sasha Levin wrote:
> > > Sorry for the delay on this.
> > 
> > Actually, you might have had excellent timing on this one.
> > 
> > Could you please try Shaohua Li's patch at:
> > 
> > 	https://lkml.org/lkml/2011/6/14/20
> > 
> > This patch makes RCU much less prone to causing massive scheduler lock
> > contention, which might well be the cause of the problems that you
> > are seeing.
> 
> This patch has solved the hang issue I've had. kvm tools now boots 60
> vcpus easily with 3.0 kernel.

Thank you, Sasha -- I have added your Tested-by.

							Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[v2,6/8] kvm tools: Add rwlock wrapper

Commit Message

Comments

Patch