mbox series

[v1,0/2] mm/page_alloc: fix stalls/soft lockups with huge VMs

Message ID 20200401104156.11564-1-david@redhat.com (mailing list archive)
Headers show
Series mm/page_alloc: fix stalls/soft lockups with huge VMs | expand

Message

David Hildenbrand April 1, 2020, 10:41 a.m. UTC
Two fixes for misleading stall messages / soft lockups with huge nodes /
zones during boot without CONFIG_PREEMPT.

David Hildenbrand (2):
  mm/page_alloc: fix RCU stalls during deferred page initialization
  mm/page_alloc: fix watchdog soft lockups during set_zone_contiguous()

 mm/page_alloc.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

David Hildenbrand April 1, 2020, 2:10 p.m. UTC | #1
On 01.04.20 12:41, David Hildenbrand wrote:
> Two fixes for misleading stall messages / soft lockups with huge nodes /
> zones during boot without CONFIG_PREEMPT.
> 
> David Hildenbrand (2):
>   mm/page_alloc: fix RCU stalls during deferred page initialization
>   mm/page_alloc: fix watchdog soft lockups during set_zone_contiguous()
> 
>  mm/page_alloc.c | 2 ++
>  1 file changed, 2 insertions(+)
> 

Patch #1 requires "[PATCH v3] mm: fix tick timer stall during deferred
page init"

https://lkml.kernel.org/r/20200311123848.118638-1-shile.zhang@linux.alibaba.com
Pankaj Gupta April 1, 2020, 2:31 p.m. UTC | #2
> On 01.04.20 12:41, David Hildenbrand wrote:
> > Two fixes for misleading stall messages / soft lockups with huge nodes /
> > zones during boot without CONFIG_PREEMPT.
> >
> > David Hildenbrand (2):
> >   mm/page_alloc: fix RCU stalls during deferred page initialization
> >   mm/page_alloc: fix watchdog soft lockups during set_zone_contiguous()
> >
> >  mm/page_alloc.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
>
> Patch #1 requires "[PATCH v3] mm: fix tick timer stall during deferred
> page init"
>
> https://lkml.kernel.org/r/20200311123848.118638-1-shile.zhang@linux.alibaba.com

Thanks! Took me some time to figure it out.

Pankaj

>
> --
> Thanks,
>
> David / dhildenb
>
>
Daniel Jordan April 1, 2020, 2:45 p.m. UTC | #3
On Wed, Apr 01, 2020 at 04:31:51PM +0200, Pankaj Gupta wrote:
> > On 01.04.20 12:41, David Hildenbrand wrote:
> > > Two fixes for misleading stall messages / soft lockups with huge nodes /
> > > zones during boot without CONFIG_PREEMPT.
> > >
> > > David Hildenbrand (2):
> > >   mm/page_alloc: fix RCU stalls during deferred page initialization
> > >   mm/page_alloc: fix watchdog soft lockups during set_zone_contiguous()
> > >
> > >  mm/page_alloc.c | 2 ++
> > >  1 file changed, 2 insertions(+)
> > >
> >
> > Patch #1 requires "[PATCH v3] mm: fix tick timer stall during deferred
> > page init"
> >
> > https://lkml.kernel.org/r/20200311123848.118638-1-shile.zhang@linux.alibaba.com
> 
> Thanks! Took me some time to figure it out.

FYI, I'm planning to post an alternate version of that fix, hopefully today if
all goes well with my testing.
David Hildenbrand April 1, 2020, 3:54 p.m. UTC | #4
On 01.04.20 16:45, Daniel Jordan wrote:
> On Wed, Apr 01, 2020 at 04:31:51PM +0200, Pankaj Gupta wrote:
>>> On 01.04.20 12:41, David Hildenbrand wrote:
>>>> Two fixes for misleading stall messages / soft lockups with huge nodes /
>>>> zones during boot without CONFIG_PREEMPT.
>>>>
>>>> David Hildenbrand (2):
>>>>   mm/page_alloc: fix RCU stalls during deferred page initialization
>>>>   mm/page_alloc: fix watchdog soft lockups during set_zone_contiguous()
>>>>
>>>>  mm/page_alloc.c | 2 ++
>>>>  1 file changed, 2 insertions(+)
>>>>
>>>
>>> Patch #1 requires "[PATCH v3] mm: fix tick timer stall during deferred
>>> page init"
>>>
>>> https://lkml.kernel.org/r/20200311123848.118638-1-shile.zhang@linux.alibaba.com
>>
>> Thanks! Took me some time to figure it out.
> 
> FYI, I'm planning to post an alternate version of that fix, hopefully today if
> all goes well with my testing.
> 

Cool, please CC me :)
Daniel Jordan April 1, 2020, 4:10 p.m. UTC | #5
On Wed, Apr 01, 2020 at 05:54:40PM +0200, David Hildenbrand wrote:
> On 01.04.20 16:45, Daniel Jordan wrote:
> > On Wed, Apr 01, 2020 at 04:31:51PM +0200, Pankaj Gupta wrote:
> >>> On 01.04.20 12:41, David Hildenbrand wrote:
> >>>> Two fixes for misleading stall messages / soft lockups with huge nodes /
> >>>> zones during boot without CONFIG_PREEMPT.
> >>>>
> >>>> David Hildenbrand (2):
> >>>>   mm/page_alloc: fix RCU stalls during deferred page initialization
> >>>>   mm/page_alloc: fix watchdog soft lockups during set_zone_contiguous()
> >>>>
> >>>>  mm/page_alloc.c | 2 ++
> >>>>  1 file changed, 2 insertions(+)
> >>>>
> >>>
> >>> Patch #1 requires "[PATCH v3] mm: fix tick timer stall during deferred
> >>> page init"
> >>>
> >>> https://lkml.kernel.org/r/20200311123848.118638-1-shile.zhang@linux.alibaba.com
> >>
> >> Thanks! Took me some time to figure it out.
> > 
> > FYI, I'm planning to post an alternate version of that fix, hopefully today if
> > all goes well with my testing.
> > 
> 
> Cool, please CC me :)

Sure, in fact you already were! :)
Andrew Morton April 1, 2020, 6:06 p.m. UTC | #6
On Wed, 1 Apr 2020 10:45:29 -0400 Daniel Jordan <daniel.m.jordan@oracle.com> wrote:

> On Wed, Apr 01, 2020 at 04:31:51PM +0200, Pankaj Gupta wrote:
> > > On 01.04.20 12:41, David Hildenbrand wrote:
> > > > Two fixes for misleading stall messages / soft lockups with huge nodes /
> > > > zones during boot without CONFIG_PREEMPT.
> > > >
> > > > David Hildenbrand (2):
> > > >   mm/page_alloc: fix RCU stalls during deferred page initialization
> > > >   mm/page_alloc: fix watchdog soft lockups during set_zone_contiguous()
> > > >
> > > >  mm/page_alloc.c | 2 ++
> > > >  1 file changed, 2 insertions(+)
> > > >
> > >
> > > Patch #1 requires "[PATCH v3] mm: fix tick timer stall during deferred
> > > page init"
> > >
> > > https://lkml.kernel.org/r/20200311123848.118638-1-shile.zhang@linux.alibaba.com
> > 
> > Thanks! Took me some time to figure it out.
> 
> FYI, I'm planning to post an alternate version of that fix, hopefully today if
> all goes well with my testing.

I assume you'll redo this two-patch series to apply on top of this
forthcoming patch?
David Hildenbrand April 1, 2020, 6:29 p.m. UTC | #7
> Am 01.04.2020 um 20:06 schrieb Andrew Morton <akpm@linux-foundation.org>:
> 
> On Wed, 1 Apr 2020 10:45:29 -0400 Daniel Jordan <daniel.m.jordan@oracle.com> wrote:
> 
>> On Wed, Apr 01, 2020 at 04:31:51PM +0200, Pankaj Gupta wrote:
>>>>> On 01.04.20 12:41, David Hildenbrand wrote:
>>>>>> Two fixes for misleading stall messages / soft lockups with huge nodes /
>>>>>> zones during boot without CONFIG_PREEMPT.
>>>>>> 
>>>>>> David Hildenbrand (2):
>>>>>>  mm/page_alloc: fix RCU stalls during deferred page initialization
>>>>>>  mm/page_alloc: fix watchdog soft lockups during set_zone_contiguous()
>>>>>> 
>>>>>> mm/page_alloc.c | 2 ++
>>>>>> 1 file changed, 2 insertions(+)
>>>>>> 
>>>>> 
>>>>> Patch #1 requires "[PATCH v3] mm: fix tick timer stall during deferred
>>>>> page init"
>>>>> 
>>>>> https://lkml.kernel.org/r/20200311123848.118638-1-shile.zhang@linux.alibaba.com
>>> 
>>> Thanks! Took me some time to figure it out.
>> 
>> FYI, I'm planning to post an alternate version of that fix, hopefully today if
>> all goes well with my testing.
> 
> I assume you'll redo this two-patch series to apply on top of this
> forthcoming patch?
> 

Yes, will wait until the old one in -next has been replaced by a revised one. Thanks!