diff mbox series

[4/5] mm/hotplug: Avoid RCU stalls when removing large amounts of memory

Message ID 20190617043635.13201-5-alastair@au1.ibm.com (mailing list archive)
State New, archived
Headers show
Series mm: Cleanup & allow modules to hotplug memory | expand

Commit Message

Alastair D'Silva June 17, 2019, 4:36 a.m. UTC
From: Alastair D'Silva <alastair@d-silva.org>

When removing sufficiently large amounts of memory, we trigger RCU stall
detection. By periodically calling cond_resched(), we avoid bogus stall
warnings.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
---
 mm/memory_hotplug.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Mike Rapoport June 17, 2019, 6:53 a.m. UTC | #1
On Mon, Jun 17, 2019 at 02:36:30PM +1000, Alastair D'Silva wrote:
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> When removing sufficiently large amounts of memory, we trigger RCU stall
> detection. By periodically calling cond_resched(), we avoid bogus stall
> warnings.
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> ---
>  mm/memory_hotplug.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index e096c987d261..382b3a0c9333 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -578,6 +578,9 @@ void __remove_pages(struct zone *zone, unsigned long phys_start_pfn,
>  		__remove_section(zone, __pfn_to_section(pfn), map_offset,
>  				 altmap);
>  		map_offset = 0;
> +
> +		if (!(i & 0x0FFF))

No magic numbers please. And a comment would be appreciated.

> +			cond_resched();
>  	}
> 
>  	set_zone_contiguous(zone);
> -- 
> 2.21.0
>
Alastair D'Silva June 17, 2019, 6:58 a.m. UTC | #2
On Mon, 2019-06-17 at 09:53 +0300, Mike Rapoport wrote:
> On Mon, Jun 17, 2019 at 02:36:30PM +1000, Alastair D'Silva wrote:
> > From: Alastair D'Silva <alastair@d-silva.org>
> > 
> > When removing sufficiently large amounts of memory, we trigger RCU
> > stall
> > detection. By periodically calling cond_resched(), we avoid bogus
> > stall
> > warnings.
> > 
> > Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> > ---
> >  mm/memory_hotplug.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> > index e096c987d261..382b3a0c9333 100644
> > --- a/mm/memory_hotplug.c
> > +++ b/mm/memory_hotplug.c
> > @@ -578,6 +578,9 @@ void __remove_pages(struct zone *zone, unsigned
> > long phys_start_pfn,
> >  		__remove_section(zone, __pfn_to_section(pfn),
> > map_offset,
> >  				 altmap);
> >  		map_offset = 0;
> > +
> > +		if (!(i & 0x0FFF))
> 
> No magic numbers please. And a comment would be appreciated.
> 

Agreed, thanks for the review.
Michal Hocko June 17, 2019, 7:47 a.m. UTC | #3
On Mon 17-06-19 14:36:30,  Alastair D'Silva  wrote:
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> When removing sufficiently large amounts of memory, we trigger RCU stall
> detection. By periodically calling cond_resched(), we avoid bogus stall
> warnings.
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> ---
>  mm/memory_hotplug.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index e096c987d261..382b3a0c9333 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -578,6 +578,9 @@ void __remove_pages(struct zone *zone, unsigned long phys_start_pfn,
>  		__remove_section(zone, __pfn_to_section(pfn), map_offset,
>  				 altmap);
>  		map_offset = 0;
> +
> +		if (!(i & 0x0FFF))
> +			cond_resched();

We already do have cond_resched before __remove_section. Why is an
additional needed?

>  	}
>  
>  	set_zone_contiguous(zone);
> -- 
> 2.21.0
>
Alastair D'Silva June 17, 2019, 7:57 a.m. UTC | #4
> -----Original Message-----
> From: Michal Hocko <mhocko@kernel.org>
> Sent: Monday, 17 June 2019 5:47 PM
> To: Alastair D'Silva <alastair@au1.ibm.com>
> Cc: alastair@d-silva.org; Arun KS <arunks@codeaurora.org>; Mukesh Ojha
> <mojha@codeaurora.org>; Logan Gunthorpe <logang@deltatee.com>; Wei
> Yang <richard.weiyang@gmail.com>; Peter Zijlstra <peterz@infradead.org>;
> Ingo Molnar <mingo@kernel.org>; linux-mm@kvack.org; Qian Cai
> <cai@lca.pw>; Thomas Gleixner <tglx@linutronix.de>; Andrew Morton
> <akpm@linux-foundation.org>; Mike Rapoport <rppt@linux.vnet.ibm.com>;
> Baoquan He <bhe@redhat.com>; David Hildenbrand <david@redhat.com>;
> Josh Poimboeuf <jpoimboe@redhat.com>; Pavel Tatashin
> <pasha.tatashin@soleen.com>; Juergen Gross <jgross@suse.com>; Oscar
> Salvador <osalvador@suse.com>; Jiri Kosina <jkosina@suse.cz>; linux-
> kernel@vger.kernel.org
> Subject: Re: [PATCH 4/5] mm/hotplug: Avoid RCU stalls when removing large
> amounts of memory
> 
> On Mon 17-06-19 14:36:30,  Alastair D'Silva  wrote:
> > From: Alastair D'Silva <alastair@d-silva.org>
> >
> > When removing sufficiently large amounts of memory, we trigger RCU
> > stall detection. By periodically calling cond_resched(), we avoid
> > bogus stall warnings.
> >
> > Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> > ---
> >  mm/memory_hotplug.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index
> > e096c987d261..382b3a0c9333 100644
> > --- a/mm/memory_hotplug.c
> > +++ b/mm/memory_hotplug.c
> > @@ -578,6 +578,9 @@ void __remove_pages(struct zone *zone, unsigned
> long phys_start_pfn,
> >  		__remove_section(zone, __pfn_to_section(pfn),
> map_offset,
> >  				 altmap);
> >  		map_offset = 0;
> > +
> > +		if (!(i & 0x0FFF))
> > +			cond_resched();
> 
> We already do have cond_resched before __remove_section. Why is an
> additional needed?

I was getting stalls when removing ~1TB of memory.
Michal Hocko June 17, 2019, 8:21 a.m. UTC | #5
On Mon 17-06-19 17:57:16, Alastair D'Silva wrote:
> > -----Original Message-----
> > From: Michal Hocko <mhocko@kernel.org>
> > Sent: Monday, 17 June 2019 5:47 PM
> > To: Alastair D'Silva <alastair@au1.ibm.com>
> > Cc: alastair@d-silva.org; Arun KS <arunks@codeaurora.org>; Mukesh Ojha
> > <mojha@codeaurora.org>; Logan Gunthorpe <logang@deltatee.com>; Wei
> > Yang <richard.weiyang@gmail.com>; Peter Zijlstra <peterz@infradead.org>;
> > Ingo Molnar <mingo@kernel.org>; linux-mm@kvack.org; Qian Cai
> > <cai@lca.pw>; Thomas Gleixner <tglx@linutronix.de>; Andrew Morton
> > <akpm@linux-foundation.org>; Mike Rapoport <rppt@linux.vnet.ibm.com>;
> > Baoquan He <bhe@redhat.com>; David Hildenbrand <david@redhat.com>;
> > Josh Poimboeuf <jpoimboe@redhat.com>; Pavel Tatashin
> > <pasha.tatashin@soleen.com>; Juergen Gross <jgross@suse.com>; Oscar
> > Salvador <osalvador@suse.com>; Jiri Kosina <jkosina@suse.cz>; linux-
> > kernel@vger.kernel.org
> > Subject: Re: [PATCH 4/5] mm/hotplug: Avoid RCU stalls when removing large
> > amounts of memory
> > 
> > On Mon 17-06-19 14:36:30,  Alastair D'Silva  wrote:
> > > From: Alastair D'Silva <alastair@d-silva.org>
> > >
> > > When removing sufficiently large amounts of memory, we trigger RCU
> > > stall detection. By periodically calling cond_resched(), we avoid
> > > bogus stall warnings.
> > >
> > > Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> > > ---
> > >  mm/memory_hotplug.c | 3 +++
> > >  1 file changed, 3 insertions(+)
> > >
> > > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index
> > > e096c987d261..382b3a0c9333 100644
> > > --- a/mm/memory_hotplug.c
> > > +++ b/mm/memory_hotplug.c
> > > @@ -578,6 +578,9 @@ void __remove_pages(struct zone *zone, unsigned
> > long phys_start_pfn,
> > >  		__remove_section(zone, __pfn_to_section(pfn),
> > map_offset,
> > >  				 altmap);
> > >  		map_offset = 0;
> > > +
> > > +		if (!(i & 0x0FFF))
> > > +			cond_resched();
> > 
> > We already do have cond_resched before __remove_section. Why is an
> > additional needed?
> 
> I was getting stalls when removing ~1TB of memory.

Have debugged what is the source of the stall? We do cond_resched once a
memory section which should be a constant unit of work regardless of the
total amount of memory to be removed.
Oscar Salvador June 17, 2019, 3:49 p.m. UTC | #6
On Mon, Jun 17, 2019 at 05:57:16PM +1000, Alastair D'Silva wrote:
> I was getting stalls when removing ~1TB of memory.

Would you mind sharing one of those stalls-splats?
I am bit spectic here because as I Michal pointed out, we do cond_resched
once per section removed.
diff mbox series

Patch

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index e096c987d261..382b3a0c9333 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -578,6 +578,9 @@  void __remove_pages(struct zone *zone, unsigned long phys_start_pfn,
 		__remove_section(zone, __pfn_to_section(pfn), map_offset,
 				 altmap);
 		map_offset = 0;
+
+		if (!(i & 0x0FFF))
+			cond_resched();
 	}
 
 	set_zone_contiguous(zone);