diff mbox

[v2,3/3] mm: page_alloc: fair zone allocator policy

Message ID 20130816201814.GA26409@cmpxchg.org (mailing list archive)
State New, archived
Headers show

Commit Message

Johannes Weiner Aug. 16, 2013, 8:18 p.m. UTC
Hi Kevin,

On Fri, Aug 16, 2013 at 10:17:01AM -0700, Kevin Hilman wrote:
> Johannes Weiner <hannes@cmpxchg.org> writes:
> > On Wed, Aug 07, 2013 at 11:37:43AM -0400, Johannes Weiner wrote:
> > Subject: [patch] mm: page_alloc: use vmstats for fair zone allocation batching
> >
> > Avoid dirtying the same cache line with every single page allocation
> > by making the fair per-zone allocation batch a vmstat item, which will
> > turn it into batched percpu counters on SMP.
> >
> > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> 
> I bisected several boot failures on various ARM platform in
> next-20130816 down to this patch (commit 67131f9837 in linux-next.)
> 
> Simply reverting it got things booting again on top of -next.  Example
> boot crash below.

Thanks for the bisect and report!

I deref the percpu pointers before initializing them properly.  It
didn't trigger on x86 because the percpu offset added to the pointer
is big enough so that it does not fall into PFN 0, but it probably
ended up corrupting something...

Could you try this patch on top of linux-next instead of the revert?

Thanks,
Johannes

---
From: Johannes Weiner <hannes@cmpxchg.org>
Subject: [patch] mm: page_alloc: use vmstats for fair zone allocation batching fix

Initialize the per-cpu counters before modifying them.  Otherwise:

[    0.000000] Booting Linux on physical CPU 0x0
[    0.000000] Linux version 3.11.0-rc5-next-20130816 (khilman@paris) (gcc version 4.7.2 (Ubuntu/Linaro 4.7.2-1ubuntu1) ) #30 SMP Fri Aug 16 09:47:32 PDT 2013
[    0.000000] CPU: ARMv7 Processor [413fc082] revision 2 (ARMv7), cr=10c53c7d
[    0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
[    0.000000] Machine: Generic AM33XX (Flattened Device Tree), model: TI AM335x BeagleBone
[    0.000000] bootconsole [earlycon0] enabled
[    0.000000] Memory policy: ECC disabled, Data cache writeback
[    0.000000] On node 0 totalpages: 130816
[    0.000000] free_area_init_node: node 0, pgdat c081d400, node_mem_map c12fc000
[    0.000000]   Normal zone: 1024 pages used for memmap
[    0.000000]   Normal zone: 0 pages reserved
[    0.000000] Unable to handle kernel NULL pointer dereference at virtual address 00000026
[    0.000000] pgd = c0004000
[    0.000000] [00000026] *pgd=00000000
[    0.000000] Internal error: Oops: 5 [#1] SMP ARM
[    0.000000] Modules linked in:
[    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 3.11.0-rc5-next-20130816 #30
[    0.000000] task: c0793c70 ti: c0788000 task.ti: c0788000
[    0.000000] PC is at __mod_zone_page_state+0x2c/0xb4
[    0.000000] LR is at mod_zone_page_state+0x2c/0x4c

Reported-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/page_alloc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Stephen Warren Aug. 16, 2013, 9:24 p.m. UTC | #1
On 08/16/2013 02:18 PM, Johannes Weiner wrote:
> Hi Kevin,
> 
> On Fri, Aug 16, 2013 at 10:17:01AM -0700, Kevin Hilman wrote:
>> Johannes Weiner <hannes@cmpxchg.org> writes:
>>> On Wed, Aug 07, 2013 at 11:37:43AM -0400, Johannes Weiner wrote:
>>> Subject: [patch] mm: page_alloc: use vmstats for fair zone allocation batching
>>>
>>> Avoid dirtying the same cache line with every single page allocation
>>> by making the fair per-zone allocation batch a vmstat item, which will
>>> turn it into batched percpu counters on SMP.
>>>
>>> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
>>
>> I bisected several boot failures on various ARM platform in
>> next-20130816 down to this patch (commit 67131f9837 in linux-next.)
>>
>> Simply reverting it got things booting again on top of -next.  Example
>> boot crash below.
> 
> Thanks for the bisect and report!
> 
> I deref the percpu pointers before initializing them properly.  It
> didn't trigger on x86 because the percpu offset added to the pointer
> is big enough so that it does not fall into PFN 0, but it probably
> ended up corrupting something...
> 
> Could you try this patch on top of linux-next instead of the revert?

That patch,
Tested-by: Stephen Warren <swarren@nvidia.com>
Kevin Hilman Aug. 16, 2013, 9:52 p.m. UTC | #2
Johannes Weiner <hannes@cmpxchg.org> writes:

> Hi Kevin,
>
> On Fri, Aug 16, 2013 at 10:17:01AM -0700, Kevin Hilman wrote:
>> Johannes Weiner <hannes@cmpxchg.org> writes:
>> > On Wed, Aug 07, 2013 at 11:37:43AM -0400, Johannes Weiner wrote:
>> > Subject: [patch] mm: page_alloc: use vmstats for fair zone allocation batching
>> >
>> > Avoid dirtying the same cache line with every single page allocation
>> > by making the fair per-zone allocation batch a vmstat item, which will
>> > turn it into batched percpu counters on SMP.
>> >
>> > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
>> 
>> I bisected several boot failures on various ARM platform in
>> next-20130816 down to this patch (commit 67131f9837 in linux-next.)
>> 
>> Simply reverting it got things booting again on top of -next.  Example
>> boot crash below.
>
> Thanks for the bisect and report!

You're welcome.  Thanks for the quick fix!

> I deref the percpu pointers before initializing them properly.  It
> didn't trigger on x86 because the percpu offset added to the pointer
> is big enough so that it does not fall into PFN 0, but it probably
> ended up corrupting something...
>
> Could you try this patch on top of linux-next instead of the revert?

Yup, that change fixes it.

Tested-by: Kevin Hilman <khilman@linaro.org>

Kevin
Stephen Rothwell Aug. 19, 2013, 12:48 a.m. UTC | #3
Hi all,

On Fri, 16 Aug 2013 14:52:11 -0700 Kevin Hilman <khilman@linaro.org> wrote:
>
> Johannes Weiner <hannes@cmpxchg.org> writes:
> 
> > On Fri, Aug 16, 2013 at 10:17:01AM -0700, Kevin Hilman wrote:
> >> Johannes Weiner <hannes@cmpxchg.org> writes:
> >> > On Wed, Aug 07, 2013 at 11:37:43AM -0400, Johannes Weiner wrote:
> >> > Subject: [patch] mm: page_alloc: use vmstats for fair zone allocation batching
> >> >
> >> > Avoid dirtying the same cache line with every single page allocation
> >> > by making the fair per-zone allocation batch a vmstat item, which will
> >> > turn it into batched percpu counters on SMP.
> >> >
> >> > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> >> 
> >> I bisected several boot failures on various ARM platform in
> >> next-20130816 down to this patch (commit 67131f9837 in linux-next.)
> >> 
> >> Simply reverting it got things booting again on top of -next.  Example
> >> boot crash below.
> >
> > Thanks for the bisect and report!
> 
> You're welcome.  Thanks for the quick fix!
> 
> > I deref the percpu pointers before initializing them properly.  It
> > didn't trigger on x86 because the percpu offset added to the pointer
> > is big enough so that it does not fall into PFN 0, but it probably
> > ended up corrupting something...
> >
> > Could you try this patch on top of linux-next instead of the revert?
> 
> Yup, that change fixes it.
> 
> Tested-by: Kevin Hilman <khilman@linaro.org>

> Tested-by: Stephen Warren <swarren@nvidia.com>

I will add that into the akpm-current tree in linux-next today (unless
Andrew releases a new mmotm in the mean time).
diff mbox

Patch

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6a95d39..b9e8f2f 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4826,11 +4826,11 @@  static void __paginginit free_area_init_core(struct pglist_data *pgdat,
 		spin_lock_init(&zone->lru_lock);
 		zone_seqlock_init(zone);
 		zone->zone_pgdat = pgdat;
+		zone_pcp_init(zone);
 
 		/* For bootup, initialized properly in watermark setup */
 		mod_zone_page_state(zone, NR_ALLOC_BATCH, zone->managed_pages);
 
-		zone_pcp_init(zone);
 		lruvec_init(&zone->lruvec);
 		if (!size)
 			continue;