From patchwork Mon Jan 19 17:43:17 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Felipe Balbi X-Patchwork-Id: 5659911 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 99AE4C058D for ; Mon, 19 Jan 2015 17:46:50 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 62DF0203B8 for ; Mon, 19 Jan 2015 17:46:49 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.9]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0B6A020443 for ; Mon, 19 Jan 2015 17:46:48 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1YDGNS-0007QB-IL; Mon, 19 Jan 2015 17:44:38 +0000 Received: from bear.ext.ti.com ([192.94.94.41]) by bombadil.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1YDGNI-0007DY-9e for linux-arm-kernel@lists.infradead.org; Mon, 19 Jan 2015 17:44:32 +0000 Received: from dflxv15.itg.ti.com ([128.247.5.124]) by bear.ext.ti.com (8.13.7/8.13.7) with ESMTP id t0JHi5eV032612; Mon, 19 Jan 2015 11:44:06 -0600 Received: from DFLE72.ent.ti.com (dfle72.ent.ti.com [128.247.5.109]) by dflxv15.itg.ti.com (8.14.3/8.13.8) with ESMTP id t0JHi5LY016587; Mon, 19 Jan 2015 11:44:05 -0600 Received: from dflp33.itg.ti.com (10.64.6.16) by DFLE72.ent.ti.com (128.247.5.109) with Microsoft SMTP Server id 14.3.174.1; Mon, 19 Jan 2015 11:44:05 -0600 Received: from localhost (ileax41-snat.itg.ti.com [10.172.224.153]) by dflp33.itg.ti.com (8.14.3/8.13.8) with ESMTP id t0JHi4s3020979; Mon, 19 Jan 2015 11:44:05 -0600 Date: Mon, 19 Jan 2015 11:43:17 -0600 From: Felipe Balbi To: Nishanth Menon Subject: Re: [next-20150119]regression (mm)? Message-ID: <20150119174317.GK20386@saruman> References: <54BD33DC.40200@ti.com> MIME-Version: 1.0 In-Reply-To: <54BD33DC.40200@ti.com> User-Agent: Mutt/1.5.23 (2014-03-12) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20150119_094428_570042_BAB75F1E X-CRM114-Status: GOOD ( 16.50 ) X-Spam-Score: -5.0 (-----) Cc: "Kirill A. Shutemov" , linux-omap , "Balbi, Felipe" , linux-next , "linux-arm-kernel@lists.infradead.org" X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: balbi@ti.com Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Hi, On Mon, Jan 19, 2015 at 10:42:04AM -0600, Nishanth Menon wrote: > Most platforms seem broken intoday's next tag. > > https://github.com/nmenon/kernel-test-logs/tree/next-20150119 > (defconfig: omap2plus_defconfig) > > > [ 7.166600] ------------[ cut here ]------------ > > [ 7.171676] WARNING: CPU: 0 PID: 54 at mm/mmap.c:2859 exit_mmap+0x1a8/0x21c() > > [ 7.179194] Modules linked in: > > [ 7.182479] CPU: 0 PID: 54 Comm: init Not tainted 3.19.0-rc5-next-20150119-00002-gfdefcded1272 #1 > > [ 7.191863] Hardware name: Generic AM33XX (Flattened Device Tree) > > [ 7.198318] [] (unwind_backtrace) from [] (show_stack+0x10/0x14) > > [ 7.206528] [] (show_stack) from [] (dump_stack+0x78/0x94) > > [ 7.214191] [] (dump_stack) from [] (warn_slowpath_common+0x7c/0xb4) > > [ 7.222751] [] (warn_slowpath_common) from [] (warn_slowpath_null+0x1c/0x24) > > [ 7.232038] [] (warn_slowpath_null) from [] (exit_mmap+0x1a8/0x21c) > > [ 7.240536] [] (exit_mmap) from [] (mmput+0x44/0xec) > > [ 7.247612] [] (mmput) from [] (flush_old_exec+0x300/0x5a4) > > [ 7.255357] [] (flush_old_exec) from [] (load_elf_binary+0x2ec/0x1144) > > [ 7.264111] [] (load_elf_binary) from [] (search_binary_handler+0x88/0x1ac) > > [ 7.273311] [] (search_binary_handler) from [] (load_script+0x260/0x280) > > [ 7.282232] [] (load_script) from [] (search_binary_handler+0x88/0x1ac) > > [ 7.291066] [] (search_binary_handler) from [] (do_execveat_common+0x538/0x6c4) > > [ 7.300628] [] (do_execveat_common) from [] (do_execve+0x2c/0x34) > > [ 7.308881] [] (do_execve) from [] (ret_fast_syscall+0x0/0x4c) > > [ 7.316881] ---[ end trace 3b8a46b1b280f423 ]--- seems like it's caused by: b316feb3c37ff19cddcaf1f6b5056c633193257d is the first bad commit Adding Kiryl to the loop. git bisect start # good: [ec6f34e5b552fb0a52e6aae1a5afbbb1605cc6cc] Linux 3.19-rc5 git bisect good ec6f34e5b552fb0a52e6aae1a5afbbb1605cc6cc # bad: [a0d4287f787889e59db0fd295853a0f1f55d0699] Add linux-next specific files for 20150119 git bisect bad a0d4287f787889e59db0fd295853a0f1f55d0699 # good: [1c2f70b77b8ca77f10c59d479d009e07359d00d2] Merge remote-tracking branch 'drm/drm-next' git bisect good 1c2f70b77b8ca77f10c59d479d009e07359d00d2 # good: [73c1390843223d8bfc85795c560c36b3d0ffee40] Merge remote-tracking branch 'leds/for-next' git bisect good 73c1390843223d8bfc85795c560c36b3d0ffee40 # good: [7bc6bef35d48e91ad796b6eead7304998842c782] Merge remote-tracking branch 'pinctrl/for-next' git bisect good 7bc6bef35d48e91ad796b6eead7304998842c782 # bad: [45e1eaa38732ffa3de0d18fe95d2d2b960a7c777] lib: bitmap: change bitmap_shift_right to take unsigned parameters git bisect bad 45e1eaa38732ffa3de0d18fe95d2d2b960a7c777 # good: [c82a73a0369a7dd6dcfaf9e6bd572a4e5deda223] mm, page_alloc: reduce number of alloc_pages* functions' parameters git bisect good c82a73a0369a7dd6dcfaf9e6bd572a4e5deda223 # bad: [0b1c810fbc4bbff7e314dd6ff91c2b4af499199d] mm: don't split THP page when syscall is called git bisect bad 0b1c810fbc4bbff7e314dd6ff91c2b4af499199d # good: [54faa439355a9ae476a446429967e9e38f04363e] oom, PM: make OOM detection in the freezer path raceless git bisect good 54faa439355a9ae476a446429967e9e38f04363e # bad: [b6c9f11c6b6993303067f7c04a73258226a6e77e] mm/compaction: add tracepoint to observe behaviour of compaction defer git bisect bad b6c9f11c6b6993303067f7c04a73258226a6e77e # good: [9ce5d3fb13a80f28db450de4ecf2727893e99c93] mm: pagemap_read: limit scan to virtual region being asked git bisect good 9ce5d3fb13a80f28db450de4ecf2727893e99c93 # bad: [1a7a376546ca56e7750987c15d0c7541c17a512c] mm/compaction: change tracepoint format from decimal to hexadecimal git bisect bad 1a7a376546ca56e7750987c15d0c7541c17a512c # bad: [4081187ff19cf2186010c003939c17d70d0bbb27] page_writeback: put account_page_redirty() after set_page_dirty() git bisect bad 4081187ff19cf2186010c003939c17d70d0bbb27 # bad: [b316feb3c37ff19cddcaf1f6b5056c633193257d] mm: account pmd page tables to the process git bisect bad b316feb3c37ff19cddcaf1f6b5056c633193257d # first bad commit: [b316feb3c37ff19cddcaf1f6b5056c633193257d] mm: account pmd page tables to the process I've added a dump_mm() call when the bug happens followed by a while (true) loop (to avoid constant reprinting of the same thing), here's what I get: [ 7.235903] ------------[ cut here ]------------ [ 7.240881] WARNING: CPU: 0 PID: 58 at mm/mmap.c:2859 exit_mmap+0x1b4/0x218() [ 7.248369] Modules linked in: ipv6 autofs4 [ 7.252792] CPU: 0 PID: 58 Comm: systemd Not tainted 3.19.0-rc5-next-20150119-dirty #888 [ 7.261274] Hardware name: Generic AM43 (Flattened Device Tree) [ 7.267512] [] (unwind_backtrace) from [] (show_stack+0x10/0x14) [ 7.275651] [] (show_stack) from [] (dump_stack+0x84/0x9c) [ 7.283249] [] (dump_stack) from [] (warn_slowpath_common+0x78/0xb4) [ 7.291750] [] (warn_slowpath_common) from [] (warn_slowpath_null+0x1c/0x24) [ 7.300977] [] (warn_slowpath_null) from [] (exit_mmap+0x1b4/0x218) [ 7.309376] [] (exit_mmap) from [] (mmput+0x44/0xec) [ 7.316385] [] (mmput) from [] (flush_old_exec+0x264/0x5d4) [ 7.324061] [] (flush_old_exec) from [] (load_elf_binary+0x288/0x1234) [ 7.332727] [] (load_elf_binary) from [] (search_binary_handler+0x84/0x1e8) [ 7.341857] [] (search_binary_handler) from [] (do_execveat_common+0x53c/0x6b8) [ 7.351346] [] (do_execveat_common) from [] (do_execve+0x24/0x2c) [ 7.359561] [] (do_execve) from [] (ret_fast_syscall+0x0/0x4c) [ 7.367485] ---[ end trace 633a89eb76b1d46e ]--- [ 7.372360] mm ed29fa00 mmap ed29b6b8 seqnum 0 task_size 3204448256 [ 7.372360] get_unmapped_area c001cfc0 [ 7.372360] mmap_base 3069620224 mmap_legacy_base 0 highest_vm_end 3202711552 [ 7.372360] pgd ed184000 mm_users 0 mm_count 1 nr_ptes 1 nr_pmds 4294967292 map_count 59 [ 7.372360] hiwater_rss 37 hiwater_vm 37e total_vm 37e locked_vm 0 [ 7.372360] pinned_vm 0 shared_vm 324 exec_vm 254 stack_vm 22 [ 7.372360] start_code 10000 end_code c9d48 start_data da1b8 end_data ea1a4 [ 7.372360] start_brk eb000 brk 10c000 start_stack becd5f10 [ 7.372360] arg_start becd5fd4 arg_end becd5fdf env_start becd5fdf env_end becd5ff1 [ 7.372360] binfmt c08b3158 flags cd core_state (null) [ 7.372360] ioctx_table (null) [ 7.372360] owner (null) exe_file ee49d040 [ 7.372360] tlb_flush_pending 0 [ 7.448908] flags: 0x0() Looking at nr_pmds, that's basically (unsigned long) -4, which tells me we are decrementing mm->nr_pmds without incrementing first. In, when I add: diff --git a/include/linux/mm.h b/include/linux/mm.h index ba5f3bcca55d..8425fb419eab 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1440,11 +1440,15 @@ static inline unsigned long mm_nr_pmds(struct mm_struct *mm) static inline void mm_inc_nr_pmds(struct mm_struct *mm) { atomic_long_inc(&mm->nr_pmds); + printk(KERN_INFO "===> %s nr_pmds %ld\n", __func__, + atomic_long_read(&mm->nr_pmds)); } static inline void mm_dec_nr_pmds(struct mm_struct *mm) { atomic_long_dec(&mm->nr_pmds); + printk(KERN_INFO "===> %s nr_pmds %ld\n", __func__, + atomic_long_read(&mm->nr_pmds)); } #endif I start getting: [...] [ 5.935390] ===> mm_dec_nr_pmds nr_pmds -1 [ 6.236832] random: systemd urandom read with 34 bits of entropy available [ 6.276340] systemd[1]: systemd 215 running in system mode. (+PAM +AUDIT +SELINUX +IMA +SYSVINIT +LIBC RYPTSETUP +GCRYPT +ACL +XZ -SECCOMP -APPARMOR) [ 6.291380] systemd[1]: Detected architecture 'arm'. Welcome to Debian GNU/Linux 8 (jessie)! [ 6.434013] systemd[1]: Inserted module 'autofs4' [ 7.152037] NET: Registered protocol family 10 [ 7.165770] systemd[1]: Inserted module 'ipv6' [ 7.170646] ===> mm_dec_nr_pmds nr_pmds -2 [ 7.174932] ===> mm_dec_nr_pmds nr_pmds -3 [ 7.179427] ===> mm_dec_nr_pmds nr_pmds -4 [ 7.258496] ===> mm_dec_nr_pmds nr_pmds -1 [ 7.262809] ===> mm_dec_nr_pmds nr_pmds -2 [ 7.267206] ===> mm_dec_nr_pmds nr_pmds -3 [ 7.271486] ===> mm_dec_nr_pmds nr_pmds -4 [ 7.275884] ------------[ cut here ]------------ [ 7.280773] WARNING: CPU: 0 PID: 58 at mm/mmap.c:2859 exit_mmap+0x1b4/0x218() [...] Which confirms my suspicion. So we never increment nr_pmds, but we decrement it. The simplest "fix" is to make mm_nr_pmds() return a signed long (see below) and cast roundup()'s return to (signed long), but that's not what we really want in this case because it's clear our PMD accounting is bogus. Kiryl, any better idea on how to balance mm_inc_nr_pmds() and mm_dec_nr_pmds() ? diff --git a/include/linux/mm.h b/include/linux/mm.h index ba5f3bcca55d..c0ce2e8a9d45 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1432,7 +1432,7 @@ static inline void mm_dec_nr_pmds(struct mm_struct *mm) {} #else int __pmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned long address); -static inline unsigned long mm_nr_pmds(struct mm_struct *mm) +static inline long mm_nr_pmds(struct mm_struct *mm) { return atomic_long_read(&mm->nr_pmds); } diff --git a/mm/mmap.c b/mm/mmap.c index 25271805ab39..703474d3336b 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -2856,7 +2856,7 @@ void exit_mmap(struct mm_struct *mm) round_up(FIRST_USER_ADDRESS, PMD_SIZE) >> PMD_SHIFT); #ifdef PUD_SHIFT WARN_ON(mm_nr_pmds(mm) > - round_up(FIRST_USER_ADDRESS, PUD_SIZE) >> PUD_SHIFT); + (signed long) round_up(FIRST_USER_ADDRESS, PUD_SIZE) >> PUD_SHIFT); #endif }