From patchwork Tue Jan 26 20:28:29 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 8127071 Return-Path: X-Original-To: patchwork-linux-scsi@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 11C3C9F859 for ; Tue, 26 Jan 2016 20:28:39 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id CCF592024C for ; Tue, 26 Jan 2016 20:28:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 7759720260 for ; Tue, 26 Jan 2016 20:28:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751260AbcAZU2f (ORCPT ); Tue, 26 Jan 2016 15:28:35 -0500 Received: from mail-wm0-f45.google.com ([74.125.82.45]:38462 "EHLO mail-wm0-f45.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751963AbcAZU2c (ORCPT ); Tue, 26 Jan 2016 15:28:32 -0500 Received: by mail-wm0-f45.google.com with SMTP id p63so4285775wmp.1 for ; Tue, 26 Jan 2016 12:28:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shutemov-name.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=cWpBPuc5DCXuX/7gVuQQq+mtWwYuJro5paBmQI3a+b4=; b=a9wTcl3bIXxG2w5AeIKq2WODqDZ2VbJpXhuY6zCrH513qefbYCsfstIp8zYfYyznfJ PRQUDGYMbYjm2fR3o7EZBAMJqVogIZV2YVJ7pqrjWfBX5ju9KSX6eGx73YthX4T54Wpo gpHeWA1vF9mtL9dCx99L0EKv7WCiQRDU0RWIZYBrwEiVdKAmUL6yL8UY5v+yl8lPu37X qDb6YMWY9cu1kjukTHZMhf1XGB2X6ioV1YV7XocHtKMF30Xkwbnd3Llg00qNy6P3vt2Z XNn9Qz0Y5ZjI18qdXiOhXNMcQ7JRloG/aYs/GqDoqJa/BN2lv+DOViAIDzaqy0qbDVAd dhZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-type:content-disposition:in-reply-to :user-agent; bh=cWpBPuc5DCXuX/7gVuQQq+mtWwYuJro5paBmQI3a+b4=; b=TpVoqCQuJsdb5SIfl8+kHkFtOiLDC6P0+32tYgf8i/8dwWZPHRNz/0Q5ivT6WMgRui 1mCW0v1sI5kQEPlmdFPUnJ42PSpcDeqUywWiEkof9Vf0YMH+KzA9/EyArz+Q5+0HJ3D8 BpRSa3ukzRZhdfCvvrgRB3zEqOcDjEC1wfMTJmDPOaNLZ+dO01uOKX+6/72sG11yB6VV 7Pvp41w6716vHI8BEC/e+I+0YvoSJ+5DOfS+pa0Ejaq7cq1OEYAl/azKgF15bVdh1cvK VgAWgQBamnPfSDIdXg71Mqah7h4LW6dN3nj3DURGV81jgxd0n2VKfivb648PXDMIsGur FeNA== X-Gm-Message-State: AG10YOTyOsfpuJzqMKvqORS+hAhtMzZZnhBKyQ8L8Un7PLIFbYN+kmxHJNsjjeH2tgGmXg== X-Received: by 10.194.192.198 with SMTP id hi6mr25414298wjc.141.1453840111410; Tue, 26 Jan 2016 12:28:31 -0800 (PST) Received: from node.shutemov.name ([93.85.135.137]) by smtp.gmail.com with ESMTPSA id g187sm5056467wmf.8.2016.01.26.12.28.30 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 26 Jan 2016 12:28:30 -0800 (PST) Received: by node.shutemov.name (Postfix, from userid 1000) id 47FE564DF2A8; Tue, 26 Jan 2016 22:28:29 +0200 (EET) Date: Tue, 26 Jan 2016 22:28:29 +0200 From: "Kirill A. Shutemov" To: Dmitry Vyukov , Doug Gilbert , Andrew Morton , David Rientjes Cc: Naoya Horiguchi , "Kirill A. Shutemov" , Shiraz Hashim , "linux-mm@kvack.org" , LKML , Hugh Dickins , Sasha Levin , syzkaller , Kostya Serebryany , Alexander Potapenko , linux-scsi@vger.kernel.org Subject: Re: mm: VM_BUG_ON_PAGE(PageTail(page)) in mbind Message-ID: <20160126202829.GA21250@node.shutemov.name> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23.1 (2014-03-12) Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Spam-Status: No, score=-6.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,RP_MATCHES_RCVD,T_DKIM_INVALID,UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Tue, Jan 26, 2016 at 01:52:31PM +0100, Dmitry Vyukov wrote: > Hello, > > The following program triggers the following bug: > > page:ffffea0000b82240 count:0 mapcount:1 mapping:dead0000ffffffff > index:0x0 compound_mapcount: 0 > flags: 0x1fffc0000000000() > page dumped because: VM_BUG_ON_PAGE(PageTail(page)) > ------------[ cut here ]------------ > kernel BUG at mm/vmscan.c:1446! > invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN > Modules linked in: > CPU: 1 PID: 6868 Comm: a.out Not tainted 4.5.0-rc1+ #287 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > task: ffff88003e24af80 ti: ffff88002e808000 task.ti: ffff88002e808000 > RIP: 0010:[] [] > isolate_lru_page+0x4ea/0x6d0 > RSP: 0018:ffff88002e80fa50 EFLAGS: 00010282 > RAX: ffff88003e24af80 RBX: ffffea0000b82240 RCX: 0000000000000000 > RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffea0000b82278 > RBP: ffff88002e80fa88 R08: 0000000000000001 R09: 0000000000000000 > R10: ffff88003e24af80 R11: 0000000000000001 R12: ffffea0000b82260 > R13: ffffea0000b82200 R14: ffffea0000b82201 R15: 0000000020004000 > FS: 0000000000c1f880(0063) GS:ffff88003ed00000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > CR2: 0000000020005ff8 CR3: 000000002e324000 CR4: 00000000000006e0 > Stack: > dffffc0000000000 ffffffff816f5e89 ffff88003124b010 0000000020002000 > dffffc0000000000 ffffea0000b82240 0000000020004000 ffff88002e80fb10 > ffffffff817612bd ffffea0000000001 ffff88002e80fc70 ffff88002e80fde8 > Call Trace: > [< inline >] migrate_page_add mm/mempolicy.c:966 > [] queue_pages_pte_range+0x4ad/0x10b0 mm/mempolicy.c:552 > [< inline >] walk_pmd_range mm/pagewalk.c:50 > [< inline >] walk_pud_range mm/pagewalk.c:90 > [< inline >] walk_pgd_range mm/pagewalk.c:116 > [] __walk_page_range+0x653/0xcd0 mm/pagewalk.c:204 > [] walk_page_range+0x134/0x300 mm/pagewalk.c:281 > [] queue_pages_range+0xfb/0x130 mm/mempolicy.c:687 > [] do_mbind+0x2c1/0xdc0 mm/mempolicy.c:1239 > [< inline >] SYSC_mbind mm/mempolicy.c:1351 > [] SyS_mbind+0x13d/0x150 mm/mempolicy.c:1333 > [] entry_SYSCALL_64_fastpath+0x16/0x7a > arch/x86/entry/entry_64.S:185 > Code: 89 df e8 aa 64 04 00 0f 0b e8 63 6d ed ff 4d 8d 6e ff e9 73 fb > ff ff e8 55 6d ed ff 48 c7 c6 60 7b 5b 86 48 89 df e8 86 64 04 00 <0f> > 0b e8 3f 6d ed ff 4d 8d 6e ff e9 eb fb ff ff c7 45 d0 f0 ff > RIP [] isolate_lru_page+0x4ea/0x6d0 mm/vmscan.c:1446 > RSP > ---[ end trace 310d844ac0b69c5b ]--- > BUG: sleeping function called from invalid context at include/linux/sched.h:2805 > in_atomic(): 1, irqs_disabled(): 0, pid: 6868, name: a.out > INFO: lockdep is turned off. > CPU: 1 PID: 6868 Comm: a.out Tainted: G D 4.5.0-rc1+ #287 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > 00000000ffffffff ffff88002e80f548 ffffffff829f9d0d ffff88003e24af80 > 0000000000001ad4 0000000000000000 ffff88002e80f570 ffffffff813cba2b > ffff88003e24af80 ffffffff865527a0 0000000000000af5 ffff88002e80f5b0 > Call Trace: > [< inline >] __dump_stack lib/dump_stack.c:15 > [] dump_stack+0x6f/0xa2 lib/dump_stack.c:50 > [] ___might_sleep+0x27b/0x3a0 kernel/sched/core.c:7703 > [] __might_sleep+0x90/0x1a0 kernel/sched/core.c:7665 > [< inline >] threadgroup_change_begin include/linux/sched.h:2805 > [] exit_signals+0x81/0x430 kernel/signal.c:2392 > [] do_exit+0x23c/0x2cb0 kernel/exit.c:701 > [] oops_end+0x9f/0xd0 arch/x86/kernel/dumpstack.c:250 > [] die+0x46/0x60 arch/x86/kernel/dumpstack.c:316 > [< inline >] do_trap_no_signal arch/x86/kernel/traps.c:205 > [] do_trap+0x18f/0x380 arch/x86/kernel/traps.c:251 > [] do_error_trap+0x11e/0x280 arch/x86/kernel/traps.c:290 > [] do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:303 > [] invalid_op+0x1e/0x30 arch/x86/entry/entry_64.S:830 > [< inline >] migrate_page_add mm/mempolicy.c:966 > [] queue_pages_pte_range+0x4ad/0x10b0 mm/mempolicy.c:552 > [< inline >] walk_pmd_range mm/pagewalk.c:50 > [< inline >] walk_pud_range mm/pagewalk.c:90 > [< inline >] walk_pgd_range mm/pagewalk.c:116 > [] __walk_page_range+0x653/0xcd0 mm/pagewalk.c:204 > [] walk_page_range+0x134/0x300 mm/pagewalk.c:281 > [] queue_pages_range+0xfb/0x130 mm/mempolicy.c:687 > [] do_mbind+0x2c1/0xdc0 mm/mempolicy.c:1239 > [< inline >] SYSC_mbind mm/mempolicy.c:1351 > [] SyS_mbind+0x13d/0x150 mm/mempolicy.c:1333 > [] entry_SYSCALL_64_fastpath+0x16/0x7a > arch/x86/entry/entry_64.S:185 > note: a.out[6868] exited with preempt_count 1 > > > // autogenerated by syzkaller (http://github.com/google/syzkaller) > #include > #include > #include > #include > #include > #include > > #ifndef SYS_mlock2 > #define SYS_mlock2 325 > #endif > > int main() > { > long r[8]; > memset(r, -1, sizeof(r)); > r[0] = syscall(SYS_mmap, 0x20000000ul, 0x1000ul, 0x3ul, 0x32ul, > 0xfffffffffffffffful, 0x0ul); > memcpy((void*)0x20000f33, "\x2f\x64\x65\x76\x2f\x73\x67\x23", 8); > r[2] = syscall(SYS_open, "/dev/sg0",O_RDWR); > r[3] = syscall(SYS_mmap, 0x20001000ul, 0x4000ul, 0x4ul, 0x12ul, r[2], > 0x0ul); > r[4] = syscall(SYS_mlock2, 0x20001000ul, 0x3000ul, 0x1ul, 0, 0, 0); > r[5] = syscall(SYS_mmap, 0x20005000ul, 0x1000ul, 0x3ul, 0x32ul, > 0xfffffffffffffffful, 0x0ul); > *(uint64_t*)0x20005ff8 = (uint64_t)0x80000000; > r[7] = syscall(SYS_mbind, 0x20000000ul, 0x4000ul, 0x8000ul, > 0x20005ff8ul, 0x5ul, 0x2ul); > return 0; > } > > > On commit 92e963f50fc74041b5e9e744c330dca48e04f08d. The patch below fixes the issue for me, but this bug makes me wounder how many bugs like this we have in kernel... :-/ Looks like we are too permissive about which VMA is migratable: vma_migratable() filters out VMA by VM_IO and VM_PFNMAP. I think VM_DONTEXPAND also correlate with VMA which cannot be migrated. $ git grep VM_DONTEXPAND drivers | grep -v '\(VM_IO\|VM_PFNMAN\)' | wc -l 33 Hm.. :-| It worth looking on them closely... And I wouldn't be surprised if some VMAs without all of these flags are not migratable too. Sigh.. Any thoughts? From 396ad132be07a2d2b9ec5d1d6ec9fe2fffe8105e Mon Sep 17 00:00:00 2001 From: "Kirill A. Shutemov" Date: Tue, 26 Jan 2016 22:59:16 +0300 Subject: [PATCH] sg: mark VMA as VM_IO to prevent migration Reduced testcase: #include #include #include #include #define SIZE 0x2000 int main() { int fd; void *p; fd = open("/dev/sg0", O_RDWR); p = mmap(NULL, SIZE, PROT_EXEC, MAP_PRIVATE | MAP_LOCKED, fd, 0); mbind(p, SIZE, 0, NULL, 0, MPOL_MF_MOVE); return 0; } We shouldn't try to migrate pages in sg VMA as we don't have a way to update Sg_scatter_hold::pages accordingly from mm core. Let's mark the VMA as VM_IO to indicate to mm core that the VMA is migratable. Signed-off-by: Kirill A. Shutemov Reported-by: Dmitry Vyukov Acked-by: Vlastimil Babka --- drivers/scsi/sg.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index 503ab8b46c0b..5e820674432c 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -1261,7 +1261,7 @@ sg_mmap(struct file *filp, struct vm_area_struct *vma) } sfp->mmap_called = 1; - vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP; + vma->vm_flags |= VM_IO | VM_DONTEXPAND | VM_DONTDUMP; vma->vm_private_data = sfp; vma->vm_ops = &sg_mmap_vm_ops; return 0;