From patchwork Tue Mar 14 15:17:21 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?UmFkaW0gS3LEjW3DocWZ?= X-Patchwork-Id: 9623737 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 8335E604CC for ; Tue, 14 Mar 2017 15:21:45 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 764A5285A3 for ; Tue, 14 Mar 2017 15:21:45 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6B19A285B0; Tue, 14 Mar 2017 15:21:45 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,HK_RANDOM_FROM, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C731A285A3 for ; Tue, 14 Mar 2017 15:21:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752822AbdCNPRs (ORCPT ); Tue, 14 Mar 2017 11:17:48 -0400 Received: from mx1.redhat.com ([209.132.183.28]:52904 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751822AbdCNPR0 (ORCPT ); Tue, 14 Mar 2017 11:17:26 -0400 Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id AED1F4DB14; Tue, 14 Mar 2017 15:17:26 +0000 (UTC) Received: from potion (dhcp-1-208.brq.redhat.com [10.34.1.208]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with SMTP id v2EFHMUJ027012; Tue, 14 Mar 2017 11:17:23 -0400 Received: by potion (sSMTP sendmail emulation); Tue, 14 Mar 2017 16:17:21 +0100 Date: Tue, 14 Mar 2017 16:17:21 +0100 From: Radim =?utf-8?B?S3LEjW3DocWZ?= To: Dmitry Vyukov Cc: Paolo Bonzini , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , "x86@kernel.org" , KVM list , LKML , Alan Stern , Steve Rutherford , Xiao Guangrong , Haozhong Zhang , syzkaller Subject: Re: kvm: WARNING in mmu_spte_clear_track_bits Message-ID: <20170314151720.GA4036@potion> References: <3e72461c-7197-e941-1d35-1aca34df2f8e@redhat.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: X-Scanned-By: MIMEDefang 2.68 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Tue, 14 Mar 2017 15:17:26 +0000 (UTC) Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP 2017-03-12 12:20+0100, Dmitry Vyukov: > On Tue, Jan 17, 2017 at 5:00 PM, Dmitry Vyukov wrote: >> On Tue, Jan 17, 2017 at 4:20 PM, Paolo Bonzini wrote: >>> >>> >>> On 13/01/2017 12:15, Dmitry Vyukov wrote: >>>> >>>> I've commented out the WARNING for now, but I am seeing lots of >>>> use-after-free's and rcu stalls involving mmu_spte_clear_track_bits: >>>> >>>> >>>> BUG: KASAN: use-after-free in mmu_spte_clear_track_bits+0x186/0x190 >>>> arch/x86/kvm/mmu.c:597 at addr ffff880068ae2008 >>>> Read of size 8 by task syz-executor2/16715 >>>> page:ffffea00016e6170 count:0 mapcount:0 mapping: (null) index:0x0 >>>> flags: 0x500000000000000() >>>> raw: 0500000000000000 0000000000000000 0000000000000000 00000000ffffffff >>>> raw: ffffea00017ec5a0 ffffea0001783d48 ffff88006aec5d98 >>>> page dumped because: kasan: bad access detected >>>> CPU: 2 PID: 16715 Comm: syz-executor2 Not tainted 4.10.0-rc3+ #163 >>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 >>>> Call Trace: >>>> __dump_stack lib/dump_stack.c:15 [inline] >>>> dump_stack+0x292/0x3a2 lib/dump_stack.c:51 >>>> kasan_report_error mm/kasan/report.c:213 [inline] >>>> kasan_report+0x42d/0x460 mm/kasan/report.c:307 >>>> __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:333 >>>> mmu_spte_clear_track_bits+0x186/0x190 arch/x86/kvm/mmu.c:597 >>>> drop_spte+0x24/0x280 arch/x86/kvm/mmu.c:1182 >>>> kvm_zap_rmapp+0x119/0x260 arch/x86/kvm/mmu.c:1401 >>>> kvm_unmap_rmapp+0x1d/0x30 arch/x86/kvm/mmu.c:1412 >>>> kvm_handle_hva_range+0x54a/0x7d0 arch/x86/kvm/mmu.c:1565 >>>> kvm_unmap_hva_range+0x2e/0x40 arch/x86/kvm/mmu.c:1591 >>>> kvm_mmu_notifier_invalidate_range_start+0xae/0x140 >>>> arch/x86/kvm/../../../virt/kvm/kvm_main.c:360 >>>> __mmu_notifier_invalidate_range_start+0x1f8/0x300 mm/mmu_notifier.c:199 >>>> mmu_notifier_invalidate_range_start include/linux/mmu_notifier.h:282 [inline] >>>> unmap_vmas+0x14b/0x1b0 mm/memory.c:1368 >>>> unmap_region+0x2f8/0x560 mm/mmap.c:2460 >>>> do_munmap+0x7b8/0xfa0 mm/mmap.c:2657 >>>> mmap_region+0x68f/0x18e0 mm/mmap.c:1612 >>>> do_mmap+0x6a2/0xd40 mm/mmap.c:1450 >>>> do_mmap_pgoff include/linux/mm.h:2031 [inline] >>>> vm_mmap_pgoff+0x1a9/0x200 mm/util.c:305 >>>> SYSC_mmap_pgoff mm/mmap.c:1500 [inline] >>>> SyS_mmap_pgoff+0x22c/0x5d0 mm/mmap.c:1458 >>>> SYSC_mmap arch/x86/kernel/sys_x86_64.c:95 [inline] >>>> SyS_mmap+0x16/0x20 arch/x86/kernel/sys_x86_64.c:86 >>>> entry_SYSCALL_64_fastpath+0x1f/0xc2 >>>> RIP: 0033:0x445329 >>>> RSP: 002b:00007fb33933cb58 EFLAGS: 00000282 ORIG_RAX: 0000000000000009 >>>> RAX: ffffffffffffffda RBX: 0000000020000000 RCX: 0000000000445329 >>>> RDX: 0000000000000003 RSI: 0000000000af1000 RDI: 0000000020000000 >>>> RBP: 00000000006dfe90 R08: ffffffffffffffff R09: 0000000000000000 >>>> R10: 0000000000000032 R11: 0000000000000282 R12: 0000000000700000 >>>> R13: 0000000000000006 R14: ffffffffffffffff R15: 0000000020001000 >>>> Memory state around the buggy address: >>>> ffff880068ae1f00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >>>> ffff880068ae1f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >>>>> ffff880068ae2000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff >>>> ^ >>>> ffff880068ae2080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff >>>> ffff880068ae2100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff >>>> ================================================================== >>> >>> This could be related to the gfn_to_rmap issues. >> >> >> Humm... That's possible. Potentially I am not seeing any more of >> spte-related crashes after I applied the following patch: >> >> --- a/virt/kvm/kvm_main.c >> +++ b/virt/kvm/kvm_main.c >> @@ -968,8 +968,7 @@ int __kvm_set_memory_region(struct kvm *kvm, >> /* Check for overlaps */ >> r = -EEXIST; >> kvm_for_each_memslot(slot, __kvm_memslots(kvm, as_id)) { >> - if ((slot->id >= KVM_USER_MEM_SLOTS) || >> - (slot->id == id)) >> + if (slot->id == id) >> continue; >> if (!((base_gfn + npages <= slot->base_gfn) || >> (base_gfn >= slot->base_gfn + slot->npages))) I don't understand how this fixes the test: the only memslot that the test creates is at memory range 0x0-0x1000, which should not overlap with any private memslots. There should be just the IDENTITY_PAGETABLE_PRIVATE_MEMSLOT @ 0xfffbc000ul. Do you get any ouput with this hunk? > Friendly ping. Just hit it on And the warning happens at mmap ... I can't reproduce, but does the bug happen on the second mmap()? (Test line 210 when i = 0.) The change above makes sense as memslots currently cannot overlap anywhere. There are three private memslots that can cause this problem: TSS, IDENTITY_MAP and APIC. TSS and IDENTITY_MAP can be configured by userspace and must not conflict by design, so we can safely enforce that. APIC memslot doesn't provide such guarantees and should be overlaid over any memory, but assuming that userspace doesn't configure memslots there seems bearable. Still, I'd like to understand why that patch would fix this bug. Thanks. > mmotm/86292b33d4b79ee03e2f43ea0381ef85f077c760 (without the above > change): > > ------------[ cut here ]------------ > WARNING: CPU: 1 PID: 31060 at arch/x86/kvm/mmu.c:682 > mmu_spte_clear_track_bits+0x3a1/0x420 arch/x86/kvm/mmu.c:682 > CPU: 1 PID: 31060 Comm: syz-executor0 Not tainted 4.11.0-rc1+ #328 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > Call Trace: > __dump_stack lib/dump_stack.c:16 [inline] > dump_stack+0x1a7/0x26a lib/dump_stack.c:52 > panic+0x1f8/0x40f kernel/panic.c:180 > __warn+0x1c4/0x1e0 kernel/panic.c:541 > warn_slowpath_null+0x2c/0x40 kernel/panic.c:584 > mmu_spte_clear_track_bits+0x3a1/0x420 arch/x86/kvm/mmu.c:682 > drop_spte+0x24/0x280 arch/x86/kvm/mmu.c:1323 > mmu_page_zap_pte+0x223/0x350 arch/x86/kvm/mmu.c:2438 > kvm_mmu_page_unlink_children arch/x86/kvm/mmu.c:2460 [inline] > kvm_mmu_prepare_zap_page+0x1ce/0x13d0 arch/x86/kvm/mmu.c:2504 > kvm_zap_obsolete_pages arch/x86/kvm/mmu.c:5134 [inline] > kvm_mmu_invalidate_zap_all_pages+0x4d4/0x6b0 arch/x86/kvm/mmu.c:5175 > kvm_arch_flush_shadow_all+0x15/0x20 arch/x86/kvm/x86.c:8364 > kvm_mmu_notifier_release+0x71/0xb0 > arch/x86/kvm/../../../virt/kvm/kvm_main.c:472 > __mmu_notifier_release+0x1e5/0x6b0 mm/mmu_notifier.c:75 > mmu_notifier_release include/linux/mmu_notifier.h:235 [inline] > exit_mmap+0x3a3/0x470 mm/mmap.c:2941 > __mmput kernel/fork.c:890 [inline] > mmput+0x228/0x700 kernel/fork.c:912 > exit_mm kernel/exit.c:558 [inline] > do_exit+0x9e8/0x1c20 kernel/exit.c:866 > do_group_exit+0x149/0x400 kernel/exit.c:983 > get_signal+0x6d9/0x1840 kernel/signal.c:2318 > do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:808 > exit_to_usermode_loop+0x1e5/0x2d0 arch/x86/entry/common.c:157 > prepare_exit_to_usermode arch/x86/entry/common.c:191 [inline] > syscall_return_slowpath+0x3bd/0x460 arch/x86/entry/common.c:260 > entry_SYSCALL_64_fastpath+0xc0/0xc2 > RIP: 0033:0x4458d9 > RSP: 002b:00007ffa472c3b58 EFLAGS: 00000286 ORIG_RAX: 00000000000000ce > RAX: fffffffffffffff4 RBX: 0000000000708000 RCX: 00000000004458d9 > RDX: 0000000000000000 RSI: 000000002006bff8 RDI: 000000000000a05b > RBP: 0000000000000fe0 R08: 0000000000000000 R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000286 R12: 00000000006df0a0 > R13: 000000000000a05b R14: 000000002006bff8 R15: 0000000000000000 diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index a17d78759727..7e1929432232 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -888,6 +888,14 @@ static struct kvm_memslots *install_new_memslots(struct kvm *kvm, return old_memslots; } +void kvm_dump_slot(struct kvm_memory_slot *slot) +{ + printk("kvm_memory_slot %p { .id = %u, .base_gfn = %#llx, .npages = %lu, " + ".userspace_addr = %#lx, .flags = %u, .dirty_bitmap = %p, .arch = ? }\n", + slot, slot->id, slot->base_gfn, slot->npages, + slot->userspace_addr, slot->flags, slot->dirty_bitmap); +} + /* * Allocate some memory and give it an address in the guest physical address * space. @@ -978,12 +986,14 @@ int __kvm_set_memory_region(struct kvm *kvm, /* Check for overlaps */ r = -EEXIST; kvm_for_each_memslot(slot, __kvm_memslots(kvm, as_id)) { - if ((slot->id >= KVM_USER_MEM_SLOTS) || - (slot->id == id)) + if (slot->id == id) continue; if (!((base_gfn + npages <= slot->base_gfn) || - (base_gfn >= slot->base_gfn + slot->npages))) + (base_gfn >= slot->base_gfn + slot->npages))) { + kvm_dump_slot(&new); + kvm_dump_slot(slot); goto out; + } } }