From patchwork Wed Feb 5 02:50:52 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 11365683 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E66DF17E0 for ; Wed, 5 Feb 2020 02:51:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id BDE912253D for ; Wed, 5 Feb 2020 02:51:21 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="ghyN+ezi" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727902AbgBECvP (ORCPT ); Tue, 4 Feb 2020 21:51:15 -0500 Received: from us-smtp-2.mimecast.com ([207.211.31.81]:33780 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727796AbgBECvO (ORCPT ); Tue, 4 Feb 2020 21:51:14 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1580871074; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=aRATN7sKWcZdWDij4U02uLG0iob0fByx2e0Dt2ueT/w=; b=ghyN+ezigxQRPpt9/+YsXDURdcO6DUtSb9loBf5zsT9jmBsSfH/cvV2QqnOcLLkWbz29Gy dF8zz8omT/1Gb4xOcnQPY9r3rU66dJR5SSL9/RXlm8hK0tUljnfOKdA45I7OCyTxh4Zs1u hIFxDNd9bLAVC4fPxWUYv2j2wwykqQA= Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-309-XODNSm8kMQeugOF_DE0fyw-1; Tue, 04 Feb 2020 21:51:12 -0500 X-MC-Unique: XODNSm8kMQeugOF_DE0fyw-1 Received: by mail-qt1-f199.google.com with SMTP id c22so380893qtn.23 for ; Tue, 04 Feb 2020 18:51:12 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=aRATN7sKWcZdWDij4U02uLG0iob0fByx2e0Dt2ueT/w=; b=nTII+8lAoe1HKMcytn9qYDHbIqa+j8+ZTkPj6M90QGrhJUT21jP+idSng9eymBdSj0 AjkVH5pAS15kslrHX2k9HySZ2CccPv6cF+Cu7jgViBlrZzJTuC5bPQx8sBoosCdiQlWb z47AR2VQA2VvB+rGQUdvsJLicE9/jTIBMuxjuVTh7gvKBkmpMZV6onl7fnWYEgIRggX+ eyRj8ovNIcPbNTAko00i0TpDO3GbVOCUS6TuTfPtS3q2Z37wb3qOOlfIommNZ2zvXd77 M/OvqKPDW9pDRrQJaJUiMoO2ESmaYqa4AipFmiQ5eA+tgcucSEnCw05bbdSYdblSjK/P GxcA== X-Gm-Message-State: APjAAAX/Qsnz0+ZODG2TtgwPC89qTfDc8lwhoHGcj6kFco034Y9A0fNM BrJX2CRkiKGYsoLxOpUMi5I90px9uyYYwU9nC79oWek4CuEBqQ0oRuIOGngmMpCoO0RtSdaxNvn OpWS0uRfVwZvA X-Received: by 2002:a05:6214:1428:: with SMTP id o8mr31031719qvx.87.1580871071594; Tue, 04 Feb 2020 18:51:11 -0800 (PST) X-Google-Smtp-Source: APXvYqwXJBNvuIFypmIVFhWk564W4B95sKGIgajPPJbgokUyyK+XPEQ6Y2igS687dPfdxxOzpb45IA== X-Received: by 2002:a05:6214:1428:: with SMTP id o8mr31031696qvx.87.1580871071387; Tue, 04 Feb 2020 18:51:11 -0800 (PST) Received: from xz-x1.redhat.com ([2607:9880:19c8:32::2]) by smtp.gmail.com with ESMTPSA id b141sm12380923qkg.33.2020.02.04.18.51.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Feb 2020 18:51:10 -0800 (PST) From: Peter Xu To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Christophe de Dinechin , Sean Christopherson , Paolo Bonzini , Jason Wang , Yan Zhao , "Michael S . Tsirkin" , peterx@redhat.com, Kevin Tian , Alex Williamson , "Dr . David Alan Gilbert" , Vitaly Kuznetsov Subject: [PATCH v4 01/14] KVM: X86: Change parameter for fast_page_fault tracepoint Date: Tue, 4 Feb 2020 21:50:52 -0500 Message-Id: <20200205025105.367213-2-peterx@redhat.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200205025105.367213-1-peterx@redhat.com> References: <20200205025105.367213-1-peterx@redhat.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org It would be clearer to dump the return value to know easily on whether did we go through the fast path for handling current page fault. Remove the old two last parameters because after all the old/new sptes were dumped in the same line. Signed-off-by: Peter Xu --- arch/x86/kvm/mmutrace.h | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/arch/x86/kvm/mmutrace.h b/arch/x86/kvm/mmutrace.h index 3c6522b84ff1..456371406d2a 100644 --- a/arch/x86/kvm/mmutrace.h +++ b/arch/x86/kvm/mmutrace.h @@ -244,9 +244,6 @@ TRACE_EVENT( __entry->access) ); -#define __spte_satisfied(__spte) \ - (__entry->retry && is_writable_pte(__entry->__spte)) - TRACE_EVENT( fast_page_fault, TP_PROTO(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u32 error_code, @@ -274,12 +271,10 @@ TRACE_EVENT( ), TP_printk("vcpu %d gva %llx error_code %s sptep %p old %#llx" - " new %llx spurious %d fixed %d", __entry->vcpu_id, + " new %llx ret %d", __entry->vcpu_id, __entry->cr2_or_gpa, __print_flags(__entry->error_code, "|", kvm_mmu_trace_pferr_flags), __entry->sptep, - __entry->old_spte, __entry->new_spte, - __spte_satisfied(old_spte), __spte_satisfied(new_spte) - ) + __entry->old_spte, __entry->new_spte, __entry->retry) ); TRACE_EVENT( From patchwork Wed Feb 5 02:50:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 11365689 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C2CEA17E0 for ; Wed, 5 Feb 2020 02:51:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id A1E8221D7D for ; Wed, 5 Feb 2020 02:51:30 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="JPBdHM4L" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727954AbgBECv3 (ORCPT ); Tue, 4 Feb 2020 21:51:29 -0500 Received: from us-smtp-1.mimecast.com ([207.211.31.81]:55766 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727922AbgBECvS (ORCPT ); Tue, 4 Feb 2020 21:51:18 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1580871077; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5Plaj5tCIvaG10j1wUs3FmUanZWkBG9z74VjeoL5zXQ=; b=JPBdHM4L0dcvqgry2wyZv0pZP3fR3sFNlYQLUO8lG0zJkRgFFIc9V7buakUqvFR2tAC34k cJjA+domOlctrXx14p6EoacukF6li5GMOMZz1b6UBM925tq503RAH2IIljHUe0d37ljQdq lWRK5+RdOrwrHL3fY+AHMhNB3iYzwA0= Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-363-yzm4hxbFNkSHSy9nFxkPcQ-1; Tue, 04 Feb 2020 21:51:14 -0500 X-MC-Unique: yzm4hxbFNkSHSy9nFxkPcQ-1 Received: by mail-qt1-f199.google.com with SMTP id h11so404245qtq.11 for ; Tue, 04 Feb 2020 18:51:14 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=5Plaj5tCIvaG10j1wUs3FmUanZWkBG9z74VjeoL5zXQ=; b=YSwgzqT46ZyUuz/B1b59zAqjRiUMw9+Hae3ZARVbQpJtzTidD2zosJUe2hF2GrWkD7 c+kscWk0Cveu3UlXK+3l6xA8EdRLaNTsaF1EutLEbBtMcAAQk7OcCYhHbSSM6OteDB+x dLxKaRdAC+mFlUK8RBpfwf4Si/xlWznNI7dBCkdPx/a7dIbzei16idoaxE2pkyrt7WPA I+wikf9zG9oIy5nNTrRft3RfArRS0FC5B4G47ADHL+ad9BXyx3W+zDDTLZJGceOAOEAI Rs53CWgc4NiyfgtnJ/6jsoOuYs5fvCx41O5152v6O6QP1ievi24EdNziQ+emgFeFf/WY qigg== X-Gm-Message-State: APjAAAUM2kIi5A/lxbnUCgB5NxIDDFqPOKtWI/cefwP6hylSc749TB4c HB9jn8yo7wufbTDMaPxb/dlo654uhF9pk3R6WmQWxxXK2QuRPXF3Uyms+ZEKisfB/phYuQSyVVL GBm4GiNVXauVQ X-Received: by 2002:a37:b883:: with SMTP id i125mr32117824qkf.64.1580871073488; Tue, 04 Feb 2020 18:51:13 -0800 (PST) X-Google-Smtp-Source: APXvYqzBej9VNfpaFC9AYibSNQqxHqKsfHwkxI7lciYbSIGQ7kQ3UunKMIE80komuW9A8u2g//rhrQ== X-Received: by 2002:a37:b883:: with SMTP id i125mr32117807qkf.64.1580871073231; Tue, 04 Feb 2020 18:51:13 -0800 (PST) Received: from xz-x1.redhat.com ([2607:9880:19c8:32::2]) by smtp.gmail.com with ESMTPSA id b141sm12380923qkg.33.2020.02.04.18.51.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Feb 2020 18:51:12 -0800 (PST) From: Peter Xu To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Christophe de Dinechin , Sean Christopherson , Paolo Bonzini , Jason Wang , Yan Zhao , "Michael S . Tsirkin" , peterx@redhat.com, Kevin Tian , Alex Williamson , "Dr . David Alan Gilbert" , Vitaly Kuznetsov Subject: [PATCH v4 02/14] KVM: Cache as_id in kvm_memory_slot Date: Tue, 4 Feb 2020 21:50:53 -0500 Message-Id: <20200205025105.367213-3-peterx@redhat.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200205025105.367213-1-peterx@redhat.com> References: <20200205025105.367213-1-peterx@redhat.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Cache the address space ID just like the slot ID. It will be used in order to fill in the dirty ring entries. Suggested-by: Paolo Bonzini Suggested-by: Sean Christopherson Signed-off-by: Peter Xu --- include/linux/kvm_host.h | 1 + virt/kvm/kvm_main.c | 2 ++ 2 files changed, 3 insertions(+) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 6d5331b0d937..62aad0a2707a 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -346,6 +346,7 @@ struct kvm_memory_slot { unsigned long userspace_addr; u32 flags; short id; + u8 as_id; }; static inline unsigned long kvm_dirty_bitmap_bytes(struct kvm_memory_slot *memslot) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index eb3709d55139..69190f9f7bd8 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1035,6 +1035,8 @@ int __kvm_set_memory_region(struct kvm *kvm, new = old = *slot; + BUILD_BUG_ON(U8_MAX < KVM_ADDRESS_SPACE_NUM); + new.as_id = as_id; new.id = id; new.base_gfn = base_gfn; new.npages = npages; From patchwork Wed Feb 5 02:50:54 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 11365685 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E98D514E3 for ; Wed, 5 Feb 2020 02:51:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B469421D7D for ; Wed, 5 Feb 2020 02:51:28 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="OOyUR0jI" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727933AbgBECv1 (ORCPT ); Tue, 4 Feb 2020 21:51:27 -0500 Received: from us-smtp-2.mimecast.com ([207.211.31.81]:59897 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727945AbgBECvW (ORCPT ); Tue, 4 Feb 2020 21:51:22 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1580871080; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=JlOS6C8tGQvFVsl+o6di+4ALD4s7OpNlxzWInCgkgGU=; b=OOyUR0jIXrMyQoVU+8glp3/DDIZVfiVwuaLUPE/Bk+x1PoT7GP5hYwh3bfcfNTP0bv7sN4 bEa6Kukkep+JkRFWz5KEj2Ww+/iU5IMaGV/ywj49tHM9QF65n4cxOH24+GTZQYKdr8pqxx H/LQFz4lAXLQdZc6o78uIlykj7GADcs= Received: from mail-qv1-f70.google.com (mail-qv1-f70.google.com [209.85.219.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-365-qtakMHO_PdO-uElLJOnzUQ-1; Tue, 04 Feb 2020 21:51:17 -0500 X-MC-Unique: qtakMHO_PdO-uElLJOnzUQ-1 Received: by mail-qv1-f70.google.com with SMTP id d7so611082qvq.12 for ; Tue, 04 Feb 2020 18:51:17 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=JlOS6C8tGQvFVsl+o6di+4ALD4s7OpNlxzWInCgkgGU=; b=DhIj3UJTTtViXovcnQ+WP6oelsNodbKw+HA/Rch06VyNlRpe/LXUjBPmQGfkNrmhXF qBdT0uSHZNXminWu3JuD7fpywyEEfXbEd3rsFJdC1MO2n++9tt09SSOMKx40/fhKBZAe rA7oDU1Vo8op5sYmPTg1EqJTO2p4cspvVp/hHMC5KWyQnDe+tmmJkUmDRw7OrQQ/ufjs YxdCE4XBEsSmPP/Vdoe15OqrINNPILZNyswyXI508yt5i/cZcbL11bIEOgVfboMBfPit YesDdNnvrVfGk9rPqoCt36ZvGCDf+s2BOb5eUcnJCMdkDnrYYLKl8VWjjlV4idmcdWFR TNuw== X-Gm-Message-State: APjAAAWolR5d2JCjppglWr4QZQekvR0xSh736gt1mDSdvndharjXhzcF E3WIvJIF2nXAsJCuuA555cao+MdYYP4ejOEjLMfwTj4VW5a7/Av4/bhROaTEIDOeIDwnYV1dMXq 1OjcN7JtX4Cj/ X-Received: by 2002:a05:620a:20d0:: with SMTP id f16mr3507890qka.349.1580871075576; Tue, 04 Feb 2020 18:51:15 -0800 (PST) X-Google-Smtp-Source: APXvYqzlOB1Xe5sXg3yy7jrT914bM9INW2BmPt7HvmMxS/JhDswHqDwxZKkxpLE34tzsUvxNTX8Usw== X-Received: by 2002:a05:620a:20d0:: with SMTP id f16mr3507866qka.349.1580871075077; Tue, 04 Feb 2020 18:51:15 -0800 (PST) Received: from xz-x1.redhat.com ([2607:9880:19c8:32::2]) by smtp.gmail.com with ESMTPSA id b141sm12380923qkg.33.2020.02.04.18.51.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Feb 2020 18:51:14 -0800 (PST) From: Peter Xu To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Christophe de Dinechin , Sean Christopherson , Paolo Bonzini , Jason Wang , Yan Zhao , "Michael S . Tsirkin" , peterx@redhat.com, Kevin Tian , Alex Williamson , "Dr . David Alan Gilbert" , Vitaly Kuznetsov Subject: [PATCH v4 03/14] KVM: X86: Don't track dirty for KVM_SET_[TSS_ADDR|IDENTITY_MAP_ADDR] Date: Tue, 4 Feb 2020 21:50:54 -0500 Message-Id: <20200205025105.367213-4-peterx@redhat.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200205025105.367213-1-peterx@redhat.com> References: <20200205025105.367213-1-peterx@redhat.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Originally, we have three code paths that can dirty a page without vcpu context for X86: - init_rmode_identity_map - init_rmode_tss - kvmgt_rw_gpa init_rmode_identity_map and init_rmode_tss will be setup on destination VM no matter what (and the guest cannot even see them), so it does not make sense to track them at all. To do this, allow __x86_set_memory_region() to return the userspace address that just allocated to the caller. Then in both of the functions we directly write to the userspace address instead of calling kvm_write_*() APIs. Another trivial change is that we don't need to explicitly clear the identity page table root in init_rmode_identity_map() because no matter what we'll write to the whole page with 4M huge page entries. Suggested-by: Paolo Bonzini Signed-off-by: Peter Xu --- arch/x86/include/asm/kvm_host.h | 3 +- arch/x86/kvm/svm.c | 9 ++-- arch/x86/kvm/vmx/vmx.c | 78 ++++++++++++++++----------------- arch/x86/kvm/x86.c | 40 ++++++++++++++--- 4 files changed, 80 insertions(+), 50 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 77d206a93658..8fc46bbce57a 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1640,7 +1640,8 @@ void __kvm_request_immediate_exit(struct kvm_vcpu *vcpu); int kvm_is_in_guest(void); -int __x86_set_memory_region(struct kvm *kvm, int id, gpa_t gpa, u32 size); +void __user *__x86_set_memory_region(struct kvm *kvm, int id, gpa_t gpa, + u32 size); bool kvm_vcpu_is_reset_bsp(struct kvm_vcpu *vcpu); bool kvm_vcpu_is_bsp(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index bf0556588ad0..160468e0898e 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -1736,7 +1736,8 @@ static u64 *avic_get_physical_id_entry(struct kvm_vcpu *vcpu, */ static int avic_update_access_page(struct kvm *kvm, bool activate) { - int ret = 0; + void __user *ret; + int r = 0; mutex_lock(&kvm->slots_lock); /* @@ -1752,13 +1753,15 @@ static int avic_update_access_page(struct kvm *kvm, bool activate) APIC_ACCESS_PAGE_PRIVATE_MEMSLOT, APIC_DEFAULT_PHYS_BASE, activate ? PAGE_SIZE : 0); - if (ret) + if (IS_ERR(ret)) { + r = PTR_ERR(ret); goto out; + } kvm->arch.apic_access_page_done = activate; out: mutex_unlock(&kvm->slots_lock); - return ret; + return r; } static int avic_init_backing_page(struct kvm_vcpu *vcpu) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 1419c53aed16..a01f3bcef27a 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -3447,34 +3447,26 @@ static bool guest_state_valid(struct kvm_vcpu *vcpu) return true; } -static int init_rmode_tss(struct kvm *kvm) +static int init_rmode_tss(struct kvm *kvm, void __user *ua) { - gfn_t fn; + const void *zero_page = (const void *) __va(page_to_phys(ZERO_PAGE(0))); u16 data = 0; int idx, r; - idx = srcu_read_lock(&kvm->srcu); - fn = to_kvm_vmx(kvm)->tss_addr >> PAGE_SHIFT; - r = kvm_clear_guest_page(kvm, fn, 0, PAGE_SIZE); - if (r < 0) - goto out; + for (idx = 0; idx < 3; idx++) { + r = __copy_to_user(ua + PAGE_SIZE * idx, zero_page, PAGE_SIZE); + if (r) + return -EFAULT; + } + data = TSS_BASE_SIZE + TSS_REDIRECTION_SIZE; - r = kvm_write_guest_page(kvm, fn++, &data, - TSS_IOPB_BASE_OFFSET, sizeof(u16)); - if (r < 0) - goto out; - r = kvm_clear_guest_page(kvm, fn++, 0, PAGE_SIZE); - if (r < 0) - goto out; - r = kvm_clear_guest_page(kvm, fn, 0, PAGE_SIZE); - if (r < 0) - goto out; + r = __copy_to_user(ua + TSS_IOPB_BASE_OFFSET, &data, sizeof(u16)); + if (r) + return -EFAULT; + data = ~0; - r = kvm_write_guest_page(kvm, fn, &data, - RMODE_TSS_SIZE - 2 * PAGE_SIZE - 1, - sizeof(u8)); -out: - srcu_read_unlock(&kvm->srcu, idx); + r = __copy_to_user(ua + RMODE_TSS_SIZE - 1, &data, sizeof(u8)); + return r; } @@ -3483,6 +3475,7 @@ static int init_rmode_identity_map(struct kvm *kvm) struct kvm_vmx *kvm_vmx = to_kvm_vmx(kvm); int i, r = 0; kvm_pfn_t identity_map_pfn; + void __user *uaddr; u32 tmp; /* Protect kvm_vmx->ept_identity_pagetable_done. */ @@ -3495,22 +3488,24 @@ static int init_rmode_identity_map(struct kvm *kvm) kvm_vmx->ept_identity_map_addr = VMX_EPT_IDENTITY_PAGETABLE_ADDR; identity_map_pfn = kvm_vmx->ept_identity_map_addr >> PAGE_SHIFT; - r = __x86_set_memory_region(kvm, IDENTITY_PAGETABLE_PRIVATE_MEMSLOT, - kvm_vmx->ept_identity_map_addr, PAGE_SIZE); - if (r < 0) + uaddr = __x86_set_memory_region(kvm, + IDENTITY_PAGETABLE_PRIVATE_MEMSLOT, + kvm_vmx->ept_identity_map_addr, + PAGE_SIZE); + if (IS_ERR(uaddr)) { + r = PTR_ERR(uaddr); goto out; + } - r = kvm_clear_guest_page(kvm, identity_map_pfn, 0, PAGE_SIZE); - if (r < 0) - goto out; /* Set up identity-mapping pagetable for EPT in real mode */ for (i = 0; i < PT32_ENT_PER_PAGE; i++) { tmp = (i << 22) + (_PAGE_PRESENT | _PAGE_RW | _PAGE_USER | _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_PSE); - r = kvm_write_guest_page(kvm, identity_map_pfn, - &tmp, i * sizeof(tmp), sizeof(tmp)); - if (r < 0) + r = __copy_to_user(uaddr + i * sizeof(tmp), &tmp, sizeof(tmp)); + if (r) { + r = -EFAULT; goto out; + } } kvm_vmx->ept_identity_pagetable_done = true; @@ -3537,19 +3532,22 @@ static void seg_setup(int seg) static int alloc_apic_access_page(struct kvm *kvm) { struct page *page; - int r = 0; + void __user *r; + int ret = 0; mutex_lock(&kvm->slots_lock); if (kvm->arch.apic_access_page_done) goto out; r = __x86_set_memory_region(kvm, APIC_ACCESS_PAGE_PRIVATE_MEMSLOT, APIC_DEFAULT_PHYS_BASE, PAGE_SIZE); - if (r) + if (IS_ERR(r)) { + ret = PTR_ERR(r); goto out; + } page = gfn_to_page(kvm, APIC_DEFAULT_PHYS_BASE >> PAGE_SHIFT); if (is_error_page(page)) { - r = -EFAULT; + ret = -EFAULT; goto out; } @@ -3561,7 +3559,7 @@ static int alloc_apic_access_page(struct kvm *kvm) kvm->arch.apic_access_page_done = true; out: mutex_unlock(&kvm->slots_lock); - return r; + return ret; } int allocate_vpid(void) @@ -4479,7 +4477,7 @@ static int vmx_interrupt_allowed(struct kvm_vcpu *vcpu) static int vmx_set_tss_addr(struct kvm *kvm, unsigned int addr) { - int ret; + void __user *ret; if (enable_unrestricted_guest) return 0; @@ -4489,10 +4487,12 @@ static int vmx_set_tss_addr(struct kvm *kvm, unsigned int addr) PAGE_SIZE * 3); mutex_unlock(&kvm->slots_lock); - if (ret) - return ret; + if (IS_ERR(ret)) + return PTR_ERR(ret); + to_kvm_vmx(kvm)->tss_addr = addr; - return init_rmode_tss(kvm); + + return init_rmode_tss(kvm, ret); } static int vmx_set_identity_map_addr(struct kvm *kvm, u64 ident_addr) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 7e3f1d937224..030435f1a033 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9732,7 +9732,33 @@ void kvm_arch_sync_events(struct kvm *kvm) kvm_free_pit(kvm); } -int __x86_set_memory_region(struct kvm *kvm, int id, gpa_t gpa, u32 size) +/** + * __x86_set_memory_region: Setup KVM internal memory slot + * + * @kvm: the kvm pointer to the VM. + * @id: the slot ID to setup. + * @gpa: the GPA to install the slot (unused when @size == 0). + * @size: the size of the slot. Set to zero to uninstall a slot. + * + * This function helps to setup a KVM internal memory slot. Specify + * @size > 0 to install a new slot, while @size == 0 to uninstall a + * slot. The return code can be one of the following: + * + * - An error number if error happened, or, + * - For installation: the HVA of the newly mapped memory slot, or, + * - For uninstallation: zero if we successfully uninstall a slot. + * + * The caller should always use IS_ERR() to check the return value + * before use. NOTE: KVM internal memory slots are guaranteed and + * won't change until the VM is destroyed. This is also true to the + * returned HVA when installing a new memory slot. The HVA can be + * invalidated by either an errornous userspace program or a VM under + * destruction, however as long as we use __copy_{to|from}_user() + * properly upon the HVAs and handle the failure paths always then + * we're safe. + */ +void __user * __x86_set_memory_region(struct kvm *kvm, int id, gpa_t gpa, + u32 size) { int i, r; unsigned long hva; @@ -9741,12 +9767,12 @@ int __x86_set_memory_region(struct kvm *kvm, int id, gpa_t gpa, u32 size) /* Called with kvm->slots_lock held. */ if (WARN_ON(id >= KVM_MEM_SLOTS_NUM)) - return -EINVAL; + return ERR_PTR(-EINVAL); slot = id_to_memslot(slots, id); if (size) { if (slot->npages) - return -EEXIST; + return ERR_PTR(-EEXIST); /* * MAP_SHARED to prevent internal slot pages from being moved @@ -9755,10 +9781,10 @@ int __x86_set_memory_region(struct kvm *kvm, int id, gpa_t gpa, u32 size) hva = vm_mmap(NULL, 0, size, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, 0); if (IS_ERR((void *)hva)) - return PTR_ERR((void *)hva); + return (void __user *)hva; } else { if (!slot->npages) - return 0; + return ERR_PTR(0); hva = 0; } @@ -9774,13 +9800,13 @@ int __x86_set_memory_region(struct kvm *kvm, int id, gpa_t gpa, u32 size) m.memory_size = size; r = __kvm_set_memory_region(kvm, &m); if (r < 0) - return r; + return ERR_PTR(r); } if (!size) vm_munmap(old.userspace_addr, old.npages * PAGE_SIZE); - return 0; + return (void __user *)hva; } EXPORT_SYMBOL_GPL(__x86_set_memory_region); From patchwork Wed Feb 5 02:58:32 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 11365705 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4E807138D for ; Wed, 5 Feb 2020 02:58:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 23F3921741 for ; Wed, 5 Feb 2020 02:58:53 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Zk8Pnsf2" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727862AbgBEC6u (ORCPT ); Tue, 4 Feb 2020 21:58:50 -0500 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:26702 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727708AbgBEC6t (ORCPT ); Tue, 4 Feb 2020 21:58:49 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1580871528; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=iMoyuUQG6mGTZVCTKYztmS6Hs0K242TSdWPEeOxZW6A=; b=Zk8Pnsf2ajJs2A8PU1TMYhlCMP3QCH+1zyGe7XH59GdobpRhqwsakGFtcWBX6ZKSjAdfAr z/SBWwrcX7HbYbeCfvmgUYLgzluBlxc3hEwf3Jpex99tjYbcLq982afc4JuLs5z52Scaou AK2HwdMUH1ycPgML81mtVr8bpsWuo24= Received: from mail-qv1-f71.google.com (mail-qv1-f71.google.com [209.85.219.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-50-82FxKHxYPEGgZODUFk0-Ng-1; Tue, 04 Feb 2020 21:58:46 -0500 X-MC-Unique: 82FxKHxYPEGgZODUFk0-Ng-1 Received: by mail-qv1-f71.google.com with SMTP id dc2so631932qvb.7 for ; Tue, 04 Feb 2020 18:58:46 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=iMoyuUQG6mGTZVCTKYztmS6Hs0K242TSdWPEeOxZW6A=; b=jBmc5V3iyIDPoS9LVIZbn9srDvE06K2trImDOoSECT5izMG0uXRoIpyVqDANtR7e3X 6JNAOcLB/eqMCLHlsq394Vq6RatAXZjVlx0OtlJ1mPI0864XijLcL//sC0YsmHhP/EX8 xWzNxNnOl14Ay6N+tZw7GJZTINOp271IypN8JxWFd6egfp07mR4T5BOwLsihkNteO8dc TLr3lDhc3W+shHkAYytX8JKxiU6I/tEbdPxy9+64ABFEJtZjLcXNCVBNCqn2VRdZ0TRK jjlXqGxUXwophf2lXchVDrqdWRLaX3AuYlPUlhicZvuh9C/CybppNIWnMWdlmUaDkYD4 4JSw== X-Gm-Message-State: APjAAAWhQVQ6KY48jO/BmSsUMqK8XvxgKaoE09T9GDFAU6KC1MALeUNB vfH0ZKLlRjXpDKLZ7bFUr28QRILfhAl3Zi+KIkK/12SsBpJNRQbbrStoLNLAXACPpEK/0GMKV7U j2zP2pA9fDUoB X-Received: by 2002:ac8:1415:: with SMTP id k21mr31821680qtj.300.1580871525712; Tue, 04 Feb 2020 18:58:45 -0800 (PST) X-Google-Smtp-Source: APXvYqwcW1YtKLT8v5bd10IJa63sORuNDIe53LmsHkfWJo8kyUKgeShP1B4VamT5gnlv2wvLf68pxw== X-Received: by 2002:ac8:1415:: with SMTP id k21mr31821664qtj.300.1580871525461; Tue, 04 Feb 2020 18:58:45 -0800 (PST) Received: from xz-x1.redhat.com ([2607:9880:19c8:32::2]) by smtp.gmail.com with ESMTPSA id e64sm12961649qtd.45.2020.02.04.18.58.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Feb 2020 18:58:44 -0800 (PST) From: Peter Xu To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: dinechin@redhat.com, sean.j.christopherson@intel.com, pbonzini@redhat.com, jasowang@redhat.com, yan.y.zhao@intel.com, mst@redhat.com, peterx@redhat.com, kevin.tian@intel.com, alex.williamson@redhat.com, dgilbert@redhat.com, vkuznets@redhat.com Subject: [PATCH 04/14] KVM: Pass in kvm pointer into mark_page_dirty_in_slot() Date: Tue, 4 Feb 2020 21:58:32 -0500 Message-Id: <20200205025842.367575-1-peterx@redhat.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200205025105.367213-1-peterx@redhat.com> References: <20200205025105.367213-1-peterx@redhat.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The context will be needed to implement the kvm dirty ring. Reviewed-by: Paolo Bonzini Signed-off-by: Peter Xu --- virt/kvm/kvm_main.c | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 69190f9f7bd8..5307f6e33587 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -144,7 +144,9 @@ static void hardware_disable_all(void); static void kvm_io_bus_destroy(struct kvm_io_bus *bus); -static void mark_page_dirty_in_slot(struct kvm_memory_slot *memslot, gfn_t gfn); +static void mark_page_dirty_in_slot(struct kvm *kvm, + struct kvm_memory_slot *memslot, + gfn_t gfn); __visible bool kvm_rebooting; EXPORT_SYMBOL_GPL(kvm_rebooting); @@ -2057,7 +2059,8 @@ int kvm_vcpu_read_guest_atomic(struct kvm_vcpu *vcpu, gpa_t gpa, } EXPORT_SYMBOL_GPL(kvm_vcpu_read_guest_atomic); -static int __kvm_write_guest_page(struct kvm_memory_slot *memslot, gfn_t gfn, +static int __kvm_write_guest_page(struct kvm *kvm, + struct kvm_memory_slot *memslot, gfn_t gfn, const void *data, int offset, int len) { int r; @@ -2069,7 +2072,7 @@ static int __kvm_write_guest_page(struct kvm_memory_slot *memslot, gfn_t gfn, r = __copy_to_user((void __user *)addr + offset, data, len); if (r) return -EFAULT; - mark_page_dirty_in_slot(memslot, gfn); + mark_page_dirty_in_slot(kvm, memslot, gfn); return 0; } @@ -2078,7 +2081,7 @@ int kvm_write_guest_page(struct kvm *kvm, gfn_t gfn, { struct kvm_memory_slot *slot = gfn_to_memslot(kvm, gfn); - return __kvm_write_guest_page(slot, gfn, data, offset, len); + return __kvm_write_guest_page(kvm, slot, gfn, data, offset, len); } EXPORT_SYMBOL_GPL(kvm_write_guest_page); @@ -2087,7 +2090,7 @@ int kvm_vcpu_write_guest_page(struct kvm_vcpu *vcpu, gfn_t gfn, { struct kvm_memory_slot *slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn); - return __kvm_write_guest_page(slot, gfn, data, offset, len); + return __kvm_write_guest_page(vcpu->kvm, slot, gfn, data, offset, len); } EXPORT_SYMBOL_GPL(kvm_vcpu_write_guest_page); @@ -2206,7 +2209,7 @@ int kvm_write_guest_offset_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc, r = __copy_to_user((void __user *)ghc->hva + offset, data, len); if (r) return -EFAULT; - mark_page_dirty_in_slot(ghc->memslot, gpa >> PAGE_SHIFT); + mark_page_dirty_in_slot(kvm, ghc->memslot, gpa >> PAGE_SHIFT); return 0; } @@ -2273,7 +2276,8 @@ int kvm_clear_guest(struct kvm *kvm, gpa_t gpa, unsigned long len) } EXPORT_SYMBOL_GPL(kvm_clear_guest); -static void mark_page_dirty_in_slot(struct kvm_memory_slot *memslot, +static void mark_page_dirty_in_slot(struct kvm *kvm, + struct kvm_memory_slot *memslot, gfn_t gfn) { if (memslot && memslot->dirty_bitmap) { @@ -2288,7 +2292,7 @@ void mark_page_dirty(struct kvm *kvm, gfn_t gfn) struct kvm_memory_slot *memslot; memslot = gfn_to_memslot(kvm, gfn); - mark_page_dirty_in_slot(memslot, gfn); + mark_page_dirty_in_slot(kvm, memslot, gfn); } EXPORT_SYMBOL_GPL(mark_page_dirty); @@ -2297,7 +2301,7 @@ void kvm_vcpu_mark_page_dirty(struct kvm_vcpu *vcpu, gfn_t gfn) struct kvm_memory_slot *memslot; memslot = kvm_vcpu_gfn_to_memslot(vcpu, gfn); - mark_page_dirty_in_slot(memslot, gfn); + mark_page_dirty_in_slot(vcpu->kvm, memslot, gfn); } EXPORT_SYMBOL_GPL(kvm_vcpu_mark_page_dirty); From patchwork Wed Feb 5 02:58:33 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 11365711 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id ACC7B138D for ; Wed, 5 Feb 2020 02:58:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 64A04217BA for ; Wed, 5 Feb 2020 02:58:58 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="h76+XeJu" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727988AbgBEC65 (ORCPT ); Tue, 4 Feb 2020 21:58:57 -0500 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:52939 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727925AbgBEC65 (ORCPT ); Tue, 4 Feb 2020 21:58:57 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1580871535; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hpdraubGKKn/Ad6s+UvGI32R+k/N74IuIVKBY/+B63Q=; b=h76+XeJuJ+or/4r7Au46Hp3Ny5owQjJSENcQLPga/d1kXMZyXMiuizbrvfL4M//o29++X3 EXBwCM8qu01VpDuInhNHxS3mZLP/rpDl0Su6ACBouDTDyHBLAykmglTOopIalV/f8T1n3P OlpcLH+Cy+zT35mqguH0mBnd0Qj6hes= Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-372-oxDqxrZpOBuIc8clGk4JAA-1; Tue, 04 Feb 2020 21:58:49 -0500 X-MC-Unique: oxDqxrZpOBuIc8clGk4JAA-1 Received: by mail-qt1-f200.google.com with SMTP id c8so390877qte.22 for ; Tue, 04 Feb 2020 18:58:49 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=hpdraubGKKn/Ad6s+UvGI32R+k/N74IuIVKBY/+B63Q=; b=LmI/Kn+l5ED13GmbIxByKn9JvYeXF+LcaIMj7NpMfJOIqdnqRkuY/43SJoqnekwtC6 nAOjxTnQSIseFwlviP8BiwAQ3vvSk124pMQAYOW+tNIcPH2uUKZIsKC3Slk5Po2pm59o DYmAMoUQvvMhJdQ02dtRc9u9gQveOa5ZRzsNC9n7n1lieLzSBvZimwG4v5DGnaUmPTtz Mmu4fslM+JDigJfWmw+0MBt6+wXWR5LGfwYxslTjSFu8HxdyuHxCB0f3ZnzS/oG6nQPT WDtQVm8BO6BTav7IK4O6jCwhKmEa0O9ZvC4fQSxTzEdTLttR8EwTqytLnzmocZQ/dgrq XSTg== X-Gm-Message-State: APjAAAWcSam8y0FQO7iBcAWlcp3dzlzoa2Zf44JqHkYrzUi8lF2QcWrl /INT4BwYfia5s/8TH7I/+kpZjXi83jKLCjrkHZcYP6q1ceYkhpZp3zm/KJu4x9WcChgC1UzMxuo C3PdRlrpMG4mw X-Received: by 2002:a37:6fc1:: with SMTP id k184mr7209348qkc.53.1580871528415; Tue, 04 Feb 2020 18:58:48 -0800 (PST) X-Google-Smtp-Source: APXvYqyVhiziUUprjo6MsBpZ0G4VKKa8pUTXuKKUJISWV8la6mlx4ZeT1XfKfDFdbRrmHeMe05RiNA== X-Received: by 2002:a37:6fc1:: with SMTP id k184mr7209320qkc.53.1580871527468; Tue, 04 Feb 2020 18:58:47 -0800 (PST) Received: from xz-x1.redhat.com ([2607:9880:19c8:32::2]) by smtp.gmail.com with ESMTPSA id e64sm12961649qtd.45.2020.02.04.18.58.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Feb 2020 18:58:46 -0800 (PST) From: Peter Xu To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: dinechin@redhat.com, sean.j.christopherson@intel.com, pbonzini@redhat.com, jasowang@redhat.com, yan.y.zhao@intel.com, mst@redhat.com, peterx@redhat.com, kevin.tian@intel.com, alex.williamson@redhat.com, dgilbert@redhat.com, vkuznets@redhat.com, Lei Cao Subject: [PATCH 05/14] KVM: X86: Implement ring-based dirty memory tracking Date: Tue, 4 Feb 2020 21:58:33 -0500 Message-Id: <20200205025842.367575-2-peterx@redhat.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200205025842.367575-1-peterx@redhat.com> References: <20200205025105.367213-1-peterx@redhat.com> <20200205025842.367575-1-peterx@redhat.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This patch is heavily based on previous work from Lei Cao and Paolo Bonzini . [1] KVM currently uses large bitmaps to track dirty memory. These bitmaps are copied to userspace when userspace queries KVM for its dirty page information. The use of bitmaps is mostly sufficient for live migration, as large parts of memory are be dirtied from one log-dirty pass to another. However, in a checkpointing system, the number of dirty pages is small and in fact it is often bounded---the VM is paused when it has dirtied a pre-defined number of pages. Traversing a large, sparsely populated bitmap to find set bits is time-consuming, as is copying the bitmap to user-space. A similar issue will be there for live migration when the guest memory is huge while the page dirty procedure is trivial. In that case for each dirty sync we need to pull the whole dirty bitmap to userspace and analyse every bit even if it's mostly zeros. The preferred data structure for above scenarios is a dense list of guest frame numbers (GFN). This patch series stores the dirty list in kernel memory that can be memory mapped into userspace to allow speedy harvesting. This patch enables dirty ring for X86 only. However it should be easily extended to other archs as well. [1] https://patchwork.kernel.org/patch/10471409/ Signed-off-by: Lei Cao Signed-off-by: Paolo Bonzini Signed-off-by: Peter Xu --- Documentation/virt/kvm/api.txt | 118 ++++++++++++++++++++ arch/x86/include/asm/kvm_host.h | 3 + arch/x86/include/uapi/asm/kvm.h | 1 + arch/x86/kvm/Makefile | 3 +- arch/x86/kvm/mmu/mmu.c | 6 ++ arch/x86/kvm/vmx/vmx.c | 7 ++ arch/x86/kvm/x86.c | 9 ++ include/linux/kvm_dirty_ring.h | 50 +++++++++ include/linux/kvm_host.h | 15 +++ include/trace/events/kvm.h | 78 ++++++++++++++ include/uapi/linux/kvm.h | 44 ++++++++ virt/kvm/dirty_ring.c | 176 ++++++++++++++++++++++++++++++ virt/kvm/kvm_main.c | 184 +++++++++++++++++++++++++++++++- 13 files changed, 692 insertions(+), 2 deletions(-) create mode 100644 include/linux/kvm_dirty_ring.h create mode 100644 virt/kvm/dirty_ring.c diff --git a/Documentation/virt/kvm/api.txt b/Documentation/virt/kvm/api.txt index ebb37b34dcfc..558e719efdec 100644 --- a/Documentation/virt/kvm/api.txt +++ b/Documentation/virt/kvm/api.txt @@ -231,6 +231,7 @@ Based on their initialization different VMs may have different capabilities. It is thus encouraged to use the vm ioctl to query for capabilities (available with KVM_CAP_CHECK_EXTENSION_VM on the vm fd) + 4.5 KVM_GET_VCPU_MMAP_SIZE Capability: basic @@ -243,6 +244,18 @@ The KVM_RUN ioctl (cf.) communicates with userspace via a shared memory region. This ioctl returns the size of that region. See the KVM_RUN documentation for details. +Besides the size of the KVM_RUN communication region, other areas of +the VCPU file descriptor can be mmap-ed, including: + +- if KVM_CAP_COALESCED_MMIO is available, a page at + KVM_COALESCED_MMIO_PAGE_OFFSET * PAGE_SIZE; for historical reasons, + this page is included in the result of KVM_GET_VCPU_MMAP_SIZE. + KVM_CAP_COALESCED_MMIO is not documented yet. + +- if KVM_CAP_DIRTY_LOG_RING is available, a number of pages at + KVM_DIRTY_LOG_PAGE_OFFSET * PAGE_SIZE. For more information on + KVM_CAP_DIRTY_LOG_RING, see section 8.3. + 4.6 KVM_SET_MEMORY_REGION @@ -5376,6 +5389,7 @@ CPU when the exception is taken. If this virtual SError is taken to EL1 using AArch64, this value will be reported in the ISS field of ESR_ELx. See KVM_CAP_VCPU_EVENTS for more details. + 8.20 KVM_CAP_HYPERV_SEND_IPI Architectures: x86 @@ -5383,6 +5397,7 @@ Architectures: x86 This capability indicates that KVM supports paravirtualized Hyper-V IPI send hypercalls: HvCallSendSyntheticClusterIpi, HvCallSendSyntheticClusterIpiEx. + 8.21 KVM_CAP_HYPERV_DIRECT_TLBFLUSH Architecture: x86 @@ -5396,3 +5411,106 @@ handling by KVM (as some KVM hypercall may be mistakenly treated as TLB flush hypercalls by Hyper-V) so userspace should disable KVM identification in CPUID and only exposes Hyper-V identification. In this case, guest thinks it's running on Hyper-V and only use Hyper-V hypercalls. + +8.22 KVM_CAP_DIRTY_LOG_RING + +Architectures: x86 +Parameters: args[0] - size of the dirty log ring + +KVM is capable of tracking dirty memory using ring buffers that are +mmaped into userspace; there is one dirty ring per vcpu. + +One dirty ring is defined as below internally: + +struct kvm_dirty_ring { + u32 dirty_index; + u32 reset_index; + u32 size; + u32 soft_limit; + struct kvm_dirty_gfn *dirty_gfns; + int index; +}; + +Dirty GFNs (Guest Frame Numbers) are stored in the dirty_gfns array. +For each of the dirty entry it's defined as: + +struct kvm_dirty_gfn { + __u32 flags; + __u32 slot; /* as_id | slot_id */ + __u64 offset; +}; + +Each GFN is a state machine itself. The state is embeded in the flags +field, as defined in the uapi header: + +/* + * KVM dirty GFN flags, defined as: + * + * |---------------+---------------+--------------| + * | bit 1 (reset) | bit 0 (dirty) | Status | + * |---------------+---------------+--------------| + * | 0 | 0 | Invalid GFN | + * | 0 | 1 | Dirty GFN | + * | 1 | X | GFN to reset | + * |---------------+---------------+--------------| + * + * Lifecycle of a dirty GFN goes like: + * + * dirtied collected reset + * 00 -----------> 01 -------------> 1X -------+ + * ^ | + * | | + * +------------------------------------------+ + * + * The userspace program is only responsible for the 01->1X state + * conversion (to collect dirty bits). Also, it must not skip any + * dirty bits so that dirty bits are always collected in sequence. + */ +#define KVM_DIRTY_GFN_F_DIRTY BIT(0) +#define KVM_DIRTY_GFN_F_RESET BIT(1) +#define KVM_DIRTY_GFN_F_MASK 0x3 + +Userspace calls KVM_ENABLE_CAP ioctl right after KVM_CREATE_VM ioctl +to enable this capability for the new guest and set the size of the +rings. It is only allowed before creating any vCPU, and the size of +the ring must be a power of two. The larger the ring buffer, the less +likely the ring is full and the VM is forced to exit to userspace. The +optimal size depends on the workload, but it is recommended that it be +at least 64 KiB (4096 entries). + +Just like for dirty page bitmaps, the buffer tracks writes to +all user memory regions for which the KVM_MEM_LOG_DIRTY_PAGES flag was +set in KVM_SET_USER_MEMORY_REGION. Once a memory region is registered +with the flag set, userspace can start harvesting dirty pages from the +ring buffer. + +To harvest the dirty pages, userspace accesses the mmaped ring buffer +to read the dirty GFNs starting from zero. If the flags has the DIRTY +bit set (at this stage the RESET bit must be cleared), then it means +this GFN is a dirty GFN. The userspace should collect this GFN and +mark the flags from state 01b to 1Xb (bit 0 will be ignored by KVM, +but bit 1 must be set to show that this GFN is collected and waiting +for a reset), and move on to the next GFN. The userspace should +continue to do this until when the flags of a GFN has the DIRTY bit +cleared, it means we've collected all the dirty GFNs we have for now. +It's not a must that the userspace collects the all dirty GFNs in +once. However it must collect the dirty GFNs in sequence, i.e., the +userspace program cannot skip one dirty GFN to collect the one next to +it. + +After processing one or more entries in the ring buffer, userspace +calls the VM ioctl KVM_RESET_DIRTY_RINGS to notify the kernel about +it, so that the kernel will reprotect those collected GFNs. +Therefore, the ioctl must be called *before* reading the content of +the dirty pages. + +The dirty ring interface has a major difference comparing to the +KVM_GET_DIRTY_LOG interface in that, when reading the dirty ring from +userspace it's still possible that the kernel has not yet flushed the +hardware dirty buffers into the kernel buffer (the flushing was +previously done by the KVM_GET_DIRTY_LOG ioctl). To achieve that, one +needs to kick the vcpu out for a hardware buffer flush (vmexit) to +make sure all the existing dirty gfns are flushed to the dirty rings. + +The dirty ring can gets full. When it happens, the KVM_RUN of the +vcpu will return with exit reason KVM_EXIT_DIRTY_LOG_FULL. diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 8fc46bbce57a..8a2419505b33 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1199,6 +1199,7 @@ struct kvm_x86_ops { struct kvm_memory_slot *slot, gfn_t offset, unsigned long mask); int (*write_log_dirty)(struct kvm_vcpu *vcpu); + int (*cpu_dirty_log_size)(void); /* pmu operations of sub-arch */ const struct kvm_pmu_ops *pmu_ops; @@ -1688,4 +1689,6 @@ static inline int kvm_cpu_get_apicid(int mps_cpu) #define GET_SMSTATE(type, buf, offset) \ (*(type *)((buf) + (offset) - 0x7e00)) +int kvm_cpu_dirty_log_size(void); + #endif /* _ASM_X86_KVM_HOST_H */ diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h index 503d3f42da16..b59bf356c478 100644 --- a/arch/x86/include/uapi/asm/kvm.h +++ b/arch/x86/include/uapi/asm/kvm.h @@ -12,6 +12,7 @@ #define KVM_PIO_PAGE_OFFSET 1 #define KVM_COALESCED_MMIO_PAGE_OFFSET 2 +#define KVM_DIRTY_LOG_PAGE_OFFSET 64 #define DE_VECTOR 0 #define DB_VECTOR 1 diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile index b19ef421084d..0acee817adfb 100644 --- a/arch/x86/kvm/Makefile +++ b/arch/x86/kvm/Makefile @@ -5,7 +5,8 @@ ccflags-y += -Iarch/x86/kvm KVM := ../../../virt/kvm kvm-y += $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o \ - $(KVM)/eventfd.o $(KVM)/irqchip.o $(KVM)/vfio.o + $(KVM)/eventfd.o $(KVM)/irqchip.o $(KVM)/vfio.o \ + $(KVM)/dirty_ring.o kvm-$(CONFIG_KVM_ASYNC_PF) += $(KVM)/async_pf.o kvm-y += x86.o emulate.o i8259.o irq.o lapic.o \ diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 84eeb61d06aa..92c250e26823 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -1746,7 +1746,13 @@ int kvm_arch_write_log_dirty(struct kvm_vcpu *vcpu) { if (kvm_x86_ops->write_log_dirty) return kvm_x86_ops->write_log_dirty(vcpu); + return 0; +} +int kvm_cpu_dirty_log_size(void) +{ + if (kvm_x86_ops->cpu_dirty_log_size) + return kvm_x86_ops->cpu_dirty_log_size(); return 0; } diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index a01f3bcef27a..c25eff0156a2 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7673,6 +7673,7 @@ static __init int hardware_setup(void) kvm_x86_ops->slot_disable_log_dirty = NULL; kvm_x86_ops->flush_log_dirty = NULL; kvm_x86_ops->enable_log_dirty_pt_masked = NULL; + kvm_x86_ops->cpu_dirty_log_size = NULL; } if (!cpu_has_vmx_preemption_timer()) @@ -7745,6 +7746,11 @@ static bool vmx_check_apicv_inhibit_reasons(ulong bit) return supported & BIT(bit); } +static int vmx_cpu_dirty_log_size(void) +{ + return enable_pml ? PML_ENTITY_NUM : 0; +} + static struct kvm_x86_ops vmx_x86_ops __ro_after_init = { .cpu_has_kvm_support = cpu_has_kvm_support, .disabled_by_bios = vmx_disabled_by_bios, @@ -7868,6 +7874,7 @@ static struct kvm_x86_ops vmx_x86_ops __ro_after_init = { .flush_log_dirty = vmx_flush_log_dirty, .enable_log_dirty_pt_masked = vmx_enable_log_dirty_pt_masked, .write_log_dirty = vmx_write_pml_buffer, + .cpu_dirty_log_size = vmx_cpu_dirty_log_size, .pre_block = vmx_pre_block, .post_block = vmx_post_block, diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 030435f1a033..5e6ceb9a9e73 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -8132,6 +8132,15 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) bool req_immediate_exit = false; + /* Forbid vmenter if vcpu dirty ring is soft-full */ + if (unlikely(vcpu->kvm->dirty_ring_size && + kvm_dirty_ring_soft_full(&vcpu->dirty_ring))) { + vcpu->run->exit_reason = KVM_EXIT_DIRTY_RING_FULL; + trace_kvm_dirty_ring_exit(vcpu); + r = 0; + goto out; + } + if (kvm_request_pending(vcpu)) { if (kvm_check_request(KVM_REQ_GET_VMCS12_PAGES, vcpu)) { if (unlikely(!kvm_x86_ops->get_vmcs12_pages(vcpu))) { diff --git a/include/linux/kvm_dirty_ring.h b/include/linux/kvm_dirty_ring.h new file mode 100644 index 000000000000..9f1bf3704036 --- /dev/null +++ b/include/linux/kvm_dirty_ring.h @@ -0,0 +1,50 @@ +#ifndef KVM_DIRTY_RING_H +#define KVM_DIRTY_RING_H + +/** + * kvm_dirty_ring: KVM internal dirty ring structure + * + * @dirty_index: free running counter that points to the next slot in + * dirty_ring->dirty_gfns, where a new dirty page should go + * @reset_index: free running counter that points to the next dirty page + * in dirty_ring->dirty_gfns for which dirty trap needs to + * be reenabled + * @size: size of the compact list, dirty_ring->dirty_gfns + * @soft_limit: when the number of dirty pages in the list reaches this + * limit, vcpu that owns this ring should exit to userspace + * to allow userspace to harvest all the dirty pages + * @dirty_gfns: the array to keep the dirty gfns + * @index: index of this dirty ring + */ +struct kvm_dirty_ring { + u32 dirty_index; + u32 reset_index; + u32 size; + u32 soft_limit; + struct kvm_dirty_gfn *dirty_gfns; + int index; +}; + +u32 kvm_dirty_ring_get_rsvd_entries(void); +int kvm_dirty_ring_alloc(struct kvm_dirty_ring *ring, int index, u32 size); +struct kvm_dirty_ring *kvm_dirty_ring_get(struct kvm *kvm); + +/* + * called with kvm->slots_lock held, returns the number of + * processed pages. + */ +int kvm_dirty_ring_reset(struct kvm *kvm, struct kvm_dirty_ring *ring); + +/* + * returns =0: successfully pushed + * <0: unable to push, need to wait + */ +void kvm_dirty_ring_push(struct kvm_dirty_ring *ring, u32 slot, u64 offset); + +/* for use in vm_operations_struct */ +struct page *kvm_dirty_ring_get_page(struct kvm_dirty_ring *ring, u32 offset); + +void kvm_dirty_ring_free(struct kvm_dirty_ring *ring); +bool kvm_dirty_ring_soft_full(struct kvm_dirty_ring *ring); + +#endif /* KVM_DIRTY_RING_H */ diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 62aad0a2707a..e9d6e96a47be 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -34,6 +34,7 @@ #include #include +#include #ifndef KVM_MAX_VCPU_ID #define KVM_MAX_VCPU_ID KVM_MAX_VCPUS @@ -319,6 +320,7 @@ struct kvm_vcpu { bool ready; struct kvm_vcpu_arch arch; struct dentry *debugfs_dentry; + struct kvm_dirty_ring dirty_ring; }; static inline int kvm_vcpu_exiting_guest_mode(struct kvm_vcpu *vcpu) @@ -500,6 +502,7 @@ struct kvm { struct srcu_struct srcu; struct srcu_struct irq_srcu; pid_t userspace_pid; + u32 dirty_ring_size; }; #define kvm_err(fmt, ...) \ @@ -828,6 +831,8 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm, gfn_t gfn_offset, unsigned long mask); +void kvm_reset_dirty_gfn(struct kvm *kvm, u32 slot, u64 offset, u64 mask); + int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log); int kvm_vm_ioctl_clear_dirty_log(struct kvm *kvm, @@ -1406,4 +1411,14 @@ int kvm_vm_create_worker_thread(struct kvm *kvm, kvm_vm_thread_fn_t thread_fn, uintptr_t data, const char *name, struct task_struct **thread_ptr); +/* + * This defines how many reserved entries we want to keep before we + * kick the vcpu to the userspace to avoid dirty ring full. This + * value can be tuned to higher if e.g. PML is enabled on the host. + */ +#define KVM_DIRTY_RING_RSVD_ENTRIES 64 + +/* Max number of entries allowed for each kvm dirty ring */ +#define KVM_DIRTY_RING_MAX_ENTRIES 65536 + #endif diff --git a/include/trace/events/kvm.h b/include/trace/events/kvm.h index 2c735a3e6613..3d850997940c 100644 --- a/include/trace/events/kvm.h +++ b/include/trace/events/kvm.h @@ -399,6 +399,84 @@ TRACE_EVENT(kvm_halt_poll_ns, #define trace_kvm_halt_poll_ns_shrink(vcpu_id, new, old) \ trace_kvm_halt_poll_ns(false, vcpu_id, new, old) +TRACE_EVENT(kvm_dirty_ring_push, + TP_PROTO(struct kvm_dirty_ring *ring, u32 slot, u64 offset), + TP_ARGS(ring, slot, offset), + + TP_STRUCT__entry( + __field(int, index) + __field(u32, dirty_index) + __field(u32, reset_index) + __field(u32, slot) + __field(u64, offset) + ), + + TP_fast_assign( + __entry->index = ring->index; + __entry->dirty_index = ring->dirty_index; + __entry->reset_index = ring->reset_index; + __entry->slot = slot; + __entry->offset = offset; + ), + + TP_printk("ring %d: dirty 0x%x reset 0x%x " + "slot %u offset 0x%llx (used %u)", + __entry->index, __entry->dirty_index, + __entry->reset_index, __entry->slot, __entry->offset, + __entry->dirty_index - __entry->reset_index) +); + +TRACE_EVENT(kvm_dirty_ring_reset, + TP_PROTO(struct kvm_dirty_ring *ring), + TP_ARGS(ring), + + TP_STRUCT__entry( + __field(int, index) + __field(u32, dirty_index) + __field(u32, reset_index) + ), + + TP_fast_assign( + __entry->index = ring->index; + __entry->dirty_index = ring->dirty_index; + __entry->reset_index = ring->reset_index; + ), + + TP_printk("ring %d: dirty 0x%x reset 0x%x (used %u)", + __entry->index, __entry->dirty_index, __entry->reset_index, + __entry->dirty_index - __entry->reset_index) +); + +TRACE_EVENT(kvm_dirty_ring_waitqueue, + TP_PROTO(bool enter), + TP_ARGS(enter), + + TP_STRUCT__entry( + __field(bool, enter) + ), + + TP_fast_assign( + __entry->enter = enter; + ), + + TP_printk("%s", __entry->enter ? "wait" : "awake") +); + +TRACE_EVENT(kvm_dirty_ring_exit, + TP_PROTO(struct kvm_vcpu *vcpu), + TP_ARGS(vcpu), + + TP_STRUCT__entry( + __field(int, vcpu_id) + ), + + TP_fast_assign( + __entry->vcpu_id = vcpu->vcpu_id; + ), + + TP_printk("vcpu %d", __entry->vcpu_id) +); + #endif /* _TRACE_KVM_MAIN_H */ /* This part must be outside protection */ diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index f0a16b4adbbd..5877d7fa88d1 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -236,6 +236,7 @@ struct kvm_hyperv_exit { #define KVM_EXIT_IOAPIC_EOI 26 #define KVM_EXIT_HYPERV 27 #define KVM_EXIT_ARM_NISV 28 +#define KVM_EXIT_DIRTY_RING_FULL 29 /* For KVM_EXIT_INTERNAL_ERROR */ /* Emulate instruction failed. */ @@ -1009,6 +1010,7 @@ struct kvm_ppc_resize_hpt { #define KVM_CAP_PPC_GUEST_DEBUG_SSTEP 176 #define KVM_CAP_ARM_NISV_TO_USER 177 #define KVM_CAP_ARM_INJECT_EXT_DABT 178 +#define KVM_CAP_DIRTY_LOG_RING 179 #ifdef KVM_CAP_IRQ_ROUTING @@ -1473,6 +1475,9 @@ struct kvm_enc_region { /* Available with KVM_CAP_ARM_SVE */ #define KVM_ARM_VCPU_FINALIZE _IOW(KVMIO, 0xc2, int) +/* Available with KVM_CAP_DIRTY_LOG_RING */ +#define KVM_RESET_DIRTY_RINGS _IO(KVMIO, 0xc3) + /* Secure Encrypted Virtualization command */ enum sev_cmd_id { /* Guest initialization commands */ @@ -1623,4 +1628,43 @@ struct kvm_hyperv_eventfd { #define KVM_HYPERV_CONN_ID_MASK 0x00ffffff #define KVM_HYPERV_EVENTFD_DEASSIGN (1 << 0) +/* + * KVM dirty GFN flags, defined as: + * + * |---------------+---------------+--------------| + * | bit 1 (reset) | bit 0 (dirty) | Status | + * |---------------+---------------+--------------| + * | 0 | 0 | Invalid GFN | + * | 0 | 1 | Dirty GFN | + * | 1 | X | GFN to reset | + * |---------------+---------------+--------------| + * + * Lifecycle of a dirty GFN goes like: + * + * dirtied collected reset + * 00 -----------> 01 -------------> 1X -------+ + * ^ | + * | | + * +------------------------------------------+ + * + * The userspace program is only responsible for the 01->1X state + * conversion (to collect dirty bits). Also, it must not skip any + * dirty bits so that dirty bits are always collected in sequence. + */ +#define KVM_DIRTY_GFN_F_DIRTY BIT(0) +#define KVM_DIRTY_GFN_F_RESET BIT(1) +#define KVM_DIRTY_GFN_F_MASK 0x3 + +/* + * KVM dirty rings should be mapped at KVM_DIRTY_LOG_PAGE_OFFSET of + * per-vcpu mmaped regions as an array of struct kvm_dirty_gfn. The + * size of the gfn buffer is decided by the first argument when + * enabling KVM_CAP_DIRTY_LOG_RING. + */ +struct kvm_dirty_gfn { + __u32 flags; + __u32 slot; + __u64 offset; +}; + #endif /* __LINUX_KVM_H */ diff --git a/virt/kvm/dirty_ring.c b/virt/kvm/dirty_ring.c new file mode 100644 index 000000000000..9c4145ad93b2 --- /dev/null +++ b/virt/kvm/dirty_ring.c @@ -0,0 +1,176 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * KVM dirty ring implementation + * + * Copyright 2019 Red Hat, Inc. + */ +#include +#include +#include +#include +#include + +int __weak kvm_cpu_dirty_log_size(void) +{ + return 0; +} + +u32 kvm_dirty_ring_get_rsvd_entries(void) +{ + return KVM_DIRTY_RING_RSVD_ENTRIES + kvm_cpu_dirty_log_size(); +} + +static u32 kvm_dirty_ring_used(struct kvm_dirty_ring *ring) +{ + return READ_ONCE(ring->dirty_index) - READ_ONCE(ring->reset_index); +} + +bool kvm_dirty_ring_soft_full(struct kvm_dirty_ring *ring) +{ + return kvm_dirty_ring_used(ring) >= ring->soft_limit; +} + +bool kvm_dirty_ring_full(struct kvm_dirty_ring *ring) +{ + return kvm_dirty_ring_used(ring) >= ring->size; +} + +struct kvm_dirty_ring *kvm_dirty_ring_get(struct kvm *kvm) +{ + struct kvm_vcpu *vcpu = kvm_get_running_vcpu(); + + WARN_ON_ONCE(vcpu->kvm != kvm); + + return &vcpu->dirty_ring; +} + +int kvm_dirty_ring_alloc(struct kvm_dirty_ring *ring, int index, u32 size) +{ + ring->dirty_gfns = vmalloc(size); + if (!ring->dirty_gfns) + return -ENOMEM; + memset(ring->dirty_gfns, 0, size); + + ring->size = size / sizeof(struct kvm_dirty_gfn); + ring->soft_limit = ring->size - kvm_dirty_ring_get_rsvd_entries(); + ring->dirty_index = 0; + ring->reset_index = 0; + ring->index = index; + + return 0; +} + +static inline void kvm_dirty_gfn_set_invalid(struct kvm_dirty_gfn *gfn) +{ + gfn->flags = 0; +} + +static inline void kvm_dirty_gfn_set_dirtied(struct kvm_dirty_gfn *gfn) +{ + gfn->flags = KVM_DIRTY_GFN_F_DIRTY; +} + +static inline bool kvm_dirty_gfn_invalid(struct kvm_dirty_gfn *gfn) +{ + return gfn->flags == 0; +} + +static inline bool kvm_dirty_gfn_collected(struct kvm_dirty_gfn *gfn) +{ + return gfn->flags & KVM_DIRTY_GFN_F_RESET; +} + +int kvm_dirty_ring_reset(struct kvm *kvm, struct kvm_dirty_ring *ring) +{ + u32 cur_slot, next_slot; + u64 cur_offset, next_offset; + unsigned long mask; + int count = 0; + struct kvm_dirty_gfn *entry; + bool first_round = true; + + /* This is only needed to make compilers happy */ + cur_slot = cur_offset = mask = 0; + + while (true) { + entry = &ring->dirty_gfns[ring->reset_index & (ring->size - 1)]; + + if (!kvm_dirty_gfn_collected(entry)) + break; + + next_slot = READ_ONCE(entry->slot); + next_offset = READ_ONCE(entry->offset); + + /* Update the flags to reflect that this GFN is reset */ + kvm_dirty_gfn_set_invalid(entry); + + ring->reset_index++; + count++; + /* + * Try to coalesce the reset operations when the guest is + * scanning pages in the same slot. + */ + if (!first_round && next_slot == cur_slot) { + s64 delta = next_offset - cur_offset; + + if (delta >= 0 && delta < BITS_PER_LONG) { + mask |= 1ull << delta; + continue; + } + + /* Backwards visit, careful about overflows! */ + if (delta > -BITS_PER_LONG && delta < 0 && + (mask << -delta >> -delta) == mask) { + cur_offset = next_offset; + mask = (mask << -delta) | 1; + continue; + } + } + kvm_reset_dirty_gfn(kvm, cur_slot, cur_offset, mask); + cur_slot = next_slot; + cur_offset = next_offset; + mask = 1; + first_round = false; + } + + kvm_reset_dirty_gfn(kvm, cur_slot, cur_offset, mask); + + trace_kvm_dirty_ring_reset(ring); + + return count; +} + +void kvm_dirty_ring_push(struct kvm_dirty_ring *ring, u32 slot, u64 offset) +{ + struct kvm_dirty_gfn *entry; + + /* It should never get full */ + WARN_ON_ONCE(kvm_dirty_ring_full(ring)); + + entry = &ring->dirty_gfns[ring->dirty_index & (ring->size - 1)]; + + /* It should always be an invalid entry to fill in */ + WARN_ON_ONCE(!kvm_dirty_gfn_invalid(entry)); + + entry->slot = slot; + entry->offset = offset; + /* + * Make sure the data is filled in before we publish this to + * the userspace program. There's no paired kernel-side reader. + */ + smp_wmb(); + kvm_dirty_gfn_set_dirtied(entry); + ring->dirty_index++; + trace_kvm_dirty_ring_push(ring, slot, offset); +} + +struct page *kvm_dirty_ring_get_page(struct kvm_dirty_ring *ring, u32 offset) +{ + return vmalloc_to_page((void *)ring->dirty_gfns + offset * PAGE_SIZE); +} + +void kvm_dirty_ring_free(struct kvm_dirty_ring *ring) +{ + vfree(ring->dirty_gfns); + ring->dirty_gfns = NULL; +} diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 5307f6e33587..b710cee7e897 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -64,12 +64,23 @@ #define CREATE_TRACE_POINTS #include +#include + /* Worst case buffer size needed for holding an integer. */ #define ITOA_MAX_LEN 12 MODULE_AUTHOR("Qumranet"); MODULE_LICENSE("GPL"); +/* + * Arch needs to define the macro after implementing the dirty ring + * feature. KVM_DIRTY_LOG_PAGE_OFFSET should be defined as the + * starting page offset of the dirty ring structures. + */ +#ifndef KVM_DIRTY_LOG_PAGE_OFFSET +#define KVM_DIRTY_LOG_PAGE_OFFSET 0 +#endif + /* Architectures should define their poll value according to the halt latency */ unsigned int halt_poll_ns = KVM_HALT_POLL_NS_DEFAULT; module_param(halt_poll_ns, uint, 0644); @@ -337,6 +348,50 @@ void kvm_reload_remote_mmus(struct kvm *kvm) kvm_make_all_cpus_request(kvm, KVM_REQ_MMU_RELOAD); } +#if (KVM_DIRTY_LOG_PAGE_OFFSET == 0) +/* + * If KVM_DIRTY_LOG_PAGE_OFFSET not defined, kvm_dirty_ring.o should + * not be included as well, so define these nop functions for the arch. + */ +u32 kvm_dirty_ring_get_rsvd_entries(void) +{ + return 0; +} + +int kvm_dirty_ring_alloc(struct kvm_dirty_ring *ring, int index, u32 size) +{ + return 0; +} + +struct kvm_dirty_ring *kvm_dirty_ring_get(struct kvm *kvm) +{ + return NULL; +} + +int kvm_dirty_ring_reset(struct kvm *kvm, struct kvm_dirty_ring *ring) +{ + return 0; +} + +void kvm_dirty_ring_push(struct kvm_dirty_ring *ring, u32 slot, u64 offset) +{ +} + +struct page *kvm_dirty_ring_get_page(struct kvm_dirty_ring *ring, u32 offset) +{ + return NULL; +} + +void kvm_dirty_ring_free(struct kvm_dirty_ring *ring) +{ +} + +bool kvm_dirty_ring_soft_full(struct kvm_dirty_ring *ring) +{ + return true; +} +#endif /* KVM_DIRTY_LOG_PAGE_OFFSET == 0 */ + static void kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id) { mutex_init(&vcpu->mutex); @@ -359,6 +414,7 @@ static void kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id) void kvm_vcpu_destroy(struct kvm_vcpu *vcpu) { + kvm_dirty_ring_free(&vcpu->dirty_ring); kvm_arch_vcpu_destroy(vcpu); /* @@ -2282,8 +2338,13 @@ static void mark_page_dirty_in_slot(struct kvm *kvm, { if (memslot && memslot->dirty_bitmap) { unsigned long rel_gfn = gfn - memslot->base_gfn; + u32 slot = (memslot->as_id << 16) | memslot->id; - set_bit_le(rel_gfn, memslot->dirty_bitmap); + if (kvm->dirty_ring_size) + kvm_dirty_ring_push(kvm_dirty_ring_get(kvm), + slot, rel_gfn); + else + set_bit_le(rel_gfn, memslot->dirty_bitmap); } } @@ -2630,6 +2691,16 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me, bool yield_to_kernel_mode) } EXPORT_SYMBOL_GPL(kvm_vcpu_on_spin); +static bool kvm_page_in_dirty_ring(struct kvm *kvm, unsigned long pgoff) +{ + if (!KVM_DIRTY_LOG_PAGE_OFFSET) + return false; + + return (pgoff >= KVM_DIRTY_LOG_PAGE_OFFSET) && + (pgoff < KVM_DIRTY_LOG_PAGE_OFFSET + + kvm->dirty_ring_size / PAGE_SIZE); +} + static vm_fault_t kvm_vcpu_fault(struct vm_fault *vmf) { struct kvm_vcpu *vcpu = vmf->vma->vm_file->private_data; @@ -2645,6 +2716,10 @@ static vm_fault_t kvm_vcpu_fault(struct vm_fault *vmf) else if (vmf->pgoff == KVM_COALESCED_MMIO_PAGE_OFFSET) page = virt_to_page(vcpu->kvm->coalesced_mmio_ring); #endif + else if (kvm_page_in_dirty_ring(vcpu->kvm, vmf->pgoff)) + page = kvm_dirty_ring_get_page( + &vcpu->dirty_ring, + vmf->pgoff - KVM_DIRTY_LOG_PAGE_OFFSET); else return kvm_arch_vcpu_fault(vcpu, vmf); get_page(page); @@ -2658,6 +2733,14 @@ static const struct vm_operations_struct kvm_vcpu_vm_ops = { static int kvm_vcpu_mmap(struct file *file, struct vm_area_struct *vma) { + struct kvm_vcpu *vcpu = file->private_data; + unsigned long pages = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT; + + if ((kvm_page_in_dirty_ring(vcpu->kvm, vma->vm_pgoff) || + kvm_page_in_dirty_ring(vcpu->kvm, vma->vm_pgoff + pages - 1)) && + ((vma->vm_flags & VM_EXEC) || !(vma->vm_flags & VM_SHARED))) + return -EINVAL; + vma->vm_ops = &kvm_vcpu_vm_ops; return 0; } @@ -2751,6 +2834,13 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, u32 id) if (r) goto vcpu_free_run_page; + if (kvm->dirty_ring_size) { + r = kvm_dirty_ring_alloc(&vcpu->dirty_ring, + id, kvm->dirty_ring_size); + if (r) + goto arch_vcpu_destroy; + } + kvm_create_vcpu_debugfs(vcpu); mutex_lock(&kvm->lock); @@ -2786,6 +2876,8 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, u32 id) unlock_vcpu_destroy: mutex_unlock(&kvm->lock); debugfs_remove_recursive(vcpu->debugfs_dentry); + kvm_dirty_ring_free(&vcpu->dirty_ring); +arch_vcpu_destroy: kvm_arch_vcpu_destroy(vcpu); vcpu_free_run_page: free_page((unsigned long)vcpu->run); @@ -3256,12 +3348,97 @@ static long kvm_vm_ioctl_check_extension_generic(struct kvm *kvm, long arg) #endif case KVM_CAP_NR_MEMSLOTS: return KVM_USER_MEM_SLOTS; + case KVM_CAP_DIRTY_LOG_RING: +#ifdef CONFIG_X86 + return KVM_DIRTY_RING_MAX_ENTRIES * sizeof(struct kvm_dirty_gfn); +#else + return 0; +#endif default: break; } return kvm_vm_ioctl_check_extension(kvm, arg); } +void kvm_reset_dirty_gfn(struct kvm *kvm, u32 slot, u64 offset, u64 mask) +{ + struct kvm_memory_slot *memslot; + int as_id, id; + + as_id = slot >> 16; + id = (u16)slot; + if (as_id >= KVM_ADDRESS_SPACE_NUM || id >= KVM_USER_MEM_SLOTS) + return; + + memslot = id_to_memslot(__kvm_memslots(kvm, as_id), id); + if (offset >= memslot->npages) + return; + + spin_lock(&kvm->mmu_lock); + kvm_arch_mmu_enable_log_dirty_pt_masked(kvm, memslot, offset, mask); + spin_unlock(&kvm->mmu_lock); +} + +static int kvm_vm_ioctl_enable_dirty_log_ring(struct kvm *kvm, u32 size) +{ + int r; + + if (!KVM_DIRTY_LOG_PAGE_OFFSET) + return -EINVAL; + + /* the size should be power of 2 */ + if (!size || (size & (size - 1))) + return -EINVAL; + + /* Should be bigger to keep the reserved entries, or a page */ + if (size < kvm_dirty_ring_get_rsvd_entries() * + sizeof(struct kvm_dirty_gfn) || size < PAGE_SIZE) + return -EINVAL; + + if (size > KVM_DIRTY_RING_MAX_ENTRIES * + sizeof(struct kvm_dirty_gfn)) + return -E2BIG; + + /* We only allow it to set once */ + if (kvm->dirty_ring_size) + return -EINVAL; + + mutex_lock(&kvm->lock); + + if (kvm->created_vcpus) { + /* We don't allow to change this value after vcpu created */ + r = -EINVAL; + } else { + kvm->dirty_ring_size = size; + r = 0; + } + + mutex_unlock(&kvm->lock); + return r; +} + +static int kvm_vm_ioctl_reset_dirty_pages(struct kvm *kvm) +{ + int i; + struct kvm_vcpu *vcpu; + int cleared = 0; + + if (!kvm->dirty_ring_size) + return -EINVAL; + + mutex_lock(&kvm->slots_lock); + + kvm_for_each_vcpu(i, vcpu, kvm) + cleared += kvm_dirty_ring_reset(vcpu->kvm, &vcpu->dirty_ring); + + mutex_unlock(&kvm->slots_lock); + + if (cleared) + kvm_flush_remote_tlbs(kvm); + + return cleared; +} + int __attribute__((weak)) kvm_vm_ioctl_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap) { @@ -3279,6 +3456,8 @@ static int kvm_vm_ioctl_enable_cap_generic(struct kvm *kvm, kvm->manual_dirty_log_protect = cap->args[0]; return 0; #endif + case KVM_CAP_DIRTY_LOG_RING: + return kvm_vm_ioctl_enable_dirty_log_ring(kvm, cap->args[0]); default: return kvm_vm_ioctl_enable_cap(kvm, cap); } @@ -3466,6 +3645,9 @@ static long kvm_vm_ioctl(struct file *filp, case KVM_CHECK_EXTENSION: r = kvm_vm_ioctl_check_extension_generic(kvm, arg); break; + case KVM_RESET_DIRTY_RINGS: + r = kvm_vm_ioctl_reset_dirty_pages(kvm); + break; default: r = kvm_arch_vm_ioctl(filp, ioctl, arg); } From patchwork Wed Feb 5 02:58:34 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 11365707 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3D5FC138D for ; Wed, 5 Feb 2020 02:58:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 1BCF121741 for ; Wed, 5 Feb 2020 02:58:55 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="CvrNleSq" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727746AbgBEC6y (ORCPT ); Tue, 4 Feb 2020 21:58:54 -0500 Received: from us-smtp-1.mimecast.com ([205.139.110.61]:32509 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727789AbgBEC6x (ORCPT ); Tue, 4 Feb 2020 21:58:53 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1580871532; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dlAqEpPTUndIZh7wPLDVCooz5AgvSxeI8vfIAUOxmtc=; b=CvrNleSqn+uq8W+iWiDwmIdnyIFtvF63afOV0xAl1pxLPfNRuYrrGbHTvQSX15SH1jpWk3 Crd6uXNE7KieNowrZVL8BwXBQop43dOKRvkSOflLRV/9V+mrYzwIsazs1df9qZwIIVXaq1 9RU9aXbfnXRQGI3CUdbcv0yk3imADcE= Received: from mail-qv1-f70.google.com (mail-qv1-f70.google.com [209.85.219.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-156-jCwaQJ8-M-KWtgHkF6bLgA-1; Tue, 04 Feb 2020 21:58:50 -0500 X-MC-Unique: jCwaQJ8-M-KWtgHkF6bLgA-1 Received: by mail-qv1-f70.google.com with SMTP id z39so638682qve.5 for ; Tue, 04 Feb 2020 18:58:50 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=dlAqEpPTUndIZh7wPLDVCooz5AgvSxeI8vfIAUOxmtc=; b=tvBN1tUmrQSroP0eR7LDjMzJs8zki85MD0wHHyGGsC/CPUnZXJ1Vm7FgrfhhXXKTSS v3n/U18+WTlE1pBsL7aNbK0mL/Rqh5aU8K3cOI0UYjm3D+bsQzYfPfpQawXKPzYugdT2 mlmV+9JckuKQaoh+p34TSoarVRDp1vNEFIUSTT9fKpFHEOKuVm8HpLEXlHNXsUpE5Ptc LqIuAuN5yT3N5xDGLLZUHMQmE6V9zPpd9aoDnxvGlyIBHrq+nmqbOQxY531RczHVVprU EtWq9wfCRZUI5vRDiwuD/VoYVqTRz0c89acqsQmnMscywdBnuPsJLcGHPFUecINLH0J2 GKlg== X-Gm-Message-State: APjAAAWFYVW30aU6eUfeB85Q9LBnBSMh1lRcECN0vSzW7I6Yv6hYw0Qs ono1E5D6XXC/8XzGCS1INsTAyTffxftOe+eDasmeZSt/YeG7EIWrNTERS2sfcpAetidosJV+u0m f0joLRjjA9YAa X-Received: by 2002:a37:a14f:: with SMTP id k76mr31068424qke.170.1580871530252; Tue, 04 Feb 2020 18:58:50 -0800 (PST) X-Google-Smtp-Source: APXvYqxlLWnSK6a/QgArTLNRRly1uyWTIwUItXLoG9GWm3CQ5h4eRRPsHCvnbEv3po8kqmv45eTqfg== X-Received: by 2002:a37:a14f:: with SMTP id k76mr31068418qke.170.1580871529992; Tue, 04 Feb 2020 18:58:49 -0800 (PST) Received: from xz-x1.redhat.com ([2607:9880:19c8:32::2]) by smtp.gmail.com with ESMTPSA id e64sm12961649qtd.45.2020.02.04.18.58.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Feb 2020 18:58:48 -0800 (PST) From: Peter Xu To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: dinechin@redhat.com, sean.j.christopherson@intel.com, pbonzini@redhat.com, jasowang@redhat.com, yan.y.zhao@intel.com, mst@redhat.com, peterx@redhat.com, kevin.tian@intel.com, alex.williamson@redhat.com, dgilbert@redhat.com, vkuznets@redhat.com Subject: [PATCH 06/14] KVM: Make dirty ring exclusive to dirty bitmap log Date: Tue, 4 Feb 2020 21:58:34 -0500 Message-Id: <20200205025842.367575-3-peterx@redhat.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200205025842.367575-1-peterx@redhat.com> References: <20200205025105.367213-1-peterx@redhat.com> <20200205025842.367575-1-peterx@redhat.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org There's no good reason to use both the dirty bitmap logging and the new dirty ring buffer to track dirty bits. We should be able to even support both of them at the same time, but it could complicate things which could actually help little. Let's simply make it the rule before we enable dirty ring on any arch, that we don't allow these two interfaces to be used together. The big world switch would be KVM_CAP_DIRTY_LOG_RING capability enablement. That's where we'll switch from the default dirty logging way to the dirty ring way. As long as kvm->dirty_ring_size is setup correctly, we'll once and for all switch to the dirty ring buffer mode for the current virtual machine. Signed-off-by: Peter Xu --- Documentation/virt/kvm/api.txt | 7 +++++++ virt/kvm/kvm_main.c | 12 ++++++++++++ 2 files changed, 19 insertions(+) diff --git a/Documentation/virt/kvm/api.txt b/Documentation/virt/kvm/api.txt index 558e719efdec..bbdd68583cde 100644 --- a/Documentation/virt/kvm/api.txt +++ b/Documentation/virt/kvm/api.txt @@ -5514,3 +5514,10 @@ make sure all the existing dirty gfns are flushed to the dirty rings. The dirty ring can gets full. When it happens, the KVM_RUN of the vcpu will return with exit reason KVM_EXIT_DIRTY_LOG_FULL. + +NOTE: the KVM_CAP_DIRTY_LOG_RING capability and the new ioctl +KVM_RESET_DIRTY_RINGS are exclusive to the existing KVM_GET_DIRTY_LOG +interface. After enabling KVM_CAP_DIRTY_LOG_RING with an acceptable +dirty ring size, the virtual machine will switch to the dirty ring +tracking mode, and KVM_GET_DIRTY_LOG, KVM_CLEAR_DIRTY_LOG ioctls will +stop working. diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index b710cee7e897..5a6f83b7270f 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1243,6 +1243,10 @@ int kvm_get_dirty_log(struct kvm *kvm, unsigned long n; unsigned long any = 0; + /* Dirty ring tracking is exclusive to dirty log tracking */ + if (kvm->dirty_ring_size) + return -EINVAL; + as_id = log->slot >> 16; id = (u16)log->slot; if (as_id >= KVM_ADDRESS_SPACE_NUM || id >= KVM_USER_MEM_SLOTS) @@ -1300,6 +1304,10 @@ int kvm_get_dirty_log_protect(struct kvm *kvm, unsigned long *dirty_bitmap; unsigned long *dirty_bitmap_buffer; + /* Dirty ring tracking is exclusive to dirty log tracking */ + if (kvm->dirty_ring_size) + return -EINVAL; + as_id = log->slot >> 16; id = (u16)log->slot; if (as_id >= KVM_ADDRESS_SPACE_NUM || id >= KVM_USER_MEM_SLOTS) @@ -1371,6 +1379,10 @@ int kvm_clear_dirty_log_protect(struct kvm *kvm, unsigned long *dirty_bitmap; unsigned long *dirty_bitmap_buffer; + /* Dirty ring tracking is exclusive to dirty log tracking */ + if (kvm->dirty_ring_size) + return -EINVAL; + as_id = log->slot >> 16; id = (u16)log->slot; if (as_id >= KVM_ADDRESS_SPACE_NUM || id >= KVM_USER_MEM_SLOTS) From patchwork Wed Feb 5 02:58:35 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 11365709 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AC457138D for ; Wed, 5 Feb 2020 02:58:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 8A3D9214AF for ; Wed, 5 Feb 2020 02:58:56 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="M+RsxE+Z" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727929AbgBEC6z (ORCPT ); Tue, 4 Feb 2020 21:58:55 -0500 Received: from us-smtp-1.mimecast.com ([207.211.31.81]:50685 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727832AbgBEC6z (ORCPT ); Tue, 4 Feb 2020 21:58:55 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1580871534; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FDwDto3hXPczPquqfclgMLW72/UWKwUFYUker/I93cA=; b=M+RsxE+Z4uLJZjLH2HnMk4DRJKIbLQikBOlVFOS1ktwJ0Rumr+yVF0Bwr+p2Ozz1e2edR8 YWcOTHnSaIr3gzDfCff+umydDiYGsxI3pqiIeoOp8wG1V9Lqa/qzdKbRcj3e9FXcAp/Bdm 8IGxB+2ChCusvCmFV+2XDktI9W/TurA= Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-397-0FOGjb3uPv6EnqhNNpaa-A-1; Tue, 04 Feb 2020 21:58:53 -0500 X-MC-Unique: 0FOGjb3uPv6EnqhNNpaa-A-1 Received: by mail-qt1-f200.google.com with SMTP id c10so399445qtk.18 for ; Tue, 04 Feb 2020 18:58:53 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=FDwDto3hXPczPquqfclgMLW72/UWKwUFYUker/I93cA=; b=gSY+t5Xwl/7i+YW4DFyG9daWHvhnb7ea58vmL3ZVsKfB9Iw6N/qdGHyUdo3j/oVXMQ mM/6g8CJofsywE9ovkGRZ2feSTrSoAqBB7OyGhToaMrJpO6F7acPtN33VNwYp2FdukMm CLy+hKhnlpbvoQ59TQV3DCmOxj8/D/udbcQCkGXKE1HU93QdUFfsK4dfWsoGciKx9dTH E1aCoibXmvZCkz2Mijj7sQq/NnwROwWkv/Iq4MXDo6pZwRWFKgRp3ZlQjsuXbzPI0/En jesmQxuD2r+yHm/40cnmcPHUCq/GWrKk1WwUaFYH/KQlClcyKNl5h43G0M9KFDHL69rA ijzw== X-Gm-Message-State: APjAAAUxlN8NqQrM3R9nBJKXHY/hn2tdIjuwkXzbNRzQcPfbfEbVcv35 xQMIOGb8P8U9vfV4DXMhZSJBf7qWZcRWCIldkcpzXrFtiBxf+i9T4kQ7xgxYI82nHkHUFqwuaMq WFijDm+dtAP3N X-Received: by 2002:ac8:1a19:: with SMTP id v25mr31897235qtj.146.1580871532265; Tue, 04 Feb 2020 18:58:52 -0800 (PST) X-Google-Smtp-Source: APXvYqy668FfAXq+R1N+S7y8JoV5xEv9AZL3pJHQVt+dfDh0K4BFDgeoMXGuKkNpqPRLfsJVQKtMaw== X-Received: by 2002:ac8:1a19:: with SMTP id v25mr31897220qtj.146.1580871532024; Tue, 04 Feb 2020 18:58:52 -0800 (PST) Received: from xz-x1.redhat.com ([2607:9880:19c8:32::2]) by smtp.gmail.com with ESMTPSA id e64sm12961649qtd.45.2020.02.04.18.58.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Feb 2020 18:58:51 -0800 (PST) From: Peter Xu To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: dinechin@redhat.com, sean.j.christopherson@intel.com, pbonzini@redhat.com, jasowang@redhat.com, yan.y.zhao@intel.com, mst@redhat.com, peterx@redhat.com, kevin.tian@intel.com, alex.williamson@redhat.com, dgilbert@redhat.com, vkuznets@redhat.com Subject: [PATCH 07/14] KVM: Don't allocate dirty bitmap if dirty ring is enabled Date: Tue, 4 Feb 2020 21:58:35 -0500 Message-Id: <20200205025842.367575-4-peterx@redhat.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200205025842.367575-1-peterx@redhat.com> References: <20200205025105.367213-1-peterx@redhat.com> <20200205025842.367575-1-peterx@redhat.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Because kvm dirty rings and kvm dirty log is used in an exclusive way, Let's avoid creating the dirty_bitmap when kvm dirty ring is enabled. At the meantime, since the dirty_bitmap will be conditionally created now, we can't use it as a sign of "whether this memory slot enabled dirty tracking". Change users like that to check against the kvm memory slot flags. Note that there still can be chances where the kvm memory slot got its dirty_bitmap allocated, _if_ the memory slots are created before enabling of the dirty rings and at the same time with the dirty tracking capability enabled, they'll still with the dirty_bitmap. However it should not hurt much (e.g., the bitmaps will always be freed if they are there), and the real users normally won't trigger this because dirty bit tracking flag should in most cases only be applied to kvm slots only before migration starts, that should be far latter than kvm initializes (VM starts). Signed-off-by: Peter Xu --- arch/x86/kvm/mmu/mmu.c | 4 ++-- include/linux/kvm_host.h | 5 +++++ virt/kvm/kvm_main.c | 5 +++-- 3 files changed, 10 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 92c250e26823..039d20043ca3 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -1273,8 +1273,8 @@ gfn_to_memslot_dirty_bitmap(struct kvm_vcpu *vcpu, gfn_t gfn, slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn); if (!slot || slot->flags & KVM_MEMSLOT_INVALID) return NULL; - if (no_dirty_log && slot->dirty_bitmap) - return NULL; + if (no_dirty_log && kvm_slot_dirty_track_enabled(slot)) + return false; return slot; } diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index e9d6e96a47be..a49e6846afe6 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -351,6 +351,11 @@ struct kvm_memory_slot { u8 as_id; }; +static inline bool kvm_slot_dirty_track_enabled(struct kvm_memory_slot *slot) +{ + return slot->flags & KVM_MEM_LOG_DIRTY_PAGES; +} + static inline unsigned long kvm_dirty_bitmap_bytes(struct kvm_memory_slot *memslot) { return ALIGN(memslot->npages, BITS_PER_LONG) / 8; diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 5a6f83b7270f..72b45f491692 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1152,7 +1152,8 @@ int __kvm_set_memory_region(struct kvm *kvm, } /* Allocate page dirty bitmap if needed */ - if ((new.flags & KVM_MEM_LOG_DIRTY_PAGES) && !new.dirty_bitmap) { + if ((new.flags & KVM_MEM_LOG_DIRTY_PAGES) && !new.dirty_bitmap && + !kvm->dirty_ring_size) { if (kvm_create_dirty_bitmap(&new) < 0) goto out_free; } @@ -2348,7 +2349,7 @@ static void mark_page_dirty_in_slot(struct kvm *kvm, struct kvm_memory_slot *memslot, gfn_t gfn) { - if (memslot && memslot->dirty_bitmap) { + if (memslot && kvm_slot_dirty_track_enabled(memslot)) { unsigned long rel_gfn = gfn - memslot->base_gfn; u32 slot = (memslot->as_id << 16) | memslot->id; From patchwork Wed Feb 5 02:58:36 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 11365723 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E51F71395 for ; Wed, 5 Feb 2020 02:59:35 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id C477A21741 for ; Wed, 5 Feb 2020 02:59:35 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="P+MuElQe" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727925AbgBEC7f (ORCPT ); Tue, 4 Feb 2020 21:59:35 -0500 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:35195 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727966AbgBEC66 (ORCPT ); Tue, 4 Feb 2020 21:58:58 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1580871536; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=v+zy2FHQw9KFVxtHaLqa0r2t8FDa4q1KvvQqvnAapvQ=; b=P+MuElQeZ1SNjpDfg0uhBongTItWT2GdAluzOpQ2B4d0vqgoH87gL8i3TYDoIEpXx4r5yt BPtXzHmf3ZFZtOwIXPJE4ojh2qwb+VM6fuQFVFnHn6DJ3LwezETsOB4vlY60IVsCqxve7U +eQY7IdY74F4jA/4DivlDKWZmqBmAtA= Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-26-GoLHuFzANmKNUCqhhRtSTw-1; Tue, 04 Feb 2020 21:58:55 -0500 X-MC-Unique: GoLHuFzANmKNUCqhhRtSTw-1 Received: by mail-qk1-f200.google.com with SMTP id i11so399186qki.12 for ; Tue, 04 Feb 2020 18:58:55 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=v+zy2FHQw9KFVxtHaLqa0r2t8FDa4q1KvvQqvnAapvQ=; b=C3Hl9TIKQh92WV8WmQcUBwLBlenBbtwWYw3lHVtYAkl52mtvKaeWjxXzT/+jSBYENF L4MV7f0qWZKBp3qDeKk4+pC2V8HuITOI+eDHX879sbahqVgfTHw+bhKbvsEN6U7KwSz/ oOQrtM4ucXCUovCT/6N4V+RRvrXZI6YhmMdhfUF8Y3IklBvGiF92uFOpR43vUs+Nt77y Loal6a9ptUnDA846cXA8a2cK76gjDmc4W1AENNWhYN/cz+uie8BnrbsfG8chp8cLGUuf XH45dO69PyqJ/frpBsAZ5ySt1Yx8YRnKa22O561FmFGnU0gqBWgC7v0xWTLajVjM9M8x YgLA== X-Gm-Message-State: APjAAAWConV1PxsZRMivjtOn/IYWkUSerE1dTX1fCQEfrGU12zAf6BTp XALvk2zLNfMx2tdAqH98FfUADswhbCaZYA+hvDjhDWzZ53KcKhfyz8G4E8U7LRCUCWVst2pSJuw kVbOvgr5Aidzc X-Received: by 2002:ac8:8e7:: with SMTP id y36mr31638489qth.26.1580871534556; Tue, 04 Feb 2020 18:58:54 -0800 (PST) X-Google-Smtp-Source: APXvYqwBrtfjh/WegxUDnL0gSdyIHl6hW1ZsZRs5d/otM4voagMpbjCc/TQ+O2fujc/H2QcwMgpFTA== X-Received: by 2002:ac8:8e7:: with SMTP id y36mr31638473qth.26.1580871534337; Tue, 04 Feb 2020 18:58:54 -0800 (PST) Received: from xz-x1.redhat.com ([2607:9880:19c8:32::2]) by smtp.gmail.com with ESMTPSA id e64sm12961649qtd.45.2020.02.04.18.58.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Feb 2020 18:58:53 -0800 (PST) From: Peter Xu To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: dinechin@redhat.com, sean.j.christopherson@intel.com, pbonzini@redhat.com, jasowang@redhat.com, yan.y.zhao@intel.com, mst@redhat.com, peterx@redhat.com, kevin.tian@intel.com, alex.williamson@redhat.com, dgilbert@redhat.com, vkuznets@redhat.com Subject: [PATCH 08/14] KVM: selftests: Always clear dirty bitmap after iteration Date: Tue, 4 Feb 2020 21:58:36 -0500 Message-Id: <20200205025842.367575-5-peterx@redhat.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200205025842.367575-1-peterx@redhat.com> References: <20200205025105.367213-1-peterx@redhat.com> <20200205025842.367575-1-peterx@redhat.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org We don't clear the dirty bitmap before because KVM_GET_DIRTY_LOG will clear it for us before copying the dirty log onto it. However we'd still better to clear it explicitly instead of assuming the kernel will always do it for us. More importantly, in the upcoming dirty ring tests we'll start to fetch dirty pages from a ring buffer, so no one is going to clear the dirty bitmap for us. Signed-off-by: Peter Xu --- tools/testing/selftests/kvm/dirty_log_test.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/kvm/dirty_log_test.c b/tools/testing/selftests/kvm/dirty_log_test.c index 5614222a6628..3c0ffd34b3b0 100644 --- a/tools/testing/selftests/kvm/dirty_log_test.c +++ b/tools/testing/selftests/kvm/dirty_log_test.c @@ -197,7 +197,7 @@ static void vm_dirty_log_verify(unsigned long *bmap) page); } - if (test_bit_le(page, bmap)) { + if (test_and_clear_bit_le(page, bmap)) { host_dirty_count++; /* * If the bit is set, the value written onto From patchwork Wed Feb 5 02:58:37 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 11365721 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 313C7138D for ; Wed, 5 Feb 2020 02:59:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 0FE4C21D7D for ; Wed, 5 Feb 2020 02:59:31 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="IamG6JW8" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728010AbgBEC7A (ORCPT ); Tue, 4 Feb 2020 21:59:00 -0500 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:25837 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727995AbgBEC67 (ORCPT ); Tue, 4 Feb 2020 21:58:59 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1580871538; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=oVtEI2Wr6dUMivwAFn+kYxL1mBYmiWLoQ2SvNOgo33g=; b=IamG6JW8O1YGx/oiafemraOKklRypzGaF0WqqKbQKAKCrMq1RKfJ4+nxFT9aBvxTipswx6 f3/6lxsRMx2yBCS0bTPbbbI6BgazlHYyq0nES0Krd8UIPKYQvP37qXYMWpMrWw9/AEndOT mNS51BPlPIzmuYBX27oy3i62qPL/FME= Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-209-og26z4RzMni3Xk99pcscJA-1; Tue, 04 Feb 2020 21:58:57 -0500 X-MC-Unique: og26z4RzMni3Xk99pcscJA-1 Received: by mail-qk1-f199.google.com with SMTP id z1so396466qkl.15 for ; Tue, 04 Feb 2020 18:58:57 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=oVtEI2Wr6dUMivwAFn+kYxL1mBYmiWLoQ2SvNOgo33g=; b=fsed93QAsKbcCBkiRH+XLfH8kaUiLqdHM1TkONwWw3JBY2X25UJpYapkscYUmSiPNU eW3K1jIE6btsg3LFE8yYKnveA5MHzblBlxQ5iCYUPAl9s/8B/hxkMOfHfg/kX3IRsdZb ja3o/CjNXxd3bXHFARx71UljVFupfSoVbF9Ae0DOKcu+b7rXHhKsrM6DHtMFhqeWI486 mW8cQRdsU5tPxz9LMSgTKLUphWh8h9AlV6lFQK+L3qtKrjxgOPv4OZ7lJkyMRv0iUja7 xxU+k2iVAfCpoCiWL2hZ1NDsNWl4ZDpaTb43jtSL1zoMz6LiZv//bU0GqSr8SaIPWsMp he+w== X-Gm-Message-State: APjAAAWVZJZkiZn40KPRWqFETjh20iB0z3vn9aeNB2E2iyKZzVXqybJ3 dlh/j9XiIDZzw0GM8VgePE+vXbPGz8cXF9sDvE7aBfvJa3vBpWs0F46dORjZfOhiOgoFCKYuQ/O lD5iXKrL/V7jq X-Received: by 2002:a37:a587:: with SMTP id o129mr32257552qke.268.1580871536712; Tue, 04 Feb 2020 18:58:56 -0800 (PST) X-Google-Smtp-Source: APXvYqzjBu3Cweu+6xXN4q5BSU1kYfV9A2MHlGfTnMfOKOcci7Hyu7FzNRkaxpD0SFrs9MSWaI5Biw== X-Received: by 2002:a37:a587:: with SMTP id o129mr32257538qke.268.1580871536465; Tue, 04 Feb 2020 18:58:56 -0800 (PST) Received: from xz-x1.redhat.com ([2607:9880:19c8:32::2]) by smtp.gmail.com with ESMTPSA id e64sm12961649qtd.45.2020.02.04.18.58.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Feb 2020 18:58:55 -0800 (PST) From: Peter Xu To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: dinechin@redhat.com, sean.j.christopherson@intel.com, pbonzini@redhat.com, jasowang@redhat.com, yan.y.zhao@intel.com, mst@redhat.com, peterx@redhat.com, kevin.tian@intel.com, alex.williamson@redhat.com, dgilbert@redhat.com, vkuznets@redhat.com Subject: [PATCH 09/14] KVM: selftests: Sync uapi/linux/kvm.h to tools/ Date: Tue, 4 Feb 2020 21:58:37 -0500 Message-Id: <20200205025842.367575-6-peterx@redhat.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200205025842.367575-1-peterx@redhat.com> References: <20200205025105.367213-1-peterx@redhat.com> <20200205025842.367575-1-peterx@redhat.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This will be needed to extend the kvm selftest program. Signed-off-by: Peter Xu --- tools/include/uapi/linux/kvm.h | 44 ++++++++++++++++++++++++++++++++++ 1 file changed, 44 insertions(+) diff --git a/tools/include/uapi/linux/kvm.h b/tools/include/uapi/linux/kvm.h index f0a16b4adbbd..5877d7fa88d1 100644 --- a/tools/include/uapi/linux/kvm.h +++ b/tools/include/uapi/linux/kvm.h @@ -236,6 +236,7 @@ struct kvm_hyperv_exit { #define KVM_EXIT_IOAPIC_EOI 26 #define KVM_EXIT_HYPERV 27 #define KVM_EXIT_ARM_NISV 28 +#define KVM_EXIT_DIRTY_RING_FULL 29 /* For KVM_EXIT_INTERNAL_ERROR */ /* Emulate instruction failed. */ @@ -1009,6 +1010,7 @@ struct kvm_ppc_resize_hpt { #define KVM_CAP_PPC_GUEST_DEBUG_SSTEP 176 #define KVM_CAP_ARM_NISV_TO_USER 177 #define KVM_CAP_ARM_INJECT_EXT_DABT 178 +#define KVM_CAP_DIRTY_LOG_RING 179 #ifdef KVM_CAP_IRQ_ROUTING @@ -1473,6 +1475,9 @@ struct kvm_enc_region { /* Available with KVM_CAP_ARM_SVE */ #define KVM_ARM_VCPU_FINALIZE _IOW(KVMIO, 0xc2, int) +/* Available with KVM_CAP_DIRTY_LOG_RING */ +#define KVM_RESET_DIRTY_RINGS _IO(KVMIO, 0xc3) + /* Secure Encrypted Virtualization command */ enum sev_cmd_id { /* Guest initialization commands */ @@ -1623,4 +1628,43 @@ struct kvm_hyperv_eventfd { #define KVM_HYPERV_CONN_ID_MASK 0x00ffffff #define KVM_HYPERV_EVENTFD_DEASSIGN (1 << 0) +/* + * KVM dirty GFN flags, defined as: + * + * |---------------+---------------+--------------| + * | bit 1 (reset) | bit 0 (dirty) | Status | + * |---------------+---------------+--------------| + * | 0 | 0 | Invalid GFN | + * | 0 | 1 | Dirty GFN | + * | 1 | X | GFN to reset | + * |---------------+---------------+--------------| + * + * Lifecycle of a dirty GFN goes like: + * + * dirtied collected reset + * 00 -----------> 01 -------------> 1X -------+ + * ^ | + * | | + * +------------------------------------------+ + * + * The userspace program is only responsible for the 01->1X state + * conversion (to collect dirty bits). Also, it must not skip any + * dirty bits so that dirty bits are always collected in sequence. + */ +#define KVM_DIRTY_GFN_F_DIRTY BIT(0) +#define KVM_DIRTY_GFN_F_RESET BIT(1) +#define KVM_DIRTY_GFN_F_MASK 0x3 + +/* + * KVM dirty rings should be mapped at KVM_DIRTY_LOG_PAGE_OFFSET of + * per-vcpu mmaped regions as an array of struct kvm_dirty_gfn. The + * size of the gfn buffer is decided by the first argument when + * enabling KVM_CAP_DIRTY_LOG_RING. + */ +struct kvm_dirty_gfn { + __u32 flags; + __u32 slot; + __u64 offset; +}; + #endif /* __LINUX_KVM_H */ From patchwork Wed Feb 5 02:58:38 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 11365719 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B8CB41395 for ; Wed, 5 Feb 2020 02:59:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 8DCB921D7D for ; Wed, 5 Feb 2020 02:59:23 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="DFyqqGXu" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727993AbgBEC7X (ORCPT ); Tue, 4 Feb 2020 21:59:23 -0500 Received: from us-smtp-1.mimecast.com ([207.211.31.81]:28529 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728016AbgBEC7C (ORCPT ); Tue, 4 Feb 2020 21:59:02 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1580871540; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kPI9qPbIwSC0dShS/4Ovykpev+2eXmUvfFAp53pOJD8=; b=DFyqqGXuuhL7Z50zG8t6gA+QGccMAcIGhA81hWJ3HVT3qm4xkW6G3n1hYBnuNlHI8xDY4u 6U5j2AcYypx0u4BNL5zTI0AXz0oYmqSAxvYvo2vkrYXN0apqk5q4V2AutDqvF8XBgtVi0t UMLvqrVKLPeBROVOShIy/sPXrq8zZM4= Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-187-P1xzX4xfMTiz4OhMZVIatw-1; Tue, 04 Feb 2020 21:58:59 -0500 X-MC-Unique: P1xzX4xfMTiz4OhMZVIatw-1 Received: by mail-qk1-f200.google.com with SMTP id q2so388853qkq.19 for ; Tue, 04 Feb 2020 18:58:59 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=kPI9qPbIwSC0dShS/4Ovykpev+2eXmUvfFAp53pOJD8=; b=RSIwAg9RQOTEx1sD39CeDj6B5Y7qMCFd5KL68P9A3zpSNjfvPTbQwawuPPIk1ftLoX Q5E6JKhWuBxt7K98Zeq3RQOoU6vfGCA8fc78fzYP/09qeJlkw3WYbMk80B7xPQfD8YtU HXXQYU82165+0D2ixUtobABDScqbcGIWj00bmF0B/bHhma1KhhQkuAnBg59s/U0IpmCk ZSloxTcKw6y1mdxIt69BnHJJ39HSPMUOs2/377JLeCQMRMAl5cSlwl+Et4JGLYHTTOKP K74iKQ6EhRpWd0b62MZOZvK/o76Kfa0LouKxQZ3ZsIfQqV59RoaR1f2giy7dgXPEztrG NJAA== X-Gm-Message-State: APjAAAXKDcg2Yh2XOenW6Ms65+GojQ3NtDjzrTA+8YVR9Oz2WMUKccsT kz2w/5wNRAI3cXZLhsD4TFXiwUiKGexsaWUupQ9wSGPuFNbOQr/K31t3lMoJgeq/Neud6hkdpbP irfFSt4wdnlx0 X-Received: by 2002:a05:620a:90c:: with SMTP id v12mr3901561qkv.230.1580871538448; Tue, 04 Feb 2020 18:58:58 -0800 (PST) X-Google-Smtp-Source: APXvYqwKfOKgQYlWBeSycidVKm+E/Gh3raz1VGRf0r6YxQzSD8Ah50kFaTU36XFK4GLHVg+1F9djaQ== X-Received: by 2002:a05:620a:90c:: with SMTP id v12mr3901539qkv.230.1580871538093; Tue, 04 Feb 2020 18:58:58 -0800 (PST) Received: from xz-x1.redhat.com ([2607:9880:19c8:32::2]) by smtp.gmail.com with ESMTPSA id e64sm12961649qtd.45.2020.02.04.18.58.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Feb 2020 18:58:57 -0800 (PST) From: Peter Xu To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: dinechin@redhat.com, sean.j.christopherson@intel.com, pbonzini@redhat.com, jasowang@redhat.com, yan.y.zhao@intel.com, mst@redhat.com, peterx@redhat.com, kevin.tian@intel.com, alex.williamson@redhat.com, dgilbert@redhat.com, vkuznets@redhat.com Subject: [PATCH 10/14] KVM: selftests: Use a single binary for dirty/clear log test Date: Tue, 4 Feb 2020 21:58:38 -0500 Message-Id: <20200205025842.367575-7-peterx@redhat.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200205025842.367575-1-peterx@redhat.com> References: <20200205025105.367213-1-peterx@redhat.com> <20200205025842.367575-1-peterx@redhat.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Remove the clear_dirty_log test, instead merge it into the existing dirty_log_test. It should be cleaner to use this single binary to do both tests, also it's a preparation for the upcoming dirty ring test. The default test will still be the dirty_log test. To run the clear dirty log test, we need to specify "-M clear-log". Signed-off-by: Peter Xu --- tools/testing/selftests/kvm/Makefile | 2 - .../selftests/kvm/clear_dirty_log_test.c | 2 - tools/testing/selftests/kvm/dirty_log_test.c | 131 +++++++++++++++--- 3 files changed, 110 insertions(+), 25 deletions(-) delete mode 100644 tools/testing/selftests/kvm/clear_dirty_log_test.c diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile index 89bf05d4c2f3..9744966a48c5 100644 --- a/tools/testing/selftests/kvm/Makefile +++ b/tools/testing/selftests/kvm/Makefile @@ -26,12 +26,10 @@ TEST_GEN_PROGS_x86_64 += x86_64/vmx_dirty_log_test TEST_GEN_PROGS_x86_64 += x86_64/vmx_set_nested_state_test TEST_GEN_PROGS_x86_64 += x86_64/vmx_tsc_adjust_test TEST_GEN_PROGS_x86_64 += x86_64/xss_msr_test -TEST_GEN_PROGS_x86_64 += clear_dirty_log_test TEST_GEN_PROGS_x86_64 += dirty_log_test TEST_GEN_PROGS_x86_64 += demand_paging_test TEST_GEN_PROGS_x86_64 += kvm_create_max_vcpus -TEST_GEN_PROGS_aarch64 += clear_dirty_log_test TEST_GEN_PROGS_aarch64 += dirty_log_test TEST_GEN_PROGS_aarch64 += demand_paging_test TEST_GEN_PROGS_aarch64 += kvm_create_max_vcpus diff --git a/tools/testing/selftests/kvm/clear_dirty_log_test.c b/tools/testing/selftests/kvm/clear_dirty_log_test.c deleted file mode 100644 index 749336937d37..000000000000 --- a/tools/testing/selftests/kvm/clear_dirty_log_test.c +++ /dev/null @@ -1,2 +0,0 @@ -#define USE_CLEAR_DIRTY_LOG -#include "dirty_log_test.c" diff --git a/tools/testing/selftests/kvm/dirty_log_test.c b/tools/testing/selftests/kvm/dirty_log_test.c index 3c0ffd34b3b0..a8ae8c0042a8 100644 --- a/tools/testing/selftests/kvm/dirty_log_test.c +++ b/tools/testing/selftests/kvm/dirty_log_test.c @@ -128,6 +128,66 @@ static uint64_t host_dirty_count; static uint64_t host_clear_count; static uint64_t host_track_next_count; +enum log_mode_t { + /* Only use KVM_GET_DIRTY_LOG for logging */ + LOG_MODE_DIRTY_LOG = 0, + + /* Use both KVM_[GET|CLEAR]_DIRTY_LOG for logging */ + LOG_MODE_CLERA_LOG = 1, + + LOG_MODE_NUM, +}; + +/* Mode of logging. Default is LOG_MODE_DIRTY_LOG */ +static enum log_mode_t host_log_mode; + +static void clear_log_create_vm_done(struct kvm_vm *vm) +{ + struct kvm_enable_cap cap = {}; + + if (!kvm_check_cap(KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2)) { + fprintf(stderr, "KVM_CLEAR_DIRTY_LOG not available, skipping tests\n"); + exit(KSFT_SKIP); + } + + cap.cap = KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2; + cap.args[0] = 1; + vm_enable_cap(vm, &cap); +} + +static void dirty_log_collect_dirty_pages(struct kvm_vm *vm, int slot, + void *bitmap, uint32_t num_pages) +{ + kvm_vm_get_dirty_log(vm, slot, bitmap); +} + +static void clear_log_collect_dirty_pages(struct kvm_vm *vm, int slot, + void *bitmap, uint32_t num_pages) +{ + kvm_vm_get_dirty_log(vm, slot, bitmap); + kvm_vm_clear_dirty_log(vm, slot, bitmap, 0, num_pages); +} + +struct log_mode { + const char *name; + /* Hook when the vm creation is done (before vcpu creation) */ + void (*create_vm_done)(struct kvm_vm *vm); + /* Hook to collect the dirty pages into the bitmap provided */ + void (*collect_dirty_pages) (struct kvm_vm *vm, int slot, + void *bitmap, uint32_t num_pages); +} log_modes[LOG_MODE_NUM] = { + { + .name = "dirty-log", + .create_vm_done = NULL, + .collect_dirty_pages = dirty_log_collect_dirty_pages, + }, + { + .name = "clear-log", + .create_vm_done = clear_log_create_vm_done, + .collect_dirty_pages = clear_log_collect_dirty_pages, + }, +}; + /* * We use this bitmap to track some pages that should have its dirty * bit set in the _next_ iteration. For example, if we detected the @@ -137,6 +197,33 @@ static uint64_t host_track_next_count; */ static unsigned long *host_bmap_track; +static void log_modes_dump(void) +{ + int i; + + for (i = 0; i < LOG_MODE_NUM; i++) + printf("%s, ", log_modes[i].name); + puts("\b\b \b\b"); +} + +static void log_mode_create_vm_done(struct kvm_vm *vm) +{ + struct log_mode *mode = &log_modes[host_log_mode]; + + if (mode->create_vm_done) + mode->create_vm_done(vm); +} + +static void log_mode_collect_dirty_pages(struct kvm_vm *vm, int slot, + void *bitmap, uint32_t num_pages) +{ + struct log_mode *mode = &log_modes[host_log_mode]; + + TEST_ASSERT(mode->collect_dirty_pages != NULL, + "collect_dirty_pages() is required for any log mode!"); + mode->collect_dirty_pages(vm, slot, bitmap, num_pages); +} + static void generate_random_array(uint64_t *guest_array, uint64_t size) { uint64_t i; @@ -257,6 +344,7 @@ static struct kvm_vm *create_vm(enum vm_guest_mode mode, uint32_t vcpuid, #ifdef __x86_64__ vm_create_irqchip(vm); #endif + log_mode_create_vm_done(vm); vm_vcpu_add_default(vm, vcpuid, guest_code); return vm; } @@ -316,14 +404,6 @@ static void run_test(enum vm_guest_mode mode, unsigned long iterations, bmap = bitmap_alloc(host_num_pages); host_bmap_track = bitmap_alloc(host_num_pages); -#ifdef USE_CLEAR_DIRTY_LOG - struct kvm_enable_cap cap = {}; - - cap.cap = KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2; - cap.args[0] = 1; - vm_enable_cap(vm, &cap); -#endif - /* Add an extra memory slot for testing dirty logging */ vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS, guest_test_phys_mem, @@ -364,11 +444,8 @@ static void run_test(enum vm_guest_mode mode, unsigned long iterations, while (iteration < iterations) { /* Give the vcpu thread some time to dirty some pages */ usleep(interval * 1000); - kvm_vm_get_dirty_log(vm, TEST_MEM_SLOT_INDEX, bmap); -#ifdef USE_CLEAR_DIRTY_LOG - kvm_vm_clear_dirty_log(vm, TEST_MEM_SLOT_INDEX, bmap, 0, - host_num_pages); -#endif + log_mode_collect_dirty_pages(vm, TEST_MEM_SLOT_INDEX, + bmap, host_num_pages); vm_dirty_log_verify(bmap); iteration++; sync_global_to_guest(vm, iteration); @@ -413,6 +490,9 @@ static void help(char *name) TEST_HOST_LOOP_INTERVAL); printf(" -p: specify guest physical test memory offset\n" " Warning: a low offset can conflict with the loaded test code.\n"); + printf(" -M: specify the host logging mode " + "(default: log-dirty). Supported modes: \n\t"); + log_modes_dump(); printf(" -m: specify the guest mode ID to test " "(default: test all supported modes)\n" " This option may be used multiple times.\n" @@ -437,13 +517,6 @@ int main(int argc, char *argv[]) unsigned int host_ipa_limit; #endif -#ifdef USE_CLEAR_DIRTY_LOG - if (!kvm_check_cap(KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2)) { - fprintf(stderr, "KVM_CLEAR_DIRTY_LOG not available, skipping tests\n"); - exit(KSFT_SKIP); - } -#endif - #ifdef __x86_64__ vm_guest_mode_params_init(VM_MODE_PXXV48_4K, true, true); #endif @@ -463,7 +536,7 @@ int main(int argc, char *argv[]) vm_guest_mode_params_init(VM_MODE_P40V48_4K, true, true); #endif - while ((opt = getopt(argc, argv, "hi:I:p:m:")) != -1) { + while ((opt = getopt(argc, argv, "hi:I:p:m:M:")) != -1) { switch (opt) { case 'i': iterations = strtol(optarg, NULL, 10); @@ -485,6 +558,22 @@ int main(int argc, char *argv[]) "Guest mode ID %d too big", mode); vm_guest_mode_params[mode].enabled = true; break; + case 'M': + for (i = 0; i < LOG_MODE_NUM; i++) { + if (!strcmp(optarg, log_modes[i].name)) { + DEBUG("Setting log mode to: '%s'\n", + optarg); + host_log_mode = i; + break; + } + } + if (i == LOG_MODE_NUM) { + printf("Log mode '%s' is invalid. " + "Please choose from: ", optarg); + log_modes_dump(); + exit(-1); + } + break; case 'h': default: help(argv[0]); From patchwork Wed Feb 5 02:58:39 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 11365713 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0B394138D for ; Wed, 5 Feb 2020 02:59:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id D4712214AF for ; Wed, 5 Feb 2020 02:59:05 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="VCSKlQGU" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727998AbgBEC7E (ORCPT ); Tue, 4 Feb 2020 21:59:04 -0500 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:55456 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728034AbgBEC7E (ORCPT ); Tue, 4 Feb 2020 21:59:04 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1580871542; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=aF1BdxIc7NPI7Dsk+3ZH17n/EhvhXEPJcTNSoSkH9Wk=; b=VCSKlQGU/4TyP64fsw8XjG/I5oU5tDss7/z/BCb4xH1GZO3x0vWVMH7fb+e50/4bg7QlxL PujOXpwFBkT0AgdzCmWNDnG6mx/9TU4dBNPAtJNQjwbH/QP0UDpwphLMp55V2FSKsoWsnm oPrVJ1FljygnJiupBL8/ohUXrIOlmH4= Received: from mail-qv1-f70.google.com (mail-qv1-f70.google.com [209.85.219.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-38-HSXl-euVNVOwxzkqwy6amw-1; Tue, 04 Feb 2020 21:59:01 -0500 X-MC-Unique: HSXl-euVNVOwxzkqwy6amw-1 Received: by mail-qv1-f70.google.com with SMTP id dw11so612646qvb.16 for ; Tue, 04 Feb 2020 18:59:01 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=aF1BdxIc7NPI7Dsk+3ZH17n/EhvhXEPJcTNSoSkH9Wk=; b=lPcp/LmT3ZylnCA4HdPDPuQe0AjzXkV7OlaSjR83FM6AwkVIo9g7pgIqW6AbMOVTv0 cf1+sljK5Z6V3Bx2UVcqKtKHWjTX+VNmYdcujaiCrToe2CFxAvZevy1G6CIkf1s/qkbv YQbazuZK3FjSCVbPm3PnpiKiXa7PP3NWx1Ms9jRUzHspo3PtZYsY80KNqimVhYi64LcK JY43gFwqWryk8K2FYIoR+XnIIaRNUMG9Y+WFKiIIjP+S5Hd9rWsZJ8RxjoNwbGDRn7dS G6PjFlYevE7UecXtpU9rrGOIq4NaT5mLkEiBmL6JIns9VfS7a4pQNYd4uzdugI40MhPx bcPQ== X-Gm-Message-State: APjAAAXeshQpMx3U6hmi2/bJCi4JeOc5W0MbYDPszV9iuLKDwwfq14Bl iNeB7uxGT+qA5DrlGXJqOkMvz5dhLMpnDM/gtvZ85KOSkJ2G9KgKys0HsMAGwklt+kUMaIMzfnr qqRDBPPno5ZZZ X-Received: by 2002:a37:a8f:: with SMTP id 137mr12400368qkk.435.1580871540613; Tue, 04 Feb 2020 18:59:00 -0800 (PST) X-Google-Smtp-Source: APXvYqw6iuymQEWo4p6WSsu4I/VZfKMlmOnkU8i8JNdOSs5IKO4yUnCfo2RtVlAYGtyecP4HujqyjQ== X-Received: by 2002:a37:a8f:: with SMTP id 137mr12400354qkk.435.1580871540357; Tue, 04 Feb 2020 18:59:00 -0800 (PST) Received: from xz-x1.redhat.com ([2607:9880:19c8:32::2]) by smtp.gmail.com with ESMTPSA id e64sm12961649qtd.45.2020.02.04.18.58.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Feb 2020 18:58:59 -0800 (PST) From: Peter Xu To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: dinechin@redhat.com, sean.j.christopherson@intel.com, pbonzini@redhat.com, jasowang@redhat.com, yan.y.zhao@intel.com, mst@redhat.com, peterx@redhat.com, kevin.tian@intel.com, alex.williamson@redhat.com, dgilbert@redhat.com, vkuznets@redhat.com Subject: [PATCH 11/14] KVM: selftests: Introduce after_vcpu_run hook for dirty log test Date: Tue, 4 Feb 2020 21:58:39 -0500 Message-Id: <20200205025842.367575-8-peterx@redhat.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200205025842.367575-1-peterx@redhat.com> References: <20200205025105.367213-1-peterx@redhat.com> <20200205025842.367575-1-peterx@redhat.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Provide a hook for the checks after vcpu_run() completes. Preparation for the dirty ring test because we'll need to take care of another exit reason. Since at it, drop the pages_count because after all we have a better summary right now with statistics, and clean it up a bit. Signed-off-by: Peter Xu --- tools/testing/selftests/kvm/dirty_log_test.c | 39 ++++++++++++-------- 1 file changed, 23 insertions(+), 16 deletions(-) diff --git a/tools/testing/selftests/kvm/dirty_log_test.c b/tools/testing/selftests/kvm/dirty_log_test.c index a8ae8c0042a8..3542311f56ff 100644 --- a/tools/testing/selftests/kvm/dirty_log_test.c +++ b/tools/testing/selftests/kvm/dirty_log_test.c @@ -168,6 +168,15 @@ static void clear_log_collect_dirty_pages(struct kvm_vm *vm, int slot, kvm_vm_clear_dirty_log(vm, slot, bitmap, 0, num_pages); } +static void default_after_vcpu_run(struct kvm_vm *vm) +{ + struct kvm_run *run = vcpu_state(vm, VCPU_ID); + + TEST_ASSERT(get_ucall(vm, VCPU_ID, NULL) == UCALL_SYNC, + "Invalid guest sync status: exit_reason=%s\n", + exit_reason_str(run->exit_reason)); +} + struct log_mode { const char *name; /* Hook when the vm creation is done (before vcpu creation) */ @@ -175,16 +184,20 @@ struct log_mode { /* Hook to collect the dirty pages into the bitmap provided */ void (*collect_dirty_pages) (struct kvm_vm *vm, int slot, void *bitmap, uint32_t num_pages); + /* Hook to call when after each vcpu run */ + void (*after_vcpu_run)(struct kvm_vm *vm); } log_modes[LOG_MODE_NUM] = { { .name = "dirty-log", .create_vm_done = NULL, .collect_dirty_pages = dirty_log_collect_dirty_pages, + .after_vcpu_run = default_after_vcpu_run, }, { .name = "clear-log", .create_vm_done = clear_log_create_vm_done, .collect_dirty_pages = clear_log_collect_dirty_pages, + .after_vcpu_run = default_after_vcpu_run, }, }; @@ -224,6 +237,14 @@ static void log_mode_collect_dirty_pages(struct kvm_vm *vm, int slot, mode->collect_dirty_pages(vm, slot, bitmap, num_pages); } +static void log_mode_after_vcpu_run(struct kvm_vm *vm) +{ + struct log_mode *mode = &log_modes[host_log_mode]; + + if (mode->after_vcpu_run) + mode->after_vcpu_run(vm); +} + static void generate_random_array(uint64_t *guest_array, uint64_t size) { uint64_t i; @@ -237,31 +258,17 @@ static void *vcpu_worker(void *data) int ret; struct kvm_vm *vm = data; uint64_t *guest_array; - uint64_t pages_count = 0; - struct kvm_run *run; - - run = vcpu_state(vm, VCPU_ID); guest_array = addr_gva2hva(vm, (vm_vaddr_t)random_array); - generate_random_array(guest_array, TEST_PAGES_PER_LOOP); while (!READ_ONCE(host_quit)) { + generate_random_array(guest_array, TEST_PAGES_PER_LOOP); /* Let the guest dirty the random pages */ ret = _vcpu_run(vm, VCPU_ID); TEST_ASSERT(ret == 0, "vcpu_run failed: %d\n", ret); - if (get_ucall(vm, VCPU_ID, NULL) == UCALL_SYNC) { - pages_count += TEST_PAGES_PER_LOOP; - generate_random_array(guest_array, TEST_PAGES_PER_LOOP); - } else { - TEST_ASSERT(false, - "Invalid guest sync status: " - "exit_reason=%s\n", - exit_reason_str(run->exit_reason)); - } + log_mode_after_vcpu_run(vm); } - DEBUG("Dirtied %"PRIu64" pages\n", pages_count); - return NULL; } From patchwork Wed Feb 5 02:58:40 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 11365715 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A37D4138D for ; Wed, 5 Feb 2020 02:59:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 64F8C217F4 for ; Wed, 5 Feb 2020 02:59:12 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="MR+qJBNG" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728055AbgBEC7H (ORCPT ); Tue, 4 Feb 2020 21:59:07 -0500 Received: from us-smtp-2.mimecast.com ([205.139.110.61]:40265 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728034AbgBEC7G (ORCPT ); Tue, 4 Feb 2020 21:59:06 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1580871545; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=saPFynq7p41KzOERFMy4As7UwYCBExZGAIjj1Xf7Oj0=; b=MR+qJBNG2seuh7DpWp0IRuvCw5tUzLcvgTVne9TxjCl8rpG2Ibh5PAd6timHKJxi7l8QBp R4uRjPQGiHU86lSXtTX8wxjXaN385bcRGpG64v0Ey+XU3DHo9RqID7oysBFOD3BUnTolCq 2nFz2hAkj8teoUNFGysZmzp0uQhy5Jg= Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-232-Q8sngrm9PLyUBgs2cUX3Eg-1; Tue, 04 Feb 2020 21:59:04 -0500 X-MC-Unique: Q8sngrm9PLyUBgs2cUX3Eg-1 Received: by mail-qt1-f200.google.com with SMTP id c8so391160qte.22 for ; Tue, 04 Feb 2020 18:59:04 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=saPFynq7p41KzOERFMy4As7UwYCBExZGAIjj1Xf7Oj0=; b=I9DKKLXuTPRPs6c8kSnjPkk/fRfBOkiqbbMgH2tttGyIg73LxVG963mLskmBxA/EgB 05dXIx74oh8Bz7kawxbsGZNdIriF7/55YlylN6/bhRD9CYW4ZeX4F+h9adY4zRWnvqNe W2ijXXUoAh2gfOToHzbA3jUAmf06xkMsLhaB6up/nkpPU966H2xldl651IpA776WvNNQ JLSsbwy8yCyKbwS2xDbU+vx4Yuahia8rNKrdWTmLs6Td0cjik6Hf/gDIqhVXAtLPvKmZ SrMvFzGjP+LLKJZPcsoKZe3MuYOLsaN8wx7foFirZF9NW6JNutdw9veceeoc8il42tRV tyvA== X-Gm-Message-State: APjAAAVMhG5zgSR0Voo9mwNOlhJ4WPnk5ha/tpZYe+blZdue3OtometL rad/mAF0SclrvkHcYZpKNFcOLyl8+cDOiPk6n9nFcBd49FTmxj3emJInmq5DGS5VfCWkDOa50Le TsZfa7J+ju8H8 X-Received: by 2002:ac8:1196:: with SMTP id d22mr31814137qtj.344.1580871543268; Tue, 04 Feb 2020 18:59:03 -0800 (PST) X-Google-Smtp-Source: APXvYqyCnE0GpCDDqFOrAqcYKdDKIPf7/Vkk6D/lVsSBnBy8pP8CxOATSjH44xXaBEjtCkzmya8syw== X-Received: by 2002:ac8:1196:: with SMTP id d22mr31814099qtj.344.1580871542644; Tue, 04 Feb 2020 18:59:02 -0800 (PST) Received: from xz-x1.redhat.com ([2607:9880:19c8:32::2]) by smtp.gmail.com with ESMTPSA id e64sm12961649qtd.45.2020.02.04.18.59.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Feb 2020 18:59:01 -0800 (PST) From: Peter Xu To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: dinechin@redhat.com, sean.j.christopherson@intel.com, pbonzini@redhat.com, jasowang@redhat.com, yan.y.zhao@intel.com, mst@redhat.com, peterx@redhat.com, kevin.tian@intel.com, alex.williamson@redhat.com, dgilbert@redhat.com, vkuznets@redhat.com Subject: [PATCH 12/14] KVM: selftests: Add dirty ring buffer test Date: Tue, 4 Feb 2020 21:58:40 -0500 Message-Id: <20200205025842.367575-9-peterx@redhat.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200205025842.367575-1-peterx@redhat.com> References: <20200205025105.367213-1-peterx@redhat.com> <20200205025842.367575-1-peterx@redhat.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add the initial dirty ring buffer test. The current test implements the userspace dirty ring collection, by only reaping the dirty ring when the ring is full. So it's still running synchronously like this: vcpu main thread 1. vcpu dirties pages 2. vcpu gets dirty ring full (userspace exit) 3. main thread waits until full (so hardware buffers flushed) 4. main thread collects 5. main thread continues vcpu 6. vcpu continues, goes back to 1 We can't directly collects dirty bits during vcpu execution because otherwise we can't guarantee the hardware dirty bits were flushed when we collect and we're very strict on the dirty bits so otherwise we can fail the future verify procedure. A follow up patch will make this test to support async just like the existing dirty log test, by adding a vcpu kick mechanism. Signed-off-by: Peter Xu --- tools/testing/selftests/kvm/dirty_log_test.c | 195 +++++++++++++++++- .../testing/selftests/kvm/include/kvm_util.h | 3 + tools/testing/selftests/kvm/lib/kvm_util.c | 68 ++++++ .../selftests/kvm/lib/kvm_util_internal.h | 4 + 4 files changed, 268 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/kvm/dirty_log_test.c b/tools/testing/selftests/kvm/dirty_log_test.c index 3542311f56ff..b4c210f33dd7 100644 --- a/tools/testing/selftests/kvm/dirty_log_test.c +++ b/tools/testing/selftests/kvm/dirty_log_test.c @@ -12,8 +12,10 @@ #include #include #include +#include #include #include +#include #include "test_util.h" #include "kvm_util.h" @@ -57,6 +59,8 @@ # define test_and_clear_bit_le test_and_clear_bit #endif +#define TEST_DIRTY_RING_COUNT 1024 + /* * Guest/Host shared variables. Ensure addr_gva2hva() and/or * sync_global_to/from_guest() are used when accessing from @@ -128,6 +132,10 @@ static uint64_t host_dirty_count; static uint64_t host_clear_count; static uint64_t host_track_next_count; +/* Whether dirty ring reset is requested, or finished */ +static sem_t dirty_ring_vcpu_stop; +static sem_t dirty_ring_vcpu_cont; + enum log_mode_t { /* Only use KVM_GET_DIRTY_LOG for logging */ LOG_MODE_DIRTY_LOG = 0, @@ -135,6 +143,9 @@ enum log_mode_t { /* Use both KVM_[GET|CLEAR]_DIRTY_LOG for logging */ LOG_MODE_CLERA_LOG = 1, + /* Use dirty ring for logging */ + LOG_MODE_DIRTY_RING = 2, + LOG_MODE_NUM, }; @@ -177,6 +188,115 @@ static void default_after_vcpu_run(struct kvm_vm *vm) exit_reason_str(run->exit_reason)); } +static void dirty_ring_create_vm_done(struct kvm_vm *vm) +{ + /* + * Switch to dirty ring mode after VM creation but before any + * of the vcpu creation. + */ + vm_enable_dirty_ring(vm, TEST_DIRTY_RING_COUNT * + sizeof(struct kvm_dirty_gfn)); +} + +static inline bool dirty_gfn_is_dirtied(struct kvm_dirty_gfn *gfn) +{ + return gfn->flags == KVM_DIRTY_GFN_F_DIRTY; +} + +static inline void dirty_gfn_set_collected(struct kvm_dirty_gfn *gfn) +{ + gfn->flags = KVM_DIRTY_GFN_F_RESET; +} + +static uint32_t dirty_ring_collect_one(struct kvm_dirty_gfn *dirty_gfns, + int slot, void *bitmap, + uint32_t num_pages, uint32_t *fetch_index) +{ + struct kvm_dirty_gfn *cur; + uint32_t count = 0; + + while (true) { + cur = &dirty_gfns[*fetch_index % TEST_DIRTY_RING_COUNT]; + if (!dirty_gfn_is_dirtied(cur)) + break; + TEST_ASSERT(cur->slot == slot, "Slot number didn't match: " + "%u != %u", cur->slot, slot); + TEST_ASSERT(cur->offset < num_pages, "Offset overflow: " + "0x%llx >= 0x%llx", cur->offset, num_pages); + DEBUG("fetch 0x%x page %llu\n", *fetch_index, cur->offset); + set_bit(cur->offset, bitmap); + dirty_gfn_set_collected(cur); + (*fetch_index)++; + count++; + } + + return count; +} + +static void dirty_ring_collect_dirty_pages(struct kvm_vm *vm, int slot, + void *bitmap, uint32_t num_pages) +{ + /* We only have one vcpu */ + static uint32_t fetch_index = 0; + uint32_t count = 0, cleared; + + /* + * Before fetching the dirty pages, we need a vmexit of the + * worker vcpu to make sure the hardware dirty buffers were + * flushed. This is not needed for dirty-log/clear-log tests + * because get dirty log will natually do so. + * + * For now we do it in the simple way - we simply wait until + * the vcpu uses up the soft dirty ring, then it'll always + * do a vmexit to make sure that PML buffers will be flushed. + * In real hypervisors, we probably need a vcpu kick or to + * stop the vcpus (before the final sync) to make sure we'll + * get all the existing dirty PFNs even cached in hardware. + */ + sem_wait(&dirty_ring_vcpu_stop); + + /* Only have one vcpu */ + count = dirty_ring_collect_one(vcpu_map_dirty_ring(vm, VCPU_ID), + slot, bitmap, num_pages, &fetch_index); + + cleared = kvm_vm_reset_dirty_ring(vm); + + /* Cleared pages should be the same as collected */ + TEST_ASSERT(cleared == count, "Reset dirty pages (%u) mismatch " + "with collected (%u)", cleared, count); + + DEBUG("Notifying vcpu to continue\n"); + sem_post(&dirty_ring_vcpu_cont); + + DEBUG("Iteration %ld collected %u pages\n", iteration, count); +} + +static void dirty_ring_after_vcpu_run(struct kvm_vm *vm) +{ + struct kvm_run *run = vcpu_state(vm, VCPU_ID); + + /* A ucall-sync or ring-full event is allowed */ + if (get_ucall(vm, VCPU_ID, NULL) == UCALL_SYNC) { + /* We should allow this to continue */ + ; + } else if (run->exit_reason == KVM_EXIT_DIRTY_RING_FULL) { + sem_post(&dirty_ring_vcpu_stop); + DEBUG("vcpu stops because dirty ring full...\n"); + sem_wait(&dirty_ring_vcpu_cont); + DEBUG("vcpu continues now.\n"); + } else { + TEST_ASSERT(false, "Invalid guest sync status: " + "exit_reason=%s\n", + exit_reason_str(run->exit_reason)); + } +} + +static void dirty_ring_before_vcpu_join(void) +{ + /* Kick another round of vcpu just to make sure it will quit */ + sem_post(&dirty_ring_vcpu_cont); +} + struct log_mode { const char *name; /* Hook when the vm creation is done (before vcpu creation) */ @@ -186,6 +306,7 @@ struct log_mode { void *bitmap, uint32_t num_pages); /* Hook to call when after each vcpu run */ void (*after_vcpu_run)(struct kvm_vm *vm); + void (*before_vcpu_join) (void); } log_modes[LOG_MODE_NUM] = { { .name = "dirty-log", @@ -199,6 +320,13 @@ struct log_mode { .collect_dirty_pages = clear_log_collect_dirty_pages, .after_vcpu_run = default_after_vcpu_run, }, + { + .name = "dirty-ring", + .create_vm_done = dirty_ring_create_vm_done, + .collect_dirty_pages = dirty_ring_collect_dirty_pages, + .before_vcpu_join = dirty_ring_before_vcpu_join, + .after_vcpu_run = dirty_ring_after_vcpu_run, + }, }; /* @@ -245,6 +373,14 @@ static void log_mode_after_vcpu_run(struct kvm_vm *vm) mode->after_vcpu_run(vm); } +static void log_mode_before_vcpu_join(void) +{ + struct log_mode *mode = &log_modes[host_log_mode]; + + if (mode->before_vcpu_join) + mode->before_vcpu_join(); +} + static void generate_random_array(uint64_t *guest_array, uint64_t size) { uint64_t i; @@ -292,14 +428,65 @@ static void vm_dirty_log_verify(unsigned long *bmap) } if (test_and_clear_bit_le(page, bmap)) { + bool matched; + host_dirty_count++; + /* * If the bit is set, the value written onto * the corresponding page should be either the * previous iteration number or the current one. */ - TEST_ASSERT(*value_ptr == iteration || - *value_ptr == iteration - 1, + matched = (*value_ptr == iteration || + *value_ptr == iteration - 1); + + if (host_log_mode == LOG_MODE_DIRTY_RING && !matched) { + if (*value_ptr == iteration - 2) { + /* + * Short answer: this case is special + * only for dirty ring test where the + * page is the last page before a kvm + * dirty ring full in iteration N-2. + * + * Long answer: Assuming ring size R, + * one possible condition is: + * + * main thr vcpu thr + * -------- -------- + * iter=1 + * write 1 to page 0~(R-1) + * full, vmexit + * collect 0~(R-1) + * kick vcpu + * write 1 to (R-1)~(2R-2) + * full, vmexit + * iter=2 + * collect (R-1)~(2R-2) + * kick vcpu + * write 1 to (2R-2) + * (NOTE!!! "1" cached in cpu reg) + * write 2 to (2R-1)~(3R-3) + * full, vmexit + * iter=3 + * collect (2R-2)~(3R-3) + * (here if we read value on page + * "2R-2" is 1, while iter=3!!!) + */ + matched = true; + } else { + /* + * This is also special for dirty ring + * when this page is exactly the last + * page touched before vcpu ring full. + * If it happens, we should expect the + * value to change in the next round. + */ + set_bit_le(page, host_bmap_track); + continue; + } + } + + TEST_ASSERT(matched, "Set page %"PRIu64" value %"PRIu64 " incorrect (iteration=%"PRIu64")", page, *value_ptr, iteration); @@ -460,6 +647,7 @@ static void run_test(enum vm_guest_mode mode, unsigned long iterations, /* Tell the vcpu thread to quit */ host_quit = true; + log_mode_before_vcpu_join(); pthread_join(vcpu_thread, NULL); DEBUG("Total bits checked: dirty (%"PRIu64"), clear (%"PRIu64"), " @@ -524,6 +712,9 @@ int main(int argc, char *argv[]) unsigned int host_ipa_limit; #endif + sem_init(&dirty_ring_vcpu_stop, 0, 0); + sem_init(&dirty_ring_vcpu_cont, 0, 0); + #ifdef __x86_64__ vm_guest_mode_params_init(VM_MODE_PXXV48_4K, true, true); #endif diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing/selftests/kvm/include/kvm_util.h index 29cccaf96baf..4b78a8d3e773 100644 --- a/tools/testing/selftests/kvm/include/kvm_util.h +++ b/tools/testing/selftests/kvm/include/kvm_util.h @@ -67,6 +67,7 @@ enum vm_mem_backing_src_type { int kvm_check_cap(long cap); int vm_enable_cap(struct kvm_vm *vm, struct kvm_enable_cap *cap); +void vm_enable_dirty_ring(struct kvm_vm *vm, uint32_t ring_size); struct kvm_vm *vm_create(enum vm_guest_mode mode, uint64_t phy_pages, int perm); struct kvm_vm *_vm_create(enum vm_guest_mode mode, uint64_t phy_pages, int perm); @@ -76,6 +77,7 @@ void kvm_vm_release(struct kvm_vm *vmp); void kvm_vm_get_dirty_log(struct kvm_vm *vm, int slot, void *log); void kvm_vm_clear_dirty_log(struct kvm_vm *vm, int slot, void *log, uint64_t first_page, uint32_t num_pages); +uint32_t kvm_vm_reset_dirty_ring(struct kvm_vm *vm); int kvm_memcmp_hva_gva(void *hva, struct kvm_vm *vm, const vm_vaddr_t gva, size_t len); @@ -137,6 +139,7 @@ void vcpu_nested_state_get(struct kvm_vm *vm, uint32_t vcpuid, int vcpu_nested_state_set(struct kvm_vm *vm, uint32_t vcpuid, struct kvm_nested_state *state, bool ignore_error); #endif +void *vcpu_map_dirty_ring(struct kvm_vm *vm, uint32_t vcpuid); const char *exit_reason_str(unsigned int exit_reason); diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c index 31253b4fa12f..25edf20d1962 100644 --- a/tools/testing/selftests/kvm/lib/kvm_util.c +++ b/tools/testing/selftests/kvm/lib/kvm_util.c @@ -85,6 +85,26 @@ int vm_enable_cap(struct kvm_vm *vm, struct kvm_enable_cap *cap) return ret; } +void vm_enable_dirty_ring(struct kvm_vm *vm, uint32_t ring_size) +{ + struct kvm_enable_cap cap = {}; + int ret; + + ret = kvm_check_cap(KVM_CAP_DIRTY_LOG_RING); + + TEST_ASSERT(ret >= 0, "KVM_CAP_DIRTY_LOG_RING"); + + if (ret == 0) { + fprintf(stderr, "KVM does not support dirty ring, skipping tests\n"); + exit(KSFT_SKIP); + } + + cap.cap = KVM_CAP_DIRTY_LOG_RING; + cap.args[0] = ring_size; + vm_enable_cap(vm, &cap); + vm->dirty_ring_size = ring_size; +} + static void vm_open(struct kvm_vm *vm, int perm) { vm->kvm_fd = open(KVM_DEV_PATH, perm); @@ -299,6 +319,11 @@ void kvm_vm_clear_dirty_log(struct kvm_vm *vm, int slot, void *log, strerror(-ret)); } +uint32_t kvm_vm_reset_dirty_ring(struct kvm_vm *vm) +{ + return ioctl(vm->fd, KVM_RESET_DIRTY_RINGS); +} + /* * Userspace Memory Region Find * @@ -410,6 +435,13 @@ static void vm_vcpu_rm(struct kvm_vm *vm, uint32_t vcpuid) struct vcpu *vcpu = vcpu_find(vm, vcpuid); int ret; + if (vcpu->dirty_gfns) { + ret = munmap(vcpu->dirty_gfns, vm->dirty_ring_size); + TEST_ASSERT(ret == 0, "munmap of VCPU dirty ring failed, " + "rc: %i errno: %i", ret, errno); + vcpu->dirty_gfns = NULL; + } + ret = munmap(vcpu->state, sizeof(*vcpu->state)); TEST_ASSERT(ret == 0, "munmap of VCPU fd failed, rc: %i " "errno: %i", ret, errno); @@ -1425,6 +1457,41 @@ int _vcpu_ioctl(struct kvm_vm *vm, uint32_t vcpuid, return ret; } +void *vcpu_map_dirty_ring(struct kvm_vm *vm, uint32_t vcpuid) +{ + struct vcpu *vcpu; + uint32_t size = vm->dirty_ring_size; + + TEST_ASSERT(size > 0, "Should enable dirty ring first"); + + vcpu = vcpu_find(vm, vcpuid); + + TEST_ASSERT(vcpu, "Cannot find vcpu %u", vcpuid); + + if (!vcpu->dirty_gfns) { + void *addr; + + addr = mmap(NULL, size, PROT_READ, + MAP_PRIVATE, vcpu->fd, + vm->page_size * KVM_DIRTY_LOG_PAGE_OFFSET); + TEST_ASSERT(addr == MAP_FAILED, "Dirty ring mapped private"); + + addr = mmap(NULL, size, PROT_READ | PROT_EXEC, + MAP_PRIVATE, vcpu->fd, + vm->page_size * KVM_DIRTY_LOG_PAGE_OFFSET); + TEST_ASSERT(addr == MAP_FAILED, "Dirty ring mapped exec"); + + addr = mmap(NULL, size, PROT_READ | PROT_WRITE, + MAP_SHARED, vcpu->fd, + vm->page_size * KVM_DIRTY_LOG_PAGE_OFFSET); + + vcpu->dirty_gfns = addr; + vcpu->dirty_gfns_count = size / sizeof(struct kvm_dirty_gfn); + } + + return vcpu->dirty_gfns; +} + /* * VM Ioctl * @@ -1519,6 +1586,7 @@ static struct exit_reason { {KVM_EXIT_INTERNAL_ERROR, "INTERNAL_ERROR"}, {KVM_EXIT_OSI, "OSI"}, {KVM_EXIT_PAPR_HCALL, "PAPR_HCALL"}, + {KVM_EXIT_DIRTY_RING_FULL, "DIRTY_RING_FULL"}, #ifdef KVM_EXIT_MEMORY_NOT_PRESENT {KVM_EXIT_MEMORY_NOT_PRESENT, "MEMORY_NOT_PRESENT"}, #endif diff --git a/tools/testing/selftests/kvm/lib/kvm_util_internal.h b/tools/testing/selftests/kvm/lib/kvm_util_internal.h index ac50c42750cf..452e1f02611a 100644 --- a/tools/testing/selftests/kvm/lib/kvm_util_internal.h +++ b/tools/testing/selftests/kvm/lib/kvm_util_internal.h @@ -39,6 +39,9 @@ struct vcpu { uint32_t id; int fd; struct kvm_run *state; + struct kvm_dirty_gfn *dirty_gfns; + uint32_t fetch_index; + uint32_t dirty_gfns_count; }; struct kvm_vm { @@ -61,6 +64,7 @@ struct kvm_vm { vm_paddr_t pgd; vm_vaddr_t gdt; vm_vaddr_t tss; + uint32_t dirty_ring_size; }; struct vcpu *vcpu_find(struct kvm_vm *vm, uint32_t vcpuid); From patchwork Wed Feb 5 02:58:41 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 11365717 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CE9041395 for ; Wed, 5 Feb 2020 02:59:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 99BB4214AF for ; Wed, 5 Feb 2020 02:59:21 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="ZTIErKkd" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728044AbgBEC7Q (ORCPT ); Tue, 4 Feb 2020 21:59:16 -0500 Received: from us-smtp-2.mimecast.com ([205.139.110.61]:39107 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728072AbgBEC7L (ORCPT ); Tue, 4 Feb 2020 21:59:11 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1580871550; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=T3MfHN9tfMflA2mtZCLums4Fx20WmuB7Kv75tieeS3Q=; b=ZTIErKkd1/pe+efbcQd/brs8/fGTZu1I4pAQ7XmVMCdl50hCKPwHqN4qL4OzGL01DB3Lzh jNq07QJIdPGIeu1BzIvyMOY2X1H1daCgGjY9ecL0M4L2Hm1mlTaThFbUG6Zs8Ur5oNqt/7 EeqSPcfCh4CaNczrDcCFPlYLnOUd21c= Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-82-BpoLKZQZNR2VAwKF65m4rQ-1; Tue, 04 Feb 2020 21:59:06 -0500 X-MC-Unique: BpoLKZQZNR2VAwKF65m4rQ-1 Received: by mail-qk1-f200.google.com with SMTP id i135so396582qke.14 for ; Tue, 04 Feb 2020 18:59:06 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=T3MfHN9tfMflA2mtZCLums4Fx20WmuB7Kv75tieeS3Q=; b=b+5uTR6tgO47ACop4TUVkUgzJFo3lhdXxnnqFsugBZw7YKBWpzrzXd2O2TCzXD1zfg IZLGmlqhVZMN4PuWqYiaX3+343JcGR/5FfPUQsqFgmQveNL0a5tnRYz3jLQGfRPnzGmT 7LwCiJHlAHk2URHJ31eAza9Md4+Wo6Cl5LM/vfygMQKYB5pmWzwxc4q+Nxh9cuI0M58N IHyUxIKerhhZu8OLeiuyVwFq4252qZD+/g75N8yi5ON7RgmwxzgnEeUUcPz557wYrE4p LrDks48T294dvSakRqus/DUfcj6Hpr6swpz1oq8UnNdCINKzVpFOWXhLniAWXpx5YVPo prcg== X-Gm-Message-State: APjAAAUA0XXGcNAO5dfwiM2T0qk8W6cRwoANx462fflvSuE1nT9+S4pc 4E3xGuw1EpAkvBTyExZ7JHpPU8BH5M3qJRbLyt1Q7J7jRE+CD+Dz4josAseajSI62xFSk4EMUEa r7Y1AXaSjYYO3 X-Received: by 2002:a37:47c8:: with SMTP id u191mr32340484qka.438.1580871545348; Tue, 04 Feb 2020 18:59:05 -0800 (PST) X-Google-Smtp-Source: APXvYqxKBLFeUOTbByGjjkxUFAudwxLYcVqrWwN0nXLuxqXpj2ynXuSckzR2GzSHtwiMeb5iakaRWA== X-Received: by 2002:a37:47c8:: with SMTP id u191mr32340458qka.438.1580871544935; Tue, 04 Feb 2020 18:59:04 -0800 (PST) Received: from xz-x1.redhat.com ([2607:9880:19c8:32::2]) by smtp.gmail.com with ESMTPSA id e64sm12961649qtd.45.2020.02.04.18.59.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Feb 2020 18:59:04 -0800 (PST) From: Peter Xu To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: dinechin@redhat.com, sean.j.christopherson@intel.com, pbonzini@redhat.com, jasowang@redhat.com, yan.y.zhao@intel.com, mst@redhat.com, peterx@redhat.com, kevin.tian@intel.com, alex.williamson@redhat.com, dgilbert@redhat.com, vkuznets@redhat.com Subject: [PATCH 13/14] KVM: selftests: Let dirty_log_test async for dirty ring test Date: Tue, 4 Feb 2020 21:58:41 -0500 Message-Id: <20200205025842.367575-10-peterx@redhat.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200205025842.367575-1-peterx@redhat.com> References: <20200205025105.367213-1-peterx@redhat.com> <20200205025842.367575-1-peterx@redhat.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Previously the dirty ring test was working in synchronous way, because only with a vmexit (with that it was the ring full event) we'll know the hardware dirty bits will be flushed to the dirty ring. With this patch we first introduced the vcpu kick mechanism by using SIGUSR1, meanwhile we can have a guarantee of vmexit and also the flushing of hardware dirty bits. With all these, we can keep the vcpu dirty work asynchronous of the whole collection procedure now. Still, we need to be very careful that we can only do it async if the vcpu is not reaching soft limit (no KVM_EXIT_DIRTY_RING_FULL). Otherwise we must collect the dirty bits before continuing the vcpu. Further increase the dirty ring size to current maximum to make sure we torture more on the no-ring-full case, which should be the major scenario when the hypervisors like QEMU would like to use this feature. Signed-off-by: Peter Xu --- tools/testing/selftests/kvm/dirty_log_test.c | 123 +++++++++++++----- .../testing/selftests/kvm/include/kvm_util.h | 1 + tools/testing/selftests/kvm/lib/kvm_util.c | 8 ++ 3 files changed, 103 insertions(+), 29 deletions(-) diff --git a/tools/testing/selftests/kvm/dirty_log_test.c b/tools/testing/selftests/kvm/dirty_log_test.c index b4c210f33dd7..6c754e91fc50 100644 --- a/tools/testing/selftests/kvm/dirty_log_test.c +++ b/tools/testing/selftests/kvm/dirty_log_test.c @@ -13,6 +13,9 @@ #include #include #include +#include +#include +#include #include #include #include @@ -59,7 +62,9 @@ # define test_and_clear_bit_le test_and_clear_bit #endif -#define TEST_DIRTY_RING_COUNT 1024 +#define TEST_DIRTY_RING_COUNT 65536 + +#define SIG_IPI SIGUSR1 /* * Guest/Host shared variables. Ensure addr_gva2hva() and/or @@ -135,6 +140,12 @@ static uint64_t host_track_next_count; /* Whether dirty ring reset is requested, or finished */ static sem_t dirty_ring_vcpu_stop; static sem_t dirty_ring_vcpu_cont; +/* + * This is updated by the vcpu thread to tell the host whether it's a + * ring-full event. It should only be read until a sem_wait() of + * dirty_ring_vcpu_stop and before vcpu continues to run. + */ +static bool dirty_ring_vcpu_ring_full; enum log_mode_t { /* Only use KVM_GET_DIRTY_LOG for logging */ @@ -151,6 +162,33 @@ enum log_mode_t { /* Mode of logging. Default is LOG_MODE_DIRTY_LOG */ static enum log_mode_t host_log_mode; +pthread_t vcpu_thread; + +/* Only way to pass this to the signal handler */ +struct kvm_vm *current_vm; + +static void vcpu_sig_handler(int sig) +{ + TEST_ASSERT(sig == SIG_IPI, "unknown signal: %d", sig); +} + +static void vcpu_kick(void) +{ + pthread_kill(vcpu_thread, SIG_IPI); +} + +/* + * In our test we do signal tricks, let's use a better version of + * sem_wait to avoid signal interrupts + */ +static void sem_wait_until(sem_t *sem) +{ + int ret; + + do + ret = sem_wait(sem); + while (ret == -1 && errno == EINTR); +} static void clear_log_create_vm_done(struct kvm_vm *vm) { @@ -179,10 +217,13 @@ static void clear_log_collect_dirty_pages(struct kvm_vm *vm, int slot, kvm_vm_clear_dirty_log(vm, slot, bitmap, 0, num_pages); } -static void default_after_vcpu_run(struct kvm_vm *vm) +static void default_after_vcpu_run(struct kvm_vm *vm, int ret, int err) { struct kvm_run *run = vcpu_state(vm, VCPU_ID); + TEST_ASSERT(ret == 0 || (ret == -1 && err == EINTR), + "vcpu run failed: errno=%d", err); + TEST_ASSERT(get_ucall(vm, VCPU_ID, NULL) == UCALL_SYNC, "Invalid guest sync status: exit_reason=%s\n", exit_reason_str(run->exit_reason)); @@ -233,27 +274,37 @@ static uint32_t dirty_ring_collect_one(struct kvm_dirty_gfn *dirty_gfns, return count; } +static void dirty_ring_wait_vcpu(void) +{ + /* This makes sure that hardware PML cache flushed */ + vcpu_kick(); + sem_wait_until(&dirty_ring_vcpu_stop); +} + +static void dirty_ring_continue_vcpu(void) +{ + DEBUG("Notifying vcpu to continue\n"); + sem_post(&dirty_ring_vcpu_cont); +} + static void dirty_ring_collect_dirty_pages(struct kvm_vm *vm, int slot, void *bitmap, uint32_t num_pages) { /* We only have one vcpu */ static uint32_t fetch_index = 0; uint32_t count = 0, cleared; + bool continued_vcpu = false; - /* - * Before fetching the dirty pages, we need a vmexit of the - * worker vcpu to make sure the hardware dirty buffers were - * flushed. This is not needed for dirty-log/clear-log tests - * because get dirty log will natually do so. - * - * For now we do it in the simple way - we simply wait until - * the vcpu uses up the soft dirty ring, then it'll always - * do a vmexit to make sure that PML buffers will be flushed. - * In real hypervisors, we probably need a vcpu kick or to - * stop the vcpus (before the final sync) to make sure we'll - * get all the existing dirty PFNs even cached in hardware. - */ - sem_wait(&dirty_ring_vcpu_stop); + dirty_ring_wait_vcpu(); + + if (!dirty_ring_vcpu_ring_full) { + /* + * This is not a ring-full event, it's safe to allow + * vcpu to continue + */ + dirty_ring_continue_vcpu(); + continued_vcpu = true; + } /* Only have one vcpu */ count = dirty_ring_collect_one(vcpu_map_dirty_ring(vm, VCPU_ID), @@ -265,13 +316,16 @@ static void dirty_ring_collect_dirty_pages(struct kvm_vm *vm, int slot, TEST_ASSERT(cleared == count, "Reset dirty pages (%u) mismatch " "with collected (%u)", cleared, count); - DEBUG("Notifying vcpu to continue\n"); - sem_post(&dirty_ring_vcpu_cont); + if (!continued_vcpu) { + TEST_ASSERT(dirty_ring_vcpu_ring_full, + "Didn't continue vcpu even without ring full"); + dirty_ring_continue_vcpu(); + } DEBUG("Iteration %ld collected %u pages\n", iteration, count); } -static void dirty_ring_after_vcpu_run(struct kvm_vm *vm) +static void dirty_ring_after_vcpu_run(struct kvm_vm *vm, int ret, int err) { struct kvm_run *run = vcpu_state(vm, VCPU_ID); @@ -279,10 +333,16 @@ static void dirty_ring_after_vcpu_run(struct kvm_vm *vm) if (get_ucall(vm, VCPU_ID, NULL) == UCALL_SYNC) { /* We should allow this to continue */ ; - } else if (run->exit_reason == KVM_EXIT_DIRTY_RING_FULL) { + } else if (run->exit_reason == KVM_EXIT_DIRTY_RING_FULL || + (ret == -1 && err == EINTR)) { + /* Update the flag first before pause */ + WRITE_ONCE(dirty_ring_vcpu_ring_full, + run->exit_reason == KVM_EXIT_DIRTY_RING_FULL); sem_post(&dirty_ring_vcpu_stop); - DEBUG("vcpu stops because dirty ring full...\n"); - sem_wait(&dirty_ring_vcpu_cont); + DEBUG("vcpu stops because %s...\n", + dirty_ring_vcpu_ring_full ? + "dirty ring is full" : "vcpu is kicked out"); + sem_wait_until(&dirty_ring_vcpu_cont); DEBUG("vcpu continues now.\n"); } else { TEST_ASSERT(false, "Invalid guest sync status: " @@ -305,7 +365,7 @@ struct log_mode { void (*collect_dirty_pages) (struct kvm_vm *vm, int slot, void *bitmap, uint32_t num_pages); /* Hook to call when after each vcpu run */ - void (*after_vcpu_run)(struct kvm_vm *vm); + void (*after_vcpu_run)(struct kvm_vm *vm, int ret, int err); void (*before_vcpu_join) (void); } log_modes[LOG_MODE_NUM] = { { @@ -365,12 +425,12 @@ static void log_mode_collect_dirty_pages(struct kvm_vm *vm, int slot, mode->collect_dirty_pages(vm, slot, bitmap, num_pages); } -static void log_mode_after_vcpu_run(struct kvm_vm *vm) +static void log_mode_after_vcpu_run(struct kvm_vm *vm, int ret, int err) { struct log_mode *mode = &log_modes[host_log_mode]; if (mode->after_vcpu_run) - mode->after_vcpu_run(vm); + mode->after_vcpu_run(vm, ret, err); } static void log_mode_before_vcpu_join(void) @@ -394,15 +454,21 @@ static void *vcpu_worker(void *data) int ret; struct kvm_vm *vm = data; uint64_t *guest_array; + struct sigaction sigact; + + current_vm = vm; + memset(&sigact, 0, sizeof(sigact)); + sigact.sa_handler = vcpu_sig_handler; + sigaction(SIG_IPI, &sigact, NULL); guest_array = addr_gva2hva(vm, (vm_vaddr_t)random_array); while (!READ_ONCE(host_quit)) { + /* Clear any existing kick signals */ generate_random_array(guest_array, TEST_PAGES_PER_LOOP); /* Let the guest dirty the random pages */ - ret = _vcpu_run(vm, VCPU_ID); - TEST_ASSERT(ret == 0, "vcpu_run failed: %d\n", ret); - log_mode_after_vcpu_run(vm); + ret = __vcpu_run(vm, VCPU_ID); + log_mode_after_vcpu_run(vm, ret, errno); } return NULL; @@ -549,7 +615,6 @@ static struct kvm_vm *create_vm(enum vm_guest_mode mode, uint32_t vcpuid, static void run_test(enum vm_guest_mode mode, unsigned long iterations, unsigned long interval, uint64_t phys_offset) { - pthread_t vcpu_thread; struct kvm_vm *vm; unsigned long *bmap; diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing/selftests/kvm/include/kvm_util.h index 4b78a8d3e773..e64fbfe6bbd5 100644 --- a/tools/testing/selftests/kvm/include/kvm_util.h +++ b/tools/testing/selftests/kvm/include/kvm_util.h @@ -115,6 +115,7 @@ vm_paddr_t addr_gva2gpa(struct kvm_vm *vm, vm_vaddr_t gva); struct kvm_run *vcpu_state(struct kvm_vm *vm, uint32_t vcpuid); void vcpu_run(struct kvm_vm *vm, uint32_t vcpuid); int _vcpu_run(struct kvm_vm *vm, uint32_t vcpuid); +int __vcpu_run(struct kvm_vm *vm, uint32_t vcpuid); void vcpu_run_complete_io(struct kvm_vm *vm, uint32_t vcpuid); void vcpu_set_mp_state(struct kvm_vm *vm, uint32_t vcpuid, struct kvm_mp_state *mp_state); diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c index 25edf20d1962..5137882503bd 100644 --- a/tools/testing/selftests/kvm/lib/kvm_util.c +++ b/tools/testing/selftests/kvm/lib/kvm_util.c @@ -1203,6 +1203,14 @@ int _vcpu_run(struct kvm_vm *vm, uint32_t vcpuid) return rc; } +int __vcpu_run(struct kvm_vm *vm, uint32_t vcpuid) +{ + struct vcpu *vcpu = vcpu_find(vm, vcpuid); + + TEST_ASSERT(vcpu != NULL, "vcpu not found, vcpuid: %u", vcpuid); + return ioctl(vcpu->fd, KVM_RUN, NULL); +} + void vcpu_run_complete_io(struct kvm_vm *vm, uint32_t vcpuid) { struct vcpu *vcpu = vcpu_find(vm, vcpuid); From patchwork Wed Feb 5 03:00:42 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 11365725 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2B7151398 for ; Wed, 5 Feb 2020 03:00:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 094172082E for ; Wed, 5 Feb 2020 03:00:53 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="ea+7hciI" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727921AbgBEDAt (ORCPT ); Tue, 4 Feb 2020 22:00:49 -0500 Received: from us-smtp-1.mimecast.com ([207.211.31.81]:29860 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727855AbgBEDAt (ORCPT ); Tue, 4 Feb 2020 22:00:49 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1580871648; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=MkSZ85QIBalZvWs6VFapLnhibOydb3IjnBZ+FKrtyNU=; b=ea+7hciIlEEPE0RCfflLfaiP6VtTTQq+ch8qEEoZitEi+rRwCDsUR+pRSs65nArJqG7nih 1SyNnS6O0GtuAIt1oRokDbaGv0Sjj0Eulhb4HZ99MRcKwCK6oxbZ1+pz2O3t2Ba1An557G RIuGmhwW0AkRpdBwPqJ105+NGKBCykc= Received: from mail-qv1-f71.google.com (mail-qv1-f71.google.com [209.85.219.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-269-VMrgWnQgPS6nVSqvxZMbGQ-1; Tue, 04 Feb 2020 22:00:46 -0500 X-MC-Unique: VMrgWnQgPS6nVSqvxZMbGQ-1 Received: by mail-qv1-f71.google.com with SMTP id g15so623732qvk.11 for ; Tue, 04 Feb 2020 19:00:46 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=MkSZ85QIBalZvWs6VFapLnhibOydb3IjnBZ+FKrtyNU=; b=gM5V0EAZNBW1VLAwS6QpJnsAwhtb5R5nJhFUlcKwvRae0odhvQG05haMAcfAnobaPs 6UgYeAeUBJZ+Ey1dVSJmDEiOICwIAHNPDTwq+DpQhf0TeZa+/v1qi2kc9sQQrFH+rtrp eFxx9r0CrG0WJ7VYggb7OkASdm42XqNd/UUHfPTKiD1CZIe+0gyLJiGmsybrzR3fEv7G Ot9LBWCOQANUVWMNXT3x21e/Nzc8XmaieszBMcHmu2Y19XhjJl1bkVqoEmqAKaM+G4Iy XGr3BQJgMBOKkLBYUvz7pZGp7DhQyLCP53sepQzhZLO2+YzZxGqHv/aT2DONrBzMP5g5 2eMg== X-Gm-Message-State: APjAAAXuV3zuyqM4xdJoDWPaXlIntL1S3uqn6Fn9uoIVx+i4SQ39D3GG q07QmhunpKYt2YnKRGqfRRdqv7vnZC+AoWSF0lp3KXQuUnRPmJIuaLPQue493+dTPL3zWwPpzSh OWR9Ix0WNY5m3 X-Received: by 2002:ae9:f30e:: with SMTP id p14mr31895403qkg.186.1580871646039; Tue, 04 Feb 2020 19:00:46 -0800 (PST) X-Google-Smtp-Source: APXvYqz3lyYvbqkVDgZHul6XJFSDUjVusufeLIOwhSUi2JAXcDok/5lHyajvbFw+VyTI/6GvK5X8ow== X-Received: by 2002:ae9:f30e:: with SMTP id p14mr31895381qkg.186.1580871645788; Tue, 04 Feb 2020 19:00:45 -0800 (PST) Received: from xz-x1.redhat.com ([2607:9880:19c8:32::2]) by smtp.gmail.com with ESMTPSA id 2sm12111776qkv.98.2020.02.04.19.00.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Feb 2020 19:00:45 -0800 (PST) From: Peter Xu To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: dinechin@redhat.com, sean.j.christopherson@intel.com, pbonzini@redhat.com, jasowang@redhat.com, yan.y.zhao@intel.com, mst@redhat.com, peterx@redhat.com, kevin.tian@intel.com, alex.williamson@redhat.com, dgilbert@redhat.com, vkuznets@redhat.com Subject: [PATCH 14/14] KVM: selftests: Add "-c" parameter to dirty log test Date: Tue, 4 Feb 2020 22:00:42 -0500 Message-Id: <20200205030042.367713-1-peterx@redhat.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200205025105.367213-1-peterx@redhat.com> References: <20200205025105.367213-1-peterx@redhat.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org It's only used to override the existing dirty ring size/count. If with a bigger ring count, we test async of dirty ring. If with a smaller ring count, we test ring full code path. Async is default. It has no use for non-dirty-ring tests. Signed-off-by: Peter Xu --- tools/testing/selftests/kvm/dirty_log_test.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/tools/testing/selftests/kvm/dirty_log_test.c b/tools/testing/selftests/kvm/dirty_log_test.c index 6c754e91fc50..40312fdbe0d2 100644 --- a/tools/testing/selftests/kvm/dirty_log_test.c +++ b/tools/testing/selftests/kvm/dirty_log_test.c @@ -163,6 +163,7 @@ enum log_mode_t { /* Mode of logging. Default is LOG_MODE_DIRTY_LOG */ static enum log_mode_t host_log_mode; pthread_t vcpu_thread; +static uint32_t test_dirty_ring_count = TEST_DIRTY_RING_COUNT; /* Only way to pass this to the signal handler */ struct kvm_vm *current_vm; @@ -235,7 +236,7 @@ static void dirty_ring_create_vm_done(struct kvm_vm *vm) * Switch to dirty ring mode after VM creation but before any * of the vcpu creation. */ - vm_enable_dirty_ring(vm, TEST_DIRTY_RING_COUNT * + vm_enable_dirty_ring(vm, test_dirty_ring_count * sizeof(struct kvm_dirty_gfn)); } @@ -257,7 +258,7 @@ static uint32_t dirty_ring_collect_one(struct kvm_dirty_gfn *dirty_gfns, uint32_t count = 0; while (true) { - cur = &dirty_gfns[*fetch_index % TEST_DIRTY_RING_COUNT]; + cur = &dirty_gfns[*fetch_index % test_dirty_ring_count]; if (!dirty_gfn_is_dirtied(cur)) break; TEST_ASSERT(cur->slot == slot, "Slot number didn't match: " @@ -744,6 +745,9 @@ static void help(char *name) printf("usage: %s [-h] [-i iterations] [-I interval] " "[-p offset] [-m mode]\n", name); puts(""); + printf(" -c: specify dirty ring size, in number of entries\n"); + printf(" (only useful for dirty-ring test; default: %"PRIu32")\n", + TEST_DIRTY_RING_COUNT); printf(" -i: specify iteration counts (default: %"PRIu64")\n", TEST_HOST_LOOP_N); printf(" -I: specify interval in ms (default: %"PRIu64" ms)\n", @@ -799,8 +803,11 @@ int main(int argc, char *argv[]) vm_guest_mode_params_init(VM_MODE_P40V48_4K, true, true); #endif - while ((opt = getopt(argc, argv, "hi:I:p:m:M:")) != -1) { + while ((opt = getopt(argc, argv, "c:hi:I:p:m:M:")) != -1) { switch (opt) { + case 'c': + test_dirty_ring_count = strtol(optarg, NULL, 10); + break; case 'i': iterations = strtol(optarg, NULL, 10); break;