From patchwork Fri Sep 8 22:28:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anish Moorthy X-Patchwork-Id: 13377859 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CB9ECEEB56E for ; Fri, 8 Sep 2023 22:29:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343740AbjIHW35 (ORCPT ); Fri, 8 Sep 2023 18:29:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43994 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343662AbjIHW3y (ORCPT ); Fri, 8 Sep 2023 18:29:54 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 131F91FCA for ; Fri, 8 Sep 2023 15:29:50 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id 3f1490d57ef6-d7b9eb73dcdso2315814276.0 for ; Fri, 08 Sep 2023 15:29:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1694212189; x=1694816989; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=iUDCny58mTjAasfc+MarJfyUNBk+L++E3fgqNSlzCZM=; b=Tz8Ij39sAhMoqv2md0g7WLo/lHpSHXIh6YizjemCheGXJgwqHBOUzHkyShWJYNnrna GEkbVohqV8sG9PscSMn7xMmqyAbosa6E1tqJdiVC/wi+iuCWDxV1zFg9AK3WNlnXOfuc XyIRzOC2GAkMfv37DW1q8RXSNgwO4bSNfZl0INOhigQdpgcYku2y/JKb11iqlrGE4NzR 30FfShtJb9Ns77J0axmbbvBS00GMkQPYCdRe+WiXYLCeT1ozTgkuOsIs4NjR1DaTO5Ot +R7vqgdZx/ZEzNLd3Vn7qwPEK1BENYzsLyLVnA4pZJcu3dC1z1sd+z1iRByD/o2cr6/K PB0g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694212189; x=1694816989; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=iUDCny58mTjAasfc+MarJfyUNBk+L++E3fgqNSlzCZM=; b=p7xQYMVQV3KdmqEIbMt1EL+8P0IZ3O58Ck77VBZZBXedSVWzZs5Qbmyly2ObLb6ubO V/NS33TM0rLRHrqU5eevrhbZBgwnrIDuBPSFhmDK3Fr213eYrwp9VAMmzw1NLYW0OmuD xlU/+h23yOGALuyspeKW+9f/3McexYG1+/PKQPVS+5sgZ6DFgFB+y1SQvuoI0EmzyVzN odYrapM+NdHWXz+kYHa0mCfjE5Gyvw/RA7WdJi/MwK9oJSDpZXKMPjg01c0/wstwWuwc Bx7p1IZyQ6fMnyLkq/Po2XVr5txqs01JnM+n/W357mA5IRA+chqjwbg+3xU48Q+Ave/d +l8A== X-Gm-Message-State: AOJu0YxrihdTsPmtHZf0OkDWOVZ1C0SCVmM2rvU04jzBZCYrBIGOGdW2 wAW4R9PaM6q7aOBsjyn+C50ei51/u9f5EQ== X-Google-Smtp-Source: AGHT+IEDvILkGVKALVwPWgXNiW3g6dYyieYFxOwYXxGieU/y+z8cJi+9eCRUi/zrcXi4OV1bNd39rVDx4ol+Xg== X-Received: from laogai.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:2c9]) (user=amoorthy job=sendgmr) by 2002:a25:df13:0:b0:d7f:d239:596 with SMTP id w19-20020a25df13000000b00d7fd2390596mr82253ybg.0.1694212189354; Fri, 08 Sep 2023 15:29:49 -0700 (PDT) Date: Fri, 8 Sep 2023 22:28:48 +0000 In-Reply-To: <20230908222905.1321305-1-amoorthy@google.com> Mime-Version: 1.0 References: <20230908222905.1321305-1-amoorthy@google.com> X-Mailer: git-send-email 2.42.0.283.g2d96d420d3-goog Message-ID: <20230908222905.1321305-2-amoorthy@google.com> Subject: [PATCH v5 01/17] KVM: Clarify documentation of hva_to_pfn()'s 'atomic' parameter From: Anish Moorthy To: seanjc@google.com, oliver.upton@linux.dev, kvm@vger.kernel.org, kvmarm@lists.linux.dev Cc: pbonzini@redhat.com, maz@kernel.org, robert.hoo.linux@gmail.com, jthoughton@google.com, amoorthy@google.com, ricarkol@google.com, axelrasmussen@google.com, peterx@redhat.com, nadav.amit@gmail.com, isaku.yamahata@gmail.com, kconsul@linux.vnet.ibm.com Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The current docstring can be read as "atomic -> allowed to sleep," when in fact the intended statement is "atomic -> NOT allowed to sleep." Make that clearer in the docstring. Signed-off-by: Anish Moorthy --- virt/kvm/kvm_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index d63cf1c4f5a7..84e90ed3a134 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2669,7 +2669,7 @@ static int hva_to_pfn_remapped(struct vm_area_struct *vma, /* * Pin guest page in memory and return its pfn. * @addr: host virtual address which maps memory to the guest - * @atomic: whether this function can sleep + * @atomic: whether this function is forbidden from sleeping * @interruptible: whether the process can be interrupted by non-fatal signals * @async: whether this function need to wait IO complete if the * host page is not in the memory From patchwork Fri Sep 8 22:28:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anish Moorthy X-Patchwork-Id: 13377858 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 23F38EEB571 for ; Fri, 8 Sep 2023 22:29:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343716AbjIHW34 (ORCPT ); Fri, 8 Sep 2023 18:29:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43996 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237709AbjIHW3y (ORCPT ); Fri, 8 Sep 2023 18:29:54 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 19582B4 for ; Fri, 8 Sep 2023 15:29:51 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id 3f1490d57ef6-d7e7e70fa52so2520797276.0 for ; Fri, 08 Sep 2023 15:29:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1694212190; x=1694816990; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=/0BHnhPQjohMasi1FkwP4rKpLhtBzedfsHeNgZmOOQA=; b=7SoUNBBdrKisDRsgsN/pg9o1OKNOM54odJ0TLHSllEsXjHJQ1uNcRMXdG/0o+p3SJD aS4MlKjyk7+T09lcpbkwVxcGGltJkIOpG1LllpXC0/j5+GYk24NCf6eYB2TcpDRYaur1 MXtmXKB8IpvOiTdehQguzxYdcoJjy3vit96FPc6ZZ/bS6fOBuwzwtabx3PSP6KBNhmot ACDQjdD9JrHajmeRgR75BYXB98Wfk9fMx4nz0sbV1SKr880IYWDF/T2L0gVvjAjrwhe0 LKjseP4ny1lcdNdKvzNdJKXEIYAnnM4Du8Spo8cbAQuqq/4ixT2BVAjQb1vm4oQBze1o kpUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694212190; x=1694816990; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=/0BHnhPQjohMasi1FkwP4rKpLhtBzedfsHeNgZmOOQA=; b=Nk0/vbi6jUaJGmNR/CG5kLT7h+lYBnw8Zh79rO0f7buH23iQgvnaFTN90RwQ8jOBZ5 ZHNkXiWQiT432+/Ddv+u478jjSPfUPnDH0rY0oMk6CHd2zaXUFrb25zdFs57fY0FrKAx HEIXiEmjCi4b7Z8V5ogKCb1cBjAgQ9LAh/F/qxQ+sMk1UAaoeT33D1MQhLYSfoNcYuRm Zkoyim01soGDjUtL3SY5mkDUo34PS/Rm3XAmg6EiGEzGZhN1bQNxUGA4/DSoreqdmr5x FyeCuXVqsJV/gmrjLAXu95Bb9goaxUFsA6ErKTs2//rtZZ9iCv9VuLKdvxMwiMW2EigM lQQA== X-Gm-Message-State: AOJu0YxjhgQrx5sOjFu4eyoEiq1QL5+Ziu3vz4PgC0Zobrz97xpAF6+T Bhc+f6WzpEZnasJi9Qz+g2LyhthXeyqaQg== X-Google-Smtp-Source: AGHT+IGp3U+3huHXNrxPmxtY8xOiIHumjFDWxT6bJyVrVGJlTBuqH0tDk2yalvdlbk+Mgf/zY5+CmxE2i9YNdA== X-Received: from laogai.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:2c9]) (user=amoorthy job=sendgmr) by 2002:a05:6902:a8b:b0:d12:d6e4:a08f with SMTP id cd11-20020a0569020a8b00b00d12d6e4a08fmr72162ybb.6.1694212190325; Fri, 08 Sep 2023 15:29:50 -0700 (PDT) Date: Fri, 8 Sep 2023 22:28:49 +0000 In-Reply-To: <20230908222905.1321305-1-amoorthy@google.com> Mime-Version: 1.0 References: <20230908222905.1321305-1-amoorthy@google.com> X-Mailer: git-send-email 2.42.0.283.g2d96d420d3-goog Message-ID: <20230908222905.1321305-3-amoorthy@google.com> Subject: [PATCH v5 02/17] KVM: Add docstrings to __kvm_read/write_guest_page() From: Anish Moorthy To: seanjc@google.com, oliver.upton@linux.dev, kvm@vger.kernel.org, kvmarm@lists.linux.dev Cc: pbonzini@redhat.com, maz@kernel.org, robert.hoo.linux@gmail.com, jthoughton@google.com, amoorthy@google.com, ricarkol@google.com, axelrasmussen@google.com, peterx@redhat.com, nadav.amit@gmail.com, isaku.yamahata@gmail.com, kconsul@linux.vnet.ibm.com Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The (gfn, data, offset, len) order of parameters is a little strange, since "offset" actually applies to "gfn" rather than to "data". Add docstrings to make things perfectly clear. Signed-off-by: Anish Moorthy --- virt/kvm/kvm_main.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 84e90ed3a134..12837416ce8a 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -3014,6 +3014,9 @@ static int next_segment(unsigned long len, int offset) return len; } +/* + * Copy 'len' bytes from guest memory at '(gfn * PAGE_SIZE) + offset' to 'data' + */ static int __kvm_read_guest_page(struct kvm_memory_slot *slot, gfn_t gfn, void *data, int offset, int len) { @@ -3115,6 +3118,9 @@ int kvm_vcpu_read_guest_atomic(struct kvm_vcpu *vcpu, gpa_t gpa, } EXPORT_SYMBOL_GPL(kvm_vcpu_read_guest_atomic); +/* + * Copy 'len' bytes from 'data' into guest memory at '(gfn * PAGE_SIZE) + offset' + */ static int __kvm_write_guest_page(struct kvm *kvm, struct kvm_memory_slot *memslot, gfn_t gfn, const void *data, int offset, int len) From patchwork Fri Sep 8 22:28:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anish Moorthy X-Patchwork-Id: 13377860 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4E2E5EEB570 for ; Fri, 8 Sep 2023 22:29:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343757AbjIHW35 (ORCPT ); Fri, 8 Sep 2023 18:29:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44010 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240566AbjIHW34 (ORCPT ); Fri, 8 Sep 2023 18:29:56 -0400 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 067D51FC9 for ; Fri, 8 Sep 2023 15:29:52 -0700 (PDT) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-59b5884836cso13041417b3.0 for ; Fri, 08 Sep 2023 15:29:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1694212191; x=1694816991; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=QprTGpxfsgvH1yfhYL4R5BCfHsypk0V8kYldvoTmsTo=; b=YfMkYJvvDkfBDechruOVmdwZb7Bw22h9+lK0NdeF0isSa8GHk5bWYl+83dt7A5NQ8o z77YOdCoYyiErbWgYWnWZRqKwtk6CXP8DqOFm0Eetucqdb4Xfr1L9oalxgnxKQ60jdc1 3sA/M6G9zvA6iYGK65W+XpLshu0Ffq9LK7RWQxexctu94HGCm3Iwm5Ocw3btag22bcG6 8LxroYKuCAKhKXDP+Sc3hw0mw+qTZP5F9O5umXF6rBBu/Qvkdr49DOydeFbAcbgrUMfQ VylBUwJN+76CwhLioM5N1IYg3t9bI9IAdvw1sujUzzfs1PJOv867hEvEzr7DVNJw6KDN mFXw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694212191; x=1694816991; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=QprTGpxfsgvH1yfhYL4R5BCfHsypk0V8kYldvoTmsTo=; b=R3Vzr5vwjnwGD3R6n3ebqy6Mo2tOcLj11B57/wYv6Prtfa7IBZLxUktp9MfX9Hggs8 RLPnSoN+82j+gwsfrR6C/wUQx0JGbLsQwLP0pxi6kJwlIaVS+3vdVYBK+Zuxzn7v+LFG vxdHSkg2yUiJgaX8yqDoWt38gx+a7YXKWzvIum73WHsOY8evxFs3GsgfXkJyy8WPvh+f Mf9DxGAYWfqpdZoxQZjkcIxM/qDluMro1CwQokPs70Yjx9ZPr6NzM7pdY95vswjHyadh SjPSIlyqEp0wgG8kM2Pu6YPXvkkQh9omfb3ReNOK9JjuRYKBhymCMeR6d5buIOLkMR2U 1yig== X-Gm-Message-State: AOJu0YwHPqadmUZy/XurRRC/SOnKfglWeqQgYTin7UwbrUlwQDTnnYsK ctwMnEeIhenzRzg/cYtpJCDfnVNURi+bVw== X-Google-Smtp-Source: AGHT+IHIr/vLEv+gPrXhAif3UXfspI7EvM71lsU0z0Vz6t4qHW13LUdBjj766mWb+KiW3puYslbOWzUwEl4Icw== X-Received: from laogai.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:2c9]) (user=amoorthy job=sendgmr) by 2002:a05:6902:1682:b0:d7b:8d0c:43f1 with SMTP id bx2-20020a056902168200b00d7b8d0c43f1mr90723ybb.9.1694212191262; Fri, 08 Sep 2023 15:29:51 -0700 (PDT) Date: Fri, 8 Sep 2023 22:28:50 +0000 In-Reply-To: <20230908222905.1321305-1-amoorthy@google.com> Mime-Version: 1.0 References: <20230908222905.1321305-1-amoorthy@google.com> X-Mailer: git-send-email 2.42.0.283.g2d96d420d3-goog Message-ID: <20230908222905.1321305-4-amoorthy@google.com> Subject: [PATCH v5 03/17] KVM: Simplify error handling in __gfn_to_pfn_memslot() From: Anish Moorthy To: seanjc@google.com, oliver.upton@linux.dev, kvm@vger.kernel.org, kvmarm@lists.linux.dev Cc: pbonzini@redhat.com, maz@kernel.org, robert.hoo.linux@gmail.com, jthoughton@google.com, amoorthy@google.com, ricarkol@google.com, axelrasmussen@google.com, peterx@redhat.com, nadav.amit@gmail.com, isaku.yamahata@gmail.com, kconsul@linux.vnet.ibm.com Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org KVM_HVA_ERR_RO_BAD satisfies kvm_is_error_hva(), so there's no need to duplicate the "if (writable)" block. Fix this by bringing all kvm_is_error_hva() cases under one conditional. Signed-off-by: Anish Moorthy --- virt/kvm/kvm_main.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 12837416ce8a..8b2d5aab32bf 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2741,15 +2741,13 @@ kvm_pfn_t __gfn_to_pfn_memslot(const struct kvm_memory_slot *slot, gfn_t gfn, if (hva) *hva = addr; - if (addr == KVM_HVA_ERR_RO_BAD) { - if (writable) - *writable = false; - return KVM_PFN_ERR_RO_FAULT; - } - if (kvm_is_error_hva(addr)) { if (writable) *writable = false; + + if (addr == KVM_HVA_ERR_RO_BAD) + return KVM_PFN_ERR_RO_FAULT; + return KVM_PFN_NOSLOT; } From patchwork Fri Sep 8 22:28:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anish Moorthy X-Patchwork-Id: 13377862 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 16566EEB574 for ; Fri, 8 Sep 2023 22:29:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343816AbjIHW37 (ORCPT ); Fri, 8 Sep 2023 18:29:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44034 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343761AbjIHW35 (ORCPT ); Fri, 8 Sep 2023 18:29:57 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2310E1FD5 for ; Fri, 8 Sep 2023 15:29:53 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id 3f1490d57ef6-d71f505d21dso2550376276.3 for ; Fri, 08 Sep 2023 15:29:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1694212192; x=1694816992; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=sFLqyRQVGyStZ6wmmJoKmX48Ce2SAwoanh5HC5XvG8k=; b=Vkc6tbI7L+G8LKGS5ULmAZ1d0lPfoYWGhlcyytRM8C2kCVbgrBo5XhVFC0Ka2n18oU 6xePNiNNGo6/uWXZdLjpbOWBHjim9Wd4SPwW4Hy66zFi9VxWEGL7VjurRYnTachSOLop v7PNfCSs2YWliK6keEq1MXn+c4UVCboeOu/J+yJVrxE/s15yg81DjzJtK3sBxr/hCGfB bdb4kJTAE46FQELNrRQvFFvzfZw/97KAmDXKZ3p4XpTc/Fy9yKIbm+/k56+Uhq9sJquj kroo4YcTahrcn2+5Jz0xvrAMeSPlx5wWvEvmtk0Voaztinglxw4nRdGcXi83xJmwtN4a 0AcQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694212192; x=1694816992; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=sFLqyRQVGyStZ6wmmJoKmX48Ce2SAwoanh5HC5XvG8k=; b=hY5Kv6WyUB+5hwO6MWrzKqg23auCSfABr/+bDF00Fg8uwBXfdmeB1uQVHifq3LefjB ElEy0BN1QM1ZYAAdkC1UQIqYVAYGIViGT3jwV67cUF+js7B3QVujafXrLnYFZeIhCNI8 3bbXsUJXW1K/RUcBKBjrghfAZtUvBVItcDkgsR8riOQkipXHpTdKJolBZxBnOTpYlWhh 2NfUIshNw/wUsTSW+/8cFrN6clnkIO8NvvQDRRtGT7oWsoXj6NDjoXpXpxW/1SJ5Dr38 c3sE1AUwExayOMhUXadkPeWO32Xe85fThItA6R07AGNp/93hOswKwhwEs0ilTuaeCIXk U8ow== X-Gm-Message-State: AOJu0Yz3mxCKF0+9vWaEThZwJStWjMlr98kbXo9tEYPKYE3znsHuu4LF jP3amLcZkDgqHxf3Dqskn/Uu0NOqr84QCw== X-Google-Smtp-Source: AGHT+IH5Pjm50udQqayL/HhGoZuhUb8kke6pZg7BLVikQuxrQNq9sPbd4Tg7jduOWhg8/aNBF18rTrWuMgE8IA== X-Received: from laogai.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:2c9]) (user=amoorthy job=sendgmr) by 2002:a25:d4c9:0:b0:d78:f45:d7bd with SMTP id m192-20020a25d4c9000000b00d780f45d7bdmr83908ybf.4.1694212192389; Fri, 08 Sep 2023 15:29:52 -0700 (PDT) Date: Fri, 8 Sep 2023 22:28:51 +0000 In-Reply-To: <20230908222905.1321305-1-amoorthy@google.com> Mime-Version: 1.0 References: <20230908222905.1321305-1-amoorthy@google.com> X-Mailer: git-send-email 2.42.0.283.g2d96d420d3-goog Message-ID: <20230908222905.1321305-5-amoorthy@google.com> Subject: [PATCH v5 04/17] KVM: Add KVM_CAP_MEMORY_FAULT_INFO From: Anish Moorthy To: seanjc@google.com, oliver.upton@linux.dev, kvm@vger.kernel.org, kvmarm@lists.linux.dev Cc: pbonzini@redhat.com, maz@kernel.org, robert.hoo.linux@gmail.com, jthoughton@google.com, amoorthy@google.com, ricarkol@google.com, axelrasmussen@google.com, peterx@redhat.com, nadav.amit@gmail.com, isaku.yamahata@gmail.com, kconsul@linux.vnet.ibm.com Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org KVM_CAP_MEMORY_FAULT_INFO allows kvm_run to return useful information besides a return value of -1 and errno of EFAULT when a vCPU fails an access to guest memory which may be resolvable by userspace. Add documentation, updates to the KVM headers, and a helper function (kvm_handle_guest_uaccess_fault()) for implementing the capability. Mark KVM_CAP_MEMORY_FAULT_INFO as available on arm64 and x86, even though EFAULT annotation are currently totally absent. Picking a point to declare the implementation "done" is difficult because 1. Annotations will be performed incrementally in subsequent commits across both core and arch-specific KVM. 2. The initial series will very likely miss some cases which need annotation. Although these omissions are to be fixed in the future, userspace thus still needs to expect and be able to handle unannotated EFAULTs. Given these qualifications, just marking it available here seems the least arbitrary thing to do. Suggested-by: Sean Christopherson Signed-off-by: Anish Moorthy --- Documentation/virt/kvm/api.rst | 49 ++++++++++++++++++++++++++++++++-- include/linux/kvm_host.h | 35 ++++++++++++++++++++++++ include/uapi/linux/kvm.h | 34 +++++++++++++++++++++++ tools/include/uapi/linux/kvm.h | 24 +++++++++++++++++ virt/kvm/kvm_main.c | 3 +++ 5 files changed, 143 insertions(+), 2 deletions(-) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 660d9ca7a251..92fd3faa6bab 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -6130,8 +6130,10 @@ local APIC is not used. __u16 flags; -More architecture-specific flags detailing state of the VCPU that may -affect the device's behavior. Current defined flags:: +Flags detailing state of the VCPU. The lower/upper bytes encode archiecture +specific/agnostic bytes respectively. Current defined flags + +:: /* x86, set if the VCPU is in system management mode */ #define KVM_RUN_X86_SMM (1 << 0) @@ -6140,6 +6142,9 @@ affect the device's behavior. Current defined flags:: /* arm64, set for KVM_EXIT_DEBUG */ #define KVM_DEBUG_ARCH_HSR_HIGH_VALID (1 << 0) + /* Arch-agnostic, see KVM_CAP_MEMORY_FAULT_INFO */ + #define KVM_RUN_MEMORY_FAULT_FILLED (1 << 8) + :: /* in (pre_kvm_run), out (post_kvm_run) */ @@ -6750,6 +6755,18 @@ kvm_valid_regs for specific bits. These bits are architecture specific and usually define the validity of a groups of registers. (e.g. one bit for general purpose registers) +:: + union { + /* KVM_SPEC_EXIT_MEMORY_FAULT */ + struct { + __u64 flags; + __u64 gpa; + __u64 len; /* in bytes */ + } memory_fault; + +Indicates a memory fault on the guest physical address range +[gpa, gpa + len). See KVM_CAP_MEMORY_FAULT_INFO for more details. + Please note that the kernel is allowed to use the kvm_run structure as the primary storage for certain register types. Therefore, the kernel may use the values in kvm_run even if the corresponding bit in kvm_dirty_regs is not set. @@ -7736,6 +7753,34 @@ This capability is aimed to mitigate the threat that malicious VMs can cause CPU stuck (due to event windows don't open up) and make the CPU unavailable to host or other VMs. +7.34 KVM_CAP_MEMORY_FAULT_INFO +------------------------------ + +:Architectures: x86, arm64 +:Returns: Informational only, -EINVAL on direct KVM_ENABLE_CAP. + +The presence of this capability indicates that KVM_RUN may fill +kvm_run.memory_fault in response to failed guest memory accesses in a vCPU +context. + +When KVM_RUN returns an error with errno=EFAULT, (kvm_run.flags | +KVM_RUN_MEMORY_FAULT_FILLED) indicates that the kvm_run.memory_fault field is +valid. This capability is only partially implemented in that not all EFAULTs +from KVM_RUN may be annotated in this way: these "bare" EFAULTs should be +considered bugs and reported to the maintainers. + +The 'gpa' and 'len' (in bytes) fields of kvm_run.memory_fault describe the range +of guest physical memory to which access failed, i.e. [gpa, gpa + len). 'flags' +is a bitfield indicating the nature of the access: valid masks are + + - KVM_MEMORY_FAULT_FLAG_READ: The failed access was a read. + - KVM_MEMORY_FAULT_FLAG_WRITE: The failed access was a write. + - KVM_MEMORY_FAULT_FLAG_EXEC: The failed access was an exec. + +Note: Userspaces which attempt to resolve memory faults so that they can retry +KVM_RUN are encouraged to guard against repeatedly receiving the same +error/annotated fault. + 8. Other capabilities. ====================== diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index fb6c6109fdca..9206ac944d31 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -392,6 +392,12 @@ struct kvm_vcpu { */ struct kvm_memory_slot *last_used_slot; u64 last_used_slot_gen; + + /* + * KVM_RUN initializes this value to KVM_SPEC_EXIT_UNUSED on entry and + * sets it to something else when it fills the speculative_exit struct. + */ + u8 speculative_exit_canary; }; /* @@ -2318,4 +2324,33 @@ static inline void kvm_account_pgtable_pages(void *virt, int nr) /* Max number of entries allowed for each kvm dirty ring */ #define KVM_DIRTY_RING_MAX_ENTRIES 65536 +/* + * Attempts to set the run struct's exit reason to KVM_EXIT_MEMORY_FAULT and + * populate the memory_fault field with the given information. + * + * WARNs and does nothing if the speculative exit canary has already been set + * or if 'vcpu' is not the current running vcpu. + */ +static inline void kvm_handle_guest_uaccess_fault(struct kvm_vcpu *vcpu, + uint64_t gpa, uint64_t len, uint64_t flags) +{ + /* + * Ensure that an unloaded vCPU's run struct isn't being modified + */ + if (WARN_ON_ONCE(vcpu != kvm_get_running_vcpu())) + return; + + /* + * Warn when overwriting an already-populated run struct. + */ + WARN_ON_ONCE(vcpu->speculative_exit_canary != KVM_SPEC_EXIT_UNUSED); + + vcpu->speculative_exit_canary = KVM_SPEC_EXIT_MEMORY_FAULT; + + vcpu->run->flags |= KVM_RUN_MEMORY_FAULT_FILLED; + vcpu->run->memory_fault.gpa = gpa; + vcpu->run->memory_fault.len = len; + vcpu->run->memory_fault.flags = flags; +} + #endif diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index f089ab290978..b2e4ac83b5a8 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -265,6 +265,9 @@ struct kvm_xen_exit { #define KVM_EXIT_RISCV_CSR 36 #define KVM_EXIT_NOTIFY 37 +#define KVM_SPEC_EXIT_UNUSED 0 +#define KVM_SPEC_EXIT_MEMORY_FAULT 1 + /* For KVM_EXIT_INTERNAL_ERROR */ /* Emulate instruction failed. */ #define KVM_INTERNAL_ERROR_EMULATION 1 @@ -278,6 +281,9 @@ struct kvm_xen_exit { /* Flags that describe what fields in emulation_failure hold valid data. */ #define KVM_INTERNAL_ERROR_EMULATION_FLAG_INSTRUCTION_BYTES (1ULL << 0) +/* KVM_CAP_MEMORY_FAULT_INFO flag for kvm_run.flags */ +#define KVM_RUN_MEMORY_FAULT_FILLED (1 << 8) + /* for KVM_RUN, returned by mmap(vcpu_fd, offset=0) */ struct kvm_run { /* in */ @@ -531,6 +537,27 @@ struct kvm_run { struct kvm_sync_regs regs; char padding[SYNC_REGS_SIZE_BYTES]; } s; + + /* + * This second exit union holds structs for exits which may be triggered + * after KVM has already initiated a different exit, and/or may be + * filled speculatively by KVM. + * + * For instance, because of limitations in KVM's uAPI, a memory fault + * may be encounterd after an MMIO exit is initiated and exit_reason and + * kvm_run.mmio are filled: isolating the speculative exits here ensures + * that KVM won't clobber information for the original exit. + */ + union { + /* KVM_SPEC_EXIT_MEMORY_FAULT */ + struct { + __u64 flags; + __u64 gpa; + __u64 len; + } memory_fault; + /* Fix the size of the union. */ + char speculative_exit_padding[256]; + }; }; /* for KVM_REGISTER_COALESCED_MMIO / KVM_UNREGISTER_COALESCED_MMIO */ @@ -1192,6 +1219,7 @@ struct kvm_ppc_resize_hpt { #define KVM_CAP_COUNTER_OFFSET 227 #define KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE 228 #define KVM_CAP_ARM_SUPPORTED_BLOCK_SIZES 229 +#define KVM_CAP_MEMORY_FAULT_INFO 230 #ifdef KVM_CAP_IRQ_ROUTING @@ -2249,4 +2277,10 @@ struct kvm_s390_zpci_op { /* flags for kvm_s390_zpci_op->u.reg_aen.flags */ #define KVM_S390_ZPCIOP_REGAEN_HOST (1 << 0) +/* flags for KVM_CAP_MEMORY_FAULT_INFO */ + +#define KVM_MEMORY_FAULT_FLAG_READ (1 << 0) +#define KVM_MEMORY_FAULT_FLAG_WRITE (1 << 1) +#define KVM_MEMORY_FAULT_FLAG_EXEC (1 << 2) + #endif /* __LINUX_KVM_H */ diff --git a/tools/include/uapi/linux/kvm.h b/tools/include/uapi/linux/kvm.h index f089ab290978..d19aa7965392 100644 --- a/tools/include/uapi/linux/kvm.h +++ b/tools/include/uapi/linux/kvm.h @@ -278,6 +278,9 @@ struct kvm_xen_exit { /* Flags that describe what fields in emulation_failure hold valid data. */ #define KVM_INTERNAL_ERROR_EMULATION_FLAG_INSTRUCTION_BYTES (1ULL << 0) +/* KVM_CAP_MEMORY_FAULT_INFO flag for kvm_run.flags */ +#define KVM_RUN_MEMORY_FAULT_FILLED (1 << 8) + /* for KVM_RUN, returned by mmap(vcpu_fd, offset=0) */ struct kvm_run { /* in */ @@ -531,6 +534,27 @@ struct kvm_run { struct kvm_sync_regs regs; char padding[SYNC_REGS_SIZE_BYTES]; } s; + + /* + * This second exit union holds structs for exits which may be triggered + * after KVM has already initiated a different exit, and/or may be + * filled speculatively by KVM. + * + * For instance, because of limitations in KVM's uAPI, a memory fault + * may be encounterd after an MMIO exit is initiated and exit_reason and + * kvm_run.mmio are filled: isolating the speculative exits here ensures + * that KVM won't clobber information for the original exit. + */ + union { + /* KVM_RUN_MEMORY_FAULT_FILLED + EFAULT */ + struct { + __u64 flags; + __u64 gpa; + __u64 len; + } memory_fault; + /* Fix the size of the union. */ + char speculative_exit_padding[256]; + }; }; /* for KVM_REGISTER_COALESCED_MMIO / KVM_UNREGISTER_COALESCED_MMIO */ diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 8b2d5aab32bf..e31435179764 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -4151,6 +4151,8 @@ static long kvm_vcpu_ioctl(struct file *filp, synchronize_rcu(); put_pid(oldpid); } + vcpu->speculative_exit_canary = KVM_SPEC_EXIT_UNUSED; + vcpu->run->flags &= ~KVM_RUN_MEMORY_FAULT_FILLED; r = kvm_arch_vcpu_ioctl_run(vcpu); trace_kvm_userspace_exit(vcpu->run->exit_reason, r); break; @@ -4539,6 +4541,7 @@ static int kvm_vm_ioctl_check_extension_generic(struct kvm *kvm, long arg) case KVM_CAP_CHECK_EXTENSION_VM: case KVM_CAP_ENABLE_CAP_VM: case KVM_CAP_HALT_POLL: + case KVM_CAP_MEMORY_FAULT_INFO: return 1; #ifdef CONFIG_KVM_MMIO case KVM_CAP_COALESCED_MMIO: From patchwork Fri Sep 8 22:28:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anish Moorthy X-Patchwork-Id: 13377861 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DFC79EEB573 for ; Fri, 8 Sep 2023 22:29:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343776AbjIHW36 (ORCPT ); Fri, 8 Sep 2023 18:29:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44040 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343762AbjIHW35 (ORCPT ); Fri, 8 Sep 2023 18:29:57 -0400 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 19D5C1FCA for ; Fri, 8 Sep 2023 15:29:54 -0700 (PDT) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-59b50b45481so18350577b3.1 for ; Fri, 08 Sep 2023 15:29:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1694212193; x=1694816993; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=wf8TO8PxR5yOir1Q53pTO4W/7i3l7j9F46MQ/NBc+xE=; b=AWtddfLlP054rvxFk4vlEGlSr0TQEOjgr8HD2IBhhizsqeo5B+AvmdmvdEqYq7XaGY 6T9H30Jo2vpaKBEIs4ypadq6L5x5/a49zrguOy8KZR89kP83r+Yw1RdRQNie9bJpvZVE Z+Odeg2x/SKPpGE77lYUdvLWKwfQ2QEGXoPauDaUUdw9mQaueSSqMTwGkhg/8r4wMLXh 2S4YCStuXRVTh3Gexdhox4/OmTmHRpaBNFw45zMldSwH+gOMDP+LSAQN7usOPgphpY29 ajMo4vVPmS6OTmuFclneeqcppGZPFFGzUligJq4xkrZbENtvIGhKDXAlJi5qxpoylVBc wWJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694212193; x=1694816993; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=wf8TO8PxR5yOir1Q53pTO4W/7i3l7j9F46MQ/NBc+xE=; b=QVCp1JAizSsN+UpMMr+c2gC2fFoQiHGDalSdkAzSZRsK7UTOXv+jN4j2U3IuJY7vB5 6Y1bknUPXAzvL7Snk1FKIj3918MO6CYRm2i1OsvCVapsZD88Yr9wWrl4USyaZ8Ph4+Hy fyUJlUPkkqnwnFIYK7Np72Td+3ezK28QlBmWJ5/xSp9V8loX3CiQP42Xels9ILVvIm93 FL/pO2fdJpDtdZ7QiPkR9vTlrm5InH7fHLB97XUesb9OIAqZ9Fyv0oQRR2WcComzvPQ4 UyeRIv5ol1XI8L2YKJeyncm7ulJrMVikvIhY7DZDUclG2EeRIaRhcCAhC2+R04bxw+3K iqTw== X-Gm-Message-State: AOJu0YxU9RUUvIK9A75I7rBgCmsMDlm74wh7m2Y1npXuKkTKZJT4gScM ZSwIo92CRF1AviNXTWqXg0FliRAsbx8dag== X-Google-Smtp-Source: AGHT+IGgTkMjv+1Zp4Trc1aoaNyokP2piKmk6xlEDnMLwdBUdnUHoIzK29yx1Osrr1Pf4OF8Tp58c7zQv62wxw== X-Received: from laogai.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:2c9]) (user=amoorthy job=sendgmr) by 2002:a25:a1ca:0:b0:d78:2c3:e633 with SMTP id a68-20020a25a1ca000000b00d7802c3e633mr73175ybi.2.1694212193355; Fri, 08 Sep 2023 15:29:53 -0700 (PDT) Date: Fri, 8 Sep 2023 22:28:52 +0000 In-Reply-To: <20230908222905.1321305-1-amoorthy@google.com> Mime-Version: 1.0 References: <20230908222905.1321305-1-amoorthy@google.com> X-Mailer: git-send-email 2.42.0.283.g2d96d420d3-goog Message-ID: <20230908222905.1321305-6-amoorthy@google.com> Subject: [PATCH v5 05/17] KVM: Annotate -EFAULTs from kvm_vcpu_read/write_guest_page() From: Anish Moorthy To: seanjc@google.com, oliver.upton@linux.dev, kvm@vger.kernel.org, kvmarm@lists.linux.dev Cc: pbonzini@redhat.com, maz@kernel.org, robert.hoo.linux@gmail.com, jthoughton@google.com, amoorthy@google.com, ricarkol@google.com, axelrasmussen@google.com, peterx@redhat.com, nadav.amit@gmail.com, isaku.yamahata@gmail.com, kconsul@linux.vnet.ibm.com Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Implement KVM_CAP_MEMORY_FAULT_INFO for uaccess failures in kvm_vcpu_read/write_guest_page() Signed-off-by: Anish Moorthy --- virt/kvm/kvm_main.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index e31435179764..13aa2ed11d0d 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -3043,8 +3043,12 @@ int kvm_vcpu_read_guest_page(struct kvm_vcpu *vcpu, gfn_t gfn, void *data, int offset, int len) { struct kvm_memory_slot *slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn); + int r = __kvm_read_guest_page(slot, gfn, data, offset, len); - return __kvm_read_guest_page(slot, gfn, data, offset, len); + if (r) + kvm_handle_guest_uaccess_fault(vcpu, gfn * PAGE_SIZE + offset, + len, KVM_MEMORY_FAULT_FLAG_READ); + return r; } EXPORT_SYMBOL_GPL(kvm_vcpu_read_guest_page); @@ -3149,8 +3153,12 @@ int kvm_vcpu_write_guest_page(struct kvm_vcpu *vcpu, gfn_t gfn, const void *data, int offset, int len) { struct kvm_memory_slot *slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn); + int r = __kvm_write_guest_page(vcpu->kvm, slot, gfn, data, offset, len); - return __kvm_write_guest_page(vcpu->kvm, slot, gfn, data, offset, len); + if (r) + kvm_handle_guest_uaccess_fault(vcpu, gfn * PAGE_SIZE + offset, + len, KVM_MEMORY_FAULT_FLAG_WRITE); + return r; } EXPORT_SYMBOL_GPL(kvm_vcpu_write_guest_page); From patchwork Fri Sep 8 22:28:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anish Moorthy X-Patchwork-Id: 13377863 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CFF34EEB571 for ; Fri, 8 Sep 2023 22:29:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343788AbjIHWaA (ORCPT ); Fri, 8 Sep 2023 18:30:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44048 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343762AbjIHW36 (ORCPT ); Fri, 8 Sep 2023 18:29:58 -0400 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EC20B1FCA for ; Fri, 8 Sep 2023 15:29:54 -0700 (PDT) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-58e4d2b7d16so30133467b3.0 for ; Fri, 08 Sep 2023 15:29:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1694212194; x=1694816994; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=LzMBBO3+QOcCxytQD9gvt8PFFPzKUty4hf8hE8hWFnE=; b=x9/DqsqR6rkAIzq/iuYA2jvTxSmcvPBIFgdA6yYOXjZi3m/Woq3Bm8y23LHlGXgndn cnqjzmarxy0pJupyULFIqB9HOeNnNMB64VkYa5QlQOFg0A+W8ATfopIP8ZF33hpdgNNr ygXQK7WSMeaTNza5N/M5nRmIi/Bg8W0FnFWGbkpITP+DBdUwp+ClnmgmOhZJaatdL2VE O5uewL8fNkPpE8vBfs02jCXd53NQDR4aHonCTiXdr06+TKYaD6q1n1O7vHSqu3B+rJxu eKlknLOrsCpun4Mxdp8CuM59wE+AOaCyX+5RWZM5PPbNk/Wh1EXDFuHWAYZRDMvzINAv OPRA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694212194; x=1694816994; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=LzMBBO3+QOcCxytQD9gvt8PFFPzKUty4hf8hE8hWFnE=; b=gZTDI017bwzepodjIPRYb3dwbvex5cFCTlLknwUXQ1lFIFJ2uZYi+nMrNHxX7hHPxT nVXEma+gDNSg7WjdiBC5jBvRH2ZDqITXkvoqS/PRG6lxyw9WDojmThd+djMXYWKWbVvO agxmBNa9AmT8jHH3ifUW3jGN6hJj/4U8XpU0jmp8kLUznJGQKNyEyn46Hy3qcP5tKMMs 0EYSD1jhBSPc7plM+mm2I4sJsqWFkVJTjKNTIsJVCoJJm0VqtwCSewJAkS56GpY9q2Tl KIr/zfA/nPUKfRWKzoMJCf0LAC5PKJ5Xup+XPVK3TLgJt4uLC02sZyd/cxL4DRnFeKDk 1tHQ== X-Gm-Message-State: AOJu0Yy5yT2UV8z63wxjpse6hr3bAFUnv5V/If0ns2/LdSvEL9Kif2kU xcLasuYdkmsKZHfeAuSjm2JI0YHN7mes5g== X-Google-Smtp-Source: AGHT+IGgQ6+xByO+HkT24hG/yB9NqyYPbYWpx7ZIjBxm/QZKjr8AwVFmzf4mP+1g9/qAE5XNiwUyfp/6A4SM3Q== X-Received: from laogai.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:2c9]) (user=amoorthy job=sendgmr) by 2002:a05:6902:168b:b0:d80:cf4:7e80 with SMTP id bx11-20020a056902168b00b00d800cf47e80mr72394ybb.7.1694212194254; Fri, 08 Sep 2023 15:29:54 -0700 (PDT) Date: Fri, 8 Sep 2023 22:28:53 +0000 In-Reply-To: <20230908222905.1321305-1-amoorthy@google.com> Mime-Version: 1.0 References: <20230908222905.1321305-1-amoorthy@google.com> X-Mailer: git-send-email 2.42.0.283.g2d96d420d3-goog Message-ID: <20230908222905.1321305-7-amoorthy@google.com> Subject: [PATCH v5 06/17] KVM: x86: Annotate -EFAULTs from kvm_handle_error_pfn() From: Anish Moorthy To: seanjc@google.com, oliver.upton@linux.dev, kvm@vger.kernel.org, kvmarm@lists.linux.dev Cc: pbonzini@redhat.com, maz@kernel.org, robert.hoo.linux@gmail.com, jthoughton@google.com, amoorthy@google.com, ricarkol@google.com, axelrasmussen@google.com, peterx@redhat.com, nadav.amit@gmail.com, isaku.yamahata@gmail.com, kconsul@linux.vnet.ibm.com Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Implement KVM_CAP_MEMORY_FAULT_INFO for efaults generated by kvm_handle_error_pfn(). Signed-off-by: Anish Moorthy --- arch/x86/kvm/mmu/mmu.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index e1d011c67cc6..deae8ac74d9a 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -3267,6 +3267,8 @@ static void kvm_send_hwpoison_signal(struct kvm_memory_slot *slot, gfn_t gfn) static int kvm_handle_error_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) { + u64 fault_flags; + if (is_sigpending_pfn(fault->pfn)) { kvm_handle_signal_exit(vcpu); return -EINTR; @@ -3285,6 +3287,17 @@ static int kvm_handle_error_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fa return RET_PF_RETRY; } + WARN_ON_ONCE(fault->goal_level != PG_LEVEL_4K); + + fault_flags = 0; + if (fault->write) + fault_flags = KVM_MEMORY_FAULT_FLAG_WRITE; + else if (fault->exec) + fault_flags = KVM_MEMORY_FAULT_FLAG_EXEC; + else + fault_flags = KVM_MEMORY_FAULT_FLAG_READ; + kvm_handle_guest_uaccess_fault(vcpu, gfn_to_gpa(fault->gfn), PAGE_SIZE, + fault_flags); return -EFAULT; } From patchwork Fri Sep 8 22:28:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anish Moorthy X-Patchwork-Id: 13377864 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 97F8EEEB56E for ; Fri, 8 Sep 2023 22:29:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343865AbjIHWaB (ORCPT ); Fri, 8 Sep 2023 18:30:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44088 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343842AbjIHW37 (ORCPT ); Fri, 8 Sep 2023 18:29:59 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 11F481FCA for ; Fri, 8 Sep 2023 15:29:56 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id 3f1490d57ef6-d7ec535fe42so2301397276.1 for ; Fri, 08 Sep 2023 15:29:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1694212195; x=1694816995; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=tw1V6jjZtxQPmQpxo2OqWdUijyBXviuVC0rwevTlCmY=; b=QsV6RtPRFPkXSPvAB5TqT08jJf/BL+EdNOI/VVHO4+zqZM8QbnipwdTJ/AFH7ZTqnB v8tbZ2Z08OPONId7OL1xCw6oTfbg/mEMtYXSEQRAXvHkj8j86i8FTAFjPQNIJ1bOz4Ur 59keXBOQ1EpdYq9opjamXZESZzx0GqsHFuij2F1RcDGaKSUMOKpUimPkibO5gOJblHOB EGXJEbRu5pV993M3feUy8ippgT0pRQ8Sw2BCVVI19y5DEtxS5QPwuCip3Z1FxgplxRrf E6bwjRnUikaJezVJHQqwA/Pl19/WxT3eRi9pBsSZtgpBdqwbQRQi3ceJUow/FFfTu5MJ lLKA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694212195; x=1694816995; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=tw1V6jjZtxQPmQpxo2OqWdUijyBXviuVC0rwevTlCmY=; b=CwEK4eJsKddruDWGyjzPmZUN667YnYTKdUd2Vb8W2fCsnDfex+IHVQmhI3h+eLdc+V c9LnEEgflYPoXuocPBQUZgkJbWan0RraPpM1PsiY+1G8bS1QznyjiNiEAMaCH1uot55R q1YOdbvCEZs2CleuRIMBqow0UJeoTKA4yFR4gWwIHbca0+TY37WjquvAhmcqincDhM7k u6rjWXFMrn/vty99qO+y/8uTOvFfYdUD+KvTqrUv71eHrtb6qDJIHM4LRISX/hHFk9l0 S8YHXSNXpnYqwiMcAWvYzDre7Vf/W+BT07mcjNrswdy+1q0Qs2XhKFraiO2Rak8IWbmR pT9g== X-Gm-Message-State: AOJu0YzDEtfbVVgWjZThVCea5umGUu7+1st87PHdMiJl35K84GzoYvsN X+ALQCas6excifFNuQXfksrn4jwHQ8IprQ== X-Google-Smtp-Source: AGHT+IFidILo8DbI1LLUBkpdLdDWa4EPEgOq/ggpLsuiBitayiGeHEqMSH561wdBMd5QP0I3V80StiKjXtme2Q== X-Received: from laogai.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:2c9]) (user=amoorthy job=sendgmr) by 2002:a25:ac02:0:b0:d7f:f3e:74ab with SMTP id w2-20020a25ac02000000b00d7f0f3e74abmr86043ybi.1.1694212195382; Fri, 08 Sep 2023 15:29:55 -0700 (PDT) Date: Fri, 8 Sep 2023 22:28:54 +0000 In-Reply-To: <20230908222905.1321305-1-amoorthy@google.com> Mime-Version: 1.0 References: <20230908222905.1321305-1-amoorthy@google.com> X-Mailer: git-send-email 2.42.0.283.g2d96d420d3-goog Message-ID: <20230908222905.1321305-8-amoorthy@google.com> Subject: [PATCH v5 07/17] KVM: arm64: Annotate -EFAULT from user_mem_abort() From: Anish Moorthy To: seanjc@google.com, oliver.upton@linux.dev, kvm@vger.kernel.org, kvmarm@lists.linux.dev Cc: pbonzini@redhat.com, maz@kernel.org, robert.hoo.linux@gmail.com, jthoughton@google.com, amoorthy@google.com, ricarkol@google.com, axelrasmussen@google.com, peterx@redhat.com, nadav.amit@gmail.com, isaku.yamahata@gmail.com, kconsul@linux.vnet.ibm.com Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Implement KVM_CAP_MEMORY_FAULT_INFO for guest access failure in user_mem_abort(). Signed-off-by: Anish Moorthy --- arch/arm64/kvm/mmu.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 587a104f66c3..8ede6c5edc5f 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1408,6 +1408,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, long vma_pagesize, fault_granule; enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R; struct kvm_pgtable *pgt; + uint64_t memory_fault_flags; fault_granule = 1UL << ARM64_HW_PGTABLE_LEVEL_SHIFT(fault_level); write_fault = kvm_is_write_fault(vcpu); @@ -1507,8 +1508,18 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, kvm_send_hwpoison_signal(hva, vma_shift); return 0; } - if (is_error_noslot_pfn(pfn)) + if (is_error_noslot_pfn(pfn)) { + memory_fault_flags = 0; + if (write_fault) + memory_fault_flags = KVM_MEMORY_FAULT_FLAG_EXEC; + else if (exec_fault) + memory_fault_flags = KVM_MEMORY_FAULT_FLAG_EXEC; + else + memory_fault_flags = KVM_MEMORY_FAULT_FLAG_READ; + kvm_handle_guest_uaccess_fault(vcpu, round_down(gfn * PAGE_SIZE, vma_pagesize), + vma_pagesize, memory_fault_flags); return -EFAULT; + } if (kvm_is_device_pfn(pfn)) { /* From patchwork Fri Sep 8 22:28:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anish Moorthy X-Patchwork-Id: 13377865 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 51CE0EEB575 for ; Fri, 8 Sep 2023 22:29:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242926AbjIHWaB (ORCPT ); Fri, 8 Sep 2023 18:30:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44130 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343824AbjIHWaA (ORCPT ); Fri, 8 Sep 2023 18:30:00 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 12A1D1FCA for ; Fri, 8 Sep 2023 15:29:57 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id 3f1490d57ef6-d7ba519f46bso2513378276.3 for ; Fri, 08 Sep 2023 15:29:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1694212196; x=1694816996; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=XIpmIJLaTkwi74h1riRreKSmTmxguF7fM+nkkiCilQs=; b=3G1irqN6yy8+yYpakOYkwabbZHIhUg6H59SaA3MLZVB614Agf7F5HICjkeQo8QU1Ha 2lNq/5+XFPrD9jKTppLs3wQ78Nn/PYgC8U2poVAtv7f6U+epTMDGmQ4NfqBP34VEwg5w Fw74YK1FeC09SZL8Vsl0PYayqaYjbp/NCL/SmIa93lWKy0PJ8eCw55EkqdvjcfPemSXT BLdDvQh99uAQfKLhFdSXSkHIYdfPZBgrzUTPDC000ZomM1J9KBlvctyjjg3zAoFIfuPe y9XxZOK4YQ/e7Q7P85vBRSrBTQWP7CZsiZ/zd5m54/0gXwtnBbx2+/ULpUHEjFjkWmEN xrVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694212196; x=1694816996; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=XIpmIJLaTkwi74h1riRreKSmTmxguF7fM+nkkiCilQs=; b=gR5MH+9eR1WsRHpgbCocPtdkVeY85/Mfr0q3VuKFpRN1S7rFqS4Y0gWUofHATG+lme A9XYDdI0UZisUqn9y07hfS3nvPn4CInvQRrRZIlDjomo+G7az4Kh7I56z3jHl3eaPkDt wuUIC9fFKSAMo7LnhaDaC/DxFvhxgdguXpLMx+9PyK5abl+hGiYCYoSSLji6R8344RBP VgUHEUzrzo5fQYEcEKL/koU9CS2DcQ2OlgZKMWjh2rHZrN93ASnG843oU/HPY2PwtAka 3gBxlc4eq27kP3e1LElZLL/nzIUvUVg+1z134zOT2bwa7VIed7nUJXxCh1XrWGeq26Ej Owhw== X-Gm-Message-State: AOJu0YyDVhh6sYZ3192/irLwbX5eWq7hnBX7NN4HRFqVZDlrc7NyASLL TbK6HbcXQhat+RhQg5pFM6n/ttx0L5qDgg== X-Google-Smtp-Source: AGHT+IEe7UramIvSlKRs0rA75cYHBcDpS4lm5Q6muAmcukOws4HAJpGyE1NyjeXOcZg1PL2GEmXXUQtyP3Ilwg== X-Received: from laogai.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:2c9]) (user=amoorthy job=sendgmr) by 2002:a25:a2c8:0:b0:d77:984e:c770 with SMTP id c8-20020a25a2c8000000b00d77984ec770mr85787ybn.5.1694212196358; Fri, 08 Sep 2023 15:29:56 -0700 (PDT) Date: Fri, 8 Sep 2023 22:28:55 +0000 In-Reply-To: <20230908222905.1321305-1-amoorthy@google.com> Mime-Version: 1.0 References: <20230908222905.1321305-1-amoorthy@google.com> X-Mailer: git-send-email 2.42.0.283.g2d96d420d3-goog Message-ID: <20230908222905.1321305-9-amoorthy@google.com> Subject: [PATCH v5 08/17] KVM: Allow hva_pfn_fast() to resolve read faults From: Anish Moorthy To: seanjc@google.com, oliver.upton@linux.dev, kvm@vger.kernel.org, kvmarm@lists.linux.dev Cc: pbonzini@redhat.com, maz@kernel.org, robert.hoo.linux@gmail.com, jthoughton@google.com, amoorthy@google.com, ricarkol@google.com, axelrasmussen@google.com, peterx@redhat.com, nadav.amit@gmail.com, isaku.yamahata@gmail.com, kconsul@linux.vnet.ibm.com Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org hva_to_pfn_fast() currently just fails for faults where establishing writable mappings is forbidden, which is unnecessary. Instead, try getting the page without passing FOLL_WRITE. This allows the aforementioned faults to (potentially) be resolved without falling back to slow GUP. Suggested-by: James Houghton Signed-off-by: Anish Moorthy --- virt/kvm/kvm_main.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 13aa2ed11d0d..a7e6320dd7f0 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2508,7 +2508,7 @@ static inline int check_user_page_hwpoison(unsigned long addr) } /* - * The fast path to get the writable pfn which will be stored in @pfn, + * The fast path to get the pfn which will be stored in @pfn, * true indicates success, otherwise false is returned. It's also the * only part that runs if we can in atomic context. */ @@ -2522,10 +2522,9 @@ static bool hva_to_pfn_fast(unsigned long addr, bool write_fault, * or the caller allows to map a writable pfn for a read fault * request. */ - if (!(write_fault || writable)) - return false; + unsigned int gup_flags = (write_fault || writable) ? FOLL_WRITE : 0; - if (get_user_page_fast_only(addr, FOLL_WRITE, page)) { + if (get_user_page_fast_only(addr, gup_flags, page)) { *pfn = page_to_pfn(page[0]); if (writable) From patchwork Fri Sep 8 22:28:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anish Moorthy X-Patchwork-Id: 13377866 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 97223EEB570 for ; Fri, 8 Sep 2023 22:30:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343901AbjIHWaD (ORCPT ); Fri, 8 Sep 2023 18:30:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48554 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343876AbjIHWaC (ORCPT ); Fri, 8 Sep 2023 18:30:02 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E098D1FCA for ; Fri, 8 Sep 2023 15:29:57 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id 3f1490d57ef6-d7c676651c7so6993928276.1 for ; Fri, 08 Sep 2023 15:29:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1694212197; x=1694816997; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=5mp0XJC4R62BtJjjX6cjHP8swHT4GNhM1P2WJh4sej8=; b=VxgLjaZ+EAXLl53SJWWinedfrxXzWs5tCt6a1U3Yn8lBpX9QF2t16i4BypWAEJUhtj A+Q75CwSrBmNghQNauxvaZBtT4VxpaLTxuaCN+d56PkEGg3ahIjX2Ywn3oi8NHzFiETh CKXBsXuq1KeJJydBoDrC644yeQ+wgbYLPuV+BTRUTyY8GtqAEbK0moDdCwucDp/pYFb9 QcqV6mdr4JQYdMb2o+A6P+XAyLJ+eAGmEmZxGcGhm193vG/wsvU0kTUIMFPSEcmByCGW ki3meDVpD3khhnKL+fOekV8otLvZyaWYc5wCcGT15684YfMcojceW2EauZ3ockpEQ6Nb 8EaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694212197; x=1694816997; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=5mp0XJC4R62BtJjjX6cjHP8swHT4GNhM1P2WJh4sej8=; b=lRQG/4AyxwNf5KybIt7aAyLrlASyNLum11Jgqz1lMrKwjUxcWiMFyJmSOxK3W0eAZH 4sk9m4DjLUFtlOE5aQ6lb0XgCrPVItAm6Nx6lJJrNVg6Rm5xZgNtXzKUb9V1mTPyWiRC Fi57VeAF8O30f3gF8PCvZvJkKvN5oUQ/uPonaaXCKKNVmGP8tF+X7UkXHqXnwmxRDCjn JZsyYjeR4U0+M1tV52d9T6859Bby8ZA5Y/xHYYHfnzvZjRqRHhuxStTbI3n9/PzxTCaP +bive4AiDsVWB1YPJxj865+HLT/QbnZ1+32RZtyABAABuW1JO3lqubiu5EnuAFhiEmq7 SsKw== X-Gm-Message-State: AOJu0YyEle3VOm1PS/DG++L+UT5VPBa9VyWgXLWVMoYJ/rmLjXaVAeo0 GPvgPWH8BhSsk98qxpSVdzQJevWK/Fm5gQ== X-Google-Smtp-Source: AGHT+IFC72Q+0iEWcEgGkc+m7VbWdSgk0sXS3HD6u9epEqsTQfW84aC06nmxfOKJ/0XL5ODQBpPtQAmA6G8Tmg== X-Received: from laogai.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:2c9]) (user=amoorthy job=sendgmr) by 2002:a05:6902:1024:b0:d7e:c4af:22d2 with SMTP id x4-20020a056902102400b00d7ec4af22d2mr225355ybt.4.1694212197218; Fri, 08 Sep 2023 15:29:57 -0700 (PDT) Date: Fri, 8 Sep 2023 22:28:56 +0000 In-Reply-To: <20230908222905.1321305-1-amoorthy@google.com> Mime-Version: 1.0 References: <20230908222905.1321305-1-amoorthy@google.com> X-Mailer: git-send-email 2.42.0.283.g2d96d420d3-goog Message-ID: <20230908222905.1321305-10-amoorthy@google.com> Subject: [PATCH v5 09/17] KVM: Introduce KVM_CAP_USERFAULT_ON_MISSING without implementation From: Anish Moorthy To: seanjc@google.com, oliver.upton@linux.dev, kvm@vger.kernel.org, kvmarm@lists.linux.dev Cc: pbonzini@redhat.com, maz@kernel.org, robert.hoo.linux@gmail.com, jthoughton@google.com, amoorthy@google.com, ricarkol@google.com, axelrasmussen@google.com, peterx@redhat.com, nadav.amit@gmail.com, isaku.yamahata@gmail.com, kconsul@linux.vnet.ibm.com Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add documentation, memslot flags, useful helper functions, and the definition of the capability. Implementation is provided in a subsequent commit. Memory fault exits on absent mappings are particularly useful for userfaultfd-based postcopy live migration, where contention within uffd can lead to slowness When many vCPUs fault on a single uffd/vma. Bypassing the uffd entirely by returning information directly to the vCPU via an exit avoids contention and can greatly improves the fault rate. Suggested-by: James Houghton Signed-off-by: Anish Moorthy --- Documentation/virt/kvm/api.rst | 28 +++++++++++++++++++++++++--- include/linux/kvm_host.h | 9 +++++++++ include/uapi/linux/kvm.h | 2 ++ tools/include/uapi/linux/kvm.h | 1 + virt/kvm/Kconfig | 3 +++ virt/kvm/kvm_main.c | 5 +++++ 6 files changed, 45 insertions(+), 3 deletions(-) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 92fd3faa6bab..c2eaacb6dc63 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -1312,6 +1312,7 @@ yet and must be cleared on entry. /* for kvm_userspace_memory_region::flags */ #define KVM_MEM_LOG_DIRTY_PAGES (1UL << 0) #define KVM_MEM_READONLY (1UL << 1) + #define KVM_MEM_USERFAULT_ON_MISSING (1UL << 2) This ioctl allows the user to create, modify or delete a guest physical memory slot. Bits 0-15 of "slot" specify the slot id and this value @@ -1342,12 +1343,15 @@ It is recommended that the lower 21 bits of guest_phys_addr and userspace_addr be identical. This allows large pages in the guest to be backed by large pages in the host. -The flags field supports two flags: KVM_MEM_LOG_DIRTY_PAGES and -KVM_MEM_READONLY. The former can be set to instruct KVM to keep track of +The flags field supports three flags + +1. KVM_MEM_LOG_DIRTY_PAGES: can be set to instruct KVM to keep track of writes to memory within the slot. See KVM_GET_DIRTY_LOG ioctl to know how to -use it. The latter can be set, if KVM_CAP_READONLY_MEM capability allows it, +use it. +2. KVM_MEM_READONLY: can be set, if KVM_CAP_READONLY_MEM capability allows it, to make a new slot read-only. In this case, writes to this memory will be posted to userspace as KVM_EXIT_MMIO exits. +3. KVM_MEM_USERFAULT_ON_MISSING: see KVM_CAP_USERFAULT_ON_MISSING for details. When the KVM_CAP_SYNC_MMU capability is available, changes in the backing of the memory region are automatically reflected into the guest. For example, an @@ -7781,6 +7785,24 @@ Note: Userspaces which attempt to resolve memory faults so that they can retry KVM_RUN are encouraged to guard against repeatedly receiving the same error/annotated fault. +7.35 KVM_CAP_USERFAULT_ON_MISSING +--------------------------------- + +:Architectures: None +:Returns: Informational only, -EINVAL on direct KVM_ENABLE_CAP. + +The presence of this capability indicates that userspace may set the +KVM_MEM_USERFAULT_ON_MISSING on memslots (via KVM_SET_USER_MEMORY_REGION). Said +flag will cause KVM_RUN to fail (-EFAULT) in response to guest-context memory +accesses which would require KVM to page fault on the userspace mapping. + +The range of guest physical memory causing the fault is advertised to userspace +through KVM_CAP_MEMORY_FAULT_INFO. Userspace should determine how best to make +the mapping present, take appropriate action, then return to KVM_RUN to retry +the access. + +Attempts to enable this capability directly will fail. + 8. Other capabilities. ====================== diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 9206ac944d31..db5c3eae58fe 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -2353,4 +2353,13 @@ static inline void kvm_handle_guest_uaccess_fault(struct kvm_vcpu *vcpu, vcpu->run->memory_fault.flags = flags; } +/* + * Whether non-atomic accesses to the userspace mapping of the memslot should + * be upgraded when possible. + */ +static inline bool kvm_is_slot_userfault_on_missing(const struct kvm_memory_slot *slot) +{ + return slot && slot->flags & KVM_MEM_USERFAULT_ON_MISSING; +} + #endif diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index b2e4ac83b5a8..a21921e4ee2a 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -102,6 +102,7 @@ struct kvm_userspace_memory_region { */ #define KVM_MEM_LOG_DIRTY_PAGES (1UL << 0) #define KVM_MEM_READONLY (1UL << 1) +#define KVM_MEM_USERFAULT_ON_MISSING (1UL << 2) /* for KVM_IRQ_LINE */ struct kvm_irq_level { @@ -1220,6 +1221,7 @@ struct kvm_ppc_resize_hpt { #define KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE 228 #define KVM_CAP_ARM_SUPPORTED_BLOCK_SIZES 229 #define KVM_CAP_MEMORY_FAULT_INFO 230 +#define KVM_CAP_USERFAULT_ON_MISSING 231 #ifdef KVM_CAP_IRQ_ROUTING diff --git a/tools/include/uapi/linux/kvm.h b/tools/include/uapi/linux/kvm.h index d19aa7965392..188be8549070 100644 --- a/tools/include/uapi/linux/kvm.h +++ b/tools/include/uapi/linux/kvm.h @@ -102,6 +102,7 @@ struct kvm_userspace_memory_region { */ #define KVM_MEM_LOG_DIRTY_PAGES (1UL << 0) #define KVM_MEM_READONLY (1UL << 1) +#define KVM_MEM_USERFAULT_ON_MISSING (1UL << 2) /* for KVM_IRQ_LINE */ struct kvm_irq_level { diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig index 484d0873061c..906878438687 100644 --- a/virt/kvm/Kconfig +++ b/virt/kvm/Kconfig @@ -92,3 +92,6 @@ config HAVE_KVM_PM_NOTIFIER config KVM_GENERIC_HARDWARE_ENABLING bool + +config HAVE_KVM_USERFAULT_ON_MISSING + bool diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index a7e6320dd7f0..aa81e41b1488 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1553,6 +1553,9 @@ static int check_memory_region_flags(const struct kvm_userspace_memory_region *m valid_flags |= KVM_MEM_READONLY; #endif + if (IS_ENABLED(CONFIG_HAVE_KVM_USERFAULT_ON_MISSING)) + valid_flags |= KVM_MEM_USERFAULT_ON_MISSING; + if (mem->flags & ~valid_flags) return -EINVAL; @@ -4588,6 +4591,8 @@ static int kvm_vm_ioctl_check_extension_generic(struct kvm *kvm, long arg) case KVM_CAP_BINARY_STATS_FD: case KVM_CAP_SYSTEM_EVENT_DATA: return 1; + case KVM_CAP_USERFAULT_ON_MISSING: + return IS_ENABLED(CONFIG_HAVE_KVM_USERFAULT_ON_MISSING); default: break; } From patchwork Fri Sep 8 22:28:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anish Moorthy X-Patchwork-Id: 13377867 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 990CDEEB572 for ; Fri, 8 Sep 2023 22:30:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343913AbjIHWaE (ORCPT ); Fri, 8 Sep 2023 18:30:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48538 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343915AbjIHWaD (ORCPT ); Fri, 8 Sep 2023 18:30:03 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 061721FE8 for ; Fri, 8 Sep 2023 15:29:59 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id 3f1490d57ef6-d7e4151d07aso2543385276.2 for ; Fri, 08 Sep 2023 15:29:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1694212198; x=1694816998; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=jB1z42QoJJDiLaHgs90ydsyq4F02cKAC+BVC0dKoEvo=; b=efumdTe08f4NOcfweGgJYHEyMSVtZZgCREm+aQ7qJ9wpAPD6zOE/EzFGunKilm1HQI yUVqvP2wRrgHyz3iA2KOlBERp2RDk1g9HBzLpitvsLlGEvaqY9LWvcQhB3VSvI3dkt2P eS0quoF+/O30tqvlV6PCpNMkK3MEKw6mxhi5p8oGQwSOJjuNh+3NjZ+/Uvwp0eB3YW3U /CSc2KBAq9dEXhuCyAqWMgzVL0eWxbDfzFg9H+Uc2ScU12b0w9BQ7EqyHBj0pqd92fiS Td51hqks67BcoJxB3CG9JcTiobhUffQwyj9iswzpIKF6SVH3QCd7c8zIiZAVafZXYY4u IB7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694212198; x=1694816998; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=jB1z42QoJJDiLaHgs90ydsyq4F02cKAC+BVC0dKoEvo=; b=FxpEe8CpKplnyEcCwKeUNUAkyjqlB+mWfEFcqvMU/qChnJ0gDIpesX0h74nbKIIPXU rSqSYhRG6YFoHpGGpr4id2kv/HdPbrZxZFAp1yT4VvMjrE7gu6l1BHvhj8CnFig+DfeF CteTOiFcZymFx090LlX/xqoGF+qwBGEATx+rI7sr8yZRZhPwwUS+FdcQ6Ux5yHC2cZ1q 09BECQToI3+/3ZreIYFxqUqzdfO/ng3HOh95I5sbMfOYIoVd4e0Czb1AJhrZWaNR1D83 me9OJBwrIxpnW7qZhDXSJzveK46ag0R0FcQCdEJj5xvruN9HZiy81SsKWC2YBYQpxw9j Yg4w== X-Gm-Message-State: AOJu0YzSSLIPJ2Xw8cUCm1cyA5H5itD2CyKB0YsO5r9Ypyf5gFby3SSE 4NCPYLSwYVwhcWHC5KanCHU2Lyfepu+5PA== X-Google-Smtp-Source: AGHT+IEBLxxxYVBptOyfP6KR32iQyaj2s2xoKKyou0JBv0ZaSfmRl920VzlS/FX6mB4CaGl6pX8rcmbMlSRqAw== X-Received: from laogai.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:2c9]) (user=amoorthy job=sendgmr) by 2002:a25:d4c9:0:b0:d78:f45:d7bd with SMTP id m192-20020a25d4c9000000b00d780f45d7bdmr83912ybf.4.1694212198302; Fri, 08 Sep 2023 15:29:58 -0700 (PDT) Date: Fri, 8 Sep 2023 22:28:57 +0000 In-Reply-To: <20230908222905.1321305-1-amoorthy@google.com> Mime-Version: 1.0 References: <20230908222905.1321305-1-amoorthy@google.com> X-Mailer: git-send-email 2.42.0.283.g2d96d420d3-goog Message-ID: <20230908222905.1321305-11-amoorthy@google.com> Subject: [PATCH v5 10/17] KVM: Implement KVM_CAP_USERFAULT_ON_MISSING by atomizing __gfn_to_pfn_memslot() calls From: Anish Moorthy To: seanjc@google.com, oliver.upton@linux.dev, kvm@vger.kernel.org, kvmarm@lists.linux.dev Cc: pbonzini@redhat.com, maz@kernel.org, robert.hoo.linux@gmail.com, jthoughton@google.com, amoorthy@google.com, ricarkol@google.com, axelrasmussen@google.com, peterx@redhat.com, nadav.amit@gmail.com, isaku.yamahata@gmail.com, kconsul@linux.vnet.ibm.com Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Change the "atomic" parameter of __gfn_to_pfn_memslot() to an enum which reflects the possibility of allowig non-atomic accesses (GUP calls) being "upgraded" to atomic, and mark locations where such upgrading is allowed. Concerning gfn_to_pfn_prot(): this function is unused on x86, and the only usage on arm64 is from a codepath where upgrading gup calls to atomic based on the memslot is undesirable. Therefore, punt on adding any plumbing to expose the 'atomicity' parameter. Signed-off-by: Anish Moorthy --- arch/arm64/kvm/mmu.c | 4 +-- arch/powerpc/kvm/book3s_64_mmu_hv.c | 3 +- arch/powerpc/kvm/book3s_64_mmu_radix.c | 3 +- arch/x86/kvm/mmu/mmu.c | 8 +++--- include/linux/kvm_host.h | 14 +++++++++- virt/kvm/kvm_main.c | 38 ++++++++++++++++++++------ 6 files changed, 53 insertions(+), 17 deletions(-) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 8ede6c5edc5f..ac77ae5b5d2b 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1502,8 +1502,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, mmu_seq = vcpu->kvm->mmu_invalidate_seq; mmap_read_unlock(current->mm); - pfn = __gfn_to_pfn_memslot(memslot, gfn, false, false, NULL, - write_fault, &writable, NULL); + pfn = __gfn_to_pfn_memslot(memslot, gfn, MEMSLOT_ACCESS_NONATOMIC_MAY_UPGRADE, + false, NULL, write_fault, &writable, NULL); if (pfn == KVM_PFN_ERR_HWPOISON) { kvm_send_hwpoison_signal(hva, vma_shift); return 0; diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c index 7f765d5ad436..ab7caa86aa16 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c @@ -612,7 +612,8 @@ int kvmppc_book3s_hv_page_fault(struct kvm_vcpu *vcpu, write_ok = true; } else { /* Call KVM generic code to do the slow-path check */ - pfn = __gfn_to_pfn_memslot(memslot, gfn, false, false, NULL, + pfn = __gfn_to_pfn_memslot(memslot, gfn, MEMSLOT_ACCESS_FORCE_ALLOW_NONATOMIC, + false, NULL, writing, &write_ok, NULL); if (is_error_noslot_pfn(pfn)) return -EFAULT; diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c index 572707858d65..3fa05c8e96b0 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_radix.c +++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c @@ -846,7 +846,8 @@ int kvmppc_book3s_instantiate_page(struct kvm_vcpu *vcpu, unsigned long pfn; /* Call KVM generic code to do the slow-path check */ - pfn = __gfn_to_pfn_memslot(memslot, gfn, false, false, NULL, + pfn = __gfn_to_pfn_memslot(memslot, gfn, MEMSLOT_ACCESS_FORCE_ALLOW_NONATOMIC, + false, NULL, writing, upgrade_p, NULL); if (is_error_noslot_pfn(pfn)) return -EFAULT; diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index deae8ac74d9a..43516eb50e06 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4297,8 +4297,8 @@ static int __kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault } async = false; - fault->pfn = __gfn_to_pfn_memslot(slot, fault->gfn, false, false, &async, - fault->write, &fault->map_writable, + fault->pfn = __gfn_to_pfn_memslot(slot, fault->gfn, MEMSLOT_ACCESS_NONATOMIC_MAY_UPGRADE, + false, &async, fault->write, &fault->map_writable, &fault->hva); if (!async) return RET_PF_CONTINUE; /* *pfn has correct page already */ @@ -4319,8 +4319,8 @@ static int __kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault * to wait for IO. Note, gup always bails if it is unable to quickly * get a page and a fatal signal, i.e. SIGKILL, is pending. */ - fault->pfn = __gfn_to_pfn_memslot(slot, fault->gfn, false, true, NULL, - fault->write, &fault->map_writable, + fault->pfn = __gfn_to_pfn_memslot(slot, fault->gfn, MEMSLOT_ACCESS_NONATOMIC_MAY_UPGRADE, + true, NULL, fault->write, &fault->map_writable, &fault->hva); return RET_PF_CONTINUE; } diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index db5c3eae58fe..fdd386e1d3c0 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1188,8 +1188,20 @@ kvm_pfn_t gfn_to_pfn_prot(struct kvm *kvm, gfn_t gfn, bool write_fault, bool *writable); kvm_pfn_t gfn_to_pfn_memslot(const struct kvm_memory_slot *slot, gfn_t gfn); kvm_pfn_t gfn_to_pfn_memslot_atomic(const struct kvm_memory_slot *slot, gfn_t gfn); +enum memslot_access_atomicity { + /* Force atomic access */ + MEMSLOT_ACCESS_ATOMIC, + /* + * Ask for non-atomic access, but allow upgrading to atomic depending + * on the memslot + */ + MEMSLOT_ACCESS_NONATOMIC_MAY_UPGRADE, + /* Force non-atomic access */ + MEMSLOT_ACCESS_FORCE_ALLOW_NONATOMIC +}; kvm_pfn_t __gfn_to_pfn_memslot(const struct kvm_memory_slot *slot, gfn_t gfn, - bool atomic, bool interruptible, bool *async, + enum memslot_access_atomicity atomicity, + bool interruptible, bool *async, bool write_fault, bool *writable, hva_t *hva); void kvm_release_pfn_clean(kvm_pfn_t pfn); diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index aa81e41b1488..d4f4ccb29e6d 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2735,9 +2735,11 @@ kvm_pfn_t hva_to_pfn(unsigned long addr, bool atomic, bool interruptible, } kvm_pfn_t __gfn_to_pfn_memslot(const struct kvm_memory_slot *slot, gfn_t gfn, - bool atomic, bool interruptible, bool *async, + enum memslot_access_atomicity atomicity, + bool interruptible, bool *async, bool write_fault, bool *writable, hva_t *hva) { + bool atomic; unsigned long addr = __gfn_to_hva_many(slot, gfn, NULL, write_fault); if (hva) @@ -2759,6 +2761,23 @@ kvm_pfn_t __gfn_to_pfn_memslot(const struct kvm_memory_slot *slot, gfn_t gfn, writable = NULL; } + if (atomicity == MEMSLOT_ACCESS_ATOMIC) { + atomic = true; + } else if (atomicity == MEMSLOT_ACCESS_NONATOMIC_MAY_UPGRADE) { + atomic = false; + if (kvm_is_slot_userfault_on_missing(slot)) { + atomic = true; + if (async) { + *async = false; + async = NULL; + } + } + } else if (atomicity == MEMSLOT_ACCESS_FORCE_ALLOW_NONATOMIC) { + atomic = false; + } else { + BUG(); + } + return hva_to_pfn(addr, atomic, interruptible, async, write_fault, writable); } @@ -2767,22 +2786,23 @@ EXPORT_SYMBOL_GPL(__gfn_to_pfn_memslot); kvm_pfn_t gfn_to_pfn_prot(struct kvm *kvm, gfn_t gfn, bool write_fault, bool *writable) { - return __gfn_to_pfn_memslot(gfn_to_memslot(kvm, gfn), gfn, false, false, - NULL, write_fault, writable, NULL); + return __gfn_to_pfn_memslot(gfn_to_memslot(kvm, gfn), gfn, + MEMSLOT_ACCESS_FORCE_ALLOW_NONATOMIC, + false, NULL, write_fault, writable, NULL); } EXPORT_SYMBOL_GPL(gfn_to_pfn_prot); kvm_pfn_t gfn_to_pfn_memslot(const struct kvm_memory_slot *slot, gfn_t gfn) { - return __gfn_to_pfn_memslot(slot, gfn, false, false, NULL, true, - NULL, NULL); + return __gfn_to_pfn_memslot(slot, gfn, MEMSLOT_ACCESS_NONATOMIC_MAY_UPGRADE, + false, NULL, true, NULL, NULL); } EXPORT_SYMBOL_GPL(gfn_to_pfn_memslot); kvm_pfn_t gfn_to_pfn_memslot_atomic(const struct kvm_memory_slot *slot, gfn_t gfn) { - return __gfn_to_pfn_memslot(slot, gfn, true, false, NULL, true, - NULL, NULL); + return __gfn_to_pfn_memslot(slot, gfn, MEMSLOT_ACCESS_ATOMIC, + false, NULL, true, NULL, NULL); } EXPORT_SYMBOL_GPL(gfn_to_pfn_memslot_atomic); @@ -2862,7 +2882,9 @@ int kvm_vcpu_map(struct kvm_vcpu *vcpu, gfn_t gfn, struct kvm_host_map *map) if (!map) return -EINVAL; - pfn = gfn_to_pfn(vcpu->kvm, gfn); + pfn = __gfn_to_pfn_memslot(gfn_to_memslot(vcpu->kvm, gfn), gfn, + MEMSLOT_ACCESS_FORCE_ALLOW_NONATOMIC, + false, NULL, true, NULL, NULL); if (is_error_noslot_pfn(pfn)) return -EINVAL; From patchwork Fri Sep 8 22:28:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anish Moorthy X-Patchwork-Id: 13377868 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 38991EEB56E for ; Fri, 8 Sep 2023 22:30:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343955AbjIHWaF (ORCPT ); Fri, 8 Sep 2023 18:30:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48594 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343924AbjIHWaE (ORCPT ); Fri, 8 Sep 2023 18:30:04 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EF4181FCA for ; Fri, 8 Sep 2023 15:29:59 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id 3f1490d57ef6-d7e4151d07aso2543394276.2 for ; Fri, 08 Sep 2023 15:29:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1694212199; x=1694816999; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=/sy4Cn1X3f23j6DJCKOvy7yO2KD//Ic8lCob4A9AHhQ=; b=SPvGi8cfQpE72ZhYi+iyXWu+RgrGBhEHwEoeJdQz9Lps9itG1QJe69doicBtoaFLbn QraV7PvVxV+GWmpPN/DS4PmBAx5u7MAVYxvUt0T7I/HB2XU24Ptg1VdCuhTd1oN+niIX WKWW+Bzc/ljh8GXy05oyNIOkXK7eJOzPRh/j6+iodAMTOj/Trm9TAxcdoxDk44wDeY0t 3kBsWfdIwxXtwjVh/pFNWbCRbHKfepm7rKroDMoefs3W/PVRhx0NNj+GMdNpL5OBw8+E kmDR7KTGxk0sN3cPhPeoA/CfUAHUJr0gE6hP/B4NCdlClZ8zFYmJVgTHplSVA+TtNUOZ n2Kw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694212199; x=1694816999; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=/sy4Cn1X3f23j6DJCKOvy7yO2KD//Ic8lCob4A9AHhQ=; b=syWrA+wjgm8GSrdwRCAGxInXTbSplxEE9DlwzOFNoBBBtxGGQ9iHFW3e8nsDcyh4yR SodbrBV5bIGlR0YcOult9SMJ+Mx66fsdhBoY69hH2hZoFwuGvG0ypgWnA7Tu8dmpkyvN NkIbjKT+f2h/fw9xfOCF1SyEJx2Ga5TJiTKLAaZJ0T61iohPJHosL4sYpKd9rzMwZzL6 Ip1ElfWSFX7t39Fe1I/N3hCikS2aRDyvXHAMLVO5+nOVwxwiui4tsgwpIoj2TD9IIWEr MPqrt9fLu6CGuaTs7ONtRnrqEhiqWXxQSIyFoomliZ3H6viHVluhArx9wV+HM2FoxWGg SupA== X-Gm-Message-State: AOJu0YwQPfEQkXoTOuaS+VQKDQKssu507N89iSh5r4ljvqdBYtPpAnon xdMGHzkKgesIx8F5BbRjWaMmM7L55fxY7g== X-Google-Smtp-Source: AGHT+IHfZ0/H1S0KyKpbxmbTxxAv4PkI43WgGvDpPDRHv30smv/EmWxfKi0rHwtfKXYTF2yg50kBD8Rw7qxr2g== X-Received: from laogai.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:2c9]) (user=amoorthy job=sendgmr) by 2002:a05:6902:98d:b0:d7a:bd65:18ba with SMTP id bv13-20020a056902098d00b00d7abd6518bamr89404ybb.3.1694212199283; Fri, 08 Sep 2023 15:29:59 -0700 (PDT) Date: Fri, 8 Sep 2023 22:28:58 +0000 In-Reply-To: <20230908222905.1321305-1-amoorthy@google.com> Mime-Version: 1.0 References: <20230908222905.1321305-1-amoorthy@google.com> X-Mailer: git-send-email 2.42.0.283.g2d96d420d3-goog Message-ID: <20230908222905.1321305-12-amoorthy@google.com> Subject: [PATCH v5 11/17] KVM: x86: Enable KVM_CAP_USERFAULT_ON_MISSING From: Anish Moorthy To: seanjc@google.com, oliver.upton@linux.dev, kvm@vger.kernel.org, kvmarm@lists.linux.dev Cc: pbonzini@redhat.com, maz@kernel.org, robert.hoo.linux@gmail.com, jthoughton@google.com, amoorthy@google.com, ricarkol@google.com, axelrasmussen@google.com, peterx@redhat.com, nadav.amit@gmail.com, isaku.yamahata@gmail.com, kconsul@linux.vnet.ibm.com Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The relevant __gfn_to_pfn_memslot() calls in __kvm_faultin_pfn() already use MEMSLOT_ACCESS_NONATOMIC_MAY_UPGRADE. Signed-off-by: Anish Moorthy --- Documentation/virt/kvm/api.rst | 2 +- arch/x86/kvm/Kconfig | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index c2eaacb6dc63..a74d721a18f6 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -7788,7 +7788,7 @@ error/annotated fault. 7.35 KVM_CAP_USERFAULT_ON_MISSING --------------------------------- -:Architectures: None +:Architectures: x86 :Returns: Informational only, -EINVAL on direct KVM_ENABLE_CAP. The presence of this capability indicates that userspace may set the diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig index ed90f148140d..11d956f17a9d 100644 --- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig @@ -49,6 +49,7 @@ config KVM select INTERVAL_TREE select HAVE_KVM_PM_NOTIFIER if PM select KVM_GENERIC_HARDWARE_ENABLING + select HAVE_KVM_USERFAULT_ON_MISSING help Support hosting fully virtualized guest machines using hardware virtualization extensions. You will need a fairly recent From patchwork Fri Sep 8 22:28:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anish Moorthy X-Patchwork-Id: 13377869 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 81D4AEEB573 for ; Fri, 8 Sep 2023 22:30:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343880AbjIHWaG (ORCPT ); Fri, 8 Sep 2023 18:30:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48588 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343924AbjIHWaF (ORCPT ); Fri, 8 Sep 2023 18:30:05 -0400 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E3F6A1FF0 for ; Fri, 8 Sep 2023 15:30:00 -0700 (PDT) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-59b56dab74bso13127527b3.2 for ; Fri, 08 Sep 2023 15:30:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1694212200; x=1694817000; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=43OsBVzgibsjHQzcaxYrViWOyyh57gKn8UrK+ylhvCI=; b=PcdZFQhQDmCOjiQprgu9Z4o0fwgazqXfJ6H3LNBGOfQeFxl5nBxlt4whbzmyX88uP2 6QrV0mE9bdIcdj5w2peGgiVA+MhnVWiPZ17kzezi/lt2BvMV4nhcw+q2x5t2BYfxdHLu Jvu0bMrn2uESCWTCzO4m7uSmi9NFAZ1OQNsJZ12uoxM4Nus8QSHyGU+zilDin9CmtpMZ +khVX/s0UVQ/kEz6CLvXJrdJDADPFAainXrBmBb4TF08T75wMWwoF91FPbjmiFbWPoLr FasrQ6OxyjMmxSMgfiHpzjw8CTOSBF5mlVmqChd4lokBsWjlvyaq+BT75++bkQHLi3wR CCkw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694212200; x=1694817000; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=43OsBVzgibsjHQzcaxYrViWOyyh57gKn8UrK+ylhvCI=; b=W0ZRrljWUoT+pA3nHxxvFcq2T/BQ+tedq9C3RxN9V2qmwxPox4+KG2864NTjSLHkkw 8wFmAcGy7kPZp5C2ZTvXDaKu+gVQZp2bbvbeWmQP3Xwz4PAhZdzJ8g8lrDd7qCpaA7RS 98Wv+NdFEnGjf7vQUyJTR/ITX072xfM/v4Dnyl1zmh6sBUYiWAype0LL124oUGiguKYz uK7b5KwJBaUPkvG26+W59WxO/cn1iH46q79fM8RZ4w/u0R1zNjHqrEh0yajj7tMs2AJ0 UspwQMZTpDF7ffT25C+Ntlmst4ZsMDejW0CVhwTmF6bAcJ36K2cZ3utgNyMmiI5B7gY4 3/vA== X-Gm-Message-State: AOJu0YwM9xm4t6ft8ZEv88K05OQtHMdij9PZxMoz4ygmLBrFqnZoMPR3 +fUeUmbCD6rzi/z7kV/QcGVHqzLImIo+pw== X-Google-Smtp-Source: AGHT+IG2K9vYapHGZteSKcwo3I41HEOYuLvqUgZ/ljkv2g4SRllJC/MiufE8PmX5pJDyOTUwONJY/zrQWnN7Kw== X-Received: from laogai.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:2c9]) (user=amoorthy job=sendgmr) by 2002:a05:6902:1022:b0:d80:1250:89bb with SMTP id x2-20020a056902102200b00d80125089bbmr87935ybt.7.1694212200222; Fri, 08 Sep 2023 15:30:00 -0700 (PDT) Date: Fri, 8 Sep 2023 22:28:59 +0000 In-Reply-To: <20230908222905.1321305-1-amoorthy@google.com> Mime-Version: 1.0 References: <20230908222905.1321305-1-amoorthy@google.com> X-Mailer: git-send-email 2.42.0.283.g2d96d420d3-goog Message-ID: <20230908222905.1321305-13-amoorthy@google.com> Subject: [PATCH v5 12/17] KVM: arm64: Enable KVM_CAP_USERFAULT_ON_MISSING From: Anish Moorthy To: seanjc@google.com, oliver.upton@linux.dev, kvm@vger.kernel.org, kvmarm@lists.linux.dev Cc: pbonzini@redhat.com, maz@kernel.org, robert.hoo.linux@gmail.com, jthoughton@google.com, amoorthy@google.com, ricarkol@google.com, axelrasmussen@google.com, peterx@redhat.com, nadav.amit@gmail.com, isaku.yamahata@gmail.com, kconsul@linux.vnet.ibm.com Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The relevant __gfn_to_pfn_memslot() call in user_mem_abort() already uses MEMSLOT_ACCESS_NONATOMIC_MAY_UPGRADE. Signed-off-by: Anish Moorthy --- Documentation/virt/kvm/api.rst | 2 +- arch/arm64/kvm/Kconfig | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index a74d721a18f6..b0b1124277e8 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -7788,7 +7788,7 @@ error/annotated fault. 7.35 KVM_CAP_USERFAULT_ON_MISSING --------------------------------- -:Architectures: x86 +:Architectures: x86, arm64 :Returns: Informational only, -EINVAL on direct KVM_ENABLE_CAP. The presence of this capability indicates that userspace may set the diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig index 83c1e09be42e..d966a955d876 100644 --- a/arch/arm64/kvm/Kconfig +++ b/arch/arm64/kvm/Kconfig @@ -43,6 +43,7 @@ menuconfig KVM select GUEST_PERF_EVENTS if PERF_EVENTS select INTERVAL_TREE select XARRAY_MULTI + select HAVE_KVM_USERFAULT_ON_MISSING help Support hosting virtualized guest machines. From patchwork Fri Sep 8 22:29:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anish Moorthy X-Patchwork-Id: 13377870 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C47EAEEB571 for ; Fri, 8 Sep 2023 22:30:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343924AbjIHWaH (ORCPT ); Fri, 8 Sep 2023 18:30:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48592 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343820AbjIHWaG (ORCPT ); Fri, 8 Sep 2023 18:30:06 -0400 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CF9D71FE7 for ; Fri, 8 Sep 2023 15:30:01 -0700 (PDT) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-59b6083fa00so9604757b3.0 for ; Fri, 08 Sep 2023 15:30:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1694212201; x=1694817001; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=KBbXT1HpOrFkX6fT3ri9dT8smyClqE/Pk/MSUbxJK0g=; b=alC1oCufNh9x2TWBtheD6ycZ18f7z6RBgYZQmjwANN2xgcwW5t/uPS0WD5vRy4LT8U fAXLwtrTUpzhgP/CBxiJ//MlSrWYALeix7W6qqYRWqYHL/DIJgW8PKIu9qj0EYPQ5VtX sfZIekn6HcQARJBOEdxMXCSBerI6kCHk7KxTfAEXdbxoXB1P3Rzyp9S2653hR9Gkfh/C BnxR1WWIc/t3YHEZlphsvHAcdzJaaJXArPBL7lyNVZYXKEMKEA0y06+7NSyM/BDP8E/i QcnIQ5ax+LdkPg8gvEYPKX72IFK1FpJg7aqeooeeYFLwaRFC27MlHiX5V4xAkQSdo7Hi DzHQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694212201; x=1694817001; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=KBbXT1HpOrFkX6fT3ri9dT8smyClqE/Pk/MSUbxJK0g=; b=POFyFiHK4aST7cdfuCby0n/IwWbP4EMt+twwyTkOy52yicdBSgPcZsSvbYte8sZ+zi W94SUplQFscAoeU+xyGpz0+Rkcfm93WOo+4M9n4LEfV6+pwjW272ykK22dpTR6ZpLIo2 eG2aTQUr0ppO43hOh1dBt6P1/nn7cVfmwyTXp9kVTcsNkDzrJhKqRDmAy5pOGi9ZhzKN +K82dYvbxyMfpZKmvGdrFz9jWv7OWvDG4h23rsuo4XzkulYTFH1hkF3CzN8zgCF6OAk1 LnqdB+F+tfWRdKgGMiCwINMOJIGIOzCNkJQ6vgL6A+PHCDBzt+EGbYpDcSm123LbhPEN nj6w== X-Gm-Message-State: AOJu0Yxn8H8Yr/m4+Git3aq84uJ6fYP9Z72oEgNndFrRurQW0DqpNvAW L+uXn3fMDkbbqYkSmkVpxvQvlYWzlTKMbg== X-Google-Smtp-Source: AGHT+IG0u7Fmfgwad2kTm5siZgMBlIJxpuK6meAbCgZIKiyvEqI1hsyuH16I5IXFPHVsdwlRNhxPIlipqpV5bw== X-Received: from laogai.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:2c9]) (user=amoorthy job=sendgmr) by 2002:a25:ae96:0:b0:c78:c530:6345 with SMTP id b22-20020a25ae96000000b00c78c5306345mr67692ybj.7.1694212201104; Fri, 08 Sep 2023 15:30:01 -0700 (PDT) Date: Fri, 8 Sep 2023 22:29:00 +0000 In-Reply-To: <20230908222905.1321305-1-amoorthy@google.com> Mime-Version: 1.0 References: <20230908222905.1321305-1-amoorthy@google.com> X-Mailer: git-send-email 2.42.0.283.g2d96d420d3-goog Message-ID: <20230908222905.1321305-14-amoorthy@google.com> Subject: [PATCH v5 13/17] KVM: selftests: Report per-vcpu demand paging rate from demand paging test From: Anish Moorthy To: seanjc@google.com, oliver.upton@linux.dev, kvm@vger.kernel.org, kvmarm@lists.linux.dev Cc: pbonzini@redhat.com, maz@kernel.org, robert.hoo.linux@gmail.com, jthoughton@google.com, amoorthy@google.com, ricarkol@google.com, axelrasmussen@google.com, peterx@redhat.com, nadav.amit@gmail.com, isaku.yamahata@gmail.com, kconsul@linux.vnet.ibm.com Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Using the overall demand paging rate to measure performance can be slightly misleading when vCPU accesses are not overlapped. Adding more vCPUs will (usually) increase the overall demand paging rate even if performance remains constant or even degrades on a per-vcpu basis. As such, it makes sense to report both the total and per-vcpu paging rates. Signed-off-by: Anish Moorthy --- tools/testing/selftests/kvm/demand_paging_test.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/kvm/demand_paging_test.c b/tools/testing/selftests/kvm/demand_paging_test.c index 09c116a82a84..6dc823fa933a 100644 --- a/tools/testing/selftests/kvm/demand_paging_test.c +++ b/tools/testing/selftests/kvm/demand_paging_test.c @@ -135,6 +135,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) struct timespec ts_diff; struct kvm_vm *vm; int i; + double vcpu_paging_rate; vm = memstress_create_vm(mode, nr_vcpus, guest_percpu_mem_size, 1, p->src_type, p->partition_vcpu_memory_access); @@ -191,11 +192,17 @@ static void run_test(enum vm_guest_mode mode, void *arg) uffd_stop_demand_paging(uffd_descs[i]); } - pr_info("Total guest execution time: %ld.%.9lds\n", + pr_info("Total guest execution time:\t%ld.%.9lds\n", ts_diff.tv_sec, ts_diff.tv_nsec); - pr_info("Overall demand paging rate: %f pgs/sec\n", - memstress_args.vcpu_args[0].pages * nr_vcpus / - ((double)ts_diff.tv_sec + (double)ts_diff.tv_nsec / NSEC_PER_SEC)); + + vcpu_paging_rate = + memstress_args.vcpu_args[0].pages + / ((double)ts_diff.tv_sec + + (double)ts_diff.tv_nsec / NSEC_PER_SEC); + pr_info("Per-vcpu demand paging rate:\t%f pgs/sec/vcpu\n", + vcpu_paging_rate); + pr_info("Overall demand paging rate:\t%f pgs/sec\n", + vcpu_paging_rate * nr_vcpus); memstress_destroy_vm(vm); From patchwork Fri Sep 8 22:29:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anish Moorthy X-Patchwork-Id: 13377871 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 302FBEEB56E for ; Fri, 8 Sep 2023 22:30:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343916AbjIHWaJ (ORCPT ); Fri, 8 Sep 2023 18:30:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48718 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343820AbjIHWaI (ORCPT ); Fri, 8 Sep 2023 18:30:08 -0400 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BA5AA1FED for ; Fri, 8 Sep 2023 15:30:02 -0700 (PDT) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-594e5e2e608so28551967b3.2 for ; Fri, 08 Sep 2023 15:30:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1694212202; x=1694817002; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=oufjiRCAsL1Eh3a6usyMZvjIao9zICC1zqfvsVHk2fg=; b=1tOR+0vrTK0Z0AcWz2RBhE015q3tA8OHRk8qmSSFxqKSX+FjgOnO6lYbct7ep5Tcdy tUCzC4zWsPaxRN8fpCNLn3UobCQLYnfQ2AYv1hfAtlL4j91q+SSd9R6/eHbOqU7Hbyiq quh2GVWnuuhCJVAjQXKTxHLu9H1tkjrXtefKiBxS/GpBsByGUz6jORYOwggfuQTn2cWk ZvFsyNkD0axKJjOe0OFtBwHxOdZeDkfrE7wi1b5WcM9c4Ghxd1nzMa8trrnJhUEBcW1Z r6FQPLZa752rXvtjlLmrTV++1UpApFk7J4CYjjMAF5Z8FqLevq7mhRDmpG0bdy0zGTBZ 2EoA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694212202; x=1694817002; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=oufjiRCAsL1Eh3a6usyMZvjIao9zICC1zqfvsVHk2fg=; b=V7rOh5K0RKyotNHp1cVfILjTv5RZsDuuypxD8SszzWgWZNWjjYJxPlt5EMDpTEUct3 x8tnQa3gt9xNUUAH96QDNFLKDRUbX5+vY8ohY0xnAtTrg5aArQjjIiE3KYenxGmSWvgs b+tL0Ykn5EVlr1yjx+tkAOQNQOCDYxp4JbovsI+AWrNhlV0TFP+Ah3+Fy2T1xHl0PxFe FoDzgp1ANqM/O8zuzQEvwxI+Dp+vxt/VH+jjVTcBKR1HP1l9tRIQio92gIEu+q7sx4+m DXAAWDNY9bqeqRRnzINGLEIHE3eexKmkxsPrqAxBhw1ybDgz1RNBDQNAXP1K7UKBATJU L3Hg== X-Gm-Message-State: AOJu0Yx6+6XgXgbXI5RUVpgzkFEVQBZa4oBUsNgoVCra3Mnm9lnqqzAB XVPxTqvQqDI3S3evkBAlQIY0l5k2oOxUTA== X-Google-Smtp-Source: AGHT+IF/V1StEhGv1jCGQV+ixsg9Le/ugBaOg7jYt/PQ/ct203wFIuDTVqrnvSAmuN6ZALs+FN44k2A15i+5tA== X-Received: from laogai.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:2c9]) (user=amoorthy job=sendgmr) by 2002:a81:aa13:0:b0:562:837:122f with SMTP id i19-20020a81aa13000000b005620837122fmr94156ywh.9.1694212202026; Fri, 08 Sep 2023 15:30:02 -0700 (PDT) Date: Fri, 8 Sep 2023 22:29:01 +0000 In-Reply-To: <20230908222905.1321305-1-amoorthy@google.com> Mime-Version: 1.0 References: <20230908222905.1321305-1-amoorthy@google.com> X-Mailer: git-send-email 2.42.0.283.g2d96d420d3-goog Message-ID: <20230908222905.1321305-15-amoorthy@google.com> Subject: [PATCH v5 14/17] KVM: selftests: Allow many vCPUs and reader threads per UFFD in demand paging test From: Anish Moorthy To: seanjc@google.com, oliver.upton@linux.dev, kvm@vger.kernel.org, kvmarm@lists.linux.dev Cc: pbonzini@redhat.com, maz@kernel.org, robert.hoo.linux@gmail.com, jthoughton@google.com, amoorthy@google.com, ricarkol@google.com, axelrasmussen@google.com, peterx@redhat.com, nadav.amit@gmail.com, isaku.yamahata@gmail.com, kconsul@linux.vnet.ibm.com Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org At the moment, demand_paging_test does not support profiling/testing multiple vCPU threads concurrently faulting on a single uffd because (a) "-u" (run test in userfaultfd mode) creates a uffd for each vCPU's region, so that each uffd services a single vCPU thread. (b) "-u -o" (userfaultfd mode + overlapped vCPU memory accesses) simply doesn't work: the test tries to register the same memory to multiple uffds, causing an error. Add support for many vcpus per uffd by (1) Keeping "-u" behavior unchanged. (2) Making "-u -a" create a single uffd for all of guest memory. (3) Making "-u -o" implicitly pass "-a", solving the problem in (b). In cases (2) and (3) all vCPU threads fault on a single uffd. With potentially multiple vCPUs per UFFD, it makes sense to allow configuring the number of reader threads per UFFD as well: add the "-r" flag to do so. Signed-off-by: Anish Moorthy Acked-by: James Houghton --- .../selftests/kvm/aarch64/page_fault_test.c | 4 +- .../selftests/kvm/demand_paging_test.c | 76 +++++++++++++--- .../selftests/kvm/include/userfaultfd_util.h | 17 +++- .../selftests/kvm/lib/userfaultfd_util.c | 87 +++++++++++++------ 4 files changed, 137 insertions(+), 47 deletions(-) diff --git a/tools/testing/selftests/kvm/aarch64/page_fault_test.c b/tools/testing/selftests/kvm/aarch64/page_fault_test.c index 47bb914ab2fa..e5ca4b4be903 100644 --- a/tools/testing/selftests/kvm/aarch64/page_fault_test.c +++ b/tools/testing/selftests/kvm/aarch64/page_fault_test.c @@ -375,14 +375,14 @@ static void setup_uffd(struct kvm_vm *vm, struct test_params *p, *pt_uffd = uffd_setup_demand_paging(uffd_mode, 0, pt_args.hva, pt_args.paging_size, - test->uffd_pt_handler); + 1, test->uffd_pt_handler); *data_uffd = NULL; if (test->uffd_data_handler) *data_uffd = uffd_setup_demand_paging(uffd_mode, 0, data_args.hva, data_args.paging_size, - test->uffd_data_handler); + 1, test->uffd_data_handler); } static void free_uffd(struct test_desc *test, struct uffd_desc *pt_uffd, diff --git a/tools/testing/selftests/kvm/demand_paging_test.c b/tools/testing/selftests/kvm/demand_paging_test.c index 6dc823fa933a..f7897a951f90 100644 --- a/tools/testing/selftests/kvm/demand_paging_test.c +++ b/tools/testing/selftests/kvm/demand_paging_test.c @@ -77,8 +77,20 @@ static int handle_uffd_page_request(int uffd_mode, int uffd, copy.mode = 0; r = ioctl(uffd, UFFDIO_COPY, ©); - if (r == -1) { - pr_info("Failed UFFDIO_COPY in 0x%lx from thread %d with errno: %d\n", + /* + * With multiple vCPU threads fault on a single page and there are + * multiple readers for the UFFD, at least one of the UFFDIO_COPYs + * will fail with EEXIST: handle that case without signaling an + * error. + * + * Note that this also suppress any EEXISTs occurring from, + * e.g., the first UFFDIO_COPY/CONTINUEs on a page. That never + * happens here, but a realistic VMM might potentially maintain + * some external state to correctly surface EEXISTs to userspace + * (or prevent duplicate COPY/CONTINUEs in the first place). + */ + if (r == -1 && errno != EEXIST) { + pr_info("Failed UFFDIO_COPY in 0x%lx from thread %d, errno = %d\n", addr, tid, errno); return r; } @@ -89,8 +101,20 @@ static int handle_uffd_page_request(int uffd_mode, int uffd, cont.range.len = demand_paging_size; r = ioctl(uffd, UFFDIO_CONTINUE, &cont); - if (r == -1) { - pr_info("Failed UFFDIO_CONTINUE in 0x%lx from thread %d with errno: %d\n", + /* + * With multiple vCPU threads fault on a single page and there are + * multiple readers for the UFFD, at least one of the UFFDIO_COPYs + * will fail with EEXIST: handle that case without signaling an + * error. + * + * Note that this also suppress any EEXISTs occurring from, + * e.g., the first UFFDIO_COPY/CONTINUEs on a page. That never + * happens here, but a realistic VMM might potentially maintain + * some external state to correctly surface EEXISTs to userspace + * (or prevent duplicate COPY/CONTINUEs in the first place). + */ + if (r == -1 && errno != EEXIST) { + pr_info("Failed UFFDIO_CONTINUE in 0x%lx, thread %d, errno = %d\n", addr, tid, errno); return r; } @@ -110,7 +134,9 @@ static int handle_uffd_page_request(int uffd_mode, int uffd, struct test_params { int uffd_mode; + bool single_uffd; useconds_t uffd_delay; + int readers_per_uffd; enum vm_mem_backing_src_type src_type; bool partition_vcpu_memory_access; }; @@ -134,8 +160,9 @@ static void run_test(enum vm_guest_mode mode, void *arg) struct timespec start; struct timespec ts_diff; struct kvm_vm *vm; - int i; + int i, num_uffds = 0; double vcpu_paging_rate; + uint64_t uffd_region_size; vm = memstress_create_vm(mode, nr_vcpus, guest_percpu_mem_size, 1, p->src_type, p->partition_vcpu_memory_access); @@ -148,7 +175,8 @@ static void run_test(enum vm_guest_mode mode, void *arg) memset(guest_data_prototype, 0xAB, demand_paging_size); if (p->uffd_mode == UFFDIO_REGISTER_MODE_MINOR) { - for (i = 0; i < nr_vcpus; i++) { + num_uffds = p->single_uffd ? 1 : nr_vcpus; + for (i = 0; i < num_uffds; i++) { vcpu_args = &memstress_args.vcpu_args[i]; prefault_mem(addr_gpa2alias(vm, vcpu_args->gpa), vcpu_args->pages * memstress_args.guest_page_size); @@ -156,9 +184,13 @@ static void run_test(enum vm_guest_mode mode, void *arg) } if (p->uffd_mode) { - uffd_descs = malloc(nr_vcpus * sizeof(struct uffd_desc *)); + num_uffds = p->single_uffd ? 1 : nr_vcpus; + uffd_region_size = nr_vcpus * guest_percpu_mem_size / num_uffds; + + uffd_descs = malloc(num_uffds * sizeof(struct uffd_desc *)); TEST_ASSERT(uffd_descs, "Memory allocation failed"); - for (i = 0; i < nr_vcpus; i++) { + for (i = 0; i < num_uffds; i++) { + struct memstress_vcpu_args *vcpu_args; void *vcpu_hva; vcpu_args = &memstress_args.vcpu_args[i]; @@ -171,7 +203,8 @@ static void run_test(enum vm_guest_mode mode, void *arg) */ uffd_descs[i] = uffd_setup_demand_paging( p->uffd_mode, p->uffd_delay, vcpu_hva, - vcpu_args->pages * memstress_args.guest_page_size, + uffd_region_size, + p->readers_per_uffd, &handle_uffd_page_request); } } @@ -188,7 +221,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) if (p->uffd_mode) { /* Tell the user fault fd handler threads to quit */ - for (i = 0; i < nr_vcpus; i++) + for (i = 0; i < num_uffds; i++) uffd_stop_demand_paging(uffd_descs[i]); } @@ -214,15 +247,20 @@ static void run_test(enum vm_guest_mode mode, void *arg) static void help(char *name) { puts(""); - printf("usage: %s [-h] [-m vm_mode] [-u uffd_mode] [-d uffd_delay_usec]\n" - " [-b memory] [-s type] [-v vcpus] [-c cpu_list] [-o]\n", name); + printf("usage: %s [-h] [-m vm_mode] [-u uffd_mode] [-a]\n" + " [-d uffd_delay_usec] [-r readers_per_uffd] [-b memory]\n" + " [-s type] [-v vcpus] [-c cpu_list] [-o]\n", name); guest_modes_help(); printf(" -u: use userfaultfd to handle vCPU page faults. Mode is a\n" " UFFD registration mode: 'MISSING' or 'MINOR'.\n"); kvm_print_vcpu_pinning_help(); + printf(" -a: Use a single userfaultfd for all of guest memory, instead of\n" + " creating one for each region paged by a unique vCPU\n" + " Set implicitly with -o, and no effect without -u.\n"); printf(" -d: add a delay in usec to the User Fault\n" " FD handler to simulate demand paging\n" " overheads. Ignored without -u.\n"); + printf(" -r: Set the number of reader threads per uffd.\n"); printf(" -b: specify the size of the memory region which should be\n" " demand paged by each vCPU. e.g. 10M or 3G.\n" " Default: 1G\n"); @@ -241,12 +279,14 @@ int main(int argc, char *argv[]) struct test_params p = { .src_type = DEFAULT_VM_MEM_SRC, .partition_vcpu_memory_access = true, + .readers_per_uffd = 1, + .single_uffd = false, }; int opt; guest_modes_append_default(); - while ((opt = getopt(argc, argv, "hm:u:d:b:s:v:c:o")) != -1) { + while ((opt = getopt(argc, argv, "ahom:u:d:b:s:v:c:r:")) != -1) { switch (opt) { case 'm': guest_modes_cmdline(optarg); @@ -258,6 +298,9 @@ int main(int argc, char *argv[]) p.uffd_mode = UFFDIO_REGISTER_MODE_MINOR; TEST_ASSERT(p.uffd_mode, "UFFD mode must be 'MISSING' or 'MINOR'."); break; + case 'a': + p.single_uffd = true; + break; case 'd': p.uffd_delay = strtoul(optarg, NULL, 0); TEST_ASSERT(p.uffd_delay >= 0, "A negative UFFD delay is not supported."); @@ -278,6 +321,13 @@ int main(int argc, char *argv[]) break; case 'o': p.partition_vcpu_memory_access = false; + p.single_uffd = true; + break; + case 'r': + p.readers_per_uffd = atoi(optarg); + TEST_ASSERT(p.readers_per_uffd >= 1, + "Invalid number of readers per uffd %d: must be >=1", + p.readers_per_uffd); break; case 'h': default: diff --git a/tools/testing/selftests/kvm/include/userfaultfd_util.h b/tools/testing/selftests/kvm/include/userfaultfd_util.h index 877449c34592..af83a437e74a 100644 --- a/tools/testing/selftests/kvm/include/userfaultfd_util.h +++ b/tools/testing/selftests/kvm/include/userfaultfd_util.h @@ -17,18 +17,27 @@ typedef int (*uffd_handler_t)(int uffd_mode, int uffd, struct uffd_msg *msg); -struct uffd_desc { +struct uffd_reader_args { int uffd_mode; int uffd; - int pipefds[2]; useconds_t delay; uffd_handler_t handler; - pthread_t thread; + /* Holds the read end of the pipe for killing the reader. */ + int pipe; +}; + +struct uffd_desc { + int uffd; + uint64_t num_readers; + /* Holds the write ends of the pipes for killing the readers. */ + int *pipefds; + pthread_t *readers; + struct uffd_reader_args *reader_args; }; struct uffd_desc *uffd_setup_demand_paging(int uffd_mode, useconds_t delay, void *hva, uint64_t len, - uffd_handler_t handler); + uint64_t num_readers, uffd_handler_t handler); void uffd_stop_demand_paging(struct uffd_desc *uffd); diff --git a/tools/testing/selftests/kvm/lib/userfaultfd_util.c b/tools/testing/selftests/kvm/lib/userfaultfd_util.c index 271f63891581..6f220aa4fb08 100644 --- a/tools/testing/selftests/kvm/lib/userfaultfd_util.c +++ b/tools/testing/selftests/kvm/lib/userfaultfd_util.c @@ -27,10 +27,8 @@ static void *uffd_handler_thread_fn(void *arg) { - struct uffd_desc *uffd_desc = (struct uffd_desc *)arg; - int uffd = uffd_desc->uffd; - int pipefd = uffd_desc->pipefds[0]; - useconds_t delay = uffd_desc->delay; + struct uffd_reader_args *reader_args = (struct uffd_reader_args *)arg; + int uffd = reader_args->uffd; int64_t pages = 0; struct timespec start; struct timespec ts_diff; @@ -44,7 +42,7 @@ static void *uffd_handler_thread_fn(void *arg) pollfd[0].fd = uffd; pollfd[0].events = POLLIN; - pollfd[1].fd = pipefd; + pollfd[1].fd = reader_args->pipe; pollfd[1].events = POLLIN; r = poll(pollfd, 2, -1); @@ -92,9 +90,9 @@ static void *uffd_handler_thread_fn(void *arg) if (!(msg.event & UFFD_EVENT_PAGEFAULT)) continue; - if (delay) - usleep(delay); - r = uffd_desc->handler(uffd_desc->uffd_mode, uffd, &msg); + if (reader_args->delay) + usleep(reader_args->delay); + r = reader_args->handler(reader_args->uffd_mode, uffd, &msg); if (r < 0) return NULL; pages++; @@ -110,7 +108,7 @@ static void *uffd_handler_thread_fn(void *arg) struct uffd_desc *uffd_setup_demand_paging(int uffd_mode, useconds_t delay, void *hva, uint64_t len, - uffd_handler_t handler) + uint64_t num_readers, uffd_handler_t handler) { struct uffd_desc *uffd_desc; bool is_minor = (uffd_mode == UFFDIO_REGISTER_MODE_MINOR); @@ -118,14 +116,26 @@ struct uffd_desc *uffd_setup_demand_paging(int uffd_mode, useconds_t delay, struct uffdio_api uffdio_api; struct uffdio_register uffdio_register; uint64_t expected_ioctls = ((uint64_t) 1) << _UFFDIO_COPY; - int ret; + int ret, i; PER_PAGE_DEBUG("Userfaultfd %s mode, faults resolved with %s\n", is_minor ? "MINOR" : "MISSING", is_minor ? "UFFDIO_CONINUE" : "UFFDIO_COPY"); uffd_desc = malloc(sizeof(struct uffd_desc)); - TEST_ASSERT(uffd_desc, "malloc failed"); + TEST_ASSERT(uffd_desc, "Failed to malloc uffd descriptor"); + + uffd_desc->pipefds = malloc(sizeof(int) * num_readers); + TEST_ASSERT(uffd_desc->pipefds, "Failed to malloc pipes"); + + uffd_desc->readers = malloc(sizeof(pthread_t) * num_readers); + TEST_ASSERT(uffd_desc->readers, "Failed to malloc reader threads"); + + uffd_desc->reader_args = malloc( + sizeof(struct uffd_reader_args) * num_readers); + TEST_ASSERT(uffd_desc->reader_args, "Failed to malloc reader_args"); + + uffd_desc->num_readers = num_readers; /* In order to get minor faults, prefault via the alias. */ if (is_minor) @@ -148,18 +158,28 @@ struct uffd_desc *uffd_setup_demand_paging(int uffd_mode, useconds_t delay, TEST_ASSERT((uffdio_register.ioctls & expected_ioctls) == expected_ioctls, "missing userfaultfd ioctls"); - ret = pipe2(uffd_desc->pipefds, O_CLOEXEC | O_NONBLOCK); - TEST_ASSERT(!ret, "Failed to set up pipefd"); - - uffd_desc->uffd_mode = uffd_mode; uffd_desc->uffd = uffd; - uffd_desc->delay = delay; - uffd_desc->handler = handler; - pthread_create(&uffd_desc->thread, NULL, uffd_handler_thread_fn, - uffd_desc); + for (i = 0; i < uffd_desc->num_readers; ++i) { + int pipes[2]; + + ret = pipe2((int *) &pipes, O_CLOEXEC | O_NONBLOCK); + TEST_ASSERT(!ret, "Failed to set up pipefd %i for uffd_desc %p", + i, uffd_desc); + + uffd_desc->pipefds[i] = pipes[1]; - PER_VCPU_DEBUG("Created uffd thread for HVA range [%p, %p)\n", - hva, hva + len); + uffd_desc->reader_args[i].uffd_mode = uffd_mode; + uffd_desc->reader_args[i].uffd = uffd; + uffd_desc->reader_args[i].delay = delay; + uffd_desc->reader_args[i].handler = handler; + uffd_desc->reader_args[i].pipe = pipes[0]; + + pthread_create(&uffd_desc->readers[i], NULL, uffd_handler_thread_fn, + &uffd_desc->reader_args[i]); + + PER_VCPU_DEBUG("Created uffd thread %i for HVA range [%p, %p)\n", + i, hva, hva + len); + } return uffd_desc; } @@ -167,19 +187,30 @@ struct uffd_desc *uffd_setup_demand_paging(int uffd_mode, useconds_t delay, void uffd_stop_demand_paging(struct uffd_desc *uffd) { char c = 0; - int ret; + int i, ret; - ret = write(uffd->pipefds[1], &c, 1); - TEST_ASSERT(ret == 1, "Unable to write to pipefd"); + for (i = 0; i < uffd->num_readers; ++i) { + ret = write(uffd->pipefds[i], &c, 1); + TEST_ASSERT( + ret == 1, "Unable to write to pipefd %i for uffd_desc %p", i, uffd); + } - ret = pthread_join(uffd->thread, NULL); - TEST_ASSERT(ret == 0, "Pthread_join failed."); + for (i = 0; i < uffd->num_readers; ++i) { + ret = pthread_join(uffd->readers[i], NULL); + TEST_ASSERT( + ret == 0, "Pthread_join failed on reader %i for uffd_desc %p", i, uffd); + } close(uffd->uffd); - close(uffd->pipefds[1]); - close(uffd->pipefds[0]); + for (i = 0; i < uffd->num_readers; ++i) { + close(uffd->pipefds[i]); + close(uffd->reader_args[i].pipe); + } + free(uffd->pipefds); + free(uffd->readers); + free(uffd->reader_args); free(uffd); } From patchwork Fri Sep 8 22:29:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anish Moorthy X-Patchwork-Id: 13377872 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2895BEEB570 for ; Fri, 8 Sep 2023 22:30:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344020AbjIHWaR (ORCPT ); Fri, 8 Sep 2023 18:30:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59956 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343820AbjIHWaQ (ORCPT ); Fri, 8 Sep 2023 18:30:16 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C98932106 for ; Fri, 8 Sep 2023 15:30:04 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id 3f1490d57ef6-cf4cb742715so2482161276.2 for ; Fri, 08 Sep 2023 15:30:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1694212203; x=1694817003; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=XNvo3KyCXhe4/JPrBh+FqhkqSizcDNuUzKgVoWMo32M=; b=Z8h9+7UKUCHXZL6Zso+geVwXDg6C+ybBRW9kkeztHg9myQQIf/o+O2gtbeDyclnfuj WFMLwZssTcoLamRHKMNhlwSRLYDGqIwdQUzyGuVaAm3680ro2z5QlzSPD12durXk6dzC 26v56QEXqgqTW+toIIPb8DX3ZQJ354+LMCbe4ZSp4ewjMyK3nhFOFmes/E9DuafQtML6 FZpTmKdjdAUWFEzoUU869IJjDd64A6Ow07mEyxQWg+qPhcidSarZfJQgNpmcxeMyU5G/ XX9rHHGT0xw4yWYCVugtIt+wzkPUGhlhwdZ/CiY5g7C/LAUlzZ2C0jMHCz5AS+xXWIMD Q7OA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694212203; x=1694817003; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=XNvo3KyCXhe4/JPrBh+FqhkqSizcDNuUzKgVoWMo32M=; b=nW7mlvOanG6ersGPspR8Y9T69Y2GSt1nY4XRJlH6O+wHDytkF5eQ3f/TipCEhNvbwg wkLJTaAj6yagkL2ALGLsZ+uX9ftOa0XgxlQIDrdjzRCsVbl/yjRd+mMYx+DJZxPWzuJn VqawgrihJP72fivwphEGEjieCPmD0LYxByXiG6elBjeBlXlEdL4NS++RPVpN0/2eQqFZ Kjmwg/zcwwM6SKsulC9rO2YsPMsAOzQ0ger//atsf6qKw9ozLQ5UbsbvgTBTzWvXQwqv UuDD0JvxyJwHQRWXnd9kzDGU8ty4DuRjxIESJcwLQhXkJswJzdYEmu2bmDY7aNnXrsE0 +EqA== X-Gm-Message-State: AOJu0YzXGd2xpdL+hsuvmccoVxcy/bDwjdhd1KF9y37ga+RLQfDI92Mx Qf3vkD34MNky29QlJinvml7wjE2hIyXSzg== X-Google-Smtp-Source: AGHT+IHVTKhRKMD4mfkphXrMFXILZjv8PZl6WZRe9vF1bvuSlGumwwco+OrsrNQCiYRM232dhWiDHy6su+dfHw== X-Received: from laogai.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:2c9]) (user=amoorthy job=sendgmr) by 2002:a25:748f:0:b0:d47:4b58:a19e with SMTP id p137-20020a25748f000000b00d474b58a19emr77254ybc.11.1694212202919; Fri, 08 Sep 2023 15:30:02 -0700 (PDT) Date: Fri, 8 Sep 2023 22:29:02 +0000 In-Reply-To: <20230908222905.1321305-1-amoorthy@google.com> Mime-Version: 1.0 References: <20230908222905.1321305-1-amoorthy@google.com> X-Mailer: git-send-email 2.42.0.283.g2d96d420d3-goog Message-ID: <20230908222905.1321305-16-amoorthy@google.com> Subject: [PATCH v5 15/17] KVM: selftests: Use EPOLL in userfaultfd_util reader threads and signal errors via TEST_ASSERT From: Anish Moorthy To: seanjc@google.com, oliver.upton@linux.dev, kvm@vger.kernel.org, kvmarm@lists.linux.dev Cc: pbonzini@redhat.com, maz@kernel.org, robert.hoo.linux@gmail.com, jthoughton@google.com, amoorthy@google.com, ricarkol@google.com, axelrasmussen@google.com, peterx@redhat.com, nadav.amit@gmail.com, isaku.yamahata@gmail.com, kconsul@linux.vnet.ibm.com Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org With multiple reader threads POLLing a single UFFD, the test suffers from the thundering herd problem: performance degrades as the number of reader threads is increased. Solve this issue [1] by switching the the polling mechanism to EPOLL + EPOLLEXCLUSIVE. Also, change the error-handling convention of uffd_handler_thread_fn. Instead of just printing errors and returning early from the polling loop, check for them via TEST_ASSERT. "return NULL" is reserved for a successful exit from uffd_handler_thread_fn, ie one triggered by a write to the exit pipe. Performance samples generated by the command in [2] are given below. Num Reader Threads, Paging Rate (POLL), Paging Rate (EPOLL) 1 249k 185k 2 201k 235k 4 186k 155k 16 150k 217k 32 89k 198k [1] Single-vCPU performance does suffer somewhat. [2] ./demand_paging_test -u MINOR -s shmem -v 4 -o -r Signed-off-by: Anish Moorthy Acked-by: James Houghton --- .../selftests/kvm/demand_paging_test.c | 1 - .../selftests/kvm/lib/userfaultfd_util.c | 74 +++++++++---------- 2 files changed, 35 insertions(+), 40 deletions(-) diff --git a/tools/testing/selftests/kvm/demand_paging_test.c b/tools/testing/selftests/kvm/demand_paging_test.c index f7897a951f90..0455347f932a 100644 --- a/tools/testing/selftests/kvm/demand_paging_test.c +++ b/tools/testing/selftests/kvm/demand_paging_test.c @@ -13,7 +13,6 @@ #include #include #include -#include #include #include #include diff --git a/tools/testing/selftests/kvm/lib/userfaultfd_util.c b/tools/testing/selftests/kvm/lib/userfaultfd_util.c index 6f220aa4fb08..2a179133645a 100644 --- a/tools/testing/selftests/kvm/lib/userfaultfd_util.c +++ b/tools/testing/selftests/kvm/lib/userfaultfd_util.c @@ -16,6 +16,7 @@ #include #include #include +#include #include #include "kvm_util.h" @@ -32,60 +33,55 @@ static void *uffd_handler_thread_fn(void *arg) int64_t pages = 0; struct timespec start; struct timespec ts_diff; + int epollfd; + struct epoll_event evt; + + epollfd = epoll_create(1); + TEST_ASSERT(epollfd >= 0, "Failed to create epollfd."); + + evt.events = EPOLLIN | EPOLLEXCLUSIVE; + evt.data.u32 = 0; + TEST_ASSERT(epoll_ctl(epollfd, EPOLL_CTL_ADD, uffd, &evt) == 0, + "Failed to add uffd to epollfd"); + + evt.events = EPOLLIN; + evt.data.u32 = 1; + TEST_ASSERT(epoll_ctl(epollfd, EPOLL_CTL_ADD, reader_args->pipe, &evt) == 0, + "Failed to add pipe to epollfd"); clock_gettime(CLOCK_MONOTONIC, &start); while (1) { struct uffd_msg msg; - struct pollfd pollfd[2]; - char tmp_chr; int r; - pollfd[0].fd = uffd; - pollfd[0].events = POLLIN; - pollfd[1].fd = reader_args->pipe; - pollfd[1].events = POLLIN; - - r = poll(pollfd, 2, -1); - switch (r) { - case -1: - pr_info("poll err"); - continue; - case 0: - continue; - case 1: - break; - default: - pr_info("Polling uffd returned %d", r); - return NULL; - } + r = epoll_wait(epollfd, &evt, 1, -1); + TEST_ASSERT(r == 1, + "Unexpected number of events (%d) from epoll, errno = %d", + r, errno); - if (pollfd[0].revents & POLLERR) { - pr_info("uffd revents has POLLERR"); - return NULL; - } + if (evt.data.u32 == 1) { + char tmp_chr; - if (pollfd[1].revents & POLLIN) { - r = read(pollfd[1].fd, &tmp_chr, 1); + TEST_ASSERT(!(evt.events & (EPOLLERR | EPOLLHUP)), + "Reader thread received EPOLLERR or EPOLLHUP on pipe."); + r = read(reader_args->pipe, &tmp_chr, 1); TEST_ASSERT(r == 1, - "Error reading pipefd in UFFD thread\n"); + "Error reading pipefd in uffd reader thread"); break; } - if (!(pollfd[0].revents & POLLIN)) - continue; + TEST_ASSERT(!(evt.events & (EPOLLERR | EPOLLHUP)), + "Reader thread received EPOLLERR or EPOLLHUP on uffd."); r = read(uffd, &msg, sizeof(msg)); if (r == -1) { - if (errno == EAGAIN) - continue; - pr_info("Read of uffd got errno %d\n", errno); - return NULL; + TEST_ASSERT(errno == EAGAIN, + "Error reading from UFFD: errno = %d", errno); + continue; } - if (r != sizeof(msg)) { - pr_info("Read on uffd returned unexpected size: %d bytes", r); - return NULL; - } + TEST_ASSERT(r == sizeof(msg), + "Read on uffd returned unexpected number of bytes (%d)", r); if (!(msg.event & UFFD_EVENT_PAGEFAULT)) continue; @@ -93,8 +89,8 @@ static void *uffd_handler_thread_fn(void *arg) if (reader_args->delay) usleep(reader_args->delay); r = reader_args->handler(reader_args->uffd_mode, uffd, &msg); - if (r < 0) - return NULL; + TEST_ASSERT(r >= 0, + "Reader thread handler fn returned negative value %d", r); pages++; } From patchwork Fri Sep 8 22:29:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anish Moorthy X-Patchwork-Id: 13377873 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1582AEEB56E for ; Fri, 8 Sep 2023 22:30:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238406AbjIHWaS (ORCPT ); Fri, 8 Sep 2023 18:30:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59976 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233430AbjIHWaQ (ORCPT ); Fri, 8 Sep 2023 18:30:16 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 171EC1FE5 for ; Fri, 8 Sep 2023 15:30:05 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id 3f1490d57ef6-d7ea08906b3so2645419276.1 for ; Fri, 08 Sep 2023 15:30:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1694212204; x=1694817004; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=QixRm6EQN3Ldt+zlc4ioZ9kbbx4lKzNZFROgor3dASU=; b=Sp9GmU2Z+jlGCtEW+3quQ5MAWa9KJJeRIggJDpZy/LtlVzZ/lJDHGzsIXJpU4+3osD DoVyUVr14y8oBQ/Tn2rQCnRUXMjt1ZKhgPxj45uQ5NG+unJv7HNi6VhT/5zGka/eIxIP NEs3yRm5IWwAXwBuJ1fwg07jYV9jKeFnPVWpwbjNHEGWZwF10IeD4jkdfow54N+MG4s2 O1I1QR/doHEHWr8p+JFwL+7Ddw8tYIe66KzhHGYcCcB1/hxsGgzjeN9XiQtea9dGqPxI SLZu0U3C1VTg4clcz2oKoaR12WmJ4pO8kztAppHl5M4vI0cj2ukKze/abMSxyFiQZvyy 3Tww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694212204; x=1694817004; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=QixRm6EQN3Ldt+zlc4ioZ9kbbx4lKzNZFROgor3dASU=; b=WNfuzBNCgtdh7xsOxRx2KZ15nOGgWZN3veXZTunJ2DTFieTi6kG1cIY/F7Pyn9wBm/ pZ622Cp42wuLUeUuMt7lhYzxxMXV1Wx1vO8tVEhYX+uEvY8D+9xGsZJBo+QO0l4YZEqw RXIsqZI11VPmYBsMERuh9k9JidkkjfSmKCDwVSFMKs05ELEkTHeRhE9SpRlzp8LETEfj Ksxjm/DLYWjsOxhNJEh8FHCWyD+cxcQuejmlVfQY7vZjKaBtw46z4BPkNDHq+1UwkdWl U3NrLp4cioDKJKrlYaD2YcOOyBehkeBOg8hXptQNKwtDHpNI29sq8r6Vicnp0OV9IkDe F1aA== X-Gm-Message-State: AOJu0Yw0v/MyGB+3fTapaO1vuz19C/ivoyu7B1nkSW9atpckJSjryDHQ bITgxiuQhVo1BPmnvtptzNamTA7Tf8xGPA== X-Google-Smtp-Source: AGHT+IFmPNOfALnuUM1fozgkFU1fF1f31K71XPgaFia/rvjtmt+x1nhW4rKunooNDT/eSmNNy4Y0ISr0L5LgRw== X-Received: from laogai.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:2c9]) (user=amoorthy job=sendgmr) by 2002:a05:6902:108f:b0:d78:3c2e:b186 with SMTP id v15-20020a056902108f00b00d783c2eb186mr83779ybu.5.1694212203986; Fri, 08 Sep 2023 15:30:03 -0700 (PDT) Date: Fri, 8 Sep 2023 22:29:03 +0000 In-Reply-To: <20230908222905.1321305-1-amoorthy@google.com> Mime-Version: 1.0 References: <20230908222905.1321305-1-amoorthy@google.com> X-Mailer: git-send-email 2.42.0.283.g2d96d420d3-goog Message-ID: <20230908222905.1321305-17-amoorthy@google.com> Subject: [PATCH v5 16/17] KVM: selftests: Add memslot_flags parameter to memstress_create_vm() From: Anish Moorthy To: seanjc@google.com, oliver.upton@linux.dev, kvm@vger.kernel.org, kvmarm@lists.linux.dev Cc: pbonzini@redhat.com, maz@kernel.org, robert.hoo.linux@gmail.com, jthoughton@google.com, amoorthy@google.com, ricarkol@google.com, axelrasmussen@google.com, peterx@redhat.com, nadav.amit@gmail.com, isaku.yamahata@gmail.com, kconsul@linux.vnet.ibm.com Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Memslot flags aren't currently exposed to the tests, and are just always set to 0. Add a parameter to allow tests to manually set those flags. Signed-off-by: Anish Moorthy --- tools/testing/selftests/kvm/access_tracking_perf_test.c | 2 +- tools/testing/selftests/kvm/demand_paging_test.c | 2 +- tools/testing/selftests/kvm/dirty_log_perf_test.c | 2 +- tools/testing/selftests/kvm/include/memstress.h | 2 +- tools/testing/selftests/kvm/lib/memstress.c | 4 ++-- .../testing/selftests/kvm/memslot_modification_stress_test.c | 2 +- .../selftests/kvm/x86_64/dirty_log_page_splitting_test.c | 2 +- 7 files changed, 8 insertions(+), 8 deletions(-) diff --git a/tools/testing/selftests/kvm/access_tracking_perf_test.c b/tools/testing/selftests/kvm/access_tracking_perf_test.c index 3c7defd34f56..b51656b408b8 100644 --- a/tools/testing/selftests/kvm/access_tracking_perf_test.c +++ b/tools/testing/selftests/kvm/access_tracking_perf_test.c @@ -306,7 +306,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) struct kvm_vm *vm; int nr_vcpus = params->nr_vcpus; - vm = memstress_create_vm(mode, nr_vcpus, params->vcpu_memory_bytes, 1, + vm = memstress_create_vm(mode, nr_vcpus, params->vcpu_memory_bytes, 1, 0, params->backing_src, !overlap_memory_access); memstress_start_vcpu_threads(nr_vcpus, vcpu_thread_main); diff --git a/tools/testing/selftests/kvm/demand_paging_test.c b/tools/testing/selftests/kvm/demand_paging_test.c index 0455347f932a..61bb2e23bef0 100644 --- a/tools/testing/selftests/kvm/demand_paging_test.c +++ b/tools/testing/selftests/kvm/demand_paging_test.c @@ -163,7 +163,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) double vcpu_paging_rate; uint64_t uffd_region_size; - vm = memstress_create_vm(mode, nr_vcpus, guest_percpu_mem_size, 1, + vm = memstress_create_vm(mode, nr_vcpus, guest_percpu_mem_size, 1, 0, p->src_type, p->partition_vcpu_memory_access); demand_paging_size = get_backing_src_pagesz(p->src_type); diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c index d374dbcf9a53..8b1a84a4db3b 100644 --- a/tools/testing/selftests/kvm/dirty_log_perf_test.c +++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c @@ -153,7 +153,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) int i; vm = memstress_create_vm(mode, nr_vcpus, guest_percpu_mem_size, - p->slots, p->backing_src, + p->slots, 0, p->backing_src, p->partition_vcpu_memory_access); pr_info("Random seed: %u\n", p->random_seed); diff --git a/tools/testing/selftests/kvm/include/memstress.h b/tools/testing/selftests/kvm/include/memstress.h index ce4e603050ea..8be9609d3ca0 100644 --- a/tools/testing/selftests/kvm/include/memstress.h +++ b/tools/testing/selftests/kvm/include/memstress.h @@ -56,7 +56,7 @@ struct memstress_args { extern struct memstress_args memstress_args; struct kvm_vm *memstress_create_vm(enum vm_guest_mode mode, int nr_vcpus, - uint64_t vcpu_memory_bytes, int slots, + uint64_t vcpu_memory_bytes, int slots, uint32_t slot_flags, enum vm_mem_backing_src_type backing_src, bool partition_vcpu_memory_access); void memstress_destroy_vm(struct kvm_vm *vm); diff --git a/tools/testing/selftests/kvm/lib/memstress.c b/tools/testing/selftests/kvm/lib/memstress.c index df457452d146..dc145952d19c 100644 --- a/tools/testing/selftests/kvm/lib/memstress.c +++ b/tools/testing/selftests/kvm/lib/memstress.c @@ -123,7 +123,7 @@ void memstress_setup_vcpus(struct kvm_vm *vm, int nr_vcpus, } struct kvm_vm *memstress_create_vm(enum vm_guest_mode mode, int nr_vcpus, - uint64_t vcpu_memory_bytes, int slots, + uint64_t vcpu_memory_bytes, int slots, uint32_t slot_flags, enum vm_mem_backing_src_type backing_src, bool partition_vcpu_memory_access) { @@ -211,7 +211,7 @@ struct kvm_vm *memstress_create_vm(enum vm_guest_mode mode, int nr_vcpus, vm_userspace_mem_region_add(vm, backing_src, region_start, MEMSTRESS_MEM_SLOT_INDEX + i, - region_pages, 0); + region_pages, slot_flags); } /* Do mapping for the demand paging memory slot */ diff --git a/tools/testing/selftests/kvm/memslot_modification_stress_test.c b/tools/testing/selftests/kvm/memslot_modification_stress_test.c index 9855c41ca811..0b19ec3ecc9c 100644 --- a/tools/testing/selftests/kvm/memslot_modification_stress_test.c +++ b/tools/testing/selftests/kvm/memslot_modification_stress_test.c @@ -95,7 +95,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) struct test_params *p = arg; struct kvm_vm *vm; - vm = memstress_create_vm(mode, nr_vcpus, guest_percpu_mem_size, 1, + vm = memstress_create_vm(mode, nr_vcpus, guest_percpu_mem_size, 1, 0, VM_MEM_SRC_ANONYMOUS, p->partition_vcpu_memory_access); diff --git a/tools/testing/selftests/kvm/x86_64/dirty_log_page_splitting_test.c b/tools/testing/selftests/kvm/x86_64/dirty_log_page_splitting_test.c index 634c6bfcd572..a770d7fa469a 100644 --- a/tools/testing/selftests/kvm/x86_64/dirty_log_page_splitting_test.c +++ b/tools/testing/selftests/kvm/x86_64/dirty_log_page_splitting_test.c @@ -100,7 +100,7 @@ static void run_test(enum vm_guest_mode mode, void *unused) struct kvm_page_stats stats_dirty_logging_disabled; struct kvm_page_stats stats_repopulated; - vm = memstress_create_vm(mode, VCPUS, guest_percpu_mem_size, + vm = memstress_create_vm(mode, VCPUS, guest_percpu_mem_size, 0, SLOTS, backing_src, false); guest_num_pages = (VCPUS * guest_percpu_mem_size) >> vm->page_shift; From patchwork Fri Sep 8 22:29:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anish Moorthy X-Patchwork-Id: 13377874 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B5A65EEB56E for ; Fri, 8 Sep 2023 22:30:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232262AbjIHWaZ (ORCPT ); Fri, 8 Sep 2023 18:30:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53522 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233430AbjIHWaW (ORCPT ); Fri, 8 Sep 2023 18:30:22 -0400 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E44B61FED for ; Fri, 8 Sep 2023 15:30:05 -0700 (PDT) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-59b5884836cso13043067b3.0 for ; Fri, 08 Sep 2023 15:30:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1694212205; x=1694817005; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=v7ww+BfpEmFIFIAxkYsW9Rhi6A+QKVIeYRI9dLvE9I4=; b=w33H2YvvktodrY+KNIjcs5z4XW2JPvvGFPirjO0RNHsDHT9jHK1YM11rIngYq2XevK I0/LqF38VJ2CfMvWmclJf9ciWsqq9sBsY5gy0A7XKgb3x3g0OHsTZ9NgnS3F8dmfSyBr XoDmg32zIZl3Qr+NkjlAz8lZ9Xn6+L+jJNCxgLQdtvNk6H1Gdd/kAtaTtY6dQkTZR2fq 3YT9RwMMyYSb8wd6xb9cMXo9FkE87bbGIoG45VduwEvl05dO/Pq41FfFGr8L9WSPqNw/ uCI8537iUWcHHp9wuB5PiybyOYf+xD6LQtq1giGSGG/TwasMQ3mkf0geNb4YpinxZXm7 Sl2g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694212205; x=1694817005; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=v7ww+BfpEmFIFIAxkYsW9Rhi6A+QKVIeYRI9dLvE9I4=; b=bissxuJ+/OX8MVxfCXYyaSSsUhANzDYYU5GiHRFqRB9cX6iCd/tD74DRwjAFH3/Q8e tvEmyuwJNcoi6rJ+3t2N4AHC9Bzqtwn9r5i8BJ/Hisu5fokzF9WjTs1GpCHbvDgUX+X/ bq9m5M40lr64oLeTg74yFjBu1txqOvMuXKkoQZL/AlTOrjpZj/gK4h/2PtrggjsssysK EXf2JF+HmqweSbrZW8FQBdUyHq7QCQa2903i35Xc0T5SHHSIXOgffYZd5Syt2z63mvH6 8EIzyI/cqCss6QmtD3lFuJ1M21gFR1T+9gjhUBi1dPcnJI+Vqg5q43XbtHy+9KDuYrBE gceA== X-Gm-Message-State: AOJu0YxCFykUSsgCyYgTap9Uqn0SfxQZLIh6KlqW0htpGY4/mQQ/FXeO xOXP74VRiL5MPNPZiCq5Lzj/1bqrjBObOA== X-Google-Smtp-Source: AGHT+IGw8FHFymrlCdMZDe/ii+wNWsGnphCP6lRJyVALq/i/SUxR2j22csLrznGklL2CIySasCArZqMwcCI/rQ== X-Received: from laogai.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:2c9]) (user=amoorthy job=sendgmr) by 2002:a81:a704:0:b0:586:50cf:e13f with SMTP id e4-20020a81a704000000b0058650cfe13fmr89233ywh.1.1694212205125; Fri, 08 Sep 2023 15:30:05 -0700 (PDT) Date: Fri, 8 Sep 2023 22:29:04 +0000 In-Reply-To: <20230908222905.1321305-1-amoorthy@google.com> Mime-Version: 1.0 References: <20230908222905.1321305-1-amoorthy@google.com> X-Mailer: git-send-email 2.42.0.283.g2d96d420d3-goog Message-ID: <20230908222905.1321305-18-amoorthy@google.com> Subject: [PATCH v5 17/17] KVM: selftests: Handle memory fault exits in demand_paging_test From: Anish Moorthy To: seanjc@google.com, oliver.upton@linux.dev, kvm@vger.kernel.org, kvmarm@lists.linux.dev Cc: pbonzini@redhat.com, maz@kernel.org, robert.hoo.linux@gmail.com, jthoughton@google.com, amoorthy@google.com, ricarkol@google.com, axelrasmussen@google.com, peterx@redhat.com, nadav.amit@gmail.com, isaku.yamahata@gmail.com, kconsul@linux.vnet.ibm.com Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Demonstrate a (very basic) scheme for supporting memory fault exits. From the vCPU threads: 1. Simply issue UFFDIO_COPY/CONTINUEs in response to memory fault exits, with the purpose of establishing the absent mappings. Do so with wake_waiters=false to avoid serializing on the userfaultfd wait queue locks. 2. When the UFFDIO_COPY/CONTINUE in (1) fails with EEXIST, assume that the mapping was already established but is currently absent [A] and attempt to populate it using MADV_POPULATE_WRITE. Issue UFFDIO_COPY/CONTINUEs from the reader threads as well, but with wake_waiters=true to ensure that any threads sleeping on the uffd are eventually woken up. A real VMM would track whether it had already COPY/CONTINUEd pages (eg, via a bitmap) to avoid calls destined to EEXIST. However, even the naive approach is enough to demonstrate the performance advantages of KVM_EXIT_MEMORY_FAULT. [A] In reality it is much likelier that the vCPU thread simply lost a race to establish the mapping for the page. Signed-off-by: Anish Moorthy Acked-by: James Houghton --- .../selftests/kvm/demand_paging_test.c | 245 +++++++++++++----- 1 file changed, 173 insertions(+), 72 deletions(-) diff --git a/tools/testing/selftests/kvm/demand_paging_test.c b/tools/testing/selftests/kvm/demand_paging_test.c index 61bb2e23bef0..ded5cdf6dde9 100644 --- a/tools/testing/selftests/kvm/demand_paging_test.c +++ b/tools/testing/selftests/kvm/demand_paging_test.c @@ -15,6 +15,7 @@ #include #include #include +#include #include #include "kvm_util.h" @@ -31,36 +32,102 @@ static uint64_t guest_percpu_mem_size = DEFAULT_PER_VCPU_MEM_SIZE; static size_t demand_paging_size; static char *guest_data_prototype; +static int num_uffds; +static size_t uffd_region_size; +static struct uffd_desc **uffd_descs; +/* + * Delay when demand paging is performed through userfaultfd or directly by + * vcpu_worker in the case of an annotated memory fault. + */ +static useconds_t uffd_delay; +static int uffd_mode; + + +static int handle_uffd_page_request(int uffd_mode, int uffd, uint64_t hva, + bool is_vcpu); + +static void madv_write_or_err(uint64_t gpa) +{ + int r; + void *hva = addr_gpa2hva(memstress_args.vm, gpa); + + r = madvise(hva, demand_paging_size, MADV_POPULATE_WRITE); + TEST_ASSERT(r == 0, + "MADV_POPULATE_WRITE on hva 0x%lx (gpa 0x%lx) fail, errno %i\n", + (uintptr_t) hva, gpa, errno); +} + +static void ready_page(uint64_t gpa) +{ + int r, uffd; + + /* + * This test only registers memslot 1 w/ userfaultfd. Any accesses outside + * the registered ranges should fault in the physical pages through + * MADV_POPULATE_WRITE. + */ + if ((gpa < memstress_args.gpa) + || (gpa >= memstress_args.gpa + memstress_args.size)) { + madv_write_or_err(gpa); + } else { + if (uffd_delay) + usleep(uffd_delay); + + uffd = uffd_descs[(gpa - memstress_args.gpa) / uffd_region_size]->uffd; + + r = handle_uffd_page_request(uffd_mode, uffd, + (uint64_t) addr_gpa2hva(memstress_args.vm, gpa), true); + + if (r == EEXIST) + madv_write_or_err(gpa); + } +} + static void vcpu_worker(struct memstress_vcpu_args *vcpu_args) { struct kvm_vcpu *vcpu = vcpu_args->vcpu; int vcpu_idx = vcpu_args->vcpu_idx; struct kvm_run *run = vcpu->run; - struct timespec start; - struct timespec ts_diff; + struct timespec last_start; + struct timespec total_runtime = {}; int ret; - - clock_gettime(CLOCK_MONOTONIC, &start); - - /* Let the guest access its memory */ - ret = _vcpu_run(vcpu); - TEST_ASSERT(ret == 0, "vcpu_run failed: %d\n", ret); - if (get_ucall(vcpu, NULL) != UCALL_SYNC) { - TEST_ASSERT(false, - "Invalid guest sync status: exit_reason=%s\n", - exit_reason_str(run->exit_reason)); + u64 num_memory_fault_exits = 0; + bool annotated_memory_fault = false; + + while (true) { + clock_gettime(CLOCK_MONOTONIC, &last_start); + /* Let the guest access its memory */ + ret = _vcpu_run(vcpu); + annotated_memory_fault = errno == EFAULT + && run->flags | KVM_RUN_MEMORY_FAULT_FILLED; + TEST_ASSERT(ret == 0 || annotated_memory_fault, + "vcpu_run failed: %d\n", ret); + + total_runtime = timespec_add(total_runtime, + timespec_elapsed(last_start)); + if (ret != 0 && get_ucall(vcpu, NULL) != UCALL_SYNC) { + + if (annotated_memory_fault) { + ++num_memory_fault_exits; + ready_page(run->memory_fault.gpa); + continue; + } + + TEST_ASSERT(false, + "Invalid guest sync status: exit_reason=%s\n", + exit_reason_str(run->exit_reason)); + } + break; } - - ts_diff = timespec_elapsed(start); - PER_VCPU_DEBUG("vCPU %d execution time: %ld.%.9lds\n", vcpu_idx, - ts_diff.tv_sec, ts_diff.tv_nsec); + PER_VCPU_DEBUG("vCPU %d execution time: %ld.%.9lds, %d memory fault exits\n", + vcpu_idx, total_runtime.tv_sec, total_runtime.tv_nsec, + num_memory_fault_exits); } -static int handle_uffd_page_request(int uffd_mode, int uffd, - struct uffd_msg *msg) +static int handle_uffd_page_request(int uffd_mode, int uffd, uint64_t hva, + bool is_vcpu) { pid_t tid = syscall(__NR_gettid); - uint64_t addr = msg->arg.pagefault.address; struct timespec start; struct timespec ts_diff; int r; @@ -71,16 +138,15 @@ static int handle_uffd_page_request(int uffd_mode, int uffd, struct uffdio_copy copy; copy.src = (uint64_t)guest_data_prototype; - copy.dst = addr; + copy.dst = hva; copy.len = demand_paging_size; - copy.mode = 0; + copy.mode = is_vcpu ? UFFDIO_COPY_MODE_DONTWAKE : 0; - r = ioctl(uffd, UFFDIO_COPY, ©); /* - * With multiple vCPU threads fault on a single page and there are - * multiple readers for the UFFD, at least one of the UFFDIO_COPYs - * will fail with EEXIST: handle that case without signaling an - * error. + * With multiple vCPU threads and at least one of multiple reader threads + * or vCPU memory faults, multiple vCPUs accessing an absent page will + * almost certainly cause some thread doing the UFFDIO_COPY here to get + * EEXIST: make sure to allow that case. * * Note that this also suppress any EEXISTs occurring from, * e.g., the first UFFDIO_COPY/CONTINUEs on a page. That never @@ -88,23 +154,24 @@ static int handle_uffd_page_request(int uffd_mode, int uffd, * some external state to correctly surface EEXISTs to userspace * (or prevent duplicate COPY/CONTINUEs in the first place). */ - if (r == -1 && errno != EEXIST) { - pr_info("Failed UFFDIO_COPY in 0x%lx from thread %d, errno = %d\n", - addr, tid, errno); - return r; - } + r = ioctl(uffd, UFFDIO_COPY, ©); + TEST_ASSERT(r == 0 || errno == EEXIST, + "Thread 0x%x failed UFFDIO_COPY on hva 0x%lx, errno = %d", + tid, hva, errno); } else if (uffd_mode == UFFDIO_REGISTER_MODE_MINOR) { + /* The comments in the UFFDIO_COPY branch also apply here. */ struct uffdio_continue cont = {0}; - cont.range.start = addr; + cont.range.start = hva; cont.range.len = demand_paging_size; + cont.mode = is_vcpu ? UFFDIO_CONTINUE_MODE_DONTWAKE : 0; r = ioctl(uffd, UFFDIO_CONTINUE, &cont); /* - * With multiple vCPU threads fault on a single page and there are - * multiple readers for the UFFD, at least one of the UFFDIO_COPYs - * will fail with EEXIST: handle that case without signaling an - * error. + * With multiple vCPU threads and at least one of multiple reader threads + * or vCPU memory faults, multiple vCPUs accessing an absent page will + * almost certainly cause some thread doing the UFFDIO_COPY here to get + * EEXIST: make sure to allow that case. * * Note that this also suppress any EEXISTs occurring from, * e.g., the first UFFDIO_COPY/CONTINUEs on a page. That never @@ -112,32 +179,54 @@ static int handle_uffd_page_request(int uffd_mode, int uffd, * some external state to correctly surface EEXISTs to userspace * (or prevent duplicate COPY/CONTINUEs in the first place). */ - if (r == -1 && errno != EEXIST) { - pr_info("Failed UFFDIO_CONTINUE in 0x%lx, thread %d, errno = %d\n", - addr, tid, errno); - return r; - } + TEST_ASSERT(r == 0 || errno == EEXIST, + "Thread 0x%x failed UFFDIO_CONTINUE on hva 0x%lx, errno = %d", + tid, hva, errno); } else { TEST_FAIL("Invalid uffd mode %d", uffd_mode); } + /* + * If the above UFFDIO_COPY/CONTINUE failed with EEXIST, waiting threads + * will not have been woken: wake them here. + */ + if (!is_vcpu && r != 0) { + struct uffdio_range range = { + .start = hva, + .len = demand_paging_size + }; + r = ioctl(uffd, UFFDIO_WAKE, &range); + TEST_ASSERT(r == 0, + "Thread 0x%x failed UFFDIO_WAKE on hva 0x%lx, errno = %d", + tid, hva, errno); + } + ts_diff = timespec_elapsed(start); PER_PAGE_DEBUG("UFFD page-in %d \t%ld ns\n", tid, timespec_to_ns(ts_diff)); PER_PAGE_DEBUG("Paged in %ld bytes at 0x%lx from thread %d\n", - demand_paging_size, addr, tid); + demand_paging_size, hva, tid); return 0; } +static int handle_uffd_page_request_from_uffd(int uffd_mode, int uffd, + struct uffd_msg *msg) +{ + TEST_ASSERT(msg->event == UFFD_EVENT_PAGEFAULT, + "Received uffd message with event %d != UFFD_EVENT_PAGEFAULT", + msg->event); + return handle_uffd_page_request(uffd_mode, uffd, + msg->arg.pagefault.address, false); +} + struct test_params { - int uffd_mode; bool single_uffd; - useconds_t uffd_delay; int readers_per_uffd; enum vm_mem_backing_src_type src_type; bool partition_vcpu_memory_access; + bool memfault_exits; }; static void prefault_mem(void *alias, uint64_t len) @@ -155,16 +244,22 @@ static void run_test(enum vm_guest_mode mode, void *arg) { struct memstress_vcpu_args *vcpu_args; struct test_params *p = arg; - struct uffd_desc **uffd_descs = NULL; struct timespec start; struct timespec ts_diff; struct kvm_vm *vm; - int i, num_uffds = 0; + int i; double vcpu_paging_rate; - uint64_t uffd_region_size; + uint32_t slot_flags = 0; + bool uffd_memfault_exits = uffd_mode && p->memfault_exits; - vm = memstress_create_vm(mode, nr_vcpus, guest_percpu_mem_size, 1, 0, - p->src_type, p->partition_vcpu_memory_access); + if (uffd_memfault_exits) { + TEST_ASSERT(kvm_has_cap(KVM_CAP_USERFAULT_ON_MISSING) > 0, + "KVM does not have KVM_CAP_USERFAULT_ON_MISSING"); + slot_flags = KVM_MEM_USERFAULT_ON_MISSING; + } + + vm = memstress_create_vm(mode, nr_vcpus, guest_percpu_mem_size, + 1, slot_flags, p->src_type, p->partition_vcpu_memory_access); demand_paging_size = get_backing_src_pagesz(p->src_type); @@ -173,21 +268,21 @@ static void run_test(enum vm_guest_mode mode, void *arg) "Failed to allocate buffer for guest data pattern"); memset(guest_data_prototype, 0xAB, demand_paging_size); - if (p->uffd_mode == UFFDIO_REGISTER_MODE_MINOR) { - num_uffds = p->single_uffd ? 1 : nr_vcpus; - for (i = 0; i < num_uffds; i++) { - vcpu_args = &memstress_args.vcpu_args[i]; - prefault_mem(addr_gpa2alias(vm, vcpu_args->gpa), - vcpu_args->pages * memstress_args.guest_page_size); - } - } - - if (p->uffd_mode) { + if (uffd_mode) { num_uffds = p->single_uffd ? 1 : nr_vcpus; uffd_region_size = nr_vcpus * guest_percpu_mem_size / num_uffds; + if (uffd_mode == UFFDIO_REGISTER_MODE_MINOR) { + for (i = 0; i < num_uffds; i++) { + vcpu_args = &memstress_args.vcpu_args[i]; + prefault_mem(addr_gpa2alias(vm, vcpu_args->gpa), + uffd_region_size); + } + } + uffd_descs = malloc(num_uffds * sizeof(struct uffd_desc *)); - TEST_ASSERT(uffd_descs, "Memory allocation failed"); + TEST_ASSERT(uffd_descs, "Failed to allocate uffd descriptors"); + for (i = 0; i < num_uffds; i++) { struct memstress_vcpu_args *vcpu_args; void *vcpu_hva; @@ -201,10 +296,10 @@ static void run_test(enum vm_guest_mode mode, void *arg) * requests. */ uffd_descs[i] = uffd_setup_demand_paging( - p->uffd_mode, p->uffd_delay, vcpu_hva, + uffd_mode, uffd_delay, vcpu_hva, uffd_region_size, p->readers_per_uffd, - &handle_uffd_page_request); + &handle_uffd_page_request_from_uffd); } } @@ -218,7 +313,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) ts_diff = timespec_elapsed(start); pr_info("All vCPU threads joined\n"); - if (p->uffd_mode) { + if (uffd_mode) { /* Tell the user fault fd handler threads to quit */ for (i = 0; i < num_uffds; i++) uffd_stop_demand_paging(uffd_descs[i]); @@ -239,7 +334,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) memstress_destroy_vm(vm); free(guest_data_prototype); - if (p->uffd_mode) + if (uffd_mode) free(uffd_descs); } @@ -248,7 +343,8 @@ static void help(char *name) puts(""); printf("usage: %s [-h] [-m vm_mode] [-u uffd_mode] [-a]\n" " [-d uffd_delay_usec] [-r readers_per_uffd] [-b memory]\n" - " [-s type] [-v vcpus] [-c cpu_list] [-o]\n", name); + " [-s type] [-v vcpus] [-c cpu_list] [-o] [-w] \n", + name); guest_modes_help(); printf(" -u: use userfaultfd to handle vCPU page faults. Mode is a\n" " UFFD registration mode: 'MISSING' or 'MINOR'.\n"); @@ -260,6 +356,7 @@ static void help(char *name) " FD handler to simulate demand paging\n" " overheads. Ignored without -u.\n"); printf(" -r: Set the number of reader threads per uffd.\n"); + printf(" -w: Enable kvm cap for memory fault exits.\n"); printf(" -b: specify the size of the memory region which should be\n" " demand paged by each vCPU. e.g. 10M or 3G.\n" " Default: 1G\n"); @@ -280,29 +377,30 @@ int main(int argc, char *argv[]) .partition_vcpu_memory_access = true, .readers_per_uffd = 1, .single_uffd = false, + .memfault_exits = false, }; int opt; guest_modes_append_default(); - while ((opt = getopt(argc, argv, "ahom:u:d:b:s:v:c:r:")) != -1) { + while ((opt = getopt(argc, argv, "ahowm:u:d:b:s:v:c:r:")) != -1) { switch (opt) { case 'm': guest_modes_cmdline(optarg); break; case 'u': if (!strcmp("MISSING", optarg)) - p.uffd_mode = UFFDIO_REGISTER_MODE_MISSING; + uffd_mode = UFFDIO_REGISTER_MODE_MISSING; else if (!strcmp("MINOR", optarg)) - p.uffd_mode = UFFDIO_REGISTER_MODE_MINOR; - TEST_ASSERT(p.uffd_mode, "UFFD mode must be 'MISSING' or 'MINOR'."); + uffd_mode = UFFDIO_REGISTER_MODE_MINOR; + TEST_ASSERT(uffd_mode, "UFFD mode must be 'MISSING' or 'MINOR'."); break; case 'a': p.single_uffd = true; break; case 'd': - p.uffd_delay = strtoul(optarg, NULL, 0); - TEST_ASSERT(p.uffd_delay >= 0, "A negative UFFD delay is not supported."); + uffd_delay = strtoul(optarg, NULL, 0); + TEST_ASSERT(uffd_delay >= 0, "A negative UFFD delay is not supported."); break; case 'b': guest_percpu_mem_size = parse_size(optarg); @@ -328,6 +426,9 @@ int main(int argc, char *argv[]) "Invalid number of readers per uffd %d: must be >=1", p.readers_per_uffd); break; + case 'w': + p.memfault_exits = true; + break; case 'h': default: help(argv[0]); @@ -335,7 +436,7 @@ int main(int argc, char *argv[]) } } - if (p.uffd_mode == UFFDIO_REGISTER_MODE_MINOR && + if (uffd_mode == UFFDIO_REGISTER_MODE_MINOR && !backing_src_is_shared(p.src_type)) { TEST_FAIL("userfaultfd MINOR mode requires shared memory; pick a different -s"); }