From patchwork Wed Feb 15 01:16:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anish Moorthy X-Patchwork-Id: 13141157 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 910CCC61DA4 for ; Wed, 15 Feb 2023 01:16:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232578AbjBOBQ5 (ORCPT ); Tue, 14 Feb 2023 20:16:57 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36516 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233278AbjBOBQz (ORCPT ); Tue, 14 Feb 2023 20:16:55 -0500 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 425E534004 for ; Tue, 14 Feb 2023 17:16:32 -0800 (PST) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-52ed582a847so137760477b3.1 for ; Tue, 14 Feb 2023 17:16:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=75a28QXTJaFdsjx6g4R+wdHT2tgt6NGnmqWStNNEkpo=; b=Iw8Q2vdwA/noAgSu5p+I2tng2TkLFWwFmMLMDpqhCZUJH6FHLrlVwnGqZC7E+T+c++ WwVqgLiQAkzWOOFOG7UworKsK6kWnyfHaBaQ9jBKGItUwk7wbg5nGqmMKHgVoUDG6M0d HRrSP3LzAhDUv+h1V5wMna8A+rUBR+vv8I5e97PA7WksgkZ5bI8XlBxD+zLSVLDL1aCu 4p1ICp0e0xEnENttjWn7BOCDURaGSl3Z7du6q7F6PYE4UnjLZFSME/Ki8si4qX1zSJP3 cnfsgw8BSj/Cpp0pCwDyE36AMmllKJ58JMmqaAQOR9mVwOxJFx5a/rR78syL3Gy95QX5 mwXA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=75a28QXTJaFdsjx6g4R+wdHT2tgt6NGnmqWStNNEkpo=; b=Im5FteZsoTFWF9oItwOBDoEutufynFvS2a5X3vuZSTXwp+xpfXoKgtlaVvITIvR3D6 GByXUh479jOl5jFMh0fDoUe8y4symoFlytczDBJxlaGQDpn9v/C6sy5/LwMdGhHDc9kw lVsxJ8wPvAX+LZDd4WNS4497H+gBQd/RkUWGoCwKjusFrjYPkGB1BcADXCfynepwvRaU +fcHqDH5GAMikiWFA5O02o3TgVzw2xPov6rA9ZKCSF/EEbgEVZ5sJvwiuQG7zPC2kRZ4 4rWeZGzcaxY1u/R9pxFKVKotAwLZpbuXd9dTnIg0Je5QUWgjynlTh2LdTyHCicFfAFZh M/MQ== X-Gm-Message-State: AO0yUKUPj9HP/ivn0vRUl/Ilq002LqGIsg29aLjyPkGIUEdFub7qL4yl nes0ghFysBUoFruPSHATp//Xj5LKByVGZQ== X-Google-Smtp-Source: AK7set8jyzjZ/w7zGwihP0Af5Tw20Rh/Yrccj44g/TtOWlS4X6Kbr4whsu8IPoI5Z6begZB8p5TvMHZofkLS+g== X-Received: from laogai.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:2c9]) (user=amoorthy job=sendgmr) by 2002:a05:6902:388:b0:902:5b5c:73f7 with SMTP id f8-20020a056902038800b009025b5c73f7mr0ybs.12.1676423790329; Tue, 14 Feb 2023 17:16:30 -0800 (PST) Date: Wed, 15 Feb 2023 01:16:07 +0000 In-Reply-To: <20230215011614.725983-1-amoorthy@google.com> Mime-Version: 1.0 References: <20230215011614.725983-1-amoorthy@google.com> X-Mailer: git-send-email 2.39.1.581.gbfd45094c4-goog Message-ID: <20230215011614.725983-2-amoorthy@google.com> Subject: [PATCH 1/8] selftests/kvm: Fix bug in how demand_paging_test calculates paging rate From: Anish Moorthy To: Paolo Bonzini , Marc Zyngier Cc: Oliver Upton , Sean Christopherson , James Houghton , Anish Moorthy , Ben Gardon , David Matlack , Ricardo Koller , Chao Peng , Axel Rasmussen , kvm@vger.kernel.org, kvmarm@lists.linux.dev Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Currently we're dividing tv_nsec by 1E8, not 1E9. Reported-by: James Houghton Signed-off-by: Anish Moorthy --- tools/testing/selftests/kvm/demand_paging_test.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/kvm/demand_paging_test.c b/tools/testing/selftests/kvm/demand_paging_test.c index b0e1fc4de9e29..6809184ce2390 100644 --- a/tools/testing/selftests/kvm/demand_paging_test.c +++ b/tools/testing/selftests/kvm/demand_paging_test.c @@ -194,7 +194,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) ts_diff.tv_sec, ts_diff.tv_nsec); pr_info("Overall demand paging rate: %f pgs/sec\n", memstress_args.vcpu_args[0].pages * nr_vcpus / - ((double)ts_diff.tv_sec + (double)ts_diff.tv_nsec / 100000000.0)); + ((double)ts_diff.tv_sec + (double)ts_diff.tv_nsec / 1E9)); memstress_destroy_vm(vm); From patchwork Wed Feb 15 01:16:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anish Moorthy X-Patchwork-Id: 13141158 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9A1D7C61DA4 for ; Wed, 15 Feb 2023 01:17:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233278AbjBOBRA (ORCPT ); Tue, 14 Feb 2023 20:17:00 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36628 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231777AbjBOBQ6 (ORCPT ); Tue, 14 Feb 2023 20:16:58 -0500 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D27E634009 for ; Tue, 14 Feb 2023 17:16:33 -0800 (PST) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-517f8be4b00so179237017b3.3 for ; Tue, 14 Feb 2023 17:16:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=YHoWMeic/UjG++dnFUyACHVbUxHxQxLAVvFnKubV9rg=; b=BFMvX8P6kXjmMKY8J1D50/D8HClx3uewlrKwSdSEweDZ/S4Y8fdUivjkxHHgr/IH3D eXFVaZW1Xs/qJt4g98vmsuyZakrbZWC5qG4lEwHRAyhNkeohVr6pJxgGOIEbdk90gh12 jbVPnPqF5dzltidC9m6VayfGR61Llq3Xxqxja6IGWHAjEE0vmFyvsj74wB9iCripY4VU I6zXVWX46JSdR4w/I81Fq+sdXLMkqOYO8emgLB5nC6UDpjvBLXwWBeXSw8/KJE3JzHWy i2JetKYfjh+BlnyYo2sKo43WrwcnIFSO/Rte/Oyw6H6dEBkgsaqxC5uEfXuhC0c7d4xF 92PQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=YHoWMeic/UjG++dnFUyACHVbUxHxQxLAVvFnKubV9rg=; b=UQgnTPO90Gojky9UsvQxjvA/nXNcO02CpoZAgd1W2FPCnsxJ3uix4gMZVK0qkxyKRP 72fBJesLNhkudtRxrN4uG7X8hIh6C3WtVWMCOId9UArNZJBmuNxW0XX5QbdlhHlBMv9u R1vrSuPNrD2nwSN4zY13M1GjabF9VeVn2tKnLGrZFxRJJLtFZnD188uvaWoYr0Z3hamL 6EbRIQqggcXFaLXvZzKeXrYVmJ/o70RCkciGHIM3lTsR7daJh8ih8mrzrLjV+bnIswvS z3oFfkEI63kvrjc2/M08k9nHfDHkfAYm+maLOooXEKLVkoFYV0vFGwYjWrGzihUnx4aK hfqw== X-Gm-Message-State: AO0yUKWTnagDcgeFts/+0PYy3SgRwy5EVuATlnKsnzlU3s/g83QMEBva ZdBCi5DeA6eVYWQTXFkUt098pCCBGDXUiw== X-Google-Smtp-Source: AK7set8aGyzTDfSqGhdnumZuY7pFC5tqlXwmoMKur4Sda97RiABxp5FOxeJd7OviVkWsqVeM8Kht0R4cdk9pZg== X-Received: from laogai.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:2c9]) (user=amoorthy job=sendgmr) by 2002:a0d:c846:0:b0:527:ac4c:9f80 with SMTP id k67-20020a0dc846000000b00527ac4c9f80mr73605ywd.342.1676423791719; Tue, 14 Feb 2023 17:16:31 -0800 (PST) Date: Wed, 15 Feb 2023 01:16:08 +0000 In-Reply-To: <20230215011614.725983-1-amoorthy@google.com> Mime-Version: 1.0 References: <20230215011614.725983-1-amoorthy@google.com> X-Mailer: git-send-email 2.39.1.581.gbfd45094c4-goog Message-ID: <20230215011614.725983-3-amoorthy@google.com> Subject: [PATCH 2/8] selftests/kvm: Allow many vcpus per UFFD in demand paging test From: Anish Moorthy To: Paolo Bonzini , Marc Zyngier Cc: Oliver Upton , Sean Christopherson , James Houghton , Anish Moorthy , Ben Gardon , David Matlack , Ricardo Koller , Chao Peng , Axel Rasmussen , kvm@vger.kernel.org, kvmarm@lists.linux.dev Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The aim is to enable the demand_paging_selftest to benchmark multiple vCPU threads concurrently faulting over a single region/uffd. Currently, (a) "-u" (run test in userfaultfd mode) will create a uffd for each vCPU's region, so that each uffd services a single vcpu thread. (b) "-u -o" (userfaultfd mode + overlapped vCPU memory accesses) simply doesn't work (the test will try to register all of guest memory to multiple uffds, and get an error doing so). With this change (1) "-u" behavior is unchanged. (2) "-u -a" will create a single uffd for *all* of guest memory. (3) "-u -o" will implicitly pass "-a", resolving the breakage in (b). In cases (2) and (3) all vCPU threads will fault on a single uffd (with and without partitioned accesses respectively), giving us the behavior we want to test. With multiple threads now allowed to fault on a single UFFD, it makes sense to allow more than one reader thread per UFFD as well: an option for this (-r) is also added. Signed-off-by: Anish Moorthy Acked-by: James Houghton --- .../selftests/kvm/aarch64/page_fault_test.c | 4 +- .../selftests/kvm/demand_paging_test.c | 62 +++++++++---- .../selftests/kvm/include/userfaultfd_util.h | 18 +++- .../selftests/kvm/lib/userfaultfd_util.c | 86 +++++++++++++------ 4 files changed, 125 insertions(+), 45 deletions(-) diff --git a/tools/testing/selftests/kvm/aarch64/page_fault_test.c b/tools/testing/selftests/kvm/aarch64/page_fault_test.c index beb944fa6fd46..de4251b811e3b 100644 --- a/tools/testing/selftests/kvm/aarch64/page_fault_test.c +++ b/tools/testing/selftests/kvm/aarch64/page_fault_test.c @@ -377,14 +377,14 @@ static void setup_uffd(struct kvm_vm *vm, struct test_params *p, *pt_uffd = uffd_setup_demand_paging(uffd_mode, 0, pt_args.hva, pt_args.paging_size, - test->uffd_pt_handler); + 1, test->uffd_pt_handler); *data_uffd = NULL; if (test->uffd_data_handler) *data_uffd = uffd_setup_demand_paging(uffd_mode, 0, data_args.hva, data_args.paging_size, - test->uffd_data_handler); + 1, test->uffd_data_handler); } static void free_uffd(struct test_desc *test, struct uffd_desc *pt_uffd, diff --git a/tools/testing/selftests/kvm/demand_paging_test.c b/tools/testing/selftests/kvm/demand_paging_test.c index 6809184ce2390..3c1d5b81c9822 100644 --- a/tools/testing/selftests/kvm/demand_paging_test.c +++ b/tools/testing/selftests/kvm/demand_paging_test.c @@ -58,7 +58,7 @@ static void vcpu_worker(struct memstress_vcpu_args *vcpu_args) } static int handle_uffd_page_request(int uffd_mode, int uffd, - struct uffd_msg *msg) + struct uffd_msg *msg) { pid_t tid = syscall(__NR_gettid); uint64_t addr = msg->arg.pagefault.address; @@ -77,8 +77,15 @@ static int handle_uffd_page_request(int uffd_mode, int uffd, copy.mode = 0; r = ioctl(uffd, UFFDIO_COPY, ©); - if (r == -1) { - pr_info("Failed UFFDIO_COPY in 0x%lx from thread %d with errno: %d\n", + /* + * With multiple vCPU threads fault on a single page and there are + * multiple readers for the UFFD, at least one of the UFFDIO_COPYs + * will fail with EEXIST: handle that case without signaling an + * error. + */ + if (r == -1 && errno != EEXIST) { + pr_info( + "Failed UFFDIO_COPY in 0x%lx from thread %d, errno = %d\n", addr, tid, errno); return r; } @@ -89,8 +96,10 @@ static int handle_uffd_page_request(int uffd_mode, int uffd, cont.range.len = demand_paging_size; r = ioctl(uffd, UFFDIO_CONTINUE, &cont); - if (r == -1) { - pr_info("Failed UFFDIO_CONTINUE in 0x%lx from thread %d with errno: %d\n", + /* See the note about EEXISTs in the UFFDIO_COPY branch. */ + if (r == -1 && errno != EEXIST) { + pr_info( + "Failed UFFDIO_CONTINUE in 0x%lx from thread %d, errno = %d\n", addr, tid, errno); return r; } @@ -110,7 +119,9 @@ static int handle_uffd_page_request(int uffd_mode, int uffd, struct test_params { int uffd_mode; + bool single_uffd; useconds_t uffd_delay; + int readers_per_uffd; enum vm_mem_backing_src_type src_type; bool partition_vcpu_memory_access; }; @@ -133,7 +144,8 @@ static void run_test(enum vm_guest_mode mode, void *arg) struct timespec start; struct timespec ts_diff; struct kvm_vm *vm; - int i; + int i, num_uffds = 0; + uint64_t uffd_region_size; vm = memstress_create_vm(mode, nr_vcpus, guest_percpu_mem_size, 1, p->src_type, p->partition_vcpu_memory_access); @@ -146,10 +158,13 @@ static void run_test(enum vm_guest_mode mode, void *arg) memset(guest_data_prototype, 0xAB, demand_paging_size); if (p->uffd_mode) { - uffd_descs = malloc(nr_vcpus * sizeof(struct uffd_desc *)); + num_uffds = p->single_uffd ? 1 : nr_vcpus; + uffd_region_size = nr_vcpus * guest_percpu_mem_size / num_uffds; + + uffd_descs = malloc(num_uffds * sizeof(struct uffd_desc *)); TEST_ASSERT(uffd_descs, "Memory allocation failed"); - for (i = 0; i < nr_vcpus; i++) { + for (i = 0; i < num_uffds; i++) { struct memstress_vcpu_args *vcpu_args; void *vcpu_hva; void *vcpu_alias; @@ -160,8 +175,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) vcpu_hva = addr_gpa2hva(vm, vcpu_args->gpa); vcpu_alias = addr_gpa2alias(vm, vcpu_args->gpa); - prefault_mem(vcpu_alias, - vcpu_args->pages * memstress_args.guest_page_size); + prefault_mem(vcpu_alias, uffd_region_size); /* * Set up user fault fd to handle demand paging @@ -169,7 +183,8 @@ static void run_test(enum vm_guest_mode mode, void *arg) */ uffd_descs[i] = uffd_setup_demand_paging( p->uffd_mode, p->uffd_delay, vcpu_hva, - vcpu_args->pages * memstress_args.guest_page_size, + uffd_region_size, + p->readers_per_uffd, &handle_uffd_page_request); } } @@ -186,7 +201,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) if (p->uffd_mode) { /* Tell the user fault fd handler threads to quit */ - for (i = 0; i < nr_vcpus; i++) + for (i = 0; i < num_uffds; i++) uffd_stop_demand_paging(uffd_descs[i]); } @@ -206,14 +221,19 @@ static void run_test(enum vm_guest_mode mode, void *arg) static void help(char *name) { puts(""); - printf("usage: %s [-h] [-m vm_mode] [-u uffd_mode] [-d uffd_delay_usec]\n" - " [-b memory] [-s type] [-v vcpus] [-o]\n", name); + printf("usage: %s [-h] [-m vm_mode] [-u uffd_mode] [-a]\n" + " [-d uffd_delay_usec] [-r readers_per_uffd] [-b memory]\n" + " [-s type] [-v vcpus] [-o]\n", name); guest_modes_help(); printf(" -u: use userfaultfd to handle vCPU page faults. Mode is a\n" " UFFD registration mode: 'MISSING' or 'MINOR'.\n"); + printf(" -a: Use a single userfaultfd for all of guest memory, instead of\n" + " creating one for each region paged by a unique vCPU\n" + " Set implicitly with -o, and no effect without -u.\n"); printf(" -d: add a delay in usec to the User Fault\n" " FD handler to simulate demand paging\n" " overheads. Ignored without -u.\n"); + printf(" -r: Set the number of reader threads per uffd.\n"); printf(" -b: specify the size of the memory region which should be\n" " demand paged by each vCPU. e.g. 10M or 3G.\n" " Default: 1G\n"); @@ -231,12 +251,14 @@ int main(int argc, char *argv[]) struct test_params p = { .src_type = DEFAULT_VM_MEM_SRC, .partition_vcpu_memory_access = true, + .readers_per_uffd = 1, + .single_uffd = false, }; int opt; guest_modes_append_default(); - while ((opt = getopt(argc, argv, "hm:u:d:b:s:v:o")) != -1) { + while ((opt = getopt(argc, argv, "ahom:u:d:b:s:v:r:")) != -1) { switch (opt) { case 'm': guest_modes_cmdline(optarg); @@ -248,6 +270,9 @@ int main(int argc, char *argv[]) p.uffd_mode = UFFDIO_REGISTER_MODE_MINOR; TEST_ASSERT(p.uffd_mode, "UFFD mode must be 'MISSING' or 'MINOR'."); break; + case 'a': + p.single_uffd = true; + break; case 'd': p.uffd_delay = strtoul(optarg, NULL, 0); TEST_ASSERT(p.uffd_delay >= 0, "A negative UFFD delay is not supported."); @@ -265,6 +290,13 @@ int main(int argc, char *argv[]) break; case 'o': p.partition_vcpu_memory_access = false; + p.single_uffd = true; + break; + case 'r': + p.readers_per_uffd = atoi(optarg); + TEST_ASSERT(p.readers_per_uffd >= 1, + "Invalid number of readers per uffd %d: must be >=1", + p.readers_per_uffd); break; case 'h': default: diff --git a/tools/testing/selftests/kvm/include/userfaultfd_util.h b/tools/testing/selftests/kvm/include/userfaultfd_util.h index 877449c345928..92cc1f9ec0686 100644 --- a/tools/testing/selftests/kvm/include/userfaultfd_util.h +++ b/tools/testing/selftests/kvm/include/userfaultfd_util.h @@ -17,18 +17,30 @@ typedef int (*uffd_handler_t)(int uffd_mode, int uffd, struct uffd_msg *msg); +struct uffd_reader_args { + int uffd_mode; + int uffd; + useconds_t delay; + uffd_handler_t handler; + /* Holds the read end of the pipe for killing the reader. */ + int pipe; +}; + struct uffd_desc { int uffd_mode; int uffd; - int pipefds[2]; useconds_t delay; uffd_handler_t handler; - pthread_t thread; + uint64_t num_readers; + /* Holds the write ends of the pipes for killing the readers. */ + int *pipefds; + pthread_t *readers; + struct uffd_reader_args *reader_args; }; struct uffd_desc *uffd_setup_demand_paging(int uffd_mode, useconds_t delay, void *hva, uint64_t len, - uffd_handler_t handler); + uint64_t num_readers, uffd_handler_t handler); void uffd_stop_demand_paging(struct uffd_desc *uffd); diff --git a/tools/testing/selftests/kvm/lib/userfaultfd_util.c b/tools/testing/selftests/kvm/lib/userfaultfd_util.c index 92cef20902f1f..2723ee1e3e1b2 100644 --- a/tools/testing/selftests/kvm/lib/userfaultfd_util.c +++ b/tools/testing/selftests/kvm/lib/userfaultfd_util.c @@ -27,10 +27,8 @@ static void *uffd_handler_thread_fn(void *arg) { - struct uffd_desc *uffd_desc = (struct uffd_desc *)arg; - int uffd = uffd_desc->uffd; - int pipefd = uffd_desc->pipefds[0]; - useconds_t delay = uffd_desc->delay; + struct uffd_reader_args *reader_args = (struct uffd_reader_args *)arg; + int uffd = reader_args->uffd; int64_t pages = 0; struct timespec start; struct timespec ts_diff; @@ -44,7 +42,7 @@ static void *uffd_handler_thread_fn(void *arg) pollfd[0].fd = uffd; pollfd[0].events = POLLIN; - pollfd[1].fd = pipefd; + pollfd[1].fd = reader_args->pipe; pollfd[1].events = POLLIN; r = poll(pollfd, 2, -1); @@ -92,9 +90,9 @@ static void *uffd_handler_thread_fn(void *arg) if (!(msg.event & UFFD_EVENT_PAGEFAULT)) continue; - if (delay) - usleep(delay); - r = uffd_desc->handler(uffd_desc->uffd_mode, uffd, &msg); + if (reader_args->delay) + usleep(reader_args->delay); + r = reader_args->handler(reader_args->uffd_mode, uffd, &msg); if (r < 0) return NULL; pages++; @@ -110,7 +108,7 @@ static void *uffd_handler_thread_fn(void *arg) struct uffd_desc *uffd_setup_demand_paging(int uffd_mode, useconds_t delay, void *hva, uint64_t len, - uffd_handler_t handler) + uint64_t num_readers, uffd_handler_t handler) { struct uffd_desc *uffd_desc; bool is_minor = (uffd_mode == UFFDIO_REGISTER_MODE_MINOR); @@ -118,14 +116,26 @@ struct uffd_desc *uffd_setup_demand_paging(int uffd_mode, useconds_t delay, struct uffdio_api uffdio_api; struct uffdio_register uffdio_register; uint64_t expected_ioctls = ((uint64_t) 1) << _UFFDIO_COPY; - int ret; + int ret, i; PER_PAGE_DEBUG("Userfaultfd %s mode, faults resolved with %s\n", is_minor ? "MINOR" : "MISSING", is_minor ? "UFFDIO_CONINUE" : "UFFDIO_COPY"); uffd_desc = malloc(sizeof(struct uffd_desc)); - TEST_ASSERT(uffd_desc, "malloc failed"); + TEST_ASSERT(uffd_desc, "Failed to malloc uffd descriptor"); + + uffd_desc->pipefds = malloc(sizeof(int) * num_readers); + TEST_ASSERT(uffd_desc->pipefds, "Failed to malloc pipes"); + + uffd_desc->readers = malloc(sizeof(pthread_t) * num_readers); + TEST_ASSERT(uffd_desc->readers, "Failed to malloc reader threads"); + + uffd_desc->reader_args = malloc( + sizeof(struct uffd_reader_args) * num_readers); + TEST_ASSERT(uffd_desc->reader_args, "Failed to malloc reader_args"); + + uffd_desc->num_readers = num_readers; /* In order to get minor faults, prefault via the alias. */ if (is_minor) @@ -148,18 +158,32 @@ struct uffd_desc *uffd_setup_demand_paging(int uffd_mode, useconds_t delay, TEST_ASSERT((uffdio_register.ioctls & expected_ioctls) == expected_ioctls, "missing userfaultfd ioctls"); - ret = pipe2(uffd_desc->pipefds, O_CLOEXEC | O_NONBLOCK); - TEST_ASSERT(!ret, "Failed to set up pipefd"); - uffd_desc->uffd_mode = uffd_mode; uffd_desc->uffd = uffd; uffd_desc->delay = delay; uffd_desc->handler = handler; - pthread_create(&uffd_desc->thread, NULL, uffd_handler_thread_fn, - uffd_desc); - PER_VCPU_DEBUG("Created uffd thread for HVA range [%p, %p)\n", - hva, hva + len); + for (i = 0; i < uffd_desc->num_readers; ++i) { + int pipes[2]; + + ret = pipe2((int *) &pipes, O_CLOEXEC | O_NONBLOCK); + TEST_ASSERT(!ret, "Failed to set up pipefd %i for uffd_desc %p", + i, uffd_desc); + + uffd_desc->pipefds[i] = pipes[1]; + + uffd_desc->reader_args[i].uffd_mode = uffd_mode; + uffd_desc->reader_args[i].uffd = uffd; + uffd_desc->reader_args[i].delay = delay; + uffd_desc->reader_args[i].handler = handler; + uffd_desc->reader_args[i].pipe = pipes[0]; + + pthread_create(&uffd_desc->readers[i], NULL, uffd_handler_thread_fn, + &uffd_desc->reader_args[i]); + + PER_VCPU_DEBUG("Created uffd thread %i for HVA range [%p, %p)\n", + i, hva, hva + len); + } return uffd_desc; } @@ -167,19 +191,31 @@ struct uffd_desc *uffd_setup_demand_paging(int uffd_mode, useconds_t delay, void uffd_stop_demand_paging(struct uffd_desc *uffd) { char c = 0; - int ret; + int i, ret; - ret = write(uffd->pipefds[1], &c, 1); - TEST_ASSERT(ret == 1, "Unable to write to pipefd"); + for (i = 0; i < uffd->num_readers; ++i) { + ret = write(uffd->pipefds[i], &c, 1); + TEST_ASSERT( + ret == 1, "Unable to write to pipefd %i for uffd_desc %p", i, uffd); + } - ret = pthread_join(uffd->thread, NULL); - TEST_ASSERT(ret == 0, "Pthread_join failed."); + for (i = 0; i < uffd->num_readers; ++i) { + ret = pthread_join(uffd->readers[i], NULL); + TEST_ASSERT( + ret == 0, + "Pthread_join failed on reader thread %i for uffd_desc %p", i, uffd); + } close(uffd->uffd); - close(uffd->pipefds[1]); - close(uffd->pipefds[0]); + for (i = 0; i < uffd->num_readers; ++i) { + close(uffd->pipefds[i]); + close(uffd->reader_args[i].pipe); + } + free(uffd->pipefds); + free(uffd->readers); + free(uffd->reader_args); free(uffd); } From patchwork Wed Feb 15 01:16:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anish Moorthy X-Patchwork-Id: 13141159 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 10F3AC05027 for ; Wed, 15 Feb 2023 01:17:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233281AbjBOBRG (ORCPT ); Tue, 14 Feb 2023 20:17:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36784 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231777AbjBOBRE (ORCPT ); Tue, 14 Feb 2023 20:17:04 -0500 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AFAAF1D92B for ; Tue, 14 Feb 2023 17:16:37 -0800 (PST) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-507aac99fdfso180068687b3.11 for ; Tue, 14 Feb 2023 17:16:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=dwL8ByVHxP2Vn+hru962x6SVu1Z4scs8obV2NcAq9Xw=; b=nF5azlsvaUPuLi2Vj+0IbYrSF3V/xn366MyyLZ3Jae+8x3/iP+ZDIUlsGYm85sR+p9 XFc6Q/cZovQc13uI+9HyCZbHed6ABdMgTnji5JqA0nXqWu5psXe0CHoap1MCHuT+Pvok gBGzLXdSt3eaYM29bEnQ6KGDZhow2w5G4yIPV0bEeZBXf6imudjiZumt0m3qzWmbeGFG Fc/BKnSZk3Gje0FaNFVBe/FnFEI33bb3brQ8hJhuHDeLsFGEvB+Lh3CkQJeGijFNLi42 59fCNSqoqu9ZOV2CBt9e2wWSyA9fEya0StFpuzclrRDPeizLHeeoCJD8ix4ljLQuZdlV 5jVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=dwL8ByVHxP2Vn+hru962x6SVu1Z4scs8obV2NcAq9Xw=; b=WERyYcuNLox3GGmXgL2cOitf+Zs9upXSyyIBzFRMwJ/fA3DkfV9WtwU1mAhVsFmMKH f+RTFNbYg8KSYL/IuOWwHVV8+bS/0gQf1C9O22Mcfj3DwCxLpQPOFglBY7TE5MSOkGlr uAkbDx47A/AwGRpZ5EiOX8hAXJ7yA4rhZfFgdNDJHNaDF2jz/qPKQ3l1OwaUZfELfBv/ RfIr19UKIjkI2k93t4dFsOOj9mtmF7nA3dxE7nwTZ8MijQi++remCr1PJ3R+V3mhXLqn RbnEW/aiuNJOwLuh3OAYbHEXU/9uo5M+wb5TfHizJl0B8kE5ZC62TOApMpgT7crIuvS4 V7ag== X-Gm-Message-State: AO0yUKW76M16stdavLr6IYSukjryOcIgFr6TXmn7O4EgCK1JjhiUhh22 BIv0Z+JUvXbtYp77knaNbfO1x02W2P41dg== X-Google-Smtp-Source: AK7set++GKVNxmpgH1rdfjz4+NZpZFu6Zk2KKMAVxQ5/m+n6r49ItkqSxg7WKhOg7H5N/ibdF++6Er6PpweQNg== X-Received: from laogai.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:2c9]) (user=amoorthy job=sendgmr) by 2002:a81:6503:0:b0:52e:c82e:56f with SMTP id z3-20020a816503000000b0052ec82e056fmr73265ywb.14.1676423793545; Tue, 14 Feb 2023 17:16:33 -0800 (PST) Date: Wed, 15 Feb 2023 01:16:09 +0000 In-Reply-To: <20230215011614.725983-1-amoorthy@google.com> Mime-Version: 1.0 References: <20230215011614.725983-1-amoorthy@google.com> X-Mailer: git-send-email 2.39.1.581.gbfd45094c4-goog Message-ID: <20230215011614.725983-4-amoorthy@google.com> Subject: [PATCH 3/8] selftests/kvm: Switch demand paging uffd readers to epoll From: Anish Moorthy To: Paolo Bonzini , Marc Zyngier Cc: Oliver Upton , Sean Christopherson , James Houghton , Anish Moorthy , Ben Gardon , David Matlack , Ricardo Koller , Chao Peng , Axel Rasmussen , kvm@vger.kernel.org, kvmarm@lists.linux.dev Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org With multiple reader threads for each UFFD, the test suffers from the thundering herd problem: performance degrades as the number of reader threads is increased. Switching the readers over to EPOLL (which offers EPOLLEXCLUSIVE) solves this problem, although base-case performance does suffer significantly. This commit also changes the error-handling convention of uffd_handler_thread_fn: instead of checking for errors and returning when they're found, we just use TEST_ASSERT instead, and "return NULL" indicates a successful exit of the function (ie, triggered by a write to the corresponding pipe). Performance samples are given below, generated by the command in [1]. Num Reader Threads, Paging Rate (POLL), Paging Rate (EPOLL) 1 249k 185k 2 201k 235k 4 186k 155k 16 150k 217k 32 89k 198k [1] ./demand_paging_test -u MINOR -s MINOR -s shmem -v 4 -o -r Signed-off-by: Anish Moorthy Acked-by: James Houghton --- .../selftests/kvm/demand_paging_test.c | 1 - .../selftests/kvm/lib/userfaultfd_util.c | 76 +++++++++---------- 2 files changed, 37 insertions(+), 40 deletions(-) diff --git a/tools/testing/selftests/kvm/demand_paging_test.c b/tools/testing/selftests/kvm/demand_paging_test.c index 3c1d5b81c9822..34d5ba2044a2c 100644 --- a/tools/testing/selftests/kvm/demand_paging_test.c +++ b/tools/testing/selftests/kvm/demand_paging_test.c @@ -13,7 +13,6 @@ #include #include #include -#include #include #include #include diff --git a/tools/testing/selftests/kvm/lib/userfaultfd_util.c b/tools/testing/selftests/kvm/lib/userfaultfd_util.c index 2723ee1e3e1b2..863840d340105 100644 --- a/tools/testing/selftests/kvm/lib/userfaultfd_util.c +++ b/tools/testing/selftests/kvm/lib/userfaultfd_util.c @@ -16,6 +16,7 @@ #include #include #include +#include #include #include "kvm_util.h" @@ -32,60 +33,56 @@ static void *uffd_handler_thread_fn(void *arg) int64_t pages = 0; struct timespec start; struct timespec ts_diff; + int epollfd; + struct epoll_event evt; + + epollfd = epoll_create(1); + TEST_ASSERT(epollfd >= 0, "Failed to create epollfd."); + + evt.events = EPOLLIN | EPOLLEXCLUSIVE; + evt.data.u32 = 0; + TEST_ASSERT(epoll_ctl(epollfd, EPOLL_CTL_ADD, uffd, &evt) == 0, + "Failed to add uffd to epollfd"); + + evt.events = EPOLLIN; + evt.data.u32 = 1; + TEST_ASSERT(epoll_ctl(epollfd, EPOLL_CTL_ADD, reader_args->pipe, &evt) == 0, + "Failed to add pipe to epollfd"); clock_gettime(CLOCK_MONOTONIC, &start); while (1) { struct uffd_msg msg; - struct pollfd pollfd[2]; - char tmp_chr; int r; - pollfd[0].fd = uffd; - pollfd[0].events = POLLIN; - pollfd[1].fd = reader_args->pipe; - pollfd[1].events = POLLIN; - - r = poll(pollfd, 2, -1); - switch (r) { - case -1: - pr_info("poll err"); - continue; - case 0: - continue; - case 1: - break; - default: - pr_info("Polling uffd returned %d", r); - return NULL; - } + r = epoll_wait(epollfd, &evt, 1, -1); + TEST_ASSERT( + r == 1, + "Unexpected number of events (%d) returned by epoll, errno = %d", + r, errno); - if (pollfd[0].revents & POLLERR) { - pr_info("uffd revents has POLLERR"); - return NULL; - } + if (evt.data.u32 == 1) { + char tmp_chr; - if (pollfd[1].revents & POLLIN) { - r = read(pollfd[1].fd, &tmp_chr, 1); + TEST_ASSERT(!(evt.events & (EPOLLERR | EPOLLHUP)), + "Reader thread received EPOLLERR or EPOLLHUP on pipe."); + r = read(reader_args->pipe, &tmp_chr, 1); TEST_ASSERT(r == 1, - "Error reading pipefd in UFFD thread\n"); + "Error reading pipefd in uffd reader thread"); return NULL; } - if (!(pollfd[0].revents & POLLIN)) - continue; + TEST_ASSERT(!(evt.events & (EPOLLERR | EPOLLHUP)), + "Reader thread received EPOLLERR or EPOLLHUP on uffd."); r = read(uffd, &msg, sizeof(msg)); if (r == -1) { - if (errno == EAGAIN) - continue; - pr_info("Read of uffd got errno %d\n", errno); - return NULL; + TEST_ASSERT(errno == EAGAIN, + "Error reading from UFFD: errno = %d", errno); + continue; } - if (r != sizeof(msg)) { - pr_info("Read on uffd returned unexpected size: %d bytes", r); - return NULL; - } + TEST_ASSERT(r == sizeof(msg), + "Read on uffd returned unexpected number of bytes (%d)", r); if (!(msg.event & UFFD_EVENT_PAGEFAULT)) continue; @@ -93,8 +90,9 @@ static void *uffd_handler_thread_fn(void *arg) if (reader_args->delay) usleep(reader_args->delay); r = reader_args->handler(reader_args->uffd_mode, uffd, &msg); - if (r < 0) - return NULL; + TEST_ASSERT( + r >= 0, + "Reader thread handler function returned negative value %d", r); pages++; } From patchwork Wed Feb 15 01:16:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anish Moorthy X-Patchwork-Id: 13141160 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 21876C61DA4 for ; Wed, 15 Feb 2023 01:17:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233320AbjBOBRH (ORCPT ); Tue, 14 Feb 2023 20:17:07 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36802 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233293AbjBOBRF (ORCPT ); Tue, 14 Feb 2023 20:17:05 -0500 Received: from mail-qk1-x74a.google.com (mail-qk1-x74a.google.com [IPv6:2607:f8b0:4864:20::74a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0928232E47 for ; Tue, 14 Feb 2023 17:16:40 -0800 (PST) Received: by mail-qk1-x74a.google.com with SMTP id o24-20020a05620a22d800b007389d2f57f3so10671208qki.21 for ; Tue, 14 Feb 2023 17:16:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=ZPrYycs38Jm3Sq3undZq9ttXoskj6Xab1yQZ0uAvcW8=; b=nT5UvRV+bdXTY47eDM312ZUnHfNlWrwBJx6pdwuGEIubphNFZSWnMXxcpXFhQpNXYh F8TnEG1zuWzDqzduwqwOrXyGlqCFCkM98iRpeBU1tcpHvFAk4+JIxRxdH+TKZmx8h/lO Bqm9+nSVsOmUHwl0qdzxKY2MrmCICC2ZmEOhcHWcESFpdcAQrCCKl3VAR52ACOxfx1Cl 4HRdc0oHWA+SMUEa7WpQaDAh7TbB/N+cvqRF5mGV8aRafe2Xo6dzwFGqvQm1cPXymb1X hq3qSrSQJEV5pufNUG/jT2SMrTDNz0J9Fi6QrMD2x1izAiSkB53PtmoiGLWVTEPCWm/j mg5g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ZPrYycs38Jm3Sq3undZq9ttXoskj6Xab1yQZ0uAvcW8=; b=Ox6bSM/jb5yRyvdnrbYl1Rvl+1vDhyNP9pEcvmX1QNR01T+Dk00Zz+8gKWOO6WDVPW JoDDvjfUJifqfNm+nm9wa+mt2xUI1OouItnsOnyET19QhEHo3CiPJqLDTcIff0E0lGgH xXLF/FAHAn+h9aaxwWj+xcviruPD5P4PqgnvCDV5rCvp1+eQHFpeKd6+VJFsiCALiXVj s1hQIVHzCJbBPDQ1GTy2tOjKvjHzExryuMNsCWF5qcG1PNf5MTkduuQO+gbf6Bsqp9Lq LmQSIVD1jQ/69Wwh9cdi3AXF2PFPWuVlTDOIPSXAvqbO7rDwcN1MDsnudylprpvZEm6c 4JWg== X-Gm-Message-State: AO0yUKUi9VemD4yLaQ3IGHt9JR7tkHaSztmsX78eRD4TmWqqTd84zhrL 0BA/kH/szFPimqQ/gPh/DEJYMqBI59RCCg== X-Google-Smtp-Source: AK7set8OIyhmpWoTab6NgRcquwz3ebP9gVLRLX3gqnZH2lhB8uDvbbDq1XcJNQVsOeixNwKTxn++WoFJjfzvCQ== X-Received: from laogai.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:2c9]) (user=amoorthy job=sendgmr) by 2002:ac8:498d:0:b0:3b7:fda5:14ba with SMTP id f13-20020ac8498d000000b003b7fda514bamr9963qtq.12.1676423794603; Tue, 14 Feb 2023 17:16:34 -0800 (PST) Date: Wed, 15 Feb 2023 01:16:10 +0000 In-Reply-To: <20230215011614.725983-1-amoorthy@google.com> Mime-Version: 1.0 References: <20230215011614.725983-1-amoorthy@google.com> X-Mailer: git-send-email 2.39.1.581.gbfd45094c4-goog Message-ID: <20230215011614.725983-5-amoorthy@google.com> Subject: [PATCH 4/8] kvm: Allow hva_pfn_fast to resolve read-only faults. From: Anish Moorthy To: Paolo Bonzini , Marc Zyngier Cc: Oliver Upton , Sean Christopherson , James Houghton , Anish Moorthy , Ben Gardon , David Matlack , Ricardo Koller , Chao Peng , Axel Rasmussen , kvm@vger.kernel.org, kvmarm@lists.linux.dev Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The upcoming mem_fault_nowait commits will make it so that, when the relevant cap is enabled, hva_to_pfn will return after calling hva_to_pfn_fast without ever attempting to pin memory via hva_to_pfn_slow. hva_to_pfn_fast currently just fails for read-only faults. However, there doesn't seem to be a reason that we can't just try pinning the page without FOLL_WRITE instead of immediately falling back to slow-GUP. This commit implements that behavior. Suggested-by: James Houghton Signed-off-by: Anish Moorthy --- virt/kvm/kvm_main.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index d255964ec331e..dae5f48151032 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2479,7 +2479,7 @@ static inline int check_user_page_hwpoison(unsigned long addr) } /* - * The fast path to get the writable pfn which will be stored in @pfn, + * The fast path to get the pfn which will be stored in @pfn, * true indicates success, otherwise false is returned. It's also the * only part that runs if we can in atomic context. */ @@ -2487,16 +2487,18 @@ static bool hva_to_pfn_fast(unsigned long addr, bool write_fault, bool *writable, kvm_pfn_t *pfn) { struct page *page[1]; + bool found_by_fast_gup = + get_user_page_fast_only( + addr, + /* + * Fast pin a writable pfn only if it is a write fault request + * or the caller allows to map a writable pfn for a read fault + * request. + */ + (write_fault || writable) ? FOLL_WRITE : 0, + page); - /* - * Fast pin a writable pfn only if it is a write fault request - * or the caller allows to map a writable pfn for a read fault - * request. - */ - if (!(write_fault || writable)) - return false; - - if (get_user_page_fast_only(addr, FOLL_WRITE, page)) { + if (found_by_fast_gup) { *pfn = page_to_pfn(page[0]); if (writable) From patchwork Wed Feb 15 01:16:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anish Moorthy X-Patchwork-Id: 13141162 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 62C58C6379F for ; Wed, 15 Feb 2023 01:17:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233350AbjBOBRL (ORCPT ); Tue, 14 Feb 2023 20:17:11 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36842 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230284AbjBOBRG (ORCPT ); Tue, 14 Feb 2023 20:17:06 -0500 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6F4451D91B for ; Tue, 14 Feb 2023 17:16:40 -0800 (PST) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-52ec2c6b694so166010177b3.19 for ; Tue, 14 Feb 2023 17:16:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=0rHp5sqwZfV8zA2e7TUhcFyytercbqQ9f3OcSRDFb/Q=; b=DcL7+8nx8usjTLDWPqXAD6rATn8lKqd9G4SZWnKXMCVKVpTEkZFHQGl42QbdwM8yyC F/PqKakl2Ue/cLA4Dy1BUKzOOnmIrwcp7LLVRxn1Ncq+Tbh4r9XrnBd2pkhcy8HSB2fo BTGve3d5FP0uGMCEotT4ZjKGmLE9BB/9fc/D5FAYLUp8SgwEfgHjl76hrhvoAgxeT+lE sKiL+l2gRe1src5KhX25G8//0ca6bF4UK7W7jEjt+hIbZtVHIK+Owknqp0ORfFpJ3HWq YcY429eik4ynCo39gPPQ75PJnOHRdJL1LxGXwr5E1DgnDQgyo9WnGNAWOfnnPIdto+jY y1Hg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=0rHp5sqwZfV8zA2e7TUhcFyytercbqQ9f3OcSRDFb/Q=; b=4+/1cmJim7d55y787iqfi4eWa3c563T5Q8tQd0MgGanxUAxihUAErwHDxJDMFyYm5Q TnUF1pObs81QsJARr9AU7F3P3Vs/OJafwb7vc/KsVriYdQgfbTVC7D80JVgwFtaUp4/s R1WhOyXNbYXcPEHczhHOa5afpzV383YePAM6/PuOVI7FWmpyhfkSR4z7njNe35qoMlyj F10oHGH013ucanOdC5JAoeLO9ja/Arc63UAldbY3Ecrbm/c2gA9Fcg3SevfiUvHySNnm ugP7jsHjaWjM2Q5ZdQtzc2C7exZn0va48Jx7UL4M3Jg3IZ6nQunA4NMreqYTrfc9oiXQ e2BQ== X-Gm-Message-State: AO0yUKUeqi8GrVwGgz7+VOFsvg/QbNUda/iexaWFWe09xu67pLUMlqUz 416AERGTz9pBRA/UPO9wbfs4V/JZsr6+RA== X-Google-Smtp-Source: AK7set9D0J1O6MGXIL+E7bG+8KXLjEN0EUoYJr/n+yaow9ec34nGQ4f54+vEtHEBbWDQCYr3fgOwzdZ75Z2Ebw== X-Received: from laogai.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:2c9]) (user=amoorthy job=sendgmr) by 2002:a25:9f11:0:b0:93b:ad96:1d07 with SMTP id n17-20020a259f11000000b0093bad961d07mr53407ybq.653.1676423795726; Tue, 14 Feb 2023 17:16:35 -0800 (PST) Date: Wed, 15 Feb 2023 01:16:11 +0000 In-Reply-To: <20230215011614.725983-1-amoorthy@google.com> Mime-Version: 1.0 References: <20230215011614.725983-1-amoorthy@google.com> X-Mailer: git-send-email 2.39.1.581.gbfd45094c4-goog Message-ID: <20230215011614.725983-6-amoorthy@google.com> Subject: [PATCH 5/8] kvm: Add cap/kvm_run field for memory fault exits From: Anish Moorthy To: Paolo Bonzini , Marc Zyngier Cc: Oliver Upton , Sean Christopherson , James Houghton , Anish Moorthy , Ben Gardon , David Matlack , Ricardo Koller , Chao Peng , Axel Rasmussen , kvm@vger.kernel.org, kvmarm@lists.linux.dev Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This new KVM exit allows userspace to handle missing memory. It indicates that the pages in the range [gpa, gpa + size) must be mapped. The "flags" field actually goes unused in this series: it's included for forward compatibility with [1], should this series happen to go in first. [1] https://lore.kernel.org/all/CA+EHjTyzZ2n8kQxH_Qx72aRq1k+dETJXTsoOM3tggPZAZkYbCA@mail.gmail.com/ Signed-off-by: Anish Moorthy Acked-by: James Houghton --- Documentation/virt/kvm/api.rst | 42 ++++++++++++++++++++++++++++++++++ include/linux/kvm_host.h | 13 +++++++++++ include/uapi/linux/kvm.h | 13 ++++++++++- tools/include/uapi/linux/kvm.h | 7 ++++++ virt/kvm/kvm_main.c | 26 +++++++++++++++++++++ 5 files changed, 100 insertions(+), 1 deletion(-) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 9807b05a1b571..4b06e60668686 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -5937,6 +5937,18 @@ delivery must be provided via the "reg_aen" struct. The "pad" and "reserved" fields may be used for future extensions and should be set to 0s by userspace. +4.137 KVM_SET_MEM_FAULT_NOWAIT +------------------------------ + +:Capability: KVM_CAP_MEM_FAULT_NOWAIT +:Architectures: x86, arm64 +:Type: vm ioctl +:Parameters: bool state (in) +:Returns: 0 on success, or -1 if KVM_CAP_MEM_FAULT_NOWAIT is not present. + +Enables (state=true) or disables (state=false) waitless memory faults. For more +information, see the documentation of KVM_CAP_MEM_FAULT_NOWAIT. + 5. The kvm_run structure ======================== @@ -6544,6 +6556,21 @@ array field represents return values. The userspace should update the return values of SBI call before resuming the VCPU. For more details on RISC-V SBI spec refer, https://github.com/riscv/riscv-sbi-doc. +:: + + /* KVM_EXIT_MEMORY_FAULT */ + struct { + __u64 gpa; + __u64 size; + } memory_fault; + +If exit reason is KVM_EXIT_MEMORY_FAULT then it indicates that the VCPU has +encountered a memory error which is not handled by KVM kernel module and +which userspace may choose to handle. + +'gpa' and 'size' indicate the memory range the error occurs at. Userspace +may handle the error and return to KVM to retry the previous memory access. + :: /* KVM_EXIT_NOTIFY */ @@ -7577,6 +7604,21 @@ This capability is aimed to mitigate the threat that malicious VMs can cause CPU stuck (due to event windows don't open up) and make the CPU unavailable to host or other VMs. +7.34 KVM_CAP_MEM_FAULT_NOWAIT +----------------------------- + +:Architectures: x86, arm64 +:Target: VM +:Parameters: None +:Returns: 0 on success, or -EINVAL if capability is not supported. + +The presence of this capability indicates that userspace can enable/disable +waitless memory faults through the KVM_SET_MEM_FAULT_NOWAIT ioctl. + +When waitless memory faults are enabled, fast get_user_pages failures when +handling EPT/Shadow Page Table violations will cause a vCPU exit +(KVM_EXIT_MEMORY_FAULT) instead of a fallback to slow get_user_pages. + 8. Other capabilities. ====================== diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 109b18e2789c4..9352e7f8480fb 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -801,6 +801,9 @@ struct kvm { bool vm_bugged; bool vm_dead; + rwlock_t mem_fault_nowait_lock; + bool mem_fault_nowait; + #ifdef CONFIG_HAVE_KVM_PM_NOTIFIER struct notifier_block pm_notifier; #endif @@ -2278,4 +2281,14 @@ static inline void kvm_account_pgtable_pages(void *virt, int nr) /* Max number of entries allowed for each kvm dirty ring */ #define KVM_DIRTY_RING_MAX_ENTRIES 65536 +static inline bool memory_faults_enabled(struct kvm *kvm) +{ + bool ret; + + read_lock(&kvm->mem_fault_nowait_lock); + ret = kvm->mem_fault_nowait; + read_unlock(&kvm->mem_fault_nowait_lock); + return ret; +} + #endif diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 55155e262646e..064fbfed97f01 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -264,6 +264,7 @@ struct kvm_xen_exit { #define KVM_EXIT_RISCV_SBI 35 #define KVM_EXIT_RISCV_CSR 36 #define KVM_EXIT_NOTIFY 37 +#define KVM_EXIT_MEMORY_FAULT 38 /* For KVM_EXIT_INTERNAL_ERROR */ /* Emulate instruction failed. */ @@ -505,6 +506,12 @@ struct kvm_run { #define KVM_NOTIFY_CONTEXT_INVALID (1 << 0) __u32 flags; } notify; + /* KVM_EXIT_MEMORY_FAULT */ + struct { + __u64 flags; + __u64 gpa; + __u64 size; + } memory_fault; /* Fix the size of the union. */ char padding[256]; }; @@ -1175,6 +1182,7 @@ struct kvm_ppc_resize_hpt { #define KVM_CAP_DIRTY_LOG_RING_ACQ_REL 223 #define KVM_CAP_S390_PROTECTED_ASYNC_DISABLE 224 #define KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP 225 +#define KVM_CAP_MEM_FAULT_NOWAIT 226 #ifdef KVM_CAP_IRQ_ROUTING @@ -1658,7 +1666,7 @@ struct kvm_enc_region { /* Available with KVM_CAP_ARM_SVE */ #define KVM_ARM_VCPU_FINALIZE _IOW(KVMIO, 0xc2, int) -/* Available with KVM_CAP_S390_VCPU_RESETS */ +/* Available with KVM_CAP_S390_VCPU_RESETS */ #define KVM_S390_NORMAL_RESET _IO(KVMIO, 0xc3) #define KVM_S390_CLEAR_RESET _IO(KVMIO, 0xc4) @@ -2228,4 +2236,7 @@ struct kvm_s390_zpci_op { /* flags for kvm_s390_zpci_op->u.reg_aen.flags */ #define KVM_S390_ZPCIOP_REGAEN_HOST (1 << 0) +/* Available with KVM_CAP_MEM_FAULT_NOWAIT */ +#define KVM_SET_MEM_FAULT_NOWAIT _IOWR(KVMIO, 0xd2, bool) + #endif /* __LINUX_KVM_H */ diff --git a/tools/include/uapi/linux/kvm.h b/tools/include/uapi/linux/kvm.h index 20522d4ba1e0d..5d9e3f48a9634 100644 --- a/tools/include/uapi/linux/kvm.h +++ b/tools/include/uapi/linux/kvm.h @@ -264,6 +264,7 @@ struct kvm_xen_exit { #define KVM_EXIT_RISCV_SBI 35 #define KVM_EXIT_RISCV_CSR 36 #define KVM_EXIT_NOTIFY 37 +#define KVM_EXIT_MEMORY_FAULT 38 /* For KVM_EXIT_INTERNAL_ERROR */ /* Emulate instruction failed. */ @@ -505,6 +506,12 @@ struct kvm_run { #define KVM_NOTIFY_CONTEXT_INVALID (1 << 0) __u32 flags; } notify; + /* KVM_EXIT_MEMORY_FAULT */ + struct { + __u64 flags; + __u64 gpa; + __u64 size; + } memory_fault; /* Fix the size of the union. */ char padding[256]; }; diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index dae5f48151032..8e5bfc00d1181 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1149,6 +1149,9 @@ static struct kvm *kvm_create_vm(unsigned long type, const char *fdname) INIT_LIST_HEAD(&kvm->devices); kvm->max_vcpus = KVM_MAX_VCPUS; + rwlock_init(&kvm->mem_fault_nowait_lock); + kvm->mem_fault_nowait = false; + BUILD_BUG_ON(KVM_MEM_SLOTS_NUM > SHRT_MAX); /* @@ -2313,6 +2316,16 @@ static int kvm_vm_ioctl_clear_dirty_log(struct kvm *kvm, } #endif /* CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT */ +static int kvm_vm_ioctl_set_mem_fault_nowait(struct kvm *kvm, bool state) +{ + if (!kvm_vm_ioctl_check_extension(kvm, KVM_CAP_MEM_FAULT_NOWAIT)) + return -1; + write_lock(&kvm->mem_fault_nowait_lock); + kvm->mem_fault_nowait = state; + write_unlock(&kvm->mem_fault_nowait_lock); + return 0; +} + struct kvm_memory_slot *gfn_to_memslot(struct kvm *kvm, gfn_t gfn) { return __gfn_to_memslot(kvm_memslots(kvm), gfn); @@ -4675,6 +4688,10 @@ static int kvm_vm_ioctl_enable_cap_generic(struct kvm *kvm, return r; } + case KVM_CAP_MEM_FAULT_NOWAIT: + if (!kvm_vm_ioctl_check_extension_generic(kvm, cap->cap)) + return -EINVAL; + return 0; default: return kvm_vm_ioctl_enable_cap(kvm, cap); } @@ -4892,6 +4909,15 @@ static long kvm_vm_ioctl(struct file *filp, r = 0; break; } + case KVM_SET_MEM_FAULT_NOWAIT: { + bool state; + + r = -EFAULT; + if (copy_from_user(&state, argp, sizeof(state))) + goto out; + r = kvm_vm_ioctl_set_mem_fault_nowait(kvm, state); + break; + } case KVM_CHECK_EXTENSION: r = kvm_vm_ioctl_check_extension_generic(kvm, arg); break; From patchwork Wed Feb 15 01:16:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anish Moorthy X-Patchwork-Id: 13141161 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 27B09C05027 for ; Wed, 15 Feb 2023 01:17:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233340AbjBOBRJ (ORCPT ); Tue, 14 Feb 2023 20:17:09 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36844 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233310AbjBOBRG (ORCPT ); Tue, 14 Feb 2023 20:17:06 -0500 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B616E211A for ; Tue, 14 Feb 2023 17:16:40 -0800 (PST) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-4cddba76f55so182662767b3.23 for ; Tue, 14 Feb 2023 17:16:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=OVCdKhDDZw/gGjfaku0Q25g+1PJYV5T4NjwoSTMXyIM=; b=YS0IVuodOriuD6adGk6yJIyRbkJGKtwXHdpw1eiZyMtO/0jtjpiry/ha4aiE86Lbqu tYnZLXKu6LhNvMjZT6lc9zM4Dvu0tgNeemkcwqfITKacVnU/TqlwJnFPIjN3vzZRziBa QnPI+nQBR2XFtlewR0E9N5vIIjyfymSMXo8yQysEDk9SagSu0T3U5xbW5mZrSaY8oVXF jic/nk4POmV7OjagmW5oCyoyOjWlUPb+nft0Bjir5SvqCbjlxTCxp71+UId36Xf77oHE CvNSWJqfmhGoClwY4Vh3M5sTu7kHvX4L1Ejtu5cj9z1Zy/uRjnc1c9UT9xHtBlijvc/c qjdw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=OVCdKhDDZw/gGjfaku0Q25g+1PJYV5T4NjwoSTMXyIM=; b=W8ajxLLwekr8rLT8rqbeTj/Jeovs9pnXolMl+xtZWmsWIV9lYwv8IoQlNzT8NwkXXv /4F6SD93+67tfFpBtF2ZphVqxqZXgbqd2Qux0Wt3P/b5apscayFLnen2s3SAzDmJhc8T t4oMaTIKftu9t/j+NfYOoDibqTvbx9LkHFQzVdYgGKVlQzz6Iom0ZH7Lw0cJ6YdJ1PPK D2JUyYaivrqgJDerDeieygVhhLtx/nFSfMe0cr3w4Cv2BdVU0YdIG4hNhtr1WFhVit8K huZBReT+cPGW7XEAT3v4w8Muf62EcaJBxMZSXGHEBhEvkO0RgmURQMCBKLundhCKKRdB GTGQ== X-Gm-Message-State: AO0yUKWJmwbxZqY0c2dBU5zeTVmvjW/Mo4c4MOkyYOa554KOVu+mxceJ C/PP6ShIK/njnWDxcR7VKiqiQxYFA4TpVQ== X-Google-Smtp-Source: AK7set/0P5HkCsHCdqNkX6Phqm5KuQtLcG9ryS3GI3DTfruqUOURyBW0cYukAJvfbIrniqxUkkuylJt1BjwNYQ== X-Received: from laogai.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:2c9]) (user=amoorthy job=sendgmr) by 2002:a5b:10e:0:b0:94a:ebba:cba6 with SMTP id 14-20020a5b010e000000b0094aebbacba6mr2ybx.9.1676423796690; Tue, 14 Feb 2023 17:16:36 -0800 (PST) Date: Wed, 15 Feb 2023 01:16:12 +0000 In-Reply-To: <20230215011614.725983-1-amoorthy@google.com> Mime-Version: 1.0 References: <20230215011614.725983-1-amoorthy@google.com> X-Mailer: git-send-email 2.39.1.581.gbfd45094c4-goog Message-ID: <20230215011614.725983-7-amoorthy@google.com> Subject: [PATCH 6/8] kvm/x86: Add mem fault exit on EPT violations From: Anish Moorthy To: Paolo Bonzini , Marc Zyngier Cc: Oliver Upton , Sean Christopherson , James Houghton , Anish Moorthy , Ben Gardon , David Matlack , Ricardo Koller , Chao Peng , Axel Rasmussen , kvm@vger.kernel.org, kvmarm@lists.linux.dev Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org With the relevant kvm cap enabled, EPT violations will exit to userspace w/ reason KVM_EXIT_MEMORY_FAULT instead of resolving the fault via slow get_user_pages. Signed-off-by: Anish Moorthy Suggested-by: James Houghton --- arch/x86/kvm/mmu/mmu.c | 23 ++++++++++++++++++++--- arch/x86/kvm/x86.c | 1 + 2 files changed, 21 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index aeb240b339f54..28af8d60adee6 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4201,6 +4201,7 @@ static int __kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault { struct kvm_memory_slot *slot = fault->slot; bool async; + bool mem_fault_nowait; /* * Retry the page fault if the gfn hit a memslot that is being deleted @@ -4230,9 +4231,25 @@ static int __kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault } async = false; - fault->pfn = __gfn_to_pfn_memslot(slot, fault->gfn, false, false, &async, - fault->write, &fault->map_writable, - &fault->hva); + mem_fault_nowait = memory_faults_enabled(vcpu->kvm); + + fault->pfn = __gfn_to_pfn_memslot( + slot, fault->gfn, + mem_fault_nowait, + false, + mem_fault_nowait ? NULL : &async, + fault->write, &fault->map_writable, + &fault->hva); + + if (mem_fault_nowait) { + if (fault->pfn == KVM_PFN_ERR_FAULT) { + vcpu->run->exit_reason = KVM_EXIT_MEMORY_FAULT; + vcpu->run->memory_fault.gpa = fault->gfn << PAGE_SHIFT; + vcpu->run->memory_fault.size = PAGE_SIZE; + } + return RET_PF_CONTINUE; + } + if (!async) return RET_PF_CONTINUE; /* *pfn has correct page already */ diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 508074e47bc0e..fe39ab2af5db4 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4427,6 +4427,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) case KVM_CAP_VAPIC: case KVM_CAP_ENABLE_CAP: case KVM_CAP_VM_DISABLE_NX_HUGE_PAGES: + case KVM_CAP_MEM_FAULT_NOWAIT: r = 1; break; case KVM_CAP_EXIT_HYPERCALL: From patchwork Wed Feb 15 01:16:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anish Moorthy X-Patchwork-Id: 13141164 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B2C79C61DA4 for ; Wed, 15 Feb 2023 01:17:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233396AbjBOBRM (ORCPT ); Tue, 14 Feb 2023 20:17:12 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36770 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233336AbjBOBRG (ORCPT ); Tue, 14 Feb 2023 20:17:06 -0500 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 100DF31E04 for ; Tue, 14 Feb 2023 17:16:41 -0800 (PST) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-52ec7c792b1so155742087b3.5 for ; Tue, 14 Feb 2023 17:16:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=fA0hO878IlXBSB4jC4+MvDA1z8jaFfYMhwWYQLmDel4=; b=bbl37QLKdxpOtLvT/k5zY+PC40bb+EIVGf2KQRCmQQcOBqrWCjj8cTHg9P1O53Wse2 bugHioYaQtfHeLhjbFzDZE9Z4Fa/USJV4eXHrmrbY5/e7BEvJb6m5NimTEqAIoDNLGv8 RtOK7ixEhdqBb/gNQRm1H7gBYrwBT6XIeMtAkg1qQqDWfkUKBzzu7SCyuYvIyksxF++X Nv7ghiPS6n8LZMTLXV3/bKHv/dxdos2e85WQnmo99BXYi+hjKs9IfwCYFDy7uGwwFOBQ v7/8wpH3p8RqxOLwBlALPfxSgRTzU9g1o9w1ljpVMqh6W+sHUEAGkNRAw/4vHz2gGRq/ O75w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=fA0hO878IlXBSB4jC4+MvDA1z8jaFfYMhwWYQLmDel4=; b=Z4O0e1QBUyVInSfmbKs4cHKr7qklo/dvWgQaCF8CBle6bI0vuJN5t5YZH7ICnwSEva bbcWnKywPwma7yqitJt+NDmVbmZzWdc7mNLPRP/zscHsR9Y1bs4aIDlljgN5Iq2bX96q dcp1Lisbmv2ZpDuyzIKYSQX82XLuiaVVtTTJUagbkIZj8N4Jq950XTkcBq+gPVdvYbhw mTzVTQem3PVH0mTb4cTIkDkoHbkUO4zvhDtTOnYxCKEjC6hc/3J/uWF4OU6aJswQ5Sap KNzJFDvHvLslKmCbQmjOK9b94STII2asJksP++BedsyC/fZDIAE3x6qaeKBXPEldEbHO 38gQ== X-Gm-Message-State: AO0yUKWTPKA7vqFiqdNxcd2NL0bW+yWaofLoqdsRJT41bk5gNeCbKZGn iWO8rZUYVB1RgO5mtHt33Uw1vd84WGfelA== X-Google-Smtp-Source: AK7set+HWJ0kZC5aOyt79tOmvB1HvKpMOid+UEBdsClpbT07t1SvqZT3R0AdeIWlivmxNpxwaLmCHdoiMduUIg== X-Received: from laogai.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:2c9]) (user=amoorthy job=sendgmr) by 2002:a25:9f90:0:b0:85d:3cec:46d7 with SMTP id u16-20020a259f90000000b0085d3cec46d7mr48855ybq.283.1676423798015; Tue, 14 Feb 2023 17:16:38 -0800 (PST) Date: Wed, 15 Feb 2023 01:16:13 +0000 In-Reply-To: <20230215011614.725983-1-amoorthy@google.com> Mime-Version: 1.0 References: <20230215011614.725983-1-amoorthy@google.com> X-Mailer: git-send-email 2.39.1.581.gbfd45094c4-goog Message-ID: <20230215011614.725983-8-amoorthy@google.com> Subject: [PATCH 7/8] kvm/arm64: Implement KVM_CAP_MEM_FAULT_NOWAIT for arm64 From: Anish Moorthy To: Paolo Bonzini , Marc Zyngier Cc: Oliver Upton , Sean Christopherson , James Houghton , Anish Moorthy , Ben Gardon , David Matlack , Ricardo Koller , Chao Peng , Axel Rasmussen , kvm@vger.kernel.org, kvmarm@lists.linux.dev Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Just do atomic gfn_to_pfn_memslot when the cap is enabled. Since we don't have to deal with async page faults, the implementation is even simpler than on x86 Signed-off-by: Anish Moorthy Acked-by: James Houghton --- arch/arm64/kvm/arm.c | 1 + arch/arm64/kvm/mmu.c | 14 ++++++++++++-- 2 files changed, 13 insertions(+), 2 deletions(-) diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 698787ed87e92..31bec7866c346 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -220,6 +220,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) case KVM_CAP_VCPU_ATTRIBUTES: case KVM_CAP_PTP_KVM: case KVM_CAP_ARM_SYSTEM_SUSPEND: + case KVM_CAP_MEM_FAULT_NOWAIT: r = 1; break; case KVM_CAP_SET_GUEST_DEBUG2: diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 01352f5838a00..964af7cd5f1c8 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1206,6 +1206,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, unsigned long vma_pagesize, fault_granule; enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R; struct kvm_pgtable *pgt; + bool mem_fault_nowait; fault_granule = 1UL << ARM64_HW_PGTABLE_LEVEL_SHIFT(fault_level); write_fault = kvm_is_write_fault(vcpu); @@ -1301,8 +1302,17 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, */ smp_rmb(); - pfn = __gfn_to_pfn_memslot(memslot, gfn, false, false, NULL, - write_fault, &writable, NULL); + mem_fault_nowait = memory_faults_enabled(vcpu->kvm); + pfn = __gfn_to_pfn_memslot( + memslot, gfn, mem_fault_nowait, false, NULL, + write_fault, &writable, NULL); + + if (mem_fault_nowait && pfn == KVM_PFN_ERR_FAULT) { + vcpu->run->exit_reason = KVM_EXIT_MEMORY_FAULT; + vcpu->run->memory_fault.gpa = gfn << PAGE_SHIFT; + vcpu->run->memory_fault.size = vma_pagesize; + return -EFAULT; + } if (pfn == KVM_PFN_ERR_HWPOISON) { kvm_send_hwpoison_signal(hva, vma_shift); return 0; From patchwork Wed Feb 15 01:16:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anish Moorthy X-Patchwork-Id: 13141163 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9C923C64ED6 for ; Wed, 15 Feb 2023 01:17:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229920AbjBOBRN (ORCPT ); Tue, 14 Feb 2023 20:17:13 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36770 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233329AbjBOBRI (ORCPT ); Tue, 14 Feb 2023 20:17:08 -0500 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1CA9131E23 for ; Tue, 14 Feb 2023 17:16:41 -0800 (PST) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-4c11ae6ab25so179162887b3.8 for ; Tue, 14 Feb 2023 17:16:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Ji8WghclKAhj57urzXye6wBO7ysPmUoMohvoVp5Cmo8=; b=GJzSRkccRTUwHG+o3x8+8CyEBDcwnBg9c8OsL1KGY64ImfTJ3mYDDrt8Y6tm9yw0Ll eDlKxuOkTVRWH6MlaXpUpz8Ex8Q8FCizYJpR3ya37MKNf+4OPACXM07B06ltxb8DF7R/ lK6MZvvVHj8kn83v7PnCUC9/Ddw+W53R4hX+gH9K6vS8152CNv64F1HiLD6bAt8YnH8/ tFBPJHz+mP3mX/66dT5fWjUtSCizaqiLKNqXRh0nelI8tV7XHHy4h6aBahfPhu5NOoJF vDDD9mOdOh95dfKeSoETlP7dQ3qL9MV7a2jny92mOspe5gJTW/mc7wWJbzotvNvIiYPu R6NA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Ji8WghclKAhj57urzXye6wBO7ysPmUoMohvoVp5Cmo8=; b=M0lQ2V8RzhJtWlcCNX+CDGrUMjPLjip+k/0a8jSqwQ/Ki0BSzhWFwMICPWPOnvz3e8 IAogcizYwsgCfkV+sZtBHpJvXesLofRw85x/I/EsQHlaoeAXuhvxkb8HapmOQQD5cqUK J2AYae8yd/iS1S2t1FdjYFl1hMt2FlqidevB4swgitoPXC2cL+ttpZ1/MGoPVwLx8qx7 lw9DFkcwfHc+CpJ5jmEL971vgRENYr99Ka1iGuFQy9yvjIqWkhMXyHJJM+CV7Six4+hM Fr4n8McZULPGqe+De3wAxDnjw2qPr4chwyQXye9LGuQm1WbJmfA2YQAZWOj64v5V8tqJ OOlQ== X-Gm-Message-State: AO0yUKU8N8q+ADSsdNlG7ezNq/XhgecHxL7b1OyLrLCJkjdQuRmhyUr6 OOfdvq34JSHCirDYKEaqpmf46FgwJ1gPCQ== X-Google-Smtp-Source: AK7set+ZC379pB3zY5cDUe2KY5fc5be91Y6acUhjr0klhjLcKkET8sJgULBzIXisweiqe3KGPGMXv2WabI21Hg== X-Received: from laogai.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:2c9]) (user=amoorthy job=sendgmr) by 2002:a81:4ecd:0:b0:527:98b2:bce with SMTP id c196-20020a814ecd000000b0052798b20bcemr68517ywb.222.1676423799019; Tue, 14 Feb 2023 17:16:39 -0800 (PST) Date: Wed, 15 Feb 2023 01:16:14 +0000 In-Reply-To: <20230215011614.725983-1-amoorthy@google.com> Mime-Version: 1.0 References: <20230215011614.725983-1-amoorthy@google.com> X-Mailer: git-send-email 2.39.1.581.gbfd45094c4-goog Message-ID: <20230215011614.725983-9-amoorthy@google.com> Subject: [PATCH 8/8] selftests/kvm: Handle mem fault exits in demand paging test From: Anish Moorthy To: Paolo Bonzini , Marc Zyngier Cc: Oliver Upton , Sean Christopherson , James Houghton , Anish Moorthy , Ben Gardon , David Matlack , Ricardo Koller , Chao Peng , Axel Rasmussen , kvm@vger.kernel.org, kvmarm@lists.linux.dev Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org When a memory fault exit is received for a GFN which has not yet been UFFDIO_COPY/CONTINUEd, said call will be issued (and any EEXISTs will be ignored). Otherwise, the pages will be faulted in via MADV_POPULATE_WRITE. Signed-off-by: Anish Moorthy Acked-by: James Houghton --- .../selftests/kvm/demand_paging_test.c | 206 +++++++++++++----- 1 file changed, 151 insertions(+), 55 deletions(-) diff --git a/tools/testing/selftests/kvm/demand_paging_test.c b/tools/testing/selftests/kvm/demand_paging_test.c index 34d5ba2044a2c..ff731aef52fd0 100644 --- a/tools/testing/selftests/kvm/demand_paging_test.c +++ b/tools/testing/selftests/kvm/demand_paging_test.c @@ -15,6 +15,7 @@ #include #include #include +#include #include #include "kvm_util.h" @@ -31,6 +32,60 @@ static uint64_t guest_percpu_mem_size = DEFAULT_PER_VCPU_MEM_SIZE; static size_t demand_paging_size; static char *guest_data_prototype; +static int num_uffds; +static size_t uffd_region_size; +static struct uffd_desc **uffd_descs; +/* + * Delay when demand paging is performed through userfaultfd or directly by + * vcpu_worker in the case of a KVM_EXIT_MEMORY_FAULT. + */ +static useconds_t uffd_delay; +static int uffd_mode; + + +static int handle_uffd_page_request( + int uffd_mode, int uffd, uint64_t hva, bool is_vcpu +); + +static void madv_write_or_err(uint64_t gpa) +{ + int r; + void *hva = addr_gpa2hva(memstress_args.vm, gpa); + + r = madvise(hva, demand_paging_size, MADV_POPULATE_WRITE); + TEST_ASSERT( + r == 0, + "MADV_POPULATE_WRITE on hva 0x%lx (gpa 0x%lx) failed with errno %i\n", + (uintptr_t) hva, gpa, errno); +} + +static void ready_page(uint64_t gpa) +{ + int r, uffd; + + /* + * This test only registers memslot 1 w/ userfaultfd. Any accesses outside + * the registered ranges should fault in the physical pages through + * MADV_POPULATE_WRITE. + */ + if ((gpa < memstress_args.gpa) + || (gpa >= memstress_args.gpa + memstress_args.size)) { + madv_write_or_err(gpa); + } else { + if (uffd_delay) + usleep(uffd_delay); + + uffd = uffd_descs[(gpa - memstress_args.gpa) / uffd_region_size]->uffd; + + r = handle_uffd_page_request( + uffd_mode, uffd, + (uint64_t) addr_gpa2hva(memstress_args.vm, gpa), true); + + if (r == EEXIST) + madv_write_or_err(gpa); + } +} + static void vcpu_worker(struct memstress_vcpu_args *vcpu_args) { struct kvm_vcpu *vcpu = vcpu_args->vcpu; @@ -42,25 +97,34 @@ static void vcpu_worker(struct memstress_vcpu_args *vcpu_args) clock_gettime(CLOCK_MONOTONIC, &start); - /* Let the guest access its memory */ - ret = _vcpu_run(vcpu); - TEST_ASSERT(ret == 0, "vcpu_run failed: %d\n", ret); - if (get_ucall(vcpu, NULL) != UCALL_SYNC) { - TEST_ASSERT(false, - "Invalid guest sync status: exit_reason=%s\n", - exit_reason_str(run->exit_reason)); - } + while (true) { + /* Let the guest access its memory */ + ret = _vcpu_run(vcpu); + TEST_ASSERT(ret == 0 || (run->exit_reason == KVM_EXIT_MEMORY_FAULT), + "vcpu_run failed: %d\n", ret); + if (get_ucall(vcpu, NULL) != UCALL_SYNC) { + + if (run->exit_reason == KVM_EXIT_MEMORY_FAULT) { + ready_page(run->memory_fault.gpa); + continue; + } + + TEST_ASSERT(false, + "Invalid guest sync status: exit_reason=%s\n", + exit_reason_str(run->exit_reason)); + } - ts_diff = timespec_elapsed(start); - PER_VCPU_DEBUG("vCPU %d execution time: %ld.%.9lds\n", vcpu_idx, - ts_diff.tv_sec, ts_diff.tv_nsec); + ts_diff = timespec_elapsed(start); + PER_VCPU_DEBUG("vCPU %d execution time: %ld.%.9lds\n", vcpu_idx, + ts_diff.tv_sec, ts_diff.tv_nsec); + break; + } } -static int handle_uffd_page_request(int uffd_mode, int uffd, - struct uffd_msg *msg) +static int handle_uffd_page_request( + int uffd_mode, int uffd, uint64_t hva, bool is_vcpu) { pid_t tid = syscall(__NR_gettid); - uint64_t addr = msg->arg.pagefault.address; struct timespec start; struct timespec ts_diff; int r; @@ -71,58 +135,81 @@ static int handle_uffd_page_request(int uffd_mode, int uffd, struct uffdio_copy copy; copy.src = (uint64_t)guest_data_prototype; - copy.dst = addr; + copy.dst = hva; copy.len = demand_paging_size; - copy.mode = 0; + copy.mode = UFFDIO_COPY_MODE_DONTWAKE; - r = ioctl(uffd, UFFDIO_COPY, ©); /* - * With multiple vCPU threads fault on a single page and there are - * multiple readers for the UFFD, at least one of the UFFDIO_COPYs - * will fail with EEXIST: handle that case without signaling an - * error. + * With multiple vCPU threads and at least one of multiple reader threads + * or vCPU memory faults, multiple vCPUs accessing an absent page will + * almost certainly cause some thread doing the UFFDIO_COPY here to get + * EEXIST: make sure to allow that case. */ - if (r == -1 && errno != EEXIST) { - pr_info( - "Failed UFFDIO_COPY in 0x%lx from thread %d, errno = %d\n", - addr, tid, errno); - return r; - } + r = ioctl(uffd, UFFDIO_COPY, ©); + TEST_ASSERT( + r == 0 || errno == EEXIST, + "Thread 0x%x failed UFFDIO_COPY on hva 0x%lx, errno = %d", + gettid(), hva, errno); } else if (uffd_mode == UFFDIO_REGISTER_MODE_MINOR) { + /* The comments in the UFFDIO_COPY branch also apply here. */ struct uffdio_continue cont = {0}; - cont.range.start = addr; + cont.range.start = hva; cont.range.len = demand_paging_size; + cont.mode = UFFDIO_CONTINUE_MODE_DONTWAKE; r = ioctl(uffd, UFFDIO_CONTINUE, &cont); - /* See the note about EEXISTs in the UFFDIO_COPY branch. */ - if (r == -1 && errno != EEXIST) { - pr_info( - "Failed UFFDIO_CONTINUE in 0x%lx from thread %d, errno = %d\n", - addr, tid, errno); - return r; - } + TEST_ASSERT( + r == 0 || errno == EEXIST, + "Thread 0x%x failed UFFDIO_CONTINUE on hva 0x%lx, errno = %d", + gettid(), hva, errno); } else { TEST_FAIL("Invalid uffd mode %d", uffd_mode); } + /* + * If the above UFFDIO_COPY/CONTINUE fails with EEXIST, it will do so without + * waking threads waiting on the UFFD: make sure that happens here. + */ + if (!is_vcpu) { + struct uffdio_range range = { + .start = hva, + .len = demand_paging_size + }; + r = ioctl(uffd, UFFDIO_WAKE, &range); + TEST_ASSERT( + r == 0, + "Thread 0x%x failed UFFDIO_WAKE on hva 0x%lx, errno = %d", + gettid(), hva, errno); + } + ts_diff = timespec_elapsed(start); PER_PAGE_DEBUG("UFFD page-in %d \t%ld ns\n", tid, timespec_to_ns(ts_diff)); PER_PAGE_DEBUG("Paged in %ld bytes at 0x%lx from thread %d\n", - demand_paging_size, addr, tid); + demand_paging_size, hva, tid); return 0; } +static int handle_uffd_page_request_from_uffd( + int uffd_mode, int uffd, struct uffd_msg *msg) +{ + TEST_ASSERT( + msg->event == UFFD_EVENT_PAGEFAULT, + "Received uffd message with event %d != UFFD_EVENT_PAGEFAULT", + msg->event); + return handle_uffd_page_request( + uffd_mode, uffd, msg->arg.pagefault.address, false); +} + struct test_params { - int uffd_mode; bool single_uffd; - useconds_t uffd_delay; int readers_per_uffd; enum vm_mem_backing_src_type src_type; bool partition_vcpu_memory_access; + bool memfault_exits; }; static void prefault_mem(void *alias, uint64_t len) @@ -139,12 +226,10 @@ static void prefault_mem(void *alias, uint64_t len) static void run_test(enum vm_guest_mode mode, void *arg) { struct test_params *p = arg; - struct uffd_desc **uffd_descs = NULL; struct timespec start; struct timespec ts_diff; struct kvm_vm *vm; - int i, num_uffds = 0; - uint64_t uffd_region_size; + int i; vm = memstress_create_vm(mode, nr_vcpus, guest_percpu_mem_size, 1, p->src_type, p->partition_vcpu_memory_access); @@ -156,12 +241,18 @@ static void run_test(enum vm_guest_mode mode, void *arg) "Failed to allocate buffer for guest data pattern"); memset(guest_data_prototype, 0xAB, demand_paging_size); - if (p->uffd_mode) { + if (uffd_mode) { num_uffds = p->single_uffd ? 1 : nr_vcpus; uffd_region_size = nr_vcpus * guest_percpu_mem_size / num_uffds; uffd_descs = malloc(num_uffds * sizeof(struct uffd_desc *)); - TEST_ASSERT(uffd_descs, "Memory allocation failed"); + TEST_ASSERT(uffd_descs, "Failed to allocate memory of uffd descriptors"); + + if (p->memfault_exits) { + TEST_ASSERT(vm_check_cap(vm, KVM_CAP_MEM_FAULT_NOWAIT) > 0, + "Vm does not have KVM_CAP_SET_MEM_FAULT_NOWAIT"); + vm_ioctl(vm, KVM_SET_MEM_FAULT_NOWAIT, &p->memfault_exits); + } for (i = 0; i < num_uffds; i++) { struct memstress_vcpu_args *vcpu_args; @@ -181,10 +272,10 @@ static void run_test(enum vm_guest_mode mode, void *arg) * requests. */ uffd_descs[i] = uffd_setup_demand_paging( - p->uffd_mode, p->uffd_delay, vcpu_hva, + uffd_mode, uffd_delay, vcpu_hva, uffd_region_size, p->readers_per_uffd, - &handle_uffd_page_request); + &handle_uffd_page_request_from_uffd); } } @@ -198,7 +289,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) ts_diff = timespec_elapsed(start); pr_info("All vCPU threads joined\n"); - if (p->uffd_mode) { + if (uffd_mode) { /* Tell the user fault fd handler threads to quit */ for (i = 0; i < num_uffds; i++) uffd_stop_demand_paging(uffd_descs[i]); @@ -213,7 +304,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) memstress_destroy_vm(vm); free(guest_data_prototype); - if (p->uffd_mode) + if (uffd_mode) free(uffd_descs); } @@ -222,7 +313,7 @@ static void help(char *name) puts(""); printf("usage: %s [-h] [-m vm_mode] [-u uffd_mode] [-a]\n" " [-d uffd_delay_usec] [-r readers_per_uffd] [-b memory]\n" - " [-s type] [-v vcpus] [-o]\n", name); + " [-w] [-s type] [-v vcpus] [-o]\n", name); guest_modes_help(); printf(" -u: use userfaultfd to handle vCPU page faults. Mode is a\n" " UFFD registration mode: 'MISSING' or 'MINOR'.\n"); @@ -233,6 +324,7 @@ static void help(char *name) " FD handler to simulate demand paging\n" " overheads. Ignored without -u.\n"); printf(" -r: Set the number of reader threads per uffd.\n"); + printf(" -w: Enable kvm cap for memory fault exits.\n"); printf(" -b: specify the size of the memory region which should be\n" " demand paged by each vCPU. e.g. 10M or 3G.\n" " Default: 1G\n"); @@ -252,29 +344,30 @@ int main(int argc, char *argv[]) .partition_vcpu_memory_access = true, .readers_per_uffd = 1, .single_uffd = false, + .memfault_exits = false, }; int opt; guest_modes_append_default(); - while ((opt = getopt(argc, argv, "ahom:u:d:b:s:v:r:")) != -1) { + while ((opt = getopt(argc, argv, "ahowm:u:d:b:s:v:r:")) != -1) { switch (opt) { case 'm': guest_modes_cmdline(optarg); break; case 'u': if (!strcmp("MISSING", optarg)) - p.uffd_mode = UFFDIO_REGISTER_MODE_MISSING; + uffd_mode = UFFDIO_REGISTER_MODE_MISSING; else if (!strcmp("MINOR", optarg)) - p.uffd_mode = UFFDIO_REGISTER_MODE_MINOR; - TEST_ASSERT(p.uffd_mode, "UFFD mode must be 'MISSING' or 'MINOR'."); + uffd_mode = UFFDIO_REGISTER_MODE_MINOR; + TEST_ASSERT(uffd_mode, "UFFD mode must be 'MISSING' or 'MINOR'."); break; case 'a': p.single_uffd = true; break; case 'd': - p.uffd_delay = strtoul(optarg, NULL, 0); - TEST_ASSERT(p.uffd_delay >= 0, "A negative UFFD delay is not supported."); + uffd_delay = strtoul(optarg, NULL, 0); + TEST_ASSERT(uffd_delay >= 0, "A negative UFFD delay is not supported."); break; case 'b': guest_percpu_mem_size = parse_size(optarg); @@ -297,6 +390,9 @@ int main(int argc, char *argv[]) "Invalid number of readers per uffd %d: must be >=1", p.readers_per_uffd); break; + case 'w': + p.memfault_exits = true; + break; case 'h': default: help(argv[0]); @@ -304,7 +400,7 @@ int main(int argc, char *argv[]) } } - if (p.uffd_mode == UFFDIO_REGISTER_MODE_MINOR && + if (uffd_mode == UFFDIO_REGISTER_MODE_MINOR && !backing_src_is_shared(p.src_type)) { TEST_FAIL("userfaultfd MINOR mode requires shared memory; pick a different -s"); }