From patchwork Thu Feb 15 23:53:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anish Moorthy X-Patchwork-Id: 13559293 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BC138145324 for ; Thu, 15 Feb 2024 23:54:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708041254; cv=none; b=aXQ4l8okCZr47v2vsHfGFDODVnW99HKau5VqrTnKva/YKg9dYgEWcL87xjXS8X7gTALi5wk7htPunfpLIk63rMXOyph/7O0JT07EM7C3TCFH5Die/pTdGzMnSfsvCuTFwe3qpQeWCsomNs7aVhL5aD3V8HF0M53BJSnjh7zbZ8E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708041254; c=relaxed/simple; bh=2Gmh2RqGGbogR47nKktVVZKh+hADkkb1APEKMK2y+Ec=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=EAsZJzVegiMwrpiFN5iUrH7Z09CMQ53WfrQANpxZB1JfvFCpW9qCELz033RnjUaYl0ZWT6W713SuFhoSIZ/aZ6MCrEvzqonXF94aQKVIlYYgNBIojeZIv0qNgqu5Cay955v+9f5hyeAh+6cXXBm6rXlMAZlbBLdyLmmQK7vIBs8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--amoorthy.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=BaZpZnYq; arc=none smtp.client-ip=209.85.219.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--amoorthy.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="BaZpZnYq" Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-dc6b2682870so2290139276.0 for ; Thu, 15 Feb 2024 15:54:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1708041252; x=1708646052; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=SpXI0W6SyqUsN8O3cq60fFq4dAgX1VPZevRlW+M4Y/M=; b=BaZpZnYqeQRvDiEaKaatwWbK9wL+S8SAiYu58Nhy1PP7oIlr2Xm5or6pEemMfT+/FM NoxuyLWk/yQZE0KkErzcKLWDjW5G+Hs22UT4ce/Zs06GpKANggW4wVpOuZQ3/YtWr090 3xWkzhmpHMpyTbYuMz9WnTEkvE2k/DysusTfC7zwtb3wiU/gCjcWt3v1zHifYYFOqCJF WFAUZFZD+Ay7eL40547toej/9FSsv2Lxrh2RFXsf310qd6ZwmGZDXIVIwXdvfEbt2CNp fKe3fY1aY3OTntDYb3Qob1xSy4lMELNjVFygzcsMMPZJF/E6JZHA3tyRfBHa6rb03bSa tKww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708041252; x=1708646052; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=SpXI0W6SyqUsN8O3cq60fFq4dAgX1VPZevRlW+M4Y/M=; b=gxZOfZ6znpGmrGTiyYYz3949i8HJNWkFXjIcKFRSQ5ItWrKzvvfSNbip9zkEXW9WJb RPtBBQfrzXyan+jMxaQdkkMyfGHZU9RJlazE5MXOcJ2K/SYG3af+2wpOM7YLje9e+b6u OxXLnZLZ6HnHsF8xMATmrx7mA8YWZBKEfhiN5Zf8XUwnlSKInOKY6dLN2nskZOC+LgeT kH+LVdUfd0vuECzM13TF6p5HsiQa3yh9uiliXUiWv901ekmKHrvYF97f3kMJabCaWMhy e0Qdwsm7r5nsqnclpXND4LLVU+LP10ouDnGnoInZr3l9XUovcEtEqBx6E3OuP+snR77N 0M9Q== X-Forwarded-Encrypted: i=1; AJvYcCXtom+FJrMI/JU+mBaHMIf1xZ0ZFmq4GU81i4SemfPQvcXYk0pF0emMrEodoArZWTQewdf1wYdZFZN8X/7QsVgMx9Rj X-Gm-Message-State: AOJu0Yz7S7Wq1EJsZ3D+xjdHvLr5v4/+//LapeEMvWLHN/VEvBxKuoJH 0FGKDSoLTQlWz2dIIl5Dq21N6MLjfOMJXvBcrnijIesI4+3uxawi71BbTR0WClimuCpqrPhLuE9 /jK0Z77hpgg== X-Google-Smtp-Source: AGHT+IGZA9+3WHFiYMUlOOAp3pzMgy3+EL2Y42TvoL0DWMZvwKvmW9yoy8cs+q33yE2vXi275sun/QISAhlCzA== X-Received: from laogai.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:2c9]) (user=amoorthy job=sendgmr) by 2002:a05:6902:728:b0:dc7:48ce:d17f with SMTP id l8-20020a056902072800b00dc748ced17fmr828768ybt.10.1708041251800; Thu, 15 Feb 2024 15:54:11 -0800 (PST) Date: Thu, 15 Feb 2024 23:53:52 +0000 In-Reply-To: <20240215235405.368539-1-amoorthy@google.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240215235405.368539-1-amoorthy@google.com> X-Mailer: git-send-email 2.44.0.rc0.258.g7320e95886-goog Message-ID: <20240215235405.368539-2-amoorthy@google.com> Subject: [PATCH v7 01/14] KVM: Clarify meaning of hva_to_pfn()'s 'atomic' parameter From: Anish Moorthy To: seanjc@google.com, oliver.upton@linux.dev, maz@kernel.org, kvm@vger.kernel.org, kvmarm@lists.linux.dev Cc: robert.hoo.linux@gmail.com, jthoughton@google.com, amoorthy@google.com, dmatlack@google.com, axelrasmussen@google.com, peterx@redhat.com, nadav.amit@gmail.com, isaku.yamahata@gmail.com, kconsul@linux.vnet.ibm.com The current description can be read as "atomic -> allowed to sleep," when in fact the intended statement is "atomic -> NOT allowed to sleep." Make that clearer in the docstring. Signed-off-by: Anish Moorthy --- virt/kvm/kvm_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index ff588677beb7..46e7b8dbb3d8 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2959,7 +2959,7 @@ static int hva_to_pfn_remapped(struct vm_area_struct *vma, /* * Pin guest page in memory and return its pfn. * @addr: host virtual address which maps memory to the guest - * @atomic: whether this function can sleep + * @atomic: whether this function is forbidden from sleeping * @interruptible: whether the process can be interrupted by non-fatal signals * @async: whether this function need to wait IO complete if the * host page is not in the memory From patchwork Thu Feb 15 23:53:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anish Moorthy X-Patchwork-Id: 13559295 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F3203145B15 for ; Thu, 15 Feb 2024 23:54:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708041255; cv=none; b=m8qAEjDG6j0syCkrR3cniWZei0gYyzNf0SyAOry5hw7aDaMkpO8JrjvyDM/t747xKME5FSG0m43u3PawjhD+7lYHIGjKd5mJolf1yTNUw5VLtJvAbOJN4g0BN0VKsl3UaC0tVAcDTdElkU8GaWffn3CpKar6xEqZGHR+t1bQX3U= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708041255; c=relaxed/simple; bh=hE7An6neuPVvOGcFnrnVMg3xQPxzurJcj0TyW5C9ZWo=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=t7q9ZaPcYCY2nA3HE7WjtWOTpavt46H6A2eucC//NVFOr845rQomvHoaFqmvCmyxoBEyfwLjPeb7QrBIUvUzZLzdmVqwzGkmrQuuHLrIA0/YdxoqrRx0BWUuwVwvVqlKz+gRO7+mGdf+90nY5oYSR/hhHu2Lou1nZiRvf0dxKNM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--amoorthy.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=ziL+fp5V; arc=none smtp.client-ip=209.85.219.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--amoorthy.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="ziL+fp5V" Received: by mail-yb1-f202.google.com with SMTP id 3f1490d57ef6-dc64e0fc7c8so1969245276.2 for ; Thu, 15 Feb 2024 15:54:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1708041253; x=1708646053; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=yo/o9LY2I9nX5aBe59uiruEUtYqrrYJs95LrQhRBOZE=; b=ziL+fp5VHJaRoO8k+aNtpQ+lNiGBHzqtLxRID1Ri6s344KY6IwPV0MsolH2kVJplMv lLfDNXrvVE3larf0F9J0RX80uL2GhpcvuQVXtu+T2Z7hHt3DqYCnGgGgQVsk1x7CxDHn j3riqpvPXvKpTUOzKmPOD4XVYVEDsikbAIAo2aOZoWwfCqDPPn+cz60xIp0CTMIxl4UR 13VrYSudnuiHOKJ1jpCYO3LAXiFizwDYYXlbJSD4d40CW0UQfiXeNCF8m6k0iCSS+qm7 /hmJIlCbWGZ8DrXwGWClCQh8Gx7CZlWT9tP9m4ZYyxO3dXc/gUiEaopA4V4+XG2Jop77 GReQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708041253; x=1708646053; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=yo/o9LY2I9nX5aBe59uiruEUtYqrrYJs95LrQhRBOZE=; b=ag8vLmOwXPYa1di6PDysqjIK29dbrIY8TOD0w2VkB0zu5GorjWM/LVNGOQKKL6Gr35 Ry3hRCRGVHyeIMpS4RU31MYNcGaF/j38gBxqMgWw8CMryTf0pUtxKXlMsC/nTR4Nkr6Z ZJHW0upK/w/w9ZEjcmkEQYcFxovU4HHLFWM7pcM9g36L8ojwGiLUlsZ2Kr0wvFs0nEgA /vBab4zwyaxzEBFLHj48ygp5bNfIE2TWMCFnfROABLAw7UZ0fwxY+kWjQxfGauabJTu1 QOvRjRhU9SvuFkfL9JSXPHD3vmQTj2spQ2/x1ot6+Ih7qobJwqiIAm6aX82aEtRCraKk 7IvA== X-Forwarded-Encrypted: i=1; AJvYcCWV15W0mGyM7pIKwCR9cjMKfgfQA+NrEFShUrIRUHvsaofD1fyv/Ze+uMX0U3cXuVptJNuDF0MZ1lw5T7pkpiNF2pKx X-Gm-Message-State: AOJu0Yws0W4t8Haf5JI97kRWKqsYYgnc5JFKnXzy2GK+FqP23op/J1G3 /j5xAsHzzDkQNWbVPSgX2Xmmn/IeGmRNWuHDU2qBVxr5LsTkdJybnUH0aBr5quBrlhse1ny5Ggh YIav4vp3MAw== X-Google-Smtp-Source: AGHT+IEKdbXqNVReTvl5F0UombK5sQ0Ti7UMHJ2sVJ221GtcNLTwHyT9qTykKRnqFVP6BOtF0+Av1SRu6Gg7sw== X-Received: from laogai.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:2c9]) (user=amoorthy job=sendgmr) by 2002:a05:6902:1001:b0:dcc:79ab:e522 with SMTP id w1-20020a056902100100b00dcc79abe522mr129806ybt.11.1708041253060; Thu, 15 Feb 2024 15:54:13 -0800 (PST) Date: Thu, 15 Feb 2024 23:53:53 +0000 In-Reply-To: <20240215235405.368539-1-amoorthy@google.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240215235405.368539-1-amoorthy@google.com> X-Mailer: git-send-email 2.44.0.rc0.258.g7320e95886-goog Message-ID: <20240215235405.368539-3-amoorthy@google.com> Subject: [PATCH v7 02/14] KVM: Add function comments for __kvm_read/write_guest_page() From: Anish Moorthy To: seanjc@google.com, oliver.upton@linux.dev, maz@kernel.org, kvm@vger.kernel.org, kvmarm@lists.linux.dev Cc: robert.hoo.linux@gmail.com, jthoughton@google.com, amoorthy@google.com, dmatlack@google.com, axelrasmussen@google.com, peterx@redhat.com, nadav.amit@gmail.com, isaku.yamahata@gmail.com, kconsul@linux.vnet.ibm.com The (gfn, data, offset, len) order of parameters is a little strange since "offset" applies to "gfn" rather than to "data". Add function comments to make things perfectly clear. Signed-off-by: Anish Moorthy --- virt/kvm/kvm_main.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 46e7b8dbb3d8..7186d301d617 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -3304,6 +3304,7 @@ static int next_segment(unsigned long len, int offset) return len; } +/* Copy @len bytes from guest memory at '(@gfn * PAGE_SIZE) + @offset' to @data */ static int __kvm_read_guest_page(struct kvm_memory_slot *slot, gfn_t gfn, void *data, int offset, int len) { @@ -3405,6 +3406,7 @@ int kvm_vcpu_read_guest_atomic(struct kvm_vcpu *vcpu, gpa_t gpa, } EXPORT_SYMBOL_GPL(kvm_vcpu_read_guest_atomic); +/* Copy @len bytes from @data into guest memory at '(@gfn * PAGE_SIZE) + @offset' */ static int __kvm_write_guest_page(struct kvm *kvm, struct kvm_memory_slot *memslot, gfn_t gfn, const void *data, int offset, int len) From patchwork Thu Feb 15 23:53:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anish Moorthy X-Patchwork-Id: 13559296 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 02C50145FF0 for ; Thu, 15 Feb 2024 23:54:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708041256; cv=none; b=qpQlVB0QQEB/XgngW6KhM1OmNXbmW+sEW3TUzKZi2DhBWTiC70rhJ0O6UFRwQU0kI/oXHrHSTNi6QhVMQPqNEs7wO6dDwfCPA1VCKkAIOWg7pY3xX5Lc2ieKSCF7ReBs7zITuQDzqb/iw3BN3QLxZNbZbV6WB0dkGwxIqP/j9NM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708041256; c=relaxed/simple; bh=OMyN7fHzEfLgoBye4T8ZSNiYVwuoMrC6dnIEAZg3ngg=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=YSamIkAZB+r2D6AJ+IdNoc3h8+3wVgb/d+vOZaEItFqgenKQFJGTZfal7ZFC3/QEIgA1JGEahRWubZF/A6tsyaBI25DOVLcqqhf2P8W8Hz8uA9CoWMWKMwHD8y2ZXPqx8F60cv9X+kOh8VGYlJpWIArQ9z+nt3JbZer4r/FMoh0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--amoorthy.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=jsSflqiJ; arc=none smtp.client-ip=209.85.219.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--amoorthy.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="jsSflqiJ" Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-dcc15b03287so2043909276.3 for ; Thu, 15 Feb 2024 15:54:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1708041254; x=1708646054; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=jBD0v9lfe/Rw/ET2I01vheD5aGv2zWzm/5LdmhbgJC4=; b=jsSflqiJGSC6ps15oKMCnGRb+vWgLzNEJVqlcvbd4s455JGS3nt7dwCGe3w+nGbWLl aOCCZg5ZtKMlph2MoJFVPs+Ehl1oc9t4NB9kgvroh+gz5cQdpX/KPTptcvRPNcKRFGzM /FtjwgV5g0iaCoIFSyx/ImGZkX3UEEXEuW5g+hjcNbnDJb6puakIgtVKSdVYTlNhwSsr ItrSoBw+ebGTg8DCasLDuf9/AUaoSb21FekVFK63luPmZelxtvjo8DS+MiCX6CjBczyd /GO26OhxX0xocKAYzFRuLK+F9frUXj1Bon9cA4KAymnLFTq+IgNHPvx2u6iOCDO8YRng hgpA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708041254; x=1708646054; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=jBD0v9lfe/Rw/ET2I01vheD5aGv2zWzm/5LdmhbgJC4=; b=WGIRh4NcpUUv9HYKHOp0pcDNM00JaOfZqT/PiZZtGZJwEaAqKL8q8R6N2MmOAx47c5 FedaxTOnHT7kc7pHauMJYmSpRazBuJ/LlPO08H3r9FkQ5SGbp+ZJGti93X9yKaP79Bld sED9TXMjCv7sFCIQa0rTStvJi/vwxBvUiIt+SQqkGJlWrrqgvWCe2kDSCkZ1dyldxNLa sCGkLqQXzZmndCPlnQHwkqdc4zwO2SmLjY9vFSKIlD++1khH58HfZRmsWUFz6kthR1Gf Rfm2Tv6ILZqfJMe3hStm04ibM6fROu2WJhttAkDFnpQsBtOJ/CWL0bCintIWi/m0fdJ1 hINw== X-Forwarded-Encrypted: i=1; AJvYcCXYgZQkAeh+CfWc3qa30x3h3vJGa3TPcD+pYcEsOD/bwQHmcPrxO4aHxZ7vi7at868INS4WsEuyGAoYCEPT+oFmGvm/ X-Gm-Message-State: AOJu0Yz3fLeA2DL8YaAIoXiXAzKDGjkwo2zpsAHy2QsZkUzt+tDHjsbc bGu4F7dHinesQ7HOHxxiZYEDv5+0+fD9d/SYtu44m+4TUGLlT+8Q8Ot5oLdWDGxCIAJLo19tDGg 4UV1fyRRx0g== X-Google-Smtp-Source: AGHT+IFG6eQ7U1pvB32UpE/Y9o390fzwKxrTI+bmqgzS9btoriiRyB8NESOhZyB5uyLvr3E+n4pE3rTBJf6EHA== X-Received: from laogai.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:2c9]) (user=amoorthy job=sendgmr) by 2002:a05:6902:d47:b0:dcb:611c:9055 with SMTP id cs7-20020a0569020d4700b00dcb611c9055mr129053ybb.5.1708041254107; Thu, 15 Feb 2024 15:54:14 -0800 (PST) Date: Thu, 15 Feb 2024 23:53:54 +0000 In-Reply-To: <20240215235405.368539-1-amoorthy@google.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240215235405.368539-1-amoorthy@google.com> X-Mailer: git-send-email 2.44.0.rc0.258.g7320e95886-goog Message-ID: <20240215235405.368539-4-amoorthy@google.com> Subject: [PATCH v7 03/14] KVM: Documentation: Make note of the KVM_MEM_GUEST_MEMFD memslot flag From: Anish Moorthy To: seanjc@google.com, oliver.upton@linux.dev, maz@kernel.org, kvm@vger.kernel.org, kvmarm@lists.linux.dev Cc: robert.hoo.linux@gmail.com, jthoughton@google.com, amoorthy@google.com, dmatlack@google.com, axelrasmussen@google.com, peterx@redhat.com, nadav.amit@gmail.com, isaku.yamahata@gmail.com, kconsul@linux.vnet.ibm.com The documentation for KVM_SET_USER_MEMORY_REGION2 describes what the flag does, but the flag itself is absent from where the other memslot flags are listed. Add it. Signed-off-by: Anish Moorthy --- Documentation/virt/kvm/api.rst | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 3ec0b7a455a0..8f75fca2294e 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -1352,6 +1352,7 @@ yet and must be cleared on entry. /* for kvm_userspace_memory_region::flags */ #define KVM_MEM_LOG_DIRTY_PAGES (1UL << 0) #define KVM_MEM_READONLY (1UL << 1) + #define KVM_MEM_GUEST_MEMFD (1UL << 2) This ioctl allows the user to create, modify or delete a guest physical memory slot. Bits 0-15 of "slot" specify the slot id and this value @@ -1382,12 +1383,16 @@ It is recommended that the lower 21 bits of guest_phys_addr and userspace_addr be identical. This allows large pages in the guest to be backed by large pages in the host. -The flags field supports two flags: KVM_MEM_LOG_DIRTY_PAGES and -KVM_MEM_READONLY. The former can be set to instruct KVM to keep track of +The flags field supports three flags + +1. KVM_MEM_LOG_DIRTY_PAGES: can be set to instruct KVM to keep track of writes to memory within the slot. See KVM_GET_DIRTY_LOG ioctl to know how to -use it. The latter can be set, if KVM_CAP_READONLY_MEM capability allows it, +use it. +2. KVM_MEM_READONLY: can be set, if KVM_CAP_READONLY_MEM capability allows it, to make a new slot read-only. In this case, writes to this memory will be posted to userspace as KVM_EXIT_MMIO exits. +3. KVM_MEM_GUEST_MEMFD: see KVM_SET_USER_MEMORY_REGION2. This flag is +incompatible with KVM_SET_USER_MEMORY_REGION. When the KVM_CAP_SYNC_MMU capability is available, changes in the backing of the memory region are automatically reflected into the guest. For example, an From patchwork Thu Feb 15 23:53:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anish Moorthy X-Patchwork-Id: 13559297 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0ADBD146008 for ; Thu, 15 Feb 2024 23:54:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708041257; cv=none; b=tRPAqRuYKhFfgCrTH0HsRMRKumpqRbGs183Hdn8A6l6s5YItIM5B5+eEHS988vB3tkhfUX+eZxkcyHu2I2UgxUKuUIjJrd3EdJJ+a+bSHnD2Wn6IT6im7DI1gIxxz1g4WrR+lAGYTxVDYjj6gVPO0cB9ZdVkBTY6l5nbnvtaiy0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708041257; c=relaxed/simple; bh=r2pWpzfAVdKs83bkb8yysRyBcLlfdbbrpFtdK+fTshk=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=C8elLeEsO9oEFqAkUPihMojC7ReOPOX28lg/p/Y2Yi3GadUfjEc79m6PBjuRkjPtEKdjPz4DTQHdoYtpoJpMWfuOS6QZotS64VTJHKbi7F5v4NjhgQgR7haB3RwEKR4jfIEqNX0C/mj4vKSf6Mc79iU4/ITQE3xokwnCD/fkYLw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--amoorthy.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=yt89FVTw; arc=none smtp.client-ip=209.85.128.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--amoorthy.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="yt89FVTw" Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-607cef709dcso23359147b3.1 for ; Thu, 15 Feb 2024 15:54:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1708041255; x=1708646055; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=eziXXvGnuzc3MNfe8KeDVfyFrpM4nmh4XdH42oDvVIQ=; b=yt89FVTwUvQxwNVebGn9kHQ/At44GHZpQWEZZYV0F4YeTEGeqBmUTgRUp/E0yK3wkD iudEnUPa+GRm3yrnotFNXqrEtYMGWroQHhLe866AdcNK4QyF4KL8zfoMHt3vQOO0oL9d LnNGf7Gxxw3ILVHi3N9NCssLd1SZwUIdK9jqhYR5oVla4dCnGwFjTzFCwbqoYXAgzYe2 sTxo9k0PwhjZZTOoV7F8Xi03+fiYcLUf8v/W4t8bSBPdKPouVK5amqn5J+vuRRjhv6Kf rMbDkS9Imu0NxxGiGi3Y/wVovQuKzNOpePq/+JBREurNO+il3YecaVKunWU3tvGTgevp FYNA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708041255; x=1708646055; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=eziXXvGnuzc3MNfe8KeDVfyFrpM4nmh4XdH42oDvVIQ=; b=CQga99FOPYA+onBMbsELBq/G+/F812QLwQcvD2czGld6BhY6kq2gv+QAinUWaUNI2g A3DA/gOgD1TLMszG9efFv6Oexp8ALK04vqTEBY3S1F6qXEU6zLa1kl8MQu4NpSEJ2i0q 5kHfR6JgsROIoK0HkPVxOntbla/VYcp8YyIzsAaMs2nA4X1gZ09vVJ2O1tox07N1Oi3Z +Eqw42cYwSfJAUtFV77OSgYfnDMzXNT1podnErwQVoG4QiG9TWJukm7S/9wP+0btv/FP NE4dQSXdrmggKxgAHHlQEDZn/HhfPcrvInL9bDIfm0wztDAPtR9APyXi1EFLKlaGzVCz PYig== X-Forwarded-Encrypted: i=1; AJvYcCXlG90D2VeI4g10U3GU63z1SY5fLXDeF8G+Xu4m/ebgvOB67xN2jAkTSBjjvKapNoivVrHqbHtkN/Z9zXniF5Jhn70a X-Gm-Message-State: AOJu0YzHdMNncJ+5t+3QE2hofd9XroHD1ojhsn5Vh7u9Nlz62IW417vP 9L8hJ63J/JiXTqxV2cFdQHPHfF9NEeZjOdy7oml9u8xTDnuhCfsAwd+3pclq+eMC8N8B2EYCu/k N3YCQpi28tA== X-Google-Smtp-Source: AGHT+IFww/v3CM+5seP7lyS6W1rnjeJciBbkYTG1sl3efm7OgTZ/7wSq8CJehcHQjIKBWPkZH378Co78DdrMgg== X-Received: from laogai.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:2c9]) (user=amoorthy job=sendgmr) by 2002:a05:6902:705:b0:dc7:53a0:83ad with SMTP id k5-20020a056902070500b00dc753a083admr787010ybt.5.1708041255003; Thu, 15 Feb 2024 15:54:15 -0800 (PST) Date: Thu, 15 Feb 2024 23:53:55 +0000 In-Reply-To: <20240215235405.368539-1-amoorthy@google.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240215235405.368539-1-amoorthy@google.com> X-Mailer: git-send-email 2.44.0.rc0.258.g7320e95886-goog Message-ID: <20240215235405.368539-5-amoorthy@google.com> Subject: [PATCH v7 04/14] KVM: Simplify error handling in __gfn_to_pfn_memslot() From: Anish Moorthy To: seanjc@google.com, oliver.upton@linux.dev, maz@kernel.org, kvm@vger.kernel.org, kvmarm@lists.linux.dev Cc: robert.hoo.linux@gmail.com, jthoughton@google.com, amoorthy@google.com, dmatlack@google.com, axelrasmussen@google.com, peterx@redhat.com, nadav.amit@gmail.com, isaku.yamahata@gmail.com, kconsul@linux.vnet.ibm.com KVM_HVA_ERR_RO_BAD satisfies kvm_is_error_hva(), so there's no need to duplicate the "if (writable)" block. Fix this by bringing all kvm_is_error_hva() cases under one conditional. Signed-off-by: Anish Moorthy --- virt/kvm/kvm_main.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 7186d301d617..67ca580a18c5 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -3031,15 +3031,13 @@ kvm_pfn_t __gfn_to_pfn_memslot(const struct kvm_memory_slot *slot, gfn_t gfn, if (hva) *hva = addr; - if (addr == KVM_HVA_ERR_RO_BAD) { - if (writable) - *writable = false; - return KVM_PFN_ERR_RO_FAULT; - } - if (kvm_is_error_hva(addr)) { if (writable) *writable = false; + + if (addr == KVM_HVA_ERR_RO_BAD) + return KVM_PFN_ERR_RO_FAULT; + return KVM_PFN_NOSLOT; } From patchwork Thu Feb 15 23:53:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anish Moorthy X-Patchwork-Id: 13559298 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 44A6E14601C for ; Thu, 15 Feb 2024 23:54:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708041258; cv=none; b=NLsNQVw+4xodDL/wU2r3a4hHdUdPEYsGRsc5OaBOsF0fszTQq+cicQchCUQG5xHmV+JhSullA9UElRk8us7ZDILwBSsvakOqxZ3+ZocWosb+QmQ+ik3q3m8T5xUMV4lrmfqVNy72+1dYTDVnKyaXmoXNKasdp0NsVoy8tzj8dTs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708041258; c=relaxed/simple; bh=Cgdei//BJ4TfS7THt3BBXytD0NBWfYjOmDteYlMCj40=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=gdd+ARoK4NQ/WrEArkByNCHZ2IquyBiA5x8GVOeO7xQovHOv+MDbiGKI3r0ywPVrp8ai9oor6VybPoWfPAjbndbbnSIHwJheVsn3OG4fGdJ3eu1IvO+WslswZ8OYFzgXTliAqv/DBOG32Dk/E9RJF8bhW/+iiq5/SXDd48LpSwo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--amoorthy.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=Wf3NszRU; arc=none smtp.client-ip=209.85.128.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--amoorthy.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Wf3NszRU" Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-5ffee6fcdc1so2803067b3.2 for ; Thu, 15 Feb 2024 15:54:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1708041256; x=1708646056; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=1/E37B8pEtQeUXPVkKOjh16ZsPiNTw6lyoy6KXMrnX4=; b=Wf3NszRUefVYQL46yUGzTkb9MQvbFqmCr/nbAcx3ymu195JE6mSOQ5juOuypaTcUGj 3aDAeHG96ydtNFayyPQ1HY5UF4TRW3RSNDQ3O0sjgPCZaJgM3WBm5tRmLyIa4g5aXRH6 U6z5fe9iJgYoccEeVvwS/19raoPDeHX7lvGNNtx2BYVEVEsANUvQOWsKxgwkrtv3xwAY N+iXuFjLJlGsfD9/x/qQ2pFrmCQoW8Qvb1hZRQOvA70FSEkmi9ETTH5RDnp0DJ+DElCs Neowr4Kj2fJrrQQyzgUqjMLnjKuKyD6hnMqyJJg4z9oXUM+FugF+Xra/2dpcjfJpWWOq ppBw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708041256; x=1708646056; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=1/E37B8pEtQeUXPVkKOjh16ZsPiNTw6lyoy6KXMrnX4=; b=X2mbVeaIIPQBSG/jqlCNqRt6fh1s7ZXebx3b2zXYTPO203h+WVpexXE3vFTTTgaD98 PdkTVfLSB2ylB8Kq/GUlAifz4yrF9cqdmRo1uoBxjFmSU7Y3tUpetGEY7aCHC8f+aNAZ uQZ1yHIDI43wCc513ymeubxiCFGMYqw1Nf5HTsMoAPuQnQ3TAAOAKMjYNppRXDKpEgPF oADNZcwCEGBacvBSgD8IHHtdrkEA1yczZhRArcHEhiZC9VT1mzmNVsBwcWOZ6MOXcbgr N7ZtbD/A5g/FHbzgBTFTBkB2jGinrcReVnnf73IGj6l0L9CaZ1mEqbTFXtAL96Pme/es 6PQg== X-Forwarded-Encrypted: i=1; AJvYcCXE5BciTqRD2QcQ/MJ6WnroKwdaPlaJ95w5911DkyxbHMMR6nrDyQ6m9CwHC58UjID1Sh0R4Mv8XsRT3epvBEOKuEwM X-Gm-Message-State: AOJu0YxJzsd0LX4pPCuo6suJHOdoLhGh+nuTdWgpKUtGpNfNaEP+x50d GoGXvWsnNZkX5WVEuCSxqOp0LPykUiyfyKUNlxCbTcF82fUSyCMhZT7YVIUSPRaxXB+PtAMvjK1 8yOIPy3PlAg== X-Google-Smtp-Source: AGHT+IHedA2uvd08JONpyjwItnwJE43l4fF1/9gQHix0D/EhvKZ7QtfHgjYc9NTkUnT48me3Qd8VZVAkzw6WqA== X-Received: from laogai.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:2c9]) (user=amoorthy job=sendgmr) by 2002:a25:ad8e:0:b0:dc7:63e7:7a5c with SMTP id z14-20020a25ad8e000000b00dc763e77a5cmr197747ybi.11.1708041256253; Thu, 15 Feb 2024 15:54:16 -0800 (PST) Date: Thu, 15 Feb 2024 23:53:56 +0000 In-Reply-To: <20240215235405.368539-1-amoorthy@google.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240215235405.368539-1-amoorthy@google.com> X-Mailer: git-send-email 2.44.0.rc0.258.g7320e95886-goog Message-ID: <20240215235405.368539-6-amoorthy@google.com> Subject: [PATCH v7 05/14] KVM: Define and communicate KVM_EXIT_MEMORY_FAULT RWX flags to userspace From: Anish Moorthy To: seanjc@google.com, oliver.upton@linux.dev, maz@kernel.org, kvm@vger.kernel.org, kvmarm@lists.linux.dev Cc: robert.hoo.linux@gmail.com, jthoughton@google.com, amoorthy@google.com, dmatlack@google.com, axelrasmussen@google.com, peterx@redhat.com, nadav.amit@gmail.com, isaku.yamahata@gmail.com, kconsul@linux.vnet.ibm.com kvm_prepare_memory_fault_exit() already takes parameters describing the RWX-ness of the relevant access but doesn't actually do anything with them. Define and use the flags necessary to pass this information on to userspace. Suggested-by: Sean Christopherson Signed-off-by: Anish Moorthy --- Documentation/virt/kvm/api.rst | 5 +++++ include/linux/kvm_host.h | 9 ++++++++- include/uapi/linux/kvm.h | 3 +++ 3 files changed, 16 insertions(+), 1 deletion(-) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 8f75fca2294e..9f5d45c49e36 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -6964,6 +6964,9 @@ spec refer, https://github.com/riscv/riscv-sbi-doc. /* KVM_EXIT_MEMORY_FAULT */ struct { + #define KVM_MEMORY_EXIT_FLAG_READ (1ULL << 0) + #define KVM_MEMORY_EXIT_FLAG_WRITE (1ULL << 1) + #define KVM_MEMORY_EXIT_FLAG_EXEC (1ULL << 2) #define KVM_MEMORY_EXIT_FLAG_PRIVATE (1ULL << 3) __u64 flags; __u64 gpa; @@ -6975,6 +6978,8 @@ could not be resolved by KVM. The 'gpa' and 'size' (in bytes) describe the guest physical address range [gpa, gpa + size) of the fault. The 'flags' field describes properties of the faulting access that are likely pertinent: + - KVM_MEMORY_EXIT_FLAG_READ/WRITE/EXEC - When set, indicates that the memory + fault occurred on a read/write/exec access respectively. - KVM_MEMORY_EXIT_FLAG_PRIVATE - When set, indicates the memory fault occurred on a private memory access. When clear, indicates the fault occurred on a shared access. diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 7e7fd25b09b3..32cbe5c3a9d1 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -2343,8 +2343,15 @@ static inline void kvm_prepare_memory_fault_exit(struct kvm_vcpu *vcpu, vcpu->run->memory_fault.gpa = gpa; vcpu->run->memory_fault.size = size; - /* RWX flags are not (yet) defined or communicated to userspace. */ vcpu->run->memory_fault.flags = 0; + + if (is_write) + vcpu->run->memory_fault.flags |= KVM_MEMORY_EXIT_FLAG_WRITE; + else if (is_exec) + vcpu->run->memory_fault.flags |= KVM_MEMORY_EXIT_FLAG_EXEC; + else + vcpu->run->memory_fault.flags |= KVM_MEMORY_EXIT_FLAG_READ; + if (is_private) vcpu->run->memory_fault.flags |= KVM_MEMORY_EXIT_FLAG_PRIVATE; } diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 2190adbe3002..36a51b162a71 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -428,6 +428,9 @@ struct kvm_run { } notify; /* KVM_EXIT_MEMORY_FAULT */ struct { +#define KVM_MEMORY_EXIT_FLAG_READ (1ULL << 0) +#define KVM_MEMORY_EXIT_FLAG_WRITE (1ULL << 1) +#define KVM_MEMORY_EXIT_FLAG_EXEC (1ULL << 2) #define KVM_MEMORY_EXIT_FLAG_PRIVATE (1ULL << 3) __u64 flags; __u64 gpa; From patchwork Thu Feb 15 23:53:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anish Moorthy X-Patchwork-Id: 13559299 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 735941468F3 for ; Thu, 15 Feb 2024 23:54:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708041260; cv=none; b=uVF6KSIkScKkvG0sE5MxmJJOFQojwczRm9LBTla/Py30gANd1+9UroUXyKuMQlp4EF7gVeH3ExvHWQqtIx3ut2oagRdAg13rU6XqpkFePIRupz+kDv3VSzr2lTJt/AseE67QqTQOmOOFefs7PA4va1FhziUrE5i8cFdC/fXN8c0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708041260; c=relaxed/simple; bh=SiM4xWXwUgH8LHEU5tPvjdfJ67jD/ky4OB4FJ6Uv42A=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=BK0YtK6gN6GNEiiJQqfV0mnqQVEo6B4lT1kKKeAoCcvFDbRs2koQJ1L/9IUup9f2ePE8OCJnkuPUBtO5a/gEpwUbPr++YUXfm3X6jvnNM/47RJ5ov9lhPfXoZ7WFlcDBp6DHm60GZuteJ27Q/XDvS36o6pfdfnKJQydLEtOHMmw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--amoorthy.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=PL4bwg+R; arc=none smtp.client-ip=209.85.128.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--amoorthy.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="PL4bwg+R" Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-60665b5fabcso21937177b3.1 for ; Thu, 15 Feb 2024 15:54:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1708041257; x=1708646057; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=WAJxKBXqL+4n0kIk41piMFkW8X85Amto6kELtwrWAcE=; b=PL4bwg+REztaGbJFDoXU4X093o817k6T05kpp1azMHLSh/t7api9SltmNrQWXSwlxp VfA8LvFNlkjeTyHEHNM9cYEIYAsJI+yHc5AuBYui8PZRP+w7BljNY+CCh6neOt5U/38j 8OoZQxY3kkco2T95d8VNwzTkbteCZK9plI2/rmAs7LbTZTWecnCspqJTMgGIgROj6J3J 37Cd08c+JKR8basO7A/phC6QBp01m3EIhl/I/Z5zFTxbNDcnzIAWDGfgDlPnupYOZC+m 8fzUrWk/sRBs05FhKQlJYOnU/beJWBdPyscprxsJgYUfcZfSKhjHA5As0cAn2A8DifUT 2p+g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708041257; x=1708646057; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=WAJxKBXqL+4n0kIk41piMFkW8X85Amto6kELtwrWAcE=; b=PsfI9hNEdrMUFX5DjgNy4qc5Sv9JuREZj1KWCagx3+nY2Q/1t/AQkvtA3ylq8Rg/0H 957EtucwfgHS19MSMzYm7DZ+qu4VYcAODmX99HCjkoeOIK9CW1j6C2RlCl4D67GwWe+n 4PjYs8SOcMyxgxkx2kkcnJ+gVqIPO/4/uyIIYkNv6YWKCk/V2ioMtvlZbWktgP1PHdU8 9+OdV0wumT8v96Eh30PtJiG7gzWYScL2sj37EgT2BdVbThjYjj/2dduWqEsAnmVM3EmD zfbCSTqMqJPVWNkx5+iIrDflR1+t39WSvRkRY+9j1Ug/kOsMo10/1gwcEW5bHOMNRvKE XHNw== X-Forwarded-Encrypted: i=1; AJvYcCXFqmrOiLoQs6R17lD/g5FIwAwSO9DOXcdvy9ZFsoRpN3YX8nhP9Y8s9jKiiCbkUjWB2XvVzOPaHiqlcp3FywiUvMs+ X-Gm-Message-State: AOJu0YykGYBw9n0ibjYnNrTgQXZ2JwJPkkOWkG/gJxLToxNJ5/pbHMU4 m/R7j7zgfk+Mf/cFL5Hh8YZ4ekyJaKrgNW+FQ9/BDaYZgohUHmdU1/4VXtquLHDNNrMjRm5SMkS wqHT8k6s+yg== X-Google-Smtp-Source: AGHT+IHuZ+cGSAfo5UbiyCXCmX5lx9/UGynmyEuSN81yBH4E8SXDV9tAecXhS2EWPac9/g2KZv1J6mjRjtTShw== X-Received: from laogai.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:2c9]) (user=amoorthy job=sendgmr) by 2002:a05:690c:24e:b0:607:9e9a:cba6 with SMTP id ba14-20020a05690c024e00b006079e9acba6mr558060ywb.8.1708041257497; Thu, 15 Feb 2024 15:54:17 -0800 (PST) Date: Thu, 15 Feb 2024 23:53:57 +0000 In-Reply-To: <20240215235405.368539-1-amoorthy@google.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240215235405.368539-1-amoorthy@google.com> X-Mailer: git-send-email 2.44.0.rc0.258.g7320e95886-goog Message-ID: <20240215235405.368539-7-amoorthy@google.com> Subject: [PATCH v7 06/14] KVM: Add memslot flag to let userspace force an exit on missing hva mappings From: Anish Moorthy To: seanjc@google.com, oliver.upton@linux.dev, maz@kernel.org, kvm@vger.kernel.org, kvmarm@lists.linux.dev Cc: robert.hoo.linux@gmail.com, jthoughton@google.com, amoorthy@google.com, dmatlack@google.com, axelrasmussen@google.com, peterx@redhat.com, nadav.amit@gmail.com, isaku.yamahata@gmail.com, kconsul@linux.vnet.ibm.com Allowing KVM to fault in pages during vcpu-context guest memory accesses can be undesirable: during userfaultfd-based postcopy, it can cause significant performance issues due to vCPUs contending for userfaultfd-internal locks. Add a new memslot flag (KVM_MEM_EXIT_ON_MISSING) through which userspace can indicate that KVM_RUN should exit instead of faulting in pages during vcpu-context guest memory accesses. The unfaulted pages are reported by the accompanying KVM_EXIT_MEMORY_FAULT_INFO, allowing userspace to determine and take appropriate action. The basic implementation strategy is to check the memslot flag from within __gfn_to_pfn_memslot() and override the caller-provided arguments accordingly. Some callers (such as kvm_vcpu_map()) must be able to opt out of this behavior, and do so by passing can_exit_on_missing=false. No functional change intended: nothing sets KVM_MEM_EXIT_ON_MISSING or passes can_exit_on_missing=true to __gfn_to_pfn_memslot(). Suggested-by: James Houghton Suggested-by: Sean Christopherson Signed-off-by: Anish Moorthy --- Documentation/virt/kvm/api.rst | 23 +++++++++++++++++- arch/arm64/kvm/mmu.c | 2 +- arch/powerpc/kvm/book3s_64_mmu_hv.c | 2 +- arch/powerpc/kvm/book3s_64_mmu_radix.c | 2 +- arch/x86/kvm/mmu/mmu.c | 4 ++-- include/linux/kvm_host.h | 12 +++++++++- include/uapi/linux/kvm.h | 2 ++ virt/kvm/Kconfig | 3 +++ virt/kvm/kvm_main.c | 32 ++++++++++++++++++++++---- 9 files changed, 70 insertions(+), 12 deletions(-) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 9f5d45c49e36..bf7bc21d56ac 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -1353,6 +1353,7 @@ yet and must be cleared on entry. #define KVM_MEM_LOG_DIRTY_PAGES (1UL << 0) #define KVM_MEM_READONLY (1UL << 1) #define KVM_MEM_GUEST_MEMFD (1UL << 2) + #define KVM_MEM_EXIT_ON_MISSING (1UL << 3) This ioctl allows the user to create, modify or delete a guest physical memory slot. Bits 0-15 of "slot" specify the slot id and this value @@ -1383,7 +1384,7 @@ It is recommended that the lower 21 bits of guest_phys_addr and userspace_addr be identical. This allows large pages in the guest to be backed by large pages in the host. -The flags field supports three flags +The flags field supports four flags 1. KVM_MEM_LOG_DIRTY_PAGES: can be set to instruct KVM to keep track of writes to memory within the slot. See KVM_GET_DIRTY_LOG ioctl to know how to @@ -1393,6 +1394,7 @@ to make a new slot read-only. In this case, writes to this memory will be posted to userspace as KVM_EXIT_MMIO exits. 3. KVM_MEM_GUEST_MEMFD: see KVM_SET_USER_MEMORY_REGION2. This flag is incompatible with KVM_SET_USER_MEMORY_REGION. +4. KVM_MEM_EXIT_ON_MISSING: see KVM_CAP_EXIT_ON_MISSING for details. When the KVM_CAP_SYNC_MMU capability is available, changes in the backing of the memory region are automatically reflected into the guest. For example, an @@ -1408,6 +1410,9 @@ Instead, an abort (data abort if the cause of the page-table update was a load or a store, instruction abort if it was an instruction fetch) is injected in the guest. +Note: KVM_MEM_READONLY and KVM_MEM_EXIT_ON_MISSING are currently mutually +exclusive. + 4.36 KVM_SET_TSS_ADDR --------------------- @@ -8044,6 +8049,22 @@ error/annotated fault. See KVM_EXIT_MEMORY_FAULT for more information. +7.35 KVM_CAP_EXIT_ON_MISSING +---------------------------- + +:Architectures: None +:Returns: Informational only, -EINVAL on direct KVM_ENABLE_CAP. + +The presence of this capability indicates that userspace may set the +KVM_MEM_EXIT_ON_MISSING flag on memslots. Said flag will cause KVM_RUN to fail +(-EFAULT) in response to guest-context memory accesses which would require KVM +to page fault on the userspace mapping. + +The range of guest physical memory causing the fault is advertised to userspace +through KVM_CAP_MEMORY_FAULT_INFO. Userspace should take appropriate action. +This could mean, for instance, checking that the fault is resolvable, faulting +in the relevant userspace mapping, then retrying KVM_RUN. + 8. Other capabilities. ====================== diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index d14504821b79..dfe0cbb5937c 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1487,7 +1487,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, mmap_read_unlock(current->mm); pfn = __gfn_to_pfn_memslot(memslot, gfn, false, false, NULL, - write_fault, &writable, NULL); + write_fault, &writable, false, NULL); if (pfn == KVM_PFN_ERR_HWPOISON) { kvm_send_hwpoison_signal(hva, vma_shift); return 0; diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c index 2b1f0cdd8c18..31ebfe4fe8e1 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c @@ -614,7 +614,7 @@ int kvmppc_book3s_hv_page_fault(struct kvm_vcpu *vcpu, } else { /* Call KVM generic code to do the slow-path check */ pfn = __gfn_to_pfn_memslot(memslot, gfn, false, false, NULL, - writing, &write_ok, NULL); + writing, &write_ok, false, NULL); if (is_error_noslot_pfn(pfn)) return -EFAULT; page = NULL; diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c index 4a1abb9f7c05..03b0f1c4a0d8 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_radix.c +++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c @@ -853,7 +853,7 @@ int kvmppc_book3s_instantiate_page(struct kvm_vcpu *vcpu, /* Call KVM generic code to do the slow-path check */ pfn = __gfn_to_pfn_memslot(memslot, gfn, false, false, NULL, - writing, upgrade_p, NULL); + writing, upgrade_p, false, NULL); if (is_error_noslot_pfn(pfn)) return -EFAULT; page = NULL; diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 2d6cdeab1f8a..b89a9518f6de 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4371,7 +4371,7 @@ static int __kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault async = false; fault->pfn = __gfn_to_pfn_memslot(slot, fault->gfn, false, false, &async, fault->write, &fault->map_writable, - &fault->hva); + false, &fault->hva); if (!async) return RET_PF_CONTINUE; /* *pfn has correct page already */ @@ -4393,7 +4393,7 @@ static int __kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault */ fault->pfn = __gfn_to_pfn_memslot(slot, fault->gfn, false, true, NULL, fault->write, &fault->map_writable, - &fault->hva); + false, &fault->hva); return RET_PF_CONTINUE; } diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 32cbe5c3a9d1..210e07c4c2eb 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1216,7 +1216,8 @@ kvm_pfn_t gfn_to_pfn_memslot(const struct kvm_memory_slot *slot, gfn_t gfn); kvm_pfn_t gfn_to_pfn_memslot_atomic(const struct kvm_memory_slot *slot, gfn_t gfn); kvm_pfn_t __gfn_to_pfn_memslot(const struct kvm_memory_slot *slot, gfn_t gfn, bool atomic, bool interruptible, bool *async, - bool write_fault, bool *writable, hva_t *hva); + bool write_fault, bool *writable, + bool can_exit_on_missing, hva_t *hva); void kvm_release_pfn_clean(kvm_pfn_t pfn); void kvm_release_pfn_dirty(kvm_pfn_t pfn); @@ -2394,4 +2395,13 @@ static inline int kvm_gmem_get_pfn(struct kvm *kvm, } #endif /* CONFIG_KVM_PRIVATE_MEM */ +/* + * Whether vCPUs should exit upon trying to access memory for which the + * userspace mappings are missing. + */ +static inline bool kvm_is_slot_exit_on_missing(const struct kvm_memory_slot *slot) +{ + return slot && slot->flags & KVM_MEM_EXIT_ON_MISSING; +} + #endif diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 36a51b162a71..e9f33ae93dee 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -51,6 +51,7 @@ struct kvm_userspace_memory_region2 { #define KVM_MEM_LOG_DIRTY_PAGES (1UL << 0) #define KVM_MEM_READONLY (1UL << 1) #define KVM_MEM_GUEST_MEMFD (1UL << 2) +#define KVM_MEM_EXIT_ON_MISSING (1UL << 3) /* for KVM_IRQ_LINE */ struct kvm_irq_level { @@ -920,6 +921,7 @@ struct kvm_enable_cap { #define KVM_CAP_MEMORY_ATTRIBUTES 233 #define KVM_CAP_GUEST_MEMFD 234 #define KVM_CAP_VM_TYPES 235 +#define KVM_CAP_EXIT_ON_MISSING 236 struct kvm_irq_routing_irqchip { __u32 irqchip; diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig index 29b73eedfe74..c7bdde127af4 100644 --- a/virt/kvm/Kconfig +++ b/virt/kvm/Kconfig @@ -109,3 +109,6 @@ config KVM_GENERIC_PRIVATE_MEM select KVM_GENERIC_MEMORY_ATTRIBUTES select KVM_PRIVATE_MEM bool + +config HAVE_KVM_EXIT_ON_MISSING + bool diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 67ca580a18c5..469b99898be8 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1600,7 +1600,7 @@ static void kvm_replace_memslot(struct kvm *kvm, * only allows these. */ #define KVM_SET_USER_MEMORY_REGION_V1_FLAGS \ - (KVM_MEM_LOG_DIRTY_PAGES | KVM_MEM_READONLY) + (KVM_MEM_LOG_DIRTY_PAGES | KVM_MEM_READONLY | KVM_MEM_EXIT_ON_MISSING) static int check_memory_region_flags(struct kvm *kvm, const struct kvm_userspace_memory_region2 *mem) @@ -1618,8 +1618,14 @@ static int check_memory_region_flags(struct kvm *kvm, valid_flags |= KVM_MEM_READONLY; #endif + if (IS_ENABLED(CONFIG_HAVE_KVM_EXIT_ON_MISSING)) + valid_flags |= KVM_MEM_EXIT_ON_MISSING; + if (mem->flags & ~valid_flags) return -EINVAL; + else if ((mem->flags & KVM_MEM_READONLY) && + (mem->flags & KVM_MEM_EXIT_ON_MISSING)) + return -EINVAL; return 0; } @@ -3024,7 +3030,8 @@ kvm_pfn_t hva_to_pfn(unsigned long addr, bool atomic, bool interruptible, kvm_pfn_t __gfn_to_pfn_memslot(const struct kvm_memory_slot *slot, gfn_t gfn, bool atomic, bool interruptible, bool *async, - bool write_fault, bool *writable, hva_t *hva) + bool write_fault, bool *writable, + bool can_exit_on_missing, hva_t *hva) { unsigned long addr = __gfn_to_hva_many(slot, gfn, NULL, write_fault); @@ -3047,6 +3054,19 @@ kvm_pfn_t __gfn_to_pfn_memslot(const struct kvm_memory_slot *slot, gfn_t gfn, writable = NULL; } + /* When the slot is exit-on-missing (and when we should respect that) + * set atomic=true to prevent GUP from faulting in the userspace + * mappings. + */ + if (!atomic && can_exit_on_missing && + kvm_is_slot_exit_on_missing(slot)) { + atomic = true; + if (async) { + *async = false; + async = NULL; + } + } + return hva_to_pfn(addr, atomic, interruptible, async, write_fault, writable); } @@ -3056,21 +3076,21 @@ kvm_pfn_t gfn_to_pfn_prot(struct kvm *kvm, gfn_t gfn, bool write_fault, bool *writable) { return __gfn_to_pfn_memslot(gfn_to_memslot(kvm, gfn), gfn, false, false, - NULL, write_fault, writable, NULL); + NULL, write_fault, writable, false, NULL); } EXPORT_SYMBOL_GPL(gfn_to_pfn_prot); kvm_pfn_t gfn_to_pfn_memslot(const struct kvm_memory_slot *slot, gfn_t gfn) { return __gfn_to_pfn_memslot(slot, gfn, false, false, NULL, true, - NULL, NULL); + NULL, false, NULL); } EXPORT_SYMBOL_GPL(gfn_to_pfn_memslot); kvm_pfn_t gfn_to_pfn_memslot_atomic(const struct kvm_memory_slot *slot, gfn_t gfn) { return __gfn_to_pfn_memslot(slot, gfn, true, false, NULL, true, - NULL, NULL); + NULL, false, NULL); } EXPORT_SYMBOL_GPL(gfn_to_pfn_memslot_atomic); @@ -4877,6 +4897,8 @@ static int kvm_vm_ioctl_check_extension_generic(struct kvm *kvm, long arg) case KVM_CAP_GUEST_MEMFD: return !kvm || kvm_arch_has_private_mem(kvm); #endif + case KVM_CAP_EXIT_ON_MISSING: + return IS_ENABLED(CONFIG_HAVE_KVM_EXIT_ON_MISSING); default: break; } From patchwork Thu Feb 15 23:53:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anish Moorthy X-Patchwork-Id: 13559300 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A85DF145B29 for ; Thu, 15 Feb 2024 23:54:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708041261; cv=none; b=npJEW/XKqEL7yZJOwSHD1ZvKRCFjSzufKm0u4KrTMA3VmCuJhTlPSnz2Ud+oUqBp84e+7N+ZsWR1E/mtt6WAAHgFZVnFRq2QIRanDKOY00oSrEJeTgFppCb1rOXrQ5Oqw8fAmJViJjIYAIGqOeqSDyJhWSFruFV3xyOCasP0rHQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708041261; c=relaxed/simple; bh=7QKLk8/MiFSWT7rzgMqDtu0qXhHYiCMoynK98E8XTjg=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=VK+8BlyDKtOaulhxHjTMwcbJsGs3ItCbLkwuOnH0zUQUaFTL4Mg9CTbZxC1zDi/6OQEDY1GsEIaoVynSXWHQ8Zc9YrLEdyUCzlVWUAPccTgpBL2vWNOJFyM16yAZRPyDBShIFO70NXSv0YTQ7PUKw4/oNOSt6X7bHhTi9qLdQ8o= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--amoorthy.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=NvyVO5ok; arc=none smtp.client-ip=209.85.219.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--amoorthy.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="NvyVO5ok" Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-dc6b26783b4so282038276.0 for ; Thu, 15 Feb 2024 15:54:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1708041258; x=1708646058; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=evAg07Tqv00oZcSIXy7ksZn9JBa6lYxUCa9nTQwmh0k=; b=NvyVO5ok4hdEuKwD0bPbc4oMqTXVcMyTUfDa2JKQz93mW5b8aXTqXx/KvRGzlpyzie yRXmSkIVIpH46P0rYCptgA5tzRrGhB+iRP8LjrLxDFXAwQ2PF3TgwpJgbwo+Vzd+76+v ojVMN0kuHbJYR9mHwfQi939zkQZ944kbpkXxBficZoMR7moya+Ee2zAkW3oYels/Umf3 +1gioMH//wunIMT32h4E+c7K9azZrPiAehhosytNDzdvQeWw6MkmriKfMiU0dA5vcvoJ qPmhSzuMjUY7ODUD9CnhPKcEJcwxaDEfGDwYvsi4kjjKvQHy+0bas7Q6yVk1ka7/pAnX pd1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708041258; x=1708646058; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=evAg07Tqv00oZcSIXy7ksZn9JBa6lYxUCa9nTQwmh0k=; b=wtOOe6fgHTzujJVwzibvCdd8KOGOTsJEKyXteqJtbwlJsApCeg7+FJO9vqB1Ma3iwz sSvjweTOdfZVzRAI0Pg4MRgLKoXSS74V0XMdpWUINqrJbg9h/y0VRMb0qPtB3mw7PIou OBIhQMq3B7OjQTU2L9dTTk4QpyCNHzOURqstTsIjTn4q9i0iB2n4Y8Fu/s8eBNx6ancL jkSG9/AX2nU0n4a1FPGKCQDIZAIcJmLNvtYVSM04uMx/7CJprE8WlFu5i12obpqfOUGv 5tRO57X4ljXObqP0xo1eyy6XP81yxq/hRunpK4HPysyjhYlqKAHkcYBsP2INvpt/SSVH 3HJw== X-Forwarded-Encrypted: i=1; AJvYcCUegsGCNfKMZK52FqOLIAXW/cnQstInCJzricGWFfWCGnGCuNcg0OFzIGSmB6y0tvEDh3MMmR4iwuyZXDvoFfXtUxnQ X-Gm-Message-State: AOJu0YzbIKOC2NUfyxJsDRkErrwn0FnwEOyiYfU+QwcTHnR4wGMG08eE UM/msY2xYBHrADz0nLGNAUrrl+Ef7txEPDWf2Qo07V+I3ffHB7ETu/UvReVKk24ijpOeVbX/yCe IFXaPPikPXg== X-Google-Smtp-Source: AGHT+IEL+octx5Ma1aJhkF6nQtctFbc2VuoDa/1F8Y6Hxj4I3Z7U8nf+fteVhEAb2t5g/T3+SnjYzZ3R4wNElA== X-Received: from laogai.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:2c9]) (user=amoorthy job=sendgmr) by 2002:a05:6902:100a:b0:dc6:c94e:fb85 with SMTP id w10-20020a056902100a00b00dc6c94efb85mr126325ybt.2.1708041258630; Thu, 15 Feb 2024 15:54:18 -0800 (PST) Date: Thu, 15 Feb 2024 23:53:58 +0000 In-Reply-To: <20240215235405.368539-1-amoorthy@google.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240215235405.368539-1-amoorthy@google.com> X-Mailer: git-send-email 2.44.0.rc0.258.g7320e95886-goog Message-ID: <20240215235405.368539-8-amoorthy@google.com> Subject: [PATCH v7 07/14] KVM: x86: Enable KVM_CAP_EXIT_ON_MISSING and annotate EFAULTs from stage-2 fault handler From: Anish Moorthy To: seanjc@google.com, oliver.upton@linux.dev, maz@kernel.org, kvm@vger.kernel.org, kvmarm@lists.linux.dev Cc: robert.hoo.linux@gmail.com, jthoughton@google.com, amoorthy@google.com, dmatlack@google.com, axelrasmussen@google.com, peterx@redhat.com, nadav.amit@gmail.com, isaku.yamahata@gmail.com, kconsul@linux.vnet.ibm.com Prevent the stage-2 fault handler from faulting in pages when KVM_MEM_EXIT_ON_MISSING is set by allowing its __gfn_to_pfn_memslot() calls to check the memslot flag. To actually make that behavior useful, prepare a KVM_EXIT_MEMORY_FAULT when the stage-2 handler returns EFAULT, e.g. when it cannot resolve the pfn. With KVM_MEM_EXIT_ON_MISSING enabled this effects the delivery of stage-2 faults as vCPU exits, which userspace can attempt to resolve without terminating the guest. Delivering stage-2 faults to userspace in this way sidesteps the significant scalabiliy issues associated with using userfaultfd for the same purpose. Signed-off-by: Anish Moorthy --- Documentation/virt/kvm/api.rst | 2 +- arch/x86/kvm/Kconfig | 1 + arch/x86/kvm/mmu/mmu.c | 8 ++++++-- 3 files changed, 8 insertions(+), 3 deletions(-) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index bf7bc21d56ac..d52757f9e1cb 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -8052,7 +8052,7 @@ See KVM_EXIT_MEMORY_FAULT for more information. 7.35 KVM_CAP_EXIT_ON_MISSING ---------------------------- -:Architectures: None +:Architectures: x86 :Returns: Informational only, -EINVAL on direct KVM_ENABLE_CAP. The presence of this capability indicates that userspace may set the diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig index d43efae05794..09224e306abf 100644 --- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig @@ -44,6 +44,7 @@ config KVM select KVM_VFIO select HAVE_KVM_PM_NOTIFIER if PM select KVM_GENERIC_HARDWARE_ENABLING + select HAVE_KVM_EXIT_ON_MISSING help Support hosting fully virtualized guest machines using hardware virtualization extensions. You will need a fairly recent diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index b89a9518f6de..26388e4f42df 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -3305,6 +3305,10 @@ static int kvm_handle_error_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fa return RET_PF_RETRY; } + WARN_ON_ONCE(fault->goal_level != PG_LEVEL_4K); + + kvm_prepare_memory_fault_exit(vcpu, gfn_to_gpa(fault->gfn), PAGE_SIZE, + fault->write, fault->exec, fault->is_private); return -EFAULT; } @@ -4371,7 +4375,7 @@ static int __kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault async = false; fault->pfn = __gfn_to_pfn_memslot(slot, fault->gfn, false, false, &async, fault->write, &fault->map_writable, - false, &fault->hva); + true, &fault->hva); if (!async) return RET_PF_CONTINUE; /* *pfn has correct page already */ @@ -4393,7 +4397,7 @@ static int __kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault */ fault->pfn = __gfn_to_pfn_memslot(slot, fault->gfn, false, true, NULL, fault->write, &fault->map_writable, - false, &fault->hva); + true, &fault->hva); return RET_PF_CONTINUE; } From patchwork Thu Feb 15 23:53:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anish Moorthy X-Patchwork-Id: 13559301 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5FDF8146906 for ; Thu, 15 Feb 2024 23:54:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708041263; cv=none; b=arREXqkWRe7M8bZbuHA/HiQV57+AqfqU8ZOSii/BoJVOreNU9m5DLWVPGtNBDDhhZbFMjdatmrVt2qWLg4opaftNzY0TBQi2Ojz8GPhfr6S3yye5vBBiU8g7hRWUIsbQVLT2OM5iJer2w7oDQpktcpxa1EuaONQaqoIQhIcgG3k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708041263; c=relaxed/simple; bh=eLQTZ0Bt1CIZ9NPrKNRT5DeFFxc1SAR3qmaE7o5Oe8U=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=p27KsoqDv0MtUIRzwRjlfP/tIaPiTrEfNBhygTD49G+ab+out0YBfMCjBHwrJ5M/WcWZNbtRToOuGzqS6Qw/BcgjrlinxZwt8LSc+eYbVPZHRqshfZt0v0Lp3cSPTQz80aE2IAVHFFldy/g59sFadIvWRQYcOKq1VEqWFPOT314= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--amoorthy.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=1orvqaH7; arc=none smtp.client-ip=209.85.219.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--amoorthy.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="1orvqaH7" Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-dc6dbdcfd39so2328080276.2 for ; Thu, 15 Feb 2024 15:54:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1708041259; x=1708646059; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=APWWApm07v+Ii9pzK+6vOaw3uMIWE5cbc20AIdxyhTk=; b=1orvqaH7D0VAgaSmAG25GBOpX/DlzFTxUaZqNGuCRmSCnPbvlMnVJkhIg85bqzMhQn aXI6rM3MHGV049b/gog5nipGJB3Cgm8EZd9YauvWYsSEpFhVvIVPU1vpPu+fMdwkI1hx 47BLkt4ZixC4GRbad/SrHyg2lKY+q2uwh6vDueSHfUyN7ptSH9yg+rYW24p1zuk2Y3XT M5sSPpyJ9K6cNiyvqA2MErFDcrppkpGLf+QVLp56qkdqiM9BGgD9pCSwZF1kRfS1JNgc vxk47MtEaIJK0djgMEtPjfJcIoJ7h9vxXaMmM++qm/7rKwFkkFt7XkRv2G14wXOWxZ/C 5MkQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708041259; x=1708646059; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=APWWApm07v+Ii9pzK+6vOaw3uMIWE5cbc20AIdxyhTk=; b=pvMSJVKPJu8XOshgENq3gHS2wSweRlHPlfhjM2z0tf3gRazkxg71uFR726Fts7A9CG mY9qDxbjBCc7O4WC5byPQ0cxDLGX8+2NmGbkwHy1tdplRxn9DjrWpJK/PL2cW5uw6h8G P7ahAX5zO06lHXC73Y5KmXDpx+peOJmO/JdRhMhrP++C2XCzY9VtrrABxbUvVnAwWs3K +rAb7zGX4ted/Bw09eb8pxJxgkuuYgn/xUqzCJO6xM70m0syi5UN4a8dddZ+qbVQpqMv H8C5r/Kp/AwT+59JWv6DloWTECU+myP/C0s6x/1dNKm/GdBHpKpEVOfDAFzabdexo7j9 uPDg== X-Forwarded-Encrypted: i=1; AJvYcCUxCWiypNNncue7XbXnlMUy42z0zN9x39gL44UFm6wR7W+RxbHDQbJ5EMYJ9IxcKp+I6NO5coUjTEQeR+xo/cNrAkAz X-Gm-Message-State: AOJu0YzPYz9w5TO8MvV/rvlL1YAH5U7Oy1gaurzWQtLNsND5V1+hU5MX EeDZPUqUigCvOCTRxj2nI9wdG1WF9XF1bMAod+NYAsjSpLsTxxUsm//WRxgPkqQwslpg+y1gVvs PC+/FNN45FQ== X-Google-Smtp-Source: AGHT+IHPEJ59b8BXtnY8f7wKfLDM6BJPxRmmBDQPm7EUmwopjvKfjeWa/RaVSEDv68ZZb+Tit0/u/eoxIpod4A== X-Received: from laogai.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:2c9]) (user=amoorthy job=sendgmr) by 2002:a05:6902:2487:b0:dc2:466a:23c4 with SMTP id ds7-20020a056902248700b00dc2466a23c4mr725257ybb.4.1708041259622; Thu, 15 Feb 2024 15:54:19 -0800 (PST) Date: Thu, 15 Feb 2024 23:53:59 +0000 In-Reply-To: <20240215235405.368539-1-amoorthy@google.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240215235405.368539-1-amoorthy@google.com> X-Mailer: git-send-email 2.44.0.rc0.258.g7320e95886-goog Message-ID: <20240215235405.368539-9-amoorthy@google.com> Subject: [PATCH v7 08/14] KVM: arm64: Enable KVM_CAP_MEMORY_FAULT_INFO and annotate fault in the stage-2 fault handler From: Anish Moorthy To: seanjc@google.com, oliver.upton@linux.dev, maz@kernel.org, kvm@vger.kernel.org, kvmarm@lists.linux.dev Cc: robert.hoo.linux@gmail.com, jthoughton@google.com, amoorthy@google.com, dmatlack@google.com, axelrasmussen@google.com, peterx@redhat.com, nadav.amit@gmail.com, isaku.yamahata@gmail.com, kconsul@linux.vnet.ibm.com At the moment the only intended use case for KVM_CAP_MEMORY_FAULT_INFO on arm64 is to annotate EFAULTs from the stage-2 fault handler, so add that annotation now. Signed-off-by: Anish Moorthy --- Documentation/virt/kvm/api.rst | 2 +- arch/arm64/kvm/arm.c | 1 + arch/arm64/kvm/mmu.c | 5 ++++- 3 files changed, 6 insertions(+), 2 deletions(-) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index d52757f9e1cb..7012f40332b3 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -8031,7 +8031,7 @@ unavailable to host or other VMs. 7.34 KVM_CAP_MEMORY_FAULT_INFO ------------------------------ -:Architectures: x86 +:Architectures: x86, arm64 :Returns: Informational only, -EINVAL on direct KVM_ENABLE_CAP. The presence of this capability indicates that KVM_RUN will fill diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index a25265aca432..ca4617f53250 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -240,6 +240,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) case KVM_CAP_ARM_SYSTEM_SUSPEND: case KVM_CAP_IRQFD_RESAMPLE: case KVM_CAP_COUNTER_OFFSET: + case KVM_CAP_MEMORY_FAULT_INFO: r = 1; break; case KVM_CAP_SET_GUEST_DEBUG2: diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index dfe0cbb5937c..5b740ddfcc8e 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1492,8 +1492,11 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, kvm_send_hwpoison_signal(hva, vma_shift); return 0; } - if (is_error_noslot_pfn(pfn)) + if (is_error_noslot_pfn(pfn)) { + kvm_prepare_memory_fault_exit(vcpu, gfn * PAGE_SIZE, PAGE_SIZE, + write_fault, exec_fault, false); return -EFAULT; + } if (kvm_is_device_pfn(pfn)) { /* From patchwork Thu Feb 15 23:54:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anish Moorthy X-Patchwork-Id: 13559302 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D703314691E for ; Thu, 15 Feb 2024 23:54:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708041263; cv=none; b=Bus1ab7lMDP7u0Gj4BJcBKY/rN1Ida8f1ES3M4FOROFQsbg8dJUW9ViWLbjXwU1eRvW2L1ladwqsgpLbUok0gsugmIgSVA22j3dtu2Pfa3wWgJAjQDeLGwwplqnCiB1eany5Jg865Lh5GJHg+fFaEOrxeJHgVypS4gT18LxU3f4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708041263; c=relaxed/simple; bh=YToCq6YE1hZ8sueGTK5tm1JNAU/Na0mmX/7tgTx5B8w=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=fo2XCJibeBJm1G6x0kfQFBTqTH/3d+BneEtUKuevszcHTjRmxc7m9IU9HDXbEB7Cadebq1U3fuz12p2tJG8wEIQiP9FqgUe+yvQDZsto8y29Je606v1alc6Q+wjvaNickocKcpcSMB0PJNCXHdL2bG0Za5w5DEHRab2xrUiZw84= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--amoorthy.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=NURQsrW3; arc=none smtp.client-ip=209.85.219.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--amoorthy.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="NURQsrW3" Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-dc64e0fc7c8so1969392276.2 for ; Thu, 15 Feb 2024 15:54:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1708041261; x=1708646061; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=a2S+MA0gE/d7KfEW/MnTZniPFqQuhRYWD821pFFz3JE=; b=NURQsrW3eOms4ZDS4jYXgFva3Z1rTisxJ3VEczzy038YLln0o3G3RCC7in0QIbBgrv +Ur9UC0mAC+HIGr0P4c4Pqc+JlM2QT6snq6f+aUpODs8Au5ya+Lo6sBQ/ND8VIkmVg1+ WwrWi2ArHDg261jYJfeLAtxHsvCDnOWNxLbgloM0iAIKrXN0f8uqEnzV1gsZuN/jGJvR yfOK8kaEzevH2YW3fW8HhvxP6Hm6VHSCf4WhCYlPC1qAErLqjY7QHdyn7Cr6FYgW059X 57P0eq2AxLeXFKksfx4KvCepvv5n6UJ0iaSfzse1qEufh+0FOTs3cvOM6O9PkwKRNjTs KkAA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708041261; x=1708646061; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=a2S+MA0gE/d7KfEW/MnTZniPFqQuhRYWD821pFFz3JE=; b=h190taYnOnBhEpJitZOLH79rE/c6WY1rRFpFztI+RLML1esAo+Wa9Yuw41TdQBdWj4 ukOAwy6wbrj7IHTh4gybr13ovXvbr46ar3LqVAYlJC3wJr2TGUqy/9AoU7But+5rA2xM k3fWcDyG/rEDgD6di2RE3uGF3vymvc/Vj0TB5QTEKLg0M/T65pyp4hVkfQhmSpJJWSFB r7d92i+uB3T+/3VcItGn80x+zfTNnzUxXwY1tkYo+HVjeIXLmpFwB1+/H1hF8iyDS7vy 6BX5ENWKl4O2Pz6fUHSEdhrcFgNV8ono7iOUyvBpehZTbox/VjUZZ/pPkpnbjLsB28+4 mhCw== X-Forwarded-Encrypted: i=1; AJvYcCXfKA4vJNAn4fqtz9ASbcxoS3Keq0IUFe8POqeIELCI1JVGHo/rUqoTAdnm737p2x7a3wRQ2CHGkPh/skN50cObU1sc X-Gm-Message-State: AOJu0Yyh/JbUZdLeBuhZdMULYJg9tNadzTmncp0yEVoCA1zuzjDWZK+N 5r2tmQq/6mIIYlmImvSG1FZb+SIjBxbTav2Y+ZgRCzI4z5dbV7bABHDhWoCIceDvMr4p/5AVc6h p5BT3BQgV5A== X-Google-Smtp-Source: AGHT+IGz7VwdL3z2Ro5FaEmODCGgndNM7ib0sMxb1cnlGqMmU2k8rLAE706Rj6Rrb94BZLFTt+/xxEEKwv9eHA== X-Received: from laogai.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:2c9]) (user=amoorthy job=sendgmr) by 2002:a05:6902:110a:b0:dcb:fb69:eadc with SMTP id o10-20020a056902110a00b00dcbfb69eadcmr129670ybu.6.1708041260877; Thu, 15 Feb 2024 15:54:20 -0800 (PST) Date: Thu, 15 Feb 2024 23:54:00 +0000 In-Reply-To: <20240215235405.368539-1-amoorthy@google.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240215235405.368539-1-amoorthy@google.com> X-Mailer: git-send-email 2.44.0.rc0.258.g7320e95886-goog Message-ID: <20240215235405.368539-10-amoorthy@google.com> Subject: [PATCH v7 09/14] KVM: arm64: Implement and advertise KVM_CAP_EXIT_ON_MISSING From: Anish Moorthy To: seanjc@google.com, oliver.upton@linux.dev, maz@kernel.org, kvm@vger.kernel.org, kvmarm@lists.linux.dev Cc: robert.hoo.linux@gmail.com, jthoughton@google.com, amoorthy@google.com, dmatlack@google.com, axelrasmussen@google.com, peterx@redhat.com, nadav.amit@gmail.com, isaku.yamahata@gmail.com, kconsul@linux.vnet.ibm.com Prevent the stage-2 fault handler from faulting in pages when KVM_MEM_EXIT_ON_MISSING is set by allowing its __gfn_to_pfn_memslot() call to check the memslot flag. This effects the delivery of stage-2 faults as vCPU exits (see KVM_CAP_MEMORY_FAULT_INFO), which userspace can attempt to resolve without terminating the guest. Delivering stage-2 faults to userspace in this way sidesteps the significant scalabiliy issues associated with using userfaultfd for the same purpose. Signed-off-by: Anish Moorthy --- Documentation/virt/kvm/api.rst | 2 +- arch/arm64/kvm/Kconfig | 1 + arch/arm64/kvm/mmu.c | 2 +- 3 files changed, 3 insertions(+), 2 deletions(-) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 7012f40332b3..01b762272b6f 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -8052,7 +8052,7 @@ See KVM_EXIT_MEMORY_FAULT for more information. 7.35 KVM_CAP_EXIT_ON_MISSING ---------------------------- -:Architectures: x86 +:Architectures: x86, arm64 :Returns: Informational only, -EINVAL on direct KVM_ENABLE_CAP. The presence of this capability indicates that userspace may set the diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig index 01398d2996c7..309d8e7ebc1c 100644 --- a/arch/arm64/kvm/Kconfig +++ b/arch/arm64/kvm/Kconfig @@ -39,6 +39,7 @@ menuconfig KVM select SCHED_INFO select GUEST_PERF_EVENTS if PERF_EVENTS select XARRAY_MULTI + select HAVE_KVM_EXIT_ON_MISSING help Support hosting virtualized guest machines. diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 5b740ddfcc8e..b0f1fef0a52c 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1487,7 +1487,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, mmap_read_unlock(current->mm); pfn = __gfn_to_pfn_memslot(memslot, gfn, false, false, NULL, - write_fault, &writable, false, NULL); + write_fault, &writable, true, NULL); if (pfn == KVM_PFN_ERR_HWPOISON) { kvm_send_hwpoison_signal(hva, vma_shift); return 0; From patchwork Thu Feb 15 23:54:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anish Moorthy X-Patchwork-Id: 13559303 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CBBFF145B29 for ; Thu, 15 Feb 2024 23:54:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708041264; cv=none; b=nnF7odJgGPOdQKYRv7ywysZVbEFcW5U/ytQ3GXrtLhmIjAJNbwJa+47YQgHsFqAX7503jzHsrfxJjBHHzhwXrc2iQaskERod3E3pkMHQmFf2R7erLN94Bbb8b95eVHYyKJTa7M7T4A2meQ/Mh4djwCjV3JxLP94+cLfXggh/rtk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708041264; c=relaxed/simple; bh=LU+P22WMx8+l0j28eDp7TZSAb4K8ddezaIfE2HB5OjY=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=AnaCtT2hIeG+M7gkeKkQK8fmefysRR+EJcif3pbzCszhvC+RfCCC4qItU+og4wEE3/szqkPb8WgXrfnpyaG+hBO8f6nQHrRETMMxmDznRFXuhLZ2jhCexUnKR758Z1uG+YQL4eD5C83Eu5Q2sB5lA7+VJ8t/8eg79coVc+EDuUg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--amoorthy.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=2crmDHJK; arc=none smtp.client-ip=209.85.128.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--amoorthy.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="2crmDHJK" Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-60790eb0f8fso23539767b3.1 for ; Thu, 15 Feb 2024 15:54:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1708041262; x=1708646062; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=wxgMpdjPiqAl5L5CrBln707OMjyjZ2YsrvRZmT9MQFg=; b=2crmDHJKWsgNhKgKFHi2ltTJ5HFEOP60o1uHn7f/iTC4o8FtUIHsNVM06IY5/4TklZ cp+AWw5+AcVokEHbos5oEGzvt+GBmVhMNSBvynLqxo6yhVPDzUkOhmF3pESphhwvTqUj 8/05Q9wuBKVTwjoASedOC8t3jLuXD5RTUc4CrRjCLQHyH/pP/aXQZAyWAMGEuah5Uak5 I4+u3AGUlXy2NPdQnYIqhJTptdOvsJxtEdX5II7vxbJIhS+F3oUjKz9xt//Cw5PQeIDq n1e3s/sHpiiY+ixSvJVxFG5wNUbF83nM2Eh0kmL+MozE4aRoZXXKHGsAVWyBWZS4adwT 2ntQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708041262; x=1708646062; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=wxgMpdjPiqAl5L5CrBln707OMjyjZ2YsrvRZmT9MQFg=; b=QfqPtWMOaMHgeSS5eCHULDNjNW6jqsRdWVqVBkwqGSsx0eiKFWVQwBmD5UPoWlO3WY qWjO2recUELqFWwI6a2MR/L+by+QarYHCG2Tcv4D2mf+LxrMpcPdSrnG32EsLITcyO0q 7m4WW20VSqLZCjmUU4U6yyPmize8Fa5LQDbbXQb0wlw+dpBQayC6gM3duxT27wcwPAA9 NG0pFZa6K8ve+rcRBc9ryVghDdAaCMKX1CPu+oJ0hV7clyR9i28EZvLZ83fkom6ykUjY P8TF9FBJ8B1wBXMNEZCtk+fSOG9QBD9lZsrp2fdgJU0NS9yTMzpVMDJrxBL3fEZO5C7K jC7A== X-Forwarded-Encrypted: i=1; AJvYcCXUCuN1+aT30VtxJ4VvxiOHhz0UUEmuPF35jrneXB/1jtiqxsQ0bZ3r6lT4vfzo2Q2GTqEzqEcgG0g+7X2ftkyu6t1+ X-Gm-Message-State: AOJu0YzrqrN7vyOch510P+/0LZ2memFUwYRh1wbRZNC1/IpGKIzYzkEH 6rZ2BxBuQ4rlsyPaTQZyby6bpMJwFJ9D47duaNxE6U+J90QKLL7S41OZM5MWldDIVrUC2Q9TG2s z+Wwt60AaPg== X-Google-Smtp-Source: AGHT+IGxwZE2D4TMTbojQuxIQsdHxi4CLt7P+BDw6owECL4gA7IpetyfAOdk5HtQUv5KTlzPChmWvxiHs6Ovkg== X-Received: from laogai.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:2c9]) (user=amoorthy job=sendgmr) by 2002:a81:4c04:0:b0:607:cd22:1f32 with SMTP id z4-20020a814c04000000b00607cd221f32mr774161ywa.0.1708041261933; Thu, 15 Feb 2024 15:54:21 -0800 (PST) Date: Thu, 15 Feb 2024 23:54:01 +0000 In-Reply-To: <20240215235405.368539-1-amoorthy@google.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240215235405.368539-1-amoorthy@google.com> X-Mailer: git-send-email 2.44.0.rc0.258.g7320e95886-goog Message-ID: <20240215235405.368539-11-amoorthy@google.com> Subject: [PATCH v7 10/14] KVM: selftests: Report per-vcpu demand paging rate from demand paging test From: Anish Moorthy To: seanjc@google.com, oliver.upton@linux.dev, maz@kernel.org, kvm@vger.kernel.org, kvmarm@lists.linux.dev Cc: robert.hoo.linux@gmail.com, jthoughton@google.com, amoorthy@google.com, dmatlack@google.com, axelrasmussen@google.com, peterx@redhat.com, nadav.amit@gmail.com, isaku.yamahata@gmail.com, kconsul@linux.vnet.ibm.com Using the overall demand paging rate to measure performance can be slightly misleading when vCPU accesses are not overlapped. Adding more vCPUs will (usually) increase the overall demand paging rate even if performance remains constant or even degrades on a per-vcpu basis. As such, it makes sense to report both the total and per-vcpu paging rates. Signed-off-by: Anish Moorthy --- tools/testing/selftests/kvm/demand_paging_test.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/kvm/demand_paging_test.c b/tools/testing/selftests/kvm/demand_paging_test.c index 09c116a82a84..6dc823fa933a 100644 --- a/tools/testing/selftests/kvm/demand_paging_test.c +++ b/tools/testing/selftests/kvm/demand_paging_test.c @@ -135,6 +135,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) struct timespec ts_diff; struct kvm_vm *vm; int i; + double vcpu_paging_rate; vm = memstress_create_vm(mode, nr_vcpus, guest_percpu_mem_size, 1, p->src_type, p->partition_vcpu_memory_access); @@ -191,11 +192,17 @@ static void run_test(enum vm_guest_mode mode, void *arg) uffd_stop_demand_paging(uffd_descs[i]); } - pr_info("Total guest execution time: %ld.%.9lds\n", + pr_info("Total guest execution time:\t%ld.%.9lds\n", ts_diff.tv_sec, ts_diff.tv_nsec); - pr_info("Overall demand paging rate: %f pgs/sec\n", - memstress_args.vcpu_args[0].pages * nr_vcpus / - ((double)ts_diff.tv_sec + (double)ts_diff.tv_nsec / NSEC_PER_SEC)); + + vcpu_paging_rate = + memstress_args.vcpu_args[0].pages + / ((double)ts_diff.tv_sec + + (double)ts_diff.tv_nsec / NSEC_PER_SEC); + pr_info("Per-vcpu demand paging rate:\t%f pgs/sec/vcpu\n", + vcpu_paging_rate); + pr_info("Overall demand paging rate:\t%f pgs/sec\n", + vcpu_paging_rate * nr_vcpus); memstress_destroy_vm(vm); From patchwork Thu Feb 15 23:54:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anish Moorthy X-Patchwork-Id: 13559306 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F3FB11474CB for ; Thu, 15 Feb 2024 23:54:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708041268; cv=none; b=TTgFaM+cyZ7PDJt/7yEZ6nj9OewLG6YQoAG5HK0TgedlTggoxVHUUuaT+WcdiVsUpsHy7DO2HU6186PFVndKJZxpDKQLe4OzvnELMYYciJfknlJBGxvzR4ZN32xOgFqXF82j01TgiMxtzLn5QLHm5d25pGImlQOY+J6+C908RvA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708041268; c=relaxed/simple; bh=AnPqRYIC1QnoGXTWm3rmcWk+De5eqFNuKVfi0J28esc=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=ProBUMyH8H3998ry9gHOrMQgOIq9nGGx6rUTiFhUIzEhBNJnHsChJSqesTEL2nOwWhak6NAqMc8WkTK2tg2YQztWhRYWY1B0bmwaCrNlnmo4vObNKy62gevsd2GrKSkDauQJT/Qs2Cxg29KSxhYNqrtwn9zFyBG+h2hKX7uhR4I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--amoorthy.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=aesvWqQM; arc=none smtp.client-ip=209.85.219.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--amoorthy.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="aesvWqQM" Received: by mail-yb1-f202.google.com with SMTP id 3f1490d57ef6-dcc15b03287so2044037276.3 for ; Thu, 15 Feb 2024 15:54:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1708041263; x=1708646063; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=V33K/kUOR8QF90cj78Or2u3/ENRnBAVw9m4e0yn9pfo=; b=aesvWqQMvgGnbOIN1dnGXf46AuHlgtqanuhd+vXcN/K2wsAvfklIbn3hsf69zrHeF6 l9gNNIZqIQyyQ9f5ln60aBW5ZkJ8Kl8pHn3uY7A0673OJxpJ4RwLB40KALMrNRYg63HU bTSJRrTanzS2khCNoReDqj8Zn2MC6T/g0kdFVfW97FWWPL47W4jiKbFyBsoEQyN35G17 xEfYMg2AG8kUQj1dy3734foQ2SA1bwxNvfzRlT/DU1YGgEPHVZ0gljKxCswbnvNznz3R QAse1zv5UOS2MX8DeWzKMF49iwkDUMSeBntaUw5jJ1oAYCEV40nh6bwvEhr7DZy5Lbeq XEyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708041263; x=1708646063; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=V33K/kUOR8QF90cj78Or2u3/ENRnBAVw9m4e0yn9pfo=; b=hTKXFAETuT/vLiWWE8LPEGfAcJt97u9ivP9KJz4rRlBa7B5Z+bR+S7VsrwOtOxjnUn vQQvH1STD7JjFbMhp+g14iExk4eqcT0g7zzbzJBnSLiYnDRqQdtBSwfK2LA3cW8cZRFY g4NzmsMFZdtp35R1y4ocp77R9SwNq6HPXNi9Xb+92EOf7rUnELmE4ly3xYQEoDd10C56 jkinkRxRkiu0SPyLWLK8v1l8236Rr65/zUy9NBaMxd66j4A0BgLpX/phAYnkw1qf150W Nr7ptqMkuXfDTVhR1E3JvAqYHgNBuEN30baLtHX5ktT/sqVWT3XA6EDCx6el6rmNJkZ4 MteA== X-Forwarded-Encrypted: i=1; AJvYcCXg3xlLaIYZAZeZtpCbOWU7cvHLZNaikWCurHg8lGDayUQJOHufXnOT1bs7+fl4JMglx0bJyf8nZDSBGGkHuW/4bAJd X-Gm-Message-State: AOJu0YwxArOVCJZZHEIrjYwksU1M6g8X/kiy+QkzSEvNLGgcsZ997Fmj mbzy38OOOevoWufzEO/VEp7PlGwCBP1Az/ApCAeKoQfxrRRxjOXEfRtGUZ4p6APmxu/PJD04lWv UGDW+pNJWFw== X-Google-Smtp-Source: AGHT+IG1uwtqUD5sLGbojdTBDV4ey1ZLfXEvNLIVrSuykiZmXMD/5Lba0NjgEZ/yxuxIxTECeeSfwijsGH9ZRQ== X-Received: from laogai.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:2c9]) (user=amoorthy job=sendgmr) by 2002:a05:6902:1b85:b0:dc2:26f6:fbc8 with SMTP id ei5-20020a0569021b8500b00dc226f6fbc8mr129460ybb.7.1708041263068; Thu, 15 Feb 2024 15:54:23 -0800 (PST) Date: Thu, 15 Feb 2024 23:54:02 +0000 In-Reply-To: <20240215235405.368539-1-amoorthy@google.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240215235405.368539-1-amoorthy@google.com> X-Mailer: git-send-email 2.44.0.rc0.258.g7320e95886-goog Message-ID: <20240215235405.368539-12-amoorthy@google.com> Subject: [PATCH v7 11/14] KVM: selftests: Allow many vCPUs and reader threads per UFFD in demand paging test From: Anish Moorthy To: seanjc@google.com, oliver.upton@linux.dev, maz@kernel.org, kvm@vger.kernel.org, kvmarm@lists.linux.dev Cc: robert.hoo.linux@gmail.com, jthoughton@google.com, amoorthy@google.com, dmatlack@google.com, axelrasmussen@google.com, peterx@redhat.com, nadav.amit@gmail.com, isaku.yamahata@gmail.com, kconsul@linux.vnet.ibm.com At the moment, demand_paging_test does not support profiling/testing multiple vCPU threads concurrently faulting on a single uffd because (a) "-u" (run test in userfaultfd mode) creates a uffd for each vCPU's region, so that each uffd services a single vCPU thread. (b) "-u -o" (userfaultfd mode + overlapped vCPU memory accesses) simply doesn't work: the test tries to register the same memory to multiple uffds, causing an error. Add support for many vcpus per uffd by (1) Keeping "-u" behavior unchanged. (2) Making "-u -a" create a single uffd for all of guest memory. (3) Making "-u -o" implicitly pass "-a", solving the problem in (b). In cases (2) and (3) all vCPU threads fault on a single uffd. With potentially multiple vCPUs per UFFD, it makes sense to allow configuring the number of reader threads per UFFD as well: add the "-r" flag to do so. Signed-off-by: Anish Moorthy Acked-by: James Houghton --- .../selftests/kvm/aarch64/page_fault_test.c | 4 +- .../selftests/kvm/demand_paging_test.c | 76 +++++++++++++--- .../selftests/kvm/include/userfaultfd_util.h | 17 +++- .../selftests/kvm/lib/userfaultfd_util.c | 87 +++++++++++++------ 4 files changed, 137 insertions(+), 47 deletions(-) diff --git a/tools/testing/selftests/kvm/aarch64/page_fault_test.c b/tools/testing/selftests/kvm/aarch64/page_fault_test.c index 08a5ca5bed56..dad1fb338f36 100644 --- a/tools/testing/selftests/kvm/aarch64/page_fault_test.c +++ b/tools/testing/selftests/kvm/aarch64/page_fault_test.c @@ -375,14 +375,14 @@ static void setup_uffd(struct kvm_vm *vm, struct test_params *p, *pt_uffd = uffd_setup_demand_paging(uffd_mode, 0, pt_args.hva, pt_args.paging_size, - test->uffd_pt_handler); + 1, test->uffd_pt_handler); *data_uffd = NULL; if (test->uffd_data_handler) *data_uffd = uffd_setup_demand_paging(uffd_mode, 0, data_args.hva, data_args.paging_size, - test->uffd_data_handler); + 1, test->uffd_data_handler); } static void free_uffd(struct test_desc *test, struct uffd_desc *pt_uffd, diff --git a/tools/testing/selftests/kvm/demand_paging_test.c b/tools/testing/selftests/kvm/demand_paging_test.c index 6dc823fa933a..f7897a951f90 100644 --- a/tools/testing/selftests/kvm/demand_paging_test.c +++ b/tools/testing/selftests/kvm/demand_paging_test.c @@ -77,8 +77,20 @@ static int handle_uffd_page_request(int uffd_mode, int uffd, copy.mode = 0; r = ioctl(uffd, UFFDIO_COPY, ©); - if (r == -1) { - pr_info("Failed UFFDIO_COPY in 0x%lx from thread %d with errno: %d\n", + /* + * With multiple vCPU threads fault on a single page and there are + * multiple readers for the UFFD, at least one of the UFFDIO_COPYs + * will fail with EEXIST: handle that case without signaling an + * error. + * + * Note that this also suppress any EEXISTs occurring from, + * e.g., the first UFFDIO_COPY/CONTINUEs on a page. That never + * happens here, but a realistic VMM might potentially maintain + * some external state to correctly surface EEXISTs to userspace + * (or prevent duplicate COPY/CONTINUEs in the first place). + */ + if (r == -1 && errno != EEXIST) { + pr_info("Failed UFFDIO_COPY in 0x%lx from thread %d, errno = %d\n", addr, tid, errno); return r; } @@ -89,8 +101,20 @@ static int handle_uffd_page_request(int uffd_mode, int uffd, cont.range.len = demand_paging_size; r = ioctl(uffd, UFFDIO_CONTINUE, &cont); - if (r == -1) { - pr_info("Failed UFFDIO_CONTINUE in 0x%lx from thread %d with errno: %d\n", + /* + * With multiple vCPU threads fault on a single page and there are + * multiple readers for the UFFD, at least one of the UFFDIO_COPYs + * will fail with EEXIST: handle that case without signaling an + * error. + * + * Note that this also suppress any EEXISTs occurring from, + * e.g., the first UFFDIO_COPY/CONTINUEs on a page. That never + * happens here, but a realistic VMM might potentially maintain + * some external state to correctly surface EEXISTs to userspace + * (or prevent duplicate COPY/CONTINUEs in the first place). + */ + if (r == -1 && errno != EEXIST) { + pr_info("Failed UFFDIO_CONTINUE in 0x%lx, thread %d, errno = %d\n", addr, tid, errno); return r; } @@ -110,7 +134,9 @@ static int handle_uffd_page_request(int uffd_mode, int uffd, struct test_params { int uffd_mode; + bool single_uffd; useconds_t uffd_delay; + int readers_per_uffd; enum vm_mem_backing_src_type src_type; bool partition_vcpu_memory_access; }; @@ -134,8 +160,9 @@ static void run_test(enum vm_guest_mode mode, void *arg) struct timespec start; struct timespec ts_diff; struct kvm_vm *vm; - int i; + int i, num_uffds = 0; double vcpu_paging_rate; + uint64_t uffd_region_size; vm = memstress_create_vm(mode, nr_vcpus, guest_percpu_mem_size, 1, p->src_type, p->partition_vcpu_memory_access); @@ -148,7 +175,8 @@ static void run_test(enum vm_guest_mode mode, void *arg) memset(guest_data_prototype, 0xAB, demand_paging_size); if (p->uffd_mode == UFFDIO_REGISTER_MODE_MINOR) { - for (i = 0; i < nr_vcpus; i++) { + num_uffds = p->single_uffd ? 1 : nr_vcpus; + for (i = 0; i < num_uffds; i++) { vcpu_args = &memstress_args.vcpu_args[i]; prefault_mem(addr_gpa2alias(vm, vcpu_args->gpa), vcpu_args->pages * memstress_args.guest_page_size); @@ -156,9 +184,13 @@ static void run_test(enum vm_guest_mode mode, void *arg) } if (p->uffd_mode) { - uffd_descs = malloc(nr_vcpus * sizeof(struct uffd_desc *)); + num_uffds = p->single_uffd ? 1 : nr_vcpus; + uffd_region_size = nr_vcpus * guest_percpu_mem_size / num_uffds; + + uffd_descs = malloc(num_uffds * sizeof(struct uffd_desc *)); TEST_ASSERT(uffd_descs, "Memory allocation failed"); - for (i = 0; i < nr_vcpus; i++) { + for (i = 0; i < num_uffds; i++) { + struct memstress_vcpu_args *vcpu_args; void *vcpu_hva; vcpu_args = &memstress_args.vcpu_args[i]; @@ -171,7 +203,8 @@ static void run_test(enum vm_guest_mode mode, void *arg) */ uffd_descs[i] = uffd_setup_demand_paging( p->uffd_mode, p->uffd_delay, vcpu_hva, - vcpu_args->pages * memstress_args.guest_page_size, + uffd_region_size, + p->readers_per_uffd, &handle_uffd_page_request); } } @@ -188,7 +221,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) if (p->uffd_mode) { /* Tell the user fault fd handler threads to quit */ - for (i = 0; i < nr_vcpus; i++) + for (i = 0; i < num_uffds; i++) uffd_stop_demand_paging(uffd_descs[i]); } @@ -214,15 +247,20 @@ static void run_test(enum vm_guest_mode mode, void *arg) static void help(char *name) { puts(""); - printf("usage: %s [-h] [-m vm_mode] [-u uffd_mode] [-d uffd_delay_usec]\n" - " [-b memory] [-s type] [-v vcpus] [-c cpu_list] [-o]\n", name); + printf("usage: %s [-h] [-m vm_mode] [-u uffd_mode] [-a]\n" + " [-d uffd_delay_usec] [-r readers_per_uffd] [-b memory]\n" + " [-s type] [-v vcpus] [-c cpu_list] [-o]\n", name); guest_modes_help(); printf(" -u: use userfaultfd to handle vCPU page faults. Mode is a\n" " UFFD registration mode: 'MISSING' or 'MINOR'.\n"); kvm_print_vcpu_pinning_help(); + printf(" -a: Use a single userfaultfd for all of guest memory, instead of\n" + " creating one for each region paged by a unique vCPU\n" + " Set implicitly with -o, and no effect without -u.\n"); printf(" -d: add a delay in usec to the User Fault\n" " FD handler to simulate demand paging\n" " overheads. Ignored without -u.\n"); + printf(" -r: Set the number of reader threads per uffd.\n"); printf(" -b: specify the size of the memory region which should be\n" " demand paged by each vCPU. e.g. 10M or 3G.\n" " Default: 1G\n"); @@ -241,12 +279,14 @@ int main(int argc, char *argv[]) struct test_params p = { .src_type = DEFAULT_VM_MEM_SRC, .partition_vcpu_memory_access = true, + .readers_per_uffd = 1, + .single_uffd = false, }; int opt; guest_modes_append_default(); - while ((opt = getopt(argc, argv, "hm:u:d:b:s:v:c:o")) != -1) { + while ((opt = getopt(argc, argv, "ahom:u:d:b:s:v:c:r:")) != -1) { switch (opt) { case 'm': guest_modes_cmdline(optarg); @@ -258,6 +298,9 @@ int main(int argc, char *argv[]) p.uffd_mode = UFFDIO_REGISTER_MODE_MINOR; TEST_ASSERT(p.uffd_mode, "UFFD mode must be 'MISSING' or 'MINOR'."); break; + case 'a': + p.single_uffd = true; + break; case 'd': p.uffd_delay = strtoul(optarg, NULL, 0); TEST_ASSERT(p.uffd_delay >= 0, "A negative UFFD delay is not supported."); @@ -278,6 +321,13 @@ int main(int argc, char *argv[]) break; case 'o': p.partition_vcpu_memory_access = false; + p.single_uffd = true; + break; + case 'r': + p.readers_per_uffd = atoi(optarg); + TEST_ASSERT(p.readers_per_uffd >= 1, + "Invalid number of readers per uffd %d: must be >=1", + p.readers_per_uffd); break; case 'h': default: diff --git a/tools/testing/selftests/kvm/include/userfaultfd_util.h b/tools/testing/selftests/kvm/include/userfaultfd_util.h index 877449c34592..af83a437e74a 100644 --- a/tools/testing/selftests/kvm/include/userfaultfd_util.h +++ b/tools/testing/selftests/kvm/include/userfaultfd_util.h @@ -17,18 +17,27 @@ typedef int (*uffd_handler_t)(int uffd_mode, int uffd, struct uffd_msg *msg); -struct uffd_desc { +struct uffd_reader_args { int uffd_mode; int uffd; - int pipefds[2]; useconds_t delay; uffd_handler_t handler; - pthread_t thread; + /* Holds the read end of the pipe for killing the reader. */ + int pipe; +}; + +struct uffd_desc { + int uffd; + uint64_t num_readers; + /* Holds the write ends of the pipes for killing the readers. */ + int *pipefds; + pthread_t *readers; + struct uffd_reader_args *reader_args; }; struct uffd_desc *uffd_setup_demand_paging(int uffd_mode, useconds_t delay, void *hva, uint64_t len, - uffd_handler_t handler); + uint64_t num_readers, uffd_handler_t handler); void uffd_stop_demand_paging(struct uffd_desc *uffd); diff --git a/tools/testing/selftests/kvm/lib/userfaultfd_util.c b/tools/testing/selftests/kvm/lib/userfaultfd_util.c index 271f63891581..6f220aa4fb08 100644 --- a/tools/testing/selftests/kvm/lib/userfaultfd_util.c +++ b/tools/testing/selftests/kvm/lib/userfaultfd_util.c @@ -27,10 +27,8 @@ static void *uffd_handler_thread_fn(void *arg) { - struct uffd_desc *uffd_desc = (struct uffd_desc *)arg; - int uffd = uffd_desc->uffd; - int pipefd = uffd_desc->pipefds[0]; - useconds_t delay = uffd_desc->delay; + struct uffd_reader_args *reader_args = (struct uffd_reader_args *)arg; + int uffd = reader_args->uffd; int64_t pages = 0; struct timespec start; struct timespec ts_diff; @@ -44,7 +42,7 @@ static void *uffd_handler_thread_fn(void *arg) pollfd[0].fd = uffd; pollfd[0].events = POLLIN; - pollfd[1].fd = pipefd; + pollfd[1].fd = reader_args->pipe; pollfd[1].events = POLLIN; r = poll(pollfd, 2, -1); @@ -92,9 +90,9 @@ static void *uffd_handler_thread_fn(void *arg) if (!(msg.event & UFFD_EVENT_PAGEFAULT)) continue; - if (delay) - usleep(delay); - r = uffd_desc->handler(uffd_desc->uffd_mode, uffd, &msg); + if (reader_args->delay) + usleep(reader_args->delay); + r = reader_args->handler(reader_args->uffd_mode, uffd, &msg); if (r < 0) return NULL; pages++; @@ -110,7 +108,7 @@ static void *uffd_handler_thread_fn(void *arg) struct uffd_desc *uffd_setup_demand_paging(int uffd_mode, useconds_t delay, void *hva, uint64_t len, - uffd_handler_t handler) + uint64_t num_readers, uffd_handler_t handler) { struct uffd_desc *uffd_desc; bool is_minor = (uffd_mode == UFFDIO_REGISTER_MODE_MINOR); @@ -118,14 +116,26 @@ struct uffd_desc *uffd_setup_demand_paging(int uffd_mode, useconds_t delay, struct uffdio_api uffdio_api; struct uffdio_register uffdio_register; uint64_t expected_ioctls = ((uint64_t) 1) << _UFFDIO_COPY; - int ret; + int ret, i; PER_PAGE_DEBUG("Userfaultfd %s mode, faults resolved with %s\n", is_minor ? "MINOR" : "MISSING", is_minor ? "UFFDIO_CONINUE" : "UFFDIO_COPY"); uffd_desc = malloc(sizeof(struct uffd_desc)); - TEST_ASSERT(uffd_desc, "malloc failed"); + TEST_ASSERT(uffd_desc, "Failed to malloc uffd descriptor"); + + uffd_desc->pipefds = malloc(sizeof(int) * num_readers); + TEST_ASSERT(uffd_desc->pipefds, "Failed to malloc pipes"); + + uffd_desc->readers = malloc(sizeof(pthread_t) * num_readers); + TEST_ASSERT(uffd_desc->readers, "Failed to malloc reader threads"); + + uffd_desc->reader_args = malloc( + sizeof(struct uffd_reader_args) * num_readers); + TEST_ASSERT(uffd_desc->reader_args, "Failed to malloc reader_args"); + + uffd_desc->num_readers = num_readers; /* In order to get minor faults, prefault via the alias. */ if (is_minor) @@ -148,18 +158,28 @@ struct uffd_desc *uffd_setup_demand_paging(int uffd_mode, useconds_t delay, TEST_ASSERT((uffdio_register.ioctls & expected_ioctls) == expected_ioctls, "missing userfaultfd ioctls"); - ret = pipe2(uffd_desc->pipefds, O_CLOEXEC | O_NONBLOCK); - TEST_ASSERT(!ret, "Failed to set up pipefd"); - - uffd_desc->uffd_mode = uffd_mode; uffd_desc->uffd = uffd; - uffd_desc->delay = delay; - uffd_desc->handler = handler; - pthread_create(&uffd_desc->thread, NULL, uffd_handler_thread_fn, - uffd_desc); + for (i = 0; i < uffd_desc->num_readers; ++i) { + int pipes[2]; + + ret = pipe2((int *) &pipes, O_CLOEXEC | O_NONBLOCK); + TEST_ASSERT(!ret, "Failed to set up pipefd %i for uffd_desc %p", + i, uffd_desc); + + uffd_desc->pipefds[i] = pipes[1]; - PER_VCPU_DEBUG("Created uffd thread for HVA range [%p, %p)\n", - hva, hva + len); + uffd_desc->reader_args[i].uffd_mode = uffd_mode; + uffd_desc->reader_args[i].uffd = uffd; + uffd_desc->reader_args[i].delay = delay; + uffd_desc->reader_args[i].handler = handler; + uffd_desc->reader_args[i].pipe = pipes[0]; + + pthread_create(&uffd_desc->readers[i], NULL, uffd_handler_thread_fn, + &uffd_desc->reader_args[i]); + + PER_VCPU_DEBUG("Created uffd thread %i for HVA range [%p, %p)\n", + i, hva, hva + len); + } return uffd_desc; } @@ -167,19 +187,30 @@ struct uffd_desc *uffd_setup_demand_paging(int uffd_mode, useconds_t delay, void uffd_stop_demand_paging(struct uffd_desc *uffd) { char c = 0; - int ret; + int i, ret; - ret = write(uffd->pipefds[1], &c, 1); - TEST_ASSERT(ret == 1, "Unable to write to pipefd"); + for (i = 0; i < uffd->num_readers; ++i) { + ret = write(uffd->pipefds[i], &c, 1); + TEST_ASSERT( + ret == 1, "Unable to write to pipefd %i for uffd_desc %p", i, uffd); + } - ret = pthread_join(uffd->thread, NULL); - TEST_ASSERT(ret == 0, "Pthread_join failed."); + for (i = 0; i < uffd->num_readers; ++i) { + ret = pthread_join(uffd->readers[i], NULL); + TEST_ASSERT( + ret == 0, "Pthread_join failed on reader %i for uffd_desc %p", i, uffd); + } close(uffd->uffd); - close(uffd->pipefds[1]); - close(uffd->pipefds[0]); + for (i = 0; i < uffd->num_readers; ++i) { + close(uffd->pipefds[i]); + close(uffd->reader_args[i].pipe); + } + free(uffd->pipefds); + free(uffd->readers); + free(uffd->reader_args); free(uffd); } From patchwork Thu Feb 15 23:54:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anish Moorthy X-Patchwork-Id: 13559304 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C5D171474C8 for ; Thu, 15 Feb 2024 23:54:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708041266; cv=none; b=PMDMXn5oahafyXgzTDnEBBisn5fcCUnIQgjxkk8QKTlnjhr8sVJBDWiLI3XfF+rr7hCC9jaDMRdBnFzqVJCzO0+PjFYKzb81n9hZqYy41+pBVpyrEGAUQnED/kOCvK/86SSWzXzGl2y2Y6RE11T+mFgWpL1maPR/Zn+q/506oV4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708041266; c=relaxed/simple; bh=a7eKLWtaoxoL7cVfHGdohyj2AF5ZACc8LDLsIrxBv08=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=k2CeMtQkk5vaB2sWP4NrkD02rEX+p3HBtbK/1PHOFducHqtQQAZL/TrYogtVybf4CEP2nK+XxXia+6OCTblpakMcYxfM/T1sri54HK6vfiDQT5rxBZvyCrn57He63z800BN/twn8jhiCC3di/dzXYghzPwurRWPbZJ2Lhy1EQdE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--amoorthy.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=T+1WAAsc; arc=none smtp.client-ip=209.85.219.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--amoorthy.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="T+1WAAsc" Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-dc6b267bf11so256961276.2 for ; Thu, 15 Feb 2024 15:54:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1708041264; x=1708646064; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Fp/0bdTdp/wTZtC8stanHESpr8tHISyMMSdV5bn6+Y0=; b=T+1WAAscF0gL3YptIoW1pfA/RiwY1yeuNvYiPtkNwv1wX4r2TbBUkB9SZOBBofD2Z8 wGqwyapSzVC/0Qsn9KOZd6/8Ffy6lOAMNpm2jxMspVrP/2E/njKlbKeOuTJWa5CYWY55 gsFdZ8XKLiZWF6zvQL9xpiup6RqKFzIMKVxaO4Q0opVMvzBcnCkO2RmInFnkvkVlA6cG LK+OOPT8p6U2XMT7b61AQCrGz6l8krvvMxniYWQbmpF5asjd88mVj/xVhyvr5PV1o+/w B+JrIdGxGUeq1AWXdremS1DCe7XK228ZCriH/oYMYDF84LWMCf/h9OygD0+dUhk+JoWA 1UnQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708041264; x=1708646064; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Fp/0bdTdp/wTZtC8stanHESpr8tHISyMMSdV5bn6+Y0=; b=tQXNxQFYIaPXnl82flQ9DNk/vgiyNrOnEQsfSaH7IrbrWdTtMRrBCGeiRB0fT+idoQ GbxNRL/TxuvzvosOHnJr01lNBLd7JaRotUlvTR3zDKQ0i2/dGvUDAaQ2kzWX7fwfs2lY MMPmpv5SGi2PqUWML2U1G/SDyNU9+v1Upo5rAdkrf1ru438WEfCtN8UGh9eljbC3ITps 9pgWQyjaPmmGvl9/8EPIY8ljtBHxujMyC8DaHmSMpVhhKO3dmDNVH/T3LIHpk5a7r1qT 7T4cuBBZFaB9CC15NPkv89j5LWwm0OxKQ+ucYSbITSPUrOMo8KRAPSrtbcAbUgGEVUWT yjSg== X-Forwarded-Encrypted: i=1; AJvYcCWvvRL2TUsFlPfAsWXIXEESpiKsrdTOqAJOzBLtTZznc69TMS5js8lsGC6gUGDJS2jHHh2xBJQwAWOApZQN1joi+XNb X-Gm-Message-State: AOJu0Yw6MZZ486+i3LhG8W5dlzyqkgjQ/ud+QPjzmcVUL3d+sAPK6HQF whCCQLZOzUA9kwi501J7qbvkCljqx3kGuBWXcyYqdQ8Vx4DSSrHrHF+taAUC0XJia57UdsFlrqA bhI5buhG+bg== X-Google-Smtp-Source: AGHT+IGo9PYNpFMRSL7MDuuM071FwDbG6033GhiZwLkGjhQJG67+BYPtQK3FZHpNcH0hyJDMDR4MZgbnf0Bqqw== X-Received: from laogai.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:2c9]) (user=amoorthy job=sendgmr) by 2002:a5b:ca:0:b0:dc6:b813:5813 with SMTP id d10-20020a5b00ca000000b00dc6b8135813mr123862ybp.9.1708041263965; Thu, 15 Feb 2024 15:54:23 -0800 (PST) Date: Thu, 15 Feb 2024 23:54:03 +0000 In-Reply-To: <20240215235405.368539-1-amoorthy@google.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240215235405.368539-1-amoorthy@google.com> X-Mailer: git-send-email 2.44.0.rc0.258.g7320e95886-goog Message-ID: <20240215235405.368539-13-amoorthy@google.com> Subject: [PATCH v7 12/14] KVM: selftests: Use EPOLL in userfaultfd_util reader threads and signal errors via TEST_ASSERT From: Anish Moorthy To: seanjc@google.com, oliver.upton@linux.dev, maz@kernel.org, kvm@vger.kernel.org, kvmarm@lists.linux.dev Cc: robert.hoo.linux@gmail.com, jthoughton@google.com, amoorthy@google.com, dmatlack@google.com, axelrasmussen@google.com, peterx@redhat.com, nadav.amit@gmail.com, isaku.yamahata@gmail.com, kconsul@linux.vnet.ibm.com With multiple reader threads POLLing a single UFFD, the test suffers from the thundering herd problem: performance degrades as the number of reader threads is increased. Solve this issue [1] by switching the the polling mechanism to EPOLL + EPOLLEXCLUSIVE. Also, change the error-handling convention of uffd_handler_thread_fn. Instead of just printing errors and returning early from the polling loop, check for them via TEST_ASSERT. "return NULL" is reserved for a successful exit from uffd_handler_thread_fn, ie one triggered by a write to the exit pipe. Performance samples generated by the command in [2] are given below. Num Reader Threads, Paging Rate (POLL), Paging Rate (EPOLL) 1 249k 185k 2 201k 235k 4 186k 155k 16 150k 217k 32 89k 198k [1] Single-vCPU performance does suffer somewhat. [2] ./demand_paging_test -u MINOR -s shmem -v 4 -o -r Signed-off-by: Anish Moorthy Acked-by: James Houghton --- .../selftests/kvm/demand_paging_test.c | 1 - .../selftests/kvm/lib/userfaultfd_util.c | 74 +++++++++---------- 2 files changed, 35 insertions(+), 40 deletions(-) diff --git a/tools/testing/selftests/kvm/demand_paging_test.c b/tools/testing/selftests/kvm/demand_paging_test.c index f7897a951f90..0455347f932a 100644 --- a/tools/testing/selftests/kvm/demand_paging_test.c +++ b/tools/testing/selftests/kvm/demand_paging_test.c @@ -13,7 +13,6 @@ #include #include #include -#include #include #include #include diff --git a/tools/testing/selftests/kvm/lib/userfaultfd_util.c b/tools/testing/selftests/kvm/lib/userfaultfd_util.c index 6f220aa4fb08..2a179133645a 100644 --- a/tools/testing/selftests/kvm/lib/userfaultfd_util.c +++ b/tools/testing/selftests/kvm/lib/userfaultfd_util.c @@ -16,6 +16,7 @@ #include #include #include +#include #include #include "kvm_util.h" @@ -32,60 +33,55 @@ static void *uffd_handler_thread_fn(void *arg) int64_t pages = 0; struct timespec start; struct timespec ts_diff; + int epollfd; + struct epoll_event evt; + + epollfd = epoll_create(1); + TEST_ASSERT(epollfd >= 0, "Failed to create epollfd."); + + evt.events = EPOLLIN | EPOLLEXCLUSIVE; + evt.data.u32 = 0; + TEST_ASSERT(epoll_ctl(epollfd, EPOLL_CTL_ADD, uffd, &evt) == 0, + "Failed to add uffd to epollfd"); + + evt.events = EPOLLIN; + evt.data.u32 = 1; + TEST_ASSERT(epoll_ctl(epollfd, EPOLL_CTL_ADD, reader_args->pipe, &evt) == 0, + "Failed to add pipe to epollfd"); clock_gettime(CLOCK_MONOTONIC, &start); while (1) { struct uffd_msg msg; - struct pollfd pollfd[2]; - char tmp_chr; int r; - pollfd[0].fd = uffd; - pollfd[0].events = POLLIN; - pollfd[1].fd = reader_args->pipe; - pollfd[1].events = POLLIN; - - r = poll(pollfd, 2, -1); - switch (r) { - case -1: - pr_info("poll err"); - continue; - case 0: - continue; - case 1: - break; - default: - pr_info("Polling uffd returned %d", r); - return NULL; - } + r = epoll_wait(epollfd, &evt, 1, -1); + TEST_ASSERT(r == 1, + "Unexpected number of events (%d) from epoll, errno = %d", + r, errno); - if (pollfd[0].revents & POLLERR) { - pr_info("uffd revents has POLLERR"); - return NULL; - } + if (evt.data.u32 == 1) { + char tmp_chr; - if (pollfd[1].revents & POLLIN) { - r = read(pollfd[1].fd, &tmp_chr, 1); + TEST_ASSERT(!(evt.events & (EPOLLERR | EPOLLHUP)), + "Reader thread received EPOLLERR or EPOLLHUP on pipe."); + r = read(reader_args->pipe, &tmp_chr, 1); TEST_ASSERT(r == 1, - "Error reading pipefd in UFFD thread\n"); + "Error reading pipefd in uffd reader thread"); break; } - if (!(pollfd[0].revents & POLLIN)) - continue; + TEST_ASSERT(!(evt.events & (EPOLLERR | EPOLLHUP)), + "Reader thread received EPOLLERR or EPOLLHUP on uffd."); r = read(uffd, &msg, sizeof(msg)); if (r == -1) { - if (errno == EAGAIN) - continue; - pr_info("Read of uffd got errno %d\n", errno); - return NULL; + TEST_ASSERT(errno == EAGAIN, + "Error reading from UFFD: errno = %d", errno); + continue; } - if (r != sizeof(msg)) { - pr_info("Read on uffd returned unexpected size: %d bytes", r); - return NULL; - } + TEST_ASSERT(r == sizeof(msg), + "Read on uffd returned unexpected number of bytes (%d)", r); if (!(msg.event & UFFD_EVENT_PAGEFAULT)) continue; @@ -93,8 +89,8 @@ static void *uffd_handler_thread_fn(void *arg) if (reader_args->delay) usleep(reader_args->delay); r = reader_args->handler(reader_args->uffd_mode, uffd, &msg); - if (r < 0) - return NULL; + TEST_ASSERT(r >= 0, + "Reader thread handler fn returned negative value %d", r); pages++; } From patchwork Thu Feb 15 23:54:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anish Moorthy X-Patchwork-Id: 13559305 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 680B6145FEF for ; Thu, 15 Feb 2024 23:54:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708041268; cv=none; b=UrvAvl6HWoaTiDXZzG/RwnoTGnK+Ky5MGWTHdGtpUhcXI7G5bKWHOhl5MD7VEYt6GIpMX8Ur6P6qyxGqsgjR9cFtNSjOyQpgK/qsystSpkoxE/FOZ4weibyLAYxWpTf8TShd9sF1WA4giOmGf1qNXnsJ4AaMAptA88cXA2e/E+8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708041268; c=relaxed/simple; bh=7F0hdUQkgcbLnNEOLN45aw7eBCRbULVKfec0j/XaDOU=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=lYYUjD1+Ac3G5GVWiRMIPT+tSRsMBGAm+uQgIkGaSBtzkwpPoLYssAFXpbm159r1l0ag6fnbuD+CBDFHfaaikbb1NSWfRmDv9JrmGTmvp+4dVneb2+CQL+y2RAyGgYqZZnWdfvKcWzvxyTF8KEK7RzcJRJqvJaLC0ePXLY+gC+c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--amoorthy.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=idLq/jo3; arc=none smtp.client-ip=209.85.219.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--amoorthy.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="idLq/jo3" Received: by mail-yb1-f202.google.com with SMTP id 3f1490d57ef6-dc64e0fc7c8so1969468276.2 for ; Thu, 15 Feb 2024 15:54:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1708041265; x=1708646065; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=pZ3z5wuWWtoz1ZMhcdaCZuytQaO+bHVR4Pertr9jsQQ=; b=idLq/jo3wTOfhzmYljHB5ByPoQVemZkQFFbDoxsa0awJN1SqIiMYnRFzRoe/QIlw0a WA+CMaVFhyx7HZlk6AWuNTilPUkX5SH3fAo93vRpiedHOkULGdlBbeL+Jtc1fbhOiHOl 8OOBDFLPWhS+k6u2HmzQTH+tL2u9PfGBR5o1O8+ZKMzuLVTSJ4I3jFhYuGfZ0VjSp2tq PHVliXsnanVBH1woWqacKqr05oFYZmF+JdrHTU/9HAZ9/Q1rKwxSE+K5W9u8jL/nw7cM 8M2nFJNUP2f0Le/IpT6e18KBoRPk4wdqacAQpXZFXaYRgA7zDpjFH1x8BiLufkgKdsK2 gWdQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708041265; x=1708646065; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=pZ3z5wuWWtoz1ZMhcdaCZuytQaO+bHVR4Pertr9jsQQ=; b=AoONFH2Pyyj2jzKH+NcaGY9XYeOp24cJgOGSfG/W6ufzvOg7+S+A28dsWhQb4NOiDj K87rrIzBr8jq4HGhiuPDsTf0xHme/K9vBmmyqVAp/o8QOUwx6OgvGNWwpAXfnG1MjPhV VW//wr16LMToaAD3mJYKR8PzRT0QxW+KhL75cTiCFMPabFXFKyi4qo0InFD8g1ZA6eWy mT91v3TYSUrZ0T0fJzT3gJyYFsWahjNqwZI/1xlQvfYQX9N+zj4VBgX9fA6bnd7a95Wa A/DLGAo6pNSU27+k0CWPWb/8S0H8cMhxXovXYuQGeRajANUsgn+yh/AlkZFoCEUSI0Mm PduQ== X-Forwarded-Encrypted: i=1; AJvYcCW92ZvbZ+0Yznrwt62aM+eLYDPuFT9Avscce6hJIme8aI4se1qog2WMl7aiXEuYJBuujM6UKhi8zskDK0kXSnlqd+JG X-Gm-Message-State: AOJu0YxjBdDKuI6tYAUlv9M0W3K9YsRQaStAHKTRymqZS7W1eTVX+OG/ QyZpvwjD0Cr4JuVnaA0Tgj+d+2F9E3NtMp4LrFArcDA9kahQCaFVUh5LgUp7XTGuBTLr1Q3CbHX BQZ7NxtDmIg== X-Google-Smtp-Source: AGHT+IEUZwP+LJ+E+jxNEEZlreKraG2jY/8zcXTmmfUSkAeCJVN8Cnv8dxS1QjyGU3QmLdYMdQnNWZ+EgajasA== X-Received: from laogai.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:2c9]) (user=amoorthy job=sendgmr) by 2002:a25:9846:0:b0:dcd:c091:e86 with SMTP id k6-20020a259846000000b00dcdc0910e86mr128874ybo.13.1708041265304; Thu, 15 Feb 2024 15:54:25 -0800 (PST) Date: Thu, 15 Feb 2024 23:54:04 +0000 In-Reply-To: <20240215235405.368539-1-amoorthy@google.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240215235405.368539-1-amoorthy@google.com> X-Mailer: git-send-email 2.44.0.rc0.258.g7320e95886-goog Message-ID: <20240215235405.368539-14-amoorthy@google.com> Subject: [PATCH v7 13/14] KVM: selftests: Add memslot_flags parameter to memstress_create_vm() From: Anish Moorthy To: seanjc@google.com, oliver.upton@linux.dev, maz@kernel.org, kvm@vger.kernel.org, kvmarm@lists.linux.dev Cc: robert.hoo.linux@gmail.com, jthoughton@google.com, amoorthy@google.com, dmatlack@google.com, axelrasmussen@google.com, peterx@redhat.com, nadav.amit@gmail.com, isaku.yamahata@gmail.com, kconsul@linux.vnet.ibm.com Memslot flags aren't currently exposed to the tests, and are just always set to 0. Add a parameter to allow tests to manually set those flags. Signed-off-by: Anish Moorthy --- tools/testing/selftests/kvm/access_tracking_perf_test.c | 2 +- tools/testing/selftests/kvm/demand_paging_test.c | 2 +- tools/testing/selftests/kvm/dirty_log_perf_test.c | 2 +- tools/testing/selftests/kvm/include/memstress.h | 2 +- tools/testing/selftests/kvm/lib/memstress.c | 4 ++-- .../testing/selftests/kvm/memslot_modification_stress_test.c | 2 +- .../selftests/kvm/x86_64/dirty_log_page_splitting_test.c | 2 +- 7 files changed, 8 insertions(+), 8 deletions(-) diff --git a/tools/testing/selftests/kvm/access_tracking_perf_test.c b/tools/testing/selftests/kvm/access_tracking_perf_test.c index 3c7defd34f56..b51656b408b8 100644 --- a/tools/testing/selftests/kvm/access_tracking_perf_test.c +++ b/tools/testing/selftests/kvm/access_tracking_perf_test.c @@ -306,7 +306,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) struct kvm_vm *vm; int nr_vcpus = params->nr_vcpus; - vm = memstress_create_vm(mode, nr_vcpus, params->vcpu_memory_bytes, 1, + vm = memstress_create_vm(mode, nr_vcpus, params->vcpu_memory_bytes, 1, 0, params->backing_src, !overlap_memory_access); memstress_start_vcpu_threads(nr_vcpus, vcpu_thread_main); diff --git a/tools/testing/selftests/kvm/demand_paging_test.c b/tools/testing/selftests/kvm/demand_paging_test.c index 0455347f932a..61bb2e23bef0 100644 --- a/tools/testing/selftests/kvm/demand_paging_test.c +++ b/tools/testing/selftests/kvm/demand_paging_test.c @@ -163,7 +163,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) double vcpu_paging_rate; uint64_t uffd_region_size; - vm = memstress_create_vm(mode, nr_vcpus, guest_percpu_mem_size, 1, + vm = memstress_create_vm(mode, nr_vcpus, guest_percpu_mem_size, 1, 0, p->src_type, p->partition_vcpu_memory_access); demand_paging_size = get_backing_src_pagesz(p->src_type); diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c index d374dbcf9a53..8b1a84a4db3b 100644 --- a/tools/testing/selftests/kvm/dirty_log_perf_test.c +++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c @@ -153,7 +153,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) int i; vm = memstress_create_vm(mode, nr_vcpus, guest_percpu_mem_size, - p->slots, p->backing_src, + p->slots, 0, p->backing_src, p->partition_vcpu_memory_access); pr_info("Random seed: %u\n", p->random_seed); diff --git a/tools/testing/selftests/kvm/include/memstress.h b/tools/testing/selftests/kvm/include/memstress.h index ce4e603050ea..8be9609d3ca0 100644 --- a/tools/testing/selftests/kvm/include/memstress.h +++ b/tools/testing/selftests/kvm/include/memstress.h @@ -56,7 +56,7 @@ struct memstress_args { extern struct memstress_args memstress_args; struct kvm_vm *memstress_create_vm(enum vm_guest_mode mode, int nr_vcpus, - uint64_t vcpu_memory_bytes, int slots, + uint64_t vcpu_memory_bytes, int slots, uint32_t slot_flags, enum vm_mem_backing_src_type backing_src, bool partition_vcpu_memory_access); void memstress_destroy_vm(struct kvm_vm *vm); diff --git a/tools/testing/selftests/kvm/lib/memstress.c b/tools/testing/selftests/kvm/lib/memstress.c index d05487e5a371..e74b09f39769 100644 --- a/tools/testing/selftests/kvm/lib/memstress.c +++ b/tools/testing/selftests/kvm/lib/memstress.c @@ -123,7 +123,7 @@ void memstress_setup_vcpus(struct kvm_vm *vm, int nr_vcpus, } struct kvm_vm *memstress_create_vm(enum vm_guest_mode mode, int nr_vcpus, - uint64_t vcpu_memory_bytes, int slots, + uint64_t vcpu_memory_bytes, int slots, uint32_t slot_flags, enum vm_mem_backing_src_type backing_src, bool partition_vcpu_memory_access) { @@ -212,7 +212,7 @@ struct kvm_vm *memstress_create_vm(enum vm_guest_mode mode, int nr_vcpus, vm_userspace_mem_region_add(vm, backing_src, region_start, MEMSTRESS_MEM_SLOT_INDEX + i, - region_pages, 0); + region_pages, slot_flags); } /* Do mapping for the demand paging memory slot */ diff --git a/tools/testing/selftests/kvm/memslot_modification_stress_test.c b/tools/testing/selftests/kvm/memslot_modification_stress_test.c index 9855c41ca811..0b19ec3ecc9c 100644 --- a/tools/testing/selftests/kvm/memslot_modification_stress_test.c +++ b/tools/testing/selftests/kvm/memslot_modification_stress_test.c @@ -95,7 +95,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) struct test_params *p = arg; struct kvm_vm *vm; - vm = memstress_create_vm(mode, nr_vcpus, guest_percpu_mem_size, 1, + vm = memstress_create_vm(mode, nr_vcpus, guest_percpu_mem_size, 1, 0, VM_MEM_SRC_ANONYMOUS, p->partition_vcpu_memory_access); diff --git a/tools/testing/selftests/kvm/x86_64/dirty_log_page_splitting_test.c b/tools/testing/selftests/kvm/x86_64/dirty_log_page_splitting_test.c index 634c6bfcd572..a770d7fa469a 100644 --- a/tools/testing/selftests/kvm/x86_64/dirty_log_page_splitting_test.c +++ b/tools/testing/selftests/kvm/x86_64/dirty_log_page_splitting_test.c @@ -100,7 +100,7 @@ static void run_test(enum vm_guest_mode mode, void *unused) struct kvm_page_stats stats_dirty_logging_disabled; struct kvm_page_stats stats_repopulated; - vm = memstress_create_vm(mode, VCPUS, guest_percpu_mem_size, + vm = memstress_create_vm(mode, VCPUS, guest_percpu_mem_size, 0, SLOTS, backing_src, false); guest_num_pages = (VCPUS * guest_percpu_mem_size) >> vm->page_shift; From patchwork Thu Feb 15 23:54:05 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anish Moorthy X-Patchwork-Id: 13559307 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AFB251482E1 for ; Thu, 15 Feb 2024 23:54:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708041269; cv=none; b=SyqqRHrRFDL4e72d+Ivufa5dOsog0SOayxBKuf+kltbf/qABLMPObMa0o5PUs4lUSgZi69ScrYll5HKkc9kQZ+MVJLbq8uaV75KejqiwIMOgbQV5pVLZVx6IaknIYCWT6zfYSC61vx4BR904I5QGTbOBFm0jc0ofjZpa29KIBCI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708041269; c=relaxed/simple; bh=RAsvrg/i2F9gNfTS8X0SfS9qSZ0A01hR5Wjg09DfJAk=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=plQh2CgW35BDu+QeNzzLHum75kKt+rc0yOBSdzPMZZSlYddopMAJ0UOaXDwqKShiSuHHXkkvFA9PTC9YKfBNbx1c98T7UIsBK4OcQ4jnSnMaSlQCXQQGU8ZJr56LsGfC7vU74iMmLn76EeEpGAbxHTMVnyBtKOMDDvafw9pUrpI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--amoorthy.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=nQwuld+x; arc=none smtp.client-ip=209.85.128.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--amoorthy.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="nQwuld+x" Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-60781e8709eso17593367b3.1 for ; Thu, 15 Feb 2024 15:54:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1708041266; x=1708646066; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=a6h4Eo5y6YkqEM75Hb+vk5Ws+PtTPrt2qfKZqi3f3os=; b=nQwuld+x+L1IPGytXg1mZ2MQl+YAthGdbhspGzkWaOiSDglQ5NMRw3wQDv3/eoe+6+ kuUqsEtUeePx9r+tj+RV4a7Ol7E2+0Xt12ProbdGBpE1fX/Uf8VXVBLZOf9iMDvwwZr5 TmDghD1sUqDS8Z57ToXW0+9LkBX0CrmiyL0A8erMIcJH06bF8yg7ZW/lvYzKSOwrHcJ3 38Sqq0xJR77jk9A0qClU9OaedDhWbroagrirXTrdD8OZ9AbW4VAPTEN/HdrXdlcwmVra rhfOLGz+TqlzgdMH3Qy55dX9HH74VGIt2omK7veIBv3uLG02okuQNYSjB808+QpT07CW Fezw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708041266; x=1708646066; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=a6h4Eo5y6YkqEM75Hb+vk5Ws+PtTPrt2qfKZqi3f3os=; b=Q99k9QBn0VkG2p7lcPyqjCMk9E6Ndxh3lcvzFDgzROkp/joH81zwkRqJQYCYe9EmfF WUhv+J+Gg90m+x7ZATfXnDuLd2th6xxNyqevnaIxmmzNkjG77HbXeTf6KiqFuttTMry+ PlJcLQKnGkl/TbImlcxjCijqTdL4PDNA+uIZUa17wZwOL7pPe4XnM2msMM0KnQ9hkwqx EBulXhi11XTv48yN9VNmq/KBWrwqj5GUC23YtiL1MPbbDtpm0zNW9vXX1wuVs9CvIOIG GxP/GmayIZgVwy1wgN4tvY67lq7io75Q0W5iDTusgwVVltyEWxI5AUP2lojXAFJ5fMB6 RguQ== X-Forwarded-Encrypted: i=1; AJvYcCVUg/7WCetSe1v2IGFb/rLFP2OIDKT2+yyGKnikaoAMlYPibYrDsntyTdIGomA9GaWYQ7blgV4zxoy0Oim61sl+1rNL X-Gm-Message-State: AOJu0YzhZmRZ7vjXZ62RxDAb8Pi4lqiQ8iUruVXWedrjaaL+C9Oe4726 xFGQ0HOv7KAOn4tTaqebMyLGl6W5hgdd7fZSk6ada3SNyHPwPyFyjqCwEdNBMKzn9bn9NgbrcVF iedar9QkM3Q== X-Google-Smtp-Source: AGHT+IFvJf/xu20a2d3/nspz229yfZ04T2h8KRb/BqXUe1MHzBquBNYTrUf1pRLCX3uAEeYJvtxOqbGG6NS6Mw== X-Received: from laogai.c.googlers.com ([fda3:e722:ac3:cc00:2b:7d90:c0a8:2c9]) (user=amoorthy job=sendgmr) by 2002:a05:690c:2b89:b0:607:f4b2:42f0 with SMTP id en9-20020a05690c2b8900b00607f4b242f0mr162491ywb.2.1708041266752; Thu, 15 Feb 2024 15:54:26 -0800 (PST) Date: Thu, 15 Feb 2024 23:54:05 +0000 In-Reply-To: <20240215235405.368539-1-amoorthy@google.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240215235405.368539-1-amoorthy@google.com> X-Mailer: git-send-email 2.44.0.rc0.258.g7320e95886-goog Message-ID: <20240215235405.368539-15-amoorthy@google.com> Subject: [PATCH v7 14/14] KVM: selftests: Handle memory fault exits in demand_paging_test From: Anish Moorthy To: seanjc@google.com, oliver.upton@linux.dev, maz@kernel.org, kvm@vger.kernel.org, kvmarm@lists.linux.dev Cc: robert.hoo.linux@gmail.com, jthoughton@google.com, amoorthy@google.com, dmatlack@google.com, axelrasmussen@google.com, peterx@redhat.com, nadav.amit@gmail.com, isaku.yamahata@gmail.com, kconsul@linux.vnet.ibm.com Demonstrate a (very basic) scheme for supporting memory fault exits. From the vCPU threads: 1. Simply issue UFFDIO_COPY/CONTINUEs in response to memory fault exits, with the purpose of establishing the absent mappings. Do so with wake_waiters=false to avoid serializing on the userfaultfd wait queue locks. 2. When the UFFDIO_COPY/CONTINUE in (1) fails with EEXIST, assume that the mapping was already established but is currently absent [A] and attempt to populate it using MADV_POPULATE_WRITE. Issue UFFDIO_COPY/CONTINUEs from the reader threads as well, but with wake_waiters=true to ensure that any threads sleeping on the uffd are eventually woken up. A real VMM would track whether it had already COPY/CONTINUEd pages (eg, via a bitmap) to avoid calls destined to EEXIST. However, even the naive approach is enough to demonstrate the performance advantages of KVM_EXIT_MEMORY_FAULT. [A] In reality it is much likelier that the vCPU thread simply lost a race to establish the mapping for the page. Signed-off-by: Anish Moorthy Acked-by: James Houghton --- .../selftests/kvm/demand_paging_test.c | 245 +++++++++++++----- 1 file changed, 173 insertions(+), 72 deletions(-) diff --git a/tools/testing/selftests/kvm/demand_paging_test.c b/tools/testing/selftests/kvm/demand_paging_test.c index 61bb2e23bef0..44bdcc7aad87 100644 --- a/tools/testing/selftests/kvm/demand_paging_test.c +++ b/tools/testing/selftests/kvm/demand_paging_test.c @@ -15,6 +15,7 @@ #include #include #include +#include #include #include "kvm_util.h" @@ -31,36 +32,102 @@ static uint64_t guest_percpu_mem_size = DEFAULT_PER_VCPU_MEM_SIZE; static size_t demand_paging_size; static char *guest_data_prototype; +static int num_uffds; +static size_t uffd_region_size; +static struct uffd_desc **uffd_descs; +/* + * Delay when demand paging is performed through userfaultfd or directly by + * vcpu_worker in the case of an annotated memory fault. + */ +static useconds_t uffd_delay; +static int uffd_mode; + + +static int handle_uffd_page_request(int uffd_mode, int uffd, uint64_t hva, + bool is_vcpu); + +static void madv_write_or_err(uint64_t gpa) +{ + int r; + void *hva = addr_gpa2hva(memstress_args.vm, gpa); + + r = madvise(hva, demand_paging_size, MADV_POPULATE_WRITE); + TEST_ASSERT(r == 0, + "MADV_POPULATE_WRITE on hva 0x%lx (gpa 0x%lx) fail, errno %i\n", + (uintptr_t) hva, gpa, errno); +} + +static void ready_page(uint64_t gpa) +{ + int r, uffd; + + /* + * This test only registers memslot 1 w/ userfaultfd. Any accesses outside + * the registered ranges should fault in the physical pages through + * MADV_POPULATE_WRITE. + */ + if ((gpa < memstress_args.gpa) + || (gpa >= memstress_args.gpa + memstress_args.size)) { + madv_write_or_err(gpa); + } else { + if (uffd_delay) + usleep(uffd_delay); + + uffd = uffd_descs[(gpa - memstress_args.gpa) / uffd_region_size]->uffd; + + r = handle_uffd_page_request(uffd_mode, uffd, + (uint64_t) addr_gpa2hva(memstress_args.vm, gpa), true); + + if (r == EEXIST) + madv_write_or_err(gpa); + } +} + static void vcpu_worker(struct memstress_vcpu_args *vcpu_args) { struct kvm_vcpu *vcpu = vcpu_args->vcpu; int vcpu_idx = vcpu_args->vcpu_idx; struct kvm_run *run = vcpu->run; - struct timespec start; - struct timespec ts_diff; + struct timespec last_start; + struct timespec total_runtime = {}; int ret; - - clock_gettime(CLOCK_MONOTONIC, &start); - - /* Let the guest access its memory */ - ret = _vcpu_run(vcpu); - TEST_ASSERT(ret == 0, "vcpu_run failed: %d\n", ret); - if (get_ucall(vcpu, NULL) != UCALL_SYNC) { - TEST_ASSERT(false, - "Invalid guest sync status: exit_reason=%s\n", - exit_reason_str(run->exit_reason)); + u64 num_memory_fault_exits = 0; + bool annotated_memory_fault = false; + + while (true) { + clock_gettime(CLOCK_MONOTONIC, &last_start); + /* Let the guest access its memory */ + ret = _vcpu_run(vcpu); + annotated_memory_fault = errno == EFAULT + && run->exit_reason == KVM_EXIT_MEMORY_FAULT; + TEST_ASSERT(ret == 0 || annotated_memory_fault, + "vcpu_run failed: %d\n", ret); + + total_runtime = timespec_add(total_runtime, + timespec_elapsed(last_start)); + if (ret != 0 && get_ucall(vcpu, NULL) != UCALL_SYNC) { + + if (annotated_memory_fault) { + ++num_memory_fault_exits; + ready_page(run->memory_fault.gpa); + continue; + } + + TEST_ASSERT(false, + "Invalid guest sync status: exit_reason=%s\n", + exit_reason_str(run->exit_reason)); + } + break; } - - ts_diff = timespec_elapsed(start); - PER_VCPU_DEBUG("vCPU %d execution time: %ld.%.9lds\n", vcpu_idx, - ts_diff.tv_sec, ts_diff.tv_nsec); + PER_VCPU_DEBUG("vCPU %d execution time: %ld.%.9lds, %d memory fault exits\n", + vcpu_idx, total_runtime.tv_sec, total_runtime.tv_nsec, + num_memory_fault_exits); } -static int handle_uffd_page_request(int uffd_mode, int uffd, - struct uffd_msg *msg) +static int handle_uffd_page_request(int uffd_mode, int uffd, uint64_t hva, + bool is_vcpu) { pid_t tid = syscall(__NR_gettid); - uint64_t addr = msg->arg.pagefault.address; struct timespec start; struct timespec ts_diff; int r; @@ -71,16 +138,15 @@ static int handle_uffd_page_request(int uffd_mode, int uffd, struct uffdio_copy copy; copy.src = (uint64_t)guest_data_prototype; - copy.dst = addr; + copy.dst = hva; copy.len = demand_paging_size; - copy.mode = 0; + copy.mode = is_vcpu ? UFFDIO_COPY_MODE_DONTWAKE : 0; - r = ioctl(uffd, UFFDIO_COPY, ©); /* - * With multiple vCPU threads fault on a single page and there are - * multiple readers for the UFFD, at least one of the UFFDIO_COPYs - * will fail with EEXIST: handle that case without signaling an - * error. + * With multiple vCPU threads and at least one of multiple reader threads + * or vCPU memory faults, multiple vCPUs accessing an absent page will + * almost certainly cause some thread doing the UFFDIO_COPY here to get + * EEXIST: make sure to allow that case. * * Note that this also suppress any EEXISTs occurring from, * e.g., the first UFFDIO_COPY/CONTINUEs on a page. That never @@ -88,23 +154,24 @@ static int handle_uffd_page_request(int uffd_mode, int uffd, * some external state to correctly surface EEXISTs to userspace * (or prevent duplicate COPY/CONTINUEs in the first place). */ - if (r == -1 && errno != EEXIST) { - pr_info("Failed UFFDIO_COPY in 0x%lx from thread %d, errno = %d\n", - addr, tid, errno); - return r; - } + r = ioctl(uffd, UFFDIO_COPY, ©); + TEST_ASSERT(r == 0 || errno == EEXIST, + "Thread 0x%x failed UFFDIO_COPY on hva 0x%lx, errno = %d", + tid, hva, errno); } else if (uffd_mode == UFFDIO_REGISTER_MODE_MINOR) { + /* The comments in the UFFDIO_COPY branch also apply here. */ struct uffdio_continue cont = {0}; - cont.range.start = addr; + cont.range.start = hva; cont.range.len = demand_paging_size; + cont.mode = is_vcpu ? UFFDIO_CONTINUE_MODE_DONTWAKE : 0; r = ioctl(uffd, UFFDIO_CONTINUE, &cont); /* - * With multiple vCPU threads fault on a single page and there are - * multiple readers for the UFFD, at least one of the UFFDIO_COPYs - * will fail with EEXIST: handle that case without signaling an - * error. + * With multiple vCPU threads and at least one of multiple reader threads + * or vCPU memory faults, multiple vCPUs accessing an absent page will + * almost certainly cause some thread doing the UFFDIO_COPY here to get + * EEXIST: make sure to allow that case. * * Note that this also suppress any EEXISTs occurring from, * e.g., the first UFFDIO_COPY/CONTINUEs on a page. That never @@ -112,32 +179,54 @@ static int handle_uffd_page_request(int uffd_mode, int uffd, * some external state to correctly surface EEXISTs to userspace * (or prevent duplicate COPY/CONTINUEs in the first place). */ - if (r == -1 && errno != EEXIST) { - pr_info("Failed UFFDIO_CONTINUE in 0x%lx, thread %d, errno = %d\n", - addr, tid, errno); - return r; - } + TEST_ASSERT(r == 0 || errno == EEXIST, + "Thread 0x%x failed UFFDIO_CONTINUE on hva 0x%lx, errno = %d", + tid, hva, errno); } else { TEST_FAIL("Invalid uffd mode %d", uffd_mode); } + /* + * If the above UFFDIO_COPY/CONTINUE failed with EEXIST, waiting threads + * will not have been woken: wake them here. + */ + if (!is_vcpu && r != 0) { + struct uffdio_range range = { + .start = hva, + .len = demand_paging_size + }; + r = ioctl(uffd, UFFDIO_WAKE, &range); + TEST_ASSERT(r == 0, + "Thread 0x%x failed UFFDIO_WAKE on hva 0x%lx, errno = %d", + tid, hva, errno); + } + ts_diff = timespec_elapsed(start); PER_PAGE_DEBUG("UFFD page-in %d \t%ld ns\n", tid, timespec_to_ns(ts_diff)); PER_PAGE_DEBUG("Paged in %ld bytes at 0x%lx from thread %d\n", - demand_paging_size, addr, tid); + demand_paging_size, hva, tid); return 0; } +static int handle_uffd_page_request_from_uffd(int uffd_mode, int uffd, + struct uffd_msg *msg) +{ + TEST_ASSERT(msg->event == UFFD_EVENT_PAGEFAULT, + "Received uffd message with event %d != UFFD_EVENT_PAGEFAULT", + msg->event); + return handle_uffd_page_request(uffd_mode, uffd, + msg->arg.pagefault.address, false); +} + struct test_params { - int uffd_mode; bool single_uffd; - useconds_t uffd_delay; int readers_per_uffd; enum vm_mem_backing_src_type src_type; bool partition_vcpu_memory_access; + bool memfault_exits; }; static void prefault_mem(void *alias, uint64_t len) @@ -155,16 +244,22 @@ static void run_test(enum vm_guest_mode mode, void *arg) { struct memstress_vcpu_args *vcpu_args; struct test_params *p = arg; - struct uffd_desc **uffd_descs = NULL; struct timespec start; struct timespec ts_diff; struct kvm_vm *vm; - int i, num_uffds = 0; + int i; double vcpu_paging_rate; - uint64_t uffd_region_size; + uint32_t slot_flags = 0; + bool uffd_memfault_exits = uffd_mode && p->memfault_exits; - vm = memstress_create_vm(mode, nr_vcpus, guest_percpu_mem_size, 1, 0, - p->src_type, p->partition_vcpu_memory_access); + if (uffd_memfault_exits) { + TEST_ASSERT(kvm_has_cap(KVM_CAP_EXIT_ON_MISSING) > 0, + "KVM does not have KVM_CAP_EXIT_ON_MISSING"); + slot_flags = KVM_MEM_EXIT_ON_MISSING; + } + + vm = memstress_create_vm(mode, nr_vcpus, guest_percpu_mem_size, + 1, slot_flags, p->src_type, p->partition_vcpu_memory_access); demand_paging_size = get_backing_src_pagesz(p->src_type); @@ -173,21 +268,21 @@ static void run_test(enum vm_guest_mode mode, void *arg) "Failed to allocate buffer for guest data pattern"); memset(guest_data_prototype, 0xAB, demand_paging_size); - if (p->uffd_mode == UFFDIO_REGISTER_MODE_MINOR) { - num_uffds = p->single_uffd ? 1 : nr_vcpus; - for (i = 0; i < num_uffds; i++) { - vcpu_args = &memstress_args.vcpu_args[i]; - prefault_mem(addr_gpa2alias(vm, vcpu_args->gpa), - vcpu_args->pages * memstress_args.guest_page_size); - } - } - - if (p->uffd_mode) { + if (uffd_mode) { num_uffds = p->single_uffd ? 1 : nr_vcpus; uffd_region_size = nr_vcpus * guest_percpu_mem_size / num_uffds; + if (uffd_mode == UFFDIO_REGISTER_MODE_MINOR) { + for (i = 0; i < num_uffds; i++) { + vcpu_args = &memstress_args.vcpu_args[i]; + prefault_mem(addr_gpa2alias(vm, vcpu_args->gpa), + uffd_region_size); + } + } + uffd_descs = malloc(num_uffds * sizeof(struct uffd_desc *)); - TEST_ASSERT(uffd_descs, "Memory allocation failed"); + TEST_ASSERT(uffd_descs, "Failed to allocate uffd descriptors"); + for (i = 0; i < num_uffds; i++) { struct memstress_vcpu_args *vcpu_args; void *vcpu_hva; @@ -201,10 +296,10 @@ static void run_test(enum vm_guest_mode mode, void *arg) * requests. */ uffd_descs[i] = uffd_setup_demand_paging( - p->uffd_mode, p->uffd_delay, vcpu_hva, + uffd_mode, uffd_delay, vcpu_hva, uffd_region_size, p->readers_per_uffd, - &handle_uffd_page_request); + &handle_uffd_page_request_from_uffd); } } @@ -218,7 +313,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) ts_diff = timespec_elapsed(start); pr_info("All vCPU threads joined\n"); - if (p->uffd_mode) { + if (uffd_mode) { /* Tell the user fault fd handler threads to quit */ for (i = 0; i < num_uffds; i++) uffd_stop_demand_paging(uffd_descs[i]); @@ -239,7 +334,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) memstress_destroy_vm(vm); free(guest_data_prototype); - if (p->uffd_mode) + if (uffd_mode) free(uffd_descs); } @@ -248,7 +343,8 @@ static void help(char *name) puts(""); printf("usage: %s [-h] [-m vm_mode] [-u uffd_mode] [-a]\n" " [-d uffd_delay_usec] [-r readers_per_uffd] [-b memory]\n" - " [-s type] [-v vcpus] [-c cpu_list] [-o]\n", name); + " [-s type] [-v vcpus] [-c cpu_list] [-o] [-w] \n", + name); guest_modes_help(); printf(" -u: use userfaultfd to handle vCPU page faults. Mode is a\n" " UFFD registration mode: 'MISSING' or 'MINOR'.\n"); @@ -260,6 +356,7 @@ static void help(char *name) " FD handler to simulate demand paging\n" " overheads. Ignored without -u.\n"); printf(" -r: Set the number of reader threads per uffd.\n"); + printf(" -w: Enable kvm cap for memory fault exits.\n"); printf(" -b: specify the size of the memory region which should be\n" " demand paged by each vCPU. e.g. 10M or 3G.\n" " Default: 1G\n"); @@ -280,29 +377,30 @@ int main(int argc, char *argv[]) .partition_vcpu_memory_access = true, .readers_per_uffd = 1, .single_uffd = false, + .memfault_exits = false, }; int opt; guest_modes_append_default(); - while ((opt = getopt(argc, argv, "ahom:u:d:b:s:v:c:r:")) != -1) { + while ((opt = getopt(argc, argv, "ahowm:u:d:b:s:v:c:r:")) != -1) { switch (opt) { case 'm': guest_modes_cmdline(optarg); break; case 'u': if (!strcmp("MISSING", optarg)) - p.uffd_mode = UFFDIO_REGISTER_MODE_MISSING; + uffd_mode = UFFDIO_REGISTER_MODE_MISSING; else if (!strcmp("MINOR", optarg)) - p.uffd_mode = UFFDIO_REGISTER_MODE_MINOR; - TEST_ASSERT(p.uffd_mode, "UFFD mode must be 'MISSING' or 'MINOR'."); + uffd_mode = UFFDIO_REGISTER_MODE_MINOR; + TEST_ASSERT(uffd_mode, "UFFD mode must be 'MISSING' or 'MINOR'."); break; case 'a': p.single_uffd = true; break; case 'd': - p.uffd_delay = strtoul(optarg, NULL, 0); - TEST_ASSERT(p.uffd_delay >= 0, "A negative UFFD delay is not supported."); + uffd_delay = strtoul(optarg, NULL, 0); + TEST_ASSERT(uffd_delay >= 0, "A negative UFFD delay is not supported."); break; case 'b': guest_percpu_mem_size = parse_size(optarg); @@ -328,6 +426,9 @@ int main(int argc, char *argv[]) "Invalid number of readers per uffd %d: must be >=1", p.readers_per_uffd); break; + case 'w': + p.memfault_exits = true; + break; case 'h': default: help(argv[0]); @@ -335,7 +436,7 @@ int main(int argc, char *argv[]) } } - if (p.uffd_mode == UFFDIO_REGISTER_MODE_MINOR && + if (uffd_mode == UFFDIO_REGISTER_MODE_MINOR && !backing_src_is_shared(p.src_type)) { TEST_FAIL("userfaultfd MINOR mode requires shared memory; pick a different -s"); }