From patchwork Tue Oct 11 19:58:05 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13004289 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A15B8C433F5 for ; Tue, 11 Oct 2022 19:58:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EFE5A6B0074; Tue, 11 Oct 2022 15:58:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E86108E0002; Tue, 11 Oct 2022 15:58:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CFF298E0001; Tue, 11 Oct 2022 15:58:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id B63196B0074 for ; Tue, 11 Oct 2022 15:58:14 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 8B0C91613FF for ; Tue, 11 Oct 2022 19:58:14 +0000 (UTC) X-FDA: 80009730108.12.330995E Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf28.hostedemail.com (Postfix) with ESMTP id 2796CC0025 for ; Tue, 11 Oct 2022 19:58:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1665518293; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=nW4Tem6gjDlyI55SAwvtu6ESz8FR9RsyePOG8cHS2ck=; b=Ki+XbOdtf2t/LBjdeAJNm5Sclzfl7rO1BjRuCOwt3Qrk/rOmknAL+DXfJOqHo+6eYKB9jP 6uGkMRvS/MoPT4zHoTDnQUYpQvH77ScV8wOjd/RDaQ6s0/ONwTH5dSYZDJwaXqHIfpsZ7I Atpy2ea/ZUKlCa0EVWaFTCV7Vu8Ljm8= Received: from mail-qv1-f70.google.com (mail-qv1-f70.google.com [209.85.219.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-328-t7VJ5o9eP9eqfPS-RcLXfQ-1; Tue, 11 Oct 2022 15:58:12 -0400 X-MC-Unique: t7VJ5o9eP9eqfPS-RcLXfQ-1 Received: by mail-qv1-f70.google.com with SMTP id h3-20020a0ceec3000000b004b17a25f8bcso8695121qvs.23 for ; Tue, 11 Oct 2022 12:58:12 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=nW4Tem6gjDlyI55SAwvtu6ESz8FR9RsyePOG8cHS2ck=; b=OMqWRThHO/Mm12IO4eAX4ANCe5Jc6NOI5vLivEz1K526FEQFVm3rphf06eunzRQ56L NNm1jgdKrhqIm9VG/gPUFLASqTHSifvPX2JgOdKG9zMdpQu5ExrQuaZ0iM7B4ISCcYRn az89M/Xu8RzIU5oi6itLYsJPF6ZMW+N1udgNChjJCV1OBd/3vkHpyogRmasSwAAIgp8M T0ZggKAfGsGUuBlXFujhsNhPrFI2y3+CYkkCUNDyAu6F1M2eMTEhb27kfs/68s1hQVCx JwniEzLfTXC8PQZt77Jx7ijX58jl31hfNmGNIPbXMOCTekb1nWu3fE23q8ksRhGMiLyc 9F4A== X-Gm-Message-State: ACrzQf06jGouCm2GmfOZquxfttnpLwF/VYaCaG0wpMkeWR/oZh5gv7kl E51qJw3SEIwkz0y073ONMkmkpU8YVMIB4fn1VghKZgMUK5JpE15ECryWnJeT+UKU4L9seBexUAO 1SRitiQ8IKZY= X-Received: by 2002:a05:622a:488:b0:38f:9e9f:e7b7 with SMTP id p8-20020a05622a048800b0038f9e9fe7b7mr20902577qtx.212.1665518291952; Tue, 11 Oct 2022 12:58:11 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7mQvmLS4z3S3ameJmrhZO1RKNCxdtRNzX6xV5FGmZxATHaHR21SAyjqAS7KGxUgld/hg7iLw== X-Received: by 2002:a05:622a:488:b0:38f:9e9f:e7b7 with SMTP id p8-20020a05622a048800b0038f9e9fe7b7mr20902555qtx.212.1665518291701; Tue, 11 Oct 2022 12:58:11 -0700 (PDT) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-46-70-31-27-79.dsl.bell.ca. [70.31.27.79]) by smtp.gmail.com with ESMTPSA id az31-20020a05620a171f00b006ce9e880c6fsm13648837qkb.111.2022.10.11.12.58.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 11 Oct 2022 12:58:10 -0700 (PDT) From: Peter Xu To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Sean Christopherson , peterx@redhat.com, John Hubbard , Paolo Bonzini , David Matlack , Andrew Morton , Andrea Arcangeli , "Dr . David Alan Gilbert" , David Hildenbrand , Linux MM Mailing List , Mike Kravetz Subject: [PATCH v4 0/4] kvm/mm: Allow GUP to respond to non fatal signals Date: Tue, 11 Oct 2022 15:58:05 -0400 Message-Id: <20221011195809.557016-1-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Ki+XbOdt; spf=pass (imf28.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1665518294; a=rsa-sha256; cv=none; b=HLLH8GPIrtoMzc1vik2ZzEr2Tmr1F9i3iYrwB1fys0YiMZ0XOaReoli3dkHkgTq/3bShh1 bSnSzfFXw+Kp7jcq7kQjbPMSUXWaK/VfQA0NSKNuoah3sy5DT4z0fKNKD+NSmnEEqZqZOs hemc0BzMf8wJOVIyD5lKaMTUMsY/kbA= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1665518294; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=nW4Tem6gjDlyI55SAwvtu6ESz8FR9RsyePOG8cHS2ck=; b=fJnt60sJfbaKAYedpPje5Ogp7rn3OmE8U4yIn0OmJ1PWgKHya0ESm1q7ICFvlPXt+s+xtg ATjiromf3GNbK0hr1EQcCHYOXD9ufYYXtjkMVaq2XA3KPTIWYK0xfzAumDlUZrAdxvKfPf tplaWBd108MvZAGqI2guYd8qFXFoXs0= X-Rspamd-Queue-Id: 2796CC0025 X-Rspam-User: Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Ki+XbOdt; spf=pass (imf28.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspamd-Server: rspam04 X-Stat-Signature: hbe793wn5qckfsx1f6mtaqdesr8fjfjw X-HE-Tag: 1665518293-850300 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: v4: - Split patch 2+3 into three patches [Sean] rfc: https://lore.kernel.org/r/20220617014147.7299-1-peterx@redhat.com v1: https://lore.kernel.org/r/20220622213656.81546-1-peterx@redhat.com v2: https://lore.kernel.org/r/20220721000318.93522-1-peterx@redhat.com v3: https://lore.kernel.org/r/20220817003614.58900-1-peterx@redhat.com One issue was reported that libvirt won't be able to stop the virtual machine using QMP command "stop" during a paused postcopy migration [1]. It won't work because "stop the VM" operation requires the hypervisor to kick all the vcpu threads out using SIG_IPI in QEMU (which is translated to a SIGUSR1). However since during a paused postcopy, the vcpu threads are hang death at handle_userfault() so there're simply not responding to the kicks. Further, the "stop" command will further hang the QMP channel. The mm has facility to process generic signal (FAULT_FLAG_INTERRUPTIBLE), however it's only used in the PF handlers only, not in GUP. Unluckily, KVM is a heavy GUP user on guest page faults. It means we won't be able to interrupt a long page fault for KVM fetching guest pages with what we have right now. I think it's reasonable for GUP to only listen to fatal signals, as most of the GUP users are not really ready to handle such case. But actually KVM is not such an user, and KVM actually has rich infrastructure to handle even generic signals, and properly deliver the signal to the userspace. Then the page fault can be retried in the next KVM_RUN. This patchset added FOLL_INTERRUPTIBLE to enable FAULT_FLAG_INTERRUPTIBLE, and let KVM be the first one to use it. KVM and mm/gup can always be able to respond to fatal signals, but not non-fatal ones until this patchset. One thing to mention is that this is not allowing all KVM paths to be able to respond to non fatal signals, but only on x86 slow page faults. In the future when more code is ready for handling signal interruptions, we can explore possibility to have more gup callers using FOLL_INTERRUPTIBLE. Tests ===== I created a postcopy environment, pause the migration by shutting down the network to emulate a network failure (so the handle_userfault() will stuck for a long time), then I tried three things: (1) Sending QMP command "stop" to QEMU monitor, (2) Hitting Ctrl-C from QEMU cmdline, (3) GDB attach to the dest QEMU process. Before this patchset, all three use case hang. After the patchset, all work just like when there's not network failure at all. Please have a look, thanks. [1] https://gitlab.com/qemu-project/qemu/-/issues/1052 Peter Xu (4): mm/gup: Add FOLL_INTERRUPTIBLE kvm: Add KVM_PFN_ERR_SIGPENDING kvm: Add interruptible flag to __gfn_to_pfn_memslot() kvm: x86: Allow to respond to generic signals during slow PF arch/arm64/kvm/mmu.c | 2 +- arch/powerpc/kvm/book3s_64_mmu_hv.c | 2 +- arch/powerpc/kvm/book3s_64_mmu_radix.c | 2 +- arch/x86/kvm/mmu/mmu.c | 18 ++++++++++---- include/linux/kvm_host.h | 14 +++++++++-- include/linux/mm.h | 1 + mm/gup.c | 33 ++++++++++++++++++++++---- mm/hugetlb.c | 5 +++- virt/kvm/kvm_main.c | 30 ++++++++++++++--------- virt/kvm/kvm_mm.h | 4 ++-- virt/kvm/pfncache.c | 2 +- 11 files changed, 85 insertions(+), 28 deletions(-)