From patchwork Wed Aug 17 00:36:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12945370 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7670BC32771 for ; Wed, 17 Aug 2022 00:36:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8B5A58D0003; Tue, 16 Aug 2022 20:36:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 864F48D0001; Tue, 16 Aug 2022 20:36:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 52E538D0006; Tue, 16 Aug 2022 20:36:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 3CCB28D0001 for ; Tue, 16 Aug 2022 20:36:21 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 0A882805F0 for ; Wed, 17 Aug 2022 00:36:21 +0000 (UTC) X-FDA: 79807218162.24.EC9CA14 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf04.hostedemail.com (Postfix) with ESMTP id 94EAC401A6 for ; Wed, 17 Aug 2022 00:36:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1660696580; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=MKdnvMx5HcsCQZcyhQgGaAskxmfq1ngWlPQ1j5mYASQ=; b=cZ2ex7SbQRpJUrV80Fsjucux2AQhw5Cr/6pH+3EmniqH6dDRL9/5GlpNrsXVzkq5yOk/md GvmgwWPl+xJRSmAw/CMj5MUyI/2a4goH9Y1uT6BtyO3WBYLvc3BTG1z+clpamknvurS7m+ lB6ul8CyJD1Fm7Hx2hAKGAKiDJBkpKo= Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-584-M1Dfny2GNOmm34tx0swtcg-1; Tue, 16 Aug 2022 20:36:17 -0400 X-MC-Unique: M1Dfny2GNOmm34tx0swtcg-1 Received: by mail-qt1-f197.google.com with SMTP id b10-20020a05622a020a00b003437e336ca7so8844015qtx.16 for ; Tue, 16 Aug 2022 17:36:17 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc; bh=MKdnvMx5HcsCQZcyhQgGaAskxmfq1ngWlPQ1j5mYASQ=; b=mdhMHXxoBqcNW1qz/B23xwwJMi5nij8GD1H2315lPpkakNV6tg2XSHAf5cVIVBIHUt rhDjfA5SYoisTaTkoTQtZmO9fH8v5IXcGnEF2UoRAPq8Qo0sD2utFEP+Z88MVb8x8t0G 7/rznSzjb+xYgiZOlAQI0gZ9bluFZ69tR9ZFbf/D39gvFEPkirf1cWjkBDMCmXJ7QR2X C2UYHALql09HYvwX6Ps0PuIoJVAq733Je8lMToIKCa3SpCNN3duSSPHN4d0q/KI7wPNQ BLafqLtC2LiyGQfDQ5X2egOKlecVf4KgEZXhCVfbM708bmr3LzK9URzFALzZBv8xEsg5 Egdw== X-Gm-Message-State: ACgBeo0vpt17OCZfdZW9JGYZ+DE+cWOVkpepFZyOBi9wZ2d2f15Bqq3r PeMsW62I3+b8q13xi1I3sokNeQvX2Sqqk0VN3oRJVLMcetHRTL26Oawv+jL6G6fQppB3CvP4jKz V8DfMDxDeznQ= X-Received: by 2002:a05:620a:f96:b0:6ba:e280:3adc with SMTP id b22-20020a05620a0f9600b006bae2803adcmr14702783qkn.435.1660696576693; Tue, 16 Aug 2022 17:36:16 -0700 (PDT) X-Google-Smtp-Source: AA6agR6f7bSIzqEEt4Tk3rqBqKencREqt2JkgzbgDmzwuRWN/rUerQ++83tyZbG75fBXqzW5bbXSTQ== X-Received: by 2002:a05:620a:f96:b0:6ba:e280:3adc with SMTP id b22-20020a05620a0f9600b006bae2803adcmr14702770qkn.435.1660696576472; Tue, 16 Aug 2022 17:36:16 -0700 (PDT) Received: from localhost.localdomain (bras-base-aurron9127w-grc-35-70-27-3-10.dsl.bell.ca. [70.27.3.10]) by smtp.gmail.com with ESMTPSA id c13-20020ac87dcd000000b0034358bfc3c8sm12007175qte.67.2022.08.16.17.36.15 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 16 Aug 2022 17:36:16 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Sean Christopherson , David Hildenbrand , Andrew Morton , Andrea Arcangeli , peterx@redhat.com, Paolo Bonzini , "Dr . David Alan Gilbert" , Linux MM Mailing List , John Hubbard Subject: [PATCH v3 0/3] kvm/mm: Allow GUP to respond to non fatal signals Date: Tue, 16 Aug 2022 20:36:11 -0400 Message-Id: <20220817003614.58900-1-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1660696580; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=MKdnvMx5HcsCQZcyhQgGaAskxmfq1ngWlPQ1j5mYASQ=; b=IFfWrD89tLQUxLJRGjU+q35WAgG7W6sK2lFXjaOIP7PuPKkh1WW5XMkybK+krYiBMeKXxM AiyTTXUTgh2V6zd4xJsibQOZxE+xziWQMmF7wYLpIL7dZHSVmP9VZLmcTEG8gyqJ7U7I9/ lJpMtlx2JsyWub7LlBGG8l2L6tKdRSg= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=cZ2ex7Sb; spf=pass (imf04.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1660696580; a=rsa-sha256; cv=none; b=Ho8WenCbSPchf0zFnHh0+EkP6qtFArrdWq2131OzHYftzj7Sb5/O83QkqeUZoFDcn6hY0G JAA4tAPbMmlw50waMDFla2nKoImaoSvorqN5mwGdgHRyslsDwfgcwP1t3VxipfNGuc2sbY QE0XMVK6pJj0Uihbec9KtYUuIA7mCfI= X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 94EAC401A6 X-Rspam-User: Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=cZ2ex7Sb; spf=pass (imf04.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Stat-Signature: 5rfqrphyc5m71p8gxepuuj4wt7pybyyq X-HE-Tag: 1660696580-614944 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: v3: - Patch 1 - Added r-b for DavidH - Added support for hugetlbfs - Patch 2 & 3 - Comment fixes [Sean] - Move introduction of "interruptible" parameter into patch 2 [Sean] - Move sigpending handling into kvm_handle_bad_page [Sean] - Renamed kvm_handle_bad_page() to kvm_handle_error_pfn() [Sean, DavidM] - Use kvm_handle_signal_exit() [Sean] rfc: https://lore.kernel.org/kvm/20220617014147.7299-1-peterx@redhat.com v1: https://lore.kernel.org/kvm/20220622213656.81546-1-peterx@redhat.com v2: https://lore.kernel.org/kvm/20220721000318.93522-1-peterx@redhat.com One issue was reported that libvirt won't be able to stop the virtual machine using QMP command "stop" during a paused postcopy migration [1]. It won't work because "stop the VM" operation requires the hypervisor to kick all the vcpu threads out using SIG_IPI in QEMU (which is translated to a SIGUSR1). However since during a paused postcopy, the vcpu threads are hang death at handle_userfault() so there're simply not responding to the kicks. Further, the "stop" command will further hang the QMP channel. The mm has facility to process generic signal (FAULT_FLAG_INTERRUPTIBLE), however it's only used in the PF handlers only, not in GUP. Unluckily, KVM is a heavy GUP user on guest page faults. It means we won't be able to interrupt a long page fault for KVM fetching guest pages with what we have right now. I think it's reasonable for GUP to only listen to fatal signals, as most of the GUP users are not really ready to handle such case. But actually KVM is not such an user, and KVM actually has rich infrastructure to handle even generic signals, and properly deliver the signal to the userspace. Then the page fault can be retried in the next KVM_RUN. This patchset added FOLL_INTERRUPTIBLE to enable FAULT_FLAG_INTERRUPTIBLE, and let KVM be the first one to use it. KVM and mm/gup can always be able to respond to fatal signals, but not non-fatal ones until this patchset. One thing to mention is that this is not allowing all KVM paths to be able to respond to non fatal signals, but only on x86 slow page faults. In the future when more code is ready for handling signal interruptions, we can explore possibility to have more gup callers using FOLL_INTERRUPTIBLE. Tests ===== I created a postcopy environment, pause the migration by shutting down the network to emulate a network failure (so the handle_userfault() will stuck for a long time), then I tried three things: (1) Sending QMP command "stop" to QEMU monitor, (2) Hitting Ctrl-C from QEMU cmdline, (3) GDB attach to the dest QEMU process. Before this patchset, all three use case hang. After the patchset, all work just like when there's not network failure at all. Please have a look, thanks. [1] https://gitlab.com/qemu-project/qemu/-/issues/1052 Peter Xu (3): mm/gup: Add FOLL_INTERRUPTIBLE kvm: Add new pfn error KVM_PFN_ERR_SIGPENDING kvm/x86: Allow to respond to generic signals during slow page faults arch/arm64/kvm/mmu.c | 2 +- arch/powerpc/kvm/book3s_64_mmu_hv.c | 2 +- arch/powerpc/kvm/book3s_64_mmu_radix.c | 2 +- arch/x86/kvm/mmu/mmu.c | 18 ++++++++++---- include/linux/kvm_host.h | 14 +++++++++-- include/linux/mm.h | 1 + mm/gup.c | 33 ++++++++++++++++++++++---- mm/hugetlb.c | 5 +++- virt/kvm/kvm_main.c | 30 ++++++++++++++--------- virt/kvm/kvm_mm.h | 4 ++-- virt/kvm/pfncache.c | 2 +- 11 files changed, 85 insertions(+), 28 deletions(-)