[v2,0/3] kvm/mm: Allow GUP to respond to non fatal signals

Message ID	20220721000318.93522-1-peterx@redhat.com (mailing list archive)
Headers	show Return-Path: <owner-linux-mm@kvack.org> From: Peter Xu <peterx@redhat.com> To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: David Hildenbrand <david@redhat.com>, "Dr . David Alan Gilbert" <dgilbert@redhat.com>, peterx@redhat.com, John Hubbard <jhubbard@nvidia.com>, Sean Christopherson <seanjc@google.com>, Linux MM Mailing List <linux-mm@kvack.org>, Andrew Morton <akpm@linux-foundation.org>, Paolo Bonzini <pbonzini@redhat.com>, Andrea Arcangeli <aarcange@redhat.com> Subject: [PATCH v2 0/3] kvm/mm: Allow GUP to respond to non fatal signals Date: Wed, 20 Jul 2022 20:03:15 -0400 Message-Id: <20220721000318.93522-1-peterx@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit Sender: owner-linux-mm@kvack.org Precedence: bulk
Series	kvm/mm: Allow GUP to respond to non fatal signals \| expand [v2,0/3] kvm/mm: Allow GUP to respond to non fatal signals [v2,1/3] mm/gup: Add FOLL_INTERRUPTIBLE [v2,2/3] kvm: Add new pfn error KVM_PFN_ERR_SIGPENDING [v2,3/3] kvm/x86: Allow to respond to generic signals during slow page faults

Message ID

20220721000318.93522-1-peterx@redhat.com (mailing list archive)

Headers

From: Peter Xu <peterx@redhat.com>
To: linux-kernel@vger.kernel.org,
	kvm@vger.kernel.org
Cc: David Hildenbrand <david@redhat.com>,
	"Dr . David Alan Gilbert" <dgilbert@redhat.com>,
	peterx@redhat.com,
	John Hubbard <jhubbard@nvidia.com>,
	Sean Christopherson <seanjc@google.com>,
	Linux MM Mailing List <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Andrea Arcangeli <aarcange@redhat.com>
Subject: [PATCH v2 0/3] kvm/mm: Allow GUP to respond to non fatal signals
Date: Wed, 20 Jul 2022 20:03:15 -0400
Message-Id: <20220721000318.93522-1-peterx@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
Sender: owner-linux-mm@kvack.org
Precedence: bulk

Series

kvm/mm: Allow GUP to respond to non fatal signals | expand

Message

Peter Xu July 21, 2022, 12:03 a.m. UTC

v2:
- Added r-b
- Rewrite the comment in faultin_page() for FOLL_INTERRUPTIBLE [John]
- Dropped the controversial patch to introduce a flag for
  __gfn_to_pfn_memslot(), instead used a boolean for now [Sean]
- Rename s/is_sigpending_pfn/KVM_PFN_ERR_SIGPENDING/ [Sean]
- Change comment in kvm_faultin_pfn() mentioning fatal signals [Sean]

rfc: https://lore.kernel.org/kvm/20220617014147.7299-1-peterx@redhat.com
v1:  https://lore.kernel.org/kvm/20220622213656.81546-1-peterx@redhat.com

One issue was reported that libvirt won't be able to stop the virtual
machine using QMP command "stop" during a paused postcopy migration [1].

It won't work because "stop the VM" operation requires the hypervisor to
kick all the vcpu threads out using SIG_IPI in QEMU (which is translated to
a SIGUSR1).  However since during a paused postcopy, the vcpu threads are
hang death at handle_userfault() so there're simply not responding to the
kicks.  Further, the "stop" command will further hang the QMP channel.

The mm has facility to process generic signal (FAULT_FLAG_INTERRUPTIBLE),
however it's only used in the PF handlers only, not in GUP. Unluckily, KVM
is a heavy GUP user on guest page faults.  It means we won't be able to
interrupt a long page fault for KVM fetching guest pages with what we have
right now.

I think it's reasonable for GUP to only listen to fatal signals, as most of
the GUP users are not really ready to handle such case.  But actually KVM
is not such an user, and KVM actually has rich infrastructure to handle
even generic signals, and properly deliver the signal to the userspace.
Then the page fault can be retried in the next KVM_RUN.

This patchset added FOLL_INTERRUPTIBLE to enable FAULT_FLAG_INTERRUPTIBLE,
and let KVM be the first one to use it.  KVM and mm/gup can always be able
to respond to fatal signals, but not non-fatal ones until this patchset.

One thing to mention is that this is not allowing all KVM paths to be able
to respond to non fatal signals, but only on x86 slow page faults.  In the
future when more code is ready for handling signal interruptions, we can
explore possibility to have more gup callers using FOLL_INTERRUPTIBLE.

Tests
=====

I created a postcopy environment, pause the migration by shutting down the
network to emulate a network failure (so the handle_userfault() will stuck
for a long time), then I tried three things:

  (1) Sending QMP command "stop" to QEMU monitor,
  (2) Hitting Ctrl-C from QEMU cmdline,
  (3) GDB attach to the dest QEMU process.

Before this patchset, all three use case hang.  After the patchset, all
work just like when there's not network failure at all.

Please have a look, thanks.

[1] https://gitlab.com/qemu-project/qemu/-/issues/1052

Peter Xu (3):
  mm/gup: Add FOLL_INTERRUPTIBLE
  kvm: Add new pfn error KVM_PFN_ERR_SIGPENDING
  kvm/x86: Allow to respond to generic signals during slow page faults

 arch/arm64/kvm/mmu.c                   |  2 +-
 arch/powerpc/kvm/book3s_64_mmu_hv.c    |  2 +-
 arch/powerpc/kvm/book3s_64_mmu_radix.c |  2 +-
 arch/x86/kvm/mmu/mmu.c                 | 16 +++++++++++--
 include/linux/kvm_host.h               | 15 ++++++++++--
 include/linux/mm.h                     |  1 +
 mm/gup.c                               | 33 ++++++++++++++++++++++----
 virt/kvm/kvm_main.c                    | 30 ++++++++++++++---------
 virt/kvm/kvm_mm.h                      |  4 ++--
 virt/kvm/pfncache.c                    |  2 +-
 10 files changed, 82 insertions(+), 25 deletions(-)

Comments

Peter Xu Aug. 10, 2022, 7:38 p.m. UTC | #1

Any further comments?  Thanks,

On Wed, Jul 20, 2022 at 08:03:15PM -0400, Peter Xu wrote:
> v2:
> - Added r-b
> - Rewrite the comment in faultin_page() for FOLL_INTERRUPTIBLE [John]
> - Dropped the controversial patch to introduce a flag for
>   __gfn_to_pfn_memslot(), instead used a boolean for now [Sean]
> - Rename s/is_sigpending_pfn/KVM_PFN_ERR_SIGPENDING/ [Sean]
> - Change comment in kvm_faultin_pfn() mentioning fatal signals [Sean]
> 
> rfc: https://lore.kernel.org/kvm/20220617014147.7299-1-peterx@redhat.com
> v1:  https://lore.kernel.org/kvm/20220622213656.81546-1-peterx@redhat.com
> 
> One issue was reported that libvirt won't be able to stop the virtual
> machine using QMP command "stop" during a paused postcopy migration [1].
> 
> It won't work because "stop the VM" operation requires the hypervisor to
> kick all the vcpu threads out using SIG_IPI in QEMU (which is translated to
> a SIGUSR1).  However since during a paused postcopy, the vcpu threads are
> hang death at handle_userfault() so there're simply not responding to the
> kicks.  Further, the "stop" command will further hang the QMP channel.
> 
> The mm has facility to process generic signal (FAULT_FLAG_INTERRUPTIBLE),
> however it's only used in the PF handlers only, not in GUP. Unluckily, KVM
> is a heavy GUP user on guest page faults.  It means we won't be able to
> interrupt a long page fault for KVM fetching guest pages with what we have
> right now.
> 
> I think it's reasonable for GUP to only listen to fatal signals, as most of
> the GUP users are not really ready to handle such case.  But actually KVM
> is not such an user, and KVM actually has rich infrastructure to handle
> even generic signals, and properly deliver the signal to the userspace.
> Then the page fault can be retried in the next KVM_RUN.
> 
> This patchset added FOLL_INTERRUPTIBLE to enable FAULT_FLAG_INTERRUPTIBLE,
> and let KVM be the first one to use it.  KVM and mm/gup can always be able
> to respond to fatal signals, but not non-fatal ones until this patchset.
> 
> One thing to mention is that this is not allowing all KVM paths to be able
> to respond to non fatal signals, but only on x86 slow page faults.  In the
> future when more code is ready for handling signal interruptions, we can
> explore possibility to have more gup callers using FOLL_INTERRUPTIBLE.
> 
> Tests
> =====
> 
> I created a postcopy environment, pause the migration by shutting down the
> network to emulate a network failure (so the handle_userfault() will stuck
> for a long time), then I tried three things:
> 
>   (1) Sending QMP command "stop" to QEMU monitor,
>   (2) Hitting Ctrl-C from QEMU cmdline,
>   (3) GDB attach to the dest QEMU process.
> 
> Before this patchset, all three use case hang.  After the patchset, all
> work just like when there's not network failure at all.
> 
> Please have a look, thanks.
> 
> [1] https://gitlab.com/qemu-project/qemu/-/issues/1052