mbox series

[GIT,PULL] KVM/rseq changes for Linux 5.15-rc3

Message ID 20210923181252.44385-1-pbonzini@redhat.com (mailing list archive)
State New, archived
Headers show
Series [GIT,PULL] KVM/rseq changes for Linux 5.15-rc3 | expand

Pull-request

https://git.kernel.org/pub/scm/virt/kvm/kvm.git tags/for-linus-rseq

Message

Paolo Bonzini Sept. 23, 2021, 6:12 p.m. UTC
Linus,

The following changes since commit e4e737bb5c170df6135a127739a9e6148ee3da82:

  Linux 5.15-rc2 (2021-09-19 17:28:22 -0700)

are available in the Git repository at:

  https://git.kernel.org/pub/scm/virt/kvm/kvm.git tags/for-linus-rseq

for you to fetch changes up to 2da4a23599c263bd4a7658c2fe561cb3a73ea6ae:

  KVM: selftests: Remove __NR_userfaultfd syscall fallback (2021-09-22 10:24:02 -0400)

----------------------------------------------------------------
A fix for a bug with restartable sequences and KVM.  KVM's handling
of TIF_NOTIFY_RESUME, e.g. for task migration, clears the flag without
informing rseq and leads to stale data in userspace's rseq struct.

I'm sending this as a separate pull request since it's not code
that I usually touch.  In particular, patch 2 ("entry: rseq: Call
rseq_handle_notify_resume() in tracehook_notify_resume()") is just a
cleanup to try and make future bugs less likely.  If you prefer this to
be sent via Thomas and only in 5.16, please speak up.

----------------------------------------------------------------
Sean Christopherson (5):
      KVM: rseq: Update rseq when processing NOTIFY_RESUME on xfer to KVM guest
      entry: rseq: Call rseq_handle_notify_resume() in tracehook_notify_resume()
      tools: Move x86 syscall number fallbacks to .../uapi/
      KVM: selftests: Add a test for KVM_RUN+rseq to detect task migration bugs
      KVM: selftests: Remove __NR_userfaultfd syscall fallback

 arch/arm/kernel/signal.c                          |   1 -
 arch/arm64/kernel/signal.c                        |   4 +-
 arch/csky/kernel/signal.c                         |   4 +-
 arch/mips/kernel/signal.c                         |   4 +-
 arch/powerpc/kernel/signal.c                      |   4 +-
 include/linux/tracehook.h                         |   2 +
 kernel/entry/common.c                             |   4 +-
 kernel/rseq.c                                     |  14 +-
 tools/arch/x86/include/{ => uapi}/asm/unistd_32.h |   0
 tools/arch/x86/include/{ => uapi}/asm/unistd_64.h |   3 -
 tools/testing/selftests/kvm/.gitignore            |   1 +
 tools/testing/selftests/kvm/Makefile              |   3 +
 tools/testing/selftests/kvm/rseq_test.c           | 236 ++++++++++++++++++++++
 13 files changed, 258 insertions(+), 22 deletions(-)
 rename tools/arch/x86/include/{ => uapi}/asm/unistd_32.h (100%)
 rename tools/arch/x86/include/{ => uapi}/asm/unistd_64.h (83%)
 create mode 100644 tools/testing/selftests/kvm/rseq_test.c

Comments

pr-tracker-bot@kernel.org Sept. 23, 2021, 6:33 p.m. UTC | #1
The pull request you sent on Thu, 23 Sep 2021 14:12:52 -0400:

> https://git.kernel.org/pub/scm/virt/kvm/kvm.git tags/for-linus-rseq

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/f10f0481a5b58f8986f626d43f8534472f7776c2

Thank you!
Linus Torvalds Sept. 23, 2021, 6:35 p.m. UTC | #2
On Thu, Sep 23, 2021 at 11:13 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> A fix for a bug with restartable sequences and KVM.  KVM's handling
> of TIF_NOTIFY_RESUME, e.g. for task migration, clears the flag without
> informing rseq and leads to stale data in userspace's rseq struct.

Ok, patches look reasonable.

> I'm sending this as a separate pull request since it's not code
> that I usually touch.  In particular, patch 2 ("entry: rseq: Call
> rseq_handle_notify_resume() in tracehook_notify_resume()") is just a
> cleanup to try and make future bugs less likely.  If you prefer this to
> be sent via Thomas and only in 5.16, please speak up.

So I took the pull request this way, thanks for separating it like this.

But I'm adding a few people to the cc for a completely different
reason: the cleanup to move all the notify_resume stuff to
tracehook_notify_resume() is good, but it does make me go - once again
- "Hmm, that naming is really really bad".

The <linux/tracehook.h> code was literally meant for tracing. It's
where the name comes from, and it's the original intent: having a
place that you can hook into for tracing that doesn't depend on how
the core kernel code ends up changing.

But that's not how it actually acts right now. That header file is now
some very core functionality, and little of it is actually related to
tracing any more. It's more core process state handling for the user
space return path.

So I don't object to the patches, and they are merged, but I'm cc'ing people to

 (a) let them know about this (see commit a68de80f61f6: "entry: rseq:
Call rseq_handle_notify_resume() in tracehook_notify_resume()" in the
current -git tree)

 (b) possibly prod some people into perhaps moving/renaming some of
that code to actual core kernel C files, instead of a misnamed header
file..

Hmm?

         Linus
Paolo Bonzini Sept. 23, 2021, 6:44 p.m. UTC | #3
On 23/09/21 20:35, Linus Torvalds wrote:
> But I'm adding a few people to the cc for a completely different
> reason: the cleanup to move all the notify_resume stuff to
> tracehook_notify_resume() is good, but it does make me go - once again
> - "Hmm, that naming is really really bad".

Absolutely, it's even in the commit message:

---
Note, tracehook_notify_resume() is horribly named and arguably does not
belong in tracehook.h as literally every line of code in it has nothing
to do with tracing.  But, that's been true since commit a42c6ded827d
("move key_repace_session_keyring() into tracehook_notify_resume()")
first usurped tracehook_notify_resume() back in 2012.  Punt cleaning that
mess up to future patches.
---

As you point out, it's really all of the header and not just that one
function.

Paolo

> The <linux/tracehook.h> code was literally meant for tracing. It's
> where the name comes from, and it's the original intent: having a
> place that you can hook into for tracing that doesn't depend on how
> the core kernel code ends up changing.
> 
> But that's not how it actually acts right now. That header file is now
> some very core functionality, and little of it is actually related to
> tracing any more. It's more core process state handling for the user
> space return path.
> 
> So I don't object to the patches, and they are merged, but I'm cc'ing people to
> 
>   (a) let them know about this (see commit a68de80f61f6: "entry: rseq:
> Call rseq_handle_notify_resume() in tracehook_notify_resume()" in the
> current -git tree)
> 
>   (b) possibly prod some people into perhaps moving/renaming some of
> that code to actual core kernel C files, instead of a misnamed header
> file..
> 
> Hmm?
> 
>           Linus
>
Eric W. Biederman Sept. 24, 2021, 3:13 p.m. UTC | #4
Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Thu, Sep 23, 2021 at 11:13 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
>>
>> A fix for a bug with restartable sequences and KVM.  KVM's handling
>> of TIF_NOTIFY_RESUME, e.g. for task migration, clears the flag without
>> informing rseq and leads to stale data in userspace's rseq struct.
>
> Ok, patches look reasonable.
>
>> I'm sending this as a separate pull request since it's not code
>> that I usually touch.  In particular, patch 2 ("entry: rseq: Call
>> rseq_handle_notify_resume() in tracehook_notify_resume()") is just a
>> cleanup to try and make future bugs less likely.  If you prefer this to
>> be sent via Thomas and only in 5.16, please speak up.
>
> So I took the pull request this way, thanks for separating it like this.
>
> But I'm adding a few people to the cc for a completely different
> reason: the cleanup to move all the notify_resume stuff to
> tracehook_notify_resume() is good, but it does make me go - once again
> - "Hmm, that naming is really really bad".
>
> The <linux/tracehook.h> code was literally meant for tracing. It's
> where the name comes from, and it's the original intent: having a
> place that you can hook into for tracing that doesn't depend on how
> the core kernel code ends up changing.
>
> But that's not how it actually acts right now. That header file is now
> some very core functionality, and little of it is actually related to
> tracing any more. It's more core process state handling for the user
> space return path.

Yes.  The tracehook header was a precursor to merging utrace which
ultimately was replaced by uprobes.  Quite a few of the tracehooks hooks
have become regular ptrace hooks over the years, and left tracehook.h

It looks like that is the path that should happen with the rest of the
hooks as well.

It looks like: tracehook_report_syscall_entry, and
tracehook_report_syscall_exit should just become
ptrace_report_syscall_entry and ptrace_report_syscall_exit.

That tracehook_signal_handler should just be inlined into it's one
caller.

That leaves set_notify_resume, tracehook_notify_resume,
tracehook_notify_signal, and set_notify_signal.

I am still waiting to hear if we can just remove
tracehook_notify_signal now that io_uring has become an ordinary process
thing.

It looks like tracehook_notify_resume should be renamed and put
somewhere I just don't know where.

The config option HAVE_ARCH_TRACEHOOK appears to have nothing to do
with the header tracehook.h any more.  It looks to be just about
regsets, and task_current_syscall.  It looks like only alpha, h8300,
m68k, and microblaze need an implementation and then we can make all
of the code that depends upon HAVE_ARCH_TRACEHOOK unconditional.


> So I don't object to the patches, and they are merged, but I'm cc'ing people to
>
>  (a) let them know about this (see commit a68de80f61f6: "entry: rseq:
> Call rseq_handle_notify_resume() in tracehook_notify_resume()" in the
> current -git tree)
>
>  (b) possibly prod some people into perhaps moving/renaming some of
> that code to actual core kernel C files, instead of a misnamed header
> file..
>
> Hmm?

It is on my radar.  Does anyone have any idea what to call
tracehook_notify_resume so that it describes it's current usage?

Eric
Paolo Bonzini Sept. 24, 2021, 4:01 p.m. UTC | #5
On 24/09/21 17:13, Eric W. Biederman wrote:
> Does anyone have any idea what to call
> tracehook_notify_resume so that it describes it's current usage?

There isn't a more precise definition than "sundry slowpaths to be 
invoked before returning to userspace".  Whenever one of the triggering 
conditions becomes true, set_notify_resume() is called and 
tracehook_notify_resume() will execute them all.

Paolo