Message ID | c7fb2aafd8a5885e9c333957abef9e29b691098e.1457723023.git.luto@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 11/03/2016 20:06, Andy Lutomirski wrote: > This adds paravirt hooks for unsafe MSR access. On native, they > call native_{read,write}_msr. On Xen, they use > xen_{read,write}_msr_safe. > > Nothing uses them yet for ease of bisection. The next patch will > use them in rdmsrl, wrmsrl, etc. > > I intentionally didn't make them OOPS on #GP on Xen. I think that > should be done separately by the Xen maintainers. Please do the same for KVM. Paolo
On Mar 14, 2016 7:03 AM, "Paolo Bonzini" <pbonzini@redhat.com> wrote: > > > > On 11/03/2016 20:06, Andy Lutomirski wrote: > > This adds paravirt hooks for unsafe MSR access. On native, they > > call native_{read,write}_msr. On Xen, they use > > xen_{read,write}_msr_safe. > > > > Nothing uses them yet for ease of bisection. The next patch will > > use them in rdmsrl, wrmsrl, etc. > > > > I intentionally didn't make them OOPS on #GP on Xen. I think that > > should be done separately by the Xen maintainers. > > Please do the same for KVM. Can you clarify? KVM uses the native version, and the native version only oopses with this series applied if panic_on_oops is set. --Andy
On Mar 14, 2016 9:53 AM, "Andy Lutomirski" <luto@amacapital.net> wrote: > > Can you clarify? KVM uses the native version, and the native version > only oopses with this series applied if panic_on_oops is set. Can we please remove that idiocy? There is no reason to panic whatsoever. Seriously. What's the upside of that logic? Linus
On Mon, Mar 14, 2016 at 9:58 AM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > On Mar 14, 2016 9:53 AM, "Andy Lutomirski" <luto@amacapital.net> wrote: >> >> Can you clarify? KVM uses the native version, and the native version >> only oopses with this series applied if panic_on_oops is set. > > Can we please remove that idiocy? > > There is no reason to panic whatsoever. Seriously. What's the upside of that > logic? I imagine that people who set panic_on_oops want their systems to stop running user code if something happens that could corrupt the state or if there's any sign that user code is trying some non-deterministic exploit. So I'm guessing that they'd want this type of "the kernel screwed up -- abort" to actually result in a panic. As a concrete, although somewhat silly, example, suppose that a write to MSR_SYSENTER_STACK fails. If that happened, then user code could subsequently try to take over the kernel by evil manipulation of TF and/or perf. I'd be okay with removing this too, though, since arranging for MSR access to fail seems unlikely as an exploit vector. Borislav: SUSE actually uses panic_on_oops, right? What's their goal? --Andy
On 14/03/2016 18:02, Andy Lutomirski wrote: > On Mon, Mar 14, 2016 at 9:58 AM, Linus Torvalds > <torvalds@linux-foundation.org> wrote: >> >> On Mar 14, 2016 9:53 AM, "Andy Lutomirski" <luto@amacapital.net> wrote: >>> >>> Can you clarify? KVM uses the native version, and the native version >>> only oopses with this series applied if panic_on_oops is set. >> >> Can we please remove that idiocy? >> >> There is no reason to panic whatsoever. Seriously. What's the upside of that >> logic? > > I imagine that people who set panic_on_oops want their systems to stop > running user code if something happens that could corrupt the state or > if there's any sign that user code is trying some non-deterministic > exploit. So I'm guessing that they'd want this type of "the kernel > screwed up -- abort" to actually result in a panic. > > As a concrete, although somewhat silly, example, suppose that a write > to MSR_SYSENTER_STACK fails. If that happened, then user code could > subsequently try to take over the kernel by evil manipulation of TF > and/or perf. > > I'd be okay with removing this too, though, since arranging for MSR > access to fail seems unlikely as an exploit vector. > > Borislav: SUSE actually uses panic_on_oops, right? What's their goal? RHEL also does, and it's mostly to trap kernel page faults before they do more damage such as filesystem corruption. The debug kernel has panic_on_oops=0, while the production kernel has panic_on_oops=1. Paolo
On 14/03/2016 17:53, Andy Lutomirski wrote: > > Please do the same for KVM. > > Can you clarify? KVM uses the native version, and the native version > only oopses with this series applied if panic_on_oops is set. Since you fixed the panic_on_oops thing, I'm okay with letting KVM use the unsafe version and seeing what breaks... Paolo
diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h index 1487054a1a70..13da359881d7 100644 --- a/arch/x86/include/asm/msr.h +++ b/arch/x86/include/asm/msr.h @@ -119,8 +119,9 @@ static inline unsigned long long native_read_msr_safe(unsigned int msr, return EAX_EDX_VAL(val, low, high); } -static inline void native_write_msr(unsigned int msr, - unsigned low, unsigned high) +/* Can be uninlined because referenced by paravirt */ +notrace static inline void native_write_msr(unsigned int msr, + unsigned low, unsigned high) { asm volatile("1: wrmsr\n" "2:\n" diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h index 2e49228ed9a3..68297d87e85c 100644 --- a/arch/x86/include/asm/paravirt.h +++ b/arch/x86/include/asm/paravirt.h @@ -129,6 +129,17 @@ static inline void wbinvd(void) #define get_kernel_rpl() (pv_info.kernel_rpl) +static inline u64 paravirt_read_msr(unsigned msr) +{ + return PVOP_CALL1(u64, pv_cpu_ops.read_msr, msr); +} + +static inline void paravirt_write_msr(unsigned msr, + unsigned low, unsigned high) +{ + return PVOP_VCALL3(pv_cpu_ops.write_msr, msr, low, high); +} + static inline u64 paravirt_read_msr_safe(unsigned msr, int *err) { return PVOP_CALL2(u64, pv_cpu_ops.read_msr_safe, msr, err); diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h index 5a06cccd36f0..b85aec45a1b8 100644 --- a/arch/x86/include/asm/paravirt_types.h +++ b/arch/x86/include/asm/paravirt_types.h @@ -155,8 +155,14 @@ struct pv_cpu_ops { void (*cpuid)(unsigned int *eax, unsigned int *ebx, unsigned int *ecx, unsigned int *edx); - /* MSR operations. - err = 0/-EIO. wrmsr returns 0/-EIO. */ + /* Unsafe MSR operations. These will warn or panic on failure. */ + u64 (*read_msr)(unsigned int msr); + void (*write_msr)(unsigned int msr, unsigned low, unsigned high); + + /* + * Safe MSR operations. + * read sets err to 0 or -EIO. write returns 0 or -EIO. + */ u64 (*read_msr_safe)(unsigned int msr, int *err); int (*write_msr_safe)(unsigned int msr, unsigned low, unsigned high); diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c index 8aad95478ae5..f9583917c7c4 100644 --- a/arch/x86/kernel/paravirt.c +++ b/arch/x86/kernel/paravirt.c @@ -339,6 +339,8 @@ __visible struct pv_cpu_ops pv_cpu_ops = { .write_cr8 = native_write_cr8, #endif .wbinvd = native_wbinvd, + .read_msr = native_read_msr, + .write_msr = native_write_msr, .read_msr_safe = native_read_msr_safe, .write_msr_safe = native_write_msr_safe, .read_pmc = native_read_pmc, diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c index ff2df20d0067..548cd3e02b9e 100644 --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@ -1092,6 +1092,26 @@ static int xen_write_msr_safe(unsigned int msr, unsigned low, unsigned high) return ret; } +static u64 xen_read_msr(unsigned int msr) +{ + /* + * This will silently swallow a #GP from RDMSR. It may be worth + * changing that. + */ + int err; + + return xen_read_msr_safe(msr, &err); +} + +static void xen_write_msr(unsigned int msr, unsigned low, unsigned high) +{ + /* + * This will silently swallow a #GP from WRMSR. It may be worth + * changing that. + */ + xen_write_msr_safe(msr, low, high); +} + void xen_setup_shared_info(void) { if (!xen_feature(XENFEAT_auto_translated_physmap)) { @@ -1222,6 +1242,9 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = { .wbinvd = native_wbinvd, + .read_msr = xen_read_msr, + .write_msr = xen_write_msr, + .read_msr_safe = xen_read_msr_safe, .write_msr_safe = xen_write_msr_safe,
This adds paravirt hooks for unsafe MSR access. On native, they call native_{read,write}_msr. On Xen, they use xen_{read,write}_msr_safe. Nothing uses them yet for ease of bisection. The next patch will use them in rdmsrl, wrmsrl, etc. I intentionally didn't make them OOPS on #GP on Xen. I think that should be done separately by the Xen maintainers. Signed-off-by: Andy Lutomirski <luto@kernel.org> --- arch/x86/include/asm/msr.h | 5 +++-- arch/x86/include/asm/paravirt.h | 11 +++++++++++ arch/x86/include/asm/paravirt_types.h | 10 ++++++++-- arch/x86/kernel/paravirt.c | 2 ++ arch/x86/xen/enlighten.c | 23 +++++++++++++++++++++++ 5 files changed, 47 insertions(+), 4 deletions(-)