diff mbox series

[v5,2/5] powerpc/kprobes: Mark newly allocated probes as RO

Message ID 20191030073111.140493-3-ruscur@russell.cc (mailing list archive)
State New, archived
Headers show
Series Implement STRICT_MODULE_RWX for powerpc | expand

Commit Message

Russell Currey Oct. 30, 2019, 7:31 a.m. UTC
With CONFIG_STRICT_KERNEL_RWX=y and CONFIG_KPROBES=y, there will be one
W+X page at boot by default.  This can be tested with
CONFIG_PPC_PTDUMP=y and CONFIG_PPC_DEBUG_WX=y set, and checking the
kernel log during boot.

powerpc doesn't implement its own alloc() for kprobes like other
architectures do, but we couldn't immediately mark RO anyway since we do
a memcpy to the page we allocate later.  After that, nothing should be
allowed to modify the page, and write permissions are removed well
before the kprobe is armed.

Thus mark newly allocated probes as read-only once it's safe to do so.

Signed-off-by: Russell Currey <ruscur@russell.cc>
---
 arch/powerpc/kernel/kprobes.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Daniel Axtens Nov. 1, 2019, 2:23 p.m. UTC | #1
Russell Currey <ruscur@russell.cc> writes:

> With CONFIG_STRICT_KERNEL_RWX=y and CONFIG_KPROBES=y, there will be one
> W+X page at boot by default.  This can be tested with
> CONFIG_PPC_PTDUMP=y and CONFIG_PPC_DEBUG_WX=y set, and checking the
> kernel log during boot.
>
> powerpc doesn't implement its own alloc() for kprobes like other
> architectures do, but we couldn't immediately mark RO anyway since we do
> a memcpy to the page we allocate later.  After that, nothing should be
> allowed to modify the page, and write permissions are removed well
> before the kprobe is armed.
>
> Thus mark newly allocated probes as read-only once it's safe to do so.

So if I've got the flow right here:

register[_aggr]_kprobe
 -> prepare_kprobe
  -> arch_prepare_kprobe
       perform memcpy
       mark as readonly, after which no-one touches writes to the memory

which all seems right to me.

I have been trying to check if optprobes need special handling: it looks
like the buffer for them lives in kernel text, not dynamically allocated
memory, so it should be protected by the usual Strict RWX protections
without special treatment here.

So lgtm.

Reviewed-by: Daniel Axtens <dja@axtens.net>

Regards,
Daniel

>
> Signed-off-by: Russell Currey <ruscur@russell.cc>
> ---
>  arch/powerpc/kernel/kprobes.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
> index 2d27ec4feee4..2610496de7c7 100644
> --- a/arch/powerpc/kernel/kprobes.c
> +++ b/arch/powerpc/kernel/kprobes.c
> @@ -24,6 +24,7 @@
>  #include <asm/sstep.h>
>  #include <asm/sections.h>
>  #include <linux/uaccess.h>
> +#include <linux/set_memory.h>
>  
>  DEFINE_PER_CPU(struct kprobe *, current_kprobe) = NULL;
>  DEFINE_PER_CPU(struct kprobe_ctlblk, kprobe_ctlblk);
> @@ -131,6 +132,8 @@ int arch_prepare_kprobe(struct kprobe *p)
>  			(unsigned long)p->ainsn.insn + sizeof(kprobe_opcode_t));
>  	}
>  
> +	set_memory_ro((unsigned long)p->ainsn.insn, 1);
> +
>  	p->ainsn.boostable = 0;
>  	return ret;
>  }
> -- 
> 2.23.0
Michael Ellerman Nov. 2, 2019, 10:45 a.m. UTC | #2
Russell Currey <ruscur@russell.cc> writes:
> With CONFIG_STRICT_KERNEL_RWX=y and CONFIG_KPROBES=y, there will be one
> W+X page at boot by default.  This can be tested with
> CONFIG_PPC_PTDUMP=y and CONFIG_PPC_DEBUG_WX=y set, and checking the
> kernel log during boot.
>
> powerpc doesn't implement its own alloc() for kprobes like other
> architectures do, but we couldn't immediately mark RO anyway since we do
> a memcpy to the page we allocate later.  After that, nothing should be
> allowed to modify the page, and write permissions are removed well
> before the kprobe is armed.
>
> Thus mark newly allocated probes as read-only once it's safe to do so.
>
> Signed-off-by: Russell Currey <ruscur@russell.cc>
> ---
>  arch/powerpc/kernel/kprobes.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
> index 2d27ec4feee4..2610496de7c7 100644
> --- a/arch/powerpc/kernel/kprobes.c
> +++ b/arch/powerpc/kernel/kprobes.c
> @@ -24,6 +24,7 @@
>  #include <asm/sstep.h>
>  #include <asm/sections.h>
>  #include <linux/uaccess.h>
> +#include <linux/set_memory.h>
>  
>  DEFINE_PER_CPU(struct kprobe *, current_kprobe) = NULL;
>  DEFINE_PER_CPU(struct kprobe_ctlblk, kprobe_ctlblk);
> @@ -131,6 +132,8 @@ int arch_prepare_kprobe(struct kprobe *p)
>  			(unsigned long)p->ainsn.insn + sizeof(kprobe_opcode_t));
>  	}
>  
> +	set_memory_ro((unsigned long)p->ainsn.insn, 1);
> +

That comes from:
	p->ainsn.insn = get_insn_slot();


Which ends up in __get_insn_slot() I think. And that looks very much
like it's going to hand out multiple slots per page, which isn't going
to work because you've just marked the whole page RO.

So I would expect this to crash on the 2nd kprobe that's installed. Have
you tested it somehow?

I think this code should just use patch_instruction() rather than
memcpy().

cheers
Michael Ellerman Dec. 5, 2019, 11:47 p.m. UTC | #3
Michael Ellerman <mpe@ellerman.id.au> writes:
> Russell Currey <ruscur@russell.cc> writes:
>> With CONFIG_STRICT_KERNEL_RWX=y and CONFIG_KPROBES=y, there will be one
>> W+X page at boot by default.  This can be tested with
>> CONFIG_PPC_PTDUMP=y and CONFIG_PPC_DEBUG_WX=y set, and checking the
>> kernel log during boot.
>>
>> powerpc doesn't implement its own alloc() for kprobes like other
>> architectures do, but we couldn't immediately mark RO anyway since we do
>> a memcpy to the page we allocate later.  After that, nothing should be
>> allowed to modify the page, and write permissions are removed well
>> before the kprobe is armed.
>>
>> Thus mark newly allocated probes as read-only once it's safe to do so.
>>
>> Signed-off-by: Russell Currey <ruscur@russell.cc>
>> ---
>>  arch/powerpc/kernel/kprobes.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
>> index 2d27ec4feee4..2610496de7c7 100644
>> --- a/arch/powerpc/kernel/kprobes.c
>> +++ b/arch/powerpc/kernel/kprobes.c
>> @@ -24,6 +24,7 @@
>>  #include <asm/sstep.h>
>>  #include <asm/sections.h>
>>  #include <linux/uaccess.h>
>> +#include <linux/set_memory.h>
>>  
>>  DEFINE_PER_CPU(struct kprobe *, current_kprobe) = NULL;
>>  DEFINE_PER_CPU(struct kprobe_ctlblk, kprobe_ctlblk);
>> @@ -131,6 +132,8 @@ int arch_prepare_kprobe(struct kprobe *p)
>>  			(unsigned long)p->ainsn.insn + sizeof(kprobe_opcode_t));
>>  	}
>>  
>> +	set_memory_ro((unsigned long)p->ainsn.insn, 1);
>> +
>
> That comes from:
> 	p->ainsn.insn = get_insn_slot();
>
>
> Which ends up in __get_insn_slot() I think. And that looks very much
> like it's going to hand out multiple slots per page, which isn't going
> to work because you've just marked the whole page RO.
>
> So I would expect this to crash on the 2nd kprobe that's installed. Have
> you tested it somehow?

I'm not sure if this is the issue I was talking about, but it doesn't
survive ftracetest:

  [ 1139.576047] ------------[ cut here ]------------
  [ 1139.576322] kernel BUG at mm/memory.c:2036!
  cpu 0x1f: Vector: 700 (Program Check) at [c000001fd6c675d0]
      pc: c00000000035d018: apply_to_page_range+0x318/0x610
      lr: c0000000000900bc: change_memory_attr+0x4c/0x70
      sp: c000001fd6c67860
     msr: 9000000000029033
    current = 0xc000001fa4a47880
    paca    = 0xc000001ffffe5c80   irqmask: 0x03   irq_happened: 0x01
      pid   = 7168, comm = ftracetest
  kernel BUG at mm/memory.c:2036!
  Linux version 5.4.0-gcc-8.2.0-11694-gf1f9aa266811 (michael@Raptor-2.ozlabs.ibm.com) (gcc version 8.2.0 (crosstool-NG 1.24.0-rc1.16-9627a04)) #1384 SMP Thu Dec 5 22:11:09 AEDT 2019
  enter ? for help
  [c000001fd6c67940] c0000000000900bc change_memory_attr+0x4c/0x70
  [c000001fd6c67970] c000000000053c48 arch_prepare_kprobe+0xb8/0x120
  [c000001fd6c679e0] c00000000022f718 register_kprobe+0x608/0x790
  [c000001fd6c67a40] c00000000022fc50 register_kretprobe+0x230/0x350
  [c000001fd6c67a80] c0000000002849b4 __register_trace_kprobe+0xf4/0x1a0
  [c000001fd6c67af0] c000000000285b18 trace_kprobe_create+0x738/0xf70
  [c000001fd6c67c30] c000000000286378 create_or_delete_trace_kprobe+0x28/0x70
  [c000001fd6c67c50] c00000000025f024 trace_run_command+0xc4/0xe0
  [c000001fd6c67ca0] c00000000025f128 trace_parse_run_command+0xe8/0x230
  [c000001fd6c67d40] c0000000002845d0 probes_write+0x20/0x40
  [c000001fd6c67d60] c0000000003eef4c __vfs_write+0x3c/0x70
  [c000001fd6c67d80] c0000000003f26a0 vfs_write+0xd0/0x200
  [c000001fd6c67dd0] c0000000003f2a3c ksys_write+0x7c/0x140
  [c000001fd6c67e20] c00000000000b9e0 system_call+0x5c/0x68
  --- Exception: c01 (System Call) at 00007fff8f06e420
  SP (7ffff93d6830) is in userspace
  1f:mon> client_loop: send disconnect: Broken pipe


Sorry I didn't get any more info on the crash, I lost the console and
then some CI bot stole the machine 8)

You should be able to reproduce just by running ftracetest.

cheers
Russell Currey Dec. 12, 2019, 6:43 a.m. UTC | #4
On Fri, 2019-12-06 at 10:47 +1100, Michael Ellerman wrote:
> Michael Ellerman <mpe@ellerman.id.au> writes:
> > Russell Currey <ruscur@russell.cc> writes:
> > > With CONFIG_STRICT_KERNEL_RWX=y and CONFIG_KPROBES=y, there will
> > > be one
> > > W+X page at boot by default.  This can be tested with
> > > CONFIG_PPC_PTDUMP=y and CONFIG_PPC_DEBUG_WX=y set, and checking
> > > the
> > > kernel log during boot.
> > > 
> > > powerpc doesn't implement its own alloc() for kprobes like other
> > > architectures do, but we couldn't immediately mark RO anyway
> > > since we do
> > > a memcpy to the page we allocate later.  After that, nothing
> > > should be
> > > allowed to modify the page, and write permissions are removed
> > > well
> > > before the kprobe is armed.
> > > 
> > > Thus mark newly allocated probes as read-only once it's safe to
> > > do so.
> > > 
> > > Signed-off-by: Russell Currey <ruscur@russell.cc>
> > > ---
> > >  arch/powerpc/kernel/kprobes.c | 3 +++
> > >  1 file changed, 3 insertions(+)
> > > 
> > > diff --git a/arch/powerpc/kernel/kprobes.c
> > > b/arch/powerpc/kernel/kprobes.c
> > > index 2d27ec4feee4..2610496de7c7 100644
> > > --- a/arch/powerpc/kernel/kprobes.c
> > > +++ b/arch/powerpc/kernel/kprobes.c
> > > @@ -24,6 +24,7 @@
> > >  #include <asm/sstep.h>
> > >  #include <asm/sections.h>
> > >  #include <linux/uaccess.h>
> > > +#include <linux/set_memory.h>
> > >  
> > >  DEFINE_PER_CPU(struct kprobe *, current_kprobe) = NULL;
> > >  DEFINE_PER_CPU(struct kprobe_ctlblk, kprobe_ctlblk);
> > > @@ -131,6 +132,8 @@ int arch_prepare_kprobe(struct kprobe *p)
> > >  			(unsigned long)p->ainsn.insn +
> > > sizeof(kprobe_opcode_t));
> > >  	}
> > >  
> > > +	set_memory_ro((unsigned long)p->ainsn.insn, 1);
> > > +
> > 
> > That comes from:
> > 	p->ainsn.insn = get_insn_slot();
> > 
> > 
> > Which ends up in __get_insn_slot() I think. And that looks very
> > much
> > like it's going to hand out multiple slots per page, which isn't
> > going
> > to work because you've just marked the whole page RO.
> > 
> > So I would expect this to crash on the 2nd kprobe that's installed.
> > Have
> > you tested it somehow?
> 
> I'm not sure if this is the issue I was talking about, but it doesn't
> survive ftracetest:
> 
>   [ 1139.576047] ------------[ cut here ]------------
>   [ 1139.576322] kernel BUG at mm/memory.c:2036!
>   cpu 0x1f: Vector: 700 (Program Check) at [c000001fd6c675d0]
>       pc: c00000000035d018: apply_to_page_range+0x318/0x610
>       lr: c0000000000900bc: change_memory_attr+0x4c/0x70
>       sp: c000001fd6c67860
>      msr: 9000000000029033
>     current = 0xc000001fa4a47880
>     paca    = 0xc000001ffffe5c80   irqmask: 0x03   irq_happened: 0x01
>       pid   = 7168, comm = ftracetest
>   kernel BUG at mm/memory.c:2036!
>   Linux version 5.4.0-gcc-8.2.0-11694-gf1f9aa266811 (
> michael@Raptor-2.ozlabs.ibm.com) (gcc version 8.2.0 (crosstool-NG
> 1.24.0-rc1.16-9627a04)) #1384 SMP Thu Dec 5 22:11:09 AEDT 2019
>   enter ? for help
>   [c000001fd6c67940] c0000000000900bc change_memory_attr+0x4c/0x70
>   [c000001fd6c67970] c000000000053c48 arch_prepare_kprobe+0xb8/0x120
>   [c000001fd6c679e0] c00000000022f718 register_kprobe+0x608/0x790
>   [c000001fd6c67a40] c00000000022fc50 register_kretprobe+0x230/0x350
>   [c000001fd6c67a80] c0000000002849b4
> __register_trace_kprobe+0xf4/0x1a0
>   [c000001fd6c67af0] c000000000285b18 trace_kprobe_create+0x738/0xf70
>   [c000001fd6c67c30] c000000000286378
> create_or_delete_trace_kprobe+0x28/0x70
>   [c000001fd6c67c50] c00000000025f024 trace_run_command+0xc4/0xe0
>   [c000001fd6c67ca0] c00000000025f128
> trace_parse_run_command+0xe8/0x230
>   [c000001fd6c67d40] c0000000002845d0 probes_write+0x20/0x40
>   [c000001fd6c67d60] c0000000003eef4c __vfs_write+0x3c/0x70
>   [c000001fd6c67d80] c0000000003f26a0 vfs_write+0xd0/0x200
>   [c000001fd6c67dd0] c0000000003f2a3c ksys_write+0x7c/0x140
>   [c000001fd6c67e20] c00000000000b9e0 system_call+0x5c/0x68
>   --- Exception: c01 (System Call) at 00007fff8f06e420
>   SP (7ffff93d6830) is in userspace
>   1f:mon> client_loop: send disconnect: Broken pipe
> 
> 
> Sorry I didn't get any more info on the crash, I lost the console and
> then some CI bot stole the machine 8)
> 
> You should be able to reproduce just by running ftracetest.

The test that blew it up was test.d/kprobe/probepoint.tc for the
record.  It goes away when replacing the memcpy with a
patch_instruction().

> 
> cheers
diff mbox series

Patch

diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
index 2d27ec4feee4..2610496de7c7 100644
--- a/arch/powerpc/kernel/kprobes.c
+++ b/arch/powerpc/kernel/kprobes.c
@@ -24,6 +24,7 @@ 
 #include <asm/sstep.h>
 #include <asm/sections.h>
 #include <linux/uaccess.h>
+#include <linux/set_memory.h>
 
 DEFINE_PER_CPU(struct kprobe *, current_kprobe) = NULL;
 DEFINE_PER_CPU(struct kprobe_ctlblk, kprobe_ctlblk);
@@ -131,6 +132,8 @@  int arch_prepare_kprobe(struct kprobe *p)
 			(unsigned long)p->ainsn.insn + sizeof(kprobe_opcode_t));
 	}
 
+	set_memory_ro((unsigned long)p->ainsn.insn, 1);
+
 	p->ainsn.boostable = 0;
 	return ret;
 }