diff mbox

[4/4] refcount: Report failures through CHECK_DATA_CORRUPTION

Message ID 1486164412-7338-5-git-send-email-keescook@chromium.org (mailing list archive)
State New, archived
Headers show

Commit Message

Kees Cook Feb. 3, 2017, 11:26 p.m. UTC
This converts from WARN_ON() to CHECK_DATA_CORRUPTION() in the
CONFIG_DEBUG_REFCOUNT case. Additionally moves refcount_t sanity check
conditionals into regular function flow. Since CHECK_DATA_CORRUPTION()
is marked __much_check, we override few cases where the failure has
already been handled but we want to explicitly report it.

Signed-off-by: Kees Cook <keescook@chromium.org>
---
 include/linux/refcount.h | 35 ++++++++++++++++++++++-------------
 lib/Kconfig.debug        |  2 ++
 2 files changed, 24 insertions(+), 13 deletions(-)

Comments

Peter Zijlstra Feb. 5, 2017, 3:40 p.m. UTC | #1
On Fri, Feb 03, 2017 at 03:26:52PM -0800, Kees Cook wrote:
> This converts from WARN_ON() to CHECK_DATA_CORRUPTION() in the
> CONFIG_DEBUG_REFCOUNT case. Additionally moves refcount_t sanity check
> conditionals into regular function flow. Since CHECK_DATA_CORRUPTION()
> is marked __much_check, we override few cases where the failure has
> already been handled but we want to explicitly report it.
> 
> Signed-off-by: Kees Cook <keescook@chromium.org>
> ---
>  include/linux/refcount.h | 35 ++++++++++++++++++++++-------------
>  lib/Kconfig.debug        |  2 ++
>  2 files changed, 24 insertions(+), 13 deletions(-)
> 
> diff --git a/include/linux/refcount.h b/include/linux/refcount.h
> index 5b89cad62237..ef32910c7dd8 100644
> --- a/include/linux/refcount.h
> +++ b/include/linux/refcount.h
> @@ -43,10 +43,10 @@
>  #include <linux/spinlock.h>
>  
>  #if CONFIG_DEBUG_REFCOUNT
> -#define REFCOUNT_WARN(cond, str) WARN_ON(cond)
> +#define REFCOUNT_CHECK(cond, str) CHECK_DATA_CORRUPTION(cond, str)

OK, so that goes back to a full WARN() which will make the generated
code gigantic due to the whole printk() trainwreck :/
Kees Cook Feb. 5, 2017, 11:33 p.m. UTC | #2
On Sun, Feb 5, 2017 at 7:40 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Fri, Feb 03, 2017 at 03:26:52PM -0800, Kees Cook wrote:
>> This converts from WARN_ON() to CHECK_DATA_CORRUPTION() in the
>> CONFIG_DEBUG_REFCOUNT case. Additionally moves refcount_t sanity check
>> conditionals into regular function flow. Since CHECK_DATA_CORRUPTION()
>> is marked __much_check, we override few cases where the failure has
>> already been handled but we want to explicitly report it.
>>
>> Signed-off-by: Kees Cook <keescook@chromium.org>
>> ---
>>  include/linux/refcount.h | 35 ++++++++++++++++++++++-------------
>>  lib/Kconfig.debug        |  2 ++
>>  2 files changed, 24 insertions(+), 13 deletions(-)
>>
>> diff --git a/include/linux/refcount.h b/include/linux/refcount.h
>> index 5b89cad62237..ef32910c7dd8 100644
>> --- a/include/linux/refcount.h
>> +++ b/include/linux/refcount.h
>> @@ -43,10 +43,10 @@
>>  #include <linux/spinlock.h>
>>
>>  #if CONFIG_DEBUG_REFCOUNT
>> -#define REFCOUNT_WARN(cond, str) WARN_ON(cond)
>> +#define REFCOUNT_CHECK(cond, str) CHECK_DATA_CORRUPTION(cond, str)
>
> OK, so that goes back to a full WARN() which will make the generated
> code gigantic due to the whole printk() trainwreck :/

Hrm, perhaps we need three levels? WARN_ON, WARN, and BUG?

-Kees
Peter Zijlstra Feb. 6, 2017, 8:57 a.m. UTC | #3
On Sun, Feb 05, 2017 at 03:33:36PM -0800, Kees Cook wrote:
> On Sun, Feb 5, 2017 at 7:40 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> > On Fri, Feb 03, 2017 at 03:26:52PM -0800, Kees Cook wrote:
> >> This converts from WARN_ON() to CHECK_DATA_CORRUPTION() in the
> >> CONFIG_DEBUG_REFCOUNT case. Additionally moves refcount_t sanity check
> >> conditionals into regular function flow. Since CHECK_DATA_CORRUPTION()
> >> is marked __much_check, we override few cases where the failure has
> >> already been handled but we want to explicitly report it.
> >>
> >> Signed-off-by: Kees Cook <keescook@chromium.org>
> >> ---
> >>  include/linux/refcount.h | 35 ++++++++++++++++++++++-------------
> >>  lib/Kconfig.debug        |  2 ++
> >>  2 files changed, 24 insertions(+), 13 deletions(-)
> >>
> >> diff --git a/include/linux/refcount.h b/include/linux/refcount.h
> >> index 5b89cad62237..ef32910c7dd8 100644
> >> --- a/include/linux/refcount.h
> >> +++ b/include/linux/refcount.h
> >> @@ -43,10 +43,10 @@
> >>  #include <linux/spinlock.h>
> >>
> >>  #if CONFIG_DEBUG_REFCOUNT
> >> -#define REFCOUNT_WARN(cond, str) WARN_ON(cond)
> >> +#define REFCOUNT_CHECK(cond, str) CHECK_DATA_CORRUPTION(cond, str)
> >
> > OK, so that goes back to a full WARN() which will make the generated
> > code gigantic due to the whole printk() trainwreck :/
> 
> Hrm, perhaps we need three levels? WARN_ON, WARN, and BUG?

Did consider that, didn't really know if that made sense.

Like I wrote, ideally we'd end up using something like the x86 exception
table with a custom handler. Just no idea how to pull that off without
doing a full blown arch specific implementation, so I didn't go there
quite yet.


That way refcount_inc() would end up being inlined to something like:

        mov    0x148(%rdi),%eax
        jmp    2f
  1:    lock cmpxchg %edx,0x148(%rdi)
        je     4f
  2:    lea    -0x1(%rax),%ecx
        lea    0x1(%rax),%edx
        cmp    $0xfffffffd,%ecx
        jbe    1b
  3:    ud2
  4:

	_ASM_EXTABLE_HANDLE(3b, 4b, ex_handler_refcount_inc)


where:

bool ex_handler_refcount_inc(const struct exception_table_entry *fixup,
			     struct pt_regs *regs, int trapnr)
{
	regs->ip = ex_fixup_addr(fixup);

	if (!regs->ax)
		WARN(1, "refcount_t: increment on 0; use-after-free.\n");
	else
		WARN(1, "refcount_t: saturated; leaking memory.\n");

	return true;
}

and the handler is shared between all instances and can be as big and
fancy as we'd like.
Kees Cook Feb. 6, 2017, 4:54 p.m. UTC | #4
On Mon, Feb 6, 2017 at 12:57 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Sun, Feb 05, 2017 at 03:33:36PM -0800, Kees Cook wrote:
>> On Sun, Feb 5, 2017 at 7:40 AM, Peter Zijlstra <peterz@infradead.org> wrote:
>> > On Fri, Feb 03, 2017 at 03:26:52PM -0800, Kees Cook wrote:
>> >> This converts from WARN_ON() to CHECK_DATA_CORRUPTION() in the
>> >> CONFIG_DEBUG_REFCOUNT case. Additionally moves refcount_t sanity check
>> >> conditionals into regular function flow. Since CHECK_DATA_CORRUPTION()
>> >> is marked __much_check, we override few cases where the failure has
>> >> already been handled but we want to explicitly report it.
>> >>
>> >> Signed-off-by: Kees Cook <keescook@chromium.org>
>> >> ---
>> >>  include/linux/refcount.h | 35 ++++++++++++++++++++++-------------
>> >>  lib/Kconfig.debug        |  2 ++
>> >>  2 files changed, 24 insertions(+), 13 deletions(-)
>> >>
>> >> diff --git a/include/linux/refcount.h b/include/linux/refcount.h
>> >> index 5b89cad62237..ef32910c7dd8 100644
>> >> --- a/include/linux/refcount.h
>> >> +++ b/include/linux/refcount.h
>> >> @@ -43,10 +43,10 @@
>> >>  #include <linux/spinlock.h>
>> >>
>> >>  #if CONFIG_DEBUG_REFCOUNT
>> >> -#define REFCOUNT_WARN(cond, str) WARN_ON(cond)
>> >> +#define REFCOUNT_CHECK(cond, str) CHECK_DATA_CORRUPTION(cond, str)
>> >
>> > OK, so that goes back to a full WARN() which will make the generated
>> > code gigantic due to the whole printk() trainwreck :/
>>
>> Hrm, perhaps we need three levels? WARN_ON, WARN, and BUG?
>
> Did consider that, didn't really know if that made sense.
>
> Like I wrote, ideally we'd end up using something like the x86 exception
> table with a custom handler. Just no idea how to pull that off without
> doing a full blown arch specific implementation, so I didn't go there
> quite yet.

I haven't spent much time looking at the extable stuff. (Though
coincidentally, I was poking at it for x86's test_nx stuff...) I
thought there was a way to build arch-agnostic extables already?
kernel/extable.c is unconditionally built-in, for example.

> That way refcount_inc() would end up being inlined to something like:
>
>         mov    0x148(%rdi),%eax
>         jmp    2f
>   1:    lock cmpxchg %edx,0x148(%rdi)
>         je     4f
>   2:    lea    -0x1(%rax),%ecx
>         lea    0x1(%rax),%edx
>         cmp    $0xfffffffd,%ecx
>         jbe    1b
>   3:    ud2
>   4:
>
>         _ASM_EXTABLE_HANDLE(3b, 4b, ex_handler_refcount_inc)
>
>
> where:
>
> bool ex_handler_refcount_inc(const struct exception_table_entry *fixup,
>                              struct pt_regs *regs, int trapnr)
> {
>         regs->ip = ex_fixup_addr(fixup);
>
>         if (!regs->ax)
>                 WARN(1, "refcount_t: increment on 0; use-after-free.\n");
>         else
>                 WARN(1, "refcount_t: saturated; leaking memory.\n");
>
>         return true;
> }
>
> and the handler is shared between all instances and can be as big and
> fancy as we'd like.

I'll dig a bit to see what I can build. Can you add the lkdtm tests to
the series, though? That should be fine as-is.

Thanks!

-Kees
Peter Zijlstra Feb. 7, 2017, 8:34 a.m. UTC | #5
On Mon, Feb 06, 2017 at 08:54:38AM -0800, Kees Cook wrote:
> >
> > Like I wrote, ideally we'd end up using something like the x86 exception
> > table with a custom handler. Just no idea how to pull that off without
> > doing a full blown arch specific implementation, so I didn't go there
> > quite yet.
> 
> I haven't spent much time looking at the extable stuff. (Though
> coincidentally, I was poking at it for x86's test_nx stuff...) I
> thought there was a way to build arch-agnostic extables already?
> kernel/extable.c is unconditionally built-in, for example.

That doesn't seem to be of much use. It only contains section sort and
search functions.

Another problem for generic code would be to figure out what register
the relevant variable would live in at the time of exception. Here its
'obviously' EAX because that's what cmpxchg requires, but in generic
you'd need a means of querying GCC's register allocator at the exception
point and somehow using that information for the generation of the
exception handler.

AFAIK that's not something GCC can do.

> > and the handler is shared between all instances and can be as big and
> > fancy as we'd like.
> 
> I'll dig a bit to see what I can build. Can you add the lkdtm tests to
> the series, though? That should be fine as-is.

Yes, I did. I also did the 's/--help--/help/' and 's/#if/#ifdef/' thing.

Thanks!
Mark Rutland Feb. 7, 2017, 11:10 a.m. UTC | #6
On Tue, Feb 07, 2017 at 09:34:05AM +0100, Peter Zijlstra wrote:
> On Mon, Feb 06, 2017 at 08:54:38AM -0800, Kees Cook wrote:
> > >
> > > Like I wrote, ideally we'd end up using something like the x86 exception
> > > table with a custom handler. Just no idea how to pull that off without
> > > doing a full blown arch specific implementation, so I didn't go there
> > > quite yet.
> > 
> > I haven't spent much time looking at the extable stuff. (Though
> > coincidentally, I was poking at it for x86's test_nx stuff...) I
> > thought there was a way to build arch-agnostic extables already?
> > kernel/extable.c is unconditionally built-in, for example.
> 
> That doesn't seem to be of much use. It only contains section sort and
> search functions.
> 
> Another problem for generic code would be to figure out what register
> the relevant variable would live in at the time of exception. Here its
> 'obviously' EAX because that's what cmpxchg requires, but in generic
> you'd need a means of querying GCC's register allocator at the exception
> point and somehow using that information for the generation of the
> exception handler.

I think we only need two arch-specific primitives:
(a) mangle a GCC assigned register into an idx stored in the extable
(b) take said index, and grab the relevant register from pt_regs

Then you can have a BUG_VALUE(v, ...), where we use an input "r" (val),
and mangle that into the idx in the extable. In the common case, I'd
hope GCC would leave the register in-place from the cmpxchg.

... or have I misundertood? :)

Thanks,
Mark.
Peter Zijlstra Feb. 7, 2017, 12:36 p.m. UTC | #7
On Tue, Feb 07, 2017 at 11:10:12AM +0000, Mark Rutland wrote:
> On Tue, Feb 07, 2017 at 09:34:05AM +0100, Peter Zijlstra wrote:
> > On Mon, Feb 06, 2017 at 08:54:38AM -0800, Kees Cook wrote:
> > > >
> > > > Like I wrote, ideally we'd end up using something like the x86 exception
> > > > table with a custom handler. Just no idea how to pull that off without
> > > > doing a full blown arch specific implementation, so I didn't go there
> > > > quite yet.
> > > 
> > > I haven't spent much time looking at the extable stuff. (Though
> > > coincidentally, I was poking at it for x86's test_nx stuff...) I
> > > thought there was a way to build arch-agnostic extables already?
> > > kernel/extable.c is unconditionally built-in, for example.
> > 
> > That doesn't seem to be of much use. It only contains section sort and
> > search functions.
> > 
> > Another problem for generic code would be to figure out what register
> > the relevant variable would live in at the time of exception. Here its
> > 'obviously' EAX because that's what cmpxchg requires, but in generic
> > you'd need a means of querying GCC's register allocator at the exception
> > point and somehow using that information for the generation of the
> > exception handler.
> 
> I think we only need two arch-specific primitives:
> (a) mangle a GCC assigned register into an idx stored in the extable
> (b) take said index, and grab the relevant register from pt_regs
> 
> Then you can have a BUG_VALUE(v, ...), where we use an input "r" (val),
> and mangle that into the idx in the extable. In the common case, I'd
> hope GCC would leave the register in-place from the cmpxchg.
> 
> ... or have I misundertood? :)

Right something along those lines. (a) will need GCC help, and (b) would
be kernel-arch specific. So this isn't something we can quickly do.
Mark Rutland Feb. 7, 2017, 1:50 p.m. UTC | #8
On Tue, Feb 07, 2017 at 01:36:30PM +0100, Peter Zijlstra wrote:
> On Tue, Feb 07, 2017 at 11:10:12AM +0000, Mark Rutland wrote:
> > On Tue, Feb 07, 2017 at 09:34:05AM +0100, Peter Zijlstra wrote:
> > > On Mon, Feb 06, 2017 at 08:54:38AM -0800, Kees Cook wrote:
> > > > >
> > > > > Like I wrote, ideally we'd end up using something like the x86 exception
> > > > > table with a custom handler. Just no idea how to pull that off without
> > > > > doing a full blown arch specific implementation, so I didn't go there
> > > > > quite yet.
> > > > 
> > > > I haven't spent much time looking at the extable stuff. (Though
> > > > coincidentally, I was poking at it for x86's test_nx stuff...) I
> > > > thought there was a way to build arch-agnostic extables already?
> > > > kernel/extable.c is unconditionally built-in, for example.
> > > 
> > > That doesn't seem to be of much use. It only contains section sort and
> > > search functions.
> > > 
> > > Another problem for generic code would be to figure out what register
> > > the relevant variable would live in at the time of exception. Here its
> > > 'obviously' EAX because that's what cmpxchg requires, but in generic
> > > you'd need a means of querying GCC's register allocator at the exception
> > > point and somehow using that information for the generation of the
> > > exception handler.
> > 
> > I think we only need two arch-specific primitives:
> > (a) mangle a GCC assigned register into an idx stored in the extable
> > (b) take said index, and grab the relevant register from pt_regs
> > 
> > Then you can have a BUG_VALUE(v, ...), where we use an input "r" (val),
> > and mangle that into the idx in the extable. In the common case, I'd
> > hope GCC would leave the register in-place from the cmpxchg.
> > 
> > ... or have I misundertood? :)
> 
> Right something along those lines. (a) will need GCC help, and (b) would
> be kernel-arch specific. So this isn't something we can quickly do.

I agree this isn't something that can be hacked together quickly, and
certainly shouldn't block these patches.

However, I don't think we need anything new from GCC, and I think we
already have a generic API for (b).

For (a) we don't need new GCC help if we do something like we did in
commit 72c5839515260dce to do the mangling. Prepend a prefix to the
register, e.g. changing 'x0' to '__pt_regs_offset_x0', which we arrange
to hold the correct value.

For (b) we already have regs_get_register().

Thanks,
Mark.
Peter Zijlstra Feb. 7, 2017, 3:07 p.m. UTC | #9
On Tue, Feb 07, 2017 at 01:50:20PM +0000, Mark Rutland wrote:
> > Right something along those lines. (a) will need GCC help, and (b) would
> > be kernel-arch specific. So this isn't something we can quickly do.
> 
> I agree this isn't something that can be hacked together quickly, and
> certainly shouldn't block these patches.
> 
> However, I don't think we need anything new from GCC, and I think we
> already have a generic API for (b).
> 
> For (a) we don't need new GCC help if we do something like we did in
> commit 72c5839515260dce to do the mangling. Prepend a prefix to the
> register, e.g. changing 'x0' to '__pt_regs_offset_x0', which we arrange
> to hold the correct value.

I'm not sure I can decipher that commit and therefore have no idea if
something similar can be done for other architectures.
Mark Rutland Feb. 7, 2017, 4:03 p.m. UTC | #10
On Tue, Feb 07, 2017 at 04:07:37PM +0100, Peter Zijlstra wrote:
> On Tue, Feb 07, 2017 at 01:50:20PM +0000, Mark Rutland wrote:
> > > Right something along those lines. (a) will need GCC help, and (b) would
> > > be kernel-arch specific. So this isn't something we can quickly do.
> > 
> > I agree this isn't something that can be hacked together quickly, and
> > certainly shouldn't block these patches.
> > 
> > However, I don't think we need anything new from GCC, and I think we
> > already have a generic API for (b).
> > 
> > For (a) we don't need new GCC help if we do something like we did in
> > commit 72c5839515260dce to do the mangling. Prepend a prefix to the
> > register, e.g. changing 'x0' to '__pt_regs_offset_x0', which we arrange
> > to hold the correct value.
> 
> I'm not sure I can decipher that commit and therefore have no idea if
> something similar can be done for other architectures.

For x86 it's a little painful due to '%' in the register names, but it looks
possible. The below appears to do the mangling correctly (then screams due to
the mangled result being nonexistent).

Thanks,
Mark.

---->8----
#define cmpxchg(ptr, old, new)						\
({									\
	typeof(*ptr) __ret;						\
	typeof(*ptr) __old = (old);					\
	typeof(*ptr) __new = (new);					\
									\
	volatile unsigned int *__ptr = (volatile unsigned int *)ptr;	\
	asm volatile("cmpxchgl %2, %1"					\
		     : "=a" (__ret), "+m" (*__ptr)			\
		     : "r" (__new), "0" (__old)				\
		     : "memory");					\
	__ret;								\
})

asm(
"	.macro	reg_to_offset	r\n"
"	.irp rs,eax,ebx,ecx,edx\n"
"	.ifc \\r, %\\rs\n"
"	__offset_of_\\rs\n"
"	.endif\n"
"	.endr\n"
"	.endm\n"
);

#define asm_sym(var)		asm volatile("reg_to_offset %0\n" : : "r" (var))

int foo(void)
{
	unsigned int mem = 0;
	unsigned int new;
	int bar = 7, baz = 11;

	new = cmpxchg(&mem, 1, 2);
	asm_sym(new);
}
Peter Zijlstra Feb. 7, 2017, 5:30 p.m. UTC | #11
On Tue, Feb 07, 2017 at 04:03:01PM +0000, Mark Rutland wrote:
> For x86 it's a little painful due to '%' in the register names, but it looks
> possible. The below appears to do the mangling correctly (then screams due to
> the mangled result being nonexistent).

> asm(
> "	.macro	reg_to_offset	r\n"
> "	.irp rs,eax,ebx,ecx,edx\n"
> "	.ifc \\r, %\\rs\n"
> "	__offset_of_\\rs\n"
> "	.endif\n"
> "	.endr\n"
> "	.endm\n"
> );
> 
> #define asm_sym(var)		asm volatile("reg_to_offset %0\n" : : "r" (var))

Oh gawd that's a most gnarly hack.

Do we want to go do that for all archs or somehow cook a generic
fallback that ends up doing a full function call or something?
Mark Rutland Feb. 7, 2017, 5:55 p.m. UTC | #12
On Tue, Feb 07, 2017 at 06:30:36PM +0100, Peter Zijlstra wrote:
> On Tue, Feb 07, 2017 at 04:03:01PM +0000, Mark Rutland wrote:
> > For x86 it's a little painful due to '%' in the register names, but it looks
> > possible. The below appears to do the mangling correctly (then screams due to
> > the mangled result being nonexistent).
> 
> > asm(
> > "	.macro	reg_to_offset	r\n"
> > "	.irp rs,eax,ebx,ecx,edx\n"
> > "	.ifc \\r, %\\rs\n"
> > "	__offset_of_\\rs\n"
> > "	.endif\n"
> > "	.endr\n"
> > "	.endm\n"
> > );
> > 
> > #define asm_sym(var)		asm volatile("reg_to_offset %0\n" : : "r" (var))
> 
> Oh gawd that's a most gnarly hack.

:)

> Do we want to go do that for all archs or somehow cook a generic
> fallback that ends up doing a full function call or something?

Given the arch-specific reg->blah mapping is so "fun", I guess a generic
fallback would be a good start.

I haven't figured out all the plumbing details. It'd be nice to reuse
the bug infrastructure so that arches don't have to implement another
trap and callback pair, but I guess the reg details need to live in
another data structure.

Thanks,
Mark.
Peter Zijlstra Feb. 8, 2017, 9:12 a.m. UTC | #13
On Tue, Feb 07, 2017 at 05:55:42PM +0000, Mark Rutland wrote:
> On Tue, Feb 07, 2017 at 06:30:36PM +0100, Peter Zijlstra wrote:
> > On Tue, Feb 07, 2017 at 04:03:01PM +0000, Mark Rutland wrote:
> > > For x86 it's a little painful due to '%' in the register names, but it looks
> > > possible. The below appears to do the mangling correctly (then screams due to
> > > the mangled result being nonexistent).
> > 
> > > asm(
> > > "	.macro	reg_to_offset	r\n"
> > > "	.irp rs,eax,ebx,ecx,edx\n"
> > > "	.ifc \\r, %\\rs\n"
> > > "	__offset_of_\\rs\n"
> > > "	.endif\n"
> > > "	.endr\n"
> > > "	.endm\n"
> > > );
> > > 
> > > #define asm_sym(var)		asm volatile("reg_to_offset %0\n" : : "r" (var))
> > 
> > Oh gawd that's a most gnarly hack.
> 
> :)
> 
> > Do we want to go do that for all archs or somehow cook a generic
> > fallback that ends up doing a full function call or something?
> 
> Given the arch-specific reg->blah mapping is so "fun", I guess a generic
> fallback would be a good start.
> 
> I haven't figured out all the plumbing details. It'd be nice to reuse
> the bug infrastructure so that arches don't have to implement another
> trap and callback pair, but I guess the reg details need to live in
> another data structure.

On x86 have have __ex_table and __bug_table. The former is used for all
sorts of things, including fixing up faults.

Now, our struct exception_table_entry has a third field used to specify
a handler, see commit:

 548acf19234d ("x86/mm: Expand the exception table logic to allow new handling options")

Also, given we trigger things with a known instruction at these sites,
the ->to field is reusable and can be used to encode the register
offset.

Still, if we want to allow a generic implementation that does a function
call, the handler prototype should probably look like:

	void exception_value(unsigned long value);

Which means the arch bits need a trampoline and we also need to encode
that. The best I've come up with is having nr_regs trampolines and
stuffing the trampoline function in the ->handler field and then using
the ->to field to encode the actual handler.

Something like:

#define EX_REG_HANDLER(_reg)					\
bool ex_handler_value_##_reg(const struct exception_table_entry *fixup, \
			    struct pt_regs *regs, int trapnr)	\
{								\
	void (*handler)(unsigned long) =			\
		(void *)((unsigned long)&fixup->to + fixup->to); \
								\
	if (trapnr != X86_TRAP_UD)				\
		return false;					\
								\
	regs->ip += 2; /* size of UD2 instruction */		\
	handler(regs->_reg);					\
	return true;						\
}

EX_REG_HANDLER(bx);
EX_REG_HANDLER(cx);
...
EX_REG_HANDLER(ss);


asm (
" .macro reg_to_handler	r\n"
" .irp rs,bx,cx,...,ss\n"
" .ifc \\r, %\\rs\n"
" ex_handler_value_\\rs\n"
" .endif\n"
" .endr\n"
" .endm\n"
);

#define EXCEPTION_VALUE(val, handler)			\
	asm volatile ("1: ud2"				\
		      _ASM_EXTABLE_HANDLE(1b, handler,	\
				     reg_to_handler %0) \
		      : : "r" (val))


Where the generic version can simply be:

#define EXCEPTION_VALUE(val, handler)	handler((unsigned long)val)


Makes sense?
Peter Zijlstra Feb. 8, 2017, 9:43 a.m. UTC | #14
On Wed, Feb 08, 2017 at 10:12:50AM +0100, Peter Zijlstra wrote:
> Something like:
> 
> #define EX_REG_HANDLER(_reg)					\
> bool ex_handler_value_##_reg(const struct exception_table_entry *fixup, \
> 			    struct pt_regs *regs, int trapnr)	\
> {								\
> 	void (*handler)(unsigned long) =			\
> 		(void *)((unsigned long)&fixup->to + fixup->to); \
> 								\
> 	if (trapnr != X86_TRAP_UD)				\
> 		return false;					\
> 								\
> 	regs->ip += 2; /* size of UD2 instruction */		\
> 	handler(regs->_reg);					\
> 	return true;						\
> }
> 
> EX_REG_HANDLER(bx);
> EX_REG_HANDLER(cx);
> ...
> EX_REG_HANDLER(ss);
> 
> 
> asm (
> " .macro reg_to_handler	r\n"
> " .irp rs,bx,cx,...,ss\n"
> " .ifc \\r, %\\rs\n"
> " ex_handler_value_\\rs\n"
> " .endif\n"
  " .ifc \\r, %e\\rs\n"
  " ex_handler_value_\\rs\n"
  " .endif\n"
  " .ifc \\r, %r\\rs\n"
  " ex_handler_value_\\rs\n"
  " .endif\n"
> " .endr\n"
> " .endm\n"
> );

to match the 16, 32 and 64 bit names of the same registers. The byte
registers will need additional magic :/


> #define EXCEPTION_VALUE(val, handler)			\
> 	asm volatile ("1: ud2"				\
> 		      _ASM_EXTABLE_HANDLE(1b, handler,	\
> 				     reg_to_handler %0) \
> 		      : : "r" (val))
> 
> 
> Where the generic version can simply be:
> 
> #define EXCEPTION_VALUE(val, handler)	handler((unsigned long)val)
> 
> 
> Makes sense?
Mark Rutland Feb. 8, 2017, 2:10 p.m. UTC | #15
On Wed, Feb 08, 2017 at 10:12:50AM +0100, Peter Zijlstra wrote:
> On x86 have have __ex_table and __bug_table. The former is used for all
> sorts of things, including fixing up faults.
> 
> Now, our struct exception_table_entry has a third field used to specify
> a handler, see commit:
> 
>  548acf19234d ("x86/mm: Expand the exception table logic to allow new handling options")

Ah; neat!

> Still, if we want to allow a generic implementation that does a function
> call, the handler prototype should probably look like:
> 
> 	void exception_value(unsigned long value);
> 
> Which means the arch bits need a trampoline and we also need to encode
> that. The best I've come up with is having nr_regs trampolines and
> stuffing the trampoline function in the ->handler field and then using
> the ->to field to encode the actual handler.
> 
> Something like:
> 
> #define EX_REG_HANDLER(_reg)					\
> bool ex_handler_value_##_reg(const struct exception_table_entry *fixup, \
> 			    struct pt_regs *regs, int trapnr)	\
> {								\
> 	void (*handler)(unsigned long) =			\
> 		(void *)((unsigned long)&fixup->to + fixup->to); \
> 								\
> 	if (trapnr != X86_TRAP_UD)				\
> 		return false;					\
> 								\
> 	regs->ip += 2; /* size of UD2 instruction */		\
> 	handler(regs->_reg);					\
> 	return true;						\
> }
> 
> EX_REG_HANDLER(bx);
> EX_REG_HANDLER(cx);
> ...
> EX_REG_HANDLER(ss);
> 
> 
> asm (
> " .macro reg_to_handler	r\n"
> " .irp rs,bx,cx,...,ss\n"
> " .ifc \\r, %\\rs\n"
> " ex_handler_value_\\rs\n"
> " .endif\n"
> " .endr\n"
> " .endm\n"
> );
> 
> #define EXCEPTION_VALUE(val, handler)			\
> 	asm volatile ("1: ud2"				\
> 		      _ASM_EXTABLE_HANDLE(1b, handler,	\
> 				     reg_to_handler %0) \
> 		      : : "r" (val))
> 
> 
> Where the generic version can simply be:
> 
> #define EXCEPTION_VALUE(val, handler)	handler((unsigned long)val)
> 
> Makes sense?

That all makes sense to me.

I'll take a look at putting together an arm64 equivalent to the x86
extable patch, along with cleanup to our SW breakpoint code (which we
use in lieu for x86's UD2).

Thanks,
Mark.
Kees Cook Feb. 8, 2017, 9:20 p.m. UTC | #16
On Wed, Feb 8, 2017 at 1:12 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Tue, Feb 07, 2017 at 05:55:42PM +0000, Mark Rutland wrote:
>> On Tue, Feb 07, 2017 at 06:30:36PM +0100, Peter Zijlstra wrote:
>> > On Tue, Feb 07, 2017 at 04:03:01PM +0000, Mark Rutland wrote:
>> > > For x86 it's a little painful due to '%' in the register names, but it looks
>> > > possible. The below appears to do the mangling correctly (then screams due to
>> > > the mangled result being nonexistent).
>> >
>> > > asm(
>> > > " .macro  reg_to_offset   r\n"
>> > > " .irp rs,eax,ebx,ecx,edx\n"
>> > > " .ifc \\r, %\\rs\n"
>> > > " __offset_of_\\rs\n"
>> > > " .endif\n"
>> > > " .endr\n"
>> > > " .endm\n"
>> > > );
>> > >
>> > > #define asm_sym(var)              asm volatile("reg_to_offset %0\n" : : "r" (var))
>> >
>> > Oh gawd that's a most gnarly hack.
>>
>> :)
>>
>> > Do we want to go do that for all archs or somehow cook a generic
>> > fallback that ends up doing a full function call or something?
>>
>> Given the arch-specific reg->blah mapping is so "fun", I guess a generic
>> fallback would be a good start.
>>
>> I haven't figured out all the plumbing details. It'd be nice to reuse
>> the bug infrastructure so that arches don't have to implement another
>> trap and callback pair, but I guess the reg details need to live in
>> another data structure.
>
> On x86 have have __ex_table and __bug_table. The former is used for all
> sorts of things, including fixing up faults.
>
> Now, our struct exception_table_entry has a third field used to specify
> a handler, see commit:
>
>  548acf19234d ("x86/mm: Expand the exception table logic to allow new handling options")
>
> Also, given we trigger things with a known instruction at these sites,
> the ->to field is reusable and can be used to encode the register
> offset.
>
> Still, if we want to allow a generic implementation that does a function
> call, the handler prototype should probably look like:
>
>         void exception_value(unsigned long value);
>
> Which means the arch bits need a trampoline and we also need to encode
> that. The best I've come up with is having nr_regs trampolines and
> stuffing the trampoline function in the ->handler field and then using
> the ->to field to encode the actual handler.
>
> Something like:
>
> #define EX_REG_HANDLER(_reg)                                    \
> bool ex_handler_value_##_reg(const struct exception_table_entry *fixup, \
>                             struct pt_regs *regs, int trapnr)   \
> {                                                               \
>         void (*handler)(unsigned long) =                        \
>                 (void *)((unsigned long)&fixup->to + fixup->to); \
>                                                                 \
>         if (trapnr != X86_TRAP_UD)                              \
>                 return false;                                   \
>                                                                 \
>         regs->ip += 2; /* size of UD2 instruction */            \
>         handler(regs->_reg);                                    \
>         return true;                                            \
> }
>
> EX_REG_HANDLER(bx);
> EX_REG_HANDLER(cx);
> ...
> EX_REG_HANDLER(ss);
>
>
> asm (
> " .macro reg_to_handler r\n"
> " .irp rs,bx,cx,...,ss\n"
> " .ifc \\r, %\\rs\n"
> " ex_handler_value_\\rs\n"
> " .endif\n"
> " .endr\n"
> " .endm\n"
> );
>
> #define EXCEPTION_VALUE(val, handler)                   \
>         asm volatile ("1: ud2"                          \
>                       _ASM_EXTABLE_HANDLE(1b, handler,  \
>                                      reg_to_handler %0) \
>                       : : "r" (val))
>
>
> Where the generic version can simply be:
>
> #define EXCEPTION_VALUE(val, handler)   handler((unsigned long)val)
>
>
> Makes sense?

Ooooh, that is intense. And the trampolines (EX_REG_HANDLERs) are all
just there to catch whatever register gcc decides to stuff the value
into? *cover face* Sure, okay. :)

I wonder how many existing WARN callsites could be repurposed to use this?

-Kees
Peter Zijlstra Feb. 9, 2017, 10:27 a.m. UTC | #17
On Wed, Feb 08, 2017 at 01:20:26PM -0800, Kees Cook wrote:

> Ooooh, that is intense. And the trampolines (EX_REG_HANDLERs) are all
> just there to catch whatever register gcc decides to stuff the value
> into? *cover face* Sure, okay. :)

Right, they shouldn't be big functions, but barring whole program LTO
there's just no knowing which are unused.

> I wonder how many existing WARN callsites could be repurposed to use this?

At the very least all WARN/BUG instances with trivial @format argument
that are inlined I think. For example, things like:

static inline some_function()
{
	/* ... */
	WARN(cond, "blah blah blah\n");
	/* ... */
}

where the format has no arguments. Here we can out-of-line the printk()
stuff, which, as is the purpose here, shrinks the size of the inline.
Kees Cook Feb. 10, 2017, 11:39 p.m. UTC | #18
On Thu, Feb 9, 2017 at 2:27 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Wed, Feb 08, 2017 at 01:20:26PM -0800, Kees Cook wrote:
>
>> Ooooh, that is intense. And the trampolines (EX_REG_HANDLERs) are all
>> just there to catch whatever register gcc decides to stuff the value
>> into? *cover face* Sure, okay. :)
>
> Right, they shouldn't be big functions, but barring whole program LTO
> there's just no knowing which are unused.
>
>> I wonder how many existing WARN callsites could be repurposed to use this?
>
> At the very least all WARN/BUG instances with trivial @format argument
> that are inlined I think. For example, things like:
>
> static inline some_function()
> {
>         /* ... */
>         WARN(cond, "blah blah blah\n");
>         /* ... */
> }
>
> where the format has no arguments. Here we can out-of-line the printk()
> stuff, which, as is the purpose here, shrinks the size of the inline.

Unless there is some other unholy macros trick, I think we'd need a
separate "WARN_CONST" or something macro to do this (i.e.
WARN_CONST(const, const_str) instead of WARN(cond, fmt, ...)), since
detecting a single-item vararg in a macro is very weird/impossible to
do. Hrmm.

-Kees
diff mbox

Patch

diff --git a/include/linux/refcount.h b/include/linux/refcount.h
index 5b89cad62237..ef32910c7dd8 100644
--- a/include/linux/refcount.h
+++ b/include/linux/refcount.h
@@ -43,10 +43,10 @@ 
 #include <linux/spinlock.h>
 
 #if CONFIG_DEBUG_REFCOUNT
-#define REFCOUNT_WARN(cond, str) WARN_ON(cond)
+#define REFCOUNT_CHECK(cond, str) CHECK_DATA_CORRUPTION(cond, str)
 #define __refcount_check	__must_check
 #else
-#define REFCOUNT_WARN(cond, str) (void)(cond)
+#define REFCOUNT_CHECK(cond, str) (!!(cond))
 #define __refcount_check
 #endif
 
@@ -86,14 +86,18 @@  bool refcount_add_not_zero(unsigned int i, refcount_t *r)
 			break;
 	}
 
-	REFCOUNT_WARN(new == UINT_MAX, "refcount_t: saturated; leaking memory.\n");
+	val = REFCOUNT_CHECK(new == UINT_MAX,
+			     "refcount_t: add saturated; leaking memory.\n");
 
 	return true;
 }
 
 static inline void refcount_add(unsigned int i, refcount_t *r)
 {
-	REFCOUNT_WARN(!refcount_add_not_zero(i, r), "refcount_t: addition on 0; use-after-free.\n");
+	bool __always_unused b;
+
+	b = REFCOUNT_CHECK(!refcount_add_not_zero(i, r),
+			   "refcount_t: addition on 0; use-after-free.\n");
 }
 
 /*
@@ -121,7 +125,8 @@  bool refcount_inc_not_zero(refcount_t *r)
 			break;
 	}
 
-	REFCOUNT_WARN(new == UINT_MAX, "refcount_t: saturated; leaking memory.\n");
+	val = REFCOUNT_CHECK(new == UINT_MAX,
+			     "refcount_t: inc saturated; leaking memory.\n");
 
 	return true;
 }
@@ -134,7 +139,10 @@  bool refcount_inc_not_zero(refcount_t *r)
  */
 static inline void refcount_inc(refcount_t *r)
 {
-	REFCOUNT_WARN(!refcount_inc_not_zero(r), "refcount_t: increment on 0; use-after-free.\n");
+	bool __always_unused b;
+
+	b = REFCOUNT_CHECK(!refcount_inc_not_zero(r),
+			   "refcount_t: increment on 0; use-after-free.\n");
 }
 
 /*
@@ -155,10 +163,9 @@  bool refcount_sub_and_test(unsigned int i, refcount_t *r)
 			return false;
 
 		new = val - i;
-		if (new > val) {
-			REFCOUNT_WARN(new > val, "refcount_t: underflow; use-after-free.\n");
+		if (REFCOUNT_CHECK(new > val,
+				"refcount_t: sub underflow; use-after-free.\n"))
 			return false;
-		}
 
 		if (atomic_try_cmpxchg_release(&r->refs, &val, new))
 			break;
@@ -183,7 +190,10 @@  bool refcount_dec_and_test(refcount_t *r)
 static inline
 void refcount_dec(refcount_t *r)
 {
-	REFCOUNT_WARN(refcount_dec_and_test(r), "refcount_t: decrement hit 0; leaking memory.\n");
+	bool __always_unused b;
+
+	b = REFCOUNT_CHECK(refcount_dec_and_test(r),
+			   "refcount_t: decrement hit 0; leaking memory.\n");
 }
 
 /*
@@ -224,10 +234,9 @@  bool refcount_dec_not_one(refcount_t *r)
 			return false;
 
 		new = val - 1;
-		if (new > val) {
-			REFCOUNT_WARN(new > val, "refcount_t: underflow; use-after-free.\n");
+		if (REFCOUNT_CHECK(new > val,
+				"refcount_t: dec underflow; use-after-free.\n"))
 			return true;
-		}
 
 		if (atomic_try_cmpxchg_release(&r->refs, &val, new))
 			break;
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 20fde8d4523a..01e7aa578456 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -731,6 +731,7 @@  source "lib/Kconfig.kasan"
 
 config DEBUG_REFCOUNT
 	bool "Verbose refcount checks"
+	depends on DEBUG_KERNEL || BUG_ON_DATA_CORRUPTION
 	help
 	  Say Y here if you want reference counters (refcount_t and kref) to
 	  generate WARNs on dubious usage. Without this refcount_t will still
@@ -2011,6 +2012,7 @@  config TEST_STATIC_KEYS
 config BUG_ON_DATA_CORRUPTION
 	bool "Trigger a BUG when data corruption is detected"
 	select DEBUG_LIST
+	select DEBUG_REFCOUNT
 	help
 	  Select this option if the kernel should BUG when it encounters
 	  data corruption in kernel memory structures when they get checked