diff mbox series

[v6] mm: Uninline copy_overflow()

Message ID e1723b9cfa924bcefcd41f69d0025b38e4c9364e.1644819985.git.christophe.leroy@csgroup.eu (mailing list archive)
State New
Headers show
Series [v6] mm: Uninline copy_overflow() | expand

Commit Message

Christophe Leroy Feb. 14, 2022, 6:26 a.m. UTC
While building a small config with CONFIG_CC_OPTIMISE_FOR_SIZE,
I ended up with more than 50 times the following function in vmlinux
because GCC doesn't honor the 'inline' keyword:

	c00243bc <copy_overflow>:
	c00243bc:	94 21 ff f0 	stwu    r1,-16(r1)
	c00243c0:	7c 85 23 78 	mr      r5,r4
	c00243c4:	7c 64 1b 78 	mr      r4,r3
	c00243c8:	3c 60 c0 62 	lis     r3,-16286
	c00243cc:	7c 08 02 a6 	mflr    r0
	c00243d0:	38 63 5e e5 	addi    r3,r3,24293
	c00243d4:	90 01 00 14 	stw     r0,20(r1)
	c00243d8:	4b ff 82 45 	bl      c001c61c <__warn_printk>
	c00243dc:	0f e0 00 00 	twui    r0,0
	c00243e0:	80 01 00 14 	lwz     r0,20(r1)
	c00243e4:	38 21 00 10 	addi    r1,r1,16
	c00243e8:	7c 08 03 a6 	mtlr    r0
	c00243ec:	4e 80 00 20 	blr

With -Winline, GCC tells:

	/include/linux/thread_info.h:212:20: warning: inlining failed in call to 'copy_overflow': call is unlikely and code size would grow [-Winline]

copy_overflow() is a non conditional warning called by
check_copy_size() on an error path.

check_copy_size() have to remain inlined in order to benefit
from constant folding, but copy_overflow() is not worth inlining.

Uninline the warning when CONFIG_BUG is selected.

When CONFIG_BUG is not selected, WARN() does nothing so skip it.

This reduces the size of vmlinux by almost 4kbytes.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
v6: I should have gone sleeping yesterday night instead of sending v5 out. Sorry for the noise. Fix EXPORT_SYMBOL()

v5: Change to EXPORT_SYMBOL() instead of EXPORT_SYMBOL_GPL()

v4: Make copy_overflow() a no-op when CONFIG_BUG is not selected

v3: Added missing ; after EXPORT_SYMBOL()

v2: Added missing EXPORT_SYMBOL() and enhanced commit message
---
 include/linux/thread_info.h | 5 ++++-
 mm/maccess.c                | 6 ++++++
 2 files changed, 10 insertions(+), 1 deletion(-)

Comments

David Laight Feb. 14, 2022, 9:23 a.m. UTC | #1
From: Christophe Leroy
> Sent: 14 February 2022 06:27
> 
...
> v5: Change to EXPORT_SYMBOL() instead of EXPORT_SYMBOL_GPL()

Thanks.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Anshuman Khandual Feb. 14, 2022, 9:54 a.m. UTC | #2
On 2/14/22 11:56 AM, Christophe Leroy wrote:
> While building a small config with CONFIG_CC_OPTIMISE_FOR_SIZE,
> I ended up with more than 50 times the following function in vmlinux
> because GCC doesn't honor the 'inline' keyword:
> 
> 	c00243bc <copy_overflow>:
> 	c00243bc:	94 21 ff f0 	stwu    r1,-16(r1)
> 	c00243c0:	7c 85 23 78 	mr      r5,r4
> 	c00243c4:	7c 64 1b 78 	mr      r4,r3
> 	c00243c8:	3c 60 c0 62 	lis     r3,-16286
> 	c00243cc:	7c 08 02 a6 	mflr    r0
> 	c00243d0:	38 63 5e e5 	addi    r3,r3,24293
> 	c00243d4:	90 01 00 14 	stw     r0,20(r1)
> 	c00243d8:	4b ff 82 45 	bl      c001c61c <__warn_printk>
> 	c00243dc:	0f e0 00 00 	twui    r0,0
> 	c00243e0:	80 01 00 14 	lwz     r0,20(r1)
> 	c00243e4:	38 21 00 10 	addi    r1,r1,16
> 	c00243e8:	7c 08 03 a6 	mtlr    r0
> 	c00243ec:	4e 80 00 20 	blr
> 
> With -Winline, GCC tells:
> 
> 	/include/linux/thread_info.h:212:20: warning: inlining failed in call to 'copy_overflow': call is unlikely and code size would grow [-Winline]
> 
> copy_overflow() is a non conditional warning called by
> check_copy_size() on an error path.
> 
> check_copy_size() have to remain inlined in order to benefit
> from constant folding, but copy_overflow() is not worth inlining.
> 
> Uninline the warning when CONFIG_BUG is selected.
> 
> When CONFIG_BUG is not selected, WARN() does nothing so skip it.
> 
> This reduces the size of vmlinux by almost 4kbytes.

Just wondering, is this the only such scenario which results in
an avoidable bloated vmlinux image ?

> 
> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
> ---
> v6: I should have gone sleeping yesterday night instead of sending v5 out. Sorry for the noise. Fix EXPORT_SYMBOL()
> 
> v5: Change to EXPORT_SYMBOL() instead of EXPORT_SYMBOL_GPL()
> 
> v4: Make copy_overflow() a no-op when CONFIG_BUG is not selected
> 
> v3: Added missing ; after EXPORT_SYMBOL()
> 
> v2: Added missing EXPORT_SYMBOL() and enhanced commit message
> ---
>  include/linux/thread_info.h | 5 ++++-
>  mm/maccess.c                | 6 ++++++
>  2 files changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/thread_info.h b/include/linux/thread_info.h
> index 73a6f34b3847..9f392ec76f2b 100644
> --- a/include/linux/thread_info.h
> +++ b/include/linux/thread_info.h
> @@ -209,9 +209,12 @@ __bad_copy_from(void);
>  extern void __compiletime_error("copy destination size is too small")
>  __bad_copy_to(void);
>  
> +void __copy_overflow(int size, unsigned long count);
> +
>  static inline void copy_overflow(int size, unsigned long count)
>  {
> -	WARN(1, "Buffer overflow detected (%d < %lu)!\n", size, count);
> +	if (IS_ENABLED(CONFIG_BUG))
> +		__copy_overflow(size, count);
>  }
>  
>  static __always_inline __must_check bool
> diff --git a/mm/maccess.c b/mm/maccess.c
> index d3f1a1f0b1c1..3fed2b876539 100644
> --- a/mm/maccess.c
> +++ b/mm/maccess.c
> @@ -335,3 +335,9 @@ long strnlen_user_nofault(const void __user *unsafe_addr, long count)
>  
>  	return ret;
>  }
> +
> +void __copy_overflow(int size, unsigned long count)
> +{
> +	WARN(1, "Buffer overflow detected (%d < %lu)!\n", size, count);
> +}
> +EXPORT_SYMBOL(__copy_overflow);
>
David Laight Feb. 14, 2022, 11:31 a.m. UTC | #3
From: Anshuman Khandual
> Sent: 14 February 2022 09:54
...
> > With -Winline, GCC tells:
> >
> > 	/include/linux/thread_info.h:212:20: warning: inlining failed in call to 'copy_overflow': call
> is unlikely and code size would grow [-Winline]
> >
> > copy_overflow() is a non conditional warning called by
> > check_copy_size() on an error path.
> >
> > check_copy_size() have to remain inlined in order to benefit
> > from constant folding, but copy_overflow() is not worth inlining.
> >
> > Uninline the warning when CONFIG_BUG is selected.
> >
> > When CONFIG_BUG is not selected, WARN() does nothing so skip it.
> >
> > This reduces the size of vmlinux by almost 4kbytes.
> 

> > +void __copy_overflow(int size, unsigned long count);
> > +
> >  static inline void copy_overflow(int size, unsigned long count)
> >  {
> > -	WARN(1, "Buffer overflow detected (%d < %lu)!\n", size, count);
> > +	if (IS_ENABLED(CONFIG_BUG))
> > +		__copy_overflow(size, count);
> >  }

> Just wondering, is this the only such scenario which results in
> an avoidable bloated vmlinux image ?

The more interesting question is whether the call to __copy_overflow()
is actually significantly smaller than the one to WARN()?
And if so why.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Christophe Leroy Feb. 14, 2022, 1:21 p.m. UTC | #4
Le 14/02/2022 à 12:31, David Laight a écrit :
> From: Anshuman Khandual
>> Sent: 14 February 2022 09:54
> ...
>>> With -Winline, GCC tells:
>>>
>>> 	/include/linux/thread_info.h:212:20: warning: inlining failed in call to 'copy_overflow': call
>> is unlikely and code size would grow [-Winline]
>>>
>>> copy_overflow() is a non conditional warning called by
>>> check_copy_size() on an error path.
>>>
>>> check_copy_size() have to remain inlined in order to benefit
>>> from constant folding, but copy_overflow() is not worth inlining.
>>>
>>> Uninline the warning when CONFIG_BUG is selected.
>>>
>>> When CONFIG_BUG is not selected, WARN() does nothing so skip it.
>>>
>>> This reduces the size of vmlinux by almost 4kbytes.
>>
> 
>>> +void __copy_overflow(int size, unsigned long count);
>>> +
>>>   static inline void copy_overflow(int size, unsigned long count)
>>>   {
>>> -	WARN(1, "Buffer overflow detected (%d < %lu)!\n", size, count);
>>> +	if (IS_ENABLED(CONFIG_BUG))
>>> +		__copy_overflow(size, count);
>>>   }
> 
>> Just wondering, is this the only such scenario which results in
>> an avoidable bloated vmlinux image ?
> 
> The more interesting question is whether the call to __copy_overflow()
> is actually significantly smaller than the one to WARN()?
> And if so why.
> 
unsigned long tst_copy_to_user(void __user *to, unsigned long n)
{
	return copy_to_user(to, &jiffies_64, n);
}

With the patch:

00003c78 <tst_copy_to_user>:
     3c78:	28 04 00 08 	cmplwi  r4,8
     3c7c:	7c 85 23 78 	mr      r5,r4
     3c80:	41 81 00 10 	bgt     3c90 <tst_copy_to_user+0x18>
     3c84:	3c 80 00 00 	lis     r4,0
			3c86: R_PPC_ADDR16_HA	jiffies_64
     3c88:	38 84 00 00 	addi    r4,r4,0
			3c8a: R_PPC_ADDR16_LO	jiffies_64
     3c8c:	48 00 00 00 	b       3c8c <tst_copy_to_user+0x14>
			3c8c: R_PPC_REL24	_copy_to_user

     3c90:	94 21 ff f0 	stwu    r1,-16(r1)
     3c94:	7c 08 02 a6 	mflr    r0
     3c98:	38 60 00 08 	li      r3,8
     3c9c:	90 01 00 14 	stw     r0,20(r1)
     3ca0:	90 81 00 08 	stw     r4,8(r1)
     3ca4:	48 00 00 01 	bl      3ca4 <tst_copy_to_user+0x2c>
			3ca4: R_PPC_REL24	__copy_overflow
     3ca8:	80 a1 00 08 	lwz     r5,8(r1)
     3cac:	80 01 00 14 	lwz     r0,20(r1)
     3cb0:	7c a3 2b 78 	mr      r3,r5
     3cb4:	7c 08 03 a6 	mtlr    r0
     3cb8:	38 21 00 10 	addi    r1,r1,16
     3cbc:	4e 80 00 20 	blr


Without the patch:

00003c88 <tst_copy_to_user>:
     3c88:	28 04 00 08 	cmplwi  r4,8
     3c8c:	7c 85 23 78 	mr      r5,r4
     3c90:	41 81 00 10 	bgt     3ca0 <tst_copy_to_user+0x18>
     3c94:	3c 80 00 00 	lis     r4,0
			3c96: R_PPC_ADDR16_HA	jiffies_64
     3c98:	38 84 00 00 	addi    r4,r4,0
			3c9a: R_PPC_ADDR16_LO	jiffies_64
     3c9c:	48 00 00 00 	b       3c9c <tst_copy_to_user+0x14>
			3c9c: R_PPC_REL24	_copy_to_user

     3ca0:	94 21 ff f0 	stwu    r1,-16(r1)
     3ca4:	3c 60 00 00 	lis     r3,0
			3ca6: R_PPC_ADDR16_HA	.rodata.str1.4+0x30
     3ca8:	90 81 00 08 	stw     r4,8(r1)
     3cac:	7c 08 02 a6 	mflr    r0
     3cb0:	38 63 00 00 	addi    r3,r3,0
			3cb2: R_PPC_ADDR16_LO	.rodata.str1.4+0x30
     3cb4:	38 80 00 08 	li      r4,8
     3cb8:	90 01 00 14 	stw     r0,20(r1)
     3cbc:	48 00 00 01 	bl      3cbc <tst_copy_to_user+0x34>
			3cbc: R_PPC_REL24	__warn_printk
     3cc0:	80 a1 00 08 	lwz     r5,8(r1)
     3cc4:	0f e0 00 00 	twui    r0,0
     3cc8:	80 01 00 14 	lwz     r0,20(r1)
     3ccc:	7c a3 2b 78 	mr      r3,r5
     3cd0:	7c 08 03 a6 	mtlr    r0
     3cd4:	38 21 00 10 	addi    r1,r1,16
     3cd8:	4e 80 00 20 	blr
David Laight Feb. 14, 2022, 2 p.m. UTC | #5
From: Christophe Leroy
> Sent: 14 February 2022 13:21
> 
> Le 14/02/2022 à 12:31, David Laight a écrit :
> > From: Anshuman Khandual
> >> Sent: 14 February 2022 09:54
> > ...
> >>> With -Winline, GCC tells:
> >>>
> >>> 	/include/linux/thread_info.h:212:20: warning: inlining failed in call to 'copy_overflow': call
> >> is unlikely and code size would grow [-Winline]
> >>>
> >>> copy_overflow() is a non conditional warning called by
> >>> check_copy_size() on an error path.
> >>>
> >>> check_copy_size() have to remain inlined in order to benefit
> >>> from constant folding, but copy_overflow() is not worth inlining.
> >>>
> >>> Uninline the warning when CONFIG_BUG is selected.
> >>>
> >>> When CONFIG_BUG is not selected, WARN() does nothing so skip it.
> >>>
> >>> This reduces the size of vmlinux by almost 4kbytes.
> >>
> >
> >>> +void __copy_overflow(int size, unsigned long count);
> >>> +
> >>>   static inline void copy_overflow(int size, unsigned long count)
> >>>   {
> >>> -	WARN(1, "Buffer overflow detected (%d < %lu)!\n", size, count);
> >>> +	if (IS_ENABLED(CONFIG_BUG))
> >>> +		__copy_overflow(size, count);
> >>>   }
> >
> >> Just wondering, is this the only such scenario which results in
> >> an avoidable bloated vmlinux image ?
> >
> > The more interesting question is whether the call to __copy_overflow()
> > is actually significantly smaller than the one to WARN()?
> > And if so why.
> >
> unsigned long tst_copy_to_user(void __user *to, unsigned long n)
> {
> 	return copy_to_user(to, &jiffies_64, n);
> }
> 
> With the patch:
> 
> 00003c78 <tst_copy_to_user>:
>      3c78:	28 04 00 08 	cmplwi  r4,8
>      3c7c:	7c 85 23 78 	mr      r5,r4
>      3c80:	41 81 00 10 	bgt     3c90 <tst_copy_to_user+0x18>
>      3c84:	3c 80 00 00 	lis     r4,0
> 			3c86: R_PPC_ADDR16_HA	jiffies_64
>      3c88:	38 84 00 00 	addi    r4,r4,0
> 			3c8a: R_PPC_ADDR16_LO	jiffies_64
>      3c8c:	48 00 00 00 	b       3c8c <tst_copy_to_user+0x14>
> 			3c8c: R_PPC_REL24	_copy_to_user
> 
>      3c90:	94 21 ff f0 	stwu    r1,-16(r1)
>      3c94:	7c 08 02 a6 	mflr    r0
>      3c98:	38 60 00 08 	li      r3,8
>      3c9c:	90 01 00 14 	stw     r0,20(r1)
>      3ca0:	90 81 00 08 	stw     r4,8(r1)
>      3ca4:	48 00 00 01 	bl      3ca4 <tst_copy_to_user+0x2c>
> 			3ca4: R_PPC_REL24	__copy_overflow
>      3ca8:	80 a1 00 08 	lwz     r5,8(r1)
>      3cac:	80 01 00 14 	lwz     r0,20(r1)
>      3cb0:	7c a3 2b 78 	mr      r3,r5
>      3cb4:	7c 08 03 a6 	mtlr    r0
>      3cb8:	38 21 00 10 	addi    r1,r1,16
>      3cbc:	4e 80 00 20 	blr
> 
> 
> Without the patch:
> 
> 00003c88 <tst_copy_to_user>:
>      3c88:	28 04 00 08 	cmplwi  r4,8
>      3c8c:	7c 85 23 78 	mr      r5,r4
>      3c90:	41 81 00 10 	bgt     3ca0 <tst_copy_to_user+0x18>
>      3c94:	3c 80 00 00 	lis     r4,0
> 			3c96: R_PPC_ADDR16_HA	jiffies_64
>      3c98:	38 84 00 00 	addi    r4,r4,0
> 			3c9a: R_PPC_ADDR16_LO	jiffies_64
>      3c9c:	48 00 00 00 	b       3c9c <tst_copy_to_user+0x14>
> 			3c9c: R_PPC_REL24	_copy_to_user
> 
>      3ca0:	94 21 ff f0 	stwu    r1,-16(r1)
>      3ca4:	3c 60 00 00 	lis     r3,0
> 			3ca6: R_PPC_ADDR16_HA	.rodata.str1.4+0x30
>      3ca8:	90 81 00 08 	stw     r4,8(r1)
>      3cac:	7c 08 02 a6 	mflr    r0
>      3cb0:	38 63 00 00 	addi    r3,r3,0
> 			3cb2: R_PPC_ADDR16_LO	.rodata.str1.4+0x30
>      3cb4:	38 80 00 08 	li      r4,8
>      3cb8:	90 01 00 14 	stw     r0,20(r1)
>      3cbc:	48 00 00 01 	bl      3cbc <tst_copy_to_user+0x34>
> 			3cbc: R_PPC_REL24	__warn_printk
>      3cc0:	80 a1 00 08 	lwz     r5,8(r1)
>      3cc4:	0f e0 00 00 	twui    r0,0
>      3cc8:	80 01 00 14 	lwz     r0,20(r1)
>      3ccc:	7c a3 2b 78 	mr      r3,r5
>      3cd0:	7c 08 03 a6 	mtlr    r0
>      3cd4:	38 21 00 10 	addi    r1,r1,16
>      3cd8:	4e 80 00 20 	blr

I make that 3 extra instructions.
Two are needed to load the format string.
Not sure why the third gets added.

Not really significant in the 12-15 the error call actually takes.
Although a lot of those are just generating the stack frame
in order to call the error function - and wouldn't be there in
a less trivial example.

More interesting would be changing copy_overflow() to return the size.
So copy_to_user() becomes:

	if (size_valid())
		return _copy_to_user();
	return copy_overflow()

In your example that would generate a tail call in the error path.
It also avoids having to save the transfer length.

Plausibly you'll get smaller code by making the prototypes
of _copy_to_to_user() and copy_overflow() match.
But compilers don't like generating the:
	(cond ? a : b)(args)
assembler that would really be needed.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Christophe Leroy Feb. 14, 2022, 2:57 p.m. UTC | #6
Le 14/02/2022 à 15:00, David Laight a écrit :
> From: Christophe Leroy
>> Sent: 14 February 2022 13:21
>>
>> Le 14/02/2022 à 12:31, David Laight a écrit :
>>> From: Anshuman Khandual
>>>> Sent: 14 February 2022 09:54
>>> ...
>>>>> With -Winline, GCC tells:
>>>>>
>>>>> 	/include/linux/thread_info.h:212:20: warning: inlining failed in call to 'copy_overflow': call
>>>> is unlikely and code size would grow [-Winline]
>>>>>
>>>>> copy_overflow() is a non conditional warning called by
>>>>> check_copy_size() on an error path.
>>>>>
>>>>> check_copy_size() have to remain inlined in order to benefit
>>>>> from constant folding, but copy_overflow() is not worth inlining.
>>>>>
>>>>> Uninline the warning when CONFIG_BUG is selected.
>>>>>
>>>>> When CONFIG_BUG is not selected, WARN() does nothing so skip it.
>>>>>
>>>>> This reduces the size of vmlinux by almost 4kbytes.
>>>>
>>>
>>>>> +void __copy_overflow(int size, unsigned long count);
>>>>> +
>>>>>    static inline void copy_overflow(int size, unsigned long count)
>>>>>    {
>>>>> -	WARN(1, "Buffer overflow detected (%d < %lu)!\n", size, count);
>>>>> +	if (IS_ENABLED(CONFIG_BUG))
>>>>> +		__copy_overflow(size, count);
>>>>>    }
>>>
>>>> Just wondering, is this the only such scenario which results in
>>>> an avoidable bloated vmlinux image ?
>>>
>>> The more interesting question is whether the call to __copy_overflow()
>>> is actually significantly smaller than the one to WARN()?
>>> And if so why.
>>>
>> unsigned long tst_copy_to_user(void __user *to, unsigned long n)
>> {
>> 	return copy_to_user(to, &jiffies_64, n);
>> }
>>
>> With the patch:
>>
>> 00003c78 <tst_copy_to_user>:
>>       3c78:	28 04 00 08 	cmplwi  r4,8
>>       3c7c:	7c 85 23 78 	mr      r5,r4
>>       3c80:	41 81 00 10 	bgt     3c90 <tst_copy_to_user+0x18>
>>       3c84:	3c 80 00 00 	lis     r4,0
>> 			3c86: R_PPC_ADDR16_HA	jiffies_64
>>       3c88:	38 84 00 00 	addi    r4,r4,0
>> 			3c8a: R_PPC_ADDR16_LO	jiffies_64
>>       3c8c:	48 00 00 00 	b       3c8c <tst_copy_to_user+0x14>
>> 			3c8c: R_PPC_REL24	_copy_to_user
>>
>>       3c90:	94 21 ff f0 	stwu    r1,-16(r1)
>>       3c94:	7c 08 02 a6 	mflr    r0
>>       3c98:	38 60 00 08 	li      r3,8
>>       3c9c:	90 01 00 14 	stw     r0,20(r1)
>>       3ca0:	90 81 00 08 	stw     r4,8(r1)
>>       3ca4:	48 00 00 01 	bl      3ca4 <tst_copy_to_user+0x2c>
>> 			3ca4: R_PPC_REL24	__copy_overflow
>>       3ca8:	80 a1 00 08 	lwz     r5,8(r1)
>>       3cac:	80 01 00 14 	lwz     r0,20(r1)
>>       3cb0:	7c a3 2b 78 	mr      r3,r5
>>       3cb4:	7c 08 03 a6 	mtlr    r0
>>       3cb8:	38 21 00 10 	addi    r1,r1,16
>>       3cbc:	4e 80 00 20 	blr
>>
>>
>> Without the patch:
>>
>> 00003c88 <tst_copy_to_user>:
>>       3c88:	28 04 00 08 	cmplwi  r4,8
>>       3c8c:	7c 85 23 78 	mr      r5,r4
>>       3c90:	41 81 00 10 	bgt     3ca0 <tst_copy_to_user+0x18>
>>       3c94:	3c 80 00 00 	lis     r4,0
>> 			3c96: R_PPC_ADDR16_HA	jiffies_64
>>       3c98:	38 84 00 00 	addi    r4,r4,0
>> 			3c9a: R_PPC_ADDR16_LO	jiffies_64
>>       3c9c:	48 00 00 00 	b       3c9c <tst_copy_to_user+0x14>
>> 			3c9c: R_PPC_REL24	_copy_to_user
>>
>>       3ca0:	94 21 ff f0 	stwu    r1,-16(r1)
>>       3ca4:	3c 60 00 00 	lis     r3,0
>> 			3ca6: R_PPC_ADDR16_HA	.rodata.str1.4+0x30
>>       3ca8:	90 81 00 08 	stw     r4,8(r1)
>>       3cac:	7c 08 02 a6 	mflr    r0
>>       3cb0:	38 63 00 00 	addi    r3,r3,0
>> 			3cb2: R_PPC_ADDR16_LO	.rodata.str1.4+0x30
>>       3cb4:	38 80 00 08 	li      r4,8
>>       3cb8:	90 01 00 14 	stw     r0,20(r1)
>>       3cbc:	48 00 00 01 	bl      3cbc <tst_copy_to_user+0x34>
>> 			3cbc: R_PPC_REL24	__warn_printk
>>       3cc0:	80 a1 00 08 	lwz     r5,8(r1)
>>       3cc4:	0f e0 00 00 	twui    r0,0
>>       3cc8:	80 01 00 14 	lwz     r0,20(r1)
>>       3ccc:	7c a3 2b 78 	mr      r3,r5
>>       3cd0:	7c 08 03 a6 	mtlr    r0
>>       3cd4:	38 21 00 10 	addi    r1,r1,16
>>       3cd8:	4e 80 00 20 	blr
> 
> I make that 3 extra instructions.
> Two are needed to load the format string.
> Not sure why the third gets added.

Third instruction is 'twui', to 'trap' and get the warning oops.

> 
> Not really significant in the 12-15 the error call actually takes.
> Although a lot of those are just generating the stack frame
> in order to call the error function - and wouldn't be there in
> a less trivial example.


Yes, after looking once more, maybe making it __always_inline would be 
enough.

The starting point was that I got almost 50 times copy_overflow() in my 
vmlinux, each having its own format string as well.

So my patch reduced vmlinux size by 3908 bytes.

But with __always_inline I get a reduction by 3560 which is almost the same.

So if you prefer, I can just make copy_overflow() __always_inline and voila.


> 
> More interesting would be changing copy_overflow() to return the size.
> So copy_to_user() becomes:
> 
> 	if (size_valid())
> 		return _copy_to_user();
> 	return copy_overflow()

Yes that's something to try, allthough it means changing all callers of 
check_copy_size()

Christophe
David Laight Feb. 14, 2022, 3:10 p.m. UTC | #7
From: Christophe Leroy
> Sent: 14 February 2022 14:58
...
> > I make that 3 extra instructions.
> > Two are needed to load the format string.
> > Not sure why the third gets added.
> 
> Third instruction is 'twui', to 'trap' and get the warning oops.

I wondered what that did :-)
Although you really want the -- cut here -- to contain the pr_warn().
Doesn't WARN() do that for you?

I was looking at that last week because the 'scheduling while atomic'
trace is "BUG: xxxx" but doesn't have the '--- cut here --" marker.

> > Not really significant in the 12-15 the error call actually takes.
> > Although a lot of those are just generating the stack frame
> > in order to call the error function - and wouldn't be there in
> > a less trivial example.
> 
> 
> Yes, after looking once more, maybe making it __always_inline would be
> enough.
> 
> The starting point was that I got almost 50 times copy_overflow() in my
> vmlinux, each having its own format string as well.

Didn't the linker merge the format strings?
They ought to end up in strdata.ro.1 (or whatever it is called)
and the linker merge the references.

> So my patch reduced vmlinux size by 3908 bytes.
> 
> But with __always_inline I get a reduction by 3560 which is almost the same.
> 
> So if you prefer, I can just make copy_overflow() __always_inline and voila.

I suspect #define __inline __always_inline is the way to go.
Probable along with -Winline.

The kernel shouldn't have inline sprinkled where it isn't needed.

> > More interesting would be changing copy_overflow() to return the size.
> > So copy_to_user() becomes:
> >
> > 	if (size_valid())
> > 		return _copy_to_user();
> > 	return copy_overflow()
> 
> Yes that's something to try, allthough it means changing all callers of
> check_copy_size().

You could use a differently named function so they can be changed in stages.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Christophe Leroy Feb. 14, 2022, 3:31 p.m. UTC | #8
Le 14/02/2022 à 16:10, David Laight a écrit :
> From: Christophe Leroy
>> Sent: 14 February 2022 14:58
> ...
>>> I make that 3 extra instructions.
>>> Two are needed to load the format string.
>>> Not sure why the third gets added.
>>
>> Third instruction is 'twui', to 'trap' and get the warning oops.
> 
> I wondered what that did :-)
> Although you really want the -- cut here -- to contain the pr_warn().
> Doesn't WARN() do that for you?

I remember some discussion about that in the past. Will dig into it 
tomorrow.

> 
> I was looking at that last week because the 'scheduling while atomic'
> trace is "BUG: xxxx" but doesn't have the '--- cut here --" marker.
> 
>>> Not really significant in the 12-15 the error call actually takes.
>>> Although a lot of those are just generating the stack frame
>>> in order to call the error function - and wouldn't be there in
>>> a less trivial example.
>>
>>
>> Yes, after looking once more, maybe making it __always_inline would be
>> enough.
>>
>> The starting point was that I got almost 50 times copy_overflow() in my
>> vmlinux, each having its own format string as well.
> 
> Didn't the linker merge the format strings?
> They ought to end up in strdata.ro.1 (or whatever it is called)
> and the linker merge the references.
> 
>> So my patch reduced vmlinux size by 3908 bytes.
>>
>> But with __always_inline I get a reduction by 3560 which is almost the same.
>>
>> So if you prefer, I can just make copy_overflow() __always_inline and voila.
> 
> I suspect #define __inline __always_inline is the way to go.

That was the case until 889b3c1245de ("compiler: remove 
CONFIG_OPTIMIZE_INLINING entirely")


> Probable along with -Winline.
> 
> The kernel shouldn't have inline sprinkled where it isn't needed.
> 
>>> More interesting would be changing copy_overflow() to return the size.
>>> So copy_to_user() becomes:
>>>
>>> 	if (size_valid())
>>> 		return _copy_to_user();
>>> 	return copy_overflow()
>>
>> Yes that's something to try, allthough it means changing all callers of
>> check_copy_size().
> 
> You could use a differently named function so they can be changed in stages.
> 
> 	David
> 
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
Christophe Leroy Feb. 15, 2022, 7:42 a.m. UTC | #9
Le 14/02/2022 à 16:31, Christophe Leroy a écrit :
> 
> 
> Le 14/02/2022 à 16:10, David Laight a écrit :
>> From: Christophe Leroy
>>> Sent: 14 February 2022 14:58
>> ...
>>>> I make that 3 extra instructions.
>>>> Two are needed to load the format string.
>>>> Not sure why the third gets added.
>>>
>>> Third instruction is 'twui', to 'trap' and get the warning oops.
>>
>> I wondered what that did :-)
>> Although you really want the -- cut here -- to contain the pr_warn().
>> Doesn't WARN() do that for you?
> 
> I remember some discussion about that in the past. Will dig into it 
> tomorrow.
> 
>>
>> I was looking at that last week because the 'scheduling while atomic'
>> trace is "BUG: xxxx" but doesn't have the '--- cut here --" marker.
>>

So I looked at it. Both WARN_ON() and WARN() properly display the cut 
here line:

WARN(1, "Testing whether cut here is there");

	[   35.051548] ------------[ cut here ]------------
	[   35.051611] Testing whether cut here is there
	[   35.051665] WARNING: CPU: 0 PID: 358 at 
arch/powerpc/kernel/setup-common.c:330 show_cpuinfo+0x234/0x30c


WARN_ON(1);

	[   35.058987] ------------[ cut here ]------------
	[   35.059033] WARNING: CPU: 0 PID: 358 at 
arch/powerpc/kernel/setup-common.c:331 show_cpuinfo+0x2b0/0x30c


So yes WARN() prints the "cut here", but what the 'twui' provides you is 
everything else, the dump of all registers, call trace, instruction 
dump, etc ...

The 'twui' is after the call to __warn_printk() so everything is after 
the 'cut here'.

Then I'm not sure I understood your question.

The 'scheduling while atomic' is not generated by a WARN() but by a 
printk in function __schedule_bug() hence the absence of '--- cut here ---'

Thanks
Christophe
diff mbox series

Patch

diff --git a/include/linux/thread_info.h b/include/linux/thread_info.h
index 73a6f34b3847..9f392ec76f2b 100644
--- a/include/linux/thread_info.h
+++ b/include/linux/thread_info.h
@@ -209,9 +209,12 @@  __bad_copy_from(void);
 extern void __compiletime_error("copy destination size is too small")
 __bad_copy_to(void);
 
+void __copy_overflow(int size, unsigned long count);
+
 static inline void copy_overflow(int size, unsigned long count)
 {
-	WARN(1, "Buffer overflow detected (%d < %lu)!\n", size, count);
+	if (IS_ENABLED(CONFIG_BUG))
+		__copy_overflow(size, count);
 }
 
 static __always_inline __must_check bool
diff --git a/mm/maccess.c b/mm/maccess.c
index d3f1a1f0b1c1..3fed2b876539 100644
--- a/mm/maccess.c
+++ b/mm/maccess.c
@@ -335,3 +335,9 @@  long strnlen_user_nofault(const void __user *unsafe_addr, long count)
 
 	return ret;
 }
+
+void __copy_overflow(int size, unsigned long count)
+{
+	WARN(1, "Buffer overflow detected (%d < %lu)!\n", size, count);
+}
+EXPORT_SYMBOL(__copy_overflow);