diff mbox series

[v3,2/3] xen: common: add ability to enable stack protector

Message ID 20241211020424.401614-3-volodymyr_babchuk@epam.com (mailing list archive)
State New
Headers show
Series Add stack protector | expand

Commit Message

Volodymyr Babchuk Dec. 11, 2024, 2:04 a.m. UTC
Both GCC and Clang support -fstack-protector feature, which add stack
canaries to functions where stack corruption is possible. This patch
makes general preparations to enable this feature on different
supported architectures:

 - Added CONFIG_HAS_STACK_PROTECTOR option so each architecture
   can enable this feature individually
 - Added user-selectable CONFIG_STACK_PROTECTOR option
 - Implemented code that sets up random stack canary and a basic
   handler for stack protector failures

Stack guard value is initialized in three phases:

1. Pre-defined randomly-selected value.

2. Early use of linear congruent random number generator. It relies on
get_cycles() being available very early. If get_cycles() returns zero,
it would leave pre-defined value from the previous step. Even when
get_cycles() is available, it's return value may be easily predicted,
especially on embedded systems, where boot time is quite consistent.

3. After hypervisor is sufficiently initialized, stack guard can be
set-up with get_random() function, which is expected to provide better
randomness.

Also this patch adds comment to asm-generic/random.h about stack
protector dependency on it.

Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>

---

Changes in v3:
 - Fixed coding style in stack-protector.h
 - Extended panic() message
 - Included missed random.h
 - Renamed Kconfig option
 - Used Andrew's suggestion for the Kconfig help text
 - Added "asmlinkage" attribute to __stack_chk_fail() to make Eclair
 happy
 - Initial stack guard value is random
 - Added LCG to generate stack guard value at early boot stages
 - Added comment to asm-generic/random.h about dependencies
 - Extended the commit message

Changes in v2:
 - Moved changes to EMBEDDED_EXTRA_CFLAGS into separate patch
 - Renamed stack_protector.c to stack-protector.c
 - Renamed stack_protector.h to stack-protector.h
 - Removed #ifdef CONFIG_X86 in stack-protector.h
 - Updated comment in stack-protector.h
   (also, we can't call boot_stack_chk_guard_setup() from asm code in
   general case, because it calls get_random() and get_random() may
   depend in per_cpu infrastructure, which is initialized later)
 - Fixed coding style
 - Moved CONFIG_STACK_PROTECTOR into newly added "Compiler options"
 submenu
 - Marked __stack_chk_guard as __ro_after_init
---
 xen/Makefile                      |  4 +++
 xen/common/Kconfig                | 15 ++++++++++
 xen/common/Makefile               |  1 +
 xen/common/stack-protector.c      | 47 +++++++++++++++++++++++++++++++
 xen/include/asm-generic/random.h  |  5 ++++
 xen/include/xen/stack-protector.h | 30 ++++++++++++++++++++
 6 files changed, 102 insertions(+)
 create mode 100644 xen/common/stack-protector.c
 create mode 100644 xen/include/xen/stack-protector.h

Comments

Jan Beulich Dec. 11, 2024, 8:16 a.m. UTC | #1
On 11.12.2024 03:04, Volodymyr Babchuk wrote:
> Both GCC and Clang support -fstack-protector feature, which add stack
> canaries to functions where stack corruption is possible. This patch
> makes general preparations to enable this feature on different
> supported architectures:
> 
>  - Added CONFIG_HAS_STACK_PROTECTOR option so each architecture
>    can enable this feature individually
>  - Added user-selectable CONFIG_STACK_PROTECTOR option
>  - Implemented code that sets up random stack canary and a basic
>    handler for stack protector failures
> 
> Stack guard value is initialized in three phases:
> 
> 1. Pre-defined randomly-selected value.
> 
> 2. Early use of linear congruent random number generator. It relies on
> get_cycles() being available very early. If get_cycles() returns zero,
> it would leave pre-defined value from the previous step. Even when
> get_cycles() is available, it's return value may be easily predicted,
> especially on embedded systems, where boot time is quite consistent.
> 
> 3. After hypervisor is sufficiently initialized, stack guard can be
> set-up with get_random() function, which is expected to provide better
> randomness.
> 
> Also this patch adds comment to asm-generic/random.h about stack
> protector dependency on it.
> 
> Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
> 
> ---
> 
> Changes in v3:
>  - Fixed coding style in stack-protector.h
>  - Extended panic() message
>  - Included missed random.h
>  - Renamed Kconfig option
>  - Used Andrew's suggestion for the Kconfig help text
>  - Added "asmlinkage" attribute to __stack_chk_fail() to make Eclair
>  happy
>  - Initial stack guard value is random
>  - Added LCG to generate stack guard value at early boot stages
>  - Added comment to asm-generic/random.h about dependencies
>  - Extended the commit message
> 
> Changes in v2:
>  - Moved changes to EMBEDDED_EXTRA_CFLAGS into separate patch
>  - Renamed stack_protector.c to stack-protector.c
>  - Renamed stack_protector.h to stack-protector.h
>  - Removed #ifdef CONFIG_X86 in stack-protector.h
>  - Updated comment in stack-protector.h
>    (also, we can't call boot_stack_chk_guard_setup() from asm code in
>    general case, because it calls get_random() and get_random() may
>    depend in per_cpu infrastructure, which is initialized later)
>  - Fixed coding style
>  - Moved CONFIG_STACK_PROTECTOR into newly added "Compiler options"
>  submenu
>  - Marked __stack_chk_guard as __ro_after_init
> ---
>  xen/Makefile                      |  4 +++
>  xen/common/Kconfig                | 15 ++++++++++
>  xen/common/Makefile               |  1 +
>  xen/common/stack-protector.c      | 47 +++++++++++++++++++++++++++++++
>  xen/include/asm-generic/random.h  |  5 ++++
>  xen/include/xen/stack-protector.h | 30 ++++++++++++++++++++
>  6 files changed, 102 insertions(+)
>  create mode 100644 xen/common/stack-protector.c
>  create mode 100644 xen/include/xen/stack-protector.h
> 
> diff --git a/xen/Makefile b/xen/Makefile
> index 34ed8c0fc7..0de0101fd0 100644
> --- a/xen/Makefile
> +++ b/xen/Makefile
> @@ -432,7 +432,11 @@ else
>  CFLAGS_UBSAN :=
>  endif
>  
> +ifeq ($(CONFIG_STACK_PROTECTOR),y)
> +CFLAGS += -fstack-protector
> +else
>  CFLAGS += -fno-stack-protector
> +endif

Personally I'd prefer if we consistently used the list approach we use
in various places, whenever possible:

CFLAGS-stack-protector-y := -fno-stack-protector
CFLAGS-stack-protector-$(CONFIG_STACK_PROTECTOR) := -fstack-protector
CFLAGS += $(CFLAGS-stack-protector-y)

> --- a/xen/common/Kconfig
> +++ b/xen/common/Kconfig
> @@ -86,6 +86,9 @@ config HAS_UBSAN
>  config HAS_VMAP
>  	bool
>  
> +config HAS_STACK_PROTECTOR
> +	bool

Please obey to alphabetic sorting in this region of the file.

> @@ -213,6 +216,18 @@ config SPECULATIVE_HARDEN_LOCK
>  
>  endmenu
>  
> +menu "Compiler options"
> +
> +config STACK_PROTECTOR
> +	bool "Stack protector"
> +	depends on HAS_STACK_PROTECTOR
> +	help
> +	  Enable the Stack Protector compiler hardening option. This inserts a
> +	  canary value in the stack frame of functions, and performs an integrity
> +	  check on exit.
> +
> +endmenu

"Compiler options" reads a little odd to me as a menu title. The preceding one
is "Speculative hardening"; how about making this one "Other hardening" or some
such?

> --- /dev/null
> +++ b/xen/common/stack-protector.c
> @@ -0,0 +1,47 @@
> +// SPDX-License-Identifier: GPL-2.0-only

Nit: I don't think we permit C++ comments as per our style.

> +#include <xen/init.h>
> +#include <xen/lib.h>
> +#include <xen/random.h>
> +#include <xen/time.h>
> +
> +/*
> + * Initial value is chosen by a fair dice roll.
> + * It will be updated during boot process.
> + */
> +#if BITS_PER_LONG == 32
> +unsigned long __ro_after_init __stack_chk_guard = 0xdd2cc927UL;
> +#else
> +unsigned long __ro_after_init __stack_chk_guard = 0x2d853605a4d9a09cUL;
> +#endif
> +
> +/* This function should be called from ASM only */

And with no (stack-protector enabled) C functions up the call stack. This
may be as easy to express in the comment as by simply adding "early".
However, considering the so far hypothetical case of offering the feature
also on x86: What about xen.efi, which from the very start uses C code?

> +void __init asmlinkage boot_stack_chk_guard_setup_early(void)
> +{
> +    /*
> +     * Linear congruent generator (X_n+1 = X_n * a + c).
> +     *
> +     * Constant is taken from "Tables Of Linear Congruential
> +     * Generators Of Different Sizes And Good Lattice Structure" by
> +     * Pierre L’Ecuyer.
> +     */
> +#if BITS_PER_LONG == 32
> +    const unsigned long a = 2891336453UL;
> +#else
> +    const unsigned long a = 2862933555777941757UL;
> +#endif
> +    const unsigned long c = 1;
> +
> +    unsigned long cycles = get_cycles();
> +
> +    /* Use the initial value if we can't generate random one */
> +    if ( !cycles )
> +	    return;

Nit: Indentation (no hard tabs please).

> +    __stack_chk_guard = cycles * a + c;
> +}
> +
> +void asmlinkage __stack_chk_fail(void)
> +{
> +    panic("Stack Protector integrity violation identified in %ps\n",
> +	  __builtin_return_address(0));

Again.

Is panic() really the right construct to use here, though?
__builtin_return_address() will merely identify the immediate caller. A
full stack trace (from BUG()) would likely be more useful in identifying
the offender.

> --- a/xen/include/asm-generic/random.h
> +++ b/xen/include/asm-generic/random.h
> @@ -2,6 +2,11 @@
>  #ifndef __ASM_GENERIC_RANDOM_H__
>  #define __ASM_GENERIC_RANDOM_H__
>  
> +/*
> + * When implementing arch_get_random(), please make sure that
> + * it can provide random data before stack protector is initialized
> + * (i.e. before boot_stack_chk_guard_setup() is called).
> + */
>  static inline unsigned int arch_get_random(void)
>  {
>      return 0;

What exactly will go (entirely) wrong if the comment isn't followed?
(I'm afraid anyway that the comment living here is easy to miss.)

> --- /dev/null
> +++ b/xen/include/xen/stack-protector.h
> @@ -0,0 +1,30 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#ifndef XEN__STACK_PROTECTOR_H
> +#define XEN__STACK_PROTECTOR_H
> +
> +#ifdef CONFIG_STACK_PROTECTOR
> +
> +#include <xen/random.h>
> +
> +extern unsigned long __stack_chk_guard;
> +
> +/*
> + * This function should be always inlined. Also it should be called
> + * from a function that never returns or a function that has
> + * stack-protector disabled.
> + */

As to the latter - that's not just "a function" but an entire call
stack that would need to have stack protector disabled.

> +static always_inline void boot_stack_chk_guard_setup(void)
> +{
> +    __stack_chk_guard = get_random();
> +    if (BITS_PER_LONG == 64)

Nit: Style (missing blanks).

> +        __stack_chk_guard |= ((unsigned long)get_random()) << 32;

Nit: No real need for the outer pair of parentheses.

Jan
Volodymyr Babchuk Dec. 12, 2024, 12:47 a.m. UTC | #2
Hello Jan,

Jan Beulich <jbeulich@suse.com> writes:

> On 11.12.2024 03:04, Volodymyr Babchuk wrote:

[...]

>
>> @@ -213,6 +216,18 @@ config SPECULATIVE_HARDEN_LOCK
>>  
>>  endmenu
>>  
>> +menu "Compiler options"
>> +
>> +config STACK_PROTECTOR
>> +	bool "Stack protector"
>> +	depends on HAS_STACK_PROTECTOR
>> +	help
>> +	  Enable the Stack Protector compiler hardening option. This inserts a
>> +	  canary value in the stack frame of functions, and performs an integrity
>> +	  check on exit.
>> +
>> +endmenu
>
> "Compiler options" reads a little odd to me as a menu title. The preceding one
> is "Speculative hardening"; how about making this one "Other hardening" or some
> such?

It was on of the Andrew's suggestion. Other was "Hardening". So yes, I
can rename it to "Other hardening", hope Andrew will be okay with this.

[...]

>> +/* This function should be called from ASM only */
>
> And with no (stack-protector enabled) C functions up the call stack. This
> may be as easy to express in the comment as by simply adding "early".

Like "This function should be called from early ASM only" ?

> However, considering the so far hypothetical case of offering the feature
> also on x86: What about xen.efi, which from the very start uses C code?
>

It depends on what other services are available. If RNG can be used
already, we don't need to call this function at all and can use
boot_stack_chk_guard_setup() right away.

>> +void __init asmlinkage boot_stack_chk_guard_setup_early(void)
>> +{

[...]

>> +void asmlinkage __stack_chk_fail(void)
>> +{
>> +    panic("Stack Protector integrity violation identified in %ps\n",
>> +	  __builtin_return_address(0));
>
> Again.
>
> Is panic() really the right construct to use here, though?
> __builtin_return_address() will merely identify the immediate caller. A
> full stack trace (from BUG()) would likely be more useful in identifying
> the offender.

Okay, I'll put just plain BUG(); here.

>
>> --- a/xen/include/asm-generic/random.h
>> +++ b/xen/include/asm-generic/random.h
>> @@ -2,6 +2,11 @@
>>  #ifndef __ASM_GENERIC_RANDOM_H__
>>  #define __ASM_GENERIC_RANDOM_H__
>>  
>> +/*
>> + * When implementing arch_get_random(), please make sure that
>> + * it can provide random data before stack protector is initialized
>> + * (i.e. before boot_stack_chk_guard_setup() is called).
>> + */
>>  static inline unsigned int arch_get_random(void)
>>  {
>>      return 0;
>
> What exactly will go (entirely) wrong if the comment isn't followed?

This will not cause immediate harm, but it will give false confidence to
anyone who enables stack protector.

I'd prefer more substantial protection, but we can't even check if
random generator is available in runtime. Taking into account that we
potential can get 0 as result of RNG, we can't even put
WARN_ON(!arch_get_random()) check.

> (I'm afraid anyway that the comment living here is easy to miss.)

I didn't found a better place for it. Maybe you can suggest one?


[...]

Thank you for the review. I'll address all your other comments.
Andrew Cooper Dec. 12, 2024, 12:52 a.m. UTC | #3
On 11/12/2024 8:16 am, Jan Beulich wrote:
> On 11.12.2024 03:04, Volodymyr Babchuk wrote:
>> Both GCC and Clang support -fstack-protector feature, which add stack
>> canaries to functions where stack corruption is possible. This patch
>> makes general preparations to enable this feature on different
>> supported architectures:
>>
>>  - Added CONFIG_HAS_STACK_PROTECTOR option so each architecture
>>    can enable this feature individually
>>  - Added user-selectable CONFIG_STACK_PROTECTOR option
>>  - Implemented code that sets up random stack canary and a basic
>>    handler for stack protector failures
>>
>> Stack guard value is initialized in three phases:
>>
>> 1. Pre-defined randomly-selected value.
>>
>> 2. Early use of linear congruent random number generator. It relies on
>> get_cycles() being available very early. If get_cycles() returns zero,
>> it would leave pre-defined value from the previous step. Even when
>> get_cycles() is available, it's return value may be easily predicted,
>> especially on embedded systems, where boot time is quite consistent.
>>
>> 3. After hypervisor is sufficiently initialized, stack guard can be
>> set-up with get_random() function, which is expected to provide better
>> randomness.
>>
>> Also this patch adds comment to asm-generic/random.h about stack
>> protector dependency on it.
>>
>> Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
>>
>> ---
>>
>> Changes in v3:
>>  - Fixed coding style in stack-protector.h
>>  - Extended panic() message
>>  - Included missed random.h
>>  - Renamed Kconfig option
>>  - Used Andrew's suggestion for the Kconfig help text
>>  - Added "asmlinkage" attribute to __stack_chk_fail() to make Eclair
>>  happy
>>  - Initial stack guard value is random
>>  - Added LCG to generate stack guard value at early boot stages
>>  - Added comment to asm-generic/random.h about dependencies
>>  - Extended the commit message
>>
>> Changes in v2:
>>  - Moved changes to EMBEDDED_EXTRA_CFLAGS into separate patch
>>  - Renamed stack_protector.c to stack-protector.c
>>  - Renamed stack_protector.h to stack-protector.h
>>  - Removed #ifdef CONFIG_X86 in stack-protector.h
>>  - Updated comment in stack-protector.h
>>    (also, we can't call boot_stack_chk_guard_setup() from asm code in
>>    general case, because it calls get_random() and get_random() may
>>    depend in per_cpu infrastructure, which is initialized later)
>>  - Fixed coding style
>>  - Moved CONFIG_STACK_PROTECTOR into newly added "Compiler options"
>>  submenu
>>  - Marked __stack_chk_guard as __ro_after_init
>> ---
>>  xen/Makefile                      |  4 +++
>>  xen/common/Kconfig                | 15 ++++++++++
>>  xen/common/Makefile               |  1 +
>>  xen/common/stack-protector.c      | 47 +++++++++++++++++++++++++++++++
>>  xen/include/asm-generic/random.h  |  5 ++++
>>  xen/include/xen/stack-protector.h | 30 ++++++++++++++++++++
>>  6 files changed, 102 insertions(+)
>>  create mode 100644 xen/common/stack-protector.c
>>  create mode 100644 xen/include/xen/stack-protector.h
>>
>> diff --git a/xen/Makefile b/xen/Makefile
>> index 34ed8c0fc7..0de0101fd0 100644
>> --- a/xen/Makefile
>> +++ b/xen/Makefile
>> @@ -432,7 +432,11 @@ else
>>  CFLAGS_UBSAN :=
>>  endif
>>  
>> +ifeq ($(CONFIG_STACK_PROTECTOR),y)
>> +CFLAGS += -fstack-protector
>> +else
>>  CFLAGS += -fno-stack-protector
>> +endif
> Personally I'd prefer if we consistently used the list approach we use
> in various places, whenever possible:
>
> CFLAGS-stack-protector-y := -fno-stack-protector
> CFLAGS-stack-protector-$(CONFIG_STACK_PROTECTOR) := -fstack-protector
> CFLAGS += $(CFLAGS-stack-protector-y)

No - please stop this antipattern.

It saves 2 lines of code and makes the logic complete unintelligible.

I have a very strong preference for this patch to happen as Volodymyr
presented, and without the double := replacing the more-legible ifeq.

>> --- a/xen/common/Kconfig
>> +++ b/xen/common/Kconfig
>> @@ -86,6 +86,9 @@ config HAS_UBSAN
>>  config HAS_VMAP
>>  	bool
>>  
>> +config HAS_STACK_PROTECTOR
>> +	bool
> Please obey to alphabetic sorting in this region of the file.
>
>> @@ -213,6 +216,18 @@ config SPECULATIVE_HARDEN_LOCK
>>  
>>  endmenu
>>  
>> +menu "Compiler options"
>> +
>> +config STACK_PROTECTOR
>> +	bool "Stack protector"
>> +	depends on HAS_STACK_PROTECTOR
>> +	help
>> +	  Enable the Stack Protector compiler hardening option. This inserts a
>> +	  canary value in the stack frame of functions, and performs an integrity
>> +	  check on exit.

I'd be tempted to say "on function exit" to be a little more specific.

>> +
>> +endmenu
> "Compiler options" reads a little odd to me as a menu title. The preceding one
> is "Speculative hardening"; how about making this one "Other hardening" or some
> such?

In an ideal world, we'd have "Code hardening", with speculative just
being one item in the list beside stack protector, trivial auto var
init, FineIBT, etc.

>
>> --- /dev/null
>> +++ b/xen/common/stack-protector.c
>> @@ -0,0 +1,47 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
> Nit: I don't think we permit C++ comments as per our style.
>
>> +#include <xen/init.h>
>> +#include <xen/lib.h>
>> +#include <xen/random.h>
>> +#include <xen/time.h>
>> +
>> +/*
>> + * Initial value is chosen by a fair dice roll.
>> + * It will be updated during boot process.
>> + */
>> +#if BITS_PER_LONG == 32
>> +unsigned long __ro_after_init __stack_chk_guard = 0xdd2cc927UL;
>> +#else
>> +unsigned long __ro_after_init __stack_chk_guard = 0x2d853605a4d9a09cUL;
>> +#endif
>> +
>> +/* This function should be called from ASM only */
> And with no (stack-protector enabled) C functions up the call stack. This
> may be as easy to express in the comment as by simply adding "early".
> However, considering the so far hypothetical case of offering the feature
> also on x86: What about xen.efi, which from the very start uses C code?

The necessary property is "with no functions to unwind that have an
active canary".

This is why it ends up as "from early assembly, or a noreturn function",
where really the latter is even really "has an exit which escapes canary
tracking, such as Xen's reset_stack_and_jmp()".

>
>> +void __init asmlinkage boot_stack_chk_guard_setup_early(void)
>> +{
>> +    /*
>> +     * Linear congruent generator (X_n+1 = X_n * a + c).
>> +     *
>> +     * Constant is taken from "Tables Of Linear Congruential
>> +     * Generators Of Different Sizes And Good Lattice Structure" by
>> +     * Pierre L’Ecuyer.
>> +     */
>> +#if BITS_PER_LONG == 32
>> +    const unsigned long a = 2891336453UL;
>> +#else
>> +    const unsigned long a = 2862933555777941757UL;
>> +#endif
>> +    const unsigned long c = 1;
>> +
>> +    unsigned long cycles = get_cycles();
>> +
>> +    /* Use the initial value if we can't generate random one */
>> +    if ( !cycles )
>> +	    return;
> Nit: Indentation (no hard tabs please).
>
>> +    __stack_chk_guard = cycles * a + c;
>> +}

This is much much nicer.

That said, I don't think we two different setup phases.  I'd suggest
having only this get_cycles implementation and ignore the later
get_random() setup.

It's definitely good enough for a v1, (and frankly, get_random() wants
some separate improvements anyway.)


>> +
>> +void asmlinkage __stack_chk_fail(void)
>> +{
>> +    panic("Stack Protector integrity violation identified in %ps\n",
>> +	  __builtin_return_address(0));
> Again.
>
> Is panic() really the right construct to use here, though?
> __builtin_return_address() will merely identify the immediate caller. A
> full stack trace (from BUG()) would likely be more useful in identifying
> the offender.

Well - we have to be careful, because the backtrace from here is
specifically misleading in this case.

When this trips, it's either the caller itself that broke, or some
sibling call tree which is rubble under the active stack now.

BUG() also comes with 0 information.

So, maybe we want a dump_execution_state() (to get the backtrace), and
then this panic() which states it was a Stack Protection violation,
which hopefully is enough of a hint to people to look in the sibling
call tree.

~Andrew
Jan Beulich Dec. 12, 2024, 10:45 a.m. UTC | #4
On 12.12.2024 01:52, Andrew Cooper wrote:
> On 11/12/2024 8:16 am, Jan Beulich wrote:
>> On 11.12.2024 03:04, Volodymyr Babchuk wrote:
>>> --- a/xen/Makefile
>>> +++ b/xen/Makefile
>>> @@ -432,7 +432,11 @@ else
>>>  CFLAGS_UBSAN :=
>>>  endif
>>>  
>>> +ifeq ($(CONFIG_STACK_PROTECTOR),y)
>>> +CFLAGS += -fstack-protector
>>> +else
>>>  CFLAGS += -fno-stack-protector
>>> +endif
>> Personally I'd prefer if we consistently used the list approach we use
>> in various places, whenever possible:
>>
>> CFLAGS-stack-protector-y := -fno-stack-protector
>> CFLAGS-stack-protector-$(CONFIG_STACK_PROTECTOR) := -fstack-protector
>> CFLAGS += $(CFLAGS-stack-protector-y)
> 
> No - please stop this antipattern.
> 
> It saves 2 lines of code and makes the logic complete unintelligible.
> 
> I have a very strong preference for this patch to happen as Volodymyr
> presented, and without the double := replacing the more-legible ifeq.

Why "antipattern"? Surely there are cases where the list approach is
preferable. Surely there are cases where it ends up less legible, and
this may indeed be one such case. Yet then - where do you suggest to
draw the boundary?

>>> +void asmlinkage __stack_chk_fail(void)
>>> +{
>>> +    panic("Stack Protector integrity violation identified in %ps\n",
>>> +	  __builtin_return_address(0));
>> Again.
>>
>> Is panic() really the right construct to use here, though?
>> __builtin_return_address() will merely identify the immediate caller. A
>> full stack trace (from BUG()) would likely be more useful in identifying
>> the offender.
> 
> Well - we have to be careful, because the backtrace from here is
> specifically misleading in this case.
> 
> When this trips, it's either the caller itself that broke, or some
> sibling call tree which is rubble under the active stack now.
> 
> BUG() also comes with 0 information.

Not quite, the more that you suggest an alternative way to ...

> So, maybe we want a dump_execution_state() (to get the backtrace), and
> then this panic() which states it was a Stack Protection violation,
> which hopefully is enough of a hint to people to look in the sibling
> call tree.

... get register state and stack trace out. Which I'd be entirely fine
with.

Jan
Jan Beulich Dec. 12, 2024, 10:58 a.m. UTC | #5
On 12.12.2024 01:47, Volodymyr Babchuk wrote:
> Jan Beulich <jbeulich@suse.com> writes:
>> On 11.12.2024 03:04, Volodymyr Babchuk wrote:
>>> --- a/xen/include/asm-generic/random.h
>>> +++ b/xen/include/asm-generic/random.h
>>> @@ -2,6 +2,11 @@
>>>  #ifndef __ASM_GENERIC_RANDOM_H__
>>>  #define __ASM_GENERIC_RANDOM_H__
>>>  
>>> +/*
>>> + * When implementing arch_get_random(), please make sure that
>>> + * it can provide random data before stack protector is initialized
>>> + * (i.e. before boot_stack_chk_guard_setup() is called).
>>> + */
>>>  static inline unsigned int arch_get_random(void)
>>>  {
>>>      return 0;
>>
>> What exactly will go (entirely) wrong if the comment isn't followed?
> 
> This will not cause immediate harm, but it will give false confidence to
> anyone who enables stack protector.
> 
> I'd prefer more substantial protection, but we can't even check if
> random generator is available in runtime. Taking into account that we
> potential can get 0 as result of RNG, we can't even put
> WARN_ON(!arch_get_random()) check.

Right, and hence 0 isn't strictly something that can be called "bad".
With at least some randomness one will of course observe a possible
problem at least across two or more runs. However, you don't call
arch_get_random() directly anyway, and get_random() has fallback code,
which is no more likely to return 0 than arch_get_random() is.

In fact this fallback code means get_random() will also use it when
arch_get_random() returns 0 as coming from an actual RNG. Which can
be considered bogus, as for a good random number source _every_
possible value ought to have equal probability of being returned.
Plus if we special-case 0, why not also special-case ~0? Or any other
number with only very few bits set/clear?

For the purpose of stack protector we may want to consider using a
mix of static pattern (not used elsewhere in the codebase) and random
number. The static pattern part would then want arranging such that
at least any value representing a valid address (within Xen alone)
won't match possible canary values.

>> (I'm afraid anyway that the comment living here is easy to miss.)
> 
> I didn't found a better place for it. Maybe you can suggest one?

I'm of two minds here really. Part of me wants this simply dropped. Yet
then some may deem this worthwhile information, except that then it
needs to live is a suitably exposed place. Just that I can't think of
any that would really fit.

Jan
diff mbox series

Patch

diff --git a/xen/Makefile b/xen/Makefile
index 34ed8c0fc7..0de0101fd0 100644
--- a/xen/Makefile
+++ b/xen/Makefile
@@ -432,7 +432,11 @@  else
 CFLAGS_UBSAN :=
 endif
 
+ifeq ($(CONFIG_STACK_PROTECTOR),y)
+CFLAGS += -fstack-protector
+else
 CFLAGS += -fno-stack-protector
+endif
 
 ifeq ($(CONFIG_LTO),y)
 CFLAGS += -flto
diff --git a/xen/common/Kconfig b/xen/common/Kconfig
index 90268d9249..5676339a66 100644
--- a/xen/common/Kconfig
+++ b/xen/common/Kconfig
@@ -86,6 +86,9 @@  config HAS_UBSAN
 config HAS_VMAP
 	bool
 
+config HAS_STACK_PROTECTOR
+	bool
+
 config MEM_ACCESS_ALWAYS_ON
 	bool
 
@@ -213,6 +216,18 @@  config SPECULATIVE_HARDEN_LOCK
 
 endmenu
 
+menu "Compiler options"
+
+config STACK_PROTECTOR
+	bool "Stack protector"
+	depends on HAS_STACK_PROTECTOR
+	help
+	  Enable the Stack Protector compiler hardening option. This inserts a
+	  canary value in the stack frame of functions, and performs an integrity
+	  check on exit.
+
+endmenu
+
 config DIT_DEFAULT
 	bool "Data Independent Timing default"
 	depends on HAS_DIT
diff --git a/xen/common/Makefile b/xen/common/Makefile
index b279b09bfb..ceb5b2f32b 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -45,6 +45,7 @@  obj-y += shutdown.o
 obj-y += softirq.o
 obj-y += smp.o
 obj-y += spinlock.o
+obj-$(CONFIG_STACK_PROTECTOR) += stack-protector.o
 obj-y += stop_machine.o
 obj-y += symbols.o
 obj-y += tasklet.o
diff --git a/xen/common/stack-protector.c b/xen/common/stack-protector.c
new file mode 100644
index 0000000000..922511555f
--- /dev/null
+++ b/xen/common/stack-protector.c
@@ -0,0 +1,47 @@ 
+// SPDX-License-Identifier: GPL-2.0-only
+#include <xen/init.h>
+#include <xen/lib.h>
+#include <xen/random.h>
+#include <xen/time.h>
+
+/*
+ * Initial value is chosen by a fair dice roll.
+ * It will be updated during boot process.
+ */
+#if BITS_PER_LONG == 32
+unsigned long __ro_after_init __stack_chk_guard = 0xdd2cc927UL;
+#else
+unsigned long __ro_after_init __stack_chk_guard = 0x2d853605a4d9a09cUL;
+#endif
+
+/* This function should be called from ASM only */
+void __init asmlinkage boot_stack_chk_guard_setup_early(void)
+{
+    /*
+     * Linear congruent generator (X_n+1 = X_n * a + c).
+     *
+     * Constant is taken from "Tables Of Linear Congruential
+     * Generators Of Different Sizes And Good Lattice Structure" by
+     * Pierre L’Ecuyer.
+     */
+#if BITS_PER_LONG == 32
+    const unsigned long a = 2891336453UL;
+#else
+    const unsigned long a = 2862933555777941757UL;
+#endif
+    const unsigned long c = 1;
+
+    unsigned long cycles = get_cycles();
+
+    /* Use the initial value if we can't generate random one */
+    if ( !cycles )
+	    return;
+
+    __stack_chk_guard = cycles * a + c;
+}
+
+void asmlinkage __stack_chk_fail(void)
+{
+    panic("Stack Protector integrity violation identified in %ps\n",
+	  __builtin_return_address(0));
+}
diff --git a/xen/include/asm-generic/random.h b/xen/include/asm-generic/random.h
index d0d35dd217..7f6d8790c4 100644
--- a/xen/include/asm-generic/random.h
+++ b/xen/include/asm-generic/random.h
@@ -2,6 +2,11 @@ 
 #ifndef __ASM_GENERIC_RANDOM_H__
 #define __ASM_GENERIC_RANDOM_H__
 
+/*
+ * When implementing arch_get_random(), please make sure that
+ * it can provide random data before stack protector is initialized
+ * (i.e. before boot_stack_chk_guard_setup() is called).
+ */
 static inline unsigned int arch_get_random(void)
 {
     return 0;
diff --git a/xen/include/xen/stack-protector.h b/xen/include/xen/stack-protector.h
new file mode 100644
index 0000000000..bd324d9003
--- /dev/null
+++ b/xen/include/xen/stack-protector.h
@@ -0,0 +1,30 @@ 
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#ifndef XEN__STACK_PROTECTOR_H
+#define XEN__STACK_PROTECTOR_H
+
+#ifdef CONFIG_STACK_PROTECTOR
+
+#include <xen/random.h>
+
+extern unsigned long __stack_chk_guard;
+
+/*
+ * This function should be always inlined. Also it should be called
+ * from a function that never returns or a function that has
+ * stack-protector disabled.
+ */
+static always_inline void boot_stack_chk_guard_setup(void)
+{
+    __stack_chk_guard = get_random();
+    if (BITS_PER_LONG == 64)
+        __stack_chk_guard |= ((unsigned long)get_random()) << 32;
+}
+
+#else
+
+static inline void boot_stack_chk_guard_setup(void) {}
+
+#endif /* CONFIG_STACK_PROTECTOR  */
+
+#endif /* XEN__STACK_PROTECTOR_H */