Message ID | 20211102070616.119780-1-ashimida@linux.alibaba.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Kees Cook |
Headers | show |
Series | [RFC,PR102768] aarch64: Add compiler support for Shadow Call Stack | expand |
The 11/02/2021 00:06, Dan Li via Gcc-patches wrote: > Shadow Call Stack can be used to protect the return address of a > function at runtime, and clang already supports this feature[1]. > > To enable SCS in user mode, in addition to compiler, other support > is also required (as described in [2]). This patch only adds basic > support for SCS from the compiler side, and provides convenience > for users to enable SCS. > > For linux kernel, only the support of the compiler is required. > > [1] https://clang.llvm.org/docs/ShadowCallStack.html > [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102768 i'm not a gcc maintainer, but i prefer such feature to be in upstream gcc instead of in a plugin. it will require update to the documentation: which should mention that it depends on -ffixed-x18 (probably that should be enforced too) which is an important abi issue: functions following the normal pcs can clobber x18 and break scs. and that there is no unwinder support. the abi issue means it is unlikely to be useful in linux user space (even if libc and unwinder support is implemented), but it can be still useful in freestanding code such as the linux kernel. thanks. > > gcc/c-family/ChangeLog: > > * c-attribs.c (handle_no_sanitize_shadow_call_stack_attribute): > > gcc/ChangeLog: > > * config/aarch64/aarch64-protos.h (aarch64_shadow_call_stack_enabled): > * config/aarch64/aarch64.c (aarch64_shadow_call_stack_enabled): > (aarch64_expand_prologue): > (aarch64_expand_epilogue): > * config/aarch64/aarch64.h (TARGET_SUPPORT_SHADOW_CALL_STACK): > * config/aarch64/aarch64.md (scs_push): > (scs_pop): > * defaults.h (TARGET_SUPPORT_SHADOW_CALL_STACK): > * flag-types.h (enum sanitize_code): > * opts.c (finish_options): > > Signed-off-by: Dan Li <ashimida@linux.alibaba.com> > --- > gcc/c-family/c-attribs.c | 21 +++++++++++++++++++++ > gcc/config/aarch64/aarch64-protos.h | 1 + > gcc/config/aarch64/aarch64.c | 27 +++++++++++++++++++++++++++ > gcc/config/aarch64/aarch64.h | 4 ++++ > gcc/config/aarch64/aarch64.md | 18 ++++++++++++++++++ > gcc/defaults.h | 4 ++++ > gcc/flag-types.h | 2 ++ > gcc/opts.c | 6 ++++++ > 8 files changed, 83 insertions(+) > > diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c > index 007b928c54b..9b3a35c06bf 100644 > --- a/gcc/c-family/c-attribs.c > +++ b/gcc/c-family/c-attribs.c > @@ -56,6 +56,8 @@ static tree handle_cold_attribute (tree *, tree, tree, int, bool *); > static tree handle_no_sanitize_attribute (tree *, tree, tree, int, bool *); > static tree handle_no_sanitize_address_attribute (tree *, tree, tree, > int, bool *); > +static tree handle_no_sanitize_shadow_call_stack_attribute (tree *, tree, > + tree, int, bool *); > static tree handle_no_sanitize_thread_attribute (tree *, tree, tree, > int, bool *); > static tree handle_no_address_safety_analysis_attribute (tree *, tree, tree, > @@ -454,6 +456,10 @@ const struct attribute_spec c_common_attribute_table[] = > handle_no_sanitize_attribute, NULL }, > { "no_sanitize_address", 0, 0, true, false, false, false, > handle_no_sanitize_address_attribute, NULL }, > + { "no_sanitize_shadow_call_stack", > + 0, 0, true, false, false, false, > + handle_no_sanitize_shadow_call_stack_attribute, > + NULL }, > { "no_sanitize_thread", 0, 0, true, false, false, false, > handle_no_sanitize_thread_attribute, NULL }, > { "no_sanitize_undefined", 0, 0, true, false, false, false, > @@ -1175,6 +1181,21 @@ handle_no_sanitize_address_attribute (tree *node, tree name, tree, int, > return NULL_TREE; > } > > +/* Handle a "no_sanitize_shadow_call_stack" attribute; arguments as in > + struct attribute_spec.handler. */ > +static tree > +handle_no_sanitize_shadow_call_stack_attribute (tree *node, tree name, > + tree, int, bool *no_add_attrs) > +{ > + *no_add_attrs = true; > + if (TREE_CODE (*node) != FUNCTION_DECL) > + warning (OPT_Wattributes, "%qE attribute ignored", name); > + else > + add_no_sanitize_value (*node, SANITIZE_SHADOW_CALL_STACK); > + > + return NULL_TREE; > +} > + > /* Handle a "no_sanitize_thread" attribute; arguments as in > struct attribute_spec.handler. */ > > diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h > index 768e8fae136..150c015df21 100644 > --- a/gcc/config/aarch64/aarch64-protos.h > +++ b/gcc/config/aarch64/aarch64-protos.h > @@ -893,6 +893,7 @@ void aarch64_register_pragmas (void); > void aarch64_relayout_simd_types (void); > void aarch64_reset_previous_fndecl (void); > bool aarch64_return_address_signing_enabled (void); > +bool aarch64_shadow_call_stack_enabled (void); > bool aarch64_bti_enabled (void); > void aarch64_save_restore_target_globals (tree); > void aarch64_addti_scratch_regs (rtx, rtx, rtx *, > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c > index 699c105a42a..5a36a459f4e 100644 > --- a/gcc/config/aarch64/aarch64.c > +++ b/gcc/config/aarch64/aarch64.c > @@ -79,6 +79,7 @@ > #include "tree-ssa-loop-niter.h" > #include "fractional-cost.h" > #include "rtlanal.h" > +#include "asan.h" > > /* This file should be included last. */ > #include "target-def.h" > @@ -7799,6 +7800,24 @@ aarch64_return_address_signing_enabled (void) > && known_ge (cfun->machine->frame.reg_offset[LR_REGNUM], 0))); > } > > +/* Return TRUE if shadow call stack should be enabled for the current > + function, otherwise return FALSE. */ > + > +bool > +aarch64_shadow_call_stack_enabled (void) > +{ > + /* This function should only be called after frame laid out. */ > + gcc_assert (cfun->machine->frame.laid_out); > + > + if (crtl->calls_eh_return) > + return false; > + > + /* We only deal with a function if its LR is pushed onto stack > + and attribute no_sanitize_shadow_call_stack is not specified. */ > + return (sanitize_flags_p (SANITIZE_SHADOW_CALL_STACK) > + && known_ge (cfun->machine->frame.reg_offset[LR_REGNUM], 0)); > +} > + > /* Return TRUE if Branch Target Identification Mechanism is enabled. */ > bool > aarch64_bti_enabled (void) > @@ -8810,6 +8829,10 @@ aarch64_expand_prologue (void) > RTX_FRAME_RELATED_P (insn) = 1; > } > > + /* Push return address to shadow call stack. */ > + if (aarch64_shadow_call_stack_enabled ()) > + emit_insn (gen_scs_push ()); > + > if (flag_stack_usage_info) > current_function_static_stack_size = constant_lower_bound (frame_size); > > @@ -9066,6 +9089,10 @@ aarch64_expand_epilogue (bool for_sibcall) > RTX_FRAME_RELATED_P (insn) = 1; > } > > + /* Pop return address from shadow call stack. */ > + if (aarch64_shadow_call_stack_enabled ()) > + emit_insn (gen_scs_pop ()); > + > /* We prefer to emit the combined return/authenticate instruction RETAA, > however there are three cases in which we must instead emit an explicit > authentication instruction. > diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h > index 2792bb29adb..1a83875dec8 100644 > --- a/gcc/config/aarch64/aarch64.h > +++ b/gcc/config/aarch64/aarch64.h > @@ -100,6 +100,10 @@ > generating stack clash probes. */ > #define STACK_CLASH_MAX_UNROLL_PAGES 4 > > +/* This value represents whether the shadow call stack is implemented on > + the target platform. */ > +#define TARGET_SUPPORT_SHADOW_CALL_STACK true > + > /* The architecture reserves all bits of the address for hardware use, > so the vbit must go into the delta field of pointers to member > functions. This is the same config as that in the AArch32 > diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md > index 1a39470a1fe..8e68a6f793d 100644 > --- a/gcc/config/aarch64/aarch64.md > +++ b/gcc/config/aarch64/aarch64.md > @@ -6994,6 +6994,24 @@ (define_insn "xpaclri" > "hint\t7 // xpaclri" > ) > > +;; Save X30 in the X18-based POST_INC stack (consistent with clang). > +(define_insn "scs_push" > + [(set (mem:DI (reg:DI R18_REGNUM)) (reg:DI R30_REGNUM)) > + (set (reg:DI R18_REGNUM) (plus:DI (reg:DI R18_REGNUM) (const_int 8)))] > + "" > + "str\\tx30, [x18], #8" > + [(set_attr "type" "store_8")] > +) > + > +;; Load X30 form the X18-based PRE_DEC stack (consistent with clang). > +(define_insn "scs_pop" > + [(set (reg:DI R18_REGNUM) (minus:DI (reg:DI R18_REGNUM) (const_int 8))) > + (set (reg:DI R30_REGNUM) (mem:DI (reg:DI R18_REGNUM)))] > + "" > + "ldr\\tx30, [x18, #-8]!" > + [(set_attr "type" "load_8")] > +) > + > ;; UNSPEC_VOLATILE is considered to use and clobber all hard registers and > ;; all of memory. This blocks insns from being moved across this point. > > diff --git a/gcc/defaults.h b/gcc/defaults.h > index bb68d0d1a79..0f1719a3bb5 100644 > --- a/gcc/defaults.h > +++ b/gcc/defaults.h > @@ -1172,6 +1172,10 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see > #define PCC_BITFIELD_TYPE_MATTERS false > #endif > > +#ifndef TARGET_SUPPORT_SHADOW_CALL_STACK > +#define TARGET_SUPPORT_SHADOW_CALL_STACK false > +#endif > + > #ifndef INSN_SETS_ARE_DELAYED > #define INSN_SETS_ARE_DELAYED(INSN) false > #endif > diff --git a/gcc/flag-types.h b/gcc/flag-types.h > index a5a637160d7..c22ef35a289 100644 > --- a/gcc/flag-types.h > +++ b/gcc/flag-types.h > @@ -321,6 +321,8 @@ enum sanitize_code { > SANITIZE_HWADDRESS = 1UL << 28, > SANITIZE_USER_HWADDRESS = 1UL << 29, > SANITIZE_KERNEL_HWADDRESS = 1UL << 30, > + /* Shadow Call Stack. */ > + SANITIZE_SHADOW_CALL_STACK = 1UL << 31, > SANITIZE_SHIFT = SANITIZE_SHIFT_BASE | SANITIZE_SHIFT_EXPONENT, > SANITIZE_UNDEFINED = SANITIZE_SHIFT | SANITIZE_DIVIDE | SANITIZE_UNREACHABLE > | SANITIZE_VLA | SANITIZE_NULL | SANITIZE_RETURN > diff --git a/gcc/opts.c b/gcc/opts.c > index 4472cec1b98..e94f316fb85 100644 > --- a/gcc/opts.c > +++ b/gcc/opts.c > @@ -1308,6 +1308,11 @@ finish_options (struct gcc_options *opts, struct gcc_options *opts_set, > sorry ("transactional memory is not supported with " > "%<-fsanitize=kernel-address%>"); > > + if ((opts->x_flag_sanitize & SANITIZE_SHADOW_CALL_STACK) > + && !TARGET_SUPPORT_SHADOW_CALL_STACK) > + error_at (loc, "%<-fsanitize=shadow-call-stack%> not supported " > + "in current platform"); > + > /* Currently live patching is not support for LTO. */ > if (opts->x_flag_live_patching && opts->x_flag_lto) > sorry ("live patching is not supported with LTO"); > @@ -1994,6 +1999,7 @@ const struct sanitizer_opts_s sanitizer_opts[] = > SANITIZER_OPT (vptr, SANITIZE_VPTR, true), > SANITIZER_OPT (pointer-overflow, SANITIZE_POINTER_OVERFLOW, true), > SANITIZER_OPT (builtin, SANITIZE_BUILTIN, true), > + SANITIZER_OPT (shadow-call-stack, SANITIZE_SHADOW_CALL_STACK, false), > SANITIZER_OPT (all, ~0U, true), > #undef SANITIZER_OPT > { NULL, 0U, 0UL, false } > -- > 2.17.1 >
On 11/2/21 9:04 PM, Szabolcs Nagy wrote: > The 11/02/2021 00:06, Dan Li via Gcc-patches wrote: >> Shadow Call Stack can be used to protect the return address of a >> function at runtime, and clang already supports this feature[1]. >> >> To enable SCS in user mode, in addition to compiler, other support >> is also required (as described in [2]). This patch only adds basic >> support for SCS from the compiler side, and provides convenience >> for users to enable SCS. >> >> For linux kernel, only the support of the compiler is required. >> >> [1] https://clang.llvm.org/docs/ShadowCallStack.html >> [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102768 > > i'm not a gcc maintainer, but i prefer such feature > to be in upstream gcc instead of in a plugin. > > it will require update to the documentation: > > which should mention that it depends on -ffixed-x18 > (probably that should be enforced too) which is an > important abi issue: functions following the normal > pcs can clobber x18 and break scs. > Thanks Szabolcs, I will update the documentation in next version. It sounds reasonable to enforced -ffixed-x18 with scs, but I see that clang doesn’t do that. Maybe it is better to be consistent with clang here? > and that there is no unwinder support. > Ok, let me try to add a support for this. > the abi issue means it is unlikely to be useful in > linux user space (even if libc and unwinder support > is implemented), but it can be still useful in > freestanding code such as the linux kernel. > > thanks. >
The 11/03/2021 00:24, Dan Li wrote: > On 11/2/21 9:04 PM, Szabolcs Nagy wrote: > > The 11/02/2021 00:06, Dan Li via Gcc-patches wrote: > > > Shadow Call Stack can be used to protect the return address of a > > > function at runtime, and clang already supports this feature[1]. > > > > > > To enable SCS in user mode, in addition to compiler, other support > > > is also required (as described in [2]). This patch only adds basic > > > support for SCS from the compiler side, and provides convenience > > > for users to enable SCS. > > > > > > For linux kernel, only the support of the compiler is required. > > > > > > [1] https://clang.llvm.org/docs/ShadowCallStack.html > > > [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102768 > > > > i'm not a gcc maintainer, but i prefer such feature > > to be in upstream gcc instead of in a plugin. > > > > it will require update to the documentation: > > > > which should mention that it depends on -ffixed-x18 > > (probably that should be enforced too) which is an > > important abi issue: functions following the normal > > pcs can clobber x18 and break scs. > > > Thanks Szabolcs, I will update the documentation in next version. > > It sounds reasonable to enforced -ffixed-x18 with scs, but I see > that clang doesn’t do that. Maybe it is better to be consistent > with clang here? i mean gcc can issue a diagnostic if -ffixed-x18 is not passed. (it seems clang rejects scs too without -ffixed-x18) > > and that there is no unwinder support. > > > Ok, let me try to add a support for this. i assume exception handling info has to change for scs to work (to pop the shadow stack when transferring control), so either scs must require -fno-exceptions or the eh info changes must be implemented. i think the kernel does not require exceptions and does not depend on the unwinder runtime in libgcc, so this is optional for the linux kernel use-case.
Hi Szabolcs, First of all, apologies for my late reply (since I just had a new baby, I'm quite busy recently and also because I'm not familiar with C++ exception handling, it takes me some time to learn this part). On 11/3/21 8:00 PM, Szabolcs Nagy wrote: > The 11/03/2021 00:24, Dan Li wrote: >> On 11/2/21 9:04 PM, Szabolcs Nagy wrote: >>> The 11/02/2021 00:06, Dan Li via Gcc-patches wrote: >>>> Shadow Call Stack can be used to protect the return address of a >>>> function at runtime, and clang already supports this feature[1]. >>>> >>>> To enable SCS in user mode, in addition to compiler, other support >>>> is also required (as described in [2]). This patch only adds basic >>>> support for SCS from the compiler side, and provides convenience >>>> for users to enable SCS. >>>> >>>> For linux kernel, only the support of the compiler is required. >>>> >>>> [1] https://clang.llvm.org/docs/ShadowCallStack.html >>>> [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102768 >>> >>> i'm not a gcc maintainer, but i prefer such feature >>> to be in upstream gcc instead of in a plugin. >>> >>> it will require update to the documentation: >>> >>> which should mention that it depends on -ffixed-x18 >>> (probably that should be enforced too) which is an >>> important abi issue: functions following the normal >>> pcs can clobber x18 and break scs. >>> >> Thanks Szabolcs, I will update the documentation in next version. >> >> It sounds reasonable to enforced -ffixed-x18 with scs, but I see >> that clang doesn’t do that. Maybe it is better to be consistent >> with clang here? > > i mean gcc can issue a diagnostic if -ffixed-x18 is not passed. > (it seems clang rejects scs too without -ffixed-x18) > Oh, yes, you are right. Clang rejects scs without -ffixed-x18[1], I should add a similar check in next version. >>> and that there is no unwinder support. >>> >> Ok, let me try to add a support for this. > > i assume exception handling info has to change for scs to > work (to pop the shadow stack when transferring control), > so either scs must require -fno-exceptions or the eh info > changes must be implemented. > > i think the kernel does not require exceptions and does > not depend on the unwinder runtime in libgcc, so this > is optional for the linux kernel use-case. > I recompiled a glibc and gcc runtime library with -ffixed-x18 enabled. As you said, the scs stack needs to be popped at the same time during exception handling. I saw that Clang is processed by adding ".cfi_escape 0x16, 0x12, 0x02, 0x82, 0x78" directive (x18 -= 8;) after each emit of scs push[2]. But this directive has problems when executed in libgcc: 1)context->reg[x] in uw_init_context_1 are all based on cfa, most registers have no initial values by default. 2)Address of shadow call stack (x18) cannot(and should not) be calculated based on cfa, and I did not yet find a way to assign hardware register x18 to context->reg[18]. 3)This causes libgcc to crash when parsing .cfi_escape exp because of 0 address dereference (* x18) (execute_stack_op => case DW_OP_breg18: _Unwind_GetGR) 4)uw_install_context_1 does not restore all hardware registers by default before eh return, so context->reg[18] can't write directly to hw x18. (In clang, __unw_getcontext/__unw_resume will save/restore all hardware registers, so this directive works fine in my libunwind test.) I tried to fix this problem through a patch[3], the exception handling works fine in my test environment, but I'm not sure if this fix is ppropriate for two reasons: 1)libgcc does not push/pop all registers by default during exception handling. Is this change appropriate? 2)The test case may not be able to test this patch, because the test environment requires at least on glibc/gcc runtime compiled with -ffixed-x18. May be it's better to rely on -fno-exceptions for this patch first? and If the glibc/gcc runtime also supports SCS later, the problem can be fixed at the same time. PS: I'm still not familiar enough with exception handling in libgcc/libunwind, please correct me if there are any mistakes :) [1] https://github.com/llvm/llvm-project/commit/f11eb3ebe77729426e562d7d4d7ebb1d5ff2e7c8 [2] https://reviews.llvm.org/D54609 [3] https://gcc.gnu.org/bugzilla/attachment.cgi?id=51854&action=diff
The 11/23/2021 16:32, Dan Li wrote: > On 11/3/21 8:00 PM, Szabolcs Nagy wrote: > > i assume exception handling info has to change for scs to > > work (to pop the shadow stack when transferring control), > > so either scs must require -fno-exceptions or the eh info > > changes must be implemented. > > > > i think the kernel does not require exceptions and does > > not depend on the unwinder runtime in libgcc, so this > > is optional for the linux kernel use-case. > > > I recompiled a glibc and gcc runtime library with -ffixed-x18 enabled. > As you said, the scs stack needs to be popped at the same time during > exception handling. > > I saw that Clang is processed by adding > ".cfi_escape 0x16, 0x12, 0x02, 0x82, 0x78" > directive (x18 -= 8;) after each emit of scs push[2]. > > But this directive has problems when executed in libgcc: > 1)context->reg[x] in uw_init_context_1 are all based on cfa, most > registers have no initial values by default. > 2)Address of shadow call stack (x18) cannot(and should not) be calculated > based on cfa, and I did not yet find a way to assign hardware register > x18 to context->reg[18]. > 3)This causes libgcc to crash when parsing .cfi_escape exp because of 0 > address dereference (* x18) > (execute_stack_op => case DW_OP_breg18: _Unwind_GetGR) > 4)uw_install_context_1 does not restore all hardware registers by default > before eh return, so context->reg[18] can't write directly to hw x18. > (In clang, __unw_getcontext/__unw_resume will save/restore all hardware > registers, so this directive works fine in my libunwind test.) > > I tried to fix this problem through a patch[3], the exception handling > works fine in my test environment, but I'm not sure if this fix is > ppropriate for two reasons: > 1)libgcc does not push/pop all registers by default during exception > handling. Is this change appropriate? > 2)The test case may not be able to test this patch, because the test > environment requires at least on glibc/gcc runtime compiled with > -ffixed-x18. > > May be it's better to rely on -fno-exceptions for this patch first? and If > the glibc/gcc runtime also supports SCS later, the problem can be fixed > at the same time. i did not look at the exception handling in detail (that's difficult to understand for me too). to use scs, non-default abi is required anyway, so not supporting exceptions sounds fine to me. however it should be documented and ideally enforced (-fexceptions should be rejected, just like -fno-fixed-x18). i assume the linux kernel does not require -fexceptions. > > PS: > I'm still not familiar enough with exception handling in libgcc/libunwind, > please correct me if there are any mistakes :) > > [1] https://github.com/llvm/llvm-project/commit/f11eb3ebe77729426e562d7d4d7ebb1d5ff2e7c8 > [2] https://reviews.llvm.org/D54609 > [3] https://gcc.gnu.org/bugzilla/attachment.cgi?id=51854&action=diff >
On 11/23/21 6:51 PM, Szabolcs Nagy wrote: > The 11/23/2021 16:32, Dan Li wrote: >> On 11/3/21 8:00 PM, Szabolcs Nagy wrote: >>> i assume exception handling info has to change for scs to >>> work (to pop the shadow stack when transferring control), >>> so either scs must require -fno-exceptions or the eh info >>> changes must be implemented. >>> >>> i think the kernel does not require exceptions and does >>> not depend on the unwinder runtime in libgcc, so this >>> is optional for the linux kernel use-case. >>> >> I recompiled a glibc and gcc runtime library with -ffixed-x18 enabled. >> As you said, the scs stack needs to be popped at the same time during >> exception handling. >> >> I saw that Clang is processed by adding >> ".cfi_escape 0x16, 0x12, 0x02, 0x82, 0x78" >> directive (x18 -= 8;) after each emit of scs push[2]. >> >> But this directive has problems when executed in libgcc: >> 1)context->reg[x] in uw_init_context_1 are all based on cfa, most >> registers have no initial values by default. >> 2)Address of shadow call stack (x18) cannot(and should not) be calculated >> based on cfa, and I did not yet find a way to assign hardware register >> x18 to context->reg[18]. >> 3)This causes libgcc to crash when parsing .cfi_escape exp because of 0 >> address dereference (* x18) >> (execute_stack_op => case DW_OP_breg18: _Unwind_GetGR) >> 4)uw_install_context_1 does not restore all hardware registers by default >> before eh return, so context->reg[18] can't write directly to hw x18. >> (In clang, __unw_getcontext/__unw_resume will save/restore all hardware >> registers, so this directive works fine in my libunwind test.) >> >> I tried to fix this problem through a patch[3], the exception handling >> works fine in my test environment, but I'm not sure if this fix is >> ppropriate for two reasons: >> 1)libgcc does not push/pop all registers by default during exception >> handling. Is this change appropriate? >> 2)The test case may not be able to test this patch, because the test >> environment requires at least on glibc/gcc runtime compiled with >> -ffixed-x18. >> >> May be it's better to rely on -fno-exceptions for this patch first? and If >> the glibc/gcc runtime also supports SCS later, the problem can be fixed >> at the same time. > > i did not look at the exception handling in detail (that's > difficult to understand for me too). > > to use scs, non-default abi is required anyway, so not > supporting exceptions sounds fine to me. however it should > be documented and ideally enforced (-fexceptions should > be rejected, just like -fno-fixed-x18). Thanks Szabolcs, This sounds reasonable to me, and I'll fix it in the next version. > > i assume the linux kernel does not require -fexceptions. > AFAIK, -fexceptions are not used in the linux kernel. >> >> PS: >> I'm still not familiar enough with exception handling in libgcc/libunwind, >> please correct me if there are any mistakes :) >> >> [1] https://github.com/llvm/llvm-project/commit/f11eb3ebe77729426e562d7d4d7ebb1d5ff2e7c8 >> [2] https://reviews.llvm.org/D54609 >> [3] https://gcc.gnu.org/bugzilla/attachment.cgi?id=51854&action=diff >>
Hi, all, Sorry for bothering. I'm trying to commit aarch64 scs code to the gcc and there is an issue that I'm not sure about, could someone give me some suggestions? (To avoid noise, I did't cc PING^3 [1] to the kernel mail list :) ) When clang enables scs, the following instructions are usually generated: str x30, [x18], 8 ldp x29, x30, [sp], 16 ...... ldp x29, x30, [sp], 16 ## x30 pop ldr x30, [x18, -8]! ## x30 pop again ret The x30 register is popped twice here, Richard suggested that we can omit the first x30 pop here. AFAICT, it seems fine and also safe for SCS. But I'm not sure if I'm missing something with the kernel, could someone give some suggestions? The previous discussion can be found here [1]. [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-January/589257.html Thanks a lot! Dan On 1/25/22 22:51, Dan Li wrote: > > > On 1/25/22 02:19, Richard Sandiford wrote: >> Dan Li <ashimida@linux.alibaba.com> writes: >>>>> + >>>>> if (flag_stack_usage_info) >>>>> current_function_static_stack_size = constant_lower_bound (frame_size); >>>>> @@ -9066,6 +9089,10 @@ aarch64_expand_epilogue (bool for_sibcall) >>>>> RTX_FRAME_RELATED_P (insn) = 1; >>>>> } >>>>> + /* Pop return address from shadow call stack. */ >>>>> + if (aarch64_shadow_call_stack_enabled ()) >>>>> + emit_insn (gen_scs_pop ()); >>>>> + >>>> >>>> This looks correct, but following on from the above, I guess we continue >>>> to restore x30 from the frame in the traditional way first, and then >>>> overwrite it with the SCS value. I think the output code would be >>>> slightly neater if we suppressed the first restore of x30. >>>> >>> Yes, the current epilogue is like: >>> ....... >>> ldp x29, x30, [sp], #16 >>> + ldr x30, [x18, #-8]! >>> >>> I think may be we can keep the two x30 pops here, for two reasons: >>> 1) The implementation of scs in clang is to pop x30 twice, it might be >>> better to be consistent with clang here[1]. >> >> This doesn't seem a very compelling reason on its own, unless it's >> explicitly meant to be observable behaviour. (But then, observed how?) >> > > Well, probably sticking to pop x30 twice is not a good idea. > AFAICT, there doesn't seem to be an explicit requirement that > this behavior must be followed. > > BTW: > Do we still need to emit the .cfi_restore 30 directive here if we > want to save a pop x30? (Sorry I'm not familiar enough with DWARF.) > > Since the aarch64 linux kernel always enables -fno-omit-frame-pointer, > the generated assembly code may be as follows: > > str x30, [x18], 8 > ldp x29, x30, [sp], 16 > ...... > ldr x29, [sp], 16 > ## Do we still need a .cfi_restore 30 here > .cfi_restore 29 > .cfi_def_cfa_offset 0 > ldr x30, [x18, -8]! > ## Or may be a non-CFA based directive here > ret > >>> 2) If we keep the first restore of x30, it may provide some flexibility >>> for other scenarios. For example, we can directly patch the scs_push/pop >>> insns in the binary to turn it into a binary without scs; >> >> Is that a supported (and ideally documented) use case? If it is, >> then it becomes important for correctness that the compiler emits >> exactly the opcodes in the original patch. If it isn't, then the >> compiler can emit other instructions that have the same effect. >> > > Oh, no, this is just a little trick that can be used for compatibility. > (Maybe some scenarios such as our company sometimes need to be > compatible with some non-open source binaries from different > manufacturers, two pops could make life easier :). ) > > If this is not a consideration for our community, please ignore > this request. > >> I'd like a definitive ruling on this from the kernel folks before >> the patch goes in. >> > > Ok, I'll cc some kernel folks to make sure I didn't miss something. > >> If binary patching is supposed to be possible then scs_push and >> scs_pop *do* need to be separate define_insns. But they also need >> to have some magic unspec that differentiates them from normal >> pushes and pops, e.g.: >> >> (set ...mem... >> (unspec:DI [...reg...] UNSPEC_SCS_PUSH)) >> >> so that there is no chance that the pattern would be treated as >> a normal move and optimised accordingly. >> > > Yeah, this template looks more appropriate if it is to be treated > as a special directive. > > Thanks for your suggestions, > Dan
On Wed, 26 Jan 2022 at 08:53, Dan Li <ashimida@linux.alibaba.com> wrote: > > Hi, all, > > Sorry for bothering. > > I'm trying to commit aarch64 scs code to the gcc and there is an issue > that I'm not sure about, could someone give me some suggestions? > (To avoid noise, I did't cc PING^3 [1] to the kernel mail list :) ) > > When clang enables scs, the following instructions are usually generated: > > str x30, [x18], 8 > ldp x29, x30, [sp], 16 > ...... > ldp x29, x30, [sp], 16 ## x30 pop > ldr x30, [x18, -8]! ## x30 pop again > ret > > The x30 register is popped twice here, Richard suggested that we can > omit the first x30 pop here. > > AFAICT, it seems fine and also safe for SCS. But I'm not sure if I'm > missing something with the kernel, could someone give some suggestions? > > The previous discussion can be found here [1]. > > [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-January/589257.html > As was pointed out in the discussion, binary patching is in fact a concern for the Linux kernel. E.g., Android uses generic binary builds, but we would like to be able to switch between pointer authentication and shadow call stack at boot time, rather than always support both, and take the SCS performance hit on systems that implement PAC as well. However, it seems more straight-forward to patch PACIASP and AUTIASP instructions into SCS push/pop instructions rather than the other way around, as we can force the use of these exact opcodes [in the NOP space]), as well as rely on existing unwind annotations to locate any such instruction in the binary. So omitting the load of X30 from the ordinary stack seems fine to me. > On 1/25/22 22:51, Dan Li wrote: > > > > > > On 1/25/22 02:19, Richard Sandiford wrote: > >> Dan Li <ashimida@linux.alibaba.com> writes: > >>>>> + > >>>>> if (flag_stack_usage_info) > >>>>> current_function_static_stack_size = constant_lower_bound (frame_size); > >>>>> @@ -9066,6 +9089,10 @@ aarch64_expand_epilogue (bool for_sibcall) > >>>>> RTX_FRAME_RELATED_P (insn) = 1; > >>>>> } > >>>>> + /* Pop return address from shadow call stack. */ > >>>>> + if (aarch64_shadow_call_stack_enabled ()) > >>>>> + emit_insn (gen_scs_pop ()); > >>>>> + > >>>> > >>>> This looks correct, but following on from the above, I guess we continue > >>>> to restore x30 from the frame in the traditional way first, and then > >>>> overwrite it with the SCS value. I think the output code would be > >>>> slightly neater if we suppressed the first restore of x30. > >>>> > >>> Yes, the current epilogue is like: > >>> ....... > >>> ldp x29, x30, [sp], #16 > >>> + ldr x30, [x18, #-8]! > >>> > >>> I think may be we can keep the two x30 pops here, for two reasons: > >>> 1) The implementation of scs in clang is to pop x30 twice, it might be > >>> better to be consistent with clang here[1]. > >> > >> This doesn't seem a very compelling reason on its own, unless it's > >> explicitly meant to be observable behaviour. (But then, observed how?) > >> > > > > Well, probably sticking to pop x30 twice is not a good idea. > > AFAICT, there doesn't seem to be an explicit requirement that > > this behavior must be followed. > > > > BTW: > > Do we still need to emit the .cfi_restore 30 directive here if we > > want to save a pop x30? (Sorry I'm not familiar enough with DWARF.) > > > > Since the aarch64 linux kernel always enables -fno-omit-frame-pointer, > > the generated assembly code may be as follows: > > > > str x30, [x18], 8 > > ldp x29, x30, [sp], 16 > > ...... > > ldr x29, [sp], 16 > > ## Do we still need a .cfi_restore 30 here > > .cfi_restore 29 > > .cfi_def_cfa_offset 0 > > ldr x30, [x18, -8]! > > ## Or may be a non-CFA based directive here > > ret > > > >>> 2) If we keep the first restore of x30, it may provide some flexibility > >>> for other scenarios. For example, we can directly patch the scs_push/pop > >>> insns in the binary to turn it into a binary without scs; > >> > >> Is that a supported (and ideally documented) use case? If it is, > >> then it becomes important for correctness that the compiler emits > >> exactly the opcodes in the original patch. If it isn't, then the > >> compiler can emit other instructions that have the same effect. > >> > > > > Oh, no, this is just a little trick that can be used for compatibility. > > (Maybe some scenarios such as our company sometimes need to be > > compatible with some non-open source binaries from different > > manufacturers, two pops could make life easier :). ) > > > > If this is not a consideration for our community, please ignore > > this request. > > > >> I'd like a definitive ruling on this from the kernel folks before > >> the patch goes in. > >> > > > > Ok, I'll cc some kernel folks to make sure I didn't miss something. > > > >> If binary patching is supposed to be possible then scs_push and > >> scs_pop *do* need to be separate define_insns. But they also need > >> to have some magic unspec that differentiates them from normal > >> pushes and pops, e.g.: > >> > >> (set ...mem... > >> (unspec:DI [...reg...] UNSPEC_SCS_PUSH)) > >> > >> so that there is no chance that the pattern would be treated as > >> a normal move and optimised accordingly. > >> > > > > Yeah, this template looks more appropriate if it is to be treated > > as a special directive. > > > > Thanks for your suggestions, > > Dan
Thanks, Ard, On 1/26/22 00:10, Ard Biesheuvel wrote: > On Wed, 26 Jan 2022 at 08:53, Dan Li <ashimida@linux.alibaba.com> wrote: >> >> Hi, all, >> >> Sorry for bothering. >> >> I'm trying to commit aarch64 scs code to the gcc and there is an issue >> that I'm not sure about, could someone give me some suggestions? >> (To avoid noise, I did't cc PING^3 [1] to the kernel mail list :) ) >> >> When clang enables scs, the following instructions are usually generated: >> >> str x30, [x18], 8 >> ldp x29, x30, [sp], 16 >> ...... >> ldp x29, x30, [sp], 16 ## x30 pop >> ldr x30, [x18, -8]! ## x30 pop again >> ret >> >> The x30 register is popped twice here, Richard suggested that we can >> omit the first x30 pop here. >> >> AFAICT, it seems fine and also safe for SCS. But I'm not sure if I'm >> missing something with the kernel, could someone give some suggestions? >> >> The previous discussion can be found here [1]. >> >> [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-January/589257.html >> > > As was pointed out in the discussion, binary patching is in fact a > concern for the Linux kernel. E.g., Android uses generic binary > builds, but we would like to be able to switch between pointer > authentication and shadow call stack at boot time, rather than always > support both, and take the SCS performance hit on systems that > implement PAC as well. > > However, it seems more straight-forward to patch PACIASP and AUTIASP > instructions into SCS push/pop instructions rather than the other way > around, as we can force the use of these exact opcodes [in the NOP > space]), as well as rely on existing unwind annotations to locate any > such instruction in the binary. > Well, then I think I don't need to submit a kernel patch to enable SCS for gcc :) BTW: Do we have a plan to submit patches of dynamic patch PAC into the kernel recently? > So omitting the load of X30 from the ordinary stack seems fine to me. > >> On 1/25/22 22:51, Dan Li wrote: >>> >>> >>> On 1/25/22 02:19, Richard Sandiford wrote: >>> >>> Well, probably sticking to pop x30 twice is not a good idea. >>> AFAICT, there doesn't seem to be an explicit requirement that > >>>> >>> >>> Ok, I'll cc some kernel folks to make sure I didn't miss something. >>> To Richard: Sorry for my mistake. Due to binary compatibility issues, SCS related code may not be directly merged into libgcc/glibc, do we still need to add this patch into GCC? (I'd like to finish it if that makes sense). Thanks all for your time! Dan >>>> If binary patching is supposed to be possible then scs_push and >>>> scs_pop *do* need to be separate define_insns. But they also need >>>> to have some magic unspec that differentiates them from normal >>>> pushes and pops, e.g.: >>>> >>>> (set ...mem... >>>> (unspec:DI [...reg...] UNSPEC_SCS_PUSH)) >>>> >>>> so that there is no chance that the pattern would be treated as >>>> a normal move and optimised accordingly. >>>> >>> >>> Yeah, this template looks more appropriate if it is to be treated >>> as a special directive. >>> >>> Thanks for your suggestions, >>> Dan
On Wed, 26 Jan 2022 at 11:40, Dan Li <ashimida@linux.alibaba.com> wrote: > > Thanks, Ard, > > On 1/26/22 00:10, Ard Biesheuvel wrote: > > On Wed, 26 Jan 2022 at 08:53, Dan Li <ashimida@linux.alibaba.com> wrote: > >> > >> Hi, all, > >> > >> Sorry for bothering. > >> > >> I'm trying to commit aarch64 scs code to the gcc and there is an issue > >> that I'm not sure about, could someone give me some suggestions? > >> (To avoid noise, I did't cc PING^3 [1] to the kernel mail list :) ) > >> > >> When clang enables scs, the following instructions are usually generated: > >> > >> str x30, [x18], 8 > >> ldp x29, x30, [sp], 16 > >> ...... > >> ldp x29, x30, [sp], 16 ## x30 pop > >> ldr x30, [x18, -8]! ## x30 pop again > >> ret > >> > >> The x30 register is popped twice here, Richard suggested that we can > >> omit the first x30 pop here. > >> > >> AFAICT, it seems fine and also safe for SCS. But I'm not sure if I'm > >> missing something with the kernel, could someone give some suggestions? > >> > >> The previous discussion can be found here [1]. > >> > >> [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-January/589257.html > >> > > > > As was pointed out in the discussion, binary patching is in fact a > > concern for the Linux kernel. E.g., Android uses generic binary > > builds, but we would like to be able to switch between pointer > > authentication and shadow call stack at boot time, rather than always > > support both, and take the SCS performance hit on systems that > > implement PAC as well. > > > > However, it seems more straight-forward to patch PACIASP and AUTIASP > > instructions into SCS push/pop instructions rather than the other way > > around, as we can force the use of these exact opcodes [in the NOP > > space]), as well as rely on existing unwind annotations to locate any > > such instruction in the binary. > > > > Well, then I think I don't need to submit a kernel patch to > enable SCS for gcc :) > Not entirely. > BTW: > Do we have a plan to submit patches of dynamic patch PAC into > the kernel recently? > At the moment, there are just some ideas floating around. I did implement a proof of concept that uses unwind data, but it hit some issues with cfi_negate_ra_state being emitted imprecisely (GCC) or not at all (Clang) in some cases. Using objtool would be another possibility. So in summary, getting SCS support proper into GCC is definitely worth the effort.
On 1/26/22 03:09, Ard Biesheuvel wrote: > On Wed, 26 Jan 2022 at 11:40, Dan Li <ashimida@linux.alibaba.com> wrote: >> >> Thanks, Ard, >> >> On 1/26/22 00:10, Ard Biesheuvel wrote: >>> On Wed, 26 Jan 2022 at 08:53, Dan Li <ashimida@linux.alibaba.com> wrote: >>>> >>>> Hi, all, >>>> >>>> Sorry for bothering. >>>> >>>> I'm trying to commit aarch64 scs code to the gcc and there is an issue >>>> that I'm not sure about, could someone give me some suggestions? >>>> (To avoid noise, I did't cc PING^3 [1] to the kernel mail list :) ) >>>> >>>> When clang enables scs, the following instructions are usually generated: >>>> >>>> str x30, [x18], 8 >>>> ldp x29, x30, [sp], 16 >>>> ...... >>>> ldp x29, x30, [sp], 16 ## x30 pop >>>> ldr x30, [x18, -8]! ## x30 pop again >>>> ret >>>> >>>> The x30 register is popped twice here, Richard suggested that we can >>>> omit the first x30 pop here. >>>> >>>> AFAICT, it seems fine and also safe for SCS. But I'm not sure if I'm >>>> missing something with the kernel, could someone give some suggestions? >>>> >>>> The previous discussion can be found here [1]. >>>> >>>> [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-January/589257.html >>>> >>> >>> As was pointed out in the discussion, binary patching is in fact a >>> concern for the Linux kernel. E.g., Android uses generic binary >>> builds, but we would like to be able to switch between pointer >>> authentication and shadow call stack at boot time, rather than always >>> support both, and take the SCS performance hit on systems that >>> implement PAC as well. >>> >>> However, it seems more straight-forward to patch PACIASP and AUTIASP >>> instructions into SCS push/pop instructions rather than the other way >>> around, as we can force the use of these exact opcodes [in the NOP >>> space]), as well as rely on existing unwind annotations to locate any >>> such instruction in the binary. >>> >> >> Well, then I think I don't need to submit a kernel patch to >> enable SCS for gcc :) >> > > Not entirely. > >> BTW: >> Do we have a plan to submit patches of dynamic patch PAC into >> the kernel recently? >> > > At the moment, there are just some ideas floating around. I did > implement a proof of concept that uses unwind data, but it hit some > issues with cfi_negate_ra_state being emitted imprecisely (GCC) or not > at all (Clang) in some cases. Using objtool would be another > possibility. > > So in summary, getting SCS support proper into GCC is definitely worth > the effort. > Got it! And thanks for the suggestion, Ard :)
Thanks for the discussion and sorry for the slow reply, was out most of last week. Dan Li <ashimida@linux.alibaba.com> writes: > Thanks, Ard, > > On 1/26/22 00:10, Ard Biesheuvel wrote: >> On Wed, 26 Jan 2022 at 08:53, Dan Li <ashimida@linux.alibaba.com> wrote: >>> >>> Hi, all, >>> >>> Sorry for bothering. >>> >>> I'm trying to commit aarch64 scs code to the gcc and there is an issue >>> that I'm not sure about, could someone give me some suggestions? >>> (To avoid noise, I did't cc PING^3 [1] to the kernel mail list :) ) >>> >>> When clang enables scs, the following instructions are usually generated: >>> >>> str x30, [x18], 8 >>> ldp x29, x30, [sp], 16 >>> ...... >>> ldp x29, x30, [sp], 16 ## x30 pop >>> ldr x30, [x18, -8]! ## x30 pop again >>> ret >>> >>> The x30 register is popped twice here, Richard suggested that we can >>> omit the first x30 pop here. >>> >>> AFAICT, it seems fine and also safe for SCS. But I'm not sure if I'm >>> missing something with the kernel, could someone give some suggestions? >>> >>> The previous discussion can be found here [1]. >>> >>> [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-January/589257.html >>> >> >> As was pointed out in the discussion, binary patching is in fact a >> concern for the Linux kernel. E.g., Android uses generic binary >> builds, but we would like to be able to switch between pointer >> authentication and shadow call stack at boot time, rather than always >> support both, and take the SCS performance hit on systems that >> implement PAC as well. >> >> However, it seems more straight-forward to patch PACIASP and AUTIASP >> instructions into SCS push/pop instructions rather than the other way >> around, as we can force the use of these exact opcodes [in the NOP >> space]), as well as rely on existing unwind annotations to locate any >> such instruction in the binary. >> > > Well, then I think I don't need to submit a kernel patch to > enable SCS for gcc :) > > BTW: > Do we have a plan to submit patches of dynamic patch PAC into > the kernel recently? > >> So omitting the load of X30 from the ordinary stack seems fine to me. OK, thanks. Let's go with that for now then. There would still be time to change our minds before GCC 12 is released, if anyone feels that patching SCS code would be useful. Reading it back, I think my previous message came across as sounding like a complaint against binary patching, which wasn't the case at all. I think it would be fine to support patching, even if it was just for a single vendor rather than expected to be a common case. It's just that, if we did want to support it, we'd need to document it as a requirement (at least within GCC) and change the implementation accordingly. Thanks, Richard
On 1/31/22 08:26, Richard Sandiford wrote: > Thanks for the discussion and sorry for the slow reply, was out most of > last week. > > Dan Li <ashimida@linux.alibaba.com> writes: >> Thanks, Ard, >> >> On 1/26/22 00:10, Ard Biesheuvel wrote: >>> On Wed, 26 Jan 2022 at 08:53, Dan Li <ashimida@linux.alibaba.com> wrote: >>>> >>>> Hi, all, >>>> >>>> Sorry for bothering. >>>> >>>> I'm trying to commit aarch64 scs code to the gcc and there is an issue >>>> that I'm not sure about, could someone give me some suggestions? >>>> (To avoid noise, I did't cc PING^3 [1] to the kernel mail list :) ) >>>> >>> So omitting the load of X30 from the ordinary stack seems fine to me. > > OK, thanks. Let's go with that for now then. There would still be > time to change our minds before GCC 12 is released, if anyone feels > that patching SCS code would be useful. >> Reading it back, I think my previous message came across as sounding > like a complaint against binary patching, which wasn't the case at all. > I think it would be fine to support patching, even if it was just for a > single vendor rather than expected to be a common case. It's just that, > if we did want to support it, we'd need to document it as a requirement > (at least within GCC) and change the implementation accordingly. > Got it, then I'll implement this feature as discussed above and see if we could add additional options for SCS later. Thanks, Dan
diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c index 007b928c54b..9b3a35c06bf 100644 --- a/gcc/c-family/c-attribs.c +++ b/gcc/c-family/c-attribs.c @@ -56,6 +56,8 @@ static tree handle_cold_attribute (tree *, tree, tree, int, bool *); static tree handle_no_sanitize_attribute (tree *, tree, tree, int, bool *); static tree handle_no_sanitize_address_attribute (tree *, tree, tree, int, bool *); +static tree handle_no_sanitize_shadow_call_stack_attribute (tree *, tree, + tree, int, bool *); static tree handle_no_sanitize_thread_attribute (tree *, tree, tree, int, bool *); static tree handle_no_address_safety_analysis_attribute (tree *, tree, tree, @@ -454,6 +456,10 @@ const struct attribute_spec c_common_attribute_table[] = handle_no_sanitize_attribute, NULL }, { "no_sanitize_address", 0, 0, true, false, false, false, handle_no_sanitize_address_attribute, NULL }, + { "no_sanitize_shadow_call_stack", + 0, 0, true, false, false, false, + handle_no_sanitize_shadow_call_stack_attribute, + NULL }, { "no_sanitize_thread", 0, 0, true, false, false, false, handle_no_sanitize_thread_attribute, NULL }, { "no_sanitize_undefined", 0, 0, true, false, false, false, @@ -1175,6 +1181,21 @@ handle_no_sanitize_address_attribute (tree *node, tree name, tree, int, return NULL_TREE; } +/* Handle a "no_sanitize_shadow_call_stack" attribute; arguments as in + struct attribute_spec.handler. */ +static tree +handle_no_sanitize_shadow_call_stack_attribute (tree *node, tree name, + tree, int, bool *no_add_attrs) +{ + *no_add_attrs = true; + if (TREE_CODE (*node) != FUNCTION_DECL) + warning (OPT_Wattributes, "%qE attribute ignored", name); + else + add_no_sanitize_value (*node, SANITIZE_SHADOW_CALL_STACK); + + return NULL_TREE; +} + /* Handle a "no_sanitize_thread" attribute; arguments as in struct attribute_spec.handler. */ diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index 768e8fae136..150c015df21 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -893,6 +893,7 @@ void aarch64_register_pragmas (void); void aarch64_relayout_simd_types (void); void aarch64_reset_previous_fndecl (void); bool aarch64_return_address_signing_enabled (void); +bool aarch64_shadow_call_stack_enabled (void); bool aarch64_bti_enabled (void); void aarch64_save_restore_target_globals (tree); void aarch64_addti_scratch_regs (rtx, rtx, rtx *, diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 699c105a42a..5a36a459f4e 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -79,6 +79,7 @@ #include "tree-ssa-loop-niter.h" #include "fractional-cost.h" #include "rtlanal.h" +#include "asan.h" /* This file should be included last. */ #include "target-def.h" @@ -7799,6 +7800,24 @@ aarch64_return_address_signing_enabled (void) && known_ge (cfun->machine->frame.reg_offset[LR_REGNUM], 0))); } +/* Return TRUE if shadow call stack should be enabled for the current + function, otherwise return FALSE. */ + +bool +aarch64_shadow_call_stack_enabled (void) +{ + /* This function should only be called after frame laid out. */ + gcc_assert (cfun->machine->frame.laid_out); + + if (crtl->calls_eh_return) + return false; + + /* We only deal with a function if its LR is pushed onto stack + and attribute no_sanitize_shadow_call_stack is not specified. */ + return (sanitize_flags_p (SANITIZE_SHADOW_CALL_STACK) + && known_ge (cfun->machine->frame.reg_offset[LR_REGNUM], 0)); +} + /* Return TRUE if Branch Target Identification Mechanism is enabled. */ bool aarch64_bti_enabled (void) @@ -8810,6 +8829,10 @@ aarch64_expand_prologue (void) RTX_FRAME_RELATED_P (insn) = 1; } + /* Push return address to shadow call stack. */ + if (aarch64_shadow_call_stack_enabled ()) + emit_insn (gen_scs_push ()); + if (flag_stack_usage_info) current_function_static_stack_size = constant_lower_bound (frame_size); @@ -9066,6 +9089,10 @@ aarch64_expand_epilogue (bool for_sibcall) RTX_FRAME_RELATED_P (insn) = 1; } + /* Pop return address from shadow call stack. */ + if (aarch64_shadow_call_stack_enabled ()) + emit_insn (gen_scs_pop ()); + /* We prefer to emit the combined return/authenticate instruction RETAA, however there are three cases in which we must instead emit an explicit authentication instruction. diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index 2792bb29adb..1a83875dec8 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -100,6 +100,10 @@ generating stack clash probes. */ #define STACK_CLASH_MAX_UNROLL_PAGES 4 +/* This value represents whether the shadow call stack is implemented on + the target platform. */ +#define TARGET_SUPPORT_SHADOW_CALL_STACK true + /* The architecture reserves all bits of the address for hardware use, so the vbit must go into the delta field of pointers to member functions. This is the same config as that in the AArch32 diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 1a39470a1fe..8e68a6f793d 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -6994,6 +6994,24 @@ (define_insn "xpaclri" "hint\t7 // xpaclri" ) +;; Save X30 in the X18-based POST_INC stack (consistent with clang). +(define_insn "scs_push" + [(set (mem:DI (reg:DI R18_REGNUM)) (reg:DI R30_REGNUM)) + (set (reg:DI R18_REGNUM) (plus:DI (reg:DI R18_REGNUM) (const_int 8)))] + "" + "str\\tx30, [x18], #8" + [(set_attr "type" "store_8")] +) + +;; Load X30 form the X18-based PRE_DEC stack (consistent with clang). +(define_insn "scs_pop" + [(set (reg:DI R18_REGNUM) (minus:DI (reg:DI R18_REGNUM) (const_int 8))) + (set (reg:DI R30_REGNUM) (mem:DI (reg:DI R18_REGNUM)))] + "" + "ldr\\tx30, [x18, #-8]!" + [(set_attr "type" "load_8")] +) + ;; UNSPEC_VOLATILE is considered to use and clobber all hard registers and ;; all of memory. This blocks insns from being moved across this point. diff --git a/gcc/defaults.h b/gcc/defaults.h index bb68d0d1a79..0f1719a3bb5 100644 --- a/gcc/defaults.h +++ b/gcc/defaults.h @@ -1172,6 +1172,10 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see #define PCC_BITFIELD_TYPE_MATTERS false #endif +#ifndef TARGET_SUPPORT_SHADOW_CALL_STACK +#define TARGET_SUPPORT_SHADOW_CALL_STACK false +#endif + #ifndef INSN_SETS_ARE_DELAYED #define INSN_SETS_ARE_DELAYED(INSN) false #endif diff --git a/gcc/flag-types.h b/gcc/flag-types.h index a5a637160d7..c22ef35a289 100644 --- a/gcc/flag-types.h +++ b/gcc/flag-types.h @@ -321,6 +321,8 @@ enum sanitize_code { SANITIZE_HWADDRESS = 1UL << 28, SANITIZE_USER_HWADDRESS = 1UL << 29, SANITIZE_KERNEL_HWADDRESS = 1UL << 30, + /* Shadow Call Stack. */ + SANITIZE_SHADOW_CALL_STACK = 1UL << 31, SANITIZE_SHIFT = SANITIZE_SHIFT_BASE | SANITIZE_SHIFT_EXPONENT, SANITIZE_UNDEFINED = SANITIZE_SHIFT | SANITIZE_DIVIDE | SANITIZE_UNREACHABLE | SANITIZE_VLA | SANITIZE_NULL | SANITIZE_RETURN diff --git a/gcc/opts.c b/gcc/opts.c index 4472cec1b98..e94f316fb85 100644 --- a/gcc/opts.c +++ b/gcc/opts.c @@ -1308,6 +1308,11 @@ finish_options (struct gcc_options *opts, struct gcc_options *opts_set, sorry ("transactional memory is not supported with " "%<-fsanitize=kernel-address%>"); + if ((opts->x_flag_sanitize & SANITIZE_SHADOW_CALL_STACK) + && !TARGET_SUPPORT_SHADOW_CALL_STACK) + error_at (loc, "%<-fsanitize=shadow-call-stack%> not supported " + "in current platform"); + /* Currently live patching is not support for LTO. */ if (opts->x_flag_live_patching && opts->x_flag_lto) sorry ("live patching is not supported with LTO"); @@ -1994,6 +1999,7 @@ const struct sanitizer_opts_s sanitizer_opts[] = SANITIZER_OPT (vptr, SANITIZE_VPTR, true), SANITIZER_OPT (pointer-overflow, SANITIZE_POINTER_OVERFLOW, true), SANITIZER_OPT (builtin, SANITIZE_BUILTIN, true), + SANITIZER_OPT (shadow-call-stack, SANITIZE_SHADOW_CALL_STACK, false), SANITIZER_OPT (all, ~0U, true), #undef SANITIZER_OPT { NULL, 0U, 0UL, false }
Shadow Call Stack can be used to protect the return address of a function at runtime, and clang already supports this feature[1]. To enable SCS in user mode, in addition to compiler, other support is also required (as described in [2]). This patch only adds basic support for SCS from the compiler side, and provides convenience for users to enable SCS. For linux kernel, only the support of the compiler is required. [1] https://clang.llvm.org/docs/ShadowCallStack.html [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102768 gcc/c-family/ChangeLog: * c-attribs.c (handle_no_sanitize_shadow_call_stack_attribute): gcc/ChangeLog: * config/aarch64/aarch64-protos.h (aarch64_shadow_call_stack_enabled): * config/aarch64/aarch64.c (aarch64_shadow_call_stack_enabled): (aarch64_expand_prologue): (aarch64_expand_epilogue): * config/aarch64/aarch64.h (TARGET_SUPPORT_SHADOW_CALL_STACK): * config/aarch64/aarch64.md (scs_push): (scs_pop): * defaults.h (TARGET_SUPPORT_SHADOW_CALL_STACK): * flag-types.h (enum sanitize_code): * opts.c (finish_options): Signed-off-by: Dan Li <ashimida@linux.alibaba.com> --- gcc/c-family/c-attribs.c | 21 +++++++++++++++++++++ gcc/config/aarch64/aarch64-protos.h | 1 + gcc/config/aarch64/aarch64.c | 27 +++++++++++++++++++++++++++ gcc/config/aarch64/aarch64.h | 4 ++++ gcc/config/aarch64/aarch64.md | 18 ++++++++++++++++++ gcc/defaults.h | 4 ++++ gcc/flag-types.h | 2 ++ gcc/opts.c | 6 ++++++ 8 files changed, 83 insertions(+)