Message ID | 20241211020424.401614-1-volodymyr_babchuk@epam.com (mailing list archive) |
---|---|
Headers | show |
Series | Add stack protector | expand |
On 11.12.2024 03:04, Volodymyr Babchuk wrote: > Both GCC and Clang support -fstack-protector feature, which add stack > canaries to functions where stack corruption is possible. This series > makes possible to use this feature in Xen. I tested this on ARM64 and > it is working as intended. Tested both with GCC and Clang. > > It is hard to enable this feature on x86, as GCC stores stack canary > in %fs:40 by default, but Xen can't use %fs for various reasons. It is > possibly to change stack canary location new newer GCC versions, but > this will change minimal GCC requirement, which is also hard due to > various reasons. So, this series focus mostly on ARM and RISCV. Why exactly would it not be possible to offer the feature when new enough gcc is in use? Jan
Hello Jan, Jan Beulich <jbeulich@suse.com> writes: > On 11.12.2024 03:04, Volodymyr Babchuk wrote: >> Both GCC and Clang support -fstack-protector feature, which add stack >> canaries to functions where stack corruption is possible. This series >> makes possible to use this feature in Xen. I tested this on ARM64 and >> it is working as intended. Tested both with GCC and Clang. >> >> It is hard to enable this feature on x86, as GCC stores stack canary >> in %fs:40 by default, but Xen can't use %fs for various reasons. It is >> possibly to change stack canary location new newer GCC versions, but >> this will change minimal GCC requirement, which is also hard due to >> various reasons. So, this series focus mostly on ARM and RISCV. > > Why exactly would it not be possible to offer the feature when new enough > gcc is in use? It is possible to use this feature with a modern enough GCC, yes. Are you suggesting to make HAS_STACK_PROTECTOR dependent on GCC_VERSION for x86 platform?
On 12/12/2024 12:13 am, Volodymyr Babchuk wrote: > Hello Jan, > > Jan Beulich <jbeulich@suse.com> writes: > >> On 11.12.2024 03:04, Volodymyr Babchuk wrote: >>> Both GCC and Clang support -fstack-protector feature, which add stack >>> canaries to functions where stack corruption is possible. This series >>> makes possible to use this feature in Xen. I tested this on ARM64 and >>> it is working as intended. Tested both with GCC and Clang. >>> >>> It is hard to enable this feature on x86, as GCC stores stack canary >>> in %fs:40 by default, but Xen can't use %fs for various reasons. It is >>> possibly to change stack canary location new newer GCC versions, but >>> this will change minimal GCC requirement, which is also hard due to >>> various reasons. So, this series focus mostly on ARM and RISCV. >> Why exactly would it not be possible to offer the feature when new enough >> gcc is in use? > It is possible to use this feature with a modern enough GCC, yes. Are > you suggesting to make HAS_STACK_PROTECTOR dependent on GCC_VERSION for > x86 platform? (With the knowledge that this is a disputed Kconfig pattern, and will need rebasing), the way I want this to work is simply: diff --git a/xen/Makefile b/xen/Makefile index 0de0101fd0bf..5d0a88fb3c3f 100644 --- a/xen/Makefile +++ b/xen/Makefile @@ -434,6 +434,9 @@ endif ifeq ($(CONFIG_STACK_PROTECTOR),y) CFLAGS += -fstack-protector +ifeq ($(CONFIG_X86),y) +CFLAGS += -mstack-protector-guard=global +endif else CFLAGS += -fno-stack-protector endif diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig index 9cdd04721afa..7951ca908b36 100644 --- a/xen/arch/x86/Kconfig +++ b/xen/arch/x86/Kconfig @@ -28,6 +28,7 @@ config X86 select HAS_PCI_MSI select HAS_PIRQ select HAS_SCHED_GRANULARITY + select HAS_STACK_PROTECTOR if $(cc-option,-mstack-protector-guard=global) select HAS_UBSAN select HAS_VMAP select HAS_VPCI if HVM Sadly, it doesn't build. I get a handful of: prelink.o: in function `cmdline_parse': /home/andrew/xen.git/xen/common/kernel.c:216:(.init.text+0x20f2): failed to convert GOTPCREL relocation against '__stack_chk_guard'; relink with --no-relax /home/andrew/xen.git/xen/common/kernel.c:230:(.init.text+0x246f): failed to convert GOTPCREL relocation against '__stack_chk_guard'; relink with --no-relax which is more toolchain-whispering than I feel like doing tonight. ~Andrew
On 12/12/2024 1:17 am, Andrew Cooper wrote: > On 12/12/2024 12:13 am, Volodymyr Babchuk wrote: >> Hello Jan, >> >> Jan Beulich <jbeulich@suse.com> writes: >> >>> On 11.12.2024 03:04, Volodymyr Babchuk wrote: >>>> Both GCC and Clang support -fstack-protector feature, which add stack >>>> canaries to functions where stack corruption is possible. This series >>>> makes possible to use this feature in Xen. I tested this on ARM64 and >>>> it is working as intended. Tested both with GCC and Clang. >>>> >>>> It is hard to enable this feature on x86, as GCC stores stack canary >>>> in %fs:40 by default, but Xen can't use %fs for various reasons. It is >>>> possibly to change stack canary location new newer GCC versions, but >>>> this will change minimal GCC requirement, which is also hard due to >>>> various reasons. So, this series focus mostly on ARM and RISCV. >>> Why exactly would it not be possible to offer the feature when new enough >>> gcc is in use? >> It is possible to use this feature with a modern enough GCC, yes. Are >> you suggesting to make HAS_STACK_PROTECTOR dependent on GCC_VERSION for >> x86 platform? > (With the knowledge that this is a disputed Kconfig pattern, and will > need rebasing), the way I want this to work is simply: > > diff --git a/xen/Makefile b/xen/Makefile > index 0de0101fd0bf..5d0a88fb3c3f 100644 > --- a/xen/Makefile > +++ b/xen/Makefile > @@ -434,6 +434,9 @@ endif > > ifeq ($(CONFIG_STACK_PROTECTOR),y) > CFLAGS += -fstack-protector > +ifeq ($(CONFIG_X86),y) > +CFLAGS += -mstack-protector-guard=global > +endif > else > CFLAGS += -fno-stack-protector > endif > diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig > index 9cdd04721afa..7951ca908b36 100644 > --- a/xen/arch/x86/Kconfig > +++ b/xen/arch/x86/Kconfig > @@ -28,6 +28,7 @@ config X86 > select HAS_PCI_MSI > select HAS_PIRQ > select HAS_SCHED_GRANULARITY > + select HAS_STACK_PROTECTOR if > $(cc-option,-mstack-protector-guard=global) > select HAS_UBSAN > select HAS_VMAP > select HAS_VPCI if HVM > > > > Sadly, it doesn't build. I get a handful of: > > prelink.o: in function `cmdline_parse': > /home/andrew/xen.git/xen/common/kernel.c:216:(.init.text+0x20f2): failed > to convert GOTPCREL relocation against '__stack_chk_guard'; relink with > --no-relax > /home/andrew/xen.git/xen/common/kernel.c:230:(.init.text+0x246f): failed > to convert GOTPCREL relocation against '__stack_chk_guard'; relink with > --no-relax > > which is more toolchain-whispering than I feel like doing tonight. P.S. Irrespective of the x86 side of things, you need a final patch on your series adjusting CHANGELOG.md. ~Andrew
On 12.12.2024 01:13, Volodymyr Babchuk wrote: > Jan Beulich <jbeulich@suse.com> writes: >> On 11.12.2024 03:04, Volodymyr Babchuk wrote: >>> Both GCC and Clang support -fstack-protector feature, which add stack >>> canaries to functions where stack corruption is possible. This series >>> makes possible to use this feature in Xen. I tested this on ARM64 and >>> it is working as intended. Tested both with GCC and Clang. >>> >>> It is hard to enable this feature on x86, as GCC stores stack canary >>> in %fs:40 by default, but Xen can't use %fs for various reasons. It is >>> possibly to change stack canary location new newer GCC versions, but >>> this will change minimal GCC requirement, which is also hard due to >>> various reasons. So, this series focus mostly on ARM and RISCV. >> >> Why exactly would it not be possible to offer the feature when new enough >> gcc is in use? > > It is possible to use this feature with a modern enough GCC, yes. Are > you suggesting to make HAS_STACK_PROTECTOR dependent on GCC_VERSION for > x86 platform? Only kind of. I remain yet to be convinced (or formally outvoted) on such wanting to live in (only) Kconfig. I actually proposed a hybrid model [1]. But yes - some time of build time dependency. Jan [1] https://lists.xen.org/archives/html/xen-devel/2022-09/msg01793.html
On 12.12.2024 02:17, Andrew Cooper wrote: > (With the knowledge that this is a disputed Kconfig pattern, and will > need rebasing), the way I want this to work is simply: > > diff --git a/xen/Makefile b/xen/Makefile > index 0de0101fd0bf..5d0a88fb3c3f 100644 > --- a/xen/Makefile > +++ b/xen/Makefile > @@ -434,6 +434,9 @@ endif > > ifeq ($(CONFIG_STACK_PROTECTOR),y) > CFLAGS += -fstack-protector > +ifeq ($(CONFIG_X86),y) > +CFLAGS += -mstack-protector-guard=global > +endif > else > CFLAGS += -fno-stack-protector > endif > diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig > index 9cdd04721afa..7951ca908b36 100644 > --- a/xen/arch/x86/Kconfig > +++ b/xen/arch/x86/Kconfig > @@ -28,6 +28,7 @@ config X86 > select HAS_PCI_MSI > select HAS_PIRQ > select HAS_SCHED_GRANULARITY > + select HAS_STACK_PROTECTOR if > $(cc-option,-mstack-protector-guard=global) > select HAS_UBSAN > select HAS_VMAP > select HAS_VPCI if HVM > > > > Sadly, it doesn't build. I get a handful of: > > prelink.o: in function `cmdline_parse': > /home/andrew/xen.git/xen/common/kernel.c:216:(.init.text+0x20f2): failed > to convert GOTPCREL relocation against '__stack_chk_guard'; relink with > --no-relax > /home/andrew/xen.git/xen/common/kernel.c:230:(.init.text+0x246f): failed > to convert GOTPCREL relocation against '__stack_chk_guard'; relink with > --no-relax > > which is more toolchain-whispering than I feel like doing tonight. Imo the root of the problem is that the compiler doesn't itself mark __stack_chk_guard hidden (it does so for __stack_chk_fail, albeit only for 32-bit code), and hence finds it necessary to use @gotpcrel to access the variable. Even if the linker managed to relax all of these, it would then still be less efficient compared to direct RIP-relative accesses. I also can't see how we might be able to override the compiler's internal declaration to mark it hidden (the same appears to be true for other items the declares internally, like the retpoline thunks or even strcmp() et al). Passing -fvisibility=hidden doesn't make a difference (just as another data point). Playing with -fstack-protector* flavors, I observe: - -fstack-protector causing several failures, like you observed, oddly enough exclusively from __init functions, - -fstack-protector-all and -fstack-protector-strong each causing a single (but respectively different) failure, for apparently random non-__init functions. Taking this together it very much smells like a linker issue. I'll see about checking there further. Jan
On 12.12.2024 15:30, Jan Beulich wrote: > On 12.12.2024 02:17, Andrew Cooper wrote: >> (With the knowledge that this is a disputed Kconfig pattern, and will >> need rebasing), the way I want this to work is simply: >> >> diff --git a/xen/Makefile b/xen/Makefile >> index 0de0101fd0bf..5d0a88fb3c3f 100644 >> --- a/xen/Makefile >> +++ b/xen/Makefile >> @@ -434,6 +434,9 @@ endif >> >> ifeq ($(CONFIG_STACK_PROTECTOR),y) >> CFLAGS += -fstack-protector >> +ifeq ($(CONFIG_X86),y) >> +CFLAGS += -mstack-protector-guard=global >> +endif >> else >> CFLAGS += -fno-stack-protector >> endif >> diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig >> index 9cdd04721afa..7951ca908b36 100644 >> --- a/xen/arch/x86/Kconfig >> +++ b/xen/arch/x86/Kconfig >> @@ -28,6 +28,7 @@ config X86 >> select HAS_PCI_MSI >> select HAS_PIRQ >> select HAS_SCHED_GRANULARITY >> + select HAS_STACK_PROTECTOR if >> $(cc-option,-mstack-protector-guard=global) >> select HAS_UBSAN >> select HAS_VMAP >> select HAS_VPCI if HVM >> >> >> >> Sadly, it doesn't build. I get a handful of: >> >> prelink.o: in function `cmdline_parse': >> /home/andrew/xen.git/xen/common/kernel.c:216:(.init.text+0x20f2): failed >> to convert GOTPCREL relocation against '__stack_chk_guard'; relink with >> --no-relax >> /home/andrew/xen.git/xen/common/kernel.c:230:(.init.text+0x246f): failed >> to convert GOTPCREL relocation against '__stack_chk_guard'; relink with >> --no-relax >> >> which is more toolchain-whispering than I feel like doing tonight. > > Imo the root of the problem is that the compiler doesn't itself mark > __stack_chk_guard hidden (it does so for __stack_chk_fail, albeit only for > 32-bit code), and hence finds it necessary to use @gotpcrel to access the > variable. Even if the linker managed to relax all of these, it would then > still be less efficient compared to direct RIP-relative accesses. > > I also can't see how we might be able to override the compiler's internal > declaration to mark it hidden (the same appears to be true for other items > the declares internally, like the retpoline thunks or even strcmp() et al). > Passing -fvisibility=hidden doesn't make a difference (just as another > data point). > > Playing with -fstack-protector* flavors, I observe: > - -fstack-protector causing several failures, like you observed, oddly > enough exclusively from __init functions, > - -fstack-protector-all and -fstack-protector-strong each causing a single > (but respectively different) failure, for apparently random non-__init > functions. > Taking this together it very much smells like a linker issue. I'll see > about checking there further. The oddity with how many diags show up is down to internals of the linker. It processes a single input section in full (continuing over this specific type of error), but will stop processing afterwards if any such error was encountered. The issue itself is a wrong assumption in the linker: It believes that it would only ever build small-model code when encountering this kind of relocation, and when not linking a shared library or PIE. With this assumption it converts the relocation resulting from @gotpcrel to R_X86_64_32S (converting the MOV from GOT to MOV $imm), which of course overflows when later trying to actually resolve it. What I'm yet to understand is why it doesn't use R_X86_64_PC32 (also) in such a situation (it does e.g. when building a shared library). While so far I didn't try it, using --no-relax is presumably not an option, as I expect that it'll leave us with a non-empty .got. Plus I didn't even start looking into how the xen.efi linking would deal with the ELF-specific gotpcrel relocs; the concept of GOT doesn't exist in PE/COFF, after all. While the linker certainly wants fixing, I continue to think that getting the compiler side right would yield the better overall result. Jan
On 12/12/2024 4:52 pm, Jan Beulich wrote: > On 12.12.2024 15:30, Jan Beulich wrote: >> On 12.12.2024 02:17, Andrew Cooper wrote: >>> (With the knowledge that this is a disputed Kconfig pattern, and will >>> need rebasing), the way I want this to work is simply: >>> >>> diff --git a/xen/Makefile b/xen/Makefile >>> index 0de0101fd0bf..5d0a88fb3c3f 100644 >>> --- a/xen/Makefile >>> +++ b/xen/Makefile >>> @@ -434,6 +434,9 @@ endif >>> >>> ifeq ($(CONFIG_STACK_PROTECTOR),y) >>> CFLAGS += -fstack-protector >>> +ifeq ($(CONFIG_X86),y) >>> +CFLAGS += -mstack-protector-guard=global >>> +endif >>> else >>> CFLAGS += -fno-stack-protector >>> endif >>> diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig >>> index 9cdd04721afa..7951ca908b36 100644 >>> --- a/xen/arch/x86/Kconfig >>> +++ b/xen/arch/x86/Kconfig >>> @@ -28,6 +28,7 @@ config X86 >>> select HAS_PCI_MSI >>> select HAS_PIRQ >>> select HAS_SCHED_GRANULARITY >>> + select HAS_STACK_PROTECTOR if >>> $(cc-option,-mstack-protector-guard=global) >>> select HAS_UBSAN >>> select HAS_VMAP >>> select HAS_VPCI if HVM >>> >>> >>> >>> Sadly, it doesn't build. I get a handful of: >>> >>> prelink.o: in function `cmdline_parse': >>> /home/andrew/xen.git/xen/common/kernel.c:216:(.init.text+0x20f2): failed >>> to convert GOTPCREL relocation against '__stack_chk_guard'; relink with >>> --no-relax >>> /home/andrew/xen.git/xen/common/kernel.c:230:(.init.text+0x246f): failed >>> to convert GOTPCREL relocation against '__stack_chk_guard'; relink with >>> --no-relax >>> >>> which is more toolchain-whispering than I feel like doing tonight. >> Imo the root of the problem is that the compiler doesn't itself mark >> __stack_chk_guard hidden (it does so for __stack_chk_fail, albeit only for >> 32-bit code), and hence finds it necessary to use @gotpcrel to access the >> variable. Even if the linker managed to relax all of these, it would then >> still be less efficient compared to direct RIP-relative accesses. >> >> I also can't see how we might be able to override the compiler's internal >> declaration to mark it hidden (the same appears to be true for other items >> the declares internally, like the retpoline thunks or even strcmp() et al). >> Passing -fvisibility=hidden doesn't make a difference (just as another >> data point). >> >> Playing with -fstack-protector* flavors, I observe: >> - -fstack-protector causing several failures, like you observed, oddly >> enough exclusively from __init functions, >> - -fstack-protector-all and -fstack-protector-strong each causing a single >> (but respectively different) failure, for apparently random non-__init >> functions. >> Taking this together it very much smells like a linker issue. I'll see >> about checking there further. > The oddity with how many diags show up is down to internals of the linker. > It processes a single input section in full (continuing over this specific > type of error), but will stop processing afterwards if any such error was > encountered. > > The issue itself is a wrong assumption in the linker: It believes that it > would only ever build small-model code when encountering this kind of > relocation, and when not linking a shared library or PIE. With this > assumption it converts the relocation resulting from @gotpcrel to > R_X86_64_32S (converting the MOV from GOT to MOV $imm), which of course > overflows when later trying to actually resolve it. What I'm yet to > understand is why it doesn't use R_X86_64_PC32 (also) in such a situation > (it does e.g. when building a shared library). > > While so far I didn't try it, using --no-relax is presumably not an option, > as I expect that it'll leave us with a non-empty .got. Plus I didn't even > start looking into how the xen.efi linking would deal with the ELF-specific > gotpcrel relocs; the concept of GOT doesn't exist in PE/COFF, after all. > > While the linker certainly wants fixing, I continue to think that getting > the compiler side right would yield the better overall result. Ok, so what precisely needs doing here? For starters, I guess __stack_chk_guard wants to respect -fvisibilty=hidden and/or #pragma. I can see why it wouldn't want to in regular userspace, but we're not that. There's clearly also an LD error (bad assumptions about model). How many other bugs need opening in the various bugzillas? ~Andrew
On 19.12.2024 01:20, Andrew Cooper wrote: > On 12/12/2024 4:52 pm, Jan Beulich wrote: >> On 12.12.2024 15:30, Jan Beulich wrote: >>> On 12.12.2024 02:17, Andrew Cooper wrote: >>>> (With the knowledge that this is a disputed Kconfig pattern, and will >>>> need rebasing), the way I want this to work is simply: >>>> >>>> diff --git a/xen/Makefile b/xen/Makefile >>>> index 0de0101fd0bf..5d0a88fb3c3f 100644 >>>> --- a/xen/Makefile >>>> +++ b/xen/Makefile >>>> @@ -434,6 +434,9 @@ endif >>>> >>>> ifeq ($(CONFIG_STACK_PROTECTOR),y) >>>> CFLAGS += -fstack-protector >>>> +ifeq ($(CONFIG_X86),y) >>>> +CFLAGS += -mstack-protector-guard=global >>>> +endif >>>> else >>>> CFLAGS += -fno-stack-protector >>>> endif >>>> diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig >>>> index 9cdd04721afa..7951ca908b36 100644 >>>> --- a/xen/arch/x86/Kconfig >>>> +++ b/xen/arch/x86/Kconfig >>>> @@ -28,6 +28,7 @@ config X86 >>>> select HAS_PCI_MSI >>>> select HAS_PIRQ >>>> select HAS_SCHED_GRANULARITY >>>> + select HAS_STACK_PROTECTOR if >>>> $(cc-option,-mstack-protector-guard=global) >>>> select HAS_UBSAN >>>> select HAS_VMAP >>>> select HAS_VPCI if HVM >>>> >>>> >>>> >>>> Sadly, it doesn't build. I get a handful of: >>>> >>>> prelink.o: in function `cmdline_parse': >>>> /home/andrew/xen.git/xen/common/kernel.c:216:(.init.text+0x20f2): failed >>>> to convert GOTPCREL relocation against '__stack_chk_guard'; relink with >>>> --no-relax >>>> /home/andrew/xen.git/xen/common/kernel.c:230:(.init.text+0x246f): failed >>>> to convert GOTPCREL relocation against '__stack_chk_guard'; relink with >>>> --no-relax >>>> >>>> which is more toolchain-whispering than I feel like doing tonight. >>> Imo the root of the problem is that the compiler doesn't itself mark >>> __stack_chk_guard hidden (it does so for __stack_chk_fail, albeit only for >>> 32-bit code), and hence finds it necessary to use @gotpcrel to access the >>> variable. Even if the linker managed to relax all of these, it would then >>> still be less efficient compared to direct RIP-relative accesses. >>> >>> I also can't see how we might be able to override the compiler's internal >>> declaration to mark it hidden (the same appears to be true for other items >>> the declares internally, like the retpoline thunks or even strcmp() et al). >>> Passing -fvisibility=hidden doesn't make a difference (just as another >>> data point). >>> >>> Playing with -fstack-protector* flavors, I observe: >>> - -fstack-protector causing several failures, like you observed, oddly >>> enough exclusively from __init functions, >>> - -fstack-protector-all and -fstack-protector-strong each causing a single >>> (but respectively different) failure, for apparently random non-__init >>> functions. >>> Taking this together it very much smells like a linker issue. I'll see >>> about checking there further. >> The oddity with how many diags show up is down to internals of the linker. >> It processes a single input section in full (continuing over this specific >> type of error), but will stop processing afterwards if any such error was >> encountered. >> >> The issue itself is a wrong assumption in the linker: It believes that it >> would only ever build small-model code when encountering this kind of >> relocation, and when not linking a shared library or PIE. With this >> assumption it converts the relocation resulting from @gotpcrel to >> R_X86_64_32S (converting the MOV from GOT to MOV $imm), which of course >> overflows when later trying to actually resolve it. What I'm yet to >> understand is why it doesn't use R_X86_64_PC32 (also) in such a situation >> (it does e.g. when building a shared library). >> >> While so far I didn't try it, using --no-relax is presumably not an option, >> as I expect that it'll leave us with a non-empty .got. Plus I didn't even >> start looking into how the xen.efi linking would deal with the ELF-specific >> gotpcrel relocs; the concept of GOT doesn't exist in PE/COFF, after all. >> >> While the linker certainly wants fixing, I continue to think that getting >> the compiler side right would yield the better overall result. > > Ok, so what precisely needs doing here? > > For starters, I guess __stack_chk_guard wants to respect > -fvisibilty=hidden and/or #pragma. I can see why it wouldn't want to in > regular userspace, but we're not that. Yes, this is one of the things that may want reporting as a deficiency. Imo it wants generalizing though, as it's not __stack_chk_guard alone which is affected. I'm not, btw, convinced the #pragma ought to have any effect. One might consider it legitimate to have an effect if there's a subsequent re- declaration. Yet one might also consider such an (incompatible) re- declaration be an error. > There's clearly also an LD error (bad assumptions about model). I'm still in the process of collecting data for an eventual email or bug report to be put together. For the purposes here I meanwhile think it is largely irrelevant. That's because, as said, even if the linker was fixed in this regard, there would still be the fact that we'd end up with a non- empty .got (I did hack the linker enough to verify this). Whereas once the compiler side was sorted, the linker issue wouldn't come into play anymore. Jan