Message ID | 20190614145755.10926-1-GNUtoo@cyberdimension.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | security: do not enable CONFIG_GCC_PLUGINS by default | expand |
+32-bit ARM folks On Fri, Jun 14, 2019 at 5:10 PM Denis 'GNUtoo' Carikli <GNUtoo@cyberdimension.org> wrote: > On a Galaxy SIII (I9300), the patch mentioned below broke boot: > - The display still had the bootloader logo, while with this > patch, the 4 Tux logo appears. > - No print appeared on the serial port anymore after the kernel > was loaded, whereas with this patch, we have the serial > console working, and the device booting. > > Booting was broken by the following commit: > 9f671e58159a ("security: Create "kernel hardening" config area") > > As the bootloader of this device enables the MMU, I had the following > patch applied during the tests: > Author: Arve Hjønnevåg <arve@android.com> > Date: Fri Nov 30 17:05:40 2012 -0800 > > ANDROID: arm: decompressor: Flush tlb before swiching domain 0 to client mode > > If the bootloader used a page table that is incompatible with domain 0 > in client mode, and boots with the mmu on, then swithing domain 0 to > client mode causes a fault if we don't flush the tlb after updating > the page table pointer. > > v2: Add ISB before loading dacr. > > diff --git a/arch/arm/boot/compressed/head.S b/arch/arm/boot/compressed/head.S > index 7135820f76d4..6e87ceda3b29 100644 > --- a/arch/arm/boot/compressed/head.S > +++ b/arch/arm/boot/compressed/head.S > @@ -837,6 +837,8 @@ __armv7_mmu_cache_on: > bic r6, r6, #1 << 31 @ 32-bit translation system > bic r6, r6, #(7 << 0) | (1 << 4) @ use only ttbr0 > mcrne p15, 0, r3, c2, c0, 0 @ load page table pointer > + mcrne p15, 0, r0, c8, c7, 0 @ flush I,D TLBs > + mcr p15, 0, r0, c7, c5, 4 @ ISB > mcrne p15, 0, r1, c3, c0, 0 @ load domain access control > mcrne p15, 0, r6, c2, c0, 2 @ load ttb control > #endif > > Signed-off-by: Denis 'GNUtoo' Carikli <GNUtoo@cyberdimension.org> > --- > scripts/gcc-plugins/Kconfig | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/scripts/gcc-plugins/Kconfig b/scripts/gcc-plugins/Kconfig > index e9c677a53c74..afa1db3d3471 100644 > --- a/scripts/gcc-plugins/Kconfig > +++ b/scripts/gcc-plugins/Kconfig > @@ -18,7 +18,6 @@ config GCC_PLUGINS > bool > depends on HAVE_GCC_PLUGINS > depends on PLUGIN_HOSTCC != "" > - default y > help > GCC plugins are loadable modules that provide extra features to the > compiler. They are useful for runtime instrumentation and static analysis. I don't think GCC_PLUGINS alone is supposed to generate any code? It just makes it possible to enable a bunch of other kconfig flags that can generate code. STACKPROTECTOR_PER_TASK defaults to y and depends on GCC_PLUGINS, so is that perhaps what broke? Can you try whether disabling just that works for you? My guess is that maybe there is some early boot code that needs to have the stack protector disabled, or something like that.
On Fri, 14 Jun 2019 18:05:19 +0200 Jann Horn <jannh@google.com> wrote: [...] > STACKPROTECTOR_PER_TASK defaults to y and depends on GCC_PLUGINS, so > is that perhaps what broke? Can you try whether disabling just that > works for you? I've tried that, and it's sufficient to make the device boot. Denis.
On Fri, Jun 14, 2019 at 06:05:19PM +0200, Jann Horn wrote: > +32-bit ARM folks > > On Fri, Jun 14, 2019 at 5:10 PM Denis 'GNUtoo' Carikli > <GNUtoo@cyberdimension.org> wrote: > > On a Galaxy SIII (I9300), the patch mentioned below broke boot: > > - The display still had the bootloader logo, while with this > > patch, the 4 Tux logo appears. > > - No print appeared on the serial port anymore after the kernel > > was loaded, whereas with this patch, we have the serial > > console working, and the device booting. > > > > Booting was broken by the following commit: > > 9f671e58159a ("security: Create "kernel hardening" config area") > > > > As the bootloader of this device enables the MMU, I had the following > > patch applied during the tests: > > Author: Arve Hjønnevåg <arve@android.com> > > Date: Fri Nov 30 17:05:40 2012 -0800 > > > > ANDROID: arm: decompressor: Flush tlb before swiching domain 0 to client mode > > > > If the bootloader used a page table that is incompatible with domain 0 > > in client mode, and boots with the mmu on, then swithing domain 0 to > > client mode causes a fault if we don't flush the tlb after updating > > the page table pointer. > > > > v2: Add ISB before loading dacr. I'm wondering whether this is sloppy wording or whether the author is really implying that they call the kernel decompressor with the MMU enabled, against the express instructions in Documentation/arm/Booting. If they are going against the express instructions, all bets are off.
On Fri, 14 Jun 2019 17:28:11 +0100 Russell King - ARM Linux admin <linux@armlinux.org.uk> wrote: > I'm wondering whether this is sloppy wording or whether the author is > really implying that they call the kernel decompressor with the MMU > enabled, against the express instructions in > Documentation/arm/Booting. According to [1] > If they are going against the express instructions, all bets are off. More background on the decompressor patch: - The "ANDROID: arm: decompressor: Flush tlb before swiching domain 0 to client mode" patch is needed anyway since 3.4 in any case, and according to the thread about it [1], the MMU is on at boot. - There is a downstream u-boot port for the Galaxy SIII and other very similar devices, which doesn't setup the MMU at boot, but I'm not confident enough to test in on the devices I have. To test with u-boot I'd need to find a new device. - If I don't manage to find a new device to test on, since there is already some setup code like arch/arm/boot/compressed/head-sa1100.S that deal with MMU that are enabled with the bootloader, are patches to add a new file like that still accepted? The big downside is that using something like that is probably incompatible with ARCH_MULTIPLATFORM. References: ----------- [1]http://lkml.iu.edu/hypermail/linux/kernel/1212.1/02099.html [2]https://blog.forkwhiletrue.me/posts/an-almost-fully-libre-galaxy-s3/ Denis.
On Fri, Jun 14, 2019 at 08:14:34PM +0200, Denis 'GNUtoo' Carikli wrote: > On Fri, 14 Jun 2019 17:28:11 +0100 > Russell King - ARM Linux admin <linux@armlinux.org.uk> wrote: > > I'm wondering whether this is sloppy wording or whether the author is > > really implying that they call the kernel decompressor with the MMU > > enabled, against the express instructions in > > Documentation/arm/Booting. > According to [1] > > If they are going against the express instructions, all bets are off. > > More background on the decompressor patch: > - The "ANDROID: arm: decompressor: Flush tlb before swiching domain 0 to > client mode" patch is needed anyway since 3.4 in any case, and > according to the thread about it [1], the MMU is on at boot. > - There is a downstream u-boot port for the Galaxy SIII and other very > similar devices, which doesn't setup the MMU at boot, but I'm not > confident enough to test in on the devices I have. To test with > u-boot I'd need to find a new device. > - If I don't manage to find a new device to test on, since there is > already some setup code like arch/arm/boot/compressed/head-sa1100.S > that deal with MMU that are enabled with the bootloader, are patches > to add a new file like that still accepted? The big downside is that > using something like that is probably incompatible with > ARCH_MULTIPLATFORM. SA11x0 pre-dates the booting document, which came about because of the desire to make the kernel less dependent on the host CPU type. So "sa11x0 does it so we can do it" is really not an argument I ever want to see to justify this kind of stuff. The booting requirements have been known since at least 2002, some SEVENTEEN years ago, and the problem was identified as buggy back in 2012. As far as I can see, nothing has changed. Entering the kernel with the MMU on and optionally caches on is an inherently unsafe thing to do. The kernel would have been placed into RAM via the data cache, and then we're trying to execute code - unless the caches have been properly cleaned and invalidated, there is no guarantee that we'd even reach any instructions to do our own cache cleaning and invalidation. So, caches on is utter madness. MMU on presents a problem: the kernel moves itself around during decompression - if it happens to move itself on top of the in-use page tables, then that would be really bad. There's another issue as well - if the page tables are already setup, and we create a different mapping for the virtual address range, the _only_ way to safely switch to that mapping is via a break-make arrangement, which means we need code to disable the MMU, flush it. It is not as simple as "a few extra instructions to flush TLBs" although that may work in the majority of cases. Architecturally, it is wrong. Things can get even worse - what if the page tables are located where the kernel writes its own page tables - modifying the live tables and changing the type of the entries. Architecturally unpredictable behaviour may result. What is written in Documentation/arm/Booting is not for our fun, it is there to spell out what the kernel requires to be able to boot reliably on hardware. If it isn't followed, then booting a kernel will be unreliable.
On Fri, Jun 14, 2019 at 06:05:19PM +0200, Jann Horn wrote: > On Fri, Jun 14, 2019 at 5:10 PM Denis 'GNUtoo' Carikli > <GNUtoo@cyberdimension.org> wrote: > > Booting was broken by the following commit: > > 9f671e58159a ("security: Create "kernel hardening" config area") > > I don't think GCC_PLUGINS alone is supposed to generate any code? It > just makes it possible to enable a bunch of other kconfig flags that > can generate code. > > STACKPROTECTOR_PER_TASK defaults to y and depends on GCC_PLUGINS, so > is that perhaps what broke? Can you try whether disabling just that > works for you? Yes, this has come up before: the option you want to disable is as Jann mentions: CONFIG_STACKPROTECTOR_PER_TASK. > My guess is that maybe there is some early boot code that needs to > have the stack protector disabled, or something like that. Right, though I'm not sure what portion would be specific to that device. You can turn off SSP on a per-file basis with: CFLAGS_target.o += $(DISABLE_ARM_SSP_PER_TASK_PLUGIN) or per-Makefile, as in arch/arm/boot/compressed/Makefile.
Hi, On Fri, 2019-06-14 at 20:14 +0200, Denis 'GNUtoo' Carikli wrote: > On Fri, 14 Jun 2019 17:28:11 +0100 > Russell King - ARM Linux admin <linux@armlinux.org.uk> wrote: > > I'm wondering whether this is sloppy wording or whether the author is > > really implying that they call the kernel decompressor with the MMU > > enabled, against the express instructions in > > Documentation/arm/Booting. > According to [1] > > If they are going against the express instructions, all bets are off. > > More background on the decompressor patch: > - The "ANDROID: arm: decompressor: Flush tlb before swiching domain 0 to > client mode" patch is needed anyway since 3.4 in any case, and > according to the thread about it [1], the MMU is on at boot. > - There is a downstream u-boot port for the Galaxy SIII and other very > similar devices, which doesn't setup the MMU at boot, but I'm not > confident enough to test in on the devices I have. To test with > u-boot I'd need to find a new device. > - If I don't manage to find a new device to test on, since there is > already some setup code like arch/arm/boot/compressed/head-sa1100.S > that deal with MMU that are enabled with the bootloader, are patches > to add a new file like that still accepted? The big downside is that > using something like that is probably incompatible with > ARCH_MULTIPLATFORM. Maybe we could also consider having a shim that is executed before the kernel in order to sanitize things and allow booting a mainline kernel, which would be less invasive than a full U-Boot port. Other than that, we can probably manage keeping a tree around (at the Replicant project) with mainline and this patch (enabled through a dedicated config option). As long as it's not horrible to rebase, it can work well enough for us. I'm also not sure about the state of Android support in mainline today, but there's a chance we'll need to pick a few patches on top of mainline anyway. What do you think? Cheers, Paul
On Sat, 15 Jun 2019 12:13:15 +0200 Paul Kocialkowski <paul.kocialkowski@bootlin.com> wrote: > Hi, Hi, > Other than that, we can probably manage keeping a tree around (at the > Replicant project) with mainline and this patch (enabled through a > dedicated config option). As long as it's not horrible to rebase, it > can work well enough for us. I've managed to buy a new Galaxy SIII 4G (I9305) and I've tried u-boot on it, and it works flawlessly without any patches and it does also work with CONFIG_STACKPROTECTOR_PER_TASK=y. Merely rebasing that arm decompressor patch over time should not be an issue. However I really want to find a way to avoid having to look again and again over time for commits that incidentally broke booting, because, the bootloader doesn't do what it's supposed to do. > Maybe we could also consider having a shim that is executed before the > kernel in order to sanitize things and allow booting a mainline > kernel, which would be less invasive than a full U-Boot port. If I understand correctly, that isn't a solution either as it would also be affected by the issues mentioned by Russell King. More specifically I would need to do more research to find if the bootloader(s) shipped on such smartphones properly cleans and invalidates the caches before jumping to the first instruction. Doing that research probably requires decompiling the bootloader, which in turn would require me to get legal advise to understand if it's possible to do it, and if so how to do it while respecting the laws involved, and still being able to work on free and open source bootloaders without creating issues for the projects. Another alternative to that would be to make users use u-boot but this is not possible either because: - The bootloader is signed. So the bootrom checks the signature of the first bootloader (BL1), which in turn checks the second bootloader (S-Boot) in which the MMU setup probably happens. So I can't merely replace S-Boot like that. - Fortunately for that system on a chip, there is at least one BL1 that is signed but that doesn't check subsequent signatures[1]. The issue is that it's not redistributable[2]. If that BL1 had not been published I would always need to use additional patches to test the patch I send, which is very problematic in many ways: - The additional patches would need to be mentioned in most or all of the commits I send upstream. - If not, the maintainers and readers of the patch would be unaware that it would require another patch on top to work. So thanks to that, I'm at least able to test the patches I send in Linux without requiring additional patches on top, but I'm still not able to ship something usable to end users. This means that the work to complete the support for the affected devices will be way less useful, as there would be no guarantee of users still being able to use the device with newer Linux kernels. Are there other (Android) smartphones affected by similar bootloader issues? If so is it even possible to replace part of the bootloader? Did some people found a way to deal with that kind of bootloader issue? References: ----------- [1]https://wiki.odroid.com/_media/en/boot.tar.gz [2]https://github.com/hardkernel/u-boot_firmware/issues/1 Denis.
Hi, On Sat, 2019-06-22 at 01:42 +0200, Denis 'GNUtoo' Carikli wrote: > On Sat, 15 Jun 2019 12:13:15 +0200 > Paul Kocialkowski <paul.kocialkowski@bootlin.com> wrote: > > Other than that, we can probably manage keeping a tree around (at the > > Replicant project) with mainline and this patch (enabled through a > > dedicated config option). As long as it's not horrible to rebase, it > > can work well enough for us. > I've managed to buy a new Galaxy SIII 4G (I9305) and I've tried u-boot > on it, and it works flawlessly without any patches and it does also > work with CONFIG_STACKPROTECTOR_PER_TASK=y. That's good to know, maybe they realized that they got it wrong later on. Hopefully this can indicate that future models are not affected. > Merely rebasing that arm decompressor patch over time should not be an > issue. However I really want to find a way to avoid having to look > again and again over time for commits that incidentally broke booting, > because, the bootloader doesn't do what it's supposed to do. I don't think there are many more areas where the bootloader can misbehave to a point where it will influence Linux (of course, that's without mention of software running in the "secure" world, which can be totally out of control as you know). > > Maybe we could also consider having a shim that is executed before the > > kernel in order to sanitize things and allow booting a mainline > > kernel, which would be less invasive than a full U-Boot port. > If I understand correctly, that isn't a solution either as it > would also be affected by the issues mentioned by Russell King. It is definitely a solution, but it comes with the constraint that it must be able to run and act as a trampoline between the bootloader and Linux. This means that the code must be able to deal with MMU and cache enabled. > More specifically I would need to do more research to find if the > bootloader(s) shipped on such smartphones properly cleans and > invalidates the caches before jumping to the first instruction. > > Doing that research probably requires decompiling the bootloader, > which in turn would require me to get legal advise to understand if it's > possible to do it, and if so how to do it while respecting the laws > involved, and still being able to work on free and open > source bootloaders without creating issues for the projects. I would rather try to just write minimal code and make sure it generally works. We can't really have any hard guarantee, but a program that was shown to run over and over again without faulting is probably good enough, since it would be very small. > Another alternative to that would be to make users use u-boot but > this is not possible either because: > - The bootloader is signed. So the bootrom checks the signature of the > first bootloader (BL1), which in turn checks the second bootloader > (S-Boot) in which the MMU setup probably happens. So I can't merely > replace S-Boot like that. > - Fortunately for that system on a chip, there is at least one BL1 that > is signed but that doesn't check subsequent signatures[1]. The issue > is that it's not redistributable[2]. > > If that BL1 had not been published I would always need to use additional > patches to test the patch I send, which is very problematic in many > ways: > - The additional patches would need to be mentioned in most or all of > the commits I send upstream. > - If not, the maintainers and readers of the patch would be unaware > that it would require another patch on top to work. > > So thanks to that, I'm at least able to test the patches I send in > Linux without requiring additional patches on top, but I'm still not > able to ship something usable to end users. > > This means that the work to complete the support for the affected devices will > be way less useful, as there would be no guarantee of users still being > able to use the device with newer Linux kernels. I agree and while a U-Boot port is desirable, it's not the easiest solution users to bootstrap mainline Linux on the device. > Are there other (Android) smartphones affected by similar bootloader > issues? If so is it even possible to replace part of the bootloader? > Did some people found a way to deal with that kind of bootloader issue? As far as I know, the few Android devices supported by mainline Linux don't have similar issues (e.g. OMAP phones/tablets) so the situation probably hasn't occured much. Cheers, Paul > References: > ----------- > [1]https://wiki.odroid.com/_media/en/boot.tar.gz > [2]https://github.com/hardkernel/u-boot_firmware/issues/1 > > Denis.
diff --git a/scripts/gcc-plugins/Kconfig b/scripts/gcc-plugins/Kconfig index e9c677a53c74..afa1db3d3471 100644 --- a/scripts/gcc-plugins/Kconfig +++ b/scripts/gcc-plugins/Kconfig @@ -18,7 +18,6 @@ config GCC_PLUGINS bool depends on HAVE_GCC_PLUGINS depends on PLUGIN_HOSTCC != "" - default y help GCC plugins are loadable modules that provide extra features to the compiler. They are useful for runtime instrumentation and static analysis.
On a Galaxy SIII (I9300), the patch mentioned below broke boot: - The display still had the bootloader logo, while with this patch, the 4 Tux logo appears. - No print appeared on the serial port anymore after the kernel was loaded, whereas with this patch, we have the serial console working, and the device booting. Booting was broken by the following commit: 9f671e58159a ("security: Create "kernel hardening" config area") As the bootloader of this device enables the MMU, I had the following patch applied during the tests: Author: Arve Hjønnevåg <arve@android.com> Date: Fri Nov 30 17:05:40 2012 -0800 ANDROID: arm: decompressor: Flush tlb before swiching domain 0 to client mode If the bootloader used a page table that is incompatible with domain 0 in client mode, and boots with the mmu on, then swithing domain 0 to client mode causes a fault if we don't flush the tlb after updating the page table pointer. v2: Add ISB before loading dacr. diff --git a/arch/arm/boot/compressed/head.S b/arch/arm/boot/compressed/head.S index 7135820f76d4..6e87ceda3b29 100644 --- a/arch/arm/boot/compressed/head.S +++ b/arch/arm/boot/compressed/head.S @@ -837,6 +837,8 @@ __armv7_mmu_cache_on: bic r6, r6, #1 << 31 @ 32-bit translation system bic r6, r6, #(7 << 0) | (1 << 4) @ use only ttbr0 mcrne p15, 0, r3, c2, c0, 0 @ load page table pointer + mcrne p15, 0, r0, c8, c7, 0 @ flush I,D TLBs + mcr p15, 0, r0, c7, c5, 4 @ ISB mcrne p15, 0, r1, c3, c0, 0 @ load domain access control mcrne p15, 0, r6, c2, c0, 2 @ load ttb control #endif Signed-off-by: Denis 'GNUtoo' Carikli <GNUtoo@cyberdimension.org> --- scripts/gcc-plugins/Kconfig | 1 - 1 file changed, 1 deletion(-)