mbox series

[v1,0/6] Allow kexec reboot for GICv3 and device tree

Message ID 20190826190056.27854-1-pasha.tatashin@soleen.com (mailing list archive)
Headers show
Series Allow kexec reboot for GICv3 and device tree | expand

Message

Pasha Tatashin Aug. 26, 2019, 7 p.m. UTC
Marc Zyngier added the support for kexec and GICv3 for EFI based systems.
However, it is still not possible todo on systems with device trees.

Here is EFI fixes from Marc:
https://lore.kernel.org/lkml/20180921195954.21574-1-marc.zyngier@arm.com

For Device Tree variant: lets allow reserve a memory region in interrupt
controller node, and use this property to allocate interrupt tables.

This way we are safe during kexec, as these page tables are going to stay
the same after kexec.

Pavel Tatashin (6):
  rqchip/gic-v3-its: reset prop table outside of allocation
  rqchip/gic-v3-its: use temporary va / pa variables
  rqchip/gic-v3-its: add reset pending table function
  rqchip/gic-v3-its: move reset pending table outside of allocator
  rqchip/gic-v3-its: move reset pending table outside of allocator
  dt-bindings: interrupt-controller: add optional memory-region

 .../interrupt-controller/arm,gic-v3.yaml      |   7 +
 drivers/irqchip/irq-gic-v3-its.c              | 121 +++++++++++++-----
 2 files changed, 96 insertions(+), 32 deletions(-)

Comments

Marc Zyngier Aug. 26, 2019, 7:13 p.m. UTC | #1
On Mon, 26 Aug 2019 15:00:50 -0400
Pavel Tatashin <pasha.tatashin@soleen.com> wrote:

> Marc Zyngier added the support for kexec and GICv3 for EFI based systems.
> However, it is still not possible todo on systems with device trees.
> 
> Here is EFI fixes from Marc:
> https://lore.kernel.org/lkml/20180921195954.21574-1-marc.zyngier@arm.com
> 
> For Device Tree variant: lets allow reserve a memory region in interrupt
> controller node, and use this property to allocate interrupt tables.

There is no such thing as a "device tree variant". As long as your
bootloader implements EFI, everything will work correctly, whether
you're using DT, ACPI, or the anything else.

This already works today, without any need to add anything to the
kernel (I have systems using EDK II and u-boot, both implementing EFI,
and I'm able to kexec without any issue). If your bootloader doesn't
support EFI, here's a good opportunity to implement it!

Thanks,

	M.
Pasha Tatashin Aug. 26, 2019, 9:25 p.m. UTC | #2
On Mon, Aug 26, 2019 at 3:13 PM Marc Zyngier <maz@kernel.org> wrote:
>
> On Mon, 26 Aug 2019 15:00:50 -0400
> Pavel Tatashin <pasha.tatashin@soleen.com> wrote:
>
> > Marc Zyngier added the support for kexec and GICv3 for EFI based systems.
> > However, it is still not possible todo on systems with device trees.
> >
> > Here is EFI fixes from Marc:
> > https://lore.kernel.org/lkml/20180921195954.21574-1-marc.zyngier@arm.com
> >
> > For Device Tree variant: lets allow reserve a memory region in interrupt
> > controller node, and use this property to allocate interrupt tables.
>
> There is no such thing as a "device tree variant". As long as your
> bootloader implements EFI, everything will work correctly, whether
> you're using DT, ACPI, or the anything else.
>
> This already works today, without any need to add anything to the
> kernel (I have systems using EDK II and u-boot, both implementing EFI,
> and I'm able to kexec without any issue). If your bootloader doesn't
> support EFI, here's a good opportunity to implement it!

Hi Marc,

Thank you very much for looking at this work.

Running Linux without EFI is common, and there are scenarios which
make it appropriate. As I understand most of embedded linux do not
have EFI enabled, and thus I do not see a reason why we would not
support a first class feature of Linux (kexec) on non-EFI bootloaders.

We (Microsoft) have a small highly secure device with a high uptime
requirement. The device also has PCIe and thus GICv3. The update for
this device relies on kexec. For a number of reasons, it was decided
to use U-Boot and Linux without EFI enabled. One of those reasons is
to improve boot performance, enabling EFI in U-Boot alone reduces the
boot performance by half a second. Our total reboot budget is under a
second which makes that half a second unacceptable. Also, adding EFI
support to kernel increases its size and there are security
implications from enabling more code both in U-Boot and Linux.

> --
> Without deviation from the norm, progress is not possible.

Totally agreed.

Thank you,
Pasha
Marc Zyngier Aug. 27, 2019, 8:15 a.m. UTC | #3
On 26/08/2019 22:25, Pavel Tatashin wrote:
> On Mon, Aug 26, 2019 at 3:13 PM Marc Zyngier <maz@kernel.org> wrote:
>>
>> On Mon, 26 Aug 2019 15:00:50 -0400
>> Pavel Tatashin <pasha.tatashin@soleen.com> wrote:
>>
>>> Marc Zyngier added the support for kexec and GICv3 for EFI based systems.
>>> However, it is still not possible todo on systems with device trees.
>>>
>>> Here is EFI fixes from Marc:
>>> https://lore.kernel.org/lkml/20180921195954.21574-1-marc.zyngier@arm.com
>>>
>>> For Device Tree variant: lets allow reserve a memory region in interrupt
>>> controller node, and use this property to allocate interrupt tables.
>>
>> There is no such thing as a "device tree variant". As long as your
>> bootloader implements EFI, everything will work correctly, whether
>> you're using DT, ACPI, or the anything else.
>>
>> This already works today, without any need to add anything to the
>> kernel (I have systems using EDK II and u-boot, both implementing EFI,
>> and I'm able to kexec without any issue). If your bootloader doesn't
>> support EFI, here's a good opportunity to implement it!
> 
> Hi Marc,
> 
> Thank you very much for looking at this work.
> 
> Running Linux without EFI is common, and there are scenarios which
> make it appropriate. As I understand most of embedded linux do not
> have EFI enabled, and thus I do not see a reason why we would not
> support a first class feature of Linux (kexec) on non-EFI bootloaders.

Define "most". All the arm64 systems I have around (and trust me, that's
quite a number of them) can either use u-boot, which has more than
enough EFI support to use this functionality, or use EDK-II natively.

> We (Microsoft) have a small highly secure device with a high uptime
> requirement. The device also has PCIe and thus GICv3. The update for

PCIe doesn't imply GICv3 at all.

> this device relies on kexec. For a number of reasons, it was decided
> to use U-Boot and Linux without EFI enabled. One of those reasons is
> to improve boot performance, enabling EFI in U-Boot alone reduces the
> boot performance by half a second. Our total reboot budget is under a
> second which makes that half a second unacceptable. Also, adding EFI
> support to kernel increases its size and there are security
> implications from enabling more code both in U-Boot and Linux.

You're are missing the point. kexec with EFI has 0 overhead (no
non-kernel EFI code gets executed), doesn't impact your time budget, and
only relies on a single in-memory table. This can be pretty trivially
provided by the dumbest EFI shim.

All you are describing above is a set of self imposed limitations in
your bootloader, which you are fully in control of. So instead of
reinventing a square wheel, I suggest you adopt the existing implementation.

Another reason not to do this is interoperability: I want to be able to
kexec whatever Linux kernel I want, without having to cope with all
flavours of the same functionality. Effectively, the EFI table is a
private ABI between two Linux kernels. We're not changing it.

	M.
Pasha Tatashin Aug. 27, 2019, 8:53 a.m. UTC | #4
> > Running Linux without EFI is common, and there are scenarios which
> > make it appropriate. As I understand most of embedded linux do not
> > have EFI enabled, and thus I do not see a reason why we would not
> > support a first class feature of Linux (kexec) on non-EFI bootloaders.
>
> Define "most". All the arm64 systems I have around (and trust me, that's
> quite a number of them) can either use u-boot, which has more than
> enough EFI support to use this functionality, or use EDK-II natively.

OK. Is this the most common configuration in the embedded ARM64
devices currently deployed: phones, cameras, consoles, players, etc?

> > We (Microsoft) have a small highly secure device with a high uptime
> > requirement. The device also has PCIe and thus GICv3. The update for
>
> PCIe doesn't imply GICv3 at all.

My impression was that without PCIe GICv3 is rarely used, and this
could be the reason why this problem is not seen outside of larger
machines which normally have EFI enabled.

>
> > this device relies on kexec. For a number of reasons, it was decided
> > to use U-Boot and Linux without EFI enabled. One of those reasons is
> > to improve boot performance, enabling EFI in U-Boot alone reduces the
> > boot performance by half a second. Our total reboot budget is under a
> > second which makes that half a second unacceptable. Also, adding EFI
> > support to kernel increases its size and there are security
> > implications from enabling more code both in U-Boot and Linux.
>
> You're are missing the point. kexec with EFI has 0 overhead (no
> non-kernel EFI code gets executed), doesn't impact your time budget, and
> only relies on a single in-memory table. This can be pretty trivially
> provided by the dumbest EFI shim.

Thanks, this makes sense that the Linux boot time won't be affected. I
have not tested how u-boot was affected, but was told 0.5 second
longer to start.

> All you are describing above is a set of self imposed limitations in
> your bootloader, which you are fully in control of. So instead of
> reinventing a square wheel, I suggest you adopt the existing implementation.

I am not sure this analogy is correct, I do not think that non-EFI
enabled kernels became legacy.

> Another reason not to do this is interoperability: I want to be able to
> kexec whatever Linux kernel I want, without having to cope with all
> flavours of the same functionality. Effectively, the EFI table is a
> private ABI between two Linux kernels. We're not changing it.

This is exactly the problem: by having this region defined in signed
DTB file we reduce the amount of communication between the kernels.
Passing modified EFI Table causes us to pass more information from the
first kernel indefinitely through updates. Thus, increases a chance
for a security compromise. We are not changing EFI ABI between
kernels, it will stay as is. All this code does is enables kernels
that do not have EFI table communication between them a way to do
kexec updates with reduced amount of data exchange.

Thank you,
Pasha
Marc Zyngier Aug. 27, 2019, 9:24 a.m. UTC | #5
On 27/08/2019 09:53, Pavel Tatashin wrote:
>>> Running Linux without EFI is common, and there are scenarios which
>>> make it appropriate. As I understand most of embedded linux do not
>>> have EFI enabled, and thus I do not see a reason why we would not
>>> support a first class feature of Linux (kexec) on non-EFI bootloaders.
>>
>> Define "most". All the arm64 systems I have around (and trust me, that's
>> quite a number of them) can either use u-boot, which has more than
>> enough EFI support to use this functionality, or use EDK-II natively.
> 
> OK. Is this the most common configuration in the embedded ARM64
> devices currently deployed: phones, cameras, consoles, players, etc?

Which one of these has kexec as a requirement?

>>> We (Microsoft) have a small highly secure device with a high uptime
>>> requirement. The device also has PCIe and thus GICv3. The update for
>>
>> PCIe doesn't imply GICv3 at all.
> 
> My impression was that without PCIe GICv3 is rarely used, and this
> could be the reason why this problem is not seen outside of larger
> machines which normally have EFI enabled.

Wong impression. All the combinations exist and are wildly deployed.

>>> this device relies on kexec. For a number of reasons, it was decided
>>> to use U-Boot and Linux without EFI enabled. One of those reasons is
>>> to improve boot performance, enabling EFI in U-Boot alone reduces the
>>> boot performance by half a second. Our total reboot budget is under a
>>> second which makes that half a second unacceptable. Also, adding EFI
>>> support to kernel increases its size and there are security
>>> implications from enabling more code both in U-Boot and Linux.
>>
>> You're are missing the point. kexec with EFI has 0 overhead (no
>> non-kernel EFI code gets executed), doesn't impact your time budget, and
>> only relies on a single in-memory table. This can be pretty trivially
>> provided by the dumbest EFI shim.
> 
> Thanks, this makes sense that the Linux boot time won't be affected. I
> have not tested how u-boot was affected, but was told 0.5 second
> longer to start.

So you haven't even tried? :-(

> 
>> All you are describing above is a set of self imposed limitations in
>> your bootloader, which you are fully in control of. So instead of
>> reinventing a square wheel, I suggest you adopt the existing implementation.
> 
> I am not sure this analogy is correct, I do not think that non-EFI
> enabled kernels became legacy.

non-EFI systems always had reduced functionality, such as not being able
to use runtime services.

> 
>> Another reason not to do this is interoperability: I want to be able to
>> kexec whatever Linux kernel I want, without having to cope with all
>> flavours of the same functionality. Effectively, the EFI table is a
>> private ABI between two Linux kernels. We're not changing it.
> 
> This is exactly the problem: by having this region defined in signed
> DTB file we reduce the amount of communication between the kernels.
> Passing modified EFI Table causes us to pass more information from the
> first kernel indefinitely through updates. Thus, increases a chance
> for a security compromise.

Nothing says that it has to be modified. For what it's worth, you could
perform the allocation in your bootloader once and for all, configure
the GIC redistributors and enable LPIs there, and pass the EFI
reservation to the first kernel. The security argument is a fallacy.

> We are not changing EFI ABI between
> kernels, it will stay as is. All this code does is enables kernels
> that do not have EFI table communication between them a way to do
> kexec updates with reduced amount of data exchange.

And to do that, you're adding yet another ABI we have to support, and
creating havoc in the kexec chain (kernel 1 knows about the DT hack,
kernel 2 doesn't, panic follows). My answer to this is *no*. We already
have a flexible interface that allows you to do what you want, and I'm
not adding another one.

Thanks,

	M.