Message ID | 1490196629-28088-1-git-send-email-ard.biesheuvel@linaro.org (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
On Wed, Mar 22, 2017 at 03:30:29PM +0000, Ard Biesheuvel wrote: > On UEFI systems, the PCI subsystem is enumerated by the firmware, > and if a graphical framebuffer is exposed by a PCI device, its base > address and size are exposed to the OS via the Graphics Output > Protocol (GOP). > > On arm64 PCI systems, the entire PCI hierarchy is reconfigured from > scratch at boot. This may result in the GOP framebuffer address to > become stale, if the BAR covering the framebuffer is modified. This > will cause the framebuffer to become unresponsive, and may in some > cases result in unpredictable behavior if the range is reassigned to > another device. Hm, commit message seems to indicate the issue is restricted to arm64, yet there's no IS_ENABLED(ARM64) to constrain the added code to that arch? > +DECLARE_PCI_FIXUP_HEADER(PCI_ANY_ID, PCI_ANY_ID, efifb_fixup_resources); Maybe this can be constrained to PCI_BASE_CLASS_DISPLAY? Thanks, Lukas
On 22 March 2017 at 19:31, Lukas Wunner <lukas@wunner.de> wrote: > On Wed, Mar 22, 2017 at 03:30:29PM +0000, Ard Biesheuvel wrote: >> On UEFI systems, the PCI subsystem is enumerated by the firmware, >> and if a graphical framebuffer is exposed by a PCI device, its base >> address and size are exposed to the OS via the Graphics Output >> Protocol (GOP). >> >> On arm64 PCI systems, the entire PCI hierarchy is reconfigured from >> scratch at boot. This may result in the GOP framebuffer address to >> become stale, if the BAR covering the framebuffer is modified. This >> will cause the framebuffer to become unresponsive, and may in some >> cases result in unpredictable behavior if the range is reassigned to >> another device. > > Hm, commit message seems to indicate the issue is restricted to arm64, > yet there's no IS_ENABLED(ARM64) to constrain the added code to that arch? > True. I am eager to get some x86 coverage for this, since I would expect this not to do any harm. But I'm fine with making it ARM/arm64 specific in the final version. > >> +DECLARE_PCI_FIXUP_HEADER(PCI_ANY_ID, PCI_ANY_ID, efifb_fixup_resources); > > Maybe this can be constrained to PCI_BASE_CLASS_DISPLAY? > How does one do that with DECLARE_PCI_FIXUP_HEADER?
On 3/22/2017 3:31 PM, Lukas Wunner wrote: > On arm64 PCI systems, the entire PCI hierarchy is reconfigured from > scratch at boot. This may result in the GOP framebuffer address to > become stale, if the BAR covering the framebuffer is modified. Does this code still work with the case where resources are not reassigned on ARM64? As far as I know, kernel honors resource assignments done by the UEFI BIOS if they are correct. Kernel will reassign the resources only if something is wrong. Will this code break other platforms/architectures?
On 22 March 2017 at 19:36, Sinan Kaya <okaya@codeaurora.org> wrote: > On 3/22/2017 3:31 PM, Lukas Wunner wrote: >> On arm64 PCI systems, the entire PCI hierarchy is reconfigured from >> scratch at boot. This may result in the GOP framebuffer address to >> become stale, if the BAR covering the framebuffer is modified. > > Does this code still work with the case where resources are not reassigned on ARM64? > > As far as I know, kernel honors resource assignments done by the UEFI BIOS if > they are correct. Kernel will reassign the resources only if something is wrong. > No, the kernel always reassigns all BARs on arm64. > Will this code break other platforms/architectures? > Which platforms/architectures are you referring to? EFIFB on a PCI device is currently broken on arm64. On x86, it works, given that BARs are usually not reassigned, and so the patch should be a no-op in that case (although I'd argue it is still an improvement to check whether the device that owns the BAR actually has memory decoding enabled before we attach the framebuffer driver to it)
On 3/22/2017 3:41 PM, Ard Biesheuvel wrote: >> As far as I know, kernel honors resource assignments done by the UEFI BIOS if >> they are correct. Kernel will reassign the resources only if something is wrong. >> > No, the kernel always reassigns all BARs on arm64. I think this is where the problem is. I'm not seeing this happen on QDF2400 which supports ACPI + UEFI BIOS combination only. I see that kernel honored the resources assigned by UEFI BIOS if I compare the BAR addresses. I see reassignment only when something is horribly broken. Then, there would be a bridge configuration invalid message in the boot log to confirm this. > >> Will this code break other platforms/architectures? >> > Which platforms/architectures are you referring to? EFIFB on a PCI > device is currently broken on arm64. In general or on your particular platform? > On x86, it works, given that BARs > are usually not reassigned, and so the patch should be a no-op in that > case (although I'd argue it is still an improvement to check whether > the device that owns the BAR actually has memory decoding enabled > before we attach the framebuffer driver to it) > I'm fine as long as it doesn't break anything. That's why, I'm asking.
On 22 March 2017 at 19:49, Sinan Kaya <okaya@codeaurora.org> wrote: > On 3/22/2017 3:41 PM, Ard Biesheuvel wrote: >>> As far as I know, kernel honors resource assignments done by the UEFI BIOS if >>> they are correct. Kernel will reassign the resources only if something is wrong. >>> >> No, the kernel always reassigns all BARs on arm64. > > I think this is where the problem is. > > I'm not seeing this happen on QDF2400 which supports ACPI + UEFI BIOS > combination only. > > I see that kernel honored the resources assigned by UEFI BIOS if I compare > the BAR addresses. > Well, if you are using the standard MdeModulePkg PciHostBridgeDxe driver in UEFI, the logic is quite similar, and so you usually end up with the same resource assignments. But the kernel does recompute them from scratch, and ignores the existing configuration that is present in the BARs > I see reassignment only when something is horribly broken. Then, there would > be a bridge configuration invalid message in the boot log to confirm this. > >> >>> Will this code break other platforms/architectures? >>> >> Which platforms/architectures are you referring to? EFIFB on a PCI >> device is currently broken on arm64. > > In general or on your particular platform? > >> On x86, it works, given that BARs >> are usually not reassigned, and so the patch should be a no-op in that >> case (although I'd argue it is still an improvement to check whether >> the device that owns the BAR actually has memory decoding enabled >> before we attach the framebuffer driver to it) >> > > I'm fine as long as it doesn't break anything. That's why, I'm asking. > If you have working EFIFB on your platform, it works by accident. This patch is needed to ensure that the BAR associated with the EFI framebuffer is left untouched.
On 3/22/2017 3:52 PM, Ard Biesheuvel wrote: > On 22 March 2017 at 19:49, Sinan Kaya <okaya@codeaurora.org> wrote: >> On 3/22/2017 3:41 PM, Ard Biesheuvel wrote: >>>> As far as I know, kernel honors resource assignments done by the UEFI BIOS if >>>> they are correct. Kernel will reassign the resources only if something is wrong. >>>> >>> No, the kernel always reassigns all BARs on arm64. >> >> I think this is where the problem is. >> >> I'm not seeing this happen on QDF2400 which supports ACPI + UEFI BIOS >> combination only. >> >> I see that kernel honored the resources assigned by UEFI BIOS if I compare >> the BAR addresses. >> > > Well, if you are using the standard MdeModulePkg PciHostBridgeDxe > driver in UEFI, the logic is quite similar, and so you usually end up > with the same resource assignments. But the kernel does recompute them > from scratch, and ignores the existing configuration that is present > in the BARs Yes, standard stuff in UEFI. OK. That may explain what I'm seeing. > >> I see reassignment only when something is horribly broken. Then, there would >> be a bridge configuration invalid message in the boot log to confirm this. >> >>> >>>> Will this code break other platforms/architectures? >>>> >>> Which platforms/architectures are you referring to? EFIFB on a PCI >>> device is currently broken on arm64. >> >> In general or on your particular platform? >> >>> On x86, it works, given that BARs >>> are usually not reassigned, and so the patch should be a no-op in that >>> case (although I'd argue it is still an improvement to check whether >>> the device that owns the BAR actually has memory decoding enabled >>> before we attach the framebuffer driver to it) >>> >> >> I'm fine as long as it doesn't break anything. That's why, I'm asking. >> > > If you have working EFIFB on your platform, it works by accident. This > patch is needed to ensure that the BAR associated with the EFI > framebuffer is left untouched. > I never claimed that. I was just double checking your assumptions. I really don't like ARM64 specific PCI code in general. PCI doesn't care about CPU type. I hope to see this patch become part of common code.
On 22 March 2017 at 19:57, Sinan Kaya <okaya@codeaurora.org> wrote: > On 3/22/2017 3:52 PM, Ard Biesheuvel wrote: [..] >> If you have working EFIFB on your platform, it works by accident. This >> patch is needed to ensure that the BAR associated with the EFI >> framebuffer is left untouched. >> > > I never claimed that. I was just double checking your assumptions. > I really don't like ARM64 specific PCI code in general. PCI doesn't care > about CPU type. I hope to see this patch become part of common code. > Good! Thanks
On Wed, Mar 22, 2017 at 07:32:43PM +0000, Ard Biesheuvel wrote: > On 22 March 2017 at 19:31, Lukas Wunner <lukas@wunner.de> wrote: > > On Wed, Mar 22, 2017 at 03:30:29PM +0000, Ard Biesheuvel wrote: > >> On UEFI systems, the PCI subsystem is enumerated by the firmware, > >> and if a graphical framebuffer is exposed by a PCI device, its base > >> address and size are exposed to the OS via the Graphics Output > >> Protocol (GOP). > >> > >> On arm64 PCI systems, the entire PCI hierarchy is reconfigured from > >> scratch at boot. This may result in the GOP framebuffer address to > >> become stale, if the BAR covering the framebuffer is modified. This > >> will cause the framebuffer to become unresponsive, and may in some > >> cases result in unpredictable behavior if the range is reassigned to > >> another device. > > > > Hm, commit message seems to indicate the issue is restricted to arm64, > > yet there's no IS_ENABLED(ARM64) to constrain the added code to that arch? > > True. I am eager to get some x86 coverage for this, since I would > expect this not to do any harm. But I'm fine with making it ARM/arm64 > specific in the final version. I see. IIUC, this is only a problem because pci_bus_assign_resources() is called from arch/arm64/kernel/pci.c:pci_acpi_scan_root() (as well as the host drivers) and x86 isn't affected because it doesn't do that. I have no opinion on executing the quirk on x86 as well, I was just confused by the discrepancy between commit message and patch, but that can easily be remedied with a copy+paste of what you replied to Sinan: "On x86, it works, given that BARs are usually not reassigned, and so the patch should be a no-op in that case, although it should still be an improvement to check whether the device that owns the BAR actually has memory decoding enabled before we attach the framebuffer driver to it." > >> +DECLARE_PCI_FIXUP_HEADER(PCI_ANY_ID, PCI_ANY_ID, efifb_fixup_resources); > > > > Maybe this can be constrained to PCI_BASE_CLASS_DISPLAY? > > How does one do that with DECLARE_PCI_FIXUP_HEADER? With DECLARE_PCI_FIXUP_CLASS_HEADER(). Thanks, Lukas
On 23 March 2017 at 08:48, Lukas Wunner <lukas@wunner.de> wrote: > On Wed, Mar 22, 2017 at 07:32:43PM +0000, Ard Biesheuvel wrote: >> On 22 March 2017 at 19:31, Lukas Wunner <lukas@wunner.de> wrote: >> > On Wed, Mar 22, 2017 at 03:30:29PM +0000, Ard Biesheuvel wrote: >> >> On UEFI systems, the PCI subsystem is enumerated by the firmware, >> >> and if a graphical framebuffer is exposed by a PCI device, its base >> >> address and size are exposed to the OS via the Graphics Output >> >> Protocol (GOP). >> >> >> >> On arm64 PCI systems, the entire PCI hierarchy is reconfigured from >> >> scratch at boot. This may result in the GOP framebuffer address to >> >> become stale, if the BAR covering the framebuffer is modified. This >> >> will cause the framebuffer to become unresponsive, and may in some >> >> cases result in unpredictable behavior if the range is reassigned to >> >> another device. >> > >> > Hm, commit message seems to indicate the issue is restricted to arm64, >> > yet there's no IS_ENABLED(ARM64) to constrain the added code to that arch? >> >> True. I am eager to get some x86 coverage for this, since I would >> expect this not to do any harm. But I'm fine with making it ARM/arm64 >> specific in the final version. > > I see. IIUC, this is only a problem because pci_bus_assign_resources() > is called from arch/arm64/kernel/pci.c:pci_acpi_scan_root() (as well as > the host drivers) and x86 isn't affected because it doesn't do that. > Correct. But on x86 (or rather, on a PC), you can be sure that UEFI (or the legacy PCI bios) performed the resource assignment already. One could argue that this is equally the case when running arm64 in ACPI mode, but in general, you cannot assume the presence of firmware on ARM/arm64 that has already taken care of this, and so the state of the BARs has to be presumed invalid. > I have no opinion on executing the quirk on x86 as well, I was just > confused by the discrepancy between commit message and patch, but that > can easily be remedied with a copy+paste of what you replied to Sinan: > > "On x86, it works, given that BARs are usually not reassigned, > and so the patch should be a no-op in that case, although it > should still be an improvement to check whether the device that > owns the BAR actually has memory decoding enabled before we > attach the framebuffer driver to it." > > OK, I will include that. >> >> +DECLARE_PCI_FIXUP_HEADER(PCI_ANY_ID, PCI_ANY_ID, efifb_fixup_resources); >> > >> > Maybe this can be constrained to PCI_BASE_CLASS_DISPLAY? >> >> How does one do that with DECLARE_PCI_FIXUP_HEADER? > > With DECLARE_PCI_FIXUP_CLASS_HEADER(). > Ah ok, thanks for pointing that out. I do wonder whether it makes sense to keep the code as is, so that we spot unexpected configurations, i.e., where the GOP points into a BAR that is unrelated to graphics. In this case, I think we should disable efifb rather than proceed without claiming any PCI resources (as we did without this patch)
On Thu, Mar 23, 2017 at 09:04:03AM +0000, Ard Biesheuvel wrote: > On 23 March 2017 at 08:48, Lukas Wunner <lukas@wunner.de> wrote: > > On Wed, Mar 22, 2017 at 07:32:43PM +0000, Ard Biesheuvel wrote: > >> On 22 March 2017 at 19:31, Lukas Wunner <lukas@wunner.de> wrote: > >> > On Wed, Mar 22, 2017 at 03:30:29PM +0000, Ard Biesheuvel wrote: > >> >> On UEFI systems, the PCI subsystem is enumerated by the firmware, > >> >> and if a graphical framebuffer is exposed by a PCI device, its base > >> >> address and size are exposed to the OS via the Graphics Output > >> >> Protocol (GOP). > >> >> > >> >> On arm64 PCI systems, the entire PCI hierarchy is reconfigured from > >> >> scratch at boot. This may result in the GOP framebuffer address to > >> >> become stale, if the BAR covering the framebuffer is modified. This > >> >> will cause the framebuffer to become unresponsive, and may in some > >> >> cases result in unpredictable behavior if the range is reassigned to > >> >> another device. > >> > > >> > Hm, commit message seems to indicate the issue is restricted to arm64, > >> > yet there's no IS_ENABLED(ARM64) to constrain the added code to that arch? > >> > >> True. I am eager to get some x86 coverage for this, since I would > >> expect this not to do any harm. But I'm fine with making it ARM/arm64 > >> specific in the final version. > > > > I see. IIUC, this is only a problem because pci_bus_assign_resources() > > is called from arch/arm64/kernel/pci.c:pci_acpi_scan_root() (as well as > > the host drivers) and x86 isn't affected because it doesn't do that. > > > > Correct. But on x86 (or rather, on a PC), you can be sure that UEFI > (or the legacy PCI bios) performed the resource assignment already. > One could argue that this is equally the case when running arm64 in > ACPI mode, but in general, you cannot assume the presence of firmware > on ARM/arm64 that has already taken care of this, and so the state of > the BARs has to be presumed invalid. The story is a bit more convoluted than that owing to x86 (and other arches) legacy. x86 tries to claim all PCI resources (in two passes - first enabled devices, second disabled devices) and that predates ACPI/UEFI. Mind, x86 reassign resources that can't be claimed too, the only difference with ARM64 is that, for the better or the worse, we have decided not to claim the FW PCI set-up on ARM64 even if it is sane, we do not even try, it was a deliberate choice. This patch should be harmless on x86 since if the FB PCI BAR is set up sanely, claiming it again should be a nop (to be checked). For all the talk about PCI being arch agnostic as I said manifold times before, that's just theory. In practice different arches treat PCI FW set-up differently, it would be ideal to make them uniform but legacy is huge and there is a massive risk of triggering regressions, it is no mean feat (if achievable at all). Lorenzo > > I have no opinion on executing the quirk on x86 as well, I was just > > confused by the discrepancy between commit message and patch, but that > > can easily be remedied with a copy+paste of what you replied to Sinan: > > > > "On x86, it works, given that BARs are usually not reassigned, > > and so the patch should be a no-op in that case, although it > > should still be an improvement to check whether the device that > > owns the BAR actually has memory decoding enabled before we > > attach the framebuffer driver to it." > > > > > > OK, I will include that. > > >> >> +DECLARE_PCI_FIXUP_HEADER(PCI_ANY_ID, PCI_ANY_ID, efifb_fixup_resources); > >> > > >> > Maybe this can be constrained to PCI_BASE_CLASS_DISPLAY? > >> > >> How does one do that with DECLARE_PCI_FIXUP_HEADER? > > > > With DECLARE_PCI_FIXUP_CLASS_HEADER(). > > > > Ah ok, thanks for pointing that out. > > I do wonder whether it makes sense to keep the code as is, so that we > spot unexpected configurations, i.e., where the GOP points into a BAR > that is unrelated to graphics. In this case, I think we should disable > efifb rather than proceed without claiming any PCI resources (as we > did without this patch)
Hi Lorenzo, On 3/23/2017 6:57 AM, Lorenzo Pieralisi wrote: >> Correct. But on x86 (or rather, on a PC), you can be sure that UEFI >> (or the legacy PCI bios) performed the resource assignment already. >> One could argue that this is equally the case when running arm64 in >> ACPI mode, but in general, you cannot assume the presence of firmware >> on ARM/arm64 that has already taken care of this, and so the state of >> the BARs has to be presumed invalid. > The story is a bit more convoluted than that owing to x86 (and other > arches) legacy. > > x86 tries to claim all PCI resources (in two passes - first enabled > devices, second disabled devices) and that predates ACPI/UEFI. > > Mind, x86 reassign resources that can't be claimed too, the only > difference with ARM64 is that, for the better or the worse, we > have decided not to claim the FW PCI set-up on ARM64 even if it > is sane, we do not even try, it was a deliberate choice. > > This patch should be harmless on x86 since if the FB PCI BAR is set > up sanely, claiming it again should be a nop (to be checked). > > For all the talk about PCI being arch agnostic as I said manifold > times before, that's just theory. In practice different arches > treat PCI FW set-up differently, it would be ideal to make them > uniform but legacy is huge and there is a massive risk of triggering > regressions, it is no mean feat (if achievable at all). Can we explore having a uniform behavior across ALL ACPI bases systems by trying to reuse the resources assigned by firmware first and reassign them only if something is wrong? There are protocols like hotplug reservation in UEFI. It looks like Linux is not honoring any of these protocols by being too smart. Sinan
On 28 March 2017 at 22:27, Sinan Kaya <okaya@codeaurora.org> wrote: > Hi Lorenzo, > > On 3/23/2017 6:57 AM, Lorenzo Pieralisi wrote: >>> Correct. But on x86 (or rather, on a PC), you can be sure that UEFI >>> (or the legacy PCI bios) performed the resource assignment already. >>> One could argue that this is equally the case when running arm64 in >>> ACPI mode, but in general, you cannot assume the presence of firmware >>> on ARM/arm64 that has already taken care of this, and so the state of >>> the BARs has to be presumed invalid. >> The story is a bit more convoluted than that owing to x86 (and other >> arches) legacy. >> >> x86 tries to claim all PCI resources (in two passes - first enabled >> devices, second disabled devices) and that predates ACPI/UEFI. >> >> Mind, x86 reassign resources that can't be claimed too, the only >> difference with ARM64 is that, for the better or the worse, we >> have decided not to claim the FW PCI set-up on ARM64 even if it >> is sane, we do not even try, it was a deliberate choice. >> >> This patch should be harmless on x86 since if the FB PCI BAR is set >> up sanely, claiming it again should be a nop (to be checked). >> >> For all the talk about PCI being arch agnostic as I said manifold >> times before, that's just theory. In practice different arches >> treat PCI FW set-up differently, it would be ideal to make them >> uniform but legacy is huge and there is a massive risk of triggering >> regressions, it is no mean feat (if achievable at all). > > Can we explore having a uniform behavior across ALL ACPI bases systems > by trying to reuse the resources assigned by firmware first and reassign > them only if something is wrong? > > There are protocols like hotplug reservation in UEFI. It looks like Linux > is not honoring any of these protocols by being too smart. > Could you be more specific? Which protocol do you mean exactly, and how does not reusing resources interfere with it?
On 3/28/2017 5:39 PM, Ard Biesheuvel wrote: > On 28 March 2017 at 22:27, Sinan Kaya <okaya@codeaurora.org> wrote: >> Hi Lorenzo, >> >> On 3/23/2017 6:57 AM, Lorenzo Pieralisi wrote: >>>> Correct. But on x86 (or rather, on a PC), you can be sure that UEFI >>>> (or the legacy PCI bios) performed the resource assignment already. >>>> One could argue that this is equally the case when running arm64 in >>>> ACPI mode, but in general, you cannot assume the presence of firmware >>>> on ARM/arm64 that has already taken care of this, and so the state of >>>> the BARs has to be presumed invalid. >>> The story is a bit more convoluted than that owing to x86 (and other >>> arches) legacy. >>> >>> x86 tries to claim all PCI resources (in two passes - first enabled >>> devices, second disabled devices) and that predates ACPI/UEFI. >>> >>> Mind, x86 reassign resources that can't be claimed too, the only >>> difference with ARM64 is that, for the better or the worse, we >>> have decided not to claim the FW PCI set-up on ARM64 even if it >>> is sane, we do not even try, it was a deliberate choice. >>> >>> This patch should be harmless on x86 since if the FB PCI BAR is set >>> up sanely, claiming it again should be a nop (to be checked). >>> >>> For all the talk about PCI being arch agnostic as I said manifold >>> times before, that's just theory. In practice different arches >>> treat PCI FW set-up differently, it would be ideal to make them >>> uniform but legacy is huge and there is a massive risk of triggering >>> regressions, it is no mean feat (if achievable at all). >> >> Can we explore having a uniform behavior across ALL ACPI bases systems >> by trying to reuse the resources assigned by firmware first and reassign >> them only if something is wrong? >> >> There are protocols like hotplug reservation in UEFI. It looks like Linux >> is not honoring any of these protocols by being too smart. >> > > Could you be more specific? Which protocol do you mean exactly, and > how does not reusing resources interfere with it? > Let's assume for the moment that if ARM64 adaptation of PCIE reused the firmware assigned BAR addresses, would you still have this problem that you are trying to fix by a quirk? The protocol that I was looking at specifically is called hotplug reservation protocol. http://www.intel.com/content/www/us/en/architecture-and-technology/unified-extensible-firmware-interface/efi-hot-plug-pci-initialization-protocol-specification.html The idea is to pad the PCIe bridge window by some amount in UEFI. You boot the system without any cards present. when you do to hotplug, OS uses the range reserved by the BIOS for inserting the new card. what I see is that the bridge windows get overridden by the OS. I'm asking why we don't fix the actual problem in PCIe ARM64 adaptation instead of working around it by quirks. I don't see any reason why ACPI ARM64 should carry the burden of legacy systems. Legacy only applies to DT based systems. If there is still a legacy problem in ACPI ARM64, generic Linux code deals with this using some DMI/SMBIOS and setting a time bomb like products after this date shall follow the new behavior. We are not in a very hardened position when it comes to changes for ACPI based systems. Sinan
On 28 March 2017 at 22:49, Sinan Kaya <okaya@codeaurora.org> wrote: > On 3/28/2017 5:39 PM, Ard Biesheuvel wrote: >> On 28 March 2017 at 22:27, Sinan Kaya <okaya@codeaurora.org> wrote: >>> Hi Lorenzo, >>> >>> On 3/23/2017 6:57 AM, Lorenzo Pieralisi wrote: >>>>> Correct. But on x86 (or rather, on a PC), you can be sure that UEFI >>>>> (or the legacy PCI bios) performed the resource assignment already. >>>>> One could argue that this is equally the case when running arm64 in >>>>> ACPI mode, but in general, you cannot assume the presence of firmware >>>>> on ARM/arm64 that has already taken care of this, and so the state of >>>>> the BARs has to be presumed invalid. >>>> The story is a bit more convoluted than that owing to x86 (and other >>>> arches) legacy. >>>> >>>> x86 tries to claim all PCI resources (in two passes - first enabled >>>> devices, second disabled devices) and that predates ACPI/UEFI. >>>> >>>> Mind, x86 reassign resources that can't be claimed too, the only >>>> difference with ARM64 is that, for the better or the worse, we >>>> have decided not to claim the FW PCI set-up on ARM64 even if it >>>> is sane, we do not even try, it was a deliberate choice. >>>> >>>> This patch should be harmless on x86 since if the FB PCI BAR is set >>>> up sanely, claiming it again should be a nop (to be checked). >>>> >>>> For all the talk about PCI being arch agnostic as I said manifold >>>> times before, that's just theory. In practice different arches >>>> treat PCI FW set-up differently, it would be ideal to make them >>>> uniform but legacy is huge and there is a massive risk of triggering >>>> regressions, it is no mean feat (if achievable at all). >>> >>> Can we explore having a uniform behavior across ALL ACPI bases systems >>> by trying to reuse the resources assigned by firmware first and reassign >>> them only if something is wrong? >>> >>> There are protocols like hotplug reservation in UEFI. It looks like Linux >>> is not honoring any of these protocols by being too smart. >>> >> >> Could you be more specific? Which protocol do you mean exactly, and >> how does not reusing resources interfere with it? >> > > Let's assume for the moment that if ARM64 adaptation of PCIE reused the firmware > assigned BAR addresses, would you still have this problem that you are trying to > fix by a quirk? > Probably not. Although I should point out that the same issue exists on DT systems, and so we will need this patch regardless. > The protocol that I was looking at specifically is called hotplug reservation > protocol. > > http://www.intel.com/content/www/us/en/architecture-and-technology/unified-extensible-firmware-interface/efi-hot-plug-pci-initialization-protocol-specification.html > > The idea is to pad the PCIe bridge window by some amount in UEFI. You boot the > system without any cards present. when you do to hotplug, OS uses the range > reserved by the BIOS for inserting the new card. > > what I see is that the bridge windows get overridden by the OS. > > I'm asking why we don't fix the actual problem in PCIe ARM64 adaptation instead > of working around it by quirks. > > I don't see any reason why ACPI ARM64 should carry the burden of legacy systems. > > Legacy only applies to DT based systems. > I fully agree with this point: ACPI implies firmware, and so we should be able to rely on firmware to have initialized the PCIe subsystem by the time the kernel gets to access it. > If there is still a legacy problem in ACPI ARM64, generic Linux code deals with > this using some DMI/SMBIOS and setting a time bomb like products after this date > shall follow the new behavior. > > We are not in a very hardened position when it comes to changes for ACPI based > systems. > > Sinan > > -- > Sinan Kaya > Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. > Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
On Thu, Mar 30, 2017 at 09:46:39AM +0100, Ard Biesheuvel wrote: [...] > > I'm asking why we don't fix the actual problem in PCIe ARM64 adaptation instead > > of working around it by quirks. > > > > I don't see any reason why ACPI ARM64 should carry the burden of legacy systems. > > > > Legacy only applies to DT based systems. > > > > I fully agree with this point: ACPI implies firmware, and so we should > be able to rely on firmware to have initialized the PCIe subsystem by > the time the kernel gets to access it. https://lkml.org/lkml/2016/3/3/458 Lorenzo
On 30 March 2017 at 11:05, Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> wrote: > On Thu, Mar 30, 2017 at 09:46:39AM +0100, Ard Biesheuvel wrote: > > [...] > >> > I'm asking why we don't fix the actual problem in PCIe ARM64 adaptation instead >> > of working around it by quirks. >> > >> > I don't see any reason why ACPI ARM64 should carry the burden of legacy systems. >> > >> > Legacy only applies to DT based systems. >> > >> >> I fully agree with this point: ACPI implies firmware, and so we should >> be able to rely on firmware to have initialized the PCIe subsystem by >> the time the kernel gets to access it. > > https://lkml.org/lkml/2016/3/3/458 > I don't think the fact that at least one system existed over a year ago whose UEFI assigned resources incorrectly should prevent us from being normative in this case.
On 2017-03-30 06:09, Ard Biesheuvel wrote: > On 30 March 2017 at 11:05, Lorenzo Pieralisi > <lorenzo.pieralisi@arm.com> wrote: >> On Thu, Mar 30, 2017 at 09:46:39AM +0100, Ard Biesheuvel wrote: >> >> [...] >> >>> > I'm asking why we don't fix the actual problem in PCIe ARM64 adaptation instead >>> > of working around it by quirks. >>> > >>> > I don't see any reason why ACPI ARM64 should carry the burden of legacy systems. >>> > >>> > Legacy only applies to DT based systems. >>> > >>> >>> I fully agree with this point: ACPI implies firmware, and so we >>> should >>> be able to rely on firmware to have initialized the PCIe subsystem by >>> the time the kernel gets to access it. >> >> https://lkml.org/lkml/2016/3/3/458 >> > > I don't think the fact that at least one system existed over a year > ago whose UEFI assigned resources incorrectly should prevent us from > being normative in this case. My suggestion is to try to reuse the resources if possible. Otherwise, reassign the resources. There were implementation problems by the time you proposed your patch and it was delaying progess. That was my objection. Now that base pci support is in and everybody is somewhat happy, let's figure out the bugs and get this done. I am ready to test/debug.
On 30 March 2017 at 11:09, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: > On 30 March 2017 at 11:05, Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> wrote: >> On Thu, Mar 30, 2017 at 09:46:39AM +0100, Ard Biesheuvel wrote: >> >> [...] >> >>> > I'm asking why we don't fix the actual problem in PCIe ARM64 adaptation instead >>> > of working around it by quirks. >>> > >>> > I don't see any reason why ACPI ARM64 should carry the burden of legacy systems. >>> > >>> > Legacy only applies to DT based systems. >>> > >>> >>> I fully agree with this point: ACPI implies firmware, and so we should >>> be able to rely on firmware to have initialized the PCIe subsystem by >>> the time the kernel gets to access it. >> >> https://lkml.org/lkml/2016/3/3/458 >> > > I don't think the fact that at least one system existed over a year > ago whose UEFI assigned resources incorrectly should prevent us from > being normative in this case. In any case, given that EFIFB is enabled by default on some distros, and the fact that DT boot is affected as well, I should get this patch in to prevent serious potential issues that could arise when someone with a graphical UEFI stack updates to such a new kernel. So I think we are in agreement that this is needed on both ARM and arm64, since their PCI configuration is usually not preserved. The open question is whether there is any harm in enabling it for x86 as well.
On 3/30/2017 9:38 AM, Ard Biesheuvel wrote: > On 30 March 2017 at 11:09, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: >> On 30 March 2017 at 11:05, Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> wrote: >>> On Thu, Mar 30, 2017 at 09:46:39AM +0100, Ard Biesheuvel wrote: >>> >>> [...] >>> >>>>> I'm asking why we don't fix the actual problem in PCIe ARM64 adaptation instead >>>>> of working around it by quirks. >>>>> >>>>> I don't see any reason why ACPI ARM64 should carry the burden of legacy systems. >>>>> >>>>> Legacy only applies to DT based systems. >>>>> >>>> >>>> I fully agree with this point: ACPI implies firmware, and so we should >>>> be able to rely on firmware to have initialized the PCIe subsystem by >>>> the time the kernel gets to access it. >>> >>> https://lkml.org/lkml/2016/3/3/458 >>> >> >> I don't think the fact that at least one system existed over a year >> ago whose UEFI assigned resources incorrectly should prevent us from >> being normative in this case. > > In any case, given that EFIFB is enabled by default on some distros, > and the fact that DT boot is affected as well, I should get this patch > in to prevent serious potential issues that could arise when someone > with a graphical UEFI stack updates to such a new kernel. > > So I think we are in agreement that this is needed on both ARM and > arm64, since their PCI configuration is usually not preserved. The > open question is whether there is any harm in enabling it for x86 as > well. > Agreed, the other issue is about compatibility with UEFI and future proofing Linux for other potential issues like hotplug reservation.
On 30 March 2017 at 14:50, Sinan Kaya <okaya@codeaurora.org> wrote: > On 3/30/2017 9:38 AM, Ard Biesheuvel wrote: >> On 30 March 2017 at 11:09, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: >>> On 30 March 2017 at 11:05, Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> wrote: >>>> On Thu, Mar 30, 2017 at 09:46:39AM +0100, Ard Biesheuvel wrote: >>>> >>>> [...] >>>> >>>>>> I'm asking why we don't fix the actual problem in PCIe ARM64 adaptation instead >>>>>> of working around it by quirks. >>>>>> >>>>>> I don't see any reason why ACPI ARM64 should carry the burden of legacy systems. >>>>>> >>>>>> Legacy only applies to DT based systems. >>>>>> >>>>> >>>>> I fully agree with this point: ACPI implies firmware, and so we should >>>>> be able to rely on firmware to have initialized the PCIe subsystem by >>>>> the time the kernel gets to access it. >>>> >>>> https://lkml.org/lkml/2016/3/3/458 >>>> >>> >>> I don't think the fact that at least one system existed over a year >>> ago whose UEFI assigned resources incorrectly should prevent us from >>> being normative in this case. >> >> In any case, given that EFIFB is enabled by default on some distros, >> and the fact that DT boot is affected as well, I should get this patch >> in to prevent serious potential issues that could arise when someone >> with a graphical UEFI stack updates to such a new kernel. >> >> So I think we are in agreement that this is needed on both ARM and >> arm64, since their PCI configuration is usually not preserved. The >> open question is whether there is any harm in enabling it for x86 as >> well. >> > > Agreed, the other issue is about compatibility with UEFI and future > proofing Linux for other potential issues like hotplug reservation. > OK, given the lack of feedback regarding the suitability of this patch for x86, I am going to rework it as a ARM/arm64 only patch, and queue it as a fix for v4.11. This way, we can backport it to stable (which is arguably appropriate, given that upgrading to a EFIFB enabled kernel may cause severe breakage for existing systems that implement the GOP protocol), and easily change the patch to apply to x86 going forwards, by removing the #ifdefs
On 2 April 2017 at 16:16, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: > On 30 March 2017 at 14:50, Sinan Kaya <okaya@codeaurora.org> wrote: >> On 3/30/2017 9:38 AM, Ard Biesheuvel wrote: >>> On 30 March 2017 at 11:09, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: >>>> On 30 March 2017 at 11:05, Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> wrote: >>>>> On Thu, Mar 30, 2017 at 09:46:39AM +0100, Ard Biesheuvel wrote: >>>>> >>>>> [...] >>>>> >>>>>>> I'm asking why we don't fix the actual problem in PCIe ARM64 adaptation instead >>>>>>> of working around it by quirks. >>>>>>> >>>>>>> I don't see any reason why ACPI ARM64 should carry the burden of legacy systems. >>>>>>> >>>>>>> Legacy only applies to DT based systems. >>>>>>> >>>>>> >>>>>> I fully agree with this point: ACPI implies firmware, and so we should >>>>>> be able to rely on firmware to have initialized the PCIe subsystem by >>>>>> the time the kernel gets to access it. >>>>> >>>>> https://lkml.org/lkml/2016/3/3/458 >>>>> >>>> >>>> I don't think the fact that at least one system existed over a year >>>> ago whose UEFI assigned resources incorrectly should prevent us from >>>> being normative in this case. >>> >>> In any case, given that EFIFB is enabled by default on some distros, >>> and the fact that DT boot is affected as well, I should get this patch >>> in to prevent serious potential issues that could arise when someone >>> with a graphical UEFI stack updates to such a new kernel. >>> >>> So I think we are in agreement that this is needed on both ARM and >>> arm64, since their PCI configuration is usually not preserved. The >>> open question is whether there is any harm in enabling it for x86 as >>> well. >>> >> >> Agreed, the other issue is about compatibility with UEFI and future >> proofing Linux for other potential issues like hotplug reservation. >> > > OK, given the lack of feedback regarding the suitability of this patch > for x86, I am going to rework it as a ARM/arm64 only patch, and queue > it as a fix for v4.11. This way, we can backport it to stable (which > is arguably appropriate, given that upgrading to a EFIFB enabled > kernel may cause severe breakage for existing systems that implement > the GOP protocol), and easily change the patch to apply to x86 going > forwards, by removing the #ifdefs > As it turns out, this patch does not solve the problem completely. For EFI framebuffers that are backed by a PCI bar of a device residing on bus 0, things work happily. However, claiming resources for devices behind bridges doesn't work. Given that we have not made the situation worse, fixing it is less urgent than it was before. I.e., there is no longer a risk of inadvertently poking the wrong BAR when writing to the framebuffer. There is a regression in functionality, though, since EFI fb devices that happened to work (because the firmware BAR == the kernel BAR) have stopped working if they are behind a bridge, which is of course always the case for PCIe. So before starting the next round of hacking to work around this, I would like rekindle the discussion regarding the way we blindly reassign all resources on ACPI/arm64 systems, and whether there is a way imaginable to avoid doing that. I suppose the state of the BARs as we inherit it from the firmware cannot be blindly trusted (and IIUC, this is Lorenzo's primary issue with it). So should there be some side channel (UEFI config table perhaps?) to describe this? How do other architectures deal with this? Is this what the PCI bios accesses are for? In any case, I have updated the UEFI firmware we have for ARM Juno to use EDK2's generic PCI host bridge driver instead of one that was specially written for ARM Juno, and may deviate in the way it allocates PCI resources.
On Mon, Apr 10, 2017 at 04:28:11PM +0100, Ard Biesheuvel wrote: > On 2 April 2017 at 16:16, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: > > On 30 March 2017 at 14:50, Sinan Kaya <okaya@codeaurora.org> wrote: > >> On 3/30/2017 9:38 AM, Ard Biesheuvel wrote: > >>> On 30 March 2017 at 11:09, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: > >>>> On 30 March 2017 at 11:05, Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> wrote: > >>>>> On Thu, Mar 30, 2017 at 09:46:39AM +0100, Ard Biesheuvel wrote: > >>>>> > >>>>> [...] > >>>>> > >>>>>>> I'm asking why we don't fix the actual problem in PCIe ARM64 adaptation instead > >>>>>>> of working around it by quirks. > >>>>>>> > >>>>>>> I don't see any reason why ACPI ARM64 should carry the burden of legacy systems. > >>>>>>> > >>>>>>> Legacy only applies to DT based systems. > >>>>>>> > >>>>>> > >>>>>> I fully agree with this point: ACPI implies firmware, and so we should > >>>>>> be able to rely on firmware to have initialized the PCIe subsystem by > >>>>>> the time the kernel gets to access it. > >>>>> > >>>>> https://lkml.org/lkml/2016/3/3/458 > >>>>> > >>>> > >>>> I don't think the fact that at least one system existed over a year > >>>> ago whose UEFI assigned resources incorrectly should prevent us from > >>>> being normative in this case. > >>> > >>> In any case, given that EFIFB is enabled by default on some distros, > >>> and the fact that DT boot is affected as well, I should get this patch > >>> in to prevent serious potential issues that could arise when someone > >>> with a graphical UEFI stack updates to such a new kernel. > >>> > >>> So I think we are in agreement that this is needed on both ARM and > >>> arm64, since their PCI configuration is usually not preserved. The > >>> open question is whether there is any harm in enabling it for x86 as > >>> well. > >>> > >> > >> Agreed, the other issue is about compatibility with UEFI and future > >> proofing Linux for other potential issues like hotplug reservation. > >> > > > > OK, given the lack of feedback regarding the suitability of this patch > > for x86, I am going to rework it as a ARM/arm64 only patch, and queue > > it as a fix for v4.11. This way, we can backport it to stable (which > > is arguably appropriate, given that upgrading to a EFIFB enabled > > kernel may cause severe breakage for existing systems that implement > > the GOP protocol), and easily change the patch to apply to x86 going > > forwards, by removing the #ifdefs > > > > As it turns out, this patch does not solve the problem completely. > > For EFI framebuffers that are backed by a PCI bar of a device residing > on bus 0, things work happily. However, claiming resources for devices > behind bridges doesn't work. May I ask you to elaborate on this please ? It is because we do not claim the bridge windows upstream the device and they are reassigned ? > Given that we have not made the situation worse, fixing it is less > urgent than it was before. I.e., there is no longer a risk of > inadvertently poking the wrong BAR when writing to the framebuffer. > There is a regression in functionality, though, since EFI fb devices > that happened to work (because the firmware BAR == the kernel BAR) > have stopped working if they are behind a bridge, which is of course > always the case for PCIe. > > So before starting the next round of hacking to work around this, I > would like rekindle the discussion regarding the way we blindly > reassign all resources on ACPI/arm64 systems, and whether there is a > way imaginable to avoid doing that. There is a way if the whole ARM ecosystem work together to sort this out and we think that's the right way to do; I am personally not entirely convinced about that. > I suppose the state of the BARs as we inherit it from the firmware > cannot be blindly trusted (and IIUC, this is Lorenzo's primary issue > with it). So should there be some side channel (UEFI config table > perhaps?) to describe this? PCI firmware specifications rev 3.1, 4.6.5, "_DSM for Ignoring PCI Boot Configurations". Do we want to enforce it on ARM ? I do not know to be honest (and it still would not solve the DT firmware case). Whatever we do, it is not going to be clean cut IMO. I think that what x86 does is sensible (well, minus the link ordering dependency we discovered), I can do it for ARM64 but get ready for regressions and I still think we have no real FW binding support that would make this behaviour robust. I can provide you with examples where, by simply claiming resources on an ARM system you trigger resources allocation regressions by preventing the kernel from allocating bigger bridge windows than the ones set-up by FW. > How do other architectures deal with this? On an arch specific basis that make things work. > Is this what the PCI bios accesses are for? Which ones :) ? > In any case, I have updated the UEFI firmware we have for ARM Juno to > use EDK2's generic PCI host bridge driver instead of one that was > specially written for ARM Juno, and may deviate in the way it > allocates PCI resources. As long as the kernel is free to reallocate them and that FW quiesces devices at FW->OS handover I see no issues with that. If we want to try to claim the whole resource tree on boot (in ACPI) I can send a patch for that but there will be regressions. Lorenzo
On 4/10/2017 12:53 PM, Lorenzo Pieralisi wrote: > Do we want to enforce it on ARM ? I do not know to be honest (and it > still would not solve the DT firmware case). > Yes for ACPI on ARM. Probably no for DT. > Whatever we do, it is not going to be clean cut IMO. I think that > what x86 does is sensible (well, minus the link ordering dependency we > discovered), I can do it for ARM64 but get ready for regressions and > I still think we have no real FW binding support that would make this > behaviour robust. Is it possible to move the code from arch/x86 into drivers/acpi directory so that the code is shared across multiple ACPI based archs. We get more coverage this way. Can we fallback to the allocate everything behavior if we can't use the stuff from FW or would it be too late to recover? We can prevent regressions this way and issue a lot of warnings. We can sit down and fix those warnings over time whether the issue is in UEFI BIOS or the code itself.
On 10 April 2017 at 17:53, Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> wrote: > On Mon, Apr 10, 2017 at 04:28:11PM +0100, Ard Biesheuvel wrote: >> On 2 April 2017 at 16:16, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: >> > On 30 March 2017 at 14:50, Sinan Kaya <okaya@codeaurora.org> wrote: >> >> On 3/30/2017 9:38 AM, Ard Biesheuvel wrote: >> >>> On 30 March 2017 at 11:09, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: >> >>>> On 30 March 2017 at 11:05, Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> wrote: >> >>>>> On Thu, Mar 30, 2017 at 09:46:39AM +0100, Ard Biesheuvel wrote: >> >>>>> >> >>>>> [...] >> >>>>> >> >>>>>>> I'm asking why we don't fix the actual problem in PCIe ARM64 adaptation instead >> >>>>>>> of working around it by quirks. >> >>>>>>> >> >>>>>>> I don't see any reason why ACPI ARM64 should carry the burden of legacy systems. >> >>>>>>> >> >>>>>>> Legacy only applies to DT based systems. >> >>>>>>> >> >>>>>> >> >>>>>> I fully agree with this point: ACPI implies firmware, and so we should >> >>>>>> be able to rely on firmware to have initialized the PCIe subsystem by >> >>>>>> the time the kernel gets to access it. >> >>>>> >> >>>>> https://lkml.org/lkml/2016/3/3/458 >> >>>>> >> >>>> >> >>>> I don't think the fact that at least one system existed over a year >> >>>> ago whose UEFI assigned resources incorrectly should prevent us from >> >>>> being normative in this case. >> >>> >> >>> In any case, given that EFIFB is enabled by default on some distros, >> >>> and the fact that DT boot is affected as well, I should get this patch >> >>> in to prevent serious potential issues that could arise when someone >> >>> with a graphical UEFI stack updates to such a new kernel. >> >>> >> >>> So I think we are in agreement that this is needed on both ARM and >> >>> arm64, since their PCI configuration is usually not preserved. The >> >>> open question is whether there is any harm in enabling it for x86 as >> >>> well. >> >>> >> >> >> >> Agreed, the other issue is about compatibility with UEFI and future >> >> proofing Linux for other potential issues like hotplug reservation. >> >> >> > >> > OK, given the lack of feedback regarding the suitability of this patch >> > for x86, I am going to rework it as a ARM/arm64 only patch, and queue >> > it as a fix for v4.11. This way, we can backport it to stable (which >> > is arguably appropriate, given that upgrading to a EFIFB enabled >> > kernel may cause severe breakage for existing systems that implement >> > the GOP protocol), and easily change the patch to apply to x86 going >> > forwards, by removing the #ifdefs >> > >> >> As it turns out, this patch does not solve the problem completely. >> >> For EFI framebuffers that are backed by a PCI bar of a device residing >> on bus 0, things work happily. However, claiming resources for devices >> behind bridges doesn't work. > > May I ask you to elaborate on this please ? It is because we do not > claim the bridge windows upstream the device and they are reassigned ? > The pci_claim_resource() call fails like this pci 0000:01:01.0: can't claim BAR 0 [mem 0x10000000-0x10ffffff pref]: no compatible bridge window pci 0000:01:01.0: BAR 0: failed to claim resource for efifb! which is caused by the fact that the parent resources are all zeroes. It appears the BAR configuration is never read from the bridges, so the only way we will ever have meaningful values in there is if we allocate them ourselves. >> Given that we have not made the situation worse, fixing it is less >> urgent than it was before. I.e., there is no longer a risk of >> inadvertently poking the wrong BAR when writing to the framebuffer. >> There is a regression in functionality, though, since EFI fb devices >> that happened to work (because the firmware BAR == the kernel BAR) >> have stopped working if they are behind a bridge, which is of course >> always the case for PCIe. >> >> So before starting the next round of hacking to work around this, I >> would like rekindle the discussion regarding the way we blindly >> reassign all resources on ACPI/arm64 systems, and whether there is a >> way imaginable to avoid doing that. > > There is a way if the whole ARM ecosystem work together to sort this > out and we think that's the right way to do; I am personally not > entirely convinced about that. > So what are the pros and cons here? EFI fb is not a hugely important use case, but it is one that relies on BARs staying where they are. Are there others like that? >> I suppose the state of the BARs as we inherit it from the firmware >> cannot be blindly trusted (and IIUC, this is Lorenzo's primary issue >> with it). So should there be some side channel (UEFI config table >> perhaps?) to describe this? > > PCI firmware specifications rev 3.1, 4.6.5, "_DSM for Ignoring PCI > Boot Configurations". > > Do we want to enforce it on ARM ? I do not know to be honest (and it > still would not solve the DT firmware case). > No, it doesn't. But that doesn't mean we shouldn't solve it on ACPI if the pros outweigh the cons. > Whatever we do, it is not going to be clean cut IMO. I think that > what x86 does is sensible (well, minus the link ordering dependency we > discovered), I can do it for ARM64 but get ready for regressions and > I still think we have no real FW binding support that would make this > behaviour robust. > > I can provide you with examples where, by simply claiming resources > on an ARM system you trigger resources allocation regressions by > preventing the kernel from allocating bigger bridge windows than > the ones set-up by FW. > So how is this specific to ARM then? If we are running the same resource allocation routines, why should we end up with a firmware allocation that the kernel cannot use? >> How do other architectures deal with this? > > On an arch specific basis that make things work. > >> Is this what the PCI bios accesses are for? > > Which ones :) ? > Well, some of them :-) I guess the question was if the overridden __weak methods are supposed to hook into tables or other BIOS structures that contain information about the PCI resource allocations by the firmware. >> In any case, I have updated the UEFI firmware we have for ARM Juno to >> use EDK2's generic PCI host bridge driver instead of one that was >> specially written for ARM Juno, and may deviate in the way it >> allocates PCI resources. > > As long as the kernel is free to reallocate them and that FW quiesces > devices at FW->OS handover I see no issues with that. > The reason is to eliminate another unknown from the discussion whether UEFI can be expected to leave the entire PCI hierarchy in a sane state. > If we want to try to claim the whole resource tree on boot (in ACPI) > I can send a patch for that but there will be regressions. > I would like to see it, yes.
Hi Ard, I have one comment inclined. 在 3/22/2017 11:30 PM, Ard Biesheuvel 写道: > On UEFI systems, the PCI subsystem is enumerated by the firmware, > and if a graphical framebuffer is exposed by a PCI device, its base > address and size are exposed to the OS via the Graphics Output > Protocol (GOP). > > On arm64 PCI systems, the entire PCI hierarchy is reconfigured from > scratch at boot. This may result in the GOP framebuffer address to > become stale, if the BAR covering the framebuffer is modified. This > will cause the framebuffer to become unresponsive, and may in some > cases result in unpredictable behavior if the range is reassigned to > another device. > > So add a quirk to the EFI fb driver to find the BAR associated with > the GOP base address, and set the IORESOURCE_PCI_FIXED attribute so > that the PCI core will leave it alone. > > Cc: Matt Fleming <matt@codeblueprint.co.uk> > Cc: Peter Jones <pjones@redhat.com> > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> > --- > v3: check device is enabled before attempting to claim the resource > > drivers/video/fbdev/efifb.c | 60 +++++++++++++++++++- > 1 file changed, 59 insertions(+), 1 deletion(-) > > diff --git a/drivers/video/fbdev/efifb.c b/drivers/video/fbdev/efifb.c > index 8c4dc1e1f94f..88f653864a01 100644 > --- a/drivers/video/fbdev/efifb.c > +++ b/drivers/video/fbdev/efifb.c > @@ -10,6 +10,7 @@ > #include <linux/efi.h> > #include <linux/errno.h> > #include <linux/fb.h> > +#include <linux/pci.h> > #include <linux/platform_device.h> > #include <linux/screen_info.h> > #include <video/vga.h> > @@ -143,6 +144,9 @@ static struct attribute *efifb_attrs[] = { > }; > ATTRIBUTE_GROUPS(efifb); > > +static bool pci_bar_found; /* did we find a BAR matching the efifb base? */ > +static bool pci_dev_disabled; /* was the device disabled? */ > + > static int efifb_probe(struct platform_device *dev) > { > struct fb_info *info; > @@ -152,7 +156,7 @@ static int efifb_probe(struct platform_device *dev) > unsigned int size_total; > char *option = NULL; > > - if (screen_info.orig_video_isVGA != VIDEO_TYPE_EFI) > + if (screen_info.orig_video_isVGA != VIDEO_TYPE_EFI || pci_dev_disabled) > return -ENODEV; > > if (fb_get_options("efifb", &option)) > @@ -360,3 +364,57 @@ static struct platform_driver efifb_driver = { > }; > > builtin_platform_driver(efifb_driver); > + > +static void claim_efifb_bar(struct pci_dev *dev, int idx) > +{ > + u16 word; > + > + pci_bar_found = true; > + > + pci_read_config_word(dev, PCI_COMMAND, &word); > + if (!(word & PCI_COMMAND_MEMORY)) { > + pci_dev_disabled = true; > + dev_err(&dev->dev, > + "BAR %d: assigned to efifb but device is disabled!\n", > + idx); > + return; > + } > + > + if (pci_claim_resource(dev, idx)) { > + pci_dev_disabled = true; > + dev_err(&dev->dev, > + "BAR %d: failed to claim resource for efifb!\n", idx); > + return; > + } > + > + dev_info(&dev->dev, "BAR %d: assigned to efifb\n", idx); > +} > + > +static void efifb_fixup_resources(struct pci_dev *dev) > +{ > + u64 base = screen_info.lfb_base; > + u64 size = screen_info.lfb_size; > + int i; > + > + if (pci_bar_found || screen_info.orig_video_isVGA != VIDEO_TYPE_EFI) > + return; > + > + if (screen_info.capabilities & VIDEO_CAPABILITY_64BIT_BASE) > + base |= (u64)screen_info.ext_lfb_base << 32; > + > + if (!base) > + return; > + > + for (i = 0; i < PCI_STD_RESOURCE_END; i++) { > + struct resource *res = &dev->resource[i]; > + > + if (!(res->flags & IORESOURCE_MEM)) > + continue; > + > + if (res->start <= base && res->end >= base + size - 1) { Have we considered PCI address translation here? I suppose the address reported by EFI GOP should be the address of CPU domain, not address of PCI domain. Address read from PCI BAR is PCI address which should add translation offset before being compared to CPU domain address. I've not familiar with Linux code, so please ignore my comment if the code has supported address translation. Regards, Gary (Heyi Guo) > + claim_efifb_bar(dev, i); > + break; > + } > + } > +} > +DECLARE_PCI_FIXUP_HEADER(PCI_ANY_ID, PCI_ANY_ID, efifb_fixup_resources);
Hi Gary, Thanks for the question. On Wed, May 03, 2017 at 11:09:51AM +0800, Heyi Guo wrote: > Hi Ard, > > I have one comment inclined. > > > 在 3/22/2017 11:30 PM, Ard Biesheuvel 写道: > >On UEFI systems, the PCI subsystem is enumerated by the firmware, > >and if a graphical framebuffer is exposed by a PCI device, its base > >address and size are exposed to the OS via the Graphics Output > >Protocol (GOP). > > > >On arm64 PCI systems, the entire PCI hierarchy is reconfigured from > >scratch at boot. This may result in the GOP framebuffer address to > >become stale, if the BAR covering the framebuffer is modified. This > >will cause the framebuffer to become unresponsive, and may in some > >cases result in unpredictable behavior if the range is reassigned to > >another device. > > > >So add a quirk to the EFI fb driver to find the BAR associated with > >the GOP base address, and set the IORESOURCE_PCI_FIXED attribute so > >that the PCI core will leave it alone. > > > >Cc: Matt Fleming <matt@codeblueprint.co.uk> > >Cc: Peter Jones <pjones@redhat.com> > >Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> > >--- > >v3: check device is enabled before attempting to claim the resource > > > > drivers/video/fbdev/efifb.c | 60 +++++++++++++++++++- > > 1 file changed, 59 insertions(+), 1 deletion(-) > > > >diff --git a/drivers/video/fbdev/efifb.c b/drivers/video/fbdev/efifb.c > >index 8c4dc1e1f94f..88f653864a01 100644 > >--- a/drivers/video/fbdev/efifb.c > >+++ b/drivers/video/fbdev/efifb.c > >@@ -10,6 +10,7 @@ > > #include <linux/efi.h> > > #include <linux/errno.h> > > #include <linux/fb.h> > >+#include <linux/pci.h> > > #include <linux/platform_device.h> > > #include <linux/screen_info.h> > > #include <video/vga.h> > >@@ -143,6 +144,9 @@ static struct attribute *efifb_attrs[] = { > > }; > > ATTRIBUTE_GROUPS(efifb); > >+static bool pci_bar_found; /* did we find a BAR matching the efifb base? */ > >+static bool pci_dev_disabled; /* was the device disabled? */ > >+ > > static int efifb_probe(struct platform_device *dev) > > { > > struct fb_info *info; > >@@ -152,7 +156,7 @@ static int efifb_probe(struct platform_device *dev) > > unsigned int size_total; > > char *option = NULL; > >- if (screen_info.orig_video_isVGA != VIDEO_TYPE_EFI) > >+ if (screen_info.orig_video_isVGA != VIDEO_TYPE_EFI || pci_dev_disabled) > > return -ENODEV; > > if (fb_get_options("efifb", &option)) > >@@ -360,3 +364,57 @@ static struct platform_driver efifb_driver = { > > }; > > builtin_platform_driver(efifb_driver); > >+ > >+static void claim_efifb_bar(struct pci_dev *dev, int idx) > >+{ > >+ u16 word; > >+ > >+ pci_bar_found = true; > >+ > >+ pci_read_config_word(dev, PCI_COMMAND, &word); > >+ if (!(word & PCI_COMMAND_MEMORY)) { > >+ pci_dev_disabled = true; > >+ dev_err(&dev->dev, > >+ "BAR %d: assigned to efifb but device is disabled!\n", > >+ idx); > >+ return; > >+ } > >+ > >+ if (pci_claim_resource(dev, idx)) { > >+ pci_dev_disabled = true; > >+ dev_err(&dev->dev, > >+ "BAR %d: failed to claim resource for efifb!\n", idx); > >+ return; > >+ } > >+ > >+ dev_info(&dev->dev, "BAR %d: assigned to efifb\n", idx); > >+} > >+ > >+static void efifb_fixup_resources(struct pci_dev *dev) > >+{ > >+ u64 base = screen_info.lfb_base; > >+ u64 size = screen_info.lfb_size; > >+ int i; > >+ > >+ if (pci_bar_found || screen_info.orig_video_isVGA != VIDEO_TYPE_EFI) > >+ return; > >+ > >+ if (screen_info.capabilities & VIDEO_CAPABILITY_64BIT_BASE) > >+ base |= (u64)screen_info.ext_lfb_base << 32; > >+ > >+ if (!base) > >+ return; > >+ > >+ for (i = 0; i < PCI_STD_RESOURCE_END; i++) { > >+ struct resource *res = &dev->resource[i]; > >+ > >+ if (!(res->flags & IORESOURCE_MEM)) > >+ continue; > >+ > >+ if (res->start <= base && res->end >= base + size - 1) { > Have we considered PCI address translation here? I suppose the > address reported by EFI GOP should be the address of CPU domain, not > address of PCI domain. Address read from PCI BAR is PCI address > which should add translation offset before being compared to CPU > domain address. Every address in dev->resource[] is a CPU domain address, so address translation offset should be handled correctly. This is because __pci_read_base() calls pcibios_bus_to_resource() when it fills in dev->resource[], and pcibios_bus_to_resource() applies any host bridge address translation. Bjorn
Hi Bjorn, Many thanks for your clear answer. Regards, Gary (Heyi Guo) 在 5/18/2017 10:01 PM, Bjorn Helgaas 写道: > Hi Gary, > > Thanks for the question. > > On Wed, May 03, 2017 at 11:09:51AM +0800, Heyi Guo wrote: >> Hi Ard, >> >> I have one comment inclined. >> >> >> 在 3/22/2017 11:30 PM, Ard Biesheuvel 写道: >>> On UEFI systems, the PCI subsystem is enumerated by the firmware, >>> and if a graphical framebuffer is exposed by a PCI device, its base >>> address and size are exposed to the OS via the Graphics Output >>> Protocol (GOP). >>> >>> On arm64 PCI systems, the entire PCI hierarchy is reconfigured from >>> scratch at boot. This may result in the GOP framebuffer address to >>> become stale, if the BAR covering the framebuffer is modified. This >>> will cause the framebuffer to become unresponsive, and may in some >>> cases result in unpredictable behavior if the range is reassigned to >>> another device. >>> >>> So add a quirk to the EFI fb driver to find the BAR associated with >>> the GOP base address, and set the IORESOURCE_PCI_FIXED attribute so >>> that the PCI core will leave it alone. >>> >>> Cc: Matt Fleming <matt@codeblueprint.co.uk> >>> Cc: Peter Jones <pjones@redhat.com> >>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> >>> --- >>> v3: check device is enabled before attempting to claim the resource >>> >>> drivers/video/fbdev/efifb.c | 60 +++++++++++++++++++- >>> 1 file changed, 59 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/video/fbdev/efifb.c b/drivers/video/fbdev/efifb.c >>> index 8c4dc1e1f94f..88f653864a01 100644 >>> --- a/drivers/video/fbdev/efifb.c >>> +++ b/drivers/video/fbdev/efifb.c >>> @@ -10,6 +10,7 @@ >>> #include <linux/efi.h> >>> #include <linux/errno.h> >>> #include <linux/fb.h> >>> +#include <linux/pci.h> >>> #include <linux/platform_device.h> >>> #include <linux/screen_info.h> >>> #include <video/vga.h> >>> @@ -143,6 +144,9 @@ static struct attribute *efifb_attrs[] = { >>> }; >>> ATTRIBUTE_GROUPS(efifb); >>> +static bool pci_bar_found; /* did we find a BAR matching the efifb base? */ >>> +static bool pci_dev_disabled; /* was the device disabled? */ >>> + >>> static int efifb_probe(struct platform_device *dev) >>> { >>> struct fb_info *info; >>> @@ -152,7 +156,7 @@ static int efifb_probe(struct platform_device *dev) >>> unsigned int size_total; >>> char *option = NULL; >>> - if (screen_info.orig_video_isVGA != VIDEO_TYPE_EFI) >>> + if (screen_info.orig_video_isVGA != VIDEO_TYPE_EFI || pci_dev_disabled) >>> return -ENODEV; >>> if (fb_get_options("efifb", &option)) >>> @@ -360,3 +364,57 @@ static struct platform_driver efifb_driver = { >>> }; >>> builtin_platform_driver(efifb_driver); >>> + >>> +static void claim_efifb_bar(struct pci_dev *dev, int idx) >>> +{ >>> + u16 word; >>> + >>> + pci_bar_found = true; >>> + >>> + pci_read_config_word(dev, PCI_COMMAND, &word); >>> + if (!(word & PCI_COMMAND_MEMORY)) { >>> + pci_dev_disabled = true; >>> + dev_err(&dev->dev, >>> + "BAR %d: assigned to efifb but device is disabled!\n", >>> + idx); >>> + return; >>> + } >>> + >>> + if (pci_claim_resource(dev, idx)) { >>> + pci_dev_disabled = true; >>> + dev_err(&dev->dev, >>> + "BAR %d: failed to claim resource for efifb!\n", idx); >>> + return; >>> + } >>> + >>> + dev_info(&dev->dev, "BAR %d: assigned to efifb\n", idx); >>> +} >>> + >>> +static void efifb_fixup_resources(struct pci_dev *dev) >>> +{ >>> + u64 base = screen_info.lfb_base; >>> + u64 size = screen_info.lfb_size; >>> + int i; >>> + >>> + if (pci_bar_found || screen_info.orig_video_isVGA != VIDEO_TYPE_EFI) >>> + return; >>> + >>> + if (screen_info.capabilities & VIDEO_CAPABILITY_64BIT_BASE) >>> + base |= (u64)screen_info.ext_lfb_base << 32; >>> + >>> + if (!base) >>> + return; >>> + >>> + for (i = 0; i < PCI_STD_RESOURCE_END; i++) { >>> + struct resource *res = &dev->resource[i]; >>> + >>> + if (!(res->flags & IORESOURCE_MEM)) >>> + continue; >>> + >>> + if (res->start <= base && res->end >= base + size - 1) { >> Have we considered PCI address translation here? I suppose the >> address reported by EFI GOP should be the address of CPU domain, not >> address of PCI domain. Address read from PCI BAR is PCI address >> which should add translation offset before being compared to CPU >> domain address. > Every address in dev->resource[] is a CPU domain address, so address > translation offset should be handled correctly. > > This is because __pci_read_base() calls pcibios_bus_to_resource() when > it fills in dev->resource[], and pcibios_bus_to_resource() applies any > host bridge address translation. > > Bjorn
diff --git a/drivers/video/fbdev/efifb.c b/drivers/video/fbdev/efifb.c index 8c4dc1e1f94f..88f653864a01 100644 --- a/drivers/video/fbdev/efifb.c +++ b/drivers/video/fbdev/efifb.c @@ -10,6 +10,7 @@ #include <linux/efi.h> #include <linux/errno.h> #include <linux/fb.h> +#include <linux/pci.h> #include <linux/platform_device.h> #include <linux/screen_info.h> #include <video/vga.h> @@ -143,6 +144,9 @@ static struct attribute *efifb_attrs[] = { }; ATTRIBUTE_GROUPS(efifb); +static bool pci_bar_found; /* did we find a BAR matching the efifb base? */ +static bool pci_dev_disabled; /* was the device disabled? */ + static int efifb_probe(struct platform_device *dev) { struct fb_info *info; @@ -152,7 +156,7 @@ static int efifb_probe(struct platform_device *dev) unsigned int size_total; char *option = NULL; - if (screen_info.orig_video_isVGA != VIDEO_TYPE_EFI) + if (screen_info.orig_video_isVGA != VIDEO_TYPE_EFI || pci_dev_disabled) return -ENODEV; if (fb_get_options("efifb", &option)) @@ -360,3 +364,57 @@ static struct platform_driver efifb_driver = { }; builtin_platform_driver(efifb_driver); + +static void claim_efifb_bar(struct pci_dev *dev, int idx) +{ + u16 word; + + pci_bar_found = true; + + pci_read_config_word(dev, PCI_COMMAND, &word); + if (!(word & PCI_COMMAND_MEMORY)) { + pci_dev_disabled = true; + dev_err(&dev->dev, + "BAR %d: assigned to efifb but device is disabled!\n", + idx); + return; + } + + if (pci_claim_resource(dev, idx)) { + pci_dev_disabled = true; + dev_err(&dev->dev, + "BAR %d: failed to claim resource for efifb!\n", idx); + return; + } + + dev_info(&dev->dev, "BAR %d: assigned to efifb\n", idx); +} + +static void efifb_fixup_resources(struct pci_dev *dev) +{ + u64 base = screen_info.lfb_base; + u64 size = screen_info.lfb_size; + int i; + + if (pci_bar_found || screen_info.orig_video_isVGA != VIDEO_TYPE_EFI) + return; + + if (screen_info.capabilities & VIDEO_CAPABILITY_64BIT_BASE) + base |= (u64)screen_info.ext_lfb_base << 32; + + if (!base) + return; + + for (i = 0; i < PCI_STD_RESOURCE_END; i++) { + struct resource *res = &dev->resource[i]; + + if (!(res->flags & IORESOURCE_MEM)) + continue; + + if (res->start <= base && res->end >= base + size - 1) { + claim_efifb_bar(dev, i); + break; + } + } +} +DECLARE_PCI_FIXUP_HEADER(PCI_ANY_ID, PCI_ANY_ID, efifb_fixup_resources);
On UEFI systems, the PCI subsystem is enumerated by the firmware, and if a graphical framebuffer is exposed by a PCI device, its base address and size are exposed to the OS via the Graphics Output Protocol (GOP). On arm64 PCI systems, the entire PCI hierarchy is reconfigured from scratch at boot. This may result in the GOP framebuffer address to become stale, if the BAR covering the framebuffer is modified. This will cause the framebuffer to become unresponsive, and may in some cases result in unpredictable behavior if the range is reassigned to another device. So add a quirk to the EFI fb driver to find the BAR associated with the GOP base address, and set the IORESOURCE_PCI_FIXED attribute so that the PCI core will leave it alone. Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Jones <pjones@redhat.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> --- v3: check device is enabled before attempting to claim the resource drivers/video/fbdev/efifb.c | 60 +++++++++++++++++++- 1 file changed, 59 insertions(+), 1 deletion(-)