Message ID | 20181115231605.24352-1-mr.nuke.me@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | PCI/AER: Consistently use _OSC to determine who owns AER | expand |
On 11/15/2018 3:16 PM, Alexandru Gagniuc wrote: > I've asked around a few people at Dell and they unanimously agree that > _OSC is the correct way to determine ownership of AER. In linux, we > use the result of _OSC to enable AER services, but we use HEST to > determine AER ownership. That's inconsistent. This series drops the > use of HEST in favor of _OSC. > > [1]https://lkml.org/lkml/2018/11/15/62 This change breaks the existing systems that rely on the HEST table telling the operating system about firmware first presence. Besides, HEST table has much more granularity about which PCI component needs firmware such as global/device/switch. You should probably circulate these ideas for wider consumption in UEFI forum as UEFI owns the HEST table definition.
From: Alexandru Gagniuc > Sent: 15 November 2018 23:16 ... > I've asked around a few people at Dell and they unanimously agree that > _OSC is the correct way to determine ownership of AER. This is all very well, but we have systems (they might be Dell ones) where failure of a PCIe link (even when all the drivers are removed) causes an NMI - and crashes the kernel. There are other systems which have AER registers available for some of the PCIe devices, but the BIOS ignores them and _OSC doesn't let the kernel take control. In this case the AER registers can contain useful information after a read returns ~0u. It would be useful to be able to get this information without having to grovel through all the PCIe config space. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Thu, Nov 15, 2018 at 8:49 PM Sinan Kaya <okaya@kernel.org> wrote: > > On 11/15/2018 3:16 PM, Alexandru Gagniuc wrote: > > I've asked around a few people at Dell and they unanimously agree that > > _OSC is the correct way to determine ownership of AER. In linux, we > > use the result of _OSC to enable AER services, but we use HEST to > > determine AER ownership. That's inconsistent. This series drops the > > use of HEST in favor of _OSC. > > > > [1]https://lkml.org/lkml/2018/11/15/62 > > This change breaks the existing systems that rely on the HEST table > telling the operating system about firmware first presence. > > Besides, HEST table has much more granularity about which PCI component > needs firmware such as global/device/switch. > > You should probably circulate these ideas for wider consumption in UEFI > forum as UEFI owns the HEST table definition. I agree with Sinan, this will break existing systems, and the granularity of the HEST definition is more useful than the single bit in _OSC.
On Mon, Nov 19, 2018 at 11:53:05AM -0500, Tyler Baicar wrote: > On Thu, Nov 15, 2018 at 8:49 PM Sinan Kaya <okaya@kernel.org> wrote: > > > > On 11/15/2018 3:16 PM, Alexandru Gagniuc wrote: > > > I've asked around a few people at Dell and they unanimously agree that > > > _OSC is the correct way to determine ownership of AER. In linux, we > > > use the result of _OSC to enable AER services, but we use HEST to > > > determine AER ownership. That's inconsistent. This series drops the > > > use of HEST in favor of _OSC. > > > > > > [1]https://lkml.org/lkml/2018/11/15/62 > > > > This change breaks the existing systems that rely on the HEST table > > telling the operating system about firmware first presence. > > > > Besides, HEST table has much more granularity about which PCI component > > needs firmware such as global/device/switch. > > > > You should probably circulate these ideas for wider consumption in UEFI > > forum as UEFI owns the HEST table definition. > > I agree with Sinan, this will break existing systems, and the granularity of the > HEST definition is more useful than the single bit in _OSC. But we're not using HEST as a fine grain control. We disable native AER handling if *any* device has FF set in HEST, and that just forces people to use pcie_ports=native to get around that.
On 11/19/2018 11:53 AM, Keith Busch wrote: > On Mon, Nov 19, 2018 at 11:53:05AM -0500, Tyler Baicar wrote: >> On Thu, Nov 15, 2018 at 8:49 PM Sinan Kaya <okaya@kernel.org> wrote: >>> >>> On 11/15/2018 3:16 PM, Alexandru Gagniuc wrote: >>>> I've asked around a few people at Dell and they unanimously agree that >>>> _OSC is the correct way to determine ownership of AER. In linux, we >>>> use the result of _OSC to enable AER services, but we use HEST to >>>> determine AER ownership. That's inconsistent. This series drops the >>>> use of HEST in favor of _OSC. >>>> >>>> [1]https://lkml.org/lkml/2018/11/15/62 >>> >>> This change breaks the existing systems that rely on the HEST table >>> telling the operating system about firmware first presence. >>> >>> Besides, HEST table has much more granularity about which PCI component >>> needs firmware such as global/device/switch. >>> >>> You should probably circulate these ideas for wider consumption in UEFI >>> forum as UEFI owns the HEST table definition. >> >> I agree with Sinan, this will break existing systems, and the granularity of the >> HEST definition is more useful than the single bit in _OSC. > > But we're not using HEST as a fine grain control. We disable native AER > handling if *any* device has FF set in HEST, and that just forces people > to use pcie_ports=native to get around that. > I don't see *any* in the code. aer_hest_parse() does the HEST table parsing. It switches to firmware first mode if global flag in HEST is set. Otherwise for each BDF in device, hest_match_pci() is used to do a cross-matching against HEST table contents. Am I missing something?
On Mon, Nov 19, 2018 at 12:32:42PM -0500, Sinan Kaya wrote: > On 11/19/2018 11:53 AM, Keith Busch wrote: > > On Mon, Nov 19, 2018 at 11:53:05AM -0500, Tyler Baicar wrote: > > > On Thu, Nov 15, 2018 at 8:49 PM Sinan Kaya <okaya@kernel.org> wrote: > > > > > > > > On 11/15/2018 3:16 PM, Alexandru Gagniuc wrote: > > > > > I've asked around a few people at Dell and they unanimously agree that > > > > > _OSC is the correct way to determine ownership of AER. In linux, we > > > > > use the result of _OSC to enable AER services, but we use HEST to > > > > > determine AER ownership. That's inconsistent. This series drops the > > > > > use of HEST in favor of _OSC. > > > > > > > > > > [1]https://lkml.org/lkml/2018/11/15/62 > > > > > > > > This change breaks the existing systems that rely on the HEST table > > > > telling the operating system about firmware first presence. > > > > > > > > Besides, HEST table has much more granularity about which PCI component > > > > needs firmware such as global/device/switch. > > > > > > > > You should probably circulate these ideas for wider consumption in UEFI > > > > forum as UEFI owns the HEST table definition. > > > > > > I agree with Sinan, this will break existing systems, and the granularity of the > > > HEST definition is more useful than the single bit in _OSC. > > > > But we're not using HEST as a fine grain control. We disable native AER > > handling if *any* device has FF set in HEST, and that just forces people > > to use pcie_ports=native to get around that. > > > > I don't see *any* in the code. aer_hest_parse() does the HEST table parsing. > It switches to firmware first mode if global flag in HEST is set. Otherwise > for each BDF in device, hest_match_pci() is used to do a cross-matching against > HEST table contents. > > Am I missing something? You might be. :) static int aer_hest_parse(struct acpi_hest_header *hest_hdr, void *data) { <snip> /* * If no specific device is supplied, determine whether * FIRMWARE_FIRST is set for *any* PCIe device. */ if (!info->pci_dev) { info->firmware_first |= ff; return 0; }
On Mon, Nov 19, 2018 at 12:42:25PM -0500, Sinan Kaya wrote: > On 11/19/2018 12:32 PM, Sinan Kaya wrote: > > > > > > But we're not using HEST as a fine grain control. We disable native AER > > > handling if *any* device has FF set in HEST, and that just forces people > > > to use pcie_ports=native to get around that. > > > > > > > I don't see *any* in the code. aer_hest_parse() does the HEST table parsing. > > It switches to firmware first mode if global flag in HEST is set. Otherwise > > for each BDF in device, hest_match_pci() is used to do a cross-matching against > > HEST table contents. > > > > Am I missing something? > > I see. I think you are talking about aer_firmware_first, right? > > aer_set_firmware_first() and pcie_aer_get_firmware_first() seem to do the right > thing. Right, but what difference does it make if device specific AER checks do the right thing if pcie_aer_init() doesn't even register it's port driver? > aer_firmware_first is probably getting set because events are all routed to a > single root port and aer_acpi_firmware_first() is used to decide whether AER > should be initialized or not. > > I think I understand what is going on now. > > Still, breaking existing systems that rely on HEST table is not cool. > I'd rather have users specify "pcie_ports=native" to skip FF rather than > having broken systems by default to be honest. The pcie_ports=native work-around ignores FF to potentially unknown results, though.
On 11/19/2018 12:32 PM, Sinan Kaya wrote: >> >> But we're not using HEST as a fine grain control. We disable native AER >> handling if *any* device has FF set in HEST, and that just forces people >> to use pcie_ports=native to get around that. >> > > I don't see *any* in the code. aer_hest_parse() does the HEST table parsing. > It switches to firmware first mode if global flag in HEST is set. Otherwise > for each BDF in device, hest_match_pci() is used to do a cross-matching against > HEST table contents. > > Am I missing something? I see. I think you are talking about aer_firmware_first, right? aer_set_firmware_first() and pcie_aer_get_firmware_first() seem to do the right thing. aer_firmware_first is probably getting set because events are all routed to a single root port and aer_acpi_firmware_first() is used to decide whether AER should be initialized or not. I think I understand what is going on now. Still, breaking existing systems that rely on HEST table is not cool. I'd rather have users specify "pcie_ports=native" to skip FF rather than having broken systems by default to be honest.
On 11/19/2018 12:41 PM, Keith Busch wrote: >> Still, breaking existing systems that rely on HEST table is not cool. >> I'd rather have users specify "pcie_ports=native" to skip FF rather than >> having broken systems by default to be honest. > The pcie_ports=native work-around ignores FF to potentially unknown > results, though. > How about being able to enable/disable FF in BIOS? We can't really turn off firmware first in the kernel without asking help from the firmware. Like you said, it causes unpredictable results. There will be two competing software trying to touch the same registers.
On Mon, Nov 19, 2018 at 12:56:56PM -0500, Sinan Kaya wrote: > On 11/19/2018 12:41 PM, Keith Busch wrote: > > > Still, breaking existing systems that rely on HEST table is not cool. > > > I'd rather have users specify "pcie_ports=native" to skip FF rather than > > > having broken systems by default to be honest. > > The pcie_ports=native work-around ignores FF to potentially unknown > > results, though. > > > > How about being able to enable/disable FF in BIOS? > > We can't really turn off firmware first in the kernel without asking help > from the firmware. The _OSC method this patch utilizes is the ACPI spec defined way for the kernel to wrest control from firmware. BIOS specific menu settings shouldn't be our only recourse when we have a spec authority defining generic OS interfaces to accomplish the same thing. Unless there is a disagreement on the _OSC interpreation, we'd have to accept that platforms breaking from this patch are non-compliant. > Like you said, it causes unpredictable results. > > There will be two competing software trying to touch the same registers.
On 11/19/2018 1:10 PM, Keith Busch wrote: >> We can't really turn off firmware first in the kernel without asking help >> from the firmware. > The _OSC method this patch utilizes is the ACPI spec defined way for > the kernel to wrest control from firmware. BIOS specific menu settings > shouldn't be our only recourse when we have a spec authority defining > generic OS interfaces to accomplish the same thing. > > Unless there is a disagreement on the _OSC interpreation, we'd have to > accept that platforms breaking from this patch are non-compliant. > It depends on which spec you look :) UEFI HEST table specification also claims that it should be the ultimate table for when PCI firmware-first should be disabled/enabled. I think somebody needs to fix these. I saw an email from Harb Abdulhamid sent to aswg this morning. That's why I suggested circulating this idea in UEFI forums first. Let's see what everybody thinks. We can go from there.
On 11/19/2018 12:24 PM, Sinan Kaya wrote: > On 11/19/2018 1:10 PM, Keith Busch wrote: >>> We can't really turn off firmware first in the kernel without asking >>> help >>> from the firmware. >> The _OSC method this patch utilizes is the ACPI spec defined way for >> the kernel to wrest control from firmware. BIOS specific menu settings >> shouldn't be our only recourse when we have a spec authority defining >> generic OS interfaces to accomplish the same thing. >> >> Unless there is a disagreement on the _OSC interpreation, we'd have to >> accept that platforms breaking from this patch are non-compliant. >> > > It depends on which spec you look :) > > UEFI HEST table specification also claims that it should be the ultimate > table for when PCI firmware-first should be disabled/enabled. IIRC, EFI absorbed ACPI before FFS was a thing. Could you point me to the UEFI chapter that says HEST is authoritative? (not being a smartie, just that my free software PDF readers can't search within a file that large) > I think somebody needs to fix these. I saw an email from Harb Abdulhamid > sent to aswg this morning. > > That's why I suggested circulating this idea in UEFI forums first. > Let's see what everybody thinks. We can go from there. However you look at it, we have a glaring inconsistency in how we handle AER control in linux. I'm surprised we didn't see huge issues because of mixing HEST/_OSC. What systems rely on the HEST definition as opposed to _OSC? It doesn't make sense to me that you could have a system with mixed FFS and native AER on the same root port. The granularity of HEST shouldn't matter here because of how AER works. I'd like see how exactly we break one of those elusive systems with _OSC. I suspect _OSC and HEST end up having the same information, and that's why we didn't see any real-life issue with mixing the approaches. Alex P.S. (SARCASM ALERT) Isn't UEFI is a pile of stuff that didn't stick to the wall? P.P.S I remember someone asking why we don't disable FFS in the BIOS. I think it would be next to impossible to get certain platform vendors to relinquish FFS control, unless the specs required things to be that way -- and had a "standard" way to do so. Then getting the specs to change in that direction is also a battle.
>> UEFI HEST table specification also claims that it should be the ultimate >> table for when PCI firmware-first should be disabled/enabled. > > IIRC, EFI absorbed ACPI before FFS was a thing. Could you point me to the UEFI > chapter that says HEST is authoritative? > (not being a smartie, just that my free software PDF readers can't search within > a file that large) > ACPI 6.2: 18.3.2.4 PCI Express Root Port AER Structure Flags: Bit [0] - FIRMWARE_FIRST: If set, this bit indicates to the OSPM that system firmware will handle errors from this source first. Bit [1] - GLOBAL: If set, indicates that the settings contained in this structure apply globally to all PCI Express Devices. All other bits must be set to zero. It doesn't say shall, may or might. It says will. > >> I think somebody needs to fix these. I saw an email from Harb Abdulhamid >> sent to aswg this morning. >> >> That's why I suggested circulating this idea in UEFI forums first. >> Let's see what everybody thinks. We can go from there. > > However you look at it, we have a glaring inconsistency in how we handle AER > control in linux. I'm surprised we didn't see huge issues because of mixing > HEST/_OSC. > > What systems rely on the HEST definition as opposed to _OSC? It doesn't make > sense to me that you could have a system with mixed FFS and native AER on the > same root port. The granularity of HEST shouldn't matter here because of how AER > works. I think It depends on your PCI topology. For other topologies with multiple PCI root complexes, I can see this being used per root complex flag to indicate which root complex needs firmware first and which one doesn't. > > I'd like see how exactly we break one of those elusive systems with _OSC. I > suspect _OSC and HEST end up having the same information, and that's why we > didn't see any real-life issue with mixing the approaches. I'm already aware of two systems that rely on HEST table to pass information to the OS that firmware first is enabled. Both of the systems do not change their _OSC bits during this assuming HEST table has priority over _OSC for firmware first. If we add this patch, OS will try to claim the AER address space while firmware wants exclusive access. As I said in my previous email, the right place to talk about this is UEFI forum. > > Alex > > > P.S. > (SARCASM ALERT) Isn't UEFI is a pile of stuff that didn't stick to the wall? > > P.P.S > I remember someone asking why we don't disable FFS in the BIOS. I think it would > be next to impossible to get certain platform vendors to relinquish FFS control, > unless the specs required things to be that way -- and had a "standard" way to > do so. > > Then getting the specs to change in that direction is also a battle. >
On 11/19/2018 01:32 PM, Sinan Kaya wrote: > ACPI 6.2: > > 18.3.2.4 PCI Express Root Port AER Structure > > Flags: > > Bit [0] - FIRMWARE_FIRST: If set, this bit indicates to the OSPM that system > firmware will handle errors from this source first. > Bit [1] - GLOBAL: If set, indicates that the settings contained in this > structure apply globally to all PCI Express Devices. > All other bits must be set to zero. > > It doesn't say shall, may or might. It says will. It says "system firmware will handle errors". It does not say "system firmware owns AER registers". In absence on any descriptor text on the meaning of these tables, this really looks to me like it should be interpreted as a descriptor of APEI error sources, not a mutex on who writes to certain bits-- AER in this case. I don't think that is contradictory or inconsistent. I also wasn't able to find any reference to HEST in UEFI 2.7, only in ACPI spec. > I think It depends on your PCI topology. > > For other topologies with multiple PCI root complexes, I can see this being > used per root complex flag to indicate which root complex needs firmware first > and which one doesn't. _OSC is per root bus, so it's already granular enough, right? Why would it depend on PCI topology? >> I'd like see how exactly we break one of those elusive systems with _OSC. I >> suspect _OSC and HEST end up having the same information, and that's why we >> didn't see any real-life issue with mixing the approaches. > > I'm already aware of two systems that rely on HEST table to pass information to > the OS that firmware first is enabled. Both of the systems do not change their > _OSC bits during this assuming HEST table has priority over _OSC for firmware > first. Are those hax86 systems? It seems like the systems have broken firmware. I see several ways to handle broken systems like those: - Parse both HEST and _OSC, and decide AER ownership with root bridge granularity. i.e. host_bridge->native_aer is authoritative, but is derived from both HEST and _OSC - Add quirks for the broken systems - Keep doing what we're doing until current code breaks a new system > If we add this patch, OS will try to claim the AER address space while firmware > wants exclusive access. Yay! FFS wants exclusive access, but does not claim it. Oh, FFS! > As I said in my previous email, the right place to talk about this is UEFI > forum. The way I would present the problem to he spec writers is that, although the spec appears to be consistent, we've seen firmware vendors that made the wrong assumptions about HEST/_OSC. Instead of describing AER ownership with _OSC, they attempted to do it with HEST. So we should add an implementation note, or clarification about this. Alex
On 11/19/2018 3:16 PM, Alex_Gagniuc@Dellteam.com wrote: > On 11/19/2018 01:32 PM, Sinan Kaya wrote: >> ACPI 6.2: >> >> 18.3.2.4 PCI Express Root Port AER Structure >> >> Flags: >> >> Bit [0] - FIRMWARE_FIRST: If set, this bit indicates to the OSPM that system >> firmware will handle errors from this source first. >> Bit [1] - GLOBAL: If set, indicates that the settings contained in this >> structure apply globally to all PCI Express Devices. >> All other bits must be set to zero. >> >> It doesn't say shall, may or might. It says will. > > It says "system firmware will handle errors". It does not say "system > firmware owns AER registers". In absence on any descriptor text on the > meaning of these tables, this really looks to me like it should be > interpreted as a descriptor of APEI error sources, not a mutex on who > writes to certain bits-- AER in this case. True. I was trying to get it out in a rush. I omitted words. However; table assumes governance about for which entities firmware first should be enabled. There is no cross reference to _OSC or permission negotiation like _OST. > > I don't think that is contradictory or inconsistent. > I also wasn't able to find any reference to HEST in UEFI 2.7, only in > ACPI spec. You are right. It was a confusion on my side. The right place to look is ACPI specification. I was involved in this a couple of years ago. Some pieces were in UEFI spec. Other pieces were in ACPI. I guess they got unified now. > >> I think It depends on your PCI topology. >> >> For other topologies with multiple PCI root complexes, I can see this being >> used per root complex flag to indicate which root complex needs firmware first >> and which one doesn't. > > _OSC is per root bus, so it's already granular enough, right? Why would > it depend on PCI topology? > I was speculating. I don't have the full background on this. Need to consult the spec developers. >> As I said in my previous email, the right place to talk about this is UEFI >> forum. > > The way I would present the problem to he spec writers is that, although > the spec appears to be consistent, we've seen firmware vendors that made > the wrong assumptions about HEST/_OSC. Instead of describing AER > ownership with _OSC, they attempted to do it with HEST. So we should add > an implementation note, or clarification about this. I agree. > > Alex >
On 11/19/2018 02:33 PM, Sinan Kaya wrote: > True. I was trying to get it out in a rush. I omitted words. Sounds like you'd make an top notch spec writer! :p > However; table assumes governance about for which entities firmware first > should be enabled. There is no cross reference to _OSC or permission > negotiation like _OST. Well, from an OSPM perspective, is FFS something that can be enabled or disabled? FFS seems to be static to OSPM, which would change the sort of assumptions we can reasonably make here. >>> As I said in my previous email, the right place to talk about this is UEFI >>> forum. >> >> The way I would present the problem to he spec writers is that, although >> the spec appears to be consistent, we've seen firmware vendors that made >> the wrong assumptions about HEST/_OSC. Instead of describing AER >> ownership with _OSC, they attempted to do it with HEST. So we should add >> an implementation note, or clarification about this. > > I agree. Cool. While the UEFI Secret Society debates, can we figure out if/how [patch 1/2] breaks those systems, or is it only [patch 2/2] of this series that we suspect? Alex
On 11/19/2018 6:49 PM, Alex_Gagniuc@Dellteam.com wrote: > On 11/19/2018 02:33 PM, Sinan Kaya wrote: >> However; table assumes governance about for which entities firmware first >> should be enabled. There is no cross reference to _OSC or permission >> negotiation like _OST. > > Well, from an OSPM perspective, is FFS something that can be enabled or > disabled? FFS seems to be static to OSPM, which would change the sort of > assumptions we can reasonably make here. IMO, it can be enabled/disabled in BIOS. I have seen this implementation before. If the trigger is the presence of a statically compiled ACPI HEST table (as the current code does); presence of FFS would be static from OSPM perspective. BIOS could patch this table or hide it during boot. If FFS were to be negotiated via _OSC as indirectly implied in this series, then same BIOS could patch the ACPI table to return different values for the _OSC return. > > >>>> As I said in my previous email, the right place to talk about this is UEFI >>>> forum. >>> >>> The way I would present the problem to he spec writers is that, although >>> the spec appears to be consistent, we've seen firmware vendors that made >>> the wrong assumptions about HEST/_OSC. Instead of describing AER >>> ownership with _OSC, they attempted to do it with HEST. So we should add >>> an implementation note, or clarification about this. >> >> I agree. > > Cool. While the UEFI Secret Society debates, can we figure out if/how > [patch 1/2] breaks those systems, or is it only [patch 2/2] of this > series that we suspect? I went back and looked at both patches. Both of them are removing references to HEST table. I think both patches are impacted by this discussion. > > Alex >
On 11/19/2018 07:54 PM, Sinan Kaya wrote: > On 11/19/2018 6:49 PM, Alex_Gagniuc@Dellteam.com wrote: >> On 11/19/2018 02:33 PM, Sinan Kaya wrote: >>> However; table assumes governance about for which entities firmware first >>> should be enabled. There is no cross reference to _OSC or permission >>> negotiation like _OST. >> >> Well, from an OSPM perspective, is FFS something that can be enabled or >> disabled? FFS seems to be static to OSPM, which would change the sort of >> assumptions we can reasonably make here. > > IMO, it can be enabled/disabled in BIOS. I have seen this implementation before. > If the trigger is the presence of a statically compiled ACPI HEST table (as the > current code does); presence of FFS would be static from OSPM perspective. > BIOS could patch this table or hide it during boot. > > If FFS were to be negotiated via _OSC as indirectly implied in this series, then > same BIOS could patch the ACPI table to return different values for the _OSC > return. It is theoretically possible to have proprietary BIOS settings to disable FFS. The platform vendors that I've spoken to do not offer this option. Though even if, hypothetically, BIOS clears the FFS bit in HEST, it won't stop it from commandeering the CPU and doing whatever it wants. Although, I'm not quite sure why we'd want to negotiate FFS itself. FFS is too big of a can of worms (goes far beyond AER error reporting), when what we really care about is if OS can use a specific feature or not. >> Cool. While the UEFI Secret Society debates, can we figure out if/how >> [patch 1/2] breaks those systems, or is it only [patch 2/2] of this >> series that we suspect? > > I went back and looked at both patches. Both of them are removing references to > HEST table. I think both patches are impacted by this discussion. I'd prefer "sure" instead of "think". "I think it breaks some system I'm not telling you about" doesn't help much in figuring out how not to break said system(s). :) Alex
On 11/20/2018 3:44 PM, Alex_Gagniuc@Dellteam.com wrote: > I'd prefer "sure" instead of "think". "I think it breaks some system I'm > not telling you about" doesn't help much in figuring out how not to > break said system(s).:) Sorry, I thought I mentioned why it would break but let me repeat. The systems I have seen rely on the HEST table presence as an indicator to the OS that firmware first is enabled. If you go look at the _OSC bits on such systems, it still says OS owns the AER service. The assumption here is that HEST table has precedence over the _OSC bits. That's what needs to be clarified in the UEFI forum. If this code is to go in and ignore the HEST table presence, then firmware will think that it owns AER service and OS will think that it owns AER service too.
On Tue, Nov 20, 2018 at 04:02:21PM -0500, Sinan Kaya wrote: > On 11/20/2018 3:44 PM, Alex_Gagniuc@Dellteam.com wrote: > > I'd prefer "sure" instead of "think". "I think it breaks some system I'm > > not telling you about" doesn't help much in figuring out how not to > > break said system(s).:) > > Sorry, I thought I mentioned why it would break but let me repeat. > > The systems I have seen rely on the HEST table presence as an indicator > to the OS that firmware first is enabled. If you go look at the _OSC bits > on such systems, it still says OS owns the AER service. > > The assumption here is that HEST table has precedence over the _OSC bits. > That's what needs to be clarified in the UEFI forum. > > If this code is to go in and ignore the HEST table presence, then firmware > will think that it owns AER service and OS will think that it owns AER > service too. How does that work? If the OS takes control, it sets up MSIs that FW don't react to, and disables system errors through PCIe Root Control. Aren't those sys errs the mechanism FW knows it has something to do, which means the OS can effectively fence it off?
On 11/20/2018 03:02 PM, Sinan Kaya wrote: > On 11/20/2018 3:44 PM, Alex_Gagniuc@Dellteam.com wrote: >> I'd prefer "sure" instead of "think". "I think it breaks some system I'm >> not telling you about" doesn't help much in figuring out how not to >> break said system(s).:) > > Sorry, I thought I mentioned why it would break but let me repeat. Why, yes, but bets are still being placed on the systems allegedly suffering from this. > The systems I have seen rely on the HEST table presence as an indicator > to the OS that firmware first is enabled. If you go look at the _OSC bits > on such systems, it still says OS owns the AER service. > > The assumption here is that HEST table has precedence over the _OSC bits. > That's what needs to be clarified in the UEFI forum. > > If this code is to go in and ignore the HEST table presence, then firmware > will think that it owns AER service and OS will think that it owns AER > service too. So this seems like exactly the scenario we were hypothesizing. * System boots up with FFS enabled. Everything is fine so far. * OSPM requests control of AER (set bit 3 in _OSC) * FW grants OS control of AER (set bit 3 in _OSC reply) That's how things are designed to work. Now, let's assume, for the sake of argument, that the firmware on those system's is broken, and it didn't intend to give the OS control of AER. OSPM checking HEST instead of _OSC is still wrong, according to the spec. Two wrongs don't make a right, they just don't crash. I think the correct way is to identify those broken systems, and add quirks for them. Continuing to have inconsistent and over-complicated logic that is not spec compliant is not any better. Alex
On 11/20/2018 4:46 PM, Alex_Gagniuc@Dellteam.com wrote: > Now, let's assume, for the sake of argument, that the firmware on those > system's is broken, and it didn't intend to give the OS control of AER. > OSPM checking HEST instead of _OSC is still wrong, according to the > spec. Two wrongs don't make a right, they just don't crash. > > I think the correct way is to identify those broken systems, and add > quirks for them. Continuing to have inconsistent and over-complicated > logic that is not spec compliant is not any better. Remember that both _OSC and HEST are in the ACPI specification. I don't think there is a consensus on what is "wrong". There is certainly a need for spec clarification. One version is: "if HEST table is present, ignore _OSC" or Another version is: "if HEST table is present, make sure that FW sets _OSC bit for AER to false. Otherwise, warn like crazy that this BIOS is broken and needs an update and can cause all sorts of trouble" I can see both points of view. The second one can also be worked around by an SMBIOS quirk too as you suggested. Counting the number of quirks and random bug reports will be an interesting exercise / regression. I followed the ASWG thread yesterday. There will be a meeting next week to discuss this. My preference is not to introduce new behavior/regression to the kernel.
On 11/20/2018 4:42 PM, Keith Busch wrote: > How does that work? If the OS takes control, it sets up MSIs that FW don't > react to, and disables system errors through PCIe Root Control. Aren't > those sys errs the mechanism FW knows it has something to do, which > means the OS can effectively fence it off? I think this is all implementation detail and doesn't necessarily apply to all firmware-first implementation flavors. Assumptions are: 1. both FW and OS are listening to MSI interrupts 2. FW monitors the system errors Some FF implementation could route the AER interrupt to a higher privilege level. Some other implementation could use INTx or a side-band channel interrupt for firmware-interrupt too. I have seen all 3 except MSI :) and also firmware never monitored the system error bits. I was curious if anybody ever used those legacy bits. Now, I know someone is using it.
On 11/20/2018 04:28 PM, Sinan Kaya wrote: > On 11/20/2018 4:42 PM, Keith Busch wrote: >> How does that work? If the OS takes control, it sets up MSIs that FW >> don't >> react to, and disables system errors through PCIe Root Control. Aren't >> those sys errs the mechanism FW knows it has something to do, which >> means the OS can effectively fence it off? > > I think this is all implementation detail and doesn't necessarily apply > to all firmware-first implementation flavors. > > Assumptions are: > 1. both FW and OS are listening to MSI interrupts On hax86, I'm not sure FW can listen to MSI interrups. FW only exists in SMM, not ring 1-4. > 2. FW monitors the system errors > > Some FF implementation could route the AER interrupt to a higher privilege > level. Some other implementation could use INTx or a side-band channel > interrupt > for firmware-interrupt too. > > I have seen all 3 except MSI :) and also firmware never monitored the > system > error bits. I was curious if anybody ever used those legacy bits. Now, I > know > someone is using it.
On 11/20/2018 04:08 PM, Sinan Kaya wrote: > One version is: > "if HEST table is present, ignore _OSC" > or > Another version is: > "if HEST table is present, make sure that FW sets _OSC bit for AER to > false. Otherwise, warn like crazy that this BIOS is broken and needs > an update and can cause all sorts of trouble" ACPI 6.2 Ch 6.2.11.3, Table 6-197 PCI Express Advanced Error Reporting control The firmware sets this bit to 1 to grant control over PCI Express Advanced Error Reporting. If firmware allows the OS control of this feature, then in the context of the _OSC method it must ensure that error messages are routed to device interrupts as described in the PCI Express Base Specification. Additionally, after control is transferred to the OS, firmware must not modify the Advanced Error Reporting Capability. If control of this feature was requested and denied or was not requested, firmware returns this bit set to 0.
On 11/20/2018 04:08 PM, Sinan Kaya wrote: > I followed the ASWG thread yesterday. There will be a meeting next week to > discuss this. Any updates on the meeting?
On 11/27/2018 1:22 PM, Alex_Gagniuc@Dellteam.com wrote: > On 11/20/2018 04:08 PM, Sinan Kaya wrote: >> I followed the ASWG thread yesterday. There will be a meeting next week to >> discuss this. > > Any updates on the meeting? > > Tyler should be able to get an update.
On Tue, Nov 27, 2018 at 1:32 PM Sinan Kaya <okaya@kernel.org> wrote: > > On 11/27/2018 1:22 PM, Alex_Gagniuc@Dellteam.com wrote: > > On 11/20/2018 04:08 PM, Sinan Kaya wrote: > >> I followed the ASWG thread yesterday. There will be a meeting next week to > >> discuss this. > > > > Any updates on the meeting? > > > > > > Tyler should be able to get an update. As per Harb, the meetings are on Thursdays so it will be discussed this Thursday since last week was a holiday. Thanks, Tyler
On Thu, Nov 15, 2018 at 05:16:01PM -0600, Alexandru Gagniuc wrote: > Thanks to Keith for pointing out that it doesn't make sense to disable > AER services when only one device has a FIRMWARE_FIRST HEST. > > AER ownership is an interesting issue brought in by FFS (firmware-first) > model. In a nutshell if FFS handles AER, then OS should not touch any > of the AER bits. FW might set things up so that it receives AER > notifications via SMI. It's theoretically possible to receive SCIs, > but the exact mechanism is platform-dependent. OS touching AER bits > when firmware owns them may interfere with these notifications. > > The ACPI mechanism for negotiating control of AER is _OSC, and is > described in detail in ACPI 6.2 Ch. 6.2.11.3. _OSC is negotiated at > the root bus level. Any root port, switch, or endpoint under the bus > would have its AER ownership negotiated in one _OSC call. > > Then there is HEST, which is part of ACPI Platform Error Interfaces > (APEI). HEST tables describe the errors that FW may report to the OS. > A types 6,7 and 7 HEST tables describe AER errors from PCIe devices. > As part of this description, we're told if the error source is FFS. > > Information in HEST seems to be redundant, as each error reported by > FW will have a CPER table that describes it in detail. > > Because HEST describes an error source as firmware-first or not, we've > taken this to mean ownership of AER. Because AER ownership and error > reporting are coupled, _OSC and HEST usually agree on the matter of > ownership. However, that doesn't seem to be required by ACPI. > > I've asked around a few people at Dell and they unanimously agree that > _OSC is the correct way to determine ownership of AER. In linux, we > use the result of _OSC to enable AER services, but we use HEST to > determine AER ownership. That's inconsistent. This series drops the > use of HEST in favor of _OSC. > > [1] https://lkml.org/lkml/2018/11/15/62 > > Alexandru Gagniuc (2): > PCI/AER: Do not use APEI/HEST to disable AER services globally > PCI/AER: Determine AER ownership based on _OSC instead of HEST > > drivers/acpi/pci_root.c | 9 +---- > drivers/pci/pcie/aer.c | 82 ++-------------------------------------- > include/linux/pci-acpi.h | 6 --- > 3 files changed, 5 insertions(+), 92 deletions(-) I'm pretty sure we do need to do something here, but there was quite a lot of discussion that didn't seem to really get resolved, so I'm dropping these for now. Please repost them with any relevant updates and we'll see if we can get a consensus that we're going the right direction. Bjorn