Message ID | 1369924769-17183-2-git-send-email-betty.dall@hp.com (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
[+cc Bob for ACPI HEST spec questions] On Thu, May 30, 2013 at 8:39 AM, Betty Dall <betty.dall@hp.com> wrote: > The function aer_hest_parse() is called to determine if the given > PCI device is firmware first or not. The code loops through each > section of the HEST table to look for a match. The bug is that > the function always returns whether the last HEST section is firmware > first. The fix stops the iteration once the info.firmware_first > variable is set. This is similar to how the function aer_hest_parse_aff() > stops the iteration. > > Signed-off-by: Betty Dall <betty.dall@hp.com> > --- > > drivers/pci/pcie/aer/aerdrv_acpi.c | 3 +++ > 1 files changed, 3 insertions(+), 0 deletions(-) > > > diff --git a/drivers/pci/pcie/aer/aerdrv_acpi.c b/drivers/pci/pcie/aer/aerdrv_acpi.c > index 5194a7d..39b8671 100644 > --- a/drivers/pci/pcie/aer/aerdrv_acpi.c > +++ b/drivers/pci/pcie/aer/aerdrv_acpi.c > @@ -42,6 +42,9 @@ static int aer_hest_parse(struct acpi_hest_header *hest_hdr, void *data) > u8 bridge = 0; > int ff = 0; > > + if (info->firmware_first) > + return 0; > + > switch (hest_hdr->type) { > case ACPI_HEST_TYPE_AER_ROOT_PORT: > pcie_type = PCI_EXP_TYPE_ROOT_PORT; Not related directly to your patch, Betty, but I can't figure out why the ACPI spec defines the HEST structures for PCIe as it does. I'm looking at ACPI 5.0, sec 18.3.2.3 - 18.3.2.5. 1) The PCIe Root Port, PCIe Device, and PCIe/PCI-X Bridge structures all include Bus, Device, and Function fields. But there's no Segment. The current Linux code (hest_match_pci()) assumes HEST records can only apply to PCI domain 0. Is Linux missing something, or is the HEST really this limited? 2) Doesn't the fact that the HEST structures include a bus number mean that the OS cannot reassign bus numbers? I guess maybe we could still reassign them if we kept track of the mapping back to the original values. 3) It's interesting that the only choices seem to be "global" or "list every device." There's no "apply to this device and the subtree under it," so I guess if you want HEST to apply to hot-added devices, "global" is the only thing that makes sense? Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, 2013-06-01 at 18:38 -0600, Bjorn Helgaas wrote: > [+cc Bob for ACPI HEST spec questions] > > On Thu, May 30, 2013 at 8:39 AM, Betty Dall <betty.dall@hp.com> wrote: > > The function aer_hest_parse() is called to determine if the given > > PCI device is firmware first or not. The code loops through each > > section of the HEST table to look for a match. The bug is that > > the function always returns whether the last HEST section is firmware > > first. The fix stops the iteration once the info.firmware_first > > variable is set. This is similar to how the function aer_hest_parse_aff() > > stops the iteration. > > > > Signed-off-by: Betty Dall <betty.dall@hp.com> > > --- > > > > drivers/pci/pcie/aer/aerdrv_acpi.c | 3 +++ > > 1 files changed, 3 insertions(+), 0 deletions(-) > > > > > > diff --git a/drivers/pci/pcie/aer/aerdrv_acpi.c b/drivers/pci/pcie/aer/aerdrv_acpi.c > > index 5194a7d..39b8671 100644 > > --- a/drivers/pci/pcie/aer/aerdrv_acpi.c > > +++ b/drivers/pci/pcie/aer/aerdrv_acpi.c > > @@ -42,6 +42,9 @@ static int aer_hest_parse(struct acpi_hest_header *hest_hdr, void *data) > > u8 bridge = 0; > > int ff = 0; > > > > + if (info->firmware_first) > > + return 0; > > + > > switch (hest_hdr->type) { > > case ACPI_HEST_TYPE_AER_ROOT_PORT: > > pcie_type = PCI_EXP_TYPE_ROOT_PORT; > > Not related directly to your patch, Betty, but I can't figure out why > the ACPI spec defines the HEST structures for PCIe as it does. I'm > looking at ACPI 5.0, sec 18.3.2.3 - 18.3.2.5. > 1) The PCIe Root Port, PCIe Device, and PCIe/PCI-X Bridge structures > all include Bus, Device, and Function fields. But there's no Segment. > The current Linux code (hest_match_pci()) assumes HEST records can > only apply to PCI domain 0. Is Linux missing something, or is the > HEST really this limited? You are right that the HEST table does not have the Segment for the PCIe sources. The Linux code uses the Generic Source type that points to a UEFI CPER record. Those do have the Segment. The code in acpi/apei/ghes.c that parses the HEST and invokes the aer_recover_queue() is using the segment from the CPER record. The hest_match_pci() code is only looking at the PCIe error source types because that is where the flags field is defined to indicate "firmware first". Flags is not defined in the Generic error source. Firmware first platforms today must be using GLOBAL error sources because the code is written in such a way that any specific PCIe match will cause the entire platform to be firmware first. Linux only supports either AER or firmware first right now, and it would be an enhancement to do both. > 2) Doesn't the fact that the HEST structures include a bus number mean > that the OS cannot reassign bus numbers? I guess maybe we could still > reassign them if we kept track of the mapping back to the original > values. Yes, I think there needs to be a mapping between the HEST/CPER bus number and the OS assigned bus numbers, if the OS is reassigning bus numbers. > 3) It's interesting that the only choices seem to be "global" or "list > every device." There's no "apply to this device and the subtree under > it," so I guess if you want HEST to apply to hot-added devices, > "global" is the only thing that makes sense? "Global" is the one one that makes sense to me too. The size of the HEST table is fixed at boot and firmware strives to keep it as small as possible because firmware memory is a scarce resource. Betty > Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, May 30, 2013 at 08:39:27AM -0600, Betty Dall wrote: > Date: Thu, 30 May 2013 08:39:27 -0600 > From: Betty Dall <betty.dall@hp.com> > To: rjw@sisk.pl, bhelgaas@google.com > Cc: ying.huang@intel.com, linux-acpi@vger.kernel.org, > linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, Betty Dall > <betty.dall@hp.com> > Subject: [PATCH v2 1/3] PCI/AER: Fix incorrect return from aer_hest_parse() > X-Mailer: git-send-email 1.7.7.6 > > The function aer_hest_parse() is called to determine if the given > PCI device is firmware first or not. The code loops through each > section of the HEST table to look for a match. The bug is that > the function always returns whether the last HEST section is firmware > first. The fix stops the iteration once the info.firmware_first > variable is set. This is similar to how the function aer_hest_parse_aff() > stops the iteration. > > Signed-off-by: Betty Dall <betty.dall@hp.com> The patch is good. But I have further concern based on your patch. 1) aer_hest_parse never checks the 2nd input parameter (void *data), which means once it is NULL, it will crash the kernel. 2) both aer_hest_parse and aer_hest_parse_aff utilize some flag as shortcut, if so, why not adding similar logic in apei_hest_parse to stop meaningless loops once FFM is confirmed as enabled. > --- > > drivers/pci/pcie/aer/aerdrv_acpi.c | 3 +++ > 1 files changed, 3 insertions(+), 0 deletions(-) > > > diff --git a/drivers/pci/pcie/aer/aerdrv_acpi.c b/drivers/pci/pcie/aer/aerdrv_acpi.c > index 5194a7d..39b8671 100644 > --- a/drivers/pci/pcie/aer/aerdrv_acpi.c > +++ b/drivers/pci/pcie/aer/aerdrv_acpi.c > @@ -42,6 +42,9 @@ static int aer_hest_parse(struct acpi_hest_header *hest_hdr, void *data) > u8 bridge = 0; > int ff = 0; > > + if (info->firmware_first) > + return 0; > + > switch (hest_hdr->type) { > case ACPI_HEST_TYPE_AER_ROOT_PORT: > pcie_type = PCI_EXP_TYPE_ROOT_PORT; > -- > To unsubscribe from this list: send the line "unsubscribe linux-acpi" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Jun 3, 2013 at 3:18 PM, Betty Dall <betty.dall@hp.com> wrote: > On Sat, 2013-06-01 at 18:38 -0600, Bjorn Helgaas wrote: >> [+cc Bob for ACPI HEST spec questions] >> >> On Thu, May 30, 2013 at 8:39 AM, Betty Dall <betty.dall@hp.com> wrote: >> > The function aer_hest_parse() is called to determine if the given >> > PCI device is firmware first or not. The code loops through each >> > section of the HEST table to look for a match. The bug is that >> > the function always returns whether the last HEST section is firmware >> > first. The fix stops the iteration once the info.firmware_first >> > variable is set. This is similar to how the function aer_hest_parse_aff() >> > stops the iteration. >> > >> > Signed-off-by: Betty Dall <betty.dall@hp.com> >> > --- >> > >> > drivers/pci/pcie/aer/aerdrv_acpi.c | 3 +++ >> > 1 files changed, 3 insertions(+), 0 deletions(-) >> > >> > >> > diff --git a/drivers/pci/pcie/aer/aerdrv_acpi.c b/drivers/pci/pcie/aer/aerdrv_acpi.c >> > index 5194a7d..39b8671 100644 >> > --- a/drivers/pci/pcie/aer/aerdrv_acpi.c >> > +++ b/drivers/pci/pcie/aer/aerdrv_acpi.c >> > @@ -42,6 +42,9 @@ static int aer_hest_parse(struct acpi_hest_header *hest_hdr, void *data) >> > u8 bridge = 0; >> > int ff = 0; >> > >> > + if (info->firmware_first) >> > + return 0; >> > + >> > switch (hest_hdr->type) { >> > case ACPI_HEST_TYPE_AER_ROOT_PORT: >> > pcie_type = PCI_EXP_TYPE_ROOT_PORT; >> >> Not related directly to your patch, Betty, but I can't figure out why >> the ACPI spec defines the HEST structures for PCIe as it does. I'm >> looking at ACPI 5.0, sec 18.3.2.3 - 18.3.2.5. > >> 1) The PCIe Root Port, PCIe Device, and PCIe/PCI-X Bridge structures >> all include Bus, Device, and Function fields. But there's no Segment. >> The current Linux code (hest_match_pci()) assumes HEST records can >> only apply to PCI domain 0. Is Linux missing something, or is the >> HEST really this limited? > You are right that the HEST table does not have the Segment for the PCIe > sources. The Linux code uses the Generic Source type that points to a > UEFI CPER record. Those do have the Segment. The code in > acpi/apei/ghes.c that parses the HEST and invokes the > aer_recover_queue() is using the segment from the CPER record. My question has nothing to do with CPER. Drivers can enable error reporting for their devices with pci_enable_pcie_error_reporting(). If the device is marked "firmware-first" in the HEST, this call fails without enabling reporting. The HEST can only mark devices in domain 0 as "firmware-first." Therefore pci_enable_pcie_error_reporting() will enable reporting for: - domain 0 devices not marked "firmware-first" and - all devices in other domains. This inconsistency seems like a hole in the spec, but maybe I'm missing something. Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/pci/pcie/aer/aerdrv_acpi.c b/drivers/pci/pcie/aer/aerdrv_acpi.c index 5194a7d..39b8671 100644 --- a/drivers/pci/pcie/aer/aerdrv_acpi.c +++ b/drivers/pci/pcie/aer/aerdrv_acpi.c @@ -42,6 +42,9 @@ static int aer_hest_parse(struct acpi_hest_header *hest_hdr, void *data) u8 bridge = 0; int ff = 0; + if (info->firmware_first) + return 0; + switch (hest_hdr->type) { case ACPI_HEST_TYPE_AER_ROOT_PORT: pcie_type = PCI_EXP_TYPE_ROOT_PORT;
The function aer_hest_parse() is called to determine if the given PCI device is firmware first or not. The code loops through each section of the HEST table to look for a match. The bug is that the function always returns whether the last HEST section is firmware first. The fix stops the iteration once the info.firmware_first variable is set. This is similar to how the function aer_hest_parse_aff() stops the iteration. Signed-off-by: Betty Dall <betty.dall@hp.com> --- drivers/pci/pcie/aer/aerdrv_acpi.c | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html