Message ID | 20231206224231.732765-3-helgaas@kernel.org (mailing list archive) |
---|---|
State | Accepted |
Delegated to: | Bjorn Helgaas |
Headers | show |
Series | PCI/AER: Clean up logging | expand |
On Wed, 6 Dec 2023 16:42:30 -0600 Bjorn Helgaas <helgaas@kernel.org> wrote: > From: Bjorn Helgaas <bhelgaas@google.com> > > When a device with AER detects an error, it logs error information in its > own AER Error Status registers. It may send an Error Message to the Root > Port (RCEC in the case of an RCiEP), which logs the fact that an Error > Message was received (Root Error Status) and the Requester ID of the > message source (Error Source Identification). > > aer_print_port_info() prints the Requester ID from the Root Port Error > Source in the usual Linux "bb:dd.f" format, but when find_source_device() > finds no error details in the hierarchy below the Root Port, it printed the > raw Requester ID without decoding it. > > Decode the Requester ID in the usual Linux format so it matches other > messages. > > Sample message changes: > > - pcieport 0000:00:1c.5: AER: Correctable error received: 0000:00:1c.5 > - pcieport 0000:00:1c.5: AER: can't find device of ID00e5 > + pcieport 0000:00:1c.5: AER: Correctable error message received from 0000:00:1c.5 > + pcieport 0000:00:1c.5: AER: found no error details for 0000:00:1c.5 > > Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> LGTM Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
LGTM On 12/6/23 16:42, Bjorn Helgaas wrote: > From: Bjorn Helgaas <bhelgaas@google.com> > > When a device with AER detects an error, it logs error information in its > own AER Error Status registers. It may send an Error Message to the Root > Port (RCEC in the case of an RCiEP), which logs the fact that an Error > Message was received (Root Error Status) and the Requester ID of the > message source (Error Source Identification). > > aer_print_port_info() prints the Requester ID from the Root Port Error > Source in the usual Linux "bb:dd.f" format, but when find_source_device() > finds no error details in the hierarchy below the Root Port, it printed the > raw Requester ID without decoding it. > > Decode the Requester ID in the usual Linux format so it matches other > messages. > > Sample message changes: > > - pcieport 0000:00:1c.5: AER: Correctable error received: 0000:00:1c.5 > - pcieport 0000:00:1c.5: AER: can't find device of ID00e5 > + pcieport 0000:00:1c.5: AER: Correctable error message received from 0000:00:1c.5 > + pcieport 0000:00:1c.5: AER: found no error details for 0000:00:1c.5 > > Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> > --- > drivers/pci/pcie/aer.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c > index 20db80018b5d..2ff6bac9979f 100644 > --- a/drivers/pci/pcie/aer.c > +++ b/drivers/pci/pcie/aer.c > @@ -740,7 +740,7 @@ static void aer_print_port_info(struct pci_dev *dev, struct aer_err_info *info) > u8 bus = info->id >> 8; > u8 devfn = info->id & 0xff; > > - pci_info(dev, "%s%s error received: %04x:%02x:%02x.%d\n", > + pci_info(dev, "%s%s error message received from %04x:%02x:%02x.%d\n", > info->multi_error_valid ? "Multiple " : "", > aer_error_severity_string[info->severity], > pci_domain_nr(dev->bus), bus, PCI_SLOT(devfn), > @@ -929,7 +929,12 @@ static bool find_source_device(struct pci_dev *parent, > pci_walk_bus(parent->subordinate, find_device_iter, e_info); > > if (!e_info->error_dev_num) { > - pci_info(parent, "can't find device of ID%04x\n", e_info->id); > + u8 bus = e_info->id >> 8; > + u8 devfn = e_info->id & 0xff; > + > + pci_info(parent, "found no error details for %04x:%02x:%02x.%d\n", > + pci_domain_nr(parent->bus), bus, PCI_SLOT(devfn), > + PCI_FUNC(devfn)); > return false; > } > return true;
On 12/6/2023 2:42 PM, Bjorn Helgaas wrote: > From: Bjorn Helgaas <bhelgaas@google.com> > > When a device with AER detects an error, it logs error information in its > own AER Error Status registers. It may send an Error Message to the Root > Port (RCEC in the case of an RCiEP), which logs the fact that an Error > Message was received (Root Error Status) and the Requester ID of the > message source (Error Source Identification). > > aer_print_port_info() prints the Requester ID from the Root Port Error > Source in the usual Linux "bb:dd.f" format, but when find_source_device() > finds no error details in the hierarchy below the Root Port, it printed the > raw Requester ID without decoding it. > > Decode the Requester ID in the usual Linux format so it matches other > messages. > > Sample message changes: > > - pcieport 0000:00:1c.5: AER: Correctable error received: 0000:00:1c.5 > - pcieport 0000:00:1c.5: AER: can't find device of ID00e5 > + pcieport 0000:00:1c.5: AER: Correctable error message received from 0000:00:1c.5 > + pcieport 0000:00:1c.5: AER: found no error details for 0000:00:1c.5 > > Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Except for the suggestion given below, it looks good to me. Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> > --- > drivers/pci/pcie/aer.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c > index 20db80018b5d..2ff6bac9979f 100644 > --- a/drivers/pci/pcie/aer.c > +++ b/drivers/pci/pcie/aer.c > @@ -740,7 +740,7 @@ static void aer_print_port_info(struct pci_dev *dev, struct aer_err_info *info) > u8 bus = info->id >> 8; > u8 devfn = info->id & 0xff; > > - pci_info(dev, "%s%s error received: %04x:%02x:%02x.%d\n", > + pci_info(dev, "%s%s error message received from %04x:%02x:%02x.%d\n", > info->multi_error_valid ? "Multiple " : "", > aer_error_severity_string[info->severity], > pci_domain_nr(dev->bus), bus, PCI_SLOT(devfn), > @@ -929,7 +929,12 @@ static bool find_source_device(struct pci_dev *parent, > pci_walk_bus(parent->subordinate, find_device_iter, e_info); > > if (!e_info->error_dev_num) { > - pci_info(parent, "can't find device of ID%04x\n", e_info->id); > + u8 bus = e_info->id >> 8; > + u8 devfn = e_info->id & 0xff; You can use PCI_BUS_NUM(e_info->id) for getting bus number. Since you are extracting this info in more than one place, maybe you can also define a macro PCI_DEVFN(id) (following PCI_BUS_NUM()). > + > + pci_info(parent, "found no error details for %04x:%02x:%02x.%d\n", > + pci_domain_nr(parent->bus), bus, PCI_SLOT(devfn), > + PCI_FUNC(devfn)); > return false; > } > return true;
On Tue, Jan 02, 2024 at 11:22:53AM -0800, Kuppuswamy Sathyanarayanan wrote: > On 12/6/2023 2:42 PM, Bjorn Helgaas wrote: > > From: Bjorn Helgaas <bhelgaas@google.com> > > > > When a device with AER detects an error, it logs error information in its > > own AER Error Status registers. It may send an Error Message to the Root > > Port (RCEC in the case of an RCiEP), which logs the fact that an Error > > Message was received (Root Error Status) and the Requester ID of the > > message source (Error Source Identification). > > > > aer_print_port_info() prints the Requester ID from the Root Port Error > > Source in the usual Linux "bb:dd.f" format, but when find_source_device() > > finds no error details in the hierarchy below the Root Port, it printed the > > raw Requester ID without decoding it. > > > > Decode the Requester ID in the usual Linux format so it matches other > > messages. > > > > Sample message changes: > > > > - pcieport 0000:00:1c.5: AER: Correctable error received: 0000:00:1c.5 > > - pcieport 0000:00:1c.5: AER: can't find device of ID00e5 > > + pcieport 0000:00:1c.5: AER: Correctable error message received from 0000:00:1c.5 > > + pcieport 0000:00:1c.5: AER: found no error details for 0000:00:1c.5 > > > > Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> > > Except for the suggestion given below, it looks good to me. > > Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Thanks for taking a look! > > @@ -740,7 +740,7 @@ static void aer_print_port_info(struct pci_dev *dev, struct aer_err_info *info) > > u8 bus = info->id >> 8; > > u8 devfn = info->id & 0xff; > > > > - pci_info(dev, "%s%s error received: %04x:%02x:%02x.%d\n", > > + pci_info(dev, "%s%s error message received from %04x:%02x:%02x.%d\n", > > info->multi_error_valid ? "Multiple " : "", > > aer_error_severity_string[info->severity], > > pci_domain_nr(dev->bus), bus, PCI_SLOT(devfn), > > @@ -929,7 +929,12 @@ static bool find_source_device(struct pci_dev *parent, > > pci_walk_bus(parent->subordinate, find_device_iter, e_info); > > > > if (!e_info->error_dev_num) { > > - pci_info(parent, "can't find device of ID%04x\n", e_info->id); > > + u8 bus = e_info->id >> 8; > > + u8 devfn = e_info->id & 0xff; > > You can use PCI_BUS_NUM(e_info->id) for getting bus number. Since > you are extracting this info in more than one place, maybe you can > also define a macro PCI_DEVFN(id) (following PCI_BUS_NUM()). Thanks, both good ideas. We already have a PCI_DEVFN() that *combines* slot + func into devfn, so we'd have to come up with a different name. I'll add a patch to use PCI_BUS_NUM() in the two places here and in pme.c. I think I'll wait with these until after the v6.7 release. > > + pci_info(parent, "found no error details for %04x:%02x:%02x.%d\n", > > + pci_domain_nr(parent->bus), bus, PCI_SLOT(devfn), > > + PCI_FUNC(devfn)); > > return false; > > } > > return true; > > -- > Sathyanarayanan Kuppuswamy > Linux Kernel Developer
diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c index 20db80018b5d..2ff6bac9979f 100644 --- a/drivers/pci/pcie/aer.c +++ b/drivers/pci/pcie/aer.c @@ -740,7 +740,7 @@ static void aer_print_port_info(struct pci_dev *dev, struct aer_err_info *info) u8 bus = info->id >> 8; u8 devfn = info->id & 0xff; - pci_info(dev, "%s%s error received: %04x:%02x:%02x.%d\n", + pci_info(dev, "%s%s error message received from %04x:%02x:%02x.%d\n", info->multi_error_valid ? "Multiple " : "", aer_error_severity_string[info->severity], pci_domain_nr(dev->bus), bus, PCI_SLOT(devfn), @@ -929,7 +929,12 @@ static bool find_source_device(struct pci_dev *parent, pci_walk_bus(parent->subordinate, find_device_iter, e_info); if (!e_info->error_dev_num) { - pci_info(parent, "can't find device of ID%04x\n", e_info->id); + u8 bus = e_info->id >> 8; + u8 devfn = e_info->id & 0xff; + + pci_info(parent, "found no error details for %04x:%02x:%02x.%d\n", + pci_domain_nr(parent->bus), bus, PCI_SLOT(devfn), + PCI_FUNC(devfn)); return false; } return true;