Message ID | f7960a4dee0e417eedd7d2e031d04ac9016c6686.1634825082.git.naveennaidu479@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Bjorn Helgaas |
Headers | show |
Series | Unify PCI error response checking | expand |
On Thursday 21 October 2021 20:37:26 Naveen Naidu wrote: > An MMIO read from a PCI device that doesn't exist or doesn't respond > causes a PCI error. There's no real data to return to satisfy the > CPU read, so most hardware fabricates ~0 data. > > Add a PCI_ERROR_RESPONSE definition for that and use it where > appropriate to make these checks consistent and easier to find. > > Also add helper definitions SET_PCI_ERROR_RESPONSE and > RESPONSE_IS_PCI_ERROR to make the code more readable. > > Suggested-by: Bjorn Helgaas <bhelgaas@google.com> > Signed-off-by: Naveen Naidu <naveennaidu479@gmail.com> Reviewed-by: Pali Rohár <pali@kernel.org> > --- > include/linux/pci.h | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/include/linux/pci.h b/include/linux/pci.h > index cd8aa6fce204..689c8277c584 100644 > --- a/include/linux/pci.h > +++ b/include/linux/pci.h > @@ -154,6 +154,15 @@ enum pci_interrupt_pin { > /* The number of legacy PCI INTx interrupts */ > #define PCI_NUM_INTX 4 > > +/* > + * Reading from a device that doesn't respond typically returns ~0. A > + * successful read from a device may also return ~0, so you need additional > + * information to reliably identify errors. > + */ > +#define PCI_ERROR_RESPONSE (~0ULL) > +#define SET_PCI_ERROR_RESPONSE(val) (*(val) = ((typeof(*(val))) PCI_ERROR_RESPONSE)) > +#define RESPONSE_IS_PCI_ERROR(val) ((val) == ((typeof(val)) PCI_ERROR_RESPONSE)) > + > /* > * pci_power_t values must match the bits in the Capabilities PME_Support > * and Control/Status PowerState fields in the Power Management capability. > -- > 2.25.1 >
On Thu, Oct 21, 2021 at 08:37:26PM +0530, Naveen Naidu wrote: > An MMIO read from a PCI device that doesn't exist or doesn't respond > causes a PCI error. There's no real data to return to satisfy the > CPU read, so most hardware fabricates ~0 data. > > Add a PCI_ERROR_RESPONSE definition for that and use it where > appropriate to make these checks consistent and easier to find. > > Also add helper definitions SET_PCI_ERROR_RESPONSE and > RESPONSE_IS_PCI_ERROR to make the code more readable. > > Suggested-by: Bjorn Helgaas <bhelgaas@google.com> > Signed-off-by: Naveen Naidu <naveennaidu479@gmail.com> > --- > include/linux/pci.h | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/include/linux/pci.h b/include/linux/pci.h > index cd8aa6fce204..689c8277c584 100644 > --- a/include/linux/pci.h > +++ b/include/linux/pci.h > @@ -154,6 +154,15 @@ enum pci_interrupt_pin { > /* The number of legacy PCI INTx interrupts */ > #define PCI_NUM_INTX 4 > > +/* > + * Reading from a device that doesn't respond typically returns ~0. A > + * successful read from a device may also return ~0, so you need additional > + * information to reliably identify errors. > + */ > +#define PCI_ERROR_RESPONSE (~0ULL) > +#define SET_PCI_ERROR_RESPONSE(val) (*(val) = ((typeof(*(val))) PCI_ERROR_RESPONSE)) > +#define RESPONSE_IS_PCI_ERROR(val) ((val) == ((typeof(val)) PCI_ERROR_RESPONSE)) Beautiful! I really like this. I would prefer the macros to start with "PCI_", e.g., PCI_SET_ERROR_RESPONSE(). I think "RESPONSE_IS_PCI_ERROR()" is too strong because (as the comment says), ~0 *may* indicate an error. Or it may be a successful read of a register that happens to contain ~0. Possibilities to convey the idea that this isn't definitive: PCI_POSSIBLE_ERROR_RESPONSE(val) # a little long PCI_LIKELY_ERROR(val) # we really have no idea whether PCI_PROBABLE_ERROR(val) # likely or probable PCI_POSSIBLE_ERROR(val) # promising? Can you rebase to my "main" branch (v5.16-rc1), tweak the above, and collect up the acks/reviews? We should also browse drivers outside drivers/pci for places we could use these. Not necessarily as part of this series, although if authors there object, it would be good to learn that earlier than later. Drivers that implement pci_error_handlers might be a fruitful place to start. But you've done a great job finding users of ~0 and 0xffff... in drivers/pci/, too. > + > /* > * pci_power_t values must match the bits in the Capabilities PME_Support > * and Control/Status PowerState fields in the Power Management capability. > -- > 2.25.1 > > _______________________________________________ > Linux-kernel-mentees mailing list > Linux-kernel-mentees@lists.linuxfoundation.org > https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees
On 17/11, Bjorn Helgaas wrote: > On Thu, Oct 21, 2021 at 08:37:26PM +0530, Naveen Naidu wrote: > > An MMIO read from a PCI device that doesn't exist or doesn't respond > > causes a PCI error. There's no real data to return to satisfy the > > CPU read, so most hardware fabricates ~0 data. > > > > Add a PCI_ERROR_RESPONSE definition for that and use it where > > appropriate to make these checks consistent and easier to find. > > > > Also add helper definitions SET_PCI_ERROR_RESPONSE and > > RESPONSE_IS_PCI_ERROR to make the code more readable. > > > > Suggested-by: Bjorn Helgaas <bhelgaas@google.com> > > Signed-off-by: Naveen Naidu <naveennaidu479@gmail.com> > > --- > > include/linux/pci.h | 9 +++++++++ > > 1 file changed, 9 insertions(+) > > > > diff --git a/include/linux/pci.h b/include/linux/pci.h > > index cd8aa6fce204..689c8277c584 100644 > > --- a/include/linux/pci.h > > +++ b/include/linux/pci.h > > @@ -154,6 +154,15 @@ enum pci_interrupt_pin { > > /* The number of legacy PCI INTx interrupts */ > > #define PCI_NUM_INTX 4 > > > > +/* > > + * Reading from a device that doesn't respond typically returns ~0. A > > + * successful read from a device may also return ~0, so you need additional > > + * information to reliably identify errors. > > + */ > > +#define PCI_ERROR_RESPONSE (~0ULL) > > +#define SET_PCI_ERROR_RESPONSE(val) (*(val) = ((typeof(*(val))) PCI_ERROR_RESPONSE)) > > +#define RESPONSE_IS_PCI_ERROR(val) ((val) == ((typeof(val)) PCI_ERROR_RESPONSE)) > > Beautiful! I really like this. > Thank you very much for the review ^^ > I would prefer the macros to start with "PCI_", e.g., > PCI_SET_ERROR_RESPONSE(). > ACK > I think "RESPONSE_IS_PCI_ERROR()" is too strong because (as the > comment says), ~0 *may* indicate an error. Or it may be a successful > read of a register that happens to contain ~0. > > Possibilities to convey the idea that this isn't definitive: > > PCI_POSSIBLE_ERROR_RESPONSE(val) # a little long > PCI_LIKELY_ERROR(val) # we really have no idea whether > PCI_PROBABLE_ERROR(val) # likely or probable > PCI_POSSIBLE_ERROR(val) # promising? > ACK. Will use PCI_POSSIBLE_ERROR() > Can you rebase to my "main" branch (v5.16-rc1), tweak the above, and > collect up the acks/reviews? > ACK > We should also browse drivers outside drivers/pci for places we could > use these. Not necessarily as part of this series, although if > authors there object, it would be good to learn that earlier than > later. > > Drivers that implement pci_error_handlers might be a fruitful place to > start. But you've done a great job finding users of ~0 and 0xffff... > in drivers/pci/, too. > A quick grep showed that there are around 80 drivers which have pci_error_handlers. I was thinking that it would be better if we handle these drivers in another patch series since the current patch series is itself 25 patches long. And in my short tenure reading LKML, I gathered that folks generally are not so kind to a long list of patches in a single patch series ^^' (I might be wrong though, Apologies) The consensus on the patch series does seem slightly positive so ideally, I was hoping that we would not have the case where a author does not like the way we are handling this patch. Then again, I'm pretty sure that I might be wrong ^^' I hope it would be okay that I send in a new patch series with the suggested changes and handle the other changes in another patch series ^^ Thanks, Naveen > > + > > /* > > * pci_power_t values must match the bits in the Capabilities PME_Support > > * and Control/Status PowerState fields in the Power Management capability. > > -- > > 2.25.1 > > > > _______________________________________________ > > Linux-kernel-mentees mailing list > > Linux-kernel-mentees@lists.linuxfoundation.org > > https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees
diff --git a/include/linux/pci.h b/include/linux/pci.h index cd8aa6fce204..689c8277c584 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -154,6 +154,15 @@ enum pci_interrupt_pin { /* The number of legacy PCI INTx interrupts */ #define PCI_NUM_INTX 4 +/* + * Reading from a device that doesn't respond typically returns ~0. A + * successful read from a device may also return ~0, so you need additional + * information to reliably identify errors. + */ +#define PCI_ERROR_RESPONSE (~0ULL) +#define SET_PCI_ERROR_RESPONSE(val) (*(val) = ((typeof(*(val))) PCI_ERROR_RESPONSE)) +#define RESPONSE_IS_PCI_ERROR(val) ((val) == ((typeof(val)) PCI_ERROR_RESPONSE)) + /* * pci_power_t values must match the bits in the Capabilities PME_Support * and Control/Status PowerState fields in the Power Management capability.
An MMIO read from a PCI device that doesn't exist or doesn't respond causes a PCI error. There's no real data to return to satisfy the CPU read, so most hardware fabricates ~0 data. Add a PCI_ERROR_RESPONSE definition for that and use it where appropriate to make these checks consistent and easier to find. Also add helper definitions SET_PCI_ERROR_RESPONSE and RESPONSE_IS_PCI_ERROR to make the code more readable. Suggested-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Naveen Naidu <naveennaidu479@gmail.com> --- include/linux/pci.h | 9 +++++++++ 1 file changed, 9 insertions(+)